From roland.westrelin at oracle.com Mon Jan 5 15:48:17 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Mon, 5 Jan 2015 16:48:17 +0100 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 Message-ID: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ The following subgraph is where the bug shows up: If (18732) | \ IfTrue IfFalse (18734) (18733) | | | If (12212) | | \ | IfFalse IfTrue | (12214) | | Region (12183) Condition to 12212 is always false so 12214 is replaced by 18733 and both branches of If 18732 are directly connected to Region 12183. 18733 still has dead 12212 as output. 12183 doesn't have phis so when it's transformed, If 18732 is considered for removal. IfTrue 18734 doesn't have uses anymore so it goes away but IfFalse 18733 still has some (dead branch 12212 is not yet removed). An If in the dead branch 12212 is processed. Range check smearing follows dominator controls until 18733, tests whether the If is a range check and the assert fires because the If only has one projection. So we?re trying to optimize a dead branch. I fixed it by making the range check code more robust. Roland. From roland.westrelin at oracle.com Mon Jan 5 16:38:13 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Mon, 5 Jan 2015 17:38:13 +0100 Subject: RFR(S): 8063086: Math.pow yields different results upon repeated calls Message-ID: http://cr.openjdk.java.net/~roland/8063086/webrev.00/ With 8029302, C2 computes x^2 as x*x. The interpreter and C1 code don?t have the special case code and as a result an application can see different result for the same computation whether it?s executed by the interpreter/c1 or c2. Fixed by adding the special case to the interpreter and C1. Roland. From vladimir.kozlov at oracle.com Mon Jan 5 18:27:15 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 05 Jan 2015 10:27:15 -0800 Subject: RFR(S): 8063086: Math.pow yields different results upon repeated calls In-Reply-To: References: Message-ID: <54AAD783.40600@oracle.com> Looks good. Make sure to run new test with -Xcomp and -XX:-TieredCompilation and Client VM. Thanks, Vladimir On 1/5/15 8:38 AM, Roland Westrelin wrote: > http://cr.openjdk.java.net/~roland/8063086/webrev.00/ > > With 8029302, C2 computes x^2 as x*x. The interpreter and C1 code don?t have the special case code and as a result an application can see different result for the same computation whether it?s executed by the interpreter/c1 or c2. Fixed by adding the special case to the interpreter and C1. > > Roland. > From vladimir.kozlov at oracle.com Mon Jan 5 18:32:18 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 05 Jan 2015 10:32:18 -0800 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> Message-ID: <54AAD8B2.1090409@oracle.com> I think correct fix will be eager dead 12212 If node elimination when 12214 is replaced by 18733. Keep it connected to graph can cause problems in other places. Thanks, Vladimir On 1/5/15 7:48 AM, Roland Westrelin wrote: > http://cr.openjdk.java.net/~roland/8027626/webrev.00/ > > The following subgraph is where the bug shows up: > > If (18732) > | \ > IfTrue IfFalse > (18734) (18733) > | | > | If (12212) > | | \ > | IfFalse IfTrue > | (12214) > | | > Region (12183) > > Condition to 12212 is always false so 12214 is replaced by 18733 and both branches of If 18732 are directly connected to Region 12183. 18733 still has dead 12212 as output. > 12183 doesn't have phis so when it's transformed, If 18732 is considered for removal. IfTrue 18734 doesn't have uses anymore so it goes away but IfFalse 18733 still has some (dead branch 12212 is not yet removed). > An If in the dead branch 12212 is processed. Range check smearing follows dominator controls until 18733, tests whether the If is a range check and the assert fires because the If only has one projection. > > So we?re trying to optimize a dead branch. I fixed it by making the range check code more robust. > > Roland. > From zoltan.majo at oracle.com Mon Jan 5 18:38:43 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 05 Jan 2015 19:38:43 +0100 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) Message-ID: <54AADA33.30203@oracle.com> Hi, please review the following patch. Bug: https://bugs.openjdk.java.net/browse/JDK-8059606 Problem: Controlling compilation thresholds on a per-method level can be useful for debugging and understanding failures, but currently there is no way to control on a per-method level when methods are compiled. Solution: This patch adds support for scaling compilation thresholds on a per-method level using the CompileThresholdScaling flag. For example, the option -XX:CompileCommand=option,SomeClass.someMethod,double,CompileThresholdScaling,0.5 reduces compilation thresholds for method SomeClass.sometMethod() by 50% (but leaves global thresholds unaffected) and results in earlier compilation of the method. Similar to the global CompileThresholdScaling flag (added in JDK-805604), the per-method CompileThresholdScaling flag works with both tiered and non-tiered modes of operation. Per-method compilation thresholds are available only in non-product builds to avoid the overhead of accessing fields added by the patch MethodData and MethodCounters. The proposed patch supports x86_64, x86_32, and sparc. Do you think it is necessary to support other architectures as well? The patch updates the name of the flags Tier2BackEdgeThreshold, Tier3BackEdgeThreshold, Tier4BackEdgeThreshold (lowercase e in "Back*e*dge) so that the naming is consistent with other backedge-related flags (Tier0BackedgeNotifyFreqLog, Tier2BackedgeNotifyFreqLog, and Tier3BackedgeNotifyFreqLog). This patch is the third (and final) part of JDK-8050853: https://bugs.openjdk.java.net/browse/JDK-8050853 . Webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.00/ Testing: manual testing on all supported architectures, JPRT. Thank you and best regards, Zoltan From vladimir.kozlov at oracle.com Mon Jan 5 19:13:59 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 05 Jan 2015 11:13:59 -0800 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54AADA33.30203@oracle.com> References: <54AADA33.30203@oracle.com> Message-ID: <54AAE277.3030209@oracle.com> On 1/5/15 10:38 AM, Zolt?n Maj? wrote: > Hi, > > > please review the following patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8059606 > > Problem: Controlling compilation thresholds on a per-method level can be useful for debugging and understanding > failures, but currently there is no way to control on a per-method level when methods are compiled. > > > Solution: > > This patch adds support for scaling compilation thresholds on a per-method level using the CompileThresholdScaling flag. > For example, the option > > -XX:CompileCommand=option,SomeClass.someMethod,double,CompileThresholdScaling,0.5 > > reduces compilation thresholds for method SomeClass.sometMethod() by 50% (but leaves global thresholds unaffected) and > results in earlier compilation of the method. > > Similar to the global CompileThresholdScaling flag (added in JDK-805604), the per-method CompileThresholdScaling flag > works with both tiered and non-tiered modes of operation. > > Per-method compilation thresholds are available only in non-product builds to avoid the overhead of accessing fields > added by the patch MethodData and MethodCounters. Too many ifdefs :) The interpreter speed is not important. And the feature could be interesting in product VM too. The only drawback is 2 additional fields in MDO which is fine. Can you make it product and run through our performance infrastructure. Also, as John Rose will say, we should have as much as possible a similar code in product as in tested debug code. Otherwise we are not testing product bits and will get into troubles. > > The proposed patch supports x86_64, x86_32, and sparc. Do you think it is necessary to support other architectures as well? Yes. It should be supported on all platforms. > > The patch updates the name of the flags Tier2BackEdgeThreshold, Tier3BackEdgeThreshold, Tier4BackEdgeThreshold > (lowercase e in "Back*e*dge) so that the naming is consistent with other backedge-related flags > (Tier0BackedgeNotifyFreqLog, Tier2BackedgeNotifyFreqLog, and Tier3BackedgeNotifyFreqLog). It added noise to main changes and may cause some testing (jfr?) failures. Can we do it separately (other RFE?). > > This patch is the third (and final) part of JDK-8050853: https://bugs.openjdk.java.net/browse/JDK-8050853 . > > > Webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.00/ In general looks good. Thanks, Vladimir > > Testing: manual testing on all supported architectures, JPRT. > > Thank you and best regards, > > > Zoltan > From john.r.rose at oracle.com Mon Jan 5 20:00:24 2015 From: john.r.rose at oracle.com (John Rose) Date: Mon, 5 Jan 2015 12:00:24 -0800 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54AAE277.3030209@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> Message-ID: <1413CED3-2DBD-4224-B942-A1F24D864FB8@oracle.com> On Jan 5, 2015, at 11:13 AM, Vladimir Kozlov wrote: > > On 1/5/15 10:38 AM, Zolt?n Maj? wrote: >> ... >> >> Per-method compilation thresholds are available only in non-product builds to avoid the overhead of accessing fields >> added by the patch MethodData and MethodCounters. > > Too many ifdefs :) > The interpreter speed is not important. And the feature could be interesting in product VM too. > The only drawback is 2 additional fields in MDO which is fine. > Can you make it product and run through our performance infrastructure. > Also, as John Rose will say, we should have as much as possible a similar code in product as in tested debug code. Otherwise we are not testing product bits and will get into troubles. Vladimir is right; please consider it said by both of us. :-) Zoltan, I am glad you are tackling the profiling code, because we need experts in it. In the long run, the invocation counter code started complex and is growing more complex. We need to rationalize the counters more. The notification frequency idea is a good step in the right direction, since it pulls logic out of the assembly code and into high-level event handling code. Another good step would be to reduce the number of distinct counters visible at the assembly level, thus simplifying the assembly code. The complexity of InvocationCounter is a fossil which deserves to be buried and paved over. Speaking of ifdefs, I am uncomfortable with the ifdef-TIERED branches in the assembly code. Can we move towards merging those code branches? And, speaking of footprint, I see no reason why we couldn't shrink the Method layout to handle all counter bookkeeping with a single word, instead of the current two. Also (while I'm on the subject) low-count, non-looping methods do not need a full MethodCounter struct, just a simple inline count with a low-tag bit, with CAS-based state changes. This would have the effect of delaying MC and MD allocation until a method has been used non-trivial amount, which would reduce footprint if "non-trivial amount" turns out to be large. counters = union { uintptr_t simple_count; // c = (invocation_count << 16 | notify_mask << 8 | other_flags_we_might_like << 1 | 1) uintptr_t method_counters; // c = ((intptr_t)method_counters_addr | 0) uintptr_t method_data; // c = ((intptr_t)method_data_addr | 0) } (Assumes that method_data and method_counters can be distinguished suitably by their contents.) ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From roland.westrelin at oracle.com Mon Jan 5 21:33:17 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Mon, 5 Jan 2015 22:33:17 +0100 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: <54AAD8B2.1090409@oracle.com> References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> Message-ID: Hi Vladimir, Thanks for looking at this. > I think correct fix will be eager dead 12212 If node elimination when 12214 is replaced by 18733. Keep it connected to graph can cause problems in other places. IfFalseNode::Identity() is what optimizes 12212/12214 out. Should I make it an Ideal transformation so that I can call igvn->remove_dead_node() on the dead branch? Roland. > > Thanks, > Vladimir > > On 1/5/15 7:48 AM, Roland Westrelin wrote: >> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ >> >> The following subgraph is where the bug shows up: >> >> If (18732) >> | \ >> IfTrue IfFalse >> (18734) (18733) >> | | >> | If (12212) >> | | \ >> | IfFalse IfTrue >> | (12214) >> | | >> Region (12183) >> >> Condition to 12212 is always false so 12214 is replaced by 18733 and both branches of If 18732 are directly connected to Region 12183. 18733 still has dead 12212 as output. >> 12183 doesn't have phis so when it's transformed, If 18732 is considered for removal. IfTrue 18734 doesn't have uses anymore so it goes away but IfFalse 18733 still has some (dead branch 12212 is not yet removed). >> An If in the dead branch 12212 is processed. Range check smearing follows dominator controls until 18733, tests whether the If is a range check and the assert fires because the If only has one projection. >> >> So we?re trying to optimize a dead branch. I fixed it by making the range check code more robust. >> >> Roland. >> From rednaxelafx at gmail.com Mon Jan 5 23:10:42 2015 From: rednaxelafx at gmail.com (Krystal Mok) Date: Mon, 5 Jan 2015 15:10:42 -0800 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> Message-ID: Hi guys, Do you have a regression test case that can show the bug and show the fix works? I noticed the replay file in the bug report, but a runnable test case (not just replaying) would give me a better sense. Thanks! Best regards, Kris On Mon, Jan 5, 2015 at 1:33 PM, Roland Westrelin < roland.westrelin at oracle.com> wrote: > Hi Vladimir, > > Thanks for looking at this. > > > I think correct fix will be eager dead 12212 If node elimination when > 12214 is replaced by 18733. Keep it connected to graph can cause problems > in other places. > > IfFalseNode::Identity() is what optimizes 12212/12214 out. Should I make > it an Ideal transformation so that I can call igvn->remove_dead_node() on > the dead branch? > > Roland. > > > > > Thanks, > > Vladimir > > > > On 1/5/15 7:48 AM, Roland Westrelin wrote: > >> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ > >> > >> The following subgraph is where the bug shows up: > >> > >> If (18732) > >> | \ > >> IfTrue IfFalse > >> (18734) (18733) > >> | | > >> | If (12212) > >> | | \ > >> | IfFalse IfTrue > >> | (12214) > >> | | > >> Region (12183) > >> > >> Condition to 12212 is always false so 12214 is replaced by 18733 and > both branches of If 18732 are directly connected to Region 12183. 18733 > still has dead 12212 as output. > >> 12183 doesn't have phis so when it's transformed, If 18732 is > considered for removal. IfTrue 18734 doesn't have uses anymore so it goes > away but IfFalse 18733 still has some (dead branch 12212 is not yet > removed). > >> An If in the dead branch 12212 is processed. Range check smearing > follows dominator controls until 18733, tests whether the If is a range > check and the assert fires because the If only has one projection. > >> > >> So we?re trying to optimize a dead branch. I fixed it by making the > range check code more robust. > >> > >> Roland. > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Tue Jan 6 00:34:15 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 05 Jan 2015 16:34:15 -0800 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> Message-ID: <54AB2D87.1080007@oracle.com> Is it again the problem with IGVN list processing order? What happens in a simple test case when only one branch is taking? How projection and If nodes are removed? Vladimir On 1/5/15 1:33 PM, Roland Westrelin wrote: > Hi Vladimir, > > Thanks for looking at this. > >> I think correct fix will be eager dead 12212 If node elimination when 12214 is replaced by 18733. Keep it connected to graph can cause problems in other places. > > IfFalseNode::Identity() is what optimizes 12212/12214 out. Should I make it an Ideal transformation so that I can call igvn->remove_dead_node() on the dead branch? > > Roland. > >> >> Thanks, >> Vladimir >> >> On 1/5/15 7:48 AM, Roland Westrelin wrote: >>> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ >>> >>> The following subgraph is where the bug shows up: >>> >>> If (18732) >>> | \ >>> IfTrue IfFalse >>> (18734) (18733) >>> | | >>> | If (12212) >>> | | \ >>> | IfFalse IfTrue >>> | (12214) >>> | | >>> Region (12183) >>> >>> Condition to 12212 is always false so 12214 is replaced by 18733 and both branches of If 18732 are directly connected to Region 12183. 18733 still has dead 12212 as output. >>> 12183 doesn't have phis so when it's transformed, If 18732 is considered for removal. IfTrue 18734 doesn't have uses anymore so it goes away but IfFalse 18733 still has some (dead branch 12212 is not yet removed). >>> An If in the dead branch 12212 is processed. Range check smearing follows dominator controls until 18733, tests whether the If is a range check and the assert fires because the If only has one projection. >>> >>> So we?re trying to optimize a dead branch. I fixed it by making the range check code more robust. >>> >>> Roland. >>> > From roland.westrelin at oracle.com Tue Jan 6 13:38:01 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Tue, 6 Jan 2015 14:38:01 +0100 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: <54AB2D87.1080007@oracle.com> References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> <54AB2D87.1080007@oracle.com> Message-ID: > Is it again the problem with IGVN list processing order? > What happens in a simple test case when only one branch is taking? How projection and If nodes are removed? IfNode::Value() returns either TypeTuple::IFFALSE or TypeTuple::IFTRUE If{False,True}Node::Identity() replaces the taken projection with the If?s control input. ProjNode::Value() returns top for the non taken projection and the dead branch is killed by propagating top. So yes, it?s an IGVN list processing order problem. Roland. > > Vladimir > > On 1/5/15 1:33 PM, Roland Westrelin wrote: >> Hi Vladimir, >> >> Thanks for looking at this. >> >>> I think correct fix will be eager dead 12212 If node elimination when 12214 is replaced by 18733. Keep it connected to graph can cause problems in other places. >> >> IfFalseNode::Identity() is what optimizes 12212/12214 out. Should I make it an Ideal transformation so that I can call igvn->remove_dead_node() on the dead branch? >> >> Roland. >> >>> >>> Thanks, >>> Vladimir >>> >>> On 1/5/15 7:48 AM, Roland Westrelin wrote: >>>> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ >>>> >>>> The following subgraph is where the bug shows up: >>>> >>>> If (18732) >>>> | \ >>>> IfTrue IfFalse >>>> (18734) (18733) >>>> | | >>>> | If (12212) >>>> | | \ >>>> | IfFalse IfTrue >>>> | (12214) >>>> | | >>>> Region (12183) >>>> >>>> Condition to 12212 is always false so 12214 is replaced by 18733 and both branches of If 18732 are directly connected to Region 12183. 18733 still has dead 12212 as output. >>>> 12183 doesn't have phis so when it's transformed, If 18732 is considered for removal. IfTrue 18734 doesn't have uses anymore so it goes away but IfFalse 18733 still has some (dead branch 12212 is not yet removed). >>>> An If in the dead branch 12212 is processed. Range check smearing follows dominator controls until 18733, tests whether the If is a range check and the assert fires because the If only has one projection. >>>> >>>> So we?re trying to optimize a dead branch. I fixed it by making the range check code more robust. >>>> >>>> Roland. >>>> >> From roland.westrelin at oracle.com Tue Jan 6 13:40:38 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Tue, 6 Jan 2015 14:40:38 +0100 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> Message-ID: Hi Kris, > Do you have a regression test case that can show the bug and show the fix works? I don?t have a simple test case. For the bug to appear, the IGVN needs to process the nodes in a specific order. I don?t see how to get it to do that with a simple test case. Roland. > I noticed the replay file in the bug report, but a runnable test case (not just replaying) would give me a better sense. Thanks! > > Best regards, > Kris > > On Mon, Jan 5, 2015 at 1:33 PM, Roland Westrelin wrote: > Hi Vladimir, > > Thanks for looking at this. > > > I think correct fix will be eager dead 12212 If node elimination when 12214 is replaced by 18733. Keep it connected to graph can cause problems in other places. > > IfFalseNode::Identity() is what optimizes 12212/12214 out. Should I make it an Ideal transformation so that I can call igvn->remove_dead_node() on the dead branch? > > Roland. > > > > > Thanks, > > Vladimir > > > > On 1/5/15 7:48 AM, Roland Westrelin wrote: > >> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ > >> > >> The following subgraph is where the bug shows up: > >> > >> If (18732) > >> | \ > >> IfTrue IfFalse > >> (18734) (18733) > >> | | > >> | If (12212) > >> | | \ > >> | IfFalse IfTrue > >> | (12214) > >> | | > >> Region (12183) > >> > >> Condition to 12212 is always false so 12214 is replaced by 18733 and both branches of If 18732 are directly connected to Region 12183. 18733 still has dead 12212 as output. > >> 12183 doesn't have phis so when it's transformed, If 18732 is considered for removal. IfTrue 18734 doesn't have uses anymore so it goes away but IfFalse 18733 still has some (dead branch 12212 is not yet removed). > >> An If in the dead branch 12212 is processed. Range check smearing follows dominator controls until 18733, tests whether the If is a range check and the assert fires because the If only has one projection. > >> > >> So we?re trying to optimize a dead branch. I fixed it by making the range check code more robust. > >> > >> Roland. > >> > > From rednaxelafx at gmail.com Tue Jan 6 18:58:32 2015 From: rednaxelafx at gmail.com (Krystal Mok) Date: Tue, 6 Jan 2015 10:58:32 -0800 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> Message-ID: Hi Roland, Thanks for the reply! I'll try to understand the issue and see if I can get a simplified test case. - Kris On Tue, Jan 6, 2015 at 5:40 AM, Roland Westrelin < roland.westrelin at oracle.com> wrote: > Hi Kris, > > > Do you have a regression test case that can show the bug and show the > fix works? > > I don?t have a simple test case. For the bug to appear, the IGVN needs to > process the nodes in a specific order. I don?t see how to get it to do that > with a simple test case. > > Roland. > > > I noticed the replay file in the bug report, but a runnable test case > (not just replaying) would give me a better sense. Thanks! > > > > Best regards, > > Kris > > > > On Mon, Jan 5, 2015 at 1:33 PM, Roland Westrelin < > roland.westrelin at oracle.com> wrote: > > Hi Vladimir, > > > > Thanks for looking at this. > > > > > I think correct fix will be eager dead 12212 If node elimination when > 12214 is replaced by 18733. Keep it connected to graph can cause problems > in other places. > > > > IfFalseNode::Identity() is what optimizes 12212/12214 out. Should I make > it an Ideal transformation so that I can call igvn->remove_dead_node() on > the dead branch? > > > > Roland. > > > > > > > > Thanks, > > > Vladimir > > > > > > On 1/5/15 7:48 AM, Roland Westrelin wrote: > > >> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ > > >> > > >> The following subgraph is where the bug shows up: > > >> > > >> If (18732) > > >> | \ > > >> IfTrue IfFalse > > >> (18734) (18733) > > >> | | > > >> | If (12212) > > >> | | \ > > >> | IfFalse IfTrue > > >> | (12214) > > >> | | > > >> Region (12183) > > >> > > >> Condition to 12212 is always false so 12214 is replaced by 18733 and > both branches of If 18732 are directly connected to Region 12183. 18733 > still has dead 12212 as output. > > >> 12183 doesn't have phis so when it's transformed, If 18732 is > considered for removal. IfTrue 18734 doesn't have uses anymore so it goes > away but IfFalse 18733 still has some (dead branch 12212 is not yet > removed). > > >> An If in the dead branch 12212 is processed. Range check smearing > follows dominator controls until 18733, tests whether the If is a range > check and the assert fires because the If only has one projection. > > >> > > >> So we?re trying to optimize a dead branch. I fixed it by making the > range check code more robust. > > >> > > >> Roland. > > >> > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Tue Jan 6 21:39:04 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 06 Jan 2015 13:39:04 -0800 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> <54AB2D87.1080007@oracle.com> Message-ID: <54AC55F8.3010806@oracle.com> Can we simple use replace_input_of(in(0), 0, NULL) in If{False,True}Node::Identity()? It will disconnect IfNode's control edge and put it on worklist. Thanks, Vladimir On 1/6/15 5:38 AM, Roland Westrelin wrote: >> Is it again the problem with IGVN list processing order? >> What happens in a simple test case when only one branch is taking? How projection and If nodes are removed? > > IfNode::Value() returns either TypeTuple::IFFALSE or TypeTuple::IFTRUE > If{False,True}Node::Identity() replaces the taken projection with the If?s control input. > ProjNode::Value() returns top for the non taken projection and the dead branch is killed by propagating top. > > So yes, it?s an IGVN list processing order problem. > > Roland. > > >> >> Vladimir >> >> On 1/5/15 1:33 PM, Roland Westrelin wrote: >>> Hi Vladimir, >>> >>> Thanks for looking at this. >>> >>>> I think correct fix will be eager dead 12212 If node elimination when 12214 is replaced by 18733. Keep it connected to graph can cause problems in other places. >>> >>> IfFalseNode::Identity() is what optimizes 12212/12214 out. Should I make it an Ideal transformation so that I can call igvn->remove_dead_node() on the dead branch? >>> >>> Roland. >>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 1/5/15 7:48 AM, Roland Westrelin wrote: >>>>> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ >>>>> >>>>> The following subgraph is where the bug shows up: >>>>> >>>>> If (18732) >>>>> | \ >>>>> IfTrue IfFalse >>>>> (18734) (18733) >>>>> | | >>>>> | If (12212) >>>>> | | \ >>>>> | IfFalse IfTrue >>>>> | (12214) >>>>> | | >>>>> Region (12183) >>>>> >>>>> Condition to 12212 is always false so 12214 is replaced by 18733 and both branches of If 18732 are directly connected to Region 12183. 18733 still has dead 12212 as output. >>>>> 12183 doesn't have phis so when it's transformed, If 18732 is considered for removal. IfTrue 18734 doesn't have uses anymore so it goes away but IfFalse 18733 still has some (dead branch 12212 is not yet removed). >>>>> An If in the dead branch 12212 is processed. Range check smearing follows dominator controls until 18733, tests whether the If is a range check and the assert fires because the If only has one projection. >>>>> >>>>> So we?re trying to optimize a dead branch. I fixed it by making the range check code more robust. >>>>> >>>>> Roland. >>>>> >>> > From roland.westrelin at oracle.com Tue Jan 6 21:44:03 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Tue, 6 Jan 2015 22:44:03 +0100 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: <54AC55F8.3010806@oracle.com> References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> <54AB2D87.1080007@oracle.com> <54AC55F8.3010806@oracle.com> Message-ID: > Can we simple use replace_input_of(in(0), 0, NULL) in If{False,True}Node::Identity()? It will disconnect IfNode's control edge and put it on worklist. But that?s a PhaseIterGVN?s method and Identity is passed a PhaseTransform? It seems in other locations in the code, the code is robust to broken if subgraphs. This for instance: IfNode::Ideal() // Another variation of a dead if if (outcnt() < 2) return NULL; Why isn?t it good enough here? Roland. > > Thanks, > Vladimir > > On 1/6/15 5:38 AM, Roland Westrelin wrote: >>> Is it again the problem with IGVN list processing order? >>> What happens in a simple test case when only one branch is taking? How projection and If nodes are removed? >> >> IfNode::Value() returns either TypeTuple::IFFALSE or TypeTuple::IFTRUE >> If{False,True}Node::Identity() replaces the taken projection with the If?s control input. >> ProjNode::Value() returns top for the non taken projection and the dead branch is killed by propagating top. >> >> So yes, it?s an IGVN list processing order problem. >> >> Roland. >> >> >>> >>> Vladimir >>> >>> On 1/5/15 1:33 PM, Roland Westrelin wrote: >>>> Hi Vladimir, >>>> >>>> Thanks for looking at this. >>>> >>>>> I think correct fix will be eager dead 12212 If node elimination when 12214 is replaced by 18733. Keep it connected to graph can cause problems in other places. >>>> >>>> IfFalseNode::Identity() is what optimizes 12212/12214 out. Should I make it an Ideal transformation so that I can call igvn->remove_dead_node() on the dead branch? >>>> >>>> Roland. >>>> >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 1/5/15 7:48 AM, Roland Westrelin wrote: >>>>>> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ >>>>>> >>>>>> The following subgraph is where the bug shows up: >>>>>> >>>>>> If (18732) >>>>>> | \ >>>>>> IfTrue IfFalse >>>>>> (18734) (18733) >>>>>> | | >>>>>> | If (12212) >>>>>> | | \ >>>>>> | IfFalse IfTrue >>>>>> | (12214) >>>>>> | | >>>>>> Region (12183) >>>>>> >>>>>> Condition to 12212 is always false so 12214 is replaced by 18733 and both branches of If 18732 are directly connected to Region 12183. 18733 still has dead 12212 as output. >>>>>> 12183 doesn't have phis so when it's transformed, If 18732 is considered for removal. IfTrue 18734 doesn't have uses anymore so it goes away but IfFalse 18733 still has some (dead branch 12212 is not yet removed). >>>>>> An If in the dead branch 12212 is processed. Range check smearing follows dominator controls until 18733, tests whether the If is a range check and the assert fires because the If only has one projection. >>>>>> >>>>>> So we?re trying to optimize a dead branch. I fixed it by making the range check code more robust. >>>>>> >>>>>> Roland. >>>>>> >>>> >> From vladimir.kozlov at oracle.com Tue Jan 6 22:31:00 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 06 Jan 2015 14:31:00 -0800 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> <54AB2D87.1080007@oracle.com> <54AC55F8.3010806@oracle.com> Message-ID: <54AC6224.1050605@oracle.com> On 1/6/15 1:44 PM, Roland Westrelin wrote: >> Can we simple use replace_input_of(in(0), 0, NULL) in If{False,True}Node::Identity()? It will disconnect IfNode's control edge and put it on worklist. > > But that?s a PhaseIterGVN?s method and Identity is passed a PhaseTransform? An other suggestion is to delay If{False,True}Node::Identity() optimization until not taken branch is removed: (outcnt() < 2). But it will need to place taken ProjNode back on worklist when we removed other branch. > > It seems in other locations in the code, the code is robust to broken if subgraphs. This for instance: I think it is too late as this bug shows. We should not allow 2 control users at any time in a graph (except for IfNode)! Vladimir > > IfNode::Ideal() > > // Another variation of a dead if > if (outcnt() < 2) return NULL; > > Why isn?t it good enough here? > > Roland. > >> >> Thanks, >> Vladimir >> >> On 1/6/15 5:38 AM, Roland Westrelin wrote: >>>> Is it again the problem with IGVN list processing order? >>>> What happens in a simple test case when only one branch is taking? How projection and If nodes are removed? >>> >>> IfNode::Value() returns either TypeTuple::IFFALSE or TypeTuple::IFTRUE >>> If{False,True}Node::Identity() replaces the taken projection with the If?s control input. >>> ProjNode::Value() returns top for the non taken projection and the dead branch is killed by propagating top. >>> >>> So yes, it?s an IGVN list processing order problem. >>> >>> Roland. >>> >>> >>>> >>>> Vladimir >>>> >>>> On 1/5/15 1:33 PM, Roland Westrelin wrote: >>>>> Hi Vladimir, >>>>> >>>>> Thanks for looking at this. >>>>> >>>>>> I think correct fix will be eager dead 12212 If node elimination when 12214 is replaced by 18733. Keep it connected to graph can cause problems in other places. >>>>> >>>>> IfFalseNode::Identity() is what optimizes 12212/12214 out. Should I make it an Ideal transformation so that I can call igvn->remove_dead_node() on the dead branch? >>>>> >>>>> Roland. >>>>> >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> On 1/5/15 7:48 AM, Roland Westrelin wrote: >>>>>>> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ >>>>>>> >>>>>>> The following subgraph is where the bug shows up: >>>>>>> >>>>>>> If (18732) >>>>>>> | \ >>>>>>> IfTrue IfFalse >>>>>>> (18734) (18733) >>>>>>> | | >>>>>>> | If (12212) >>>>>>> | | \ >>>>>>> | IfFalse IfTrue >>>>>>> | (12214) >>>>>>> | | >>>>>>> Region (12183) >>>>>>> >>>>>>> Condition to 12212 is always false so 12214 is replaced by 18733 and both branches of If 18732 are directly connected to Region 12183. 18733 still has dead 12212 as output. >>>>>>> 12183 doesn't have phis so when it's transformed, If 18732 is considered for removal. IfTrue 18734 doesn't have uses anymore so it goes away but IfFalse 18733 still has some (dead branch 12212 is not yet removed). >>>>>>> An If in the dead branch 12212 is processed. Range check smearing follows dominator controls until 18733, tests whether the If is a range check and the assert fires because the If only has one projection. >>>>>>> >>>>>>> So we?re trying to optimize a dead branch. I fixed it by making the range check code more robust. >>>>>>> >>>>>>> Roland. >>>>>>> >>>>> >>> > From vladimir.kozlov at oracle.com Tue Jan 6 22:35:21 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 06 Jan 2015 14:35:21 -0800 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: <54AC6224.1050605@oracle.com> References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> <54AB2D87.1080007@oracle.com> <54AC55F8.3010806@oracle.com> <54AC6224.1050605@oracle.com> Message-ID: <54AC6329.9080005@oracle.com> On 1/6/15 2:31 PM, Vladimir Kozlov wrote: > On 1/6/15 1:44 PM, Roland Westrelin wrote: >>> Can we simple use replace_input_of(in(0), 0, NULL) in >>> If{False,True}Node::Identity()? It will disconnect IfNode's control >>> edge and put it on worklist. >> >> But that?s a PhaseIterGVN?s method and Identity is passed a >> PhaseTransform? > > An other suggestion is to delay If{False,True}Node::Identity() > optimization until not taken branch is removed: (outcnt() < 2). But it > will need to place taken ProjNode back on worklist when we removed other > branch. Add IfNode to has_special_unique_user() so the code in PhaseIterGVN::remove_globally_dead_node() will do it for you. Thanks, Vladimir > >> >> It seems in other locations in the code, the code is robust to broken >> if subgraphs. This for instance: > > I think it is too late as this bug shows. We should not allow 2 control > users at any time in a graph (except for IfNode)! > > Vladimir > >> >> IfNode::Ideal() >> >> // Another variation of a dead if >> if (outcnt() < 2) return NULL; >> >> Why isn?t it good enough here? >> >> Roland. >> >>> >>> Thanks, >>> Vladimir >>> >>> On 1/6/15 5:38 AM, Roland Westrelin wrote: >>>>> Is it again the problem with IGVN list processing order? >>>>> What happens in a simple test case when only one branch is taking? >>>>> How projection and If nodes are removed? >>>> >>>> IfNode::Value() returns either TypeTuple::IFFALSE or TypeTuple::IFTRUE >>>> If{False,True}Node::Identity() replaces the taken projection with >>>> the If?s control input. >>>> ProjNode::Value() returns top for the non taken projection and the >>>> dead branch is killed by propagating top. >>>> >>>> So yes, it?s an IGVN list processing order problem. >>>> >>>> Roland. >>>> >>>> >>>>> >>>>> Vladimir >>>>> >>>>> On 1/5/15 1:33 PM, Roland Westrelin wrote: >>>>>> Hi Vladimir, >>>>>> >>>>>> Thanks for looking at this. >>>>>> >>>>>>> I think correct fix will be eager dead 12212 If node elimination >>>>>>> when 12214 is replaced by 18733. Keep it connected to graph can >>>>>>> cause problems in other places. >>>>>> >>>>>> IfFalseNode::Identity() is what optimizes 12212/12214 out. Should >>>>>> I make it an Ideal transformation so that I can call >>>>>> igvn->remove_dead_node() on the dead branch? >>>>>> >>>>>> Roland. >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> On 1/5/15 7:48 AM, Roland Westrelin wrote: >>>>>>>> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ >>>>>>>> >>>>>>>> The following subgraph is where the bug shows up: >>>>>>>> >>>>>>>> If (18732) >>>>>>>> | \ >>>>>>>> IfTrue IfFalse >>>>>>>> (18734) (18733) >>>>>>>> | | >>>>>>>> | If (12212) >>>>>>>> | | \ >>>>>>>> | IfFalse IfTrue >>>>>>>> | (12214) >>>>>>>> | | >>>>>>>> Region (12183) >>>>>>>> >>>>>>>> Condition to 12212 is always false so 12214 is replaced by 18733 >>>>>>>> and both branches of If 18732 are directly connected to Region >>>>>>>> 12183. 18733 still has dead 12212 as output. >>>>>>>> 12183 doesn't have phis so when it's transformed, If 18732 is >>>>>>>> considered for removal. IfTrue 18734 doesn't have uses anymore >>>>>>>> so it goes away but IfFalse 18733 still has some (dead branch >>>>>>>> 12212 is not yet removed). >>>>>>>> An If in the dead branch 12212 is processed. Range check >>>>>>>> smearing follows dominator controls until 18733, tests whether >>>>>>>> the If is a range check and the assert fires because the If only >>>>>>>> has one projection. >>>>>>>> >>>>>>>> So we?re trying to optimize a dead branch. I fixed it by making >>>>>>>> the range check code more robust. >>>>>>>> >>>>>>>> Roland. >>>>>>>> >>>>>> >>>> >> From roland.westrelin at oracle.com Wed Jan 7 13:20:36 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Wed, 7 Jan 2015 14:20:36 +0100 Subject: RFR(S): 8063086: Math.pow yields different results upon repeated calls In-Reply-To: <54AAD783.40600@oracle.com> References: <54AAD783.40600@oracle.com> Message-ID: <6EDEE610-817A-4136-AD7F-D3307F5FAEF6@oracle.com> > Looks good. Make sure to run new test with -Xcomp and -XX:-TieredCompilation and Client VM. Thanks for the review. I checked the test runs fine in these cases. Roland. > > Thanks, > Vladimir > > On 1/5/15 8:38 AM, Roland Westrelin wrote: >> http://cr.openjdk.java.net/~roland/8063086/webrev.00/ >> >> With 8029302, C2 computes x^2 as x*x. The interpreter and C1 code don?t have the special case code and as a result an application can see different result for the same computation whether it?s executed by the interpreter/c1 or c2. Fixed by adding the special case to the interpreter and C1. >> >> Roland. >> From roland.westrelin at oracle.com Wed Jan 7 22:02:25 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Wed, 7 Jan 2015 23:02:25 +0100 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: <54AC6329.9080005@oracle.com> References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> <54AB2D87.1080007@oracle.com> <54AC55F8.3010806@oracle.com> <54AC6224.1050605@oracle.com> <54AC6329.9080005@oracle.com> Message-ID: >>>> Can we simple use replace_input_of(in(0), 0, NULL) in >>>> If{False,True}Node::Identity()? It will disconnect IfNode's control >>>> edge and put it on worklist. >>> >>> But that?s a PhaseIterGVN?s method and Identity is passed a >>> PhaseTransform? >> >> An other suggestion is to delay If{False,True}Node::Identity() >> optimization until not taken branch is removed: (outcnt() < 2). But it >> will need to place taken ProjNode back on worklist when we removed other >> branch. > > Add IfNode to has_special_unique_user() so the code in PhaseIterGVN::remove_globally_dead_node() will do it for you. Thanks for the suggestion, Vladimir. Does that look ok? http://cr.openjdk.java.net/~roland/8027626/webrev.01/ Roland. > > Thanks, > Vladimir > >> >>> >>> It seems in other locations in the code, the code is robust to broken >>> if subgraphs. This for instance: >> >> I think it is too late as this bug shows. We should not allow 2 control >> users at any time in a graph (except for IfNode)! >> >> Vladimir >> >>> >>> IfNode::Ideal() >>> >>> // Another variation of a dead if >>> if (outcnt() < 2) return NULL; >>> >>> Why isn?t it good enough here? >>> >>> Roland. >>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 1/6/15 5:38 AM, Roland Westrelin wrote: >>>>>> Is it again the problem with IGVN list processing order? >>>>>> What happens in a simple test case when only one branch is taking? >>>>>> How projection and If nodes are removed? >>>>> >>>>> IfNode::Value() returns either TypeTuple::IFFALSE or TypeTuple::IFTRUE >>>>> If{False,True}Node::Identity() replaces the taken projection with >>>>> the If?s control input. >>>>> ProjNode::Value() returns top for the non taken projection and the >>>>> dead branch is killed by propagating top. >>>>> >>>>> So yes, it?s an IGVN list processing order problem. >>>>> >>>>> Roland. >>>>> >>>>> >>>>>> >>>>>> Vladimir >>>>>> >>>>>> On 1/5/15 1:33 PM, Roland Westrelin wrote: >>>>>>> Hi Vladimir, >>>>>>> >>>>>>> Thanks for looking at this. >>>>>>> >>>>>>>> I think correct fix will be eager dead 12212 If node elimination >>>>>>>> when 12214 is replaced by 18733. Keep it connected to graph can >>>>>>>> cause problems in other places. >>>>>>> >>>>>>> IfFalseNode::Identity() is what optimizes 12212/12214 out. Should >>>>>>> I make it an Ideal transformation so that I can call >>>>>>> igvn->remove_dead_node() on the dead branch? >>>>>>> >>>>>>> Roland. >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Vladimir >>>>>>>> >>>>>>>> On 1/5/15 7:48 AM, Roland Westrelin wrote: >>>>>>>>> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ >>>>>>>>> >>>>>>>>> The following subgraph is where the bug shows up: >>>>>>>>> >>>>>>>>> If (18732) >>>>>>>>> | \ >>>>>>>>> IfTrue IfFalse >>>>>>>>> (18734) (18733) >>>>>>>>> | | >>>>>>>>> | If (12212) >>>>>>>>> | | \ >>>>>>>>> | IfFalse IfTrue >>>>>>>>> | (12214) >>>>>>>>> | | >>>>>>>>> Region (12183) >>>>>>>>> >>>>>>>>> Condition to 12212 is always false so 12214 is replaced by 18733 >>>>>>>>> and both branches of If 18732 are directly connected to Region >>>>>>>>> 12183. 18733 still has dead 12212 as output. >>>>>>>>> 12183 doesn't have phis so when it's transformed, If 18732 is >>>>>>>>> considered for removal. IfTrue 18734 doesn't have uses anymore >>>>>>>>> so it goes away but IfFalse 18733 still has some (dead branch >>>>>>>>> 12212 is not yet removed). >>>>>>>>> An If in the dead branch 12212 is processed. Range check >>>>>>>>> smearing follows dominator controls until 18733, tests whether >>>>>>>>> the If is a range check and the assert fires because the If only >>>>>>>>> has one projection. >>>>>>>>> >>>>>>>>> So we?re trying to optimize a dead branch. I fixed it by making >>>>>>>>> the range check code more robust. >>>>>>>>> >>>>>>>>> Roland. >>>>>>>>> >>>>>>> >>>>> >>> From vladimir.kozlov at oracle.com Wed Jan 7 22:58:05 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 07 Jan 2015 14:58:05 -0800 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> <54AB2D87.1080007@oracle.com> <54AC55F8.3010806@oracle.com> <54AC6224.1050605@oracle.com> <54AC6329.9080005@oracle.com> Message-ID: <54ADB9FD.10507@oracle.com> Good. I don't like *virtual* methods :) and we can do with (t->field_at(1 - _con) == Type::TOP) but your code is more clear. May be change method to return bool: virtual bool always_taken(const TypeTuple* t) const { return (t == TypeTuple::IFFALSE); } Thanks, Vladimir On 1/7/15 2:02 PM, Roland Westrelin wrote: >>>>> Can we simple use replace_input_of(in(0), 0, NULL) in >>>>> If{False,True}Node::Identity()? It will disconnect IfNode's control >>>>> edge and put it on worklist. >>>> >>>> But that?s a PhaseIterGVN?s method and Identity is passed a >>>> PhaseTransform? >>> >>> An other suggestion is to delay If{False,True}Node::Identity() >>> optimization until not taken branch is removed: (outcnt() < 2). But it >>> will need to place taken ProjNode back on worklist when we removed other >>> branch. >> >> Add IfNode to has_special_unique_user() so the code in PhaseIterGVN::remove_globally_dead_node() will do it for you. > > > Thanks for the suggestion, Vladimir. Does that look ok? > > http://cr.openjdk.java.net/~roland/8027626/webrev.01/ > > Roland. > >> >> Thanks, >> Vladimir >> >>> >>>> >>>> It seems in other locations in the code, the code is robust to broken >>>> if subgraphs. This for instance: >>> >>> I think it is too late as this bug shows. We should not allow 2 control >>> users at any time in a graph (except for IfNode)! >>> >>> Vladimir >>> >>>> >>>> IfNode::Ideal() >>>> >>>> // Another variation of a dead if >>>> if (outcnt() < 2) return NULL; >>>> >>>> Why isn?t it good enough here? >>>> >>>> Roland. >>>> >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 1/6/15 5:38 AM, Roland Westrelin wrote: >>>>>>> Is it again the problem with IGVN list processing order? >>>>>>> What happens in a simple test case when only one branch is taking? >>>>>>> How projection and If nodes are removed? >>>>>> >>>>>> IfNode::Value() returns either TypeTuple::IFFALSE or TypeTuple::IFTRUE >>>>>> If{False,True}Node::Identity() replaces the taken projection with >>>>>> the If?s control input. >>>>>> ProjNode::Value() returns top for the non taken projection and the >>>>>> dead branch is killed by propagating top. >>>>>> >>>>>> So yes, it?s an IGVN list processing order problem. >>>>>> >>>>>> Roland. >>>>>> >>>>>> >>>>>>> >>>>>>> Vladimir >>>>>>> >>>>>>> On 1/5/15 1:33 PM, Roland Westrelin wrote: >>>>>>>> Hi Vladimir, >>>>>>>> >>>>>>>> Thanks for looking at this. >>>>>>>> >>>>>>>>> I think correct fix will be eager dead 12212 If node elimination >>>>>>>>> when 12214 is replaced by 18733. Keep it connected to graph can >>>>>>>>> cause problems in other places. >>>>>>>> >>>>>>>> IfFalseNode::Identity() is what optimizes 12212/12214 out. Should >>>>>>>> I make it an Ideal transformation so that I can call >>>>>>>> igvn->remove_dead_node() on the dead branch? >>>>>>>> >>>>>>>> Roland. >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Vladimir >>>>>>>>> >>>>>>>>> On 1/5/15 7:48 AM, Roland Westrelin wrote: >>>>>>>>>> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ >>>>>>>>>> >>>>>>>>>> The following subgraph is where the bug shows up: >>>>>>>>>> >>>>>>>>>> If (18732) >>>>>>>>>> | \ >>>>>>>>>> IfTrue IfFalse >>>>>>>>>> (18734) (18733) >>>>>>>>>> | | >>>>>>>>>> | If (12212) >>>>>>>>>> | | \ >>>>>>>>>> | IfFalse IfTrue >>>>>>>>>> | (12214) >>>>>>>>>> | | >>>>>>>>>> Region (12183) >>>>>>>>>> >>>>>>>>>> Condition to 12212 is always false so 12214 is replaced by 18733 >>>>>>>>>> and both branches of If 18732 are directly connected to Region >>>>>>>>>> 12183. 18733 still has dead 12212 as output. >>>>>>>>>> 12183 doesn't have phis so when it's transformed, If 18732 is >>>>>>>>>> considered for removal. IfTrue 18734 doesn't have uses anymore >>>>>>>>>> so it goes away but IfFalse 18733 still has some (dead branch >>>>>>>>>> 12212 is not yet removed). >>>>>>>>>> An If in the dead branch 12212 is processed. Range check >>>>>>>>>> smearing follows dominator controls until 18733, tests whether >>>>>>>>>> the If is a range check and the assert fires because the If only >>>>>>>>>> has one projection. >>>>>>>>>> >>>>>>>>>> So we?re trying to optimize a dead branch. I fixed it by making >>>>>>>>>> the range check code more robust. >>>>>>>>>> >>>>>>>>>> Roland. >>>>>>>>>> >>>>>>>> >>>>>> >>>> > From roland.westrelin at oracle.com Thu Jan 8 09:26:40 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Thu, 8 Jan 2015 10:26:40 +0100 Subject: RFR(S): 8063086: Math.pow yields different results upon repeated calls In-Reply-To: <6EDEE610-817A-4136-AD7F-D3307F5FAEF6@oracle.com> References: <54AAD783.40600@oracle.com> <6EDEE610-817A-4136-AD7F-D3307F5FAEF6@oracle.com> Message-ID: When trying to push this through jprt I had an assert failure due to an unreachable address. Here is a new webrev: http://cr.openjdk.java.net/~roland/8063086/webrev.01/ with the following change: --- a/src/cpu/x86/vm/macroAssembler_x86.cpp +++ b/src/cpu/x86/vm/macroAssembler_x86.cpp @@ -3189,7 +3189,9 @@ static double two = 2.0; ExternalAddress two_addr((address)&two); - fld_d(two_addr); // Stack: 2 X Y + // constant maybe too far on 64 bit + lea(tmp2, two_addr); + fld_d(Address(tmp2, 0)); // Stack: 2 X Y fcmp(tmp, 2, true, false); // Stack: X Y jcc(Assembler::parity, y_not_2); jcc(Assembler::notEqual, y_not_2); Roland. > On Jan 7, 2015, at 2:20 PM, Roland Westrelin wrote: > >> Looks good. Make sure to run new test with -Xcomp and -XX:-TieredCompilation and Client VM. > > Thanks for the review. I checked the test runs fine in these cases. > > Roland. > >> >> Thanks, >> Vladimir >> >> On 1/5/15 8:38 AM, Roland Westrelin wrote: >>> http://cr.openjdk.java.net/~roland/8063086/webrev.00/ >>> >>> With 8029302, C2 computes x^2 as x*x. The interpreter and C1 code don?t have the special case code and as a result an application can see different result for the same computation whether it?s executed by the interpreter/c1 or c2. Fixed by adding the special case to the interpreter and C1. >>> >>> Roland. >>> > From roland.westrelin at oracle.com Thu Jan 8 09:27:15 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Thu, 8 Jan 2015 10:27:15 +0100 Subject: RFR(XS): 8027626: assert(Opcode() != Op_If || outcnt() == 2) failed: bad if #1 In-Reply-To: <54ADB9FD.10507@oracle.com> References: <069E61B4-7573-45FD-9F84-95D515D2A88A@oracle.com> <54AAD8B2.1090409@oracle.com> <54AB2D87.1080007@oracle.com> <54AC55F8.3010806@oracle.com> <54AC6224.1050605@oracle.com> <54AC6329.9080005@oracle.com> <54ADB9FD.10507@oracle.com> Message-ID: <5D70D471-AAF4-4A71-A576-93D1134F001E@oracle.com> > Good. > > I don't like *virtual* methods :) and we can do with (t->field_at(1 - _con) == Type::TOP) but your code is more clear. > > May be change method to return bool: > > virtual bool always_taken(const TypeTuple* t) const { return (t == TypeTuple::IFFALSE); } Thanks for the review and comments. I?ll make that change before I push it. Roland. > > Thanks, > Vladimir > > > On 1/7/15 2:02 PM, Roland Westrelin wrote: >>>>>> Can we simple use replace_input_of(in(0), 0, NULL) in >>>>>> If{False,True}Node::Identity()? It will disconnect IfNode's control >>>>>> edge and put it on worklist. >>>>> >>>>> But that?s a PhaseIterGVN?s method and Identity is passed a >>>>> PhaseTransform? >>>> >>>> An other suggestion is to delay If{False,True}Node::Identity() >>>> optimization until not taken branch is removed: (outcnt() < 2). But it >>>> will need to place taken ProjNode back on worklist when we removed other >>>> branch. >>> >>> Add IfNode to has_special_unique_user() so the code in PhaseIterGVN::remove_globally_dead_node() will do it for you. >> >> >> Thanks for the suggestion, Vladimir. Does that look ok? >> >> http://cr.openjdk.java.net/~roland/8027626/webrev.01/ >> >> Roland. >> >>> >>> Thanks, >>> Vladimir >>> >>>> >>>>> >>>>> It seems in other locations in the code, the code is robust to broken >>>>> if subgraphs. This for instance: >>>> >>>> I think it is too late as this bug shows. We should not allow 2 control >>>> users at any time in a graph (except for IfNode)! >>>> >>>> Vladimir >>>> >>>>> >>>>> IfNode::Ideal() >>>>> >>>>> // Another variation of a dead if >>>>> if (outcnt() < 2) return NULL; >>>>> >>>>> Why isn?t it good enough here? >>>>> >>>>> Roland. >>>>> >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> On 1/6/15 5:38 AM, Roland Westrelin wrote: >>>>>>>> Is it again the problem with IGVN list processing order? >>>>>>>> What happens in a simple test case when only one branch is taking? >>>>>>>> How projection and If nodes are removed? >>>>>>> >>>>>>> IfNode::Value() returns either TypeTuple::IFFALSE or TypeTuple::IFTRUE >>>>>>> If{False,True}Node::Identity() replaces the taken projection with >>>>>>> the If?s control input. >>>>>>> ProjNode::Value() returns top for the non taken projection and the >>>>>>> dead branch is killed by propagating top. >>>>>>> >>>>>>> So yes, it?s an IGVN list processing order problem. >>>>>>> >>>>>>> Roland. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Vladimir >>>>>>>> >>>>>>>> On 1/5/15 1:33 PM, Roland Westrelin wrote: >>>>>>>>> Hi Vladimir, >>>>>>>>> >>>>>>>>> Thanks for looking at this. >>>>>>>>> >>>>>>>>>> I think correct fix will be eager dead 12212 If node elimination >>>>>>>>>> when 12214 is replaced by 18733. Keep it connected to graph can >>>>>>>>>> cause problems in other places. >>>>>>>>> >>>>>>>>> IfFalseNode::Identity() is what optimizes 12212/12214 out. Should >>>>>>>>> I make it an Ideal transformation so that I can call >>>>>>>>> igvn->remove_dead_node() on the dead branch? >>>>>>>>> >>>>>>>>> Roland. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Vladimir >>>>>>>>>> >>>>>>>>>> On 1/5/15 7:48 AM, Roland Westrelin wrote: >>>>>>>>>>> http://cr.openjdk.java.net/~roland/8027626/webrev.00/ >>>>>>>>>>> >>>>>>>>>>> The following subgraph is where the bug shows up: >>>>>>>>>>> >>>>>>>>>>> If (18732) >>>>>>>>>>> | \ >>>>>>>>>>> IfTrue IfFalse >>>>>>>>>>> (18734) (18733) >>>>>>>>>>> | | >>>>>>>>>>> | If (12212) >>>>>>>>>>> | | \ >>>>>>>>>>> | IfFalse IfTrue >>>>>>>>>>> | (12214) >>>>>>>>>>> | | >>>>>>>>>>> Region (12183) >>>>>>>>>>> >>>>>>>>>>> Condition to 12212 is always false so 12214 is replaced by 18733 >>>>>>>>>>> and both branches of If 18732 are directly connected to Region >>>>>>>>>>> 12183. 18733 still has dead 12212 as output. >>>>>>>>>>> 12183 doesn't have phis so when it's transformed, If 18732 is >>>>>>>>>>> considered for removal. IfTrue 18734 doesn't have uses anymore >>>>>>>>>>> so it goes away but IfFalse 18733 still has some (dead branch >>>>>>>>>>> 12212 is not yet removed). >>>>>>>>>>> An If in the dead branch 12212 is processed. Range check >>>>>>>>>>> smearing follows dominator controls until 18733, tests whether >>>>>>>>>>> the If is a range check and the assert fires because the If only >>>>>>>>>>> has one projection. >>>>>>>>>>> >>>>>>>>>>> So we?re trying to optimize a dead branch. I fixed it by making >>>>>>>>>>> the range check code more robust. >>>>>>>>>>> >>>>>>>>>>> Roland. >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >> From albert.noll at oracle.com Thu Jan 8 11:46:18 2015 From: albert.noll at oracle.com (Albert Noll) Date: Thu, 08 Jan 2015 12:46:18 +0100 Subject: [9] RFR(XXS): 8068661: Exclude compiler/whitebox/ForceNMethodSweepTest.java from nightly runs Message-ID: <54AE6E0A.3050006@oracle.com> Hi, this fix excludes compiler/whitebox/ForceNMethodSweepTest.java from the nightly runs. The test is unstable for the following reasons: A) The test is started with: -XX:CompileCommand=compileonly,SimpleTestCase$Helper::* As a result, all methods of SimpleTestCase$Helper can be compiled. These include the following accessor methods: SimpleTestCase$Helper.access$1400()I alive SimpleTestCase$Helper.access$1300(LSimpleTestCase$Helper;)I alive Since background compilation is enabled, it is possible that methods of the class SimpleTestCase$Helper are compiled just after (3) is executed. 1) int afterCompilation = getTotalUsage(); 2) Asserts.assertGT(afterCompilation, usage, "compilation should increase usage"); 3) guaranteedSweep(); 4) int afterSweep = getTotalUsage(); 5) Asserts.assertLTE(afterSweep, afterCompilation, "sweep shouldn't increase usage"); B) Another possible problem is that we have class loading in (2). Since adapters are created eagerly, there is a potential allocation in the code cache for adapters. In the executions I observed, this was not a problem due to adapter sharing (there already exists an adapter, since a class that contains a method with the same signature was loaded before). However, there is no guarantee that adapter sharing will also make this test work in the future. Here is the webrev: http://cr.openjdk.java.net/~anoll/8068661/webrev.00/ Thanks, Albert From david.r.chase at oracle.com Thu Jan 8 14:22:13 2015 From: david.r.chase at oracle.com (David Chase) Date: Thu, 8 Jan 2015 09:22:13 -0500 Subject: [9] RFR(XXS): 8068661: Exclude compiler/whitebox/ForceNMethodSweepTest.java from nightly runs In-Reply-To: <54AE6E0A.3050006@oracle.com> References: <54AE6E0A.3050006@oracle.com> Message-ID: <11D63676-3310-4B52-8910-4C4C7880E5D6@oracle.com> I (not a Reviewer) approve this change and reasoning. Is there a followup bug for someone somewhere else to fix the white box test package to get rid of these problems? David On 2015-01-08, at 6:46 AM, Albert Noll wrote: > Hi, > > this fix excludes compiler/whitebox/ForceNMethodSweepTest.java from the nightly runs. > The test is unstable for the following reasons: > > A) The test is started with: -XX:CompileCommand=compileonly,SimpleTestCase$Helper::* > As a result, all methods of SimpleTestCase$Helper can be compiled. These include the following accessor methods: > SimpleTestCase$Helper.access$1400()I alive > SimpleTestCase$Helper.access$1300(LSimpleTestCase$Helper;)I alive > > Since background compilation is enabled, it is possible that methods of the class SimpleTestCase$Helper are compiled just after (3) is executed. > > 1) int afterCompilation = getTotalUsage(); > 2) Asserts.assertGT(afterCompilation, usage, "compilation should increase usage"); > 3) guaranteedSweep(); > 4) int afterSweep = getTotalUsage(); > 5) Asserts.assertLTE(afterSweep, afterCompilation, "sweep shouldn't increase usage"); > > B) Another possible problem is that we have class loading in (2). Since adapters are created eagerly, there is a potential allocation in the code cache for adapters. In the executions I observed, this was not a problem due to adapter sharing (there already exists an adapter, since a class that contains a method with the same signature was loaded before). However, there is no guarantee that adapter sharing will also make this test work in the future. > > Here is the webrev: > http://cr.openjdk.java.net/~anoll/8068661/webrev.00/ > > Thanks, > Albert -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From vladimir.kozlov at oracle.com Thu Jan 8 16:58:51 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 08 Jan 2015 08:58:51 -0800 Subject: RFR(S): 8063086: Math.pow yields different results upon repeated calls In-Reply-To: References: <54AAD783.40600@oracle.com> <6EDEE610-817A-4136-AD7F-D3307F5FAEF6@oracle.com> Message-ID: <54AEB74B.1010400@oracle.com> Okay. Thanks, Vladimir On 1/8/15 1:26 AM, Roland Westrelin wrote: > When trying to push this through jprt I had an assert failure due to an unreachable address. Here is a new webrev: > > http://cr.openjdk.java.net/~roland/8063086/webrev.01/ > > with the following change: > > --- a/src/cpu/x86/vm/macroAssembler_x86.cpp > +++ b/src/cpu/x86/vm/macroAssembler_x86.cpp > @@ -3189,7 +3189,9 @@ > static double two = 2.0; > ExternalAddress two_addr((address)&two); > > - fld_d(two_addr); // Stack: 2 X Y > + // constant maybe too far on 64 bit > + lea(tmp2, two_addr); > + fld_d(Address(tmp2, 0)); // Stack: 2 X Y > fcmp(tmp, 2, true, false); // Stack: X Y > jcc(Assembler::parity, y_not_2); > jcc(Assembler::notEqual, y_not_2); > > Roland. > >> On Jan 7, 2015, at 2:20 PM, Roland Westrelin wrote: >> >>> Looks good. Make sure to run new test with -Xcomp and -XX:-TieredCompilation and Client VM. >> >> Thanks for the review. I checked the test runs fine in these cases. >> >> Roland. >> >>> >>> Thanks, >>> Vladimir >>> >>> On 1/5/15 8:38 AM, Roland Westrelin wrote: >>>> http://cr.openjdk.java.net/~roland/8063086/webrev.00/ >>>> >>>> With 8029302, C2 computes x^2 as x*x. The interpreter and C1 code don?t have the special case code and as a result an application can see different result for the same computation whether it?s executed by the interpreter/c1 or c2. Fixed by adding the special case to the interpreter and C1. >>>> >>>> Roland. >>>> >> > From roland.westrelin at oracle.com Thu Jan 8 16:59:59 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Thu, 8 Jan 2015 17:59:59 +0100 Subject: RFR(S): 8063086: Math.pow yields different results upon repeated calls In-Reply-To: <54AEB74B.1010400@oracle.com> References: <54AAD783.40600@oracle.com> <6EDEE610-817A-4136-AD7F-D3307F5FAEF6@oracle.com> <54AEB74B.1010400@oracle.com> Message-ID: <5FD9FA4F-F65B-4413-910F-C748F4FAF590@oracle.com> Thanks for the re-review, Vladimir. Roland. > On Jan 8, 2015, at 5:58 PM, Vladimir Kozlov wrote: > > Okay. > > Thanks, > Vladimir > > On 1/8/15 1:26 AM, Roland Westrelin wrote: >> When trying to push this through jprt I had an assert failure due to an unreachable address. Here is a new webrev: >> >> http://cr.openjdk.java.net/~roland/8063086/webrev.01/ >> >> with the following change: >> >> --- a/src/cpu/x86/vm/macroAssembler_x86.cpp >> +++ b/src/cpu/x86/vm/macroAssembler_x86.cpp >> @@ -3189,7 +3189,9 @@ >> static double two = 2.0; >> ExternalAddress two_addr((address)&two); >> >> - fld_d(two_addr); // Stack: 2 X Y >> + // constant maybe too far on 64 bit >> + lea(tmp2, two_addr); >> + fld_d(Address(tmp2, 0)); // Stack: 2 X Y >> fcmp(tmp, 2, true, false); // Stack: X Y >> jcc(Assembler::parity, y_not_2); >> jcc(Assembler::notEqual, y_not_2); >> >> Roland. >> >>> On Jan 7, 2015, at 2:20 PM, Roland Westrelin wrote: >>> >>>> Looks good. Make sure to run new test with -Xcomp and -XX:-TieredCompilation and Client VM. >>> >>> Thanks for the review. I checked the test runs fine in these cases. >>> >>> Roland. >>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 1/5/15 8:38 AM, Roland Westrelin wrote: >>>>> http://cr.openjdk.java.net/~roland/8063086/webrev.00/ >>>>> >>>>> With 8029302, C2 computes x^2 as x*x. The interpreter and C1 code don?t have the special case code and as a result an application can see different result for the same computation whether it?s executed by the interpreter/c1 or c2. Fixed by adding the special case to the interpreter and C1. >>>>> >>>>> Roland. >>>>> >>> >> From vladimir.kozlov at oracle.com Thu Jan 8 17:09:33 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 08 Jan 2015 09:09:33 -0800 Subject: [9] RFR(XXS): 8068661: Exclude compiler/whitebox/ForceNMethodSweepTest.java from nightly runs In-Reply-To: <54AE6E0A.3050006@oracle.com> References: <54AE6E0A.3050006@oracle.com> Message-ID: <54AEB9CD.8060208@oracle.com> Good. Thanks, Vladimir On 1/8/15 3:46 AM, Albert Noll wrote: > Hi, > > this fix excludes compiler/whitebox/ForceNMethodSweepTest.java from the nightly runs. > The test is unstable for the following reasons: > > A) The test is started with: -XX:CompileCommand=compileonly,SimpleTestCase$Helper::* > As a result, all methods of SimpleTestCase$Helper can be compiled. These include the following accessor methods: > SimpleTestCase$Helper.access$1400()I alive > SimpleTestCase$Helper.access$1300(LSimpleTestCase$Helper;)I alive > > Since background compilation is enabled, it is possible that methods of the class SimpleTestCase$Helper are > compiled just after (3) is executed. > > 1) int afterCompilation = getTotalUsage(); > 2) Asserts.assertGT(afterCompilation, usage, "compilation should increase usage"); > 3) guaranteedSweep(); > 4) int afterSweep = getTotalUsage(); > 5) Asserts.assertLTE(afterSweep, afterCompilation, "sweep shouldn't increase usage"); > > B) Another possible problem is that we have class loading in (2). Since adapters are created eagerly, there is a > potential allocation in the code cache for adapters. In the executions I observed, this was not a problem due to adapter > sharing (there already exists an adapter, since a class that contains a method with the same signature was loaded > before). However, there is no guarantee that adapter sharing will also make this test work in the future. > > Here is the webrev: > http://cr.openjdk.java.net/~anoll/8068661/webrev.00/ > > Thanks, > Albert From marc.b.reynolds at gmail.com Fri Jan 9 08:27:43 2015 From: marc.b.reynolds at gmail.com (Marc Reynolds) Date: Fri, 9 Jan 2015 09:27:43 +0100 Subject: RFR(S): 8063086: Math.pow yields different results upon repeated calls In-Reply-To: References: Message-ID: <54af911f.f23ac20a.1967.38c6@mx.google.com> Why is this considered a bug? Doing so seems to be opening a can of worms. As an example any expression which lowers to contain FMA like instructions will yield different results once compiled. It seems more reasonable to declare it proper and expected behavior. -----Original Message----- From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Roland Westrelin Sent: Monday, January 05, 2015 5:38 PM To: hotspot compiler Subject: RFR(S): 8063086: Math.pow yields different results upon repeated calls http://cr.openjdk.java.net/~roland/8063086/webrev.00/ With 8029302, C2 computes x^2 as x*x. The interpreter and C1 code don?t have the special case code and as a result an application can see different result for the same computation whether it?s executed by the interpreter/c1 or c2. Fixed by adding the special case to the interpreter and C1. Roland.= From roland.westrelin at oracle.com Fri Jan 9 09:04:22 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Fri, 9 Jan 2015 10:04:22 +0100 Subject: RFR(S): 8063086: Math.pow yields different results upon repeated calls In-Reply-To: <54af911f.f23ac20a.1967.38c6@mx.google.com> References: <54af911f.f23ac20a.1967.38c6@mx.google.com> Message-ID: > Why is this considered a bug? Doing so seems to be opening a can of worms. As an example any expression which lowers to contain FMA like instructions will yield different results once compiled. It seems more reasonable to declare it proper and expected behavior. That comment: https://bugs.openjdk.java.net/browse/JDK-8063086?focusedCommentId=13594090&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13594090 should answer the question. Roland. From zoltan.majo at oracle.com Fri Jan 9 10:10:15 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 09 Jan 2015 11:10:15 +0100 Subject: [9] RFR(S): CodeHeap::next_free should be renamed Message-ID: <54AFA907.3090709@oracle.com> Hi, please review the following small patch. Bug: https://bugs.openjdk.java.net/browse/JDK-8065894 Problem: The method 'CodeHeap::next_free' does not return the next free HeapBlock as the name and the comment suggest. Actually, it returns the following block that is _not_ free, i.e., the next used block. Solution: The name of the method and the comment about the method is changed accordingly. Webrev: http://cr.openjdk.java.net/~zmajo/8065894/webrev.00/ Testing: JPRT. Thank you and best regards, Zoltan From tobias.hartmann at oracle.com Fri Jan 9 10:17:24 2015 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 09 Jan 2015 11:17:24 +0100 Subject: [9] RFR(S): CodeHeap::next_free should be renamed In-Reply-To: <54AFA907.3090709@oracle.com> References: <54AFA907.3090709@oracle.com> Message-ID: <54AFAAB4.1090001@oracle.com> Hi Zoltan, that looks good (not a reviewer). I like the comment! Thanks, Tobias On 09.01.2015 11:10, Zolt?n Maj? wrote: > Hi, > > > please review the following small patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8065894 > > Problem: The method 'CodeHeap::next_free' does not return the next free > HeapBlock as the name and the comment suggest. Actually, it returns the > following block that is _not_ free, i.e., the next used block. > > Solution: The name of the method and the comment about the method is changed > accordingly. > > Webrev: http://cr.openjdk.java.net/~zmajo/8065894/webrev.00/ > > Testing: JPRT. > > Thank you and best regards, > > > Zoltan > From martin.doerr at sap.com Fri Jan 9 14:39:18 2015 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 9 Jan 2015 14:39:18 +0000 Subject: RFR(M): 8068724: ppc64: update assembler: SPR access, CR logic, HTM Message-ID: <7C9B87B351A4BA4AA9EC95BB418116566ACFA348@DEWDFEMB19C.global.corp.sap> Hi, Here's an update of the ppc assembler: http://cr.openjdk.java.net/~mdoerr/8068724_ppc_asm/webrev.00/ It contains changes in the following areas (as described in https://bugs.openjdk.java.net/browse/JDK-8068724): Fix bug in encoding of special purpose registers. Provide more convenient version of condition register logic instructions. Enhance support for hardware transactional memory. Please review. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From albert.noll at oracle.com Fri Jan 9 14:49:17 2015 From: albert.noll at oracle.com (Albert Noll) Date: Fri, 09 Jan 2015 15:49:17 +0100 Subject: [9] RFR(XXS): 8068661: Exclude compiler/whitebox/ForceNMethodSweepTest.java from nightly runs In-Reply-To: <54AEB9CD.8060208@oracle.com> References: <54AE6E0A.3050006@oracle.com> <54AEB9CD.8060208@oracle.com> Message-ID: <54AFEA6D.5080000@oracle.com> David, Valdimir, thanks for the reviews. @David: The original bug is still open. Best, Albert On 01/08/2015 06:09 PM, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 1/8/15 3:46 AM, Albert Noll wrote: >> Hi, >> >> this fix excludes compiler/whitebox/ForceNMethodSweepTest.java from >> the nightly runs. >> The test is unstable for the following reasons: >> >> A) The test is started with: >> -XX:CompileCommand=compileonly,SimpleTestCase$Helper::* >> As a result, all methods of SimpleTestCase$Helper can be >> compiled. These include the following accessor methods: >> SimpleTestCase$Helper.access$1400()I alive >> SimpleTestCase$Helper.access$1300(LSimpleTestCase$Helper;)I alive >> >> Since background compilation is enabled, it is possible that >> methods of the class SimpleTestCase$Helper are >> compiled just after (3) is executed. >> >> 1) int afterCompilation = getTotalUsage(); >> 2) Asserts.assertGT(afterCompilation, usage, "compilation should >> increase usage"); >> 3) guaranteedSweep(); >> 4) int afterSweep = getTotalUsage(); >> 5) Asserts.assertLTE(afterSweep, afterCompilation, "sweep >> shouldn't increase usage"); >> >> B) Another possible problem is that we have class loading in (2). >> Since adapters are created eagerly, there is a >> potential allocation in the code cache for adapters. In the >> executions I observed, this was not a problem due to adapter >> sharing (there already exists an adapter, since a class that contains >> a method with the same signature was loaded >> before). However, there is no guarantee that adapter sharing will >> also make this test work in the future. >> >> Here is the webrev: >> http://cr.openjdk.java.net/~anoll/8068661/webrev.00/ >> >> Thanks, >> Albert From vladimir.kozlov at oracle.com Fri Jan 9 19:27:52 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 09 Jan 2015 11:27:52 -0800 Subject: RFR(M): 8068724: ppc64: update assembler: SPR access, CR logic, HTM In-Reply-To: <7C9B87B351A4BA4AA9EC95BB418116566ACFA348@DEWDFEMB19C.global.corp.sap> References: <7C9B87B351A4BA4AA9EC95BB418116566ACFA348@DEWDFEMB19C.global.corp.sap> Message-ID: <54B02BB8.4010701@oracle.com> Looks good. Since it is ppc64 only changes they could be pushed by SAP. Thanks, Vladimir On 1/9/15 6:39 AM, Doerr, Martin wrote: > Hi, > > Here?s an update of the ppc assembler: > > http://cr.openjdk.java.net/~mdoerr/8068724_ppc_asm/webrev.00/ > > It contains changes in the following areas (as described in https://bugs.openjdk.java.net/browse/JDK-8068724): > > Fix bug in encoding of special purpose registers. > Provide more convenient version of condition register logic instructions. > Enhance support for hardware transactional memory. > > Please review. > > Best regards, > > Martin > From igor.veresov at oracle.com Fri Jan 9 21:52:21 2015 From: igor.veresov at oracle.com (Igor Veresov) Date: Fri, 9 Jan 2015 13:52:21 -0800 Subject: [9] RFR(S): CodeHeap::next_free should be renamed In-Reply-To: <54AFA907.3090709@oracle.com> References: <54AFA907.3090709@oracle.com> Message-ID: <86C33CB4-78A9-434F-AD3C-EEAE8407161F@oracle.com> Good. igor > On Jan 9, 2015, at 2:10 AM, Zolt?n Maj? wrote: > > Hi, > > > please review the following small patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8065894 > > Problem: The method 'CodeHeap::next_free' does not return the next free HeapBlock as the name and the comment suggest. Actually, it returns the following block that is _not_ free, i.e., the next used block. > > Solution: The name of the method and the comment about the method is changed accordingly. > > Webrev: http://cr.openjdk.java.net/~zmajo/8065894/webrev.00/ > > Testing: JPRT. > > Thank you and best regards, > > > Zoltan > From dean.long at oracle.com Fri Jan 9 22:21:33 2015 From: dean.long at oracle.com (Dean Long) Date: Fri, 09 Jan 2015 14:21:33 -0800 Subject: [9] RFR(XXS): 8068746: Exclude hotspot/test/compiler/codecache/jmx/PoolsIndependenceTest.java from nightly runs Message-ID: <54B0546D.7060508@oracle.com> This is needed so that hs-comp pushes can get through JPRT. See https://bugs.openjdk.java.net/browse/JDK-8068385 for the test bug. http://cr.openjdk.java.net/~dlong/8068746/webrev/ thanks, dl From vladimir.kozlov at oracle.com Fri Jan 9 22:26:39 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 09 Jan 2015 14:26:39 -0800 Subject: [9] RFR(XXS): 8068746: Exclude hotspot/test/compiler/codecache/jmx/PoolsIndependenceTest.java from nightly runs In-Reply-To: <54B0546D.7060508@oracle.com> References: <54B0546D.7060508@oracle.com> Message-ID: <54B0559F.7020404@oracle.com> Good. Thanks, Vladimir On 1/9/15 2:21 PM, Dean Long wrote: > This is needed so that hs-comp pushes can get through JPRT. See > https://bugs.openjdk.java.net/browse/JDK-8068385 for the test bug. > > http://cr.openjdk.java.net/~dlong/8068746/webrev/ > > thanks, > > dl From dean.long at oracle.com Fri Jan 9 22:37:20 2015 From: dean.long at oracle.com (Dean Long) Date: Fri, 09 Jan 2015 14:37:20 -0800 Subject: [9] RFR(XXS): 8068746: Exclude hotspot/test/compiler/codecache/jmx/PoolsIndependenceTest.java from nightly runs In-Reply-To: <54B0559F.7020404@oracle.com> References: <54B0546D.7060508@oracle.com> <54B0559F.7020404@oracle.com> Message-ID: <54B05820.9070503@oracle.com> Thanks. Is one review enough for this small change? dl On 1/9/2015 2:26 PM, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 1/9/15 2:21 PM, Dean Long wrote: >> This is needed so that hs-comp pushes can get through JPRT. See >> https://bugs.openjdk.java.net/browse/JDK-8068385 for the test bug. >> >> http://cr.openjdk.java.net/~dlong/8068746/webrev/ >> >> thanks, >> >> dl From vladimir.kozlov at oracle.com Fri Jan 9 22:42:21 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 09 Jan 2015 14:42:21 -0800 Subject: [9] RFR(XXS): 8068746: Exclude hotspot/test/compiler/codecache/jmx/PoolsIndependenceTest.java from nightly runs In-Reply-To: <54B05820.9070503@oracle.com> References: <54B0546D.7060508@oracle.com> <54B0559F.7020404@oracle.com> <54B05820.9070503@oracle.com> Message-ID: <54B0594D.1000901@oracle.com> Yes, for small changes and specifically for excluding tests. Vladimir On 1/9/15 2:37 PM, Dean Long wrote: > Thanks. Is one review enough for this small change? > > dl > > On 1/9/2015 2:26 PM, Vladimir Kozlov wrote: >> Good. >> >> Thanks, >> Vladimir >> >> On 1/9/15 2:21 PM, Dean Long wrote: >>> This is needed so that hs-comp pushes can get through JPRT. See >>> https://bugs.openjdk.java.net/browse/JDK-8068385 for the test bug. >>> >>> http://cr.openjdk.java.net/~dlong/8068746/webrev/ >>> >>> thanks, >>> >>> dl > From goetz.lindenmaier at sap.com Sat Jan 10 11:18:36 2015 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Sat, 10 Jan 2015 11:18:36 +0000 Subject: RFR(M): 8068724: ppc64: update assembler: SPR access, CR logic, HTM In-Reply-To: <54B02BB8.4010701@oracle.com> References: <7C9B87B351A4BA4AA9EC95BB418116566ACFA348@DEWDFEMB19C.global.corp.sap> <54B02BB8.4010701@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF6C9C8@DEWDFEMB12A.global.corp.sap> Hi Martin, the change looks good. Vladimir, yes, we will push it ourselves. There's a SAPJVM comment left we will remove. Best regards, Goetz. -----Original Message----- From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov Sent: Friday, January 09, 2015 8:28 PM To: Doerr, Martin; hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR(M): 8068724: ppc64: update assembler: SPR access, CR logic, HTM Looks good. Since it is ppc64 only changes they could be pushed by SAP. Thanks, Vladimir On 1/9/15 6:39 AM, Doerr, Martin wrote: > Hi, > > Here's an update of the ppc assembler: > > http://cr.openjdk.java.net/~mdoerr/8068724_ppc_asm/webrev.00/ > > It contains changes in the following areas (as described in https://bugs.openjdk.java.net/browse/JDK-8068724): > > Fix bug in encoding of special purpose registers. > Provide more convenient version of condition register logic instructions. > Enhance support for hardware transactional memory. > > Please review. > > Best regards, > > Martin > From marc.b.reynolds at gmail.com Sat Jan 10 15:44:35 2015 From: marc.b.reynolds at gmail.com (Marc Reynolds) Date: Sat, 10 Jan 2015 16:44:35 +0100 Subject: Logically extend strictfp to intrinsics (was RFR(S): 8063086: Math.pow yields different results upon repeated calls) In-Reply-To: References: <54af911f.f23ac20a.1967.38c6@mx.google.com> Message-ID: <54b14906.6360b40a.5451.ffffef1f@mx.google.com> >> Why is this considered a bug? Doing so seems to be opening a can of worms. >> As an example any expression which lowers to contain FMA like instructions will >> yield different results once compiled. It seems more reasonable to declare it >> proper and expected behavior. > That comment: > https://bugs.openjdk.java.net/browse/JDK-8063086?focusedCommentId=13594090&p age=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment -13594090 > should answer the question. Mr. Darcy's comment doesn't address my attempted point which is the problem is too complex if you include fused operations as well as transforms on intrinsics. I'll use these comments as a jumping board to attempt to clarify. comment 1: "Whenever a numerical intrinsification is added, it should be added to the interpreter, C1, C2, (C3, C4, etc.) so that the numerical behavior is humane." Going this route implies the following: 1) The interpreter and all compilers within a given VM version must be kept lock-stepped. 2) The modified interpreter for each transform must detect that the compiler(s) 'must' perform the one in question and only then compute the alternate form. If there are any errors in this then the 'bug' still exists and it simply occurs in different situations. 3) Likewise the compiler can only perform the transforms in the cases that the interpreter successfully detected that it must. Which in turn will only be the most trivial of cases, thereby marginalizing the transform to near uselessness. As an example of the original thread, transforming pow(a,b) -> a*a if b==2. The interpreter cannot simply test if 'b' is 2 at each invoke since the 'b' may have been computed...doing so would make them not agree in that case. Simply checking if a constant of '2' has been loaded makes the transform not worth implementing. At additional complexity (increased engineering cost and reduced performance of the interpreter) this can be somewhat improved upon by adding additional cases, but it cannot reach the point of what the compiler might be able to deduce since that requires analysis some of which requires that the interpreting phase has already occurred. The Front end AOT is in a much better position to perform these types of limited transforms than the runtime. comment 2: "However, it is clearly ugly misbehavior of the system when this kind of inconsistency occurs external from any sort of reasonable user control." I claim that in the vast majority of cases that this is desirable behavior. Lower latency and decreased error bounds are of much greater interest than bit-exact computations in floating-point. In the case of fused operations, the user does have control over the situation: specify strictfp. It seems like a simple logical extension to extend this notion to include non-bit exact transforms related to intrinsic functions. (Obviously requiring the use of StrictMath wouldn't be a reasonable solution). This would mean that no intrinsic related code is need in the interpretor and all compilers are independent of one another. The compiler only needs to know to not perform any non-bit exact transforms if the invoke is inside strictfp. From zoltan.majo at oracle.com Mon Jan 12 08:27:41 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 12 Jan 2015 09:27:41 +0100 Subject: [9] RFR(S): CodeHeap::next_free should be renamed In-Reply-To: <86C33CB4-78A9-434F-AD3C-EEAE8407161F@oracle.com> References: <54AFA907.3090709@oracle.com> <86C33CB4-78A9-434F-AD3C-EEAE8407161F@oracle.com> Message-ID: <54B3857D.7080804@oracle.com> Thank you, Tobias and Igor, for the reviews! This is a small change, so I assume we can push it with only one Reviewer's review. Best regards, Zoltan On 01/09/2015 10:52 PM, Igor Veresov wrote: > Good. > > igor > >> On Jan 9, 2015, at 2:10 AM, Zolt?n Maj? wrote: >> >> Hi, >> >> >> please review the following small patch. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8065894 >> >> Problem: The method 'CodeHeap::next_free' does not return the next free HeapBlock as the name and the comment suggest. Actually, it returns the following block that is _not_ free, i.e., the next used block. >> >> Solution: The name of the method and the comment about the method is changed accordingly. >> >> Webrev: http://cr.openjdk.java.net/~zmajo/8065894/webrev.00/ >> >> Testing: JPRT. >> >> Thank you and best regards, >> >> >> Zoltan >> From vladimir.kozlov at oracle.com Tue Jan 13 02:00:06 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 12 Jan 2015 18:00:06 -0800 Subject: RFR(XS) 8068864: C2 failed: modified node is not on IGVN._worklist Message-ID: <54B47C26.10409@oracle.com> http://cr.openjdk.java.net/~kvn/8068864/webrev/ Use igvn.replace_input_of() instead of set_req(). Same problem and the same place as in 8053915. Very complex code during javac compilation which cause a particular order of igvn._worklist processing. I can't write a test. Tested with JPRT. Thanks, Vladimir From igor.veresov at oracle.com Tue Jan 13 02:26:47 2015 From: igor.veresov at oracle.com (Igor Veresov) Date: Mon, 12 Jan 2015 18:26:47 -0800 Subject: RFR(XS) 8068864: C2 failed: modified node is not on IGVN._worklist In-Reply-To: <54B47C26.10409@oracle.com> References: <54B47C26.10409@oracle.com> Message-ID: <4675F869-4CF2-45B3-A456-F125D3368F3C@oracle.com> Looks good. igor > On Jan 12, 2015, at 6:00 PM, Vladimir Kozlov wrote: > > http://cr.openjdk.java.net/~kvn/8068864/webrev/ > > Use igvn.replace_input_of() instead of set_req(). Same problem and the same place as in 8053915. > > Very complex code during javac compilation which cause a particular order of igvn._worklist processing. I can't write a test. > > Tested with JPRT. > > Thanks, > Vladimir From vladimir.kozlov at oracle.com Tue Jan 13 02:52:39 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 12 Jan 2015 18:52:39 -0800 Subject: RFR(XS) 8068864: C2 failed: modified node is not on IGVN._worklist In-Reply-To: <4675F869-4CF2-45B3-A456-F125D3368F3C@oracle.com> References: <54B47C26.10409@oracle.com> <4675F869-4CF2-45B3-A456-F125D3368F3C@oracle.com> Message-ID: <54B48877.4070503@oracle.com> Thank you, Igor Vladimir On 1/12/15 6:26 PM, Igor Veresov wrote: > Looks good. > > igor > >> On Jan 12, 2015, at 6:00 PM, Vladimir Kozlov wrote: >> >> http://cr.openjdk.java.net/~kvn/8068864/webrev/ >> >> Use igvn.replace_input_of() instead of set_req(). Same problem and the same place as in 8053915. >> >> Very complex code during javac compilation which cause a particular order of igvn._worklist processing. I can't write a test. >> >> Tested with JPRT. >> >> Thanks, >> Vladimir > From vladimir.x.ivanov at oracle.com Tue Jan 13 08:55:48 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 13 Jan 2015 11:55:48 +0300 Subject: RFR(XS) 8068864: C2 failed: modified node is not on IGVN._worklist In-Reply-To: <54B47C26.10409@oracle.com> References: <54B47C26.10409@oracle.com> Message-ID: <54B4DD94.4000200@oracle.com> Looks good. Best regards, Vladimir Ivanov On 1/13/15 5:00 AM, Vladimir Kozlov wrote: > http://cr.openjdk.java.net/~kvn/8068864/webrev/ > > Use igvn.replace_input_of() instead of set_req(). Same problem and the > same place as in 8053915. > > Very complex code during javac compilation which cause a particular > order of igvn._worklist processing. I can't write a test. > > Tested with JPRT. > > Thanks, > Vladimir From zoltan.majo at oracle.com Tue Jan 13 11:41:11 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Tue, 13 Jan 2015 12:41:11 +0100 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method Message-ID: <54B50457.9080609@oracle.com> Hi, please review the following small patch. Bug: https://bugs.openjdk.java.net/browse/JDK-8066312 Problem: There are some locations in the source code that search for users of a node of a particular type. Solution: To simplify the code, this patch adds a new method, Node::find_user(int opc) that can be used for searching. This patch also updates some comments in the source code. Webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.00/ Testing: JPRT Thank you and best regards, Zoltan From albert.noll at oracle.com Tue Jan 13 12:19:15 2015 From: albert.noll at oracle.com (Albert Noll) Date: Tue, 13 Jan 2015 13:19:15 +0100 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <54B50457.9080609@oracle.com> References: <54B50457.9080609@oracle.com> Message-ID: <54B50D43.3020609@oracle.com> Hi Zoltan, What do you think about the following interface to find users? If we do not need the return value, we add the following function: bool find_user(int opcode); If we look for more than 1 user like in this example: if (n->find_user(Op_StoreP) != NULL || n->find_user(Op_LoadP) != NULL 2014 || n->find_user(Op_StoreN) != NULL || n->find_user(Op_LoadN)) { would it make sense to have the following function? bool find_user(int opc1, int opc2, int opc3, int opc4); Best, Albert On 01/13/2015 12:41 PM, Zolt?n Maj? wrote: > Hi, > > > please review the following small patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8066312 > > Problem: There are some locations in the source code that search for > users of a node of a particular type. > > Solution: To simplify the code, this patch adds a new method, > Node::find_user(int opc) that can be used for searching. This patch > also updates some comments in the source code. > > Webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.00/ > > Testing: JPRT > > Thank you and best regards, > > > Zoltan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zoltan.majo at oracle.com Tue Jan 13 12:32:29 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Tue, 13 Jan 2015 13:32:29 +0100 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <1413CED3-2DBD-4224-B942-A1F24D864FB8@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <1413CED3-2DBD-4224-B942-A1F24D864FB8@oracle.com> Message-ID: <54B5105D.6010207@oracle.com> Hi John, On 01/05/2015 09:00 PM, John Rose wrote: > On Jan 5, 2015, at 11:13 AM, Vladimir Kozlov > > wrote: >> >> On 1/5/15 10:38 AM, Zolt?n Maj? wrote: >>> ... >>> >>> Per-method compilation thresholds are available only in non-product >>> builds to avoid the overhead of accessing fields >>> added by the patch MethodData and MethodCounters. >> >> Too many ifdefs :) >> The interpreter speed is not important. And the feature could be >> interesting in product VM too. >> The only drawback is 2 additional fields in MDO which is fine. >> Can you make it product and run through our performance infrastructure. >> Also, as John Rose will say, we should have as much as possible a >> similar code in product as in tested debug code. Otherwise we are not >> testing product bits and will get into troubles. > > Vladimir is right; please consider it said by both of us. :-) thank you for the review! > Zoltan, I am glad you are tackling the profiling code, because we need > experts in it. Then I'm glad I've touched this part of the code. > > In the long run, the invocation counter code started complex and is > growing more complex. We need to rationalize the counters more. The > notification frequency idea is a good step in the right direction, > since it pulls logic out of the assembly code and into high-level > event handling code. Another good step would be to reduce the number > of distinct counters visible at the assembly level, thus simplifying > the assembly code. The complexity of InvocationCounter is a fossil > which deserves to be buried and paved over. Currently, the VM uses runtime notifications only with tiered compilation enabled. But runtime notification could be used in the non-tiered part of the code as well. > > Speaking of ifdefs, I am uncomfortable with the ifdef-TIERED branches > in the assembly code. Can we move towards merging those code branches? If the VM is changed to use notification frequencies also with tiered compilation disabled, I think it won't be difficult to merge those two code paths. > > And, speaking of footprint, I see no reason why we couldn't shrink the > Method layout to handle all counter bookkeeping with a single word, > instead of the current two. Also (while I'm on the subject) > low-count, non-looping methods do not need a full MethodCounter > struct, just a simple inline count with a low-tag bit, with CAS-based > state changes. This would have the effect of delaying MC and MD > allocation until a method has been used non-trivial amount, which > would reduce footprint if "non-trivial amount" turns out to be large. > > counters = union { > uintptr_t simple_count; // c = (invocation_count << 16 | > notify_mask << 8 | other_flags_we_might_like << 1 | 1) > uintptr_t method_counters; // c = > ((intptr_t)method_counters_addr | 0) > uintptr_t method_data; // c = ((intptr_t)method_data_addr | 0) > } > > (Assumes that method_data and method_counters can be distinguished > suitably by their contents.) That is also a good idea. In general, I agree that profiling infrastructure in the interpreter is too complex. I also think your ideas could simplify the profiling infrastructure quite a lot. So I opened a separate issue, 8068667 -- "simplify interpreter profiling infrastructure", that is dedicated to reducing the complexity of interpreter code: https://bugs.openjdk.java.net/browse/JDK-8068667 I hope I can take care of it soon. Thank you! Best regards, Zoltan > ? John From zoltan.majo at oracle.com Tue Jan 13 12:58:56 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Tue, 13 Jan 2015 13:58:56 +0100 Subject: [9] RFR(S): 8067374: Use %f instead of %g for LogCompilation output Message-ID: <54B51690.4080004@oracle.com> Hi, please review the following small patch. Bug: https://bugs.openjdk.java.net/browse/JDK-8067374 Problem: At some places, the format specifier '%g' is used. As a result, output can be harder to parse than a other formats. Solution: Replace '%g' with '%f'. Webrev: http://cr.openjdk.java.net/~zmajo/8067374/webrev.00/ Testing: Manual testing, JPRT Thank you and best regards, Zoltan From vladimir.kozlov at oracle.com Tue Jan 13 16:35:08 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 13 Jan 2015 08:35:08 -0800 Subject: RFR(XS) 8068864: C2 failed: modified node is not on IGVN._worklist In-Reply-To: <54B4DD94.4000200@oracle.com> References: <54B47C26.10409@oracle.com> <54B4DD94.4000200@oracle.com> Message-ID: <54B5493C.4020604@oracle.com> Thanks. Vladimir K On 1/13/15 12:55 AM, Vladimir Ivanov wrote: > Looks good. > > Best regards, > Vladimir Ivanov > > On 1/13/15 5:00 AM, Vladimir Kozlov wrote: >> http://cr.openjdk.java.net/~kvn/8068864/webrev/ >> >> Use igvn.replace_input_of() instead of set_req(). Same problem and the >> same place as in 8053915. >> >> Very complex code during javac compilation which cause a particular >> order of igvn._worklist processing. I can't write a test. >> >> Tested with JPRT. >> >> Thanks, >> Vladimir From zoltan.majo at oracle.com Tue Jan 13 16:48:20 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Tue, 13 Jan 2015 17:48:20 +0100 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <54B50D43.3020609@oracle.com> References: <54B50457.9080609@oracle.com> <54B50D43.3020609@oracle.com> Message-ID: <54B54C54.5030400@oracle.com> Hi Albert, thank your for the feedback! On 01/13/2015 01:19 PM, Albert Noll wrote: > Hi Zoltan, > > What do you think about the following interface to find users? > > If we do not need the return value, we add the following function: > bool find_user(int opcode); that is a good idea. I added the method. > If we look for more than 1 user like in this example: > > if (n->find_user(Op_StoreP) != NULL || n->find_user(Op_LoadP) != NULL > 2014 || n->find_user(Op_StoreN) != NULL || n->find_user(Op_LoadN)) { > > would it make sense to have the following function? > > bool find_user(int opc1, int opc2, int opc3, int opc4); I added that method as well. Here is the updated webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.01/ All JPRT tests pass. Thank you and best regards, Zoltan > Best, > Albert > > On 01/13/2015 12:41 PM, Zolt?n Maj? wrote: >> Hi, >> >> >> please review the following small patch. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8066312 >> >> Problem: There are some locations in the source code that search for >> users of a node of a particular type. >> >> Solution: To simplify the code, this patch adds a new method, >> Node::find_user(int opc) that can be used for searching. This patch >> also updates some comments in the source code. >> >> Webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.00/ >> >> Testing: JPRT >> >> Thank you and best regards, >> >> >> Zoltan >> > From vladimir.kozlov at oracle.com Tue Jan 13 18:21:11 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 13 Jan 2015 10:21:11 -0800 Subject: [9] RFR(S): 8067374: Use %f instead of %g for LogCompilation output In-Reply-To: <54B51690.4080004@oracle.com> References: <54B51690.4080004@oracle.com> Message-ID: <54B56217.2010609@oracle.com> Looks good. Thanks, Vladimir On 1/13/15 4:58 AM, Zolt?n Maj? wrote: > Hi, > > > please review the following small patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8067374 > > Problem: At some places, the format specifier '%g' is used. As a result, > output can be harder to parse than a other formats. > > Solution: Replace '%g' with '%f'. > > Webrev: http://cr.openjdk.java.net/~zmajo/8067374/webrev.00/ > > Testing: Manual testing, JPRT > > Thank you and best regards, > > > Zoltan > From vladimir.kozlov at oracle.com Tue Jan 13 19:09:33 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 13 Jan 2015 11:09:33 -0800 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54B5101C.5020609@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> Message-ID: <54B56D6D.3040707@oracle.com> Thank you, Zoltan, for performance testing! templateInterpreter_sparc.cpp - why you switched from G3 to G1 in 325,344 lines? templateTable_x86_64.cpp - indention: __ movptr(rcx, Address(rcx, Method::method_counters_offset())); + const Address mask(rcx, in_bytes(MethodCounters::backedge_mask_offset())); Can you move new code in method.cpp into MethodCounters() constructor? I don't see why it should be in method.cpp. advancedThresholdPolicy.hpp - some renaming left. Thanks, Vladimir On 1/13/15 4:31 AM, Zolt?n Maj? wrote: > Hi Vladimir, > > > thank you for the feedback! Please see comments below. > > On 01/05/2015 08:13 PM, Vladimir Kozlov wrote: >> On 1/5/15 10:38 AM, Zolt?n Maj? wrote: >>> Hi, >>> >>> >>> please review the following patch. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8059606 >>> >>> Problem: Controlling compilation thresholds on a per-method level can >>> be useful for debugging and understanding >>> failures, but currently there is no way to control on a per-method >>> level when methods are compiled. >>> >>> >>> Solution: >>> >>> This patch adds support for scaling compilation thresholds on a >>> per-method level using the CompileThresholdScaling flag. >>> For example, the option >>> >>> -XX:CompileCommand=option,SomeClass.someMethod,double,CompileThresholdScaling,0.5 >>> >>> >>> reduces compilation thresholds for method SomeClass.sometMethod() by >>> 50% (but leaves global thresholds unaffected) and >>> results in earlier compilation of the method. >>> >>> Similar to the global CompileThresholdScaling flag (added in >>> JDK-805604), the per-method CompileThresholdScaling flag >>> works with both tiered and non-tiered modes of operation. >>> >>> Per-method compilation thresholds are available only in non-product >>> builds to avoid the overhead of accessing fields >>> added by the patch MethodData and MethodCounters. >> >> Too many ifdefs :) > > I made per-method compilation thresholds available in product builds as > well. That helps reducing the number of ifdefs :-). > >> The interpreter speed is not important. And the feature could be >> interesting in product VM too. >> The only drawback is 2 additional fields in MDO which is fine. >> Can you make it product and run through our performance infrastructure. > > Performance data show that per-method compilation thresholds do not > result in a statistically significant change of performance. One > benchmark, Footprint3-Client, degrades ~0.5% on the X86 Client VM, but I > think that is negligible. > >> Also, as John Rose will say, we should have as much as possible a >> similar code in product as in tested debug code. Otherwise we are not >> testing product bits and will get into troubles. >> >>> >>> The proposed patch supports x86_64, x86_32, and sparc. Do you think >>> it is necessary to support other architectures as well? >> >> Yes. It should be supported on all platforms. > > The current patch supports all architectures except PPC64. > >> >>> The patch updates the name of the flags Tier2BackEdgeThreshold, >>> Tier3BackEdgeThreshold, Tier4BackEdgeThreshold >>> (lowercase e in "Back*e*dge) so that the naming is consistent with >>> other backedge-related flags >>> (Tier0BackedgeNotifyFreqLog, Tier2BackedgeNotifyFreqLog, and >>> Tier3BackedgeNotifyFreqLog). >> >> It added noise to main changes and may cause some testing (jfr?) >> failures. Can we do it separately (other RFE?). > > I created issue 8068506 for that. > > Here is the new webrev: > http://cr.openjdk.java.net/~zmajo/8059606/webrev.01/ > > Testing: manual testing, JPRT > >>> This patch is the third (and final) part of JDK-8050853: >>> https://bugs.openjdk.java.net/browse/JDK-8050853 . >>> >>> >>> Webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.00/ >> >> In general looks good. > > Thank you and best regards, > > > Zoltan > >> >> Thanks, >> Vladimir >> >>> >>> Testing: manual testing on all supported architectures, JPRT. >>> >>> Thank you and best regards, >>> >>> >>> Zoltan >>> > From vladimir.kozlov at oracle.com Tue Jan 13 19:46:12 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 13 Jan 2015 11:46:12 -0800 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <54B54C54.5030400@oracle.com> References: <54B50457.9080609@oracle.com> <54B50D43.3020609@oracle.com> <54B54C54.5030400@oracle.com> Message-ID: <54B57604.3020306@oracle.com> ifg.cpp: could be + return !def->has_user(Op_SCMemProj); Otherwise looks good. I thought we have more cases. Thanks, Vladimir On 1/13/15 8:48 AM, Zolt?n Maj? wrote: > Hi Albert, > > > thank your for the feedback! > > On 01/13/2015 01:19 PM, Albert Noll wrote: >> Hi Zoltan, >> >> What do you think about the following interface to find users? >> >> If we do not need the return value, we add the following function: >> bool find_user(int opcode); > > that is a good idea. I added the method. > >> If we look for more than 1 user like in this example: >> >> if (n->find_user(Op_StoreP) != NULL || n->find_user(Op_LoadP) != >> NULL >> 2014 || n->find_user(Op_StoreN) != NULL || >> n->find_user(Op_LoadN)) { >> >> would it make sense to have the following function? >> >> bool find_user(int opc1, int opc2, int opc3, int opc4); > > I added that method as well. > > Here is the updated webrev: > http://cr.openjdk.java.net/~zmajo/8066312/webrev.01/ > > All JPRT tests pass. > > Thank you and best regards, > > > Zoltan > >> Best, >> Albert >> >> On 01/13/2015 12:41 PM, Zolt?n Maj? wrote: >>> Hi, >>> >>> >>> please review the following small patch. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8066312 >>> >>> Problem: There are some locations in the source code that search for >>> users of a node of a particular type. >>> >>> Solution: To simplify the code, this patch adds a new method, >>> Node::find_user(int opc) that can be used for searching. This patch >>> also updates some comments in the source code. >>> >>> Webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.00/ >>> >>> Testing: JPRT >>> >>> Thank you and best regards, >>> >>> >>> Zoltan >>> >> > From john.r.rose at oracle.com Wed Jan 14 00:12:50 2015 From: john.r.rose at oracle.com (John Rose) Date: Tue, 13 Jan 2015 16:12:50 -0800 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <54B50457.9080609@oracle.com> References: <54B50457.9080609@oracle.com> Message-ID: <71C5E452-5B2E-484B-82C5-5DD29F83B08B@oracle.com> Good. One suggestion: Call it "find_out", not "find_user". The term "out" is more in use for Node than "user"; cf. Node::unique_out, raw_out. ? John On Jan 13, 2015, at 3:41 AM, Zolt?n Maj? wrote: > > Hi, > > > please review the following small patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8066312 > > Problem: There are some locations in the source code that search for users of a node of a particular type. > > Solution: To simplify the code, this patch adds a new method, Node::find_user(int opc) that can be used for searching. This patch also updates some comments in the source code. > > Webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.00/ > > Testing: JPRT > > Thank you and best regards, > > > Zoltan > From john.r.rose at oracle.com Wed Jan 14 00:14:21 2015 From: john.r.rose at oracle.com (John Rose) Date: Tue, 13 Jan 2015 16:14:21 -0800 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54B5105D.6010207@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <1413CED3-2DBD-4224-B942-A1F24D864FB8@oracle.com> <54B5105D.6010207@oracle.com> Message-ID: On Jan 13, 2015, at 4:32 AM, Zolt?n Maj? wrote: > >> >> (Assumes that method_data and method_counters can be distinguished suitably by their contents.) > > That is also a good idea. > > In general, I agree that profiling infrastructure in the interpreter is too complex. I also think your ideas could simplify the profiling infrastructure quite a lot. So I opened a separate issue, 8068667 -- "simplify interpreter profiling infrastructure", that is dedicated to reducing the complexity of interpreter code: > > https://bugs.openjdk.java.net/browse/JDK-8068667 > > I hope I can take care of it soon. That makes me happy! ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Wed Jan 14 02:06:53 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 13 Jan 2015 18:06:53 -0800 Subject: A hotspot patch for stack profiling (frame pointer) In-Reply-To: <5486CB9E.2090505@oracle.com> References: <5486C67C.2030302@oracle.com> <5486CB9E.2090505@oracle.com> Message-ID: <54B5CF3D.4070307@oracle.com> Filed RFE: https://bugs.openjdk.java.net/browse/JDK-8068945 Regards, Vladimir On 12/9/14 2:14 AM, Erik Helin wrote: > I should also add that I don't have enough knowledge of the compiler > internals to review this patch, sorry. > > Thanks, > Erik > > On 2014-12-09 10:53, Erik Helin wrote: >> I applied the patch on top of jdk9/hs-comp and created a webrev: >> http://cr.openjdk.java.net/~ehelin/brendan/frame-pointer/webrev/ >> >> I also successfully run the patch through JPRT. >> >> Thanks, >> Erik >> >> On 2014-12-05 20:57, Brendan Gregg wrote: >>> >>> >>> On Thu, Dec 4, 2014 at 2:55 PM, Brendan Gregg >> > wrote: >>> >>> G'Day, >>> >>> I've hacked hotspot to return the frame pointer, in part to see what >>> this involves, and also to have a working prototype for analysis. >>> Along with an agent to resolve symbols, this has allowed full stack >>> profiling using Linux perf_events. The following flame graphs show >>> the resulting profiles. >>> >>> A mixed mode CPU flame graph of a vert.x benchmark (click to zoom): >>> >>> http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-vertx.svg >>> >>> Same thing, but this time disabling inlining, to show more frames: >>> >>> http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-flamegraph.svg >>> >>> As expected, performance is worse without inlining. You can compare >>> the flame graphs side by side to see why. Less time spent doing work >>> / I/O! >>> >>> >>> https://github.com/brendangregg/Misc/blob/master/java/openjdk8_b132-fp.diff >>> >>> >>> is my patch, >>> >>> [...] >>> >>> >>> In case there's problems with the patch URL, the patch is: >>> >>> --- openjdk8clean/hotspot/src/cpu/x86/vm/x86_64.ad >>> 2014-03-04 02:52:11.000000000 +0000 >>> +++ openjdk8/hotspot/src/cpu/x86/vm/x86_64.ad >>> 2014-11-08 01:10:49.686044933 +0000 >>> @@ -166,10 +166,9 @@ >>> // 3) reg_class stack_slots( /* one chunk of stack-based "registers" >>> */ ) >>> // >>> >>> -// Class for all pointer registers (including RSP) >>> +// Class for all pointer registers (including RSP, excluding RBP) >>> reg_class any_reg(RAX, RAX_H, >>> RDX, RDX_H, >>> - RBP, RBP_H, >>> RDI, RDI_H, >>> RSI, RSI_H, >>> RCX, RCX_H, >>> @@ -184,10 +183,9 @@ >>> R14, R14_H, >>> R15, R15_H); >>> >>> -// Class for all pointer registers except RSP >>> +// Class for all pointer registers except RSP and RBP >>> reg_class ptr_reg(RAX, RAX_H, >>> RDX, RDX_H, >>> - RBP, RBP_H, >>> RDI, RDI_H, >>> RSI, RSI_H, >>> RCX, RCX_H, >>> @@ -199,9 +197,8 @@ >>> R13, R13_H, >>> R14, R14_H); >>> >>> -// Class for all pointer registers except RAX and RSP >>> +// Class for all pointer registers except RAX, RSP and RBP >>> reg_class ptr_no_rax_reg(RDX, RDX_H, >>> - RBP, RBP_H, >>> RDI, RDI_H, >>> RSI, RSI_H, >>> RCX, RCX_H, >>> @@ -226,9 +223,8 @@ >>> R13, R13_H, >>> R14, R14_H); >>> >>> -// Class for all pointer registers except RAX, RBX and RSP >>> +// Class for all pointer registers except RAX, RBX, RSP and RBP >>> reg_class ptr_no_rax_rbx_reg(RDX, RDX_H, >>> - RBP, RBP_H, >>> RDI, RDI_H, >>> RSI, RSI_H, >>> RCX, RCX_H, >>> @@ -260,10 +256,9 @@ >>> // Singleton class for TLS pointer >>> reg_class ptr_r15_reg(R15, R15_H); >>> >>> -// Class for all long registers (except RSP) >>> +// Class for all long registers (except RSP and RBP) >>> reg_class long_reg(RAX, RAX_H, >>> RDX, RDX_H, >>> - RBP, RBP_H, >>> RDI, RDI_H, >>> RSI, RSI_H, >>> RCX, RCX_H, >>> @@ -275,9 +270,8 @@ >>> R13, R13_H, >>> R14, R14_H); >>> >>> -// Class for all long registers except RAX, RDX (and RSP) >>> -reg_class long_no_rax_rdx_reg(RBP, RBP_H, >>> - RDI, RDI_H, >>> +// Class for all long registers except RAX, RDX (and RSP, RBP) >>> +reg_class long_no_rax_rdx_reg(RDI, RDI_H, >>> RSI, RSI_H, >>> RCX, RCX_H, >>> RBX, RBX_H, >>> @@ -288,9 +282,8 @@ >>> R13, R13_H, >>> R14, R14_H); >>> >>> -// Class for all long registers except RCX (and RSP) >>> -reg_class long_no_rcx_reg(RBP, RBP_H, >>> - RDI, RDI_H, >>> +// Class for all long registers except RCX (and RSP, RBP) >>> +reg_class long_no_rcx_reg(RDI, RDI_H, >>> RSI, RSI_H, >>> RAX, RAX_H, >>> RDX, RDX_H, >>> @@ -302,9 +295,8 @@ >>> R13, R13_H, >>> R14, R14_H); >>> >>> -// Class for all long registers except RAX (and RSP) >>> -reg_class long_no_rax_reg(RBP, RBP_H, >>> - RDX, RDX_H, >>> +// Class for all long registers except RAX (and RSP, RBP) >>> +reg_class long_no_rax_reg(RDX, RDX_H, >>> RDI, RDI_H, >>> RSI, RSI_H, >>> RCX, RCX_H, >>> @@ -325,10 +317,9 @@ >>> // Singleton class for RDX long register >>> reg_class long_rdx_reg(RDX, RDX_H); >>> >>> -// Class for all int registers (except RSP) >>> +// Class for all int registers (except RSP and RBP) >>> reg_class int_reg(RAX, >>> RDX, >>> - RBP, >>> RDI, >>> RSI, >>> RCX, >>> @@ -340,10 +331,9 @@ >>> R13, >>> R14); >>> >>> -// Class for all int registers except RCX (and RSP) >>> +// Class for all int registers except RCX (and RSP, RBP) >>> reg_class int_no_rcx_reg(RAX, >>> RDX, >>> - RBP, >>> RDI, >>> RSI, >>> RBX, >>> @@ -355,8 +345,7 @@ >>> R14); >>> >>> // Class for all int registers except RAX, RDX (and RSP) >>> -reg_class int_no_rax_rdx_reg(RBP, >>> - RDI, >>> +reg_class int_no_rax_rdx_reg(RDI, >>> RSI, >>> RCX, >>> RBX, >>> @@ -718,6 +707,7 @@ >>> st->print("# stack bang"); >>> st->print("\n\t"); >>> st->print("pushq rbp\t# Save rbp"); >>> + // BDG consider: st->print("movq rbp, rsp\t# "); >>> if (framesize) { >>> st->print("\n\t"); >>> st->print("subq rsp, #%d\t# Create frame",framesize); >>> --- openjdk8clean/hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp >>> 2014-03-04 02:52:11.000000000 +0000 >>> +++ openjdk8/hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp 2014-11-07 >>> 23:57:11.589593723 +0000 >>> @@ -5236,6 +5236,7 @@ >>> // We always push rbp, so that on return to interpreter rbp, >>> will be >>> // restored correctly and we can correct the stack. >>> push(rbp); >>> + mov(rbp, rsp); >>> // Remove word for ebp >>> framesize -= wordSize; >>> >>> --- openjdk8clean/hotspot/src/cpu/x86/vm/c1_MacroAssembler_x86.cpp >>> 2014-03-04 02:52:10.000000000 +0000 >>> +++ openjdk8/hotspot/src/cpu/x86/vm/c1_MacroAssembler_x86.cpp >>> 2014-11-07 23:57:21.933257882 +0000 >>> @@ -358,6 +358,7 @@ >>> generate_stack_overflow_check(frame_size_in_bytes); >>> >>> push(rbp); >>> + mov(rbp, rsp); >>> #ifdef TIERED >>> // c2 leaves fpu stack dirty. Clean it on entry >>> if (UseSSE < 2 ) { >>> >>> >>> Brendan > From roland.westrelin at oracle.com Wed Jan 14 09:34:35 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Wed, 14 Jan 2015 10:34:35 +0100 Subject: RFR(L): 6912521: System.arraycopy works slower than the simple loop for little lengths Message-ID: http://cr.openjdk.java.net/~roland/6912521/webrev.00/ Follow up to 6700100 (instance clone as series of loads/stores): convert ArrayCopyNode for small array copies (clone of arrays, System.arraycopy, Arrays.copyOf) to series of loads and stores. Roland. From axel.siebenborn at sap.com Wed Jan 14 10:53:10 2015 From: axel.siebenborn at sap.com (Siebenborn, Axel) Date: Wed, 14 Jan 2015 10:53:10 +0000 Subject: RFR(XS) 8068909: SIGSEGV in c2 compiled code with OptimizeStringConcat Message-ID: <02D5D45C1F8DB848A7AE20E80EE61A5C398D9D8A@DEWDFEMB20C.global.corp.sap> Hi, I investigated a crash with jdk8_u25 and opened the following bug: https://bugs.openjdk.java.net/browse/JDK-8068909 I would suggest the following fix: http://www.sapjvm.com/as/webrevs/8068909/ If the control of the inserted load, to NULL. In this case, its corresponding nullcheck will be found as required edge to the CastPP during MemNode::Ideal_common_DU_postCCP. Thanks, Axel -------------- next part -------------- An HTML attachment was scrubbed... URL: From bertrand.delsart at oracle.com Wed Jan 14 14:42:30 2015 From: bertrand.delsart at oracle.com (Bertrand Delsart) Date: Wed, 14 Jan 2015 15:42:30 +0100 Subject: A hotspot patch for stack profiling (frame pointer) In-Reply-To: <54B5CF3D.4070307@oracle.com> References: <5486C67C.2030302@oracle.com> <5486CB9E.2090505@oracle.com> <54B5CF3D.4070307@oracle.com> Message-ID: <54B68056.3090706@oracle.com> Hi, I'm surprised not to see any change related to JSR292 in the webrev. While the fix looks safe (not breaking hotspot), it probably does not work as you expect... and this might make it useless. RBP is used by the JITs to memorize SP before it may be adapted during a call sequence (look for instance for method_handle_invoke_SP_save_opr or rbp_mh_SP_save in the code). Hence the value we push on the stack may not be the RBP of the caller. It can be its SP just before the call... which would probably confuse your profiler, possibly crashing it if you really assume this is the beginning of a frame. IMHO, the RFE does not break hotspot and profiling seems to work... but if you try to profile code which uses invoke dynamic, you will hit bugs in the profiler. I would not prevent the JITs from using RBP as long as the changeset is not sufficient to guarantee the profiling will work... and IMHO solving the JSR292 issue will be much more intrusive (impacting HotSpot stack walking code). Regards, Bertrand. On 14/01/2015 03:06, Vladimir Kozlov wrote: > Filed RFE: > > https://bugs.openjdk.java.net/browse/JDK-8068945 > > Regards, > Vladimir > > On 12/9/14 2:14 AM, Erik Helin wrote: >> I should also add that I don't have enough knowledge of the compiler >> internals to review this patch, sorry. >> >> Thanks, >> Erik >> >> On 2014-12-09 10:53, Erik Helin wrote: >>> I applied the patch on top of jdk9/hs-comp and created a webrev: >>> http://cr.openjdk.java.net/~ehelin/brendan/frame-pointer/webrev/ >>> >>> I also successfully run the patch through JPRT. >>> >>> Thanks, >>> Erik >>> >>> On 2014-12-05 20:57, Brendan Gregg wrote: >>>> >>>> >>>> On Thu, Dec 4, 2014 at 2:55 PM, Brendan Gregg >>>> >>> > wrote: >>>> >>>> G'Day, >>>> >>>> I've hacked hotspot to return the frame pointer, in part to see >>>> what >>>> this involves, and also to have a working prototype for analysis. >>>> Along with an agent to resolve symbols, this has allowed full stack >>>> profiling using Linux perf_events. The following flame graphs show >>>> the resulting profiles. >>>> >>>> A mixed mode CPU flame graph of a vert.x benchmark (click to zoom): >>>> >>>> http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-vertx.svg >>>> >>>> Same thing, but this time disabling inlining, to show more frames: >>>> >>>> >>>> http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-flamegraph.svg >>>> >>>> As expected, performance is worse without inlining. You can compare >>>> the flame graphs side by side to see why. Less time spent doing >>>> work >>>> / I/O! >>>> >>>> >>>> https://github.com/brendangregg/Misc/blob/master/java/openjdk8_b132-fp.diff >>>> >>>> >>>> >>>> is my patch, >>>> >>>> [...] >>>> >>>> >>>> In case there's problems with the patch URL, the patch is: >>>> >>>> --- openjdk8clean/hotspot/src/cpu/x86/vm/x86_64.ad >>>> 2014-03-04 02:52:11.000000000 +0000 >>>> +++ openjdk8/hotspot/src/cpu/x86/vm/x86_64.ad >>>> 2014-11-08 01:10:49.686044933 +0000 >>>> @@ -166,10 +166,9 @@ >>>> // 3) reg_class stack_slots( /* one chunk of stack-based "registers" >>>> */ ) >>>> // >>>> >>>> -// Class for all pointer registers (including RSP) >>>> +// Class for all pointer registers (including RSP, excluding RBP) >>>> reg_class any_reg(RAX, RAX_H, >>>> RDX, RDX_H, >>>> - RBP, RBP_H, >>>> RDI, RDI_H, >>>> RSI, RSI_H, >>>> RCX, RCX_H, >>>> @@ -184,10 +183,9 @@ >>>> R14, R14_H, >>>> R15, R15_H); >>>> >>>> -// Class for all pointer registers except RSP >>>> +// Class for all pointer registers except RSP and RBP >>>> reg_class ptr_reg(RAX, RAX_H, >>>> RDX, RDX_H, >>>> - RBP, RBP_H, >>>> RDI, RDI_H, >>>> RSI, RSI_H, >>>> RCX, RCX_H, >>>> @@ -199,9 +197,8 @@ >>>> R13, R13_H, >>>> R14, R14_H); >>>> >>>> -// Class for all pointer registers except RAX and RSP >>>> +// Class for all pointer registers except RAX, RSP and RBP >>>> reg_class ptr_no_rax_reg(RDX, RDX_H, >>>> - RBP, RBP_H, >>>> RDI, RDI_H, >>>> RSI, RSI_H, >>>> RCX, RCX_H, >>>> @@ -226,9 +223,8 @@ >>>> R13, R13_H, >>>> R14, R14_H); >>>> >>>> -// Class for all pointer registers except RAX, RBX and RSP >>>> +// Class for all pointer registers except RAX, RBX, RSP and RBP >>>> reg_class ptr_no_rax_rbx_reg(RDX, RDX_H, >>>> - RBP, RBP_H, >>>> RDI, RDI_H, >>>> RSI, RSI_H, >>>> RCX, RCX_H, >>>> @@ -260,10 +256,9 @@ >>>> // Singleton class for TLS pointer >>>> reg_class ptr_r15_reg(R15, R15_H); >>>> >>>> -// Class for all long registers (except RSP) >>>> +// Class for all long registers (except RSP and RBP) >>>> reg_class long_reg(RAX, RAX_H, >>>> RDX, RDX_H, >>>> - RBP, RBP_H, >>>> RDI, RDI_H, >>>> RSI, RSI_H, >>>> RCX, RCX_H, >>>> @@ -275,9 +270,8 @@ >>>> R13, R13_H, >>>> R14, R14_H); >>>> >>>> -// Class for all long registers except RAX, RDX (and RSP) >>>> -reg_class long_no_rax_rdx_reg(RBP, RBP_H, >>>> - RDI, RDI_H, >>>> +// Class for all long registers except RAX, RDX (and RSP, RBP) >>>> +reg_class long_no_rax_rdx_reg(RDI, RDI_H, >>>> RSI, RSI_H, >>>> RCX, RCX_H, >>>> RBX, RBX_H, >>>> @@ -288,9 +282,8 @@ >>>> R13, R13_H, >>>> R14, R14_H); >>>> >>>> -// Class for all long registers except RCX (and RSP) >>>> -reg_class long_no_rcx_reg(RBP, RBP_H, >>>> - RDI, RDI_H, >>>> +// Class for all long registers except RCX (and RSP, RBP) >>>> +reg_class long_no_rcx_reg(RDI, RDI_H, >>>> RSI, RSI_H, >>>> RAX, RAX_H, >>>> RDX, RDX_H, >>>> @@ -302,9 +295,8 @@ >>>> R13, R13_H, >>>> R14, R14_H); >>>> >>>> -// Class for all long registers except RAX (and RSP) >>>> -reg_class long_no_rax_reg(RBP, RBP_H, >>>> - RDX, RDX_H, >>>> +// Class for all long registers except RAX (and RSP, RBP) >>>> +reg_class long_no_rax_reg(RDX, RDX_H, >>>> RDI, RDI_H, >>>> RSI, RSI_H, >>>> RCX, RCX_H, >>>> @@ -325,10 +317,9 @@ >>>> // Singleton class for RDX long register >>>> reg_class long_rdx_reg(RDX, RDX_H); >>>> >>>> -// Class for all int registers (except RSP) >>>> +// Class for all int registers (except RSP and RBP) >>>> reg_class int_reg(RAX, >>>> RDX, >>>> - RBP, >>>> RDI, >>>> RSI, >>>> RCX, >>>> @@ -340,10 +331,9 @@ >>>> R13, >>>> R14); >>>> >>>> -// Class for all int registers except RCX (and RSP) >>>> +// Class for all int registers except RCX (and RSP, RBP) >>>> reg_class int_no_rcx_reg(RAX, >>>> RDX, >>>> - RBP, >>>> RDI, >>>> RSI, >>>> RBX, >>>> @@ -355,8 +345,7 @@ >>>> R14); >>>> >>>> // Class for all int registers except RAX, RDX (and RSP) >>>> -reg_class int_no_rax_rdx_reg(RBP, >>>> - RDI, >>>> +reg_class int_no_rax_rdx_reg(RDI, >>>> RSI, >>>> RCX, >>>> RBX, >>>> @@ -718,6 +707,7 @@ >>>> st->print("# stack bang"); >>>> st->print("\n\t"); >>>> st->print("pushq rbp\t# Save rbp"); >>>> + // BDG consider: st->print("movq rbp, rsp\t# "); >>>> if (framesize) { >>>> st->print("\n\t"); >>>> st->print("subq rsp, #%d\t# Create frame",framesize); >>>> --- openjdk8clean/hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp >>>> 2014-03-04 02:52:11.000000000 +0000 >>>> +++ openjdk8/hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp >>>> 2014-11-07 >>>> 23:57:11.589593723 +0000 >>>> @@ -5236,6 +5236,7 @@ >>>> // We always push rbp, so that on return to interpreter rbp, >>>> will be >>>> // restored correctly and we can correct the stack. >>>> push(rbp); >>>> + mov(rbp, rsp); >>>> // Remove word for ebp >>>> framesize -= wordSize; >>>> >>>> --- openjdk8clean/hotspot/src/cpu/x86/vm/c1_MacroAssembler_x86.cpp >>>> 2014-03-04 02:52:10.000000000 +0000 >>>> +++ openjdk8/hotspot/src/cpu/x86/vm/c1_MacroAssembler_x86.cpp >>>> 2014-11-07 23:57:21.933257882 +0000 >>>> @@ -358,6 +358,7 @@ >>>> generate_stack_overflow_check(frame_size_in_bytes); >>>> >>>> push(rbp); >>>> + mov(rbp, rsp); >>>> #ifdef TIERED >>>> // c2 leaves fpu stack dirty. Clean it on entry >>>> if (UseSSE < 2 ) { >>>> >>>> >>>> Brendan >> -- Bertrand Delsart, Grenoble Engineering Center Oracle, 180 av. de l'Europe, ZIRST de Montbonnot 38330 Montbonnot Saint Martin, FRANCE bertrand.delsart at oracle.com Phone : +33 4 76 18 81 23 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From vladimir.kozlov at oracle.com Wed Jan 14 18:46:34 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 14 Jan 2015 10:46:34 -0800 Subject: RFR(XS) 8068909: SIGSEGV in c2 compiled code with OptimizeStringConcat In-Reply-To: <02D5D45C1F8DB848A7AE20E80EE61A5C398D9D8A@DEWDFEMB20C.global.corp.sap> References: <02D5D45C1F8DB848A7AE20E80EE61A5C398D9D8A@DEWDFEMB20C.global.corp.sap> Message-ID: <54B6B98A.5070302@oracle.com> Hi Axel, Thank you for looking on this issue. Before we start reviewing it, please, publish webrev on cr.openjdk.java.net. It is requirement for accepting changes. Your colleagues can help you. Regards, Vladimir On 1/14/15 2:53 AM, Siebenborn, Axel wrote: > Hi, > > I investigated a crash with jdk8_u25 and opened the following bug: > > https://bugs.openjdk.java.net/browse/JDK-8068909 > > I would suggest the following fix: > > http://www.sapjvm.com/as/webrevs/8068909/ > > If the control of the inserted load, to NULL. In this case, its corresponding nullcheck will be found as required edge > to the CastPP during MemNode::Ideal_common_DU_postCCP. > > Thanks, > > Axel > From john.r.rose at oracle.com Wed Jan 14 19:12:30 2015 From: john.r.rose at oracle.com (John Rose) Date: Wed, 14 Jan 2015 11:12:30 -0800 Subject: A hotspot patch for stack profiling (frame pointer) In-Reply-To: <54B68056.3090706@oracle.com> References: <5486C67C.2030302@oracle.com> <5486CB9E.2090505@oracle.com> <54B5CF3D.4070307@oracle.com> <54B68056.3090706@oracle.com> Message-ID: On Jan 14, 2015, at 6:42 AM, Bertrand Delsart wrote: > > I would not prevent the JITs from using RBP as long as the changeset is not sufficient to guarantee the profiling will work... and IMHO solving the JSR292 issue will be much more intrusive (impacting HotSpot stack walking code). Here are some thoughts on that. SPARC uses L7 (L7_mh_SP_save) for the same purpose of method handle support as x86 uses RBP (rbp_mh_SP_save). So there's not a hard requirement for x86 to take over RBP. (Deep background: This purpose, in method handle support, is to allow an adapter to make changes to the caller's SP. The adapter is the initial callee from the caller, but may change argument shape, and tail-calls the ultimate callee. Because it is a tail-call, the original caller must have a spot where his original SP can be preserved. The preservation works because the original caller knows he is calling a MH.invoke method, which requires the extra argument preservation. The repertoire of argument shape changes is quite small, actually; it is not a very general mechanism since the LF machinery was put in. Perhaps the whole thing could be removed somehow, by finding alternative techniques for the few remaining changes. OTOH, this SP-restoring mechanism may be helpful in doing more a general tail-call mechanism, and perhaps in managing int/comp mode changes more cleanly, so I'd like us to keep it. And document it better.) Any register or stack slot will do for this purpose, as long as (i) its value can be recovered after the MH.invoke call returns to the caller, and (ii) its value can be dug up somehow during stack walking. There are only a couple of places where stack walking code needs to sample the value, so they should be adjustable. Both x86 and SPARC use registers which are callee-save (or "non-volatile across calls") which satisfy properties (i) and (ii). A standard stack slot (addressed based on caller's RBP) would probably also satisfy those properties. A variably-positioned stack slot would also work, which would require registering the position in each CodeBlob. That's unpleasant extra detail, but it would align somewhat with the current logic which allows each CodeBlob (nmethod, actually) to advertise which call sites need the special processing (see the function is_method_handle_return(caller_pc)). I recommend reserving a dead word of space in every stack frame that makes MH.invoke calls, at a fixed position relative to that frame's RBP. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Wed Jan 14 20:22:40 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 14 Jan 2015 12:22:40 -0800 Subject: RFR(L): 6912521: System.arraycopy works slower than the simple loop for little lengths In-Reply-To: References: Message-ID: <54B6D010.20706@oracle.com> The logic which choose direction of coping in ArrayCopyNode::Ideal() is strange. I would like to see more explicit checks there. Something like: if (is_array_copy_overlap()) { array_copy_backward() } else { array_copy_forward() } Can you move ArraCopy class code from callnode.?pp to new arraycopynode.?pp files? the code become too large. Can you add comment in library_call.cpp explaining new validation/casting logic? Why you do that? Thanks, Vladimir On 1/14/15 1:34 AM, Roland Westrelin wrote: > http://cr.openjdk.java.net/~roland/6912521/webrev.00/ > > Follow up to 6700100 (instance clone as series of loads/stores): convert ArrayCopyNode for small array copies (clone of arrays, System.arraycopy, Arrays.copyOf) to series of loads and stores. > > Roland. > From roland.westrelin at oracle.com Wed Jan 14 21:32:25 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Wed, 14 Jan 2015 22:32:25 +0100 Subject: RFR(L): 6912521: System.arraycopy works slower than the simple loop for little lengths In-Reply-To: <54B6D010.20706@oracle.com> References: <54B6D010.20706@oracle.com> Message-ID: <7A0EDCD9-3EC0-4CB1-A427-6764F4AF907D@oracle.com> Hi Vladimir. Thanks for taking a look at this. > The logic which choose direction of coping in ArrayCopyNode::Ideal() is strange. I would like to see more explicit checks there. Something like: > > if (is_array_copy_overlap()) { > array_copy_backward() > } else { > array_copy_forward() > } I?m not following you. In the general case, the test needs to be done at runtime. Your code above seems to imply that we would always decide at compile time? > Can you move ArraCopy class code from callnode.?pp to new arraycopynode.?pp files? the code become too large. Ok. > Can you add comment in library_call.cpp explaining new validation/casting logic? Why you do that? I will add comments. To be legal, the transformation of the ArrayCopyNode to loads/stores can only happen if we?re sure the Arrays.copyOf would succeed. So we need all input arguments to the copyOf to be validated, including that the copy to the new array won?t trigger an ArrayStoreException. That?s why there?s a subtype check. That subtype check can be optimized if we know something on the type of the input array from type speculation. Roland. > > Thanks, > Vladimir > > On 1/14/15 1:34 AM, Roland Westrelin wrote: >> http://cr.openjdk.java.net/~roland/6912521/webrev.00/ >> >> Follow up to 6700100 (instance clone as series of loads/stores): convert ArrayCopyNode for small array copies (clone of arrays, System.arraycopy, Arrays.copyOf) to series of loads and stores. >> >> Roland. >> From vladimir.kozlov at oracle.com Wed Jan 14 23:29:25 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 14 Jan 2015 15:29:25 -0800 Subject: RFR(L): 6912521: System.arraycopy works slower than the simple loop for little lengths In-Reply-To: <7A0EDCD9-3EC0-4CB1-A427-6764F4AF907D@oracle.com> References: <54B6D010.20706@oracle.com> <7A0EDCD9-3EC0-4CB1-A427-6764F4AF907D@oracle.com> Message-ID: <54B6FBD5.2070009@oracle.com> On 1/14/15 1:32 PM, Roland Westrelin wrote: > Hi Vladimir. Thanks for taking a look at this. > >> The logic which choose direction of coping in ArrayCopyNode::Ideal() is strange. I would like to see more explicit checks there. Something like: >> >> if (is_array_copy_overlap()) { >> array_copy_backward() >> } else { >> array_copy_forward() >> } > > I?m not following you. In the general case, the test needs to be done at runtime. Your code above seems to imply that we would always decide at compile time? My bad, you are right. > >> Can you move ArraCopy class code from callnode.?pp to new arraycopynode.?pp files? the code become too large. > > Ok. > >> Can you add comment in library_call.cpp explaining new validation/casting logic? Why you do that? > > I will add comments. > To be legal, the transformation of the ArrayCopyNode to loads/stores can only happen if we?re sure the Arrays.copyOf would succeed. So we need all input arguments to the copyOf to be validated, including that the copy to the new array won?t trigger an ArrayStoreException. That?s why there?s a subtype check. That subtype check can be optimized if we know something on the type of the input array from type speculation. Okay. Thanks, Vladimir > > Roland. > > >> >> Thanks, >> Vladimir >> >> On 1/14/15 1:34 AM, Roland Westrelin wrote: >>> http://cr.openjdk.java.net/~roland/6912521/webrev.00/ >>> >>> Follow up to 6700100 (instance clone as series of loads/stores): convert ArrayCopyNode for small array copies (clone of arrays, System.arraycopy, Arrays.copyOf) to series of loads and stores. >>> >>> Roland. >>> > From vladimir.kozlov at oracle.com Wed Jan 14 23:53:17 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 14 Jan 2015 15:53:17 -0800 Subject: RFR(XS) 8069021: Exclude compiler/codecache/stress tests from JPRT runs Message-ID: <54B7016D.8070802@oracle.com> http://cr.openjdk.java.net/~kvn/8069021/webrev/ New codecache stress tests added by 8059551 added 45 min to hotspot_compiler_2 execution time. I got my job killed because total execution of it took > 1 hour. Fix tested in JPRT. I filed separate bug to investigate long execution time: https://bugs.openjdk.java.net/browse/JDK-8069020 Thanks, Vladimir From igor.veresov at oracle.com Wed Jan 14 23:57:27 2015 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 14 Jan 2015 15:57:27 -0800 Subject: RFR(XS) 8069021: Exclude compiler/codecache/stress tests from JPRT runs In-Reply-To: <54B7016D.8070802@oracle.com> References: <54B7016D.8070802@oracle.com> Message-ID: <1A347F7E-67E0-4160-B4E7-14BA592890FE@oracle.com> Good. igor > On Jan 14, 2015, at 3:53 PM, Vladimir Kozlov wrote: > > http://cr.openjdk.java.net/~kvn/8069021/webrev/ > > New codecache stress tests added by 8059551 added 45 min to hotspot_compiler_2 execution time. I got my job killed because total execution of it took > 1 hour. > > Fix tested in JPRT. > > I filed separate bug to investigate long execution time: > > https://bugs.openjdk.java.net/browse/JDK-8069020 > > Thanks, > Vladimir From vladimir.kozlov at oracle.com Wed Jan 14 23:59:53 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 14 Jan 2015 15:59:53 -0800 Subject: RFR(XS) 8069021: Exclude compiler/codecache/stress tests from JPRT runs In-Reply-To: <1A347F7E-67E0-4160-B4E7-14BA592890FE@oracle.com> References: <54B7016D.8070802@oracle.com> <1A347F7E-67E0-4160-B4E7-14BA592890FE@oracle.com> Message-ID: <54B702F9.1030206@oracle.com> Thanks! Vladimir On 1/14/15 3:57 PM, Igor Veresov wrote: > Good. > > igor > >> On Jan 14, 2015, at 3:53 PM, Vladimir Kozlov wrote: >> >> http://cr.openjdk.java.net/~kvn/8069021/webrev/ >> >> New codecache stress tests added by 8059551 added 45 min to hotspot_compiler_2 execution time. I got my job killed because total execution of it took > 1 hour. >> >> Fix tested in JPRT. >> >> I filed separate bug to investigate long execution time: >> >> https://bugs.openjdk.java.net/browse/JDK-8069020 >> >> Thanks, >> Vladimir > From volker.simonis at gmail.com Thu Jan 15 08:24:32 2015 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 15 Jan 2015 09:24:32 +0100 Subject: RFR(XS) 8068909: SIGSEGV in c2 compiled code with OptimizeStringConcat In-Reply-To: <54B6B98A.5070302@oracle.com> References: <02D5D45C1F8DB848A7AE20E80EE61A5C398D9D8A@DEWDFEMB20C.global.corp.sap> <54B6B98A.5070302@oracle.com> Message-ID: Hi Vladimir, please find Axels webrev here: http://cr.openjdk.java.net/~simonis/webrevs/2015/8068909/ Regards, Volker PS: unfortunately Axel can't access his OpenJDK webrev-space any more since it was moved sometimes last summer but we will try once again to reactivate it now. On Wed, Jan 14, 2015 at 7:46 PM, Vladimir Kozlov wrote: > Hi Axel, > > Thank you for looking on this issue. > Before we start reviewing it, please, publish webrev on cr.openjdk.java.net. > It is requirement for accepting changes. Your colleagues can help you. > > Regards, > Vladimir > > > On 1/14/15 2:53 AM, Siebenborn, Axel wrote: >> >> Hi, >> >> I investigated a crash with jdk8_u25 and opened the following bug: >> >> https://bugs.openjdk.java.net/browse/JDK-8068909 >> >> I would suggest the following fix: >> >> http://www.sapjvm.com/as/webrevs/8068909/ >> >> If the control of the inserted load, to NULL. In this case, its >> corresponding nullcheck will be found as required edge >> to the CastPP during MemNode::Ideal_common_DU_postCCP. >> >> Thanks, >> >> Axel >> > From tobias.hartmann at oracle.com Thu Jan 15 08:58:22 2015 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 15 Jan 2015 09:58:22 +0100 Subject: [9] RFR(S): 8064940: JMH javac performance regressions on solaris-sparcv9 in 9-b34 Message-ID: <54B7812E.9080504@oracle.com> Hi, please review the following patch. https://bugs.openjdk.java.net/browse/JDK-8064940 http://cr.openjdk.java.net/~thartmann/8064940/webrev.01/ Problem: Promotion testing revealed a performance regression for the JMH-Javac benchmarks on Solaris Sparc introduced in b34 by JDK-8015774. While investigating, I noticed that the number of iTLB misses greatly increases with code cache segmentation enabled (40190 vs. 129806235) causing the regression. This is due to large page support (-XX:+UseLargePages) being enabled on Sparc. Without code cache segmentation the single code heap uses only large (4M) pages: Address Kbytes RSS Anon Locked Pgsz Mode FFFFFFFF69000000 32768 32768 32768 - 4M rwx-- iTLB misses: 40190 (one run) With code cache segmentation the code heaps do not use large pages and due to JDK-8066875 not even the middle region of the underlying virtual space uses large pages: Address Kbytes RSS Anon Locked Pgsz Mode FFFFFFFF69000000 4544 4544 4544 - 64K rwx-- FFFFFFFF697CE000 8 8 8 - 8K rwx-- FFFFFFFF697D0000 1984 1984 1984 - 64K rwx-- FFFFFFFF699C0000 16456 16456 16456 - 8K rwx-- FFFFFFFF6A9D2000 48 - - - - rwx-- FFFFFFFF70BE8000 32 32 32 - 8K rwx-- FFFFFFFF70BF0000 1984 1984 1984 - 64K rwx-- FFFFFFFF70DE0000 10040 10040 10040 - 8K rwx-- FFFFFFFF717AE000 40 - - - - rwx-- iTLB misses: 129806235 (one run) As a result a high number 8K and 64K pages are used to cover the code cache, resulting in an increased number of iTLB misses, degrading performance. Solution: By aligning the code heap sizes to the large page size we make sure that each code heap can be covered by large pages: Address Kbytes RSS Anon Locked Pgsz Mode FFFFFFFF69000000 8192 8192 8192 - 4M rwx-- FFFFFFFF69800000 16384 16384 16384 - 4M rwx-- FFFFFFFF70C00000 4096 4096 4096 - 4M rwx-- iTLB misses: 40054 (one run) I also had to adapt the 'print code cache' test because it assumes that the code heap sizes set on the command line are equal the runtime sizes. This is not true if we align them to large pages. There is an existing RFE for additional alignment tests [1] that will cover this case. Note: The fix depends on [2]. Testing: - JPRT - Performance testing (see separate email) - Manually tested on Windows with large pages enabled Thanks, Tobias [1] https://bugs.openjdk.java.net/browse/JDK-8067135 [2] https://bugs.openjdk.java.net/browse/JDK-8066875 From albert.noll at oracle.com Thu Jan 15 09:59:08 2015 From: albert.noll at oracle.com (Albert Noll) Date: Thu, 15 Jan 2015 10:59:08 +0100 Subject: [9] RFR(S): 8064940: JMH javac performance regressions on solaris-sparcv9 in 9-b34 In-Reply-To: <54B7812E.9080504@oracle.com> References: <54B7812E.9080504@oracle.com> Message-ID: <54B78F6C.9010603@oracle.com> Hi Tobias, thanks for the detailed analysis. The fix looks good to me (not a reviewer). Best, Albert On 01/15/2015 09:58 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch. > > https://bugs.openjdk.java.net/browse/JDK-8064940 > http://cr.openjdk.java.net/~thartmann/8064940/webrev.01/ > > Problem: > Promotion testing revealed a performance regression for the JMH-Javac benchmarks > on Solaris Sparc introduced in b34 by JDK-8015774. While investigating, I > noticed that the number of iTLB misses greatly increases with code cache > segmentation enabled (40190 vs. 129806235) causing the regression. This is due > to large page support (-XX:+UseLargePages) being enabled on Sparc. > > Without code cache segmentation the single code heap uses only large (4M) pages: > > Address Kbytes RSS Anon Locked Pgsz Mode > FFFFFFFF69000000 32768 32768 32768 - 4M rwx-- > > iTLB misses: 40190 (one run) > > With code cache segmentation the code heaps do not use large pages and due to > JDK-8066875 not even the middle region of the underlying virtual space uses > large pages: > > Address Kbytes RSS Anon Locked Pgsz Mode > FFFFFFFF69000000 4544 4544 4544 - 64K rwx-- > > FFFFFFFF697CE000 8 8 8 - 8K rwx-- > FFFFFFFF697D0000 1984 1984 1984 - 64K rwx-- > FFFFFFFF699C0000 16456 16456 16456 - 8K rwx-- > FFFFFFFF6A9D2000 48 - - - - rwx-- > > FFFFFFFF70BE8000 32 32 32 - 8K rwx-- > FFFFFFFF70BF0000 1984 1984 1984 - 64K rwx-- > FFFFFFFF70DE0000 10040 10040 10040 - 8K rwx-- > FFFFFFFF717AE000 40 - - - - rwx-- > > iTLB misses: 129806235 (one run) > > As a result a high number 8K and 64K pages are used to cover the code cache, > resulting in an increased number of iTLB misses, degrading performance. > > Solution: > By aligning the code heap sizes to the large page size we make sure that each > code heap can be covered by large pages: > > Address Kbytes RSS Anon Locked Pgsz Mode > FFFFFFFF69000000 8192 8192 8192 - 4M rwx-- > FFFFFFFF69800000 16384 16384 16384 - 4M rwx-- > FFFFFFFF70C00000 4096 4096 4096 - 4M rwx-- > > iTLB misses: 40054 (one run) > > I also had to adapt the 'print code cache' test because it assumes that the code > heap sizes set on the command line are equal the runtime sizes. This is not true > if we align them to large pages. There is an existing RFE for additional > alignment tests [1] that will cover this case. > > Note: The fix depends on [2]. > > Testing: > - JPRT > - Performance testing (see separate email) > - Manually tested on Windows with large pages enabled > > Thanks, > Tobias > > > [1] https://bugs.openjdk.java.net/browse/JDK-8067135 > [2] https://bugs.openjdk.java.net/browse/JDK-8066875 From tobias.hartmann at oracle.com Thu Jan 15 10:00:51 2015 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 15 Jan 2015 11:00:51 +0100 Subject: [9] RFR(S): 8064940: JMH javac performance regressions on solaris-sparcv9 in 9-b34 In-Reply-To: <54B78F6C.9010603@oracle.com> References: <54B7812E.9080504@oracle.com> <54B78F6C.9010603@oracle.com> Message-ID: <54B78FD3.9030707@oracle.com> Thanks, Albert! Best, Tobias On 15.01.2015 10:59, Albert Noll wrote: > Hi Tobias, > > thanks for the detailed analysis. The fix looks good to me (not a reviewer). > > Best, > Albert > > On 01/15/2015 09:58 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch. >> >> https://bugs.openjdk.java.net/browse/JDK-8064940 >> http://cr.openjdk.java.net/~thartmann/8064940/webrev.01/ >> >> Problem: >> Promotion testing revealed a performance regression for the JMH-Javac benchmarks >> on Solaris Sparc introduced in b34 by JDK-8015774. While investigating, I >> noticed that the number of iTLB misses greatly increases with code cache >> segmentation enabled (40190 vs. 129806235) causing the regression. This is due >> to large page support (-XX:+UseLargePages) being enabled on Sparc. >> >> Without code cache segmentation the single code heap uses only large (4M) pages: >> >> Address Kbytes RSS Anon Locked Pgsz Mode >> FFFFFFFF69000000 32768 32768 32768 - 4M rwx-- >> >> iTLB misses: 40190 (one run) >> >> With code cache segmentation the code heaps do not use large pages and due to >> JDK-8066875 not even the middle region of the underlying virtual space uses >> large pages: >> >> Address Kbytes RSS Anon Locked Pgsz Mode >> FFFFFFFF69000000 4544 4544 4544 - 64K rwx-- >> >> FFFFFFFF697CE000 8 8 8 - 8K rwx-- >> FFFFFFFF697D0000 1984 1984 1984 - 64K rwx-- >> FFFFFFFF699C0000 16456 16456 16456 - 8K rwx-- >> FFFFFFFF6A9D2000 48 - - - - rwx-- >> >> FFFFFFFF70BE8000 32 32 32 - 8K rwx-- >> FFFFFFFF70BF0000 1984 1984 1984 - 64K rwx-- >> FFFFFFFF70DE0000 10040 10040 10040 - 8K rwx-- >> FFFFFFFF717AE000 40 - - - - rwx-- >> >> iTLB misses: 129806235 (one run) >> >> As a result a high number 8K and 64K pages are used to cover the code cache, >> resulting in an increased number of iTLB misses, degrading performance. >> >> Solution: >> By aligning the code heap sizes to the large page size we make sure that each >> code heap can be covered by large pages: >> >> Address Kbytes RSS Anon Locked Pgsz Mode >> FFFFFFFF69000000 8192 8192 8192 - 4M rwx-- >> FFFFFFFF69800000 16384 16384 16384 - 4M rwx-- >> FFFFFFFF70C00000 4096 4096 4096 - 4M rwx-- >> >> iTLB misses: 40054 (one run) >> >> I also had to adapt the 'print code cache' test because it assumes that the code >> heap sizes set on the command line are equal the runtime sizes. This is not true >> if we align them to large pages. There is an existing RFE for additional >> alignment tests [1] that will cover this case. >> >> Note: The fix depends on [2]. >> >> Testing: >> - JPRT >> - Performance testing (see separate email) >> - Manually tested on Windows with large pages enabled >> >> Thanks, >> Tobias >> >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8067135 >> [2] https://bugs.openjdk.java.net/browse/JDK-8066875 > From zoltan.majo at oracle.com Thu Jan 15 10:10:37 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 15 Jan 2015 11:10:37 +0100 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <54B57604.3020306@oracle.com> References: <54B50457.9080609@oracle.com> <54B50D43.3020609@oracle.com> <54B54C54.5030400@oracle.com> <54B57604.3020306@oracle.com> Message-ID: <54B7921D.8090701@oracle.com> Hi Vladimir, thank you for the review! On 01/13/2015 08:46 PM, Vladimir Kozlov wrote: > ifg.cpp: could be > > + return !def->has_user(Op_SCMemProj); In ifg.cpp on lines 538-539 we return only if def->has_user(Op_SCMemProj) is true. If it is false, we don't return yet. So I think the code in webrev.01 is correct, but I might miss something. > Otherwise looks good. I thought we have more cases. I hoped, too, that we have more cases. Thank you and best regards, Zoltan > Thanks, > Vladimir > > On 1/13/15 8:48 AM, Zolt?n Maj? wrote: >> Hi Albert, >> >> >> thank your for the feedback! >> >> On 01/13/2015 01:19 PM, Albert Noll wrote: >>> Hi Zoltan, >>> >>> What do you think about the following interface to find users? >>> >>> If we do not need the return value, we add the following function: >>> bool find_user(int opcode); >> >> that is a good idea. I added the method. >> >>> If we look for more than 1 user like in this example: >>> >>> if (n->find_user(Op_StoreP) != NULL || n->find_user(Op_LoadP) != >>> NULL >>> 2014 || n->find_user(Op_StoreN) != NULL || >>> n->find_user(Op_LoadN)) { >>> >>> would it make sense to have the following function? >>> >>> bool find_user(int opc1, int opc2, int opc3, int opc4); >> >> I added that method as well. >> >> Here is the updated webrev: >> http://cr.openjdk.java.net/~zmajo/8066312/webrev.01/ >> >> All JPRT tests pass. >> >> Thank you and best regards, >> >> >> Zoltan >> >>> Best, >>> Albert >>> >>> On 01/13/2015 12:41 PM, Zolt?n Maj? wrote: >>>> Hi, >>>> >>>> >>>> please review the following small patch. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8066312 >>>> >>>> Problem: There are some locations in the source code that search for >>>> users of a node of a particular type. >>>> >>>> Solution: To simplify the code, this patch adds a new method, >>>> Node::find_user(int opc) that can be used for searching. This patch >>>> also updates some comments in the source code. >>>> >>>> Webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.00/ >>>> >>>> Testing: JPRT >>>> >>>> Thank you and best regards, >>>> >>>> >>>> Zoltan >>>> >>> >> From pavel.chistyakov at oracle.com Thu Jan 15 10:10:53 2015 From: pavel.chistyakov at oracle.com (Pavel Chistyakov) Date: Thu, 15 Jan 2015 13:10:53 +0300 Subject: RFR(XXS): 8068231: Several tests are still excluded Message-ID: <54B7922D.2000108@oracle.com> Hi all, could you please take a look into this extra small change: Test compiler/loopopts/7052494/Test7052494.java is removed from exclude list (removed @ignore tag) webrev: http://cr.openjdk.java.net/~pchistyakov/8068231/webrev.00/ JBS: https://bugs.openjdk.java.net/browse/JDK-8068231 ------------------- Thanks, Pavel From pavel.chistyakov at oracle.com Thu Jan 15 10:10:57 2015 From: pavel.chistyakov at oracle.com (Pavel Chistyakov) Date: Thu, 15 Jan 2015 13:10:57 +0300 Subject: RFR(XXS): 8068234: java/lang/instrument/NativeMethodPrefixAgent.java is still in exclude list Message-ID: <54B79231.9050203@oracle.com> Hi all, could you please take a look into this extra small change: Test is removed from exclude list (removed from ProblemList.txt) Webrev: http://cr.openjdk.java.net/~pchistyakov/8068234/webrev.00/ JBS: https://bugs.openjdk.java.net/browse/JDK-8068234 ------------------- Thanks, Pavel From igor.ignatyev at oracle.com Thu Jan 15 10:26:40 2015 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 15 Jan 2015 13:26:40 +0300 Subject: RFR(XXS): 8068231: Several tests are still excluded In-Reply-To: <54B7922D.2000108@oracle.com> References: <54B7922D.2000108@oracle.com> Message-ID: <54B795E0.8000805@oracle.com> Hi Pavel, looks good to me -Igor On 01/15/2015 01:10 PM, Pavel Chistyakov wrote: > Hi all, > > could you please take a look into this extra small change: > Test compiler/loopopts/7052494/Test7052494.java is removed from exclude > list (removed @ignore tag) > > webrev: http://cr.openjdk.java.net/~pchistyakov/8068231/webrev.00/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8068231 > > ------------------- > Thanks, > Pavel From igor.ignatyev at oracle.com Thu Jan 15 10:26:53 2015 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 15 Jan 2015 13:26:53 +0300 Subject: RFR(XXS): 8068234: java/lang/instrument/NativeMethodPrefixAgent.java is still in exclude list In-Reply-To: <54B79231.9050203@oracle.com> References: <54B79231.9050203@oracle.com> Message-ID: <54B795ED.1000201@oracle.com> Hi Pavel, looks good to me -Igor On 01/15/2015 01:10 PM, Pavel Chistyakov wrote: > Hi all, > > could you please take a look into this extra small change: > Test is removed from exclude list (removed from ProblemList.txt) > > Webrev: http://cr.openjdk.java.net/~pchistyakov/8068234/webrev.00/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8068234 > > ------------------- > Thanks, > Pavel From pavel.chistyakov at oracle.com Thu Jan 15 10:20:45 2015 From: pavel.chistyakov at oracle.com (Pavel Chistyakov) Date: Thu, 15 Jan 2015 13:20:45 +0300 Subject: RFR(XXS): 8068231: Several tests are still excluded In-Reply-To: <54B795E0.8000805@oracle.com> References: <54B7922D.2000108@oracle.com> <54B795E0.8000805@oracle.com> Message-ID: <54B7947D.5010209@oracle.com> Igor, thank you for review. --------------- Regards, Pavel On 15.01.2015 13:26, Igor Ignatyev wrote: > Hi Pavel, > > looks good to me > > -Igor > > On 01/15/2015 01:10 PM, Pavel Chistyakov wrote: >> Hi all, >> >> could you please take a look into this extra small change: >> Test compiler/loopopts/7052494/Test7052494.java is removed from exclude >> list (removed @ignore tag) >> >> webrev: http://cr.openjdk.java.net/~pchistyakov/8068231/webrev.00/ >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8068231 >> >> ------------------- >> Thanks, >> Pavel From pavel.chistyakov at oracle.com Thu Jan 15 10:20:52 2015 From: pavel.chistyakov at oracle.com (Pavel Chistyakov) Date: Thu, 15 Jan 2015 13:20:52 +0300 Subject: RFR(XXS): 8068234: java/lang/instrument/NativeMethodPrefixAgent.java is still in exclude list In-Reply-To: <54B795ED.1000201@oracle.com> References: <54B79231.9050203@oracle.com> <54B795ED.1000201@oracle.com> Message-ID: <54B79484.90606@oracle.com> Igor, thank you for review. --------------- Regards, Pavel On 15.01.2015 13:26, Igor Ignatyev wrote: > Hi Pavel, > > looks good to me > > -Igor > > On 01/15/2015 01:10 PM, Pavel Chistyakov wrote: >> Hi all, >> >> could you please take a look into this extra small change: >> Test is removed from exclude list (removed from ProblemList.txt) >> >> Webrev: http://cr.openjdk.java.net/~pchistyakov/8068234/webrev.00/ >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8068234 >> >> ------------------- >> Thanks, >> Pavel From zoltan.majo at oracle.com Thu Jan 15 10:58:27 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 15 Jan 2015 11:58:27 +0100 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <71C5E452-5B2E-484B-82C5-5DD29F83B08B@oracle.com> References: <54B50457.9080609@oracle.com> <71C5E452-5B2E-484B-82C5-5DD29F83B08B@oracle.com> Message-ID: <54B79D53.5090408@oracle.com> Hi John, thank you for the review! On 01/14/2015 01:12 AM, John Rose wrote: > Good. > > One suggestion: Call it "find_out", not "find_user". The term "out" is more in use for Node than "user"; cf. Node::unique_out, raw_out. I changed the names of the methods, as you suggested. Here is the updated webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.02/ Thank you and best regards, Zoltan > > ? John > > On Jan 13, 2015, at 3:41 AM, Zolt?n Maj? wrote: >> Hi, >> >> >> please review the following small patch. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8066312 >> >> Problem: There are some locations in the source code that search for users of a node of a particular type. >> >> Solution: To simplify the code, this patch adds a new method, Node::find_user(int opc) that can be used for searching. This patch also updates some comments in the source code. >> >> Webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.00/ >> >> Testing: JPRT >> >> Thank you and best regards, >> >> >> Zoltan >> From bertrand.delsart at oracle.com Thu Jan 15 11:13:08 2015 From: bertrand.delsart at oracle.com (Bertrand Delsart) Date: Thu, 15 Jan 2015 12:13:08 +0100 Subject: A hotspot patch for stack profiling (frame pointer) In-Reply-To: References: <5486C67C.2030302@oracle.com> <5486CB9E.2090505@oracle.com> <54B5CF3D.4070307@oracle.com> <54B68056.3090706@oracle.com> Message-ID: <54B7A0C4.6040009@oracle.com> On 14/01/2015 20:12, John Rose wrote: > On Jan 14, 2015, at 6:42 AM, Bertrand Delsart > > wrote: >> >> I would not prevent the JITs from using RBP as long as the changeset >> is not sufficient to guarantee the profiling will work... and IMHO >> solving the JSR292 issue will be much more intrusive (impacting >> HotSpot stack walking code). > > Here are some thoughts on that. > > SPARC uses L7 (L7_mh_SP_save) for the same purpose of method handle > support as x86 uses RBP (rbp_mh_SP_save). So there's not a hard > requirement for x86 to take over RBP. > > (Deep background: This purpose, in method handle support, is to allow > an adapter to make changes to the caller's SP. The adapter is the > initial callee from the caller, but may change argument shape, and > tail-calls the ultimate callee. Because it is a tail-call, the original > caller must have a spot where his original SP can be preserved. The > preservation works because the original caller knows he is calling a > MH.invoke method, which requires the extra argument preservation. The > repertoire of argument shape changes is quite small, actually; it is not > a very general mechanism since the LF machinery was put in. Perhaps the > whole thing could be removed somehow, by finding alternative techniques > for the few remaining changes. OTOH, this SP-restoring mechanism may be > helpful in doing more a general tail-call mechanism, and perhaps in > managing int/comp mode changes more cleanly, so I'd like us to keep it. > And document it better.) > > Any register or stack slot will do for this purpose, as long as (i) its > value can be recovered after the MH.invoke call returns to the caller, > and (ii) its value can be dug up somehow during stack walking. There > are only a couple of places where stack walking code needs to sample the > value, so they should be adjustable. > > Both x86 and SPARC use registers which are callee-save (or "non-volatile > across calls") which satisfy properties (i) and (ii). A standard stack > slot (addressed based on caller's RBP) would probably also satisfy those > properties. > > A variably-positioned stack slot would also work, which would require > registering the position in each CodeBlob. That's unpleasant extra > detail, but it would align somewhat with the current logic which allows > each CodeBlob (nmethod, actually) to advertise which call sites need the > special processing (see the function is_method_handle_return(caller_pc)). > > I recommend reserving a dead word of space in every stack frame that > makes MH.invoke calls, at a fixed position relative to that frame's RBP. > > ? John I perfectly agree that it is doable (and with your proposed approach). I just wanted to be sure people were aware that the RFE is more complex than what the current changeset may suggest. We are not just taking about reviewing and integrating a complete changeset contributed by the community. There is more work needed, either by the community or by Oracle. This will require changes at least in C1 and C2 call sequences, in the stack walking, in the creation and sizing of compiled frames... Regards, Bertrand. -- Bertrand Delsart, Grenoble Engineering Center Oracle, 180 av. de l'Europe, ZIRST de Montbonnot 38330 Montbonnot Saint Martin, FRANCE bertrand.delsart at oracle.com Phone : +33 4 76 18 81 23 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From vladimir.kozlov at oracle.com Thu Jan 15 17:47:24 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 15 Jan 2015 09:47:24 -0800 Subject: [9] RFR(S): 8064940: JMH javac performance regressions on solaris-sparcv9 in 9-b34 In-Reply-To: <54B7812E.9080504@oracle.com> References: <54B7812E.9080504@oracle.com> Message-ID: <54B7FD2C.9050307@oracle.com> Looks good. Thanks, Vladimir On 1/15/15 12:58 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch. > > https://bugs.openjdk.java.net/browse/JDK-8064940 > http://cr.openjdk.java.net/~thartmann/8064940/webrev.01/ > > Problem: > Promotion testing revealed a performance regression for the JMH-Javac benchmarks > on Solaris Sparc introduced in b34 by JDK-8015774. While investigating, I > noticed that the number of iTLB misses greatly increases with code cache > segmentation enabled (40190 vs. 129806235) causing the regression. This is due > to large page support (-XX:+UseLargePages) being enabled on Sparc. > > Without code cache segmentation the single code heap uses only large (4M) pages: > > Address Kbytes RSS Anon Locked Pgsz Mode > FFFFFFFF69000000 32768 32768 32768 - 4M rwx-- > > iTLB misses: 40190 (one run) > > With code cache segmentation the code heaps do not use large pages and due to > JDK-8066875 not even the middle region of the underlying virtual space uses > large pages: > > Address Kbytes RSS Anon Locked Pgsz Mode > FFFFFFFF69000000 4544 4544 4544 - 64K rwx-- > > FFFFFFFF697CE000 8 8 8 - 8K rwx-- > FFFFFFFF697D0000 1984 1984 1984 - 64K rwx-- > FFFFFFFF699C0000 16456 16456 16456 - 8K rwx-- > FFFFFFFF6A9D2000 48 - - - - rwx-- > > FFFFFFFF70BE8000 32 32 32 - 8K rwx-- > FFFFFFFF70BF0000 1984 1984 1984 - 64K rwx-- > FFFFFFFF70DE0000 10040 10040 10040 - 8K rwx-- > FFFFFFFF717AE000 40 - - - - rwx-- > > iTLB misses: 129806235 (one run) > > As a result a high number 8K and 64K pages are used to cover the code cache, > resulting in an increased number of iTLB misses, degrading performance. > > Solution: > By aligning the code heap sizes to the large page size we make sure that each > code heap can be covered by large pages: > > Address Kbytes RSS Anon Locked Pgsz Mode > FFFFFFFF69000000 8192 8192 8192 - 4M rwx-- > FFFFFFFF69800000 16384 16384 16384 - 4M rwx-- > FFFFFFFF70C00000 4096 4096 4096 - 4M rwx-- > > iTLB misses: 40054 (one run) > > I also had to adapt the 'print code cache' test because it assumes that the code > heap sizes set on the command line are equal the runtime sizes. This is not true > if we align them to large pages. There is an existing RFE for additional > alignment tests [1] that will cover this case. > > Note: The fix depends on [2]. > > Testing: > - JPRT > - Performance testing (see separate email) > - Manually tested on Windows with large pages enabled > > Thanks, > Tobias > > > [1] https://bugs.openjdk.java.net/browse/JDK-8067135 > [2] https://bugs.openjdk.java.net/browse/JDK-8066875 > From vladimir.kozlov at oracle.com Thu Jan 15 17:50:01 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 15 Jan 2015 09:50:01 -0800 Subject: A hotspot patch for stack profiling (frame pointer) In-Reply-To: <54B7A0C4.6040009@oracle.com> References: <5486C67C.2030302@oracle.com> <5486CB9E.2090505@oracle.com> <54B5CF3D.4070307@oracle.com> <54B68056.3090706@oracle.com> <54B7A0C4.6040009@oracle.com> Message-ID: <54B7FDC9.8020103@oracle.com> Thank you, Bertrand and John I added this conversation to the bug report. Thanks, Vladimir On 1/15/15 3:13 AM, Bertrand Delsart wrote: > On 14/01/2015 20:12, John Rose wrote: >> On Jan 14, 2015, at 6:42 AM, Bertrand Delsart >> > wrote: >>> >>> I would not prevent the JITs from using RBP as long as the changeset >>> is not sufficient to guarantee the profiling will work... and IMHO >>> solving the JSR292 issue will be much more intrusive (impacting >>> HotSpot stack walking code). >> >> Here are some thoughts on that. >> >> SPARC uses L7 (L7_mh_SP_save) for the same purpose of method handle >> support as x86 uses RBP (rbp_mh_SP_save). So there's not a hard >> requirement for x86 to take over RBP. >> >> (Deep background: This purpose, in method handle support, is to allow >> an adapter to make changes to the caller's SP. The adapter is the >> initial callee from the caller, but may change argument shape, and >> tail-calls the ultimate callee. Because it is a tail-call, the original >> caller must have a spot where his original SP can be preserved. The >> preservation works because the original caller knows he is calling a >> MH.invoke method, which requires the extra argument preservation. The >> repertoire of argument shape changes is quite small, actually; it is not >> a very general mechanism since the LF machinery was put in. Perhaps the >> whole thing could be removed somehow, by finding alternative techniques >> for the few remaining changes. OTOH, this SP-restoring mechanism may be >> helpful in doing more a general tail-call mechanism, and perhaps in >> managing int/comp mode changes more cleanly, so I'd like us to keep it. >> And document it better.) >> >> Any register or stack slot will do for this purpose, as long as (i) its >> value can be recovered after the MH.invoke call returns to the caller, >> and (ii) its value can be dug up somehow during stack walking. There >> are only a couple of places where stack walking code needs to sample the >> value, so they should be adjustable. >> >> Both x86 and SPARC use registers which are callee-save (or "non-volatile >> across calls") which satisfy properties (i) and (ii). A standard stack >> slot (addressed based on caller's RBP) would probably also satisfy those >> properties. >> >> A variably-positioned stack slot would also work, which would require >> registering the position in each CodeBlob. That's unpleasant extra >> detail, but it would align somewhat with the current logic which allows >> each CodeBlob (nmethod, actually) to advertise which call sites need the >> special processing (see the function is_method_handle_return(caller_pc)). >> >> I recommend reserving a dead word of space in every stack frame that >> makes MH.invoke calls, at a fixed position relative to that frame's RBP. >> >> ? John > > I perfectly agree that it is doable (and with your proposed approach). > > I just wanted to be sure people were aware that the RFE is more complex > than what the current changeset may suggest. We are not just taking > about reviewing and integrating a complete changeset contributed by the > community. There is more work needed, either by the community or by > Oracle. This will require changes at least in C1 and C2 call sequences, > in the stack walking, in the creation and sizing of compiled frames... > > Regards, > > Bertrand. From vladimir.kozlov at oracle.com Thu Jan 15 18:09:46 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 15 Jan 2015 10:09:46 -0800 Subject: RFR(XS) 8068909: SIGSEGV in c2 compiled code with OptimizeStringConcat In-Reply-To: References: <02D5D45C1F8DB848A7AE20E80EE61A5C398D9D8A@DEWDFEMB20C.global.corp.sap> <54B6B98A.5070302@oracle.com> Message-ID: <54B8026A.8030705@oracle.com> Yes, the fix looks reasonable. We had other similar cases which were fixed similar way. Add a comment why we have NULL control there. And you need to add the attached to bug report test to compiler regression tests. We will verify it with 8030976 disabled. Also it is too late for 8u40 so will backport it to 8u60. Thanks, Vladimir On 1/15/15 12:24 AM, Volker Simonis wrote: > Hi Vladimir, > > please find Axels webrev here: > > http://cr.openjdk.java.net/~simonis/webrevs/2015/8068909/ > > Regards, > Volker > > PS: unfortunately Axel can't access his OpenJDK webrev-space any more > since it was moved sometimes last summer but we will try once again to > reactivate it now. > > On Wed, Jan 14, 2015 at 7:46 PM, Vladimir Kozlov > wrote: >> Hi Axel, >> >> Thank you for looking on this issue. >> Before we start reviewing it, please, publish webrev on cr.openjdk.java.net. >> It is requirement for accepting changes. Your colleagues can help you. >> >> Regards, >> Vladimir >> >> >> On 1/14/15 2:53 AM, Siebenborn, Axel wrote: >>> >>> Hi, >>> >>> I investigated a crash with jdk8_u25 and opened the following bug: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8068909 >>> >>> I would suggest the following fix: >>> >>> http://www.sapjvm.com/as/webrevs/8068909/ >>> >>> If the control of the inserted load, to NULL. In this case, its >>> corresponding nullcheck will be found as required edge >>> to the CastPP during MemNode::Ideal_common_DU_postCCP. >>> >>> Thanks, >>> >>> Axel >>> >> From vladimir.kozlov at oracle.com Thu Jan 15 18:12:02 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 15 Jan 2015 10:12:02 -0800 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <54B7921D.8090701@oracle.com> References: <54B50457.9080609@oracle.com> <54B50D43.3020609@oracle.com> <54B54C54.5030400@oracle.com> <54B57604.3020306@oracle.com> <54B7921D.8090701@oracle.com> Message-ID: <54B802F2.9060001@oracle.com> On 1/15/15 2:10 AM, Zolt?n Maj? wrote: > Hi Vladimir, > > > thank you for the review! > > On 01/13/2015 08:46 PM, Vladimir Kozlov wrote: >> ifg.cpp: could be >> >> + return !def->has_user(Op_SCMemProj); > > In ifg.cpp on lines 538-539 we return only if > def->has_user(Op_SCMemProj) is true. If it is false, we don't return > yet. So I think the code in webrev.01 is correct, but I might miss > something. Yes, you are correct. Sorry for noise. The fix is good to go. Thanks, Vladimir > >> Otherwise looks good. I thought we have more cases. > > I hoped, too, that we have more cases. > > Thank you and best regards, > > > Zoltan > >> Thanks, >> Vladimir >> >> On 1/13/15 8:48 AM, Zolt?n Maj? wrote: >>> Hi Albert, >>> >>> >>> thank your for the feedback! >>> >>> On 01/13/2015 01:19 PM, Albert Noll wrote: >>>> Hi Zoltan, >>>> >>>> What do you think about the following interface to find users? >>>> >>>> If we do not need the return value, we add the following function: >>>> bool find_user(int opcode); >>> >>> that is a good idea. I added the method. >>> >>>> If we look for more than 1 user like in this example: >>>> >>>> if (n->find_user(Op_StoreP) != NULL || n->find_user(Op_LoadP) != >>>> NULL >>>> 2014 || n->find_user(Op_StoreN) != NULL || >>>> n->find_user(Op_LoadN)) { >>>> >>>> would it make sense to have the following function? >>>> >>>> bool find_user(int opc1, int opc2, int opc3, int opc4); >>> >>> I added that method as well. >>> >>> Here is the updated webrev: >>> http://cr.openjdk.java.net/~zmajo/8066312/webrev.01/ >>> >>> All JPRT tests pass. >>> >>> Thank you and best regards, >>> >>> >>> Zoltan >>> >>>> Best, >>>> Albert >>>> >>>> On 01/13/2015 12:41 PM, Zolt?n Maj? wrote: >>>>> Hi, >>>>> >>>>> >>>>> please review the following small patch. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8066312 >>>>> >>>>> Problem: There are some locations in the source code that search for >>>>> users of a node of a particular type. >>>>> >>>>> Solution: To simplify the code, this patch adds a new method, >>>>> Node::find_user(int opc) that can be used for searching. This patch >>>>> also updates some comments in the source code. >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.00/ >>>>> >>>>> Testing: JPRT >>>>> >>>>> Thank you and best regards, >>>>> >>>>> >>>>> Zoltan >>>>> >>>> >>> > From vladimir.kozlov at oracle.com Thu Jan 15 18:14:13 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 15 Jan 2015 10:14:13 -0800 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <54B79D53.5090408@oracle.com> References: <54B50457.9080609@oracle.com> <71C5E452-5B2E-484B-82C5-5DD29F83B08B@oracle.com> <54B79D53.5090408@oracle.com> Message-ID: <54B80375.8050804@oracle.com> New web is good. Thanks, Vladimir On 1/15/15 2:58 AM, Zolt?n Maj? wrote: > Hi John, > > > thank you for the review! > > On 01/14/2015 01:12 AM, John Rose wrote: >> Good. >> >> One suggestion: Call it "find_out", not "find_user". The term "out" >> is more in use for Node than "user"; cf. Node::unique_out, raw_out. > > I changed the names of the methods, as you suggested. Here is the > updated webrev: > > http://cr.openjdk.java.net/~zmajo/8066312/webrev.02/ > > Thank you and best regards, > > > Zoltan > >> >> ? John >> >> On Jan 13, 2015, at 3:41 AM, Zolt?n Maj? wrote: >>> Hi, >>> >>> >>> please review the following small patch. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8066312 >>> >>> Problem: There are some locations in the source code that search for >>> users of a node of a particular type. >>> >>> Solution: To simplify the code, this patch adds a new method, >>> Node::find_user(int opc) that can be used for searching. This patch >>> also updates some comments in the source code. >>> >>> Webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.00/ >>> >>> Testing: JPRT >>> >>> Thank you and best regards, >>> >>> >>> Zoltan >>> > From vladimir.kozlov at oracle.com Thu Jan 15 18:22:40 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 15 Jan 2015 10:22:40 -0800 Subject: RFR(XXS): 8068231: Several tests are still excluded In-Reply-To: <54B7922D.2000108@oracle.com> References: <54B7922D.2000108@oracle.com> Message-ID: <54B80570.1080707@oracle.com> Looks good. I want to ask (because of 8069020) that you should verify that any test you added to tests running in JPRT will not affect its execution time significantly. Thanks, Vladimir On 1/15/15 2:10 AM, Pavel Chistyakov wrote: > Hi all, > > could you please take a look into this extra small change: > Test compiler/loopopts/7052494/Test7052494.java is removed from exclude > list (removed @ignore tag) > > webrev: http://cr.openjdk.java.net/~pchistyakov/8068231/webrev.00/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8068231 > > ------------------- > Thanks, > Pavel From vladimir.kozlov at oracle.com Thu Jan 15 18:34:42 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 15 Jan 2015 10:34:42 -0800 Subject: RFR(XXS): 8068234: java/lang/instrument/NativeMethodPrefixAgent.java is still in exclude list In-Reply-To: <54B79231.9050203@oracle.com> References: <54B79231.9050203@oracle.com> Message-ID: <54B80842.20201@oracle.com> Good. Thanks, Vladimir On 1/15/15 2:10 AM, Pavel Chistyakov wrote: > Hi all, > > could you please take a look into this extra small change: > Test is removed from exclude list (removed from ProblemList.txt) > > Webrev: http://cr.openjdk.java.net/~pchistyakov/8068234/webrev.00/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8068234 > > ------------------- > Thanks, > Pavel From pavel.chistyakov at oracle.com Thu Jan 15 19:14:01 2015 From: pavel.chistyakov at oracle.com (Pavel Chistyakov) Date: Thu, 15 Jan 2015 11:14:01 -0800 (PST) Subject: RFR(XXS): 8068231: Several tests are still excluded Message-ID: <53248b52-1caa-47ec-a482-64f19f5434b0@default> Vladimir, thank you for review! ------------ Regards, Pavel ----- Original Message ----- From: vladimir.kozlov at oracle.com To: hotspot-compiler-dev at openjdk.java.net Sent: Thursday, January 15, 2015 9:22:22 PM GMT +04:00 Abu Dhabi / Muscat Subject: Re: RFR(XXS): 8068231: Several tests are still excluded Looks good. I want to ask (because of 8069020) that you should verify that any test you added to tests running in JPRT will not affect its execution time significantly. Thanks, Vladimir On 1/15/15 2:10 AM, Pavel Chistyakov wrote: > Hi all, > > could you please take a look into this extra small change: > Test compiler/loopopts/7052494/Test7052494.java is removed from exclude > list (removed @ignore tag) > > webrev: http://cr.openjdk.java.net/~pchistyakov/8068231/webrev.00/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8068231 > > ------------------- > Thanks, > Pavel From pavel.chistyakov at oracle.com Thu Jan 15 19:14:14 2015 From: pavel.chistyakov at oracle.com (Pavel Chistyakov) Date: Thu, 15 Jan 2015 11:14:14 -0800 (PST) Subject: RFR(XXS): 8068234: java/lang/instrument/NativeMethodPrefixAgent.java is still in exclude list Message-ID: <0e8adeaf-7304-4371-a18d-64818ad17568@default> Vladimir, thank you for review! ------------ Regards, Pavel ----- Original Message ----- From: vladimir.kozlov at oracle.com To: hotspot-compiler-dev at openjdk.java.net Sent: Thursday, January 15, 2015 9:34:21 PM GMT +04:00 Abu Dhabi / Muscat Subject: Re: RFR(XXS): 8068234: java/lang/instrument/NativeMethodPrefixAgent.java is still in exclude list Good. Thanks, Vladimir On 1/15/15 2:10 AM, Pavel Chistyakov wrote: > Hi all, > > could you please take a look into this extra small change: > Test is removed from exclude list (removed from ProblemList.txt) > > Webrev: http://cr.openjdk.java.net/~pchistyakov/8068234/webrev.00/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8068234 > > ------------------- > Thanks, > Pavel From nils.eliasson at oracle.com Thu Jan 15 19:23:04 2015 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 15 Jan 2015 20:23:04 +0100 Subject: RFR(M): 8069035: compiler/oracle/CheckCompileCommandOption.java nightly failure Message-ID: <54B81398.90409@oracle.com> Hi, Please review this change. The push of JDK-8027829 caused a nightly failure. The fix for allowing the signature to follow the method without a whitespace wasn't complete and broke the option parsing. This change includes these fixes: * Fixing the removal of whitespace in option command parsing * Adding the test to TEST.groups (~7s test time on a fast machine) * Added additional test cases for the CompilerCommandFile * Added testcases for option parsing together with a method pattern that includes a signature * Moved some testcases together that didn't require separate VMs Bug: https://bugs.openjdk.java.net/browse/JDK-8069035 Webrev: http://cr.openjdk.java.net/~neliasso/8069035/webrev.01 Best regards, Nils Eliasson From serguei.spitsyn at oracle.com Thu Jan 15 21:26:19 2015 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 15 Jan 2015 13:26:19 -0800 Subject: A hotspot patch for stack profiling (frame pointer) In-Reply-To: <54B7A0C4.6040009@oracle.com> References: <5486C67C.2030302@oracle.com> <5486CB9E.2090505@oracle.com> <54B5CF3D.4070307@oracle.com> <54B68056.3090706@oracle.com> <54B7A0C4.6040009@oracle.com> Message-ID: <54B8307B.5040208@oracle.com> On 1/15/15 3:13 AM, Bertrand Delsart wrote: > On 14/01/2015 20:12, John Rose wrote: >> On Jan 14, 2015, at 6:42 AM, Bertrand Delsart >> > >> wrote: >>> >>> I would not prevent the JITs from using RBP as long as the changeset >>> is not sufficient to guarantee the profiling will work... and IMHO >>> solving the JSR292 issue will be much more intrusive (impacting >>> HotSpot stack walking code). >> >> Here are some thoughts on that. >> >> SPARC uses L7 (L7_mh_SP_save) for the same purpose of method handle >> support as x86 uses RBP (rbp_mh_SP_save). So there's not a hard >> requirement for x86 to take over RBP. >> >> (Deep background: This purpose, in method handle support, is to allow >> an adapter to make changes to the caller's SP. The adapter is the >> initial callee from the caller, but may change argument shape, and >> tail-calls the ultimate callee. Because it is a tail-call, the original >> caller must have a spot where his original SP can be preserved. The >> preservation works because the original caller knows he is calling a >> MH.invoke method, which requires the extra argument preservation. The >> repertoire of argument shape changes is quite small, actually; it is not >> a very general mechanism since the LF machinery was put in. Perhaps the >> whole thing could be removed somehow, by finding alternative techniques >> for the few remaining changes. OTOH, this SP-restoring mechanism may be >> helpful in doing more a general tail-call mechanism, and perhaps in >> managing int/comp mode changes more cleanly, so I'd like us to keep it. >> And document it better.) >> >> Any register or stack slot will do for this purpose, as long as (i) its >> value can be recovered after the MH.invoke call returns to the caller, >> and (ii) its value can be dug up somehow during stack walking. There >> are only a couple of places where stack walking code needs to sample the >> value, so they should be adjustable. >> >> Both x86 and SPARC use registers which are callee-save (or "non-volatile >> across calls") which satisfy properties (i) and (ii). A standard stack >> slot (addressed based on caller's RBP) would probably also satisfy those >> properties. >> >> A variably-positioned stack slot would also work, which would require >> registering the position in each CodeBlob. That's unpleasant extra >> detail, but it would align somewhat with the current logic which allows >> each CodeBlob (nmethod, actually) to advertise which call sites need the >> special processing (see the function >> is_method_handle_return(caller_pc)). >> >> I recommend reserving a dead word of space in every stack frame that >> makes MH.invoke calls, at a fixed position relative to that frame's RBP. >> >> ? John > > I perfectly agree that it is doable (and with your proposed approach). > > I just wanted to be sure people were aware that the RFE is more > complex than what the current changeset may suggest. We are not just > taking about reviewing and integrating a complete changeset > contributed by the community. There is more work needed, either by the > community or by Oracle. This will require changes at least in C1 and > C2 call sequences, in the stack walking, in the creation and sizing of > compiled frames... Just want to note about the stack walking... It also impacts some places that people are normally unaware of: - SA-based stack walking (jstack utility) - Solaris-specific: jhelper.d (dtrace jstack action support) and libjvm_db.so (pstack utility support) Thanks, Serguei > > Regards, > > Bertrand. > From tobias.hartmann at oracle.com Fri Jan 16 08:10:59 2015 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 16 Jan 2015 09:10:59 +0100 Subject: [9] RFR(S): 8064940: JMH javac performance regressions on solaris-sparcv9 in 9-b34 In-Reply-To: <54B7FD2C.9050307@oracle.com> References: <54B7812E.9080504@oracle.com> <54B7FD2C.9050307@oracle.com> Message-ID: <54B8C793.8040509@oracle.com> Thanks, Vladimir. Best, Tobias On 15.01.2015 18:47, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 1/15/15 12:58 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch. >> >> https://bugs.openjdk.java.net/browse/JDK-8064940 >> http://cr.openjdk.java.net/~thartmann/8064940/webrev.01/ >> >> Problem: >> Promotion testing revealed a performance regression for the JMH-Javac benchmarks >> on Solaris Sparc introduced in b34 by JDK-8015774. While investigating, I >> noticed that the number of iTLB misses greatly increases with code cache >> segmentation enabled (40190 vs. 129806235) causing the regression. This is due >> to large page support (-XX:+UseLargePages) being enabled on Sparc. >> >> Without code cache segmentation the single code heap uses only large (4M) pages: >> >> Address Kbytes RSS Anon Locked Pgsz Mode >> FFFFFFFF69000000 32768 32768 32768 - 4M rwx-- >> >> iTLB misses: 40190 (one run) >> >> With code cache segmentation the code heaps do not use large pages and due to >> JDK-8066875 not even the middle region of the underlying virtual space uses >> large pages: >> >> Address Kbytes RSS Anon Locked Pgsz Mode >> FFFFFFFF69000000 4544 4544 4544 - 64K rwx-- >> >> FFFFFFFF697CE000 8 8 8 - 8K rwx-- >> FFFFFFFF697D0000 1984 1984 1984 - 64K rwx-- >> FFFFFFFF699C0000 16456 16456 16456 - 8K rwx-- >> FFFFFFFF6A9D2000 48 - - - - rwx-- >> >> FFFFFFFF70BE8000 32 32 32 - 8K rwx-- >> FFFFFFFF70BF0000 1984 1984 1984 - 64K rwx-- >> FFFFFFFF70DE0000 10040 10040 10040 - 8K rwx-- >> FFFFFFFF717AE000 40 - - - - rwx-- >> >> iTLB misses: 129806235 (one run) >> >> As a result a high number 8K and 64K pages are used to cover the code cache, >> resulting in an increased number of iTLB misses, degrading performance. >> >> Solution: >> By aligning the code heap sizes to the large page size we make sure that each >> code heap can be covered by large pages: >> >> Address Kbytes RSS Anon Locked Pgsz Mode >> FFFFFFFF69000000 8192 8192 8192 - 4M rwx-- >> FFFFFFFF69800000 16384 16384 16384 - 4M rwx-- >> FFFFFFFF70C00000 4096 4096 4096 - 4M rwx-- >> >> iTLB misses: 40054 (one run) >> >> I also had to adapt the 'print code cache' test because it assumes that the code >> heap sizes set on the command line are equal the runtime sizes. This is not true >> if we align them to large pages. There is an existing RFE for additional >> alignment tests [1] that will cover this case. >> >> Note: The fix depends on [2]. >> >> Testing: >> - JPRT >> - Performance testing (see separate email) >> - Manually tested on Windows with large pages enabled >> >> Thanks, >> Tobias >> >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8067135 >> [2] https://bugs.openjdk.java.net/browse/JDK-8066875 >> From albert.noll at oracle.com Fri Jan 16 09:27:07 2015 From: albert.noll at oracle.com (Albert Noll) Date: Fri, 16 Jan 2015 10:27:07 +0100 Subject: RFR(M): 8069035: compiler/oracle/CheckCompileCommandOption.java nightly failure In-Reply-To: <54B81398.90409@oracle.com> References: <54B81398.90409@oracle.com> Message-ID: <54B8D96B.4010402@oracle.com> Hi Nils, On 01/15/2015 08:23 PM, Nils Eliasson wrote: > Hi, > > Please review this change. The push of JDK-8027829 caused a nightly > failure. The fix for allowing the signature to follow the method > without a whitespace wasn't complete and broke the option parsing. > > This change includes these fixes: > * Fixing the removal of whitespace in option command parsing > * Adding the test to TEST.groups (~7s test time on a fast machine) That means the test will be run with every jprt push? In my view it is enough to run that test in nightlies. Best, Albert > * Added additional test cases for the CompilerCommandFile > * Added testcases for option parsing together with a method pattern > that includes a signature > * Moved some testcases together that didn't require separate VMs > > Bug: https://bugs.openjdk.java.net/browse/JDK-8069035 > Webrev: http://cr.openjdk.java.net/~neliasso/8069035/webrev.01 > > Best regards, > Nils Eliasson > From nils.eliasson at oracle.com Fri Jan 16 10:15:17 2015 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 16 Jan 2015 11:15:17 +0100 Subject: RFR(M): 8069035: compiler/oracle/CheckCompileCommandOption.java nightly failure In-Reply-To: <54B8D96B.4010402@oracle.com> References: <54B81398.90409@oracle.com> <54B8D96B.4010402@oracle.com> Message-ID: <54B8E4B5.3080606@oracle.com> Hi Albert, I'll remove it from the groups. Nightlies should be enough. Thanks, Nils On 2015-01-16 10:27, Albert Noll wrote: > Hi Nils, > > On 01/15/2015 08:23 PM, Nils Eliasson wrote: >> Hi, >> >> Please review this change. The push of JDK-8027829 caused a nightly >> failure. The fix for allowing the signature to follow the method >> without a whitespace wasn't complete and broke the option parsing. >> >> This change includes these fixes: >> * Fixing the removal of whitespace in option command parsing >> * Adding the test to TEST.groups (~7s test time on a fast machine) > That means the test will be run with every jprt push? In my view it is > enough to run that test in nightlies. > > Best, > Albert >> * Added additional test cases for the CompilerCommandFile >> * Added testcases for option parsing together with a method pattern >> that includes a signature >> * Moved some testcases together that didn't require separate VMs >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8069035 >> Webrev: http://cr.openjdk.java.net/~neliasso/8069035/webrev.01 >> >> Best regards, >> Nils Eliasson >> > From zoltan.majo at oracle.com Fri Jan 16 12:19:43 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 16 Jan 2015 13:19:43 +0100 Subject: [9] RFR(S): 8069162: quarantine serviceability/dcmd/compiler/CompilerQueueTest.java Message-ID: <54B901DF.9040107@oracle.com> Hi, please review the following small patch. Bug: https://bugs.openjdk.java.net/browse/JDK-8069162 Problem: The test serviceability/dcmd/compiler/CompilerQueueTest.java is currently unstable as it passes only if the method sun.reflect.GeneratedConstructorAccessor1::newInstance is *not* in the compile queue. The test does not control when (and if) that method is compiled, therefore the test can fail non-deterministically. Solution: The problem with the test is addressed by JDK-806912. Quarantine the test until 806912 is fixed. Webrev: http://cr.openjdk.java.net/~zmajo/8069162/webrev.00/ Testing: manual testing, JPRT with the option -onlytests '.*hotspot_serviceability.*' Thank you and best regards, Zoltan From zoltan.majo at oracle.com Fri Jan 16 12:22:20 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 16 Jan 2015 13:22:20 +0100 Subject: [9] RFR(S): 8069162: quarantine serviceability/dcmd/compiler/CompilerQueueTest.java In-Reply-To: <54B901DF.9040107@oracle.com> References: <54B901DF.9040107@oracle.com> Message-ID: <54B9027C.3080506@oracle.com> Hi, small mistake in the mail I've just sent out: On 01/16/2015 01:19 PM, Zolt?n Maj? wrote: > Solution: The problem with the test is addressed by JDK-806912. > Quarantine the test until 806912 is fixed. The problem with the test is addressed by *JDK-8069160*. Quarantine the test until *8069160* is fixed. https://bugs.openjdk.java.net/browse/JDK-8069160 I'm sorry for the noise. Thank you and best regards, Zoltan > > Webrev: http://cr.openjdk.java.net/~zmajo/8069162/webrev.00/ > > Testing: manual testing, JPRT with the option -onlytests > '.*hotspot_serviceability.*' > > Thank you and best regards, > > > Zoltan > From zoltan.majo at oracle.com Fri Jan 16 12:29:34 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 16 Jan 2015 13:29:34 +0100 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <54B802F2.9060001@oracle.com> References: <54B50457.9080609@oracle.com> <54B50D43.3020609@oracle.com> <54B54C54.5030400@oracle.com> <54B57604.3020306@oracle.com> <54B7921D.8090701@oracle.com> <54B802F2.9060001@oracle.com> Message-ID: <54B9042E.3020408@oracle.com> Hi Vladimir, On 01/15/2015 07:12 PM, Vladimir Kozlov wrote: > The fix is good to go. Thank you for the review! Best regards, Zoltan > > Thanks, > Vladimir > >> >>> Otherwise looks good. I thought we have more cases. >> >> I hoped, too, that we have more cases. >> >> Thank you and best regards, >> >> >> Zoltan >> >>> Thanks, >>> Vladimir >>> >>> On 1/13/15 8:48 AM, Zolt?n Maj? wrote: >>>> Hi Albert, >>>> >>>> >>>> thank your for the feedback! >>>> >>>> On 01/13/2015 01:19 PM, Albert Noll wrote: >>>>> Hi Zoltan, >>>>> >>>>> What do you think about the following interface to find users? >>>>> >>>>> If we do not need the return value, we add the following function: >>>>> bool find_user(int opcode); >>>> >>>> that is a good idea. I added the method. >>>> >>>>> If we look for more than 1 user like in this example: >>>>> >>>>> if (n->find_user(Op_StoreP) != NULL || n->find_user(Op_LoadP) != >>>>> NULL >>>>> 2014 || n->find_user(Op_StoreN) != NULL || >>>>> n->find_user(Op_LoadN)) { >>>>> >>>>> would it make sense to have the following function? >>>>> >>>>> bool find_user(int opc1, int opc2, int opc3, int opc4); >>>> >>>> I added that method as well. >>>> >>>> Here is the updated webrev: >>>> http://cr.openjdk.java.net/~zmajo/8066312/webrev.01/ >>>> >>>> All JPRT tests pass. >>>> >>>> Thank you and best regards, >>>> >>>> >>>> Zoltan >>>> >>>>> Best, >>>>> Albert >>>>> >>>>> On 01/13/2015 12:41 PM, Zolt?n Maj? wrote: >>>>>> Hi, >>>>>> >>>>>> >>>>>> please review the following small patch. >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8066312 >>>>>> >>>>>> Problem: There are some locations in the source code that search for >>>>>> users of a node of a particular type. >>>>>> >>>>>> Solution: To simplify the code, this patch adds a new method, >>>>>> Node::find_user(int opc) that can be used for searching. This patch >>>>>> also updates some comments in the source code. >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.00/ >>>>>> >>>>>> Testing: JPRT >>>>>> >>>>>> Thank you and best regards, >>>>>> >>>>>> >>>>>> Zoltan >>>>>> >>>>> >>>> >> From zoltan.majo at oracle.com Fri Jan 16 12:30:36 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 16 Jan 2015 13:30:36 +0100 Subject: [9] RFR(S): 8067374: Use %f instead of %g for LogCompilation output In-Reply-To: <54B56217.2010609@oracle.com> References: <54B51690.4080004@oracle.com> <54B56217.2010609@oracle.com> Message-ID: <54B9046C.2000501@oracle.com> Thank you, Vladimir! Best regards, Zoltan On 01/13/2015 07:21 PM, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 1/13/15 4:58 AM, Zolt?n Maj? wrote: >> Hi, >> >> >> please review the following small patch. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8067374 >> >> Problem: At some places, the format specifier '%g' is used. As a result, >> output can be harder to parse than a other formats. >> >> Solution: Replace '%g' with '%f'. >> >> Webrev: http://cr.openjdk.java.net/~zmajo/8067374/webrev.00/ >> >> Testing: Manual testing, JPRT >> >> Thank you and best regards, >> >> >> Zoltan >> From zoltan.majo at oracle.com Fri Jan 16 13:05:35 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 16 Jan 2015 14:05:35 +0100 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54B56D6D.3040707@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> Message-ID: <54B90C9F.4000202@oracle.com> Hi Vladimir, thank you for the feedback! On 01/13/2015 08:09 PM, Vladimir Kozlov wrote: > Thank you, Zoltan, for performance testing! > > templateInterpreter_sparc.cpp - why you switched from G3 to G1 in > 325,344 lines? The reason is that the register Rcounter, which is used to store the address of MethodCounters, is an alias of G3. In the old code, we do not need MethodCounters after incrementing the invocation counter (lines 317--326), so that is why it is OK that the instruction 331: __ load_contents(profile_limit, G3_scratch); overwrites the contents of G3. In the new version, however, we do need MethodCounters afterwards (e.g., lines 339--344), so that is why I thought we should use G1 instead of G3. > > templateTable_x86_64.cpp - indention: > > __ movptr(rcx, Address(rcx, Method::method_counters_offset())); > + const Address mask(rcx, > in_bytes(MethodCounters::backedge_mask_offset())); Thanks for noticing, I've corrected the indentation. > Can you move new code in method.cpp into MethodCounters() constructor? > I don't see why it should be in method.cpp. I did that. > > advancedThresholdPolicy.hpp - some renaming left. Updated. Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/ Thank you and best regards, Zoltan > > > Thanks, > Vladimir > > On 1/13/15 4:31 AM, Zolt?n Maj? wrote: >> Hi Vladimir, >> >> >> thank you for the feedback! Please see comments below. >> >> On 01/05/2015 08:13 PM, Vladimir Kozlov wrote: >>> On 1/5/15 10:38 AM, Zolt?n Maj? wrote: >>>> Hi, >>>> >>>> >>>> please review the following patch. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8059606 >>>> >>>> Problem: Controlling compilation thresholds on a per-method level can >>>> be useful for debugging and understanding >>>> failures, but currently there is no way to control on a per-method >>>> level when methods are compiled. >>>> >>>> >>>> Solution: >>>> >>>> This patch adds support for scaling compilation thresholds on a >>>> per-method level using the CompileThresholdScaling flag. >>>> For example, the option >>>> >>>> -XX:CompileCommand=option,SomeClass.someMethod,double,CompileThresholdScaling,0.5 >>>> >>>> >>>> >>>> reduces compilation thresholds for method SomeClass.sometMethod() by >>>> 50% (but leaves global thresholds unaffected) and >>>> results in earlier compilation of the method. >>>> >>>> Similar to the global CompileThresholdScaling flag (added in >>>> JDK-805604), the per-method CompileThresholdScaling flag >>>> works with both tiered and non-tiered modes of operation. >>>> >>>> Per-method compilation thresholds are available only in non-product >>>> builds to avoid the overhead of accessing fields >>>> added by the patch MethodData and MethodCounters. >>> >>> Too many ifdefs :) >> >> I made per-method compilation thresholds available in product builds as >> well. That helps reducing the number of ifdefs :-). >> >>> The interpreter speed is not important. And the feature could be >>> interesting in product VM too. >>> The only drawback is 2 additional fields in MDO which is fine. >>> Can you make it product and run through our performance infrastructure. >> >> Performance data show that per-method compilation thresholds do not >> result in a statistically significant change of performance. One >> benchmark, Footprint3-Client, degrades ~0.5% on the X86 Client VM, but I >> think that is negligible. >> >>> Also, as John Rose will say, we should have as much as possible a >>> similar code in product as in tested debug code. Otherwise we are not >>> testing product bits and will get into troubles. >>> >>>> >>>> The proposed patch supports x86_64, x86_32, and sparc. Do you think >>>> it is necessary to support other architectures as well? >>> >>> Yes. It should be supported on all platforms. >> >> The current patch supports all architectures except PPC64. >> >>> >>>> The patch updates the name of the flags Tier2BackEdgeThreshold, >>>> Tier3BackEdgeThreshold, Tier4BackEdgeThreshold >>>> (lowercase e in "Back*e*dge) so that the naming is consistent with >>>> other backedge-related flags >>>> (Tier0BackedgeNotifyFreqLog, Tier2BackedgeNotifyFreqLog, and >>>> Tier3BackedgeNotifyFreqLog). >>> >>> It added noise to main changes and may cause some testing (jfr?) >>> failures. Can we do it separately (other RFE?). >> >> I created issue 8068506 for that. >> >> Here is the new webrev: >> http://cr.openjdk.java.net/~zmajo/8059606/webrev.01/ >> >> Testing: manual testing, JPRT >> >>>> This patch is the third (and final) part of JDK-8050853: >>>> https://bugs.openjdk.java.net/browse/JDK-8050853 . >>>> >>>> >>>> Webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.00/ >>> >>> In general looks good. >> >> Thank you and best regards, >> >> >> Zoltan >> >>> >>> Thanks, >>> Vladimir >>> >>>> >>>> Testing: manual testing on all supported architectures, JPRT. >>>> >>>> Thank you and best regards, >>>> >>>> >>>> Zoltan >>>> >> From albert.noll at oracle.com Fri Jan 16 14:23:08 2015 From: albert.noll at oracle.com (Albert Noll) Date: Fri, 16 Jan 2015 15:23:08 +0100 Subject: [9] RFR(S): 8069162: quarantine serviceability/dcmd/compiler/CompilerQueueTest.java In-Reply-To: <54B9027C.3080506@oracle.com> References: <54B901DF.9040107@oracle.com> <54B9027C.3080506@oracle.com> Message-ID: <54B91ECC.8010100@oracle.com> Hi Zoltan, I agree to quarantine the test. Shouldn't quarantining the test look like this? @ignore 8069160 Best, Albert On 01/16/2015 01:22 PM, Zolt?n Maj? wrote: > Hi, > > > small mistake in the mail I've just sent out: > > On 01/16/2015 01:19 PM, Zolt?n Maj? wrote: >> Solution: The problem with the test is addressed by JDK-806912. >> Quarantine the test until 806912 is fixed. > > The problem with the test is addressed by *JDK-8069160*. Quarantine > the test until *8069160* is fixed. > > https://bugs.openjdk.java.net/browse/JDK-8069160 > > I'm sorry for the noise. > > Thank you and best regards, > > > Zoltan > > >> >> Webrev: http://cr.openjdk.java.net/~zmajo/8069162/webrev.00/ >> >> Testing: manual testing, JPRT with the option -onlytests >> '.*hotspot_serviceability.*' >> >> Thank you and best regards, >> >> >> Zoltan >> > From axel.siebenborn at sap.com Fri Jan 16 14:31:28 2015 From: axel.siebenborn at sap.com (Siebenborn, Axel) Date: Fri, 16 Jan 2015 14:31:28 +0000 Subject: RFR(XS) 8068909: SIGSEGV in c2 compiled code with OptimizeStringConcat In-Reply-To: <54B8026A.8030705@oracle.com> References: <02D5D45C1F8DB848A7AE20E80EE61A5C398D9D8A@DEWDFEMB20C.global.corp.sap> <54B6B98A.5070302@oracle.com> <54B8026A.8030705@oracle.com> Message-ID: <02D5D45C1F8DB848A7AE20E80EE61A5C398DAA03@DEWDFEMB20C.global.corp.sap> Hi Vladimir, thanks for your review. Please, find the updated webrev here: http://cr.openjdk.java.net/~simonis/webrevs/2015/8068909.v2/ Regards, Axel On 1/15/2015 7:09 PM Vladimir Kozlov wrote: > Yes, the fix looks reasonable. We had other similar cases which were fixed similar way. > > Add a comment why we have NULL control there. > > And you need to add the attached to bug report test to compiler regression tests. We will verify it with 8030976 disabled. > > Also it is too late for 8u40 so will backport it to 8u60. > > Thanks, > Vladimir > > On 1/15/15 12:24 AM, Volker Simonis wrote: >> Hi Vladimir, >> >> please find Axels webrev here: >> >> http://cr.openjdk.java.net/~simonis/webrevs/2015/8068909/ >> >> Regards, >> Volker >> >> PS: unfortunately Axel can't access his OpenJDK webrev-space any more >> since it was moved sometimes last summer but we will try once again to >> reactivate it now. >> >> On Wed, Jan 14, 2015 at 7:46 PM, Vladimir Kozlov >> wrote: >>> Hi Axel, >>> >>> Thank you for looking on this issue. >>> Before we start reviewing it, please, publish webrev on cr.openjdk.java.net. >>> It is requirement for accepting changes. Your colleagues can help you. >>> >>> Regards, >>> Vladimir >>> >>> >>> On 1/14/15 2:53 AM, Siebenborn, Axel wrote: >>>> >>>> Hi, >>>> >>>> I investigated a crash with jdk8_u25 and opened the following bug: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8068909 >>>> >>>> I would suggest the following fix: >>>> >>>> http://www.sapjvm.com/as/webrevs/8068909/ >>>> >>>> If the control of the inserted load, to NULL. In this case, its >>>> corresponding nullcheck will be found as required edge >>>> to the CastPP during MemNode::Ideal_common_DU_postCCP. >>>> >>>> Thanks, >>>> >>>> Axel From zoltan.majo at oracle.com Fri Jan 16 14:34:09 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 16 Jan 2015 15:34:09 +0100 Subject: RFR(M): 8069035: compiler/oracle/CheckCompileCommandOption.java nightly failure In-Reply-To: <54B81398.90409@oracle.com> References: <54B81398.90409@oracle.com> Message-ID: <54B92161.3090109@oracle.com> Hi Nils, in general, your changes look good to me (I'm not a Reviewer). Please see my comments below. ** On 01/15/2015 08:23 PM, Nils Eliasson wrote: > Hi, > > Please review this change. The push of JDK-8027829 caused a nightly > failure. The fix for allowing the signature to follow the method > without a whitespace wasn't complete and broke the option parsing. > > This change includes these fixes: > * Fixing the removal of whitespace in option command parsing > * Adding the test to TEST.groups (~7s test time on a fast machine) I agree with Albert that it might be better to execute the test/compiler/oracle/CheckCompileCommandOption.java only in nightlies. Especially that the test might take much longer on the slower machines that we have. > * Added additional test cases for the CompilerCommandFile That is a good idea, but I think you've forgotten to include those command files into the webrev. As a result, when I run the test with jtreg on my own machine, it fails with the following error message: STATUS:Failed.`main' threw exception: java.lang.RuntimeException: 'CompileCommand: option com/oracle/Test.test bool MyBoolOption1 = true' missing from stdout/stder Thank you and best regards, Zoltan > * Added testcases for option parsing together with a method pattern > that includes a signature > * Moved some testcases together that didn't require separate VMs > > Bug: https://bugs.openjdk.java.net/browse/JDK-8069035 > Webrev: http://cr.openjdk.java.net/~neliasso/8069035/webrev.01 > > Best regards, > Nils Eliasson > From vladimir.x.ivanov at oracle.com Fri Jan 16 17:16:22 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 16 Jan 2015 20:16:22 +0300 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared Message-ID: <54B94766.2080102@oracle.com> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ https://bugs.openjdk.java.net/browse/JDK-8063137 After GuardWithTest (GWT) LambdaForms became shared, profile pollution significantly distorted compilation decisions. It affected inlining and hindered some optimizations. It causes significant performance regressions for Nashorn (on Octane benchmarks). Inlining was fixed by 8059877 [1], but it didn't cover the case when a branch is never taken. It can cause missed optimization opportunity, and not just increase in code size. For example, non-pruned branch can break escape analysis. Currently, there are 2 problems: - branch frequencies profile pollution - deoptimization counts pollution Branch frequency pollution hides from JIT the fact that a branch is never taken. Since GWT LambdaForms (and hence their bytecode) are heavily shared, but the behavior is specific to MethodHandle, there's no way for JIT to understand how particular GWT instance behaves. The solution I propose is to do profiling in Java code and feed it to JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where profiling info is stored. Once JIT kicks in, it can retrieve these counts, if corresponding MethodHandle is a compile-time constant (and it is usually the case). To communicate the profile data from Java code to JIT, MethodHandleImpl::profileBranch() is used. If GWT MethodHandle isn't a compile-time constant, profiling should proceed. It happens when corresponding LambdaForm is already shared, for newly created GWT MethodHandles profiling can occur only in native code (dedicated nmethod for a single LambdaForm). So, when compilation of the whole MethodHandle chain is triggered, the profile should be already gathered. Overriding branch frequencies is not enough. Statistics on deoptimization events is also polluted. Even if a branch is never taken, JIT doesn't issue an uncommon trap there unless corresponding bytecode doesn't trap too much and doesn't cause too many recompiles. I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT sees it on some method, Compile::too_many_traps & Compile::too_many_recompiles for that method always return false. It allows JIT to prune the branch based on custom profile and recompile the method, if the branch is visited. For now, I wanted to keep the fix very focused. The next thing I plan to do is to experiment with ignoring deoptimization counts for other LambdaForms which are heavily shared. I already saw problems caused by deoptimization counts pollution (see JDK-8068915 [2]). I plan to backport the fix into 8u40, once I finish extensive performance testing. Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, Octane). Thanks! PS: as a summary, my experiments show that fixes for 8063137 & 8068915 [2] almost completely recovers peak performance after LambdaForm sharing [3]. There's one more problem left (non-inlined MethodHandle invocations are more expensive when LFs are shared), but it's a story for another day. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8059877 8059877: GWT branch frequencies pollution due to LF sharing [2] https://bugs.openjdk.java.net/browse/JDK-8068915 [3] https://bugs.openjdk.java.net/browse/JDK-8046703 JEP 210: LambdaForm Reduction and Caching From vladimir.kozlov at oracle.com Fri Jan 16 18:34:36 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 16 Jan 2015 10:34:36 -0800 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54B90C9F.4000202@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> <54B90C9F.4000202@oracle.com> Message-ID: <54B959BC.3050004@oracle.com> Looks good. You need second review for this - it is not small :) Thanks, Vladimir On 1/16/15 5:05 AM, Zolt?n Maj? wrote: > Hi Vladimir, > > > thank you for the feedback! > > On 01/13/2015 08:09 PM, Vladimir Kozlov wrote: >> Thank you, Zoltan, for performance testing! >> >> templateInterpreter_sparc.cpp - why you switched from G3 to G1 in 325,344 lines? > > The reason is that the register Rcounter, which is used to store the address of MethodCounters, is an alias of G3. > > In the old code, we do not need MethodCounters after incrementing the invocation counter (lines 317--326), so that is > why it is OK that the instruction > > 331: __ load_contents(profile_limit, G3_scratch); > > overwrites the contents of G3. > > In the new version, however, we do need MethodCounters afterwards (e.g., lines 339--344), so that is why I thought we > should use G1 instead of G3. > >> >> templateTable_x86_64.cpp - indention: >> >> __ movptr(rcx, Address(rcx, Method::method_counters_offset())); >> + const Address mask(rcx, in_bytes(MethodCounters::backedge_mask_offset())); > > Thanks for noticing, I've corrected the indentation. > >> Can you move new code in method.cpp into MethodCounters() constructor? I don't see why it should be in method.cpp. > > I did that. > >> >> advancedThresholdPolicy.hpp - some renaming left. > > Updated. > > Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/ > > Thank you and best regards, > > > Zoltan > >> >> >> Thanks, >> Vladimir >> >> On 1/13/15 4:31 AM, Zolt?n Maj? wrote: >>> Hi Vladimir, >>> >>> >>> thank you for the feedback! Please see comments below. >>> >>> On 01/05/2015 08:13 PM, Vladimir Kozlov wrote: >>>> On 1/5/15 10:38 AM, Zolt?n Maj? wrote: >>>>> Hi, >>>>> >>>>> >>>>> please review the following patch. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8059606 >>>>> >>>>> Problem: Controlling compilation thresholds on a per-method level can >>>>> be useful for debugging and understanding >>>>> failures, but currently there is no way to control on a per-method >>>>> level when methods are compiled. >>>>> >>>>> >>>>> Solution: >>>>> >>>>> This patch adds support for scaling compilation thresholds on a >>>>> per-method level using the CompileThresholdScaling flag. >>>>> For example, the option >>>>> >>>>> -XX:CompileCommand=option,SomeClass.someMethod,double,CompileThresholdScaling,0.5 >>>>> >>>>> >>>>> reduces compilation thresholds for method SomeClass.sometMethod() by >>>>> 50% (but leaves global thresholds unaffected) and >>>>> results in earlier compilation of the method. >>>>> >>>>> Similar to the global CompileThresholdScaling flag (added in >>>>> JDK-805604), the per-method CompileThresholdScaling flag >>>>> works with both tiered and non-tiered modes of operation. >>>>> >>>>> Per-method compilation thresholds are available only in non-product >>>>> builds to avoid the overhead of accessing fields >>>>> added by the patch MethodData and MethodCounters. >>>> >>>> Too many ifdefs :) >>> >>> I made per-method compilation thresholds available in product builds as >>> well. That helps reducing the number of ifdefs :-). >>> >>>> The interpreter speed is not important. And the feature could be >>>> interesting in product VM too. >>>> The only drawback is 2 additional fields in MDO which is fine. >>>> Can you make it product and run through our performance infrastructure. >>> >>> Performance data show that per-method compilation thresholds do not >>> result in a statistically significant change of performance. One >>> benchmark, Footprint3-Client, degrades ~0.5% on the X86 Client VM, but I >>> think that is negligible. >>> >>>> Also, as John Rose will say, we should have as much as possible a >>>> similar code in product as in tested debug code. Otherwise we are not >>>> testing product bits and will get into troubles. >>>> >>>>> >>>>> The proposed patch supports x86_64, x86_32, and sparc. Do you think >>>>> it is necessary to support other architectures as well? >>>> >>>> Yes. It should be supported on all platforms. >>> >>> The current patch supports all architectures except PPC64. >>> >>>> >>>>> The patch updates the name of the flags Tier2BackEdgeThreshold, >>>>> Tier3BackEdgeThreshold, Tier4BackEdgeThreshold >>>>> (lowercase e in "Back*e*dge) so that the naming is consistent with >>>>> other backedge-related flags >>>>> (Tier0BackedgeNotifyFreqLog, Tier2BackedgeNotifyFreqLog, and >>>>> Tier3BackedgeNotifyFreqLog). >>>> >>>> It added noise to main changes and may cause some testing (jfr?) >>>> failures. Can we do it separately (other RFE?). >>> >>> I created issue 8068506 for that. >>> >>> Here is the new webrev: >>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.01/ >>> >>> Testing: manual testing, JPRT >>> >>>>> This patch is the third (and final) part of JDK-8050853: >>>>> https://bugs.openjdk.java.net/browse/JDK-8050853 . >>>>> >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.00/ >>>> >>>> In general looks good. >>> >>> Thank you and best regards, >>> >>> >>> Zoltan >>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>>>> >>>>> Testing: manual testing on all supported architectures, JPRT. >>>>> >>>>> Thank you and best regards, >>>>> >>>>> >>>>> Zoltan >>>>> >>> > From vladimir.x.ivanov at oracle.com Fri Jan 16 18:34:39 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 16 Jan 2015 21:34:39 +0300 Subject: [9, 8u40] RFR (XS): 8068915: uncommon trap w/ Reason_speculate_class_check causes performance regression due to continuous deoptimizations Message-ID: <54B959BF.1030709@oracle.com> http://cr.openjdk.java.net/~vlivanov/8068915/webrev.00 https://bugs.openjdk.java.net/browse/JDK-8068915 The fix for 8063137 [1] (just sent out for review) uncovered another issue with deoptimization counts pollution. In some cases, speculative traps can stuck continuously deoptimizing and never reach recompilation. Usually, it ruins performance if it happens. When a speculative guard is considered, Compile::too_many_traps() is consulted to decide whether to add it or not. But, uncommon trap action can be changed to Action_none in GraphKit::uncommon_trap, if Compile::too_many_recompiles() fires. The fix is to (1) forbid changing uncommon trap action under the hood, and (2) consult Compile::too_many_recompiles when adding a speculative guard. The fix is based on 8063137 [2] (uncommon_trap_exact is used). Testing: JPRT, java/lang/invoke tests, nashorn. Thanks! Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8063137 [2] http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ From vladimir.kozlov at oracle.com Fri Jan 16 19:33:02 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 16 Jan 2015 11:33:02 -0800 Subject: RFR(XS) 8068909: SIGSEGV in c2 compiled code with OptimizeStringConcat In-Reply-To: <02D5D45C1F8DB848A7AE20E80EE61A5C398DAA03@DEWDFEMB20C.global.corp.sap> References: <02D5D45C1F8DB848A7AE20E80EE61A5C398D9D8A@DEWDFEMB20C.global.corp.sap> <54B6B98A.5070302@oracle.com> <54B8026A.8030705@oracle.com> <02D5D45C1F8DB848A7AE20E80EE61A5C398DAA03@DEWDFEMB20C.global.corp.sap> Message-ID: <54B9676E.5030905@oracle.com> Looks good. I will sponsor it. Thanks, Vladimir On 1/16/15 6:31 AM, Siebenborn, Axel wrote: > Hi Vladimir, > thanks for your review. > > Please, find the updated webrev here: > > http://cr.openjdk.java.net/~simonis/webrevs/2015/8068909.v2/ > > Regards, > Axel > > > On 1/15/2015 7:09 PM Vladimir Kozlov wrote: >> Yes, the fix looks reasonable. We had other similar cases which were fixed similar way. >> >> Add a comment why we have NULL control there. >> >> And you need to add the attached to bug report test to compiler regression tests. We will verify it with 8030976 disabled. >> >> Also it is too late for 8u40 so will backport it to 8u60. >> >> Thanks, >> Vladimir >> >> On 1/15/15 12:24 AM, Volker Simonis wrote: >>> Hi Vladimir, >>> >>> please find Axels webrev here: >>> >>> http://cr.openjdk.java.net/~simonis/webrevs/2015/8068909/ >>> >>> Regards, >>> Volker >>> >>> PS: unfortunately Axel can't access his OpenJDK webrev-space any more >>> since it was moved sometimes last summer but we will try once again to >>> reactivate it now. >>> >>> On Wed, Jan 14, 2015 at 7:46 PM, Vladimir Kozlov >>> wrote: >>>> Hi Axel, >>>> >>>> Thank you for looking on this issue. >>>> Before we start reviewing it, please, publish webrev on cr.openjdk.java.net. >>>> It is requirement for accepting changes. Your colleagues can help you. >>>> >>>> Regards, >>>> Vladimir >>>> >>>> >>>> On 1/14/15 2:53 AM, Siebenborn, Axel wrote: >>>>> >>>>> Hi, >>>>> >>>>> I investigated a crash with jdk8_u25 and opened the following bug: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8068909 >>>>> >>>>> I would suggest the following fix: >>>>> >>>>> http://www.sapjvm.com/as/webrevs/8068909/ >>>>> >>>>> If the control of the inserted load, to NULL. In this case, its >>>>> corresponding nullcheck will be found as required edge >>>>> to the CastPP during MemNode::Ideal_common_DU_postCCP. >>>>> >>>>> Thanks, >>>>> >>>>> Axel From john.r.rose at oracle.com Fri Jan 16 19:59:34 2015 From: john.r.rose at oracle.com (John Rose) Date: Fri, 16 Jan 2015 11:59:34 -0800 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54B90C9F.4000202@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> <54B90C9F.4000202@oracle.com> Message-ID: <87F38F22-A0A0-496D-BEC7-7669E9A36881@oracle.com> On Jan 16, 2015, at 5:05 AM, Zolt?n Maj? wrote: > > Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/ Reviewed, with some comments. Overall, I like the cleanups along the way. The basic idea of replacing a hard-coded 'mask' with an addressable variable is sound and nicely executed. I suppose that idea by itself is "S" small, but this really is a "M" or even "L" change, as Vladimir says, especially since the enhanced logic is spread all around many files. How have you regression tested this? Have you verified that the compilation sequence doesn't change for a non-trivial use case? A slip in the assembly code (for example) might cause a comparison against a garbage mask or limit that could cause compilation decisions to go crazy quietly. I didn't spot any such bug, but they are hard to spot and sometimes quiet. In the sparc assembly code (more than one file), the live range of Rcounters has increased, since it is used to supply limits as well as to update the counter (which happens early in the code). To make it easier to maintain the code, I suggest renaming Rcounters to G3_method_counters. (As you can see, when a register has a logical name but has a complicated live range, we add the hardware name is to the logical name, to make it easier to spot interfering uses, when manually editing code.) If scale==0.0 is a valid input checked specially in compileBroker, perhaps the effect of a zero should be documented? Suggest adding to globals.hpp: "; values greater than one delay counter overflow; zero forces counter overflow immediately" "; can be set as a per-method option." Question: What if both a global scale and a method option for scale are both set? Is the global one ignored? Do they multiply? It's worth specifying it explicitly (and checking that the logic DTRT). Question: How are the log values (like Tier0InvokeNotifyFreqLog or the result of get_scaled_freq_log) constrained to fit in a reasonable range? (I would suppose that range is 0..31 or less.) Should we have range clipping code in the scaler functions? It would give notably simpler code, in MethodData::init, to use a branch-free setup of tier_0_invoke_notify_freq_log etc. Set scale = 1.0 and then update it conditionally. Special-case scale=1.0 in get_scaled_freq_log to taste. Same comment about less-branchy code for methodCounters.hpp. (It's better to have a one-way branch that sets up 'scale' than a two-way branch with duplicate setups of one or more variables.) In MethodCounters, I think the conditional scaling of _interpreter_backward_branch_limit is going to confuse someone, at some point. It should be scaled, unconditionally, like its sibling variables. (That would remove another somewhat verbose initialization branch!) Small style nit: The noise-word "get_" is discouraged in the style doc: > ? Getter accessor names are noun phrases, with no "get_" noise word. Boolean getters can also begin with "is_" or "has_". > https://wiki.openjdk.java.net/display/HotSpot/StyleGuide Arguments.cpp follows this rule partially (in older code maybe?). It would be better to decrease counter-examples to the rule instead of increase them. Bigger style nit: Since the functions are not getting a preset value (from the arguments) but rather normalizing a provided argument value, I suggest naming them "scale_compile_threshold" (i.e., a verb phrase instead of a noun phrase). Again from the style doc: > ? Other method names are verb phrases, as if commands to the receiver. Since you are providing overloads of the scaling functions, the header file should either contain inline code for the convenience methods, or else document how the optional argument ('scale') defaults. I'd prefer inline code, since it is simple. It's as much text to document with a comment as just to supply the inline overload. As I said before, nice work! ? John From vladimir.kozlov at oracle.com Fri Jan 16 20:34:50 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 16 Jan 2015 12:34:50 -0800 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54B94766.2080102@oracle.com> References: <54B94766.2080102@oracle.com> Message-ID: <54B975EA.6040005@oracle.com> Nice! At least Hotspot part since I don't understand jdk part :) I would suggest to add more detailed comment (instead of simple "Stop profiling") to inline_profileBranch() intrinsic explaining what it is doing because it is not strictly "intrinsic" - it does not implement profileBranch() java code when counts is constant. You forgot to mark Opaque4Node as macro node. I would suggest to base it on Opaque2Node then you will get some methods from it. Thanks, Vladimir On 1/16/15 9:16 AM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ > http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ > https://bugs.openjdk.java.net/browse/JDK-8063137 > > After GuardWithTest (GWT) LambdaForms became shared, profile pollution > significantly distorted compilation decisions. It affected inlining and > hindered some optimizations. It causes significant performance > regressions for Nashorn (on Octane benchmarks). > > Inlining was fixed by 8059877 [1], but it didn't cover the case when a > branch is never taken. It can cause missed optimization opportunity, and > not just increase in code size. For example, non-pruned branch can break > escape analysis. > > Currently, there are 2 problems: > - branch frequencies profile pollution > - deoptimization counts pollution > > Branch frequency pollution hides from JIT the fact that a branch is > never taken. Since GWT LambdaForms (and hence their bytecode) are > heavily shared, but the behavior is specific to MethodHandle, there's no > way for JIT to understand how particular GWT instance behaves. > > The solution I propose is to do profiling in Java code and feed it to > JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where > profiling info is stored. Once JIT kicks in, it can retrieve these > counts, if corresponding MethodHandle is a compile-time constant (and it > is usually the case). To communicate the profile data from Java code to > JIT, MethodHandleImpl::profileBranch() is used. > > If GWT MethodHandle isn't a compile-time constant, profiling should > proceed. It happens when corresponding LambdaForm is already shared, for > newly created GWT MethodHandles profiling can occur only in native code > (dedicated nmethod for a single LambdaForm). So, when compilation of the > whole MethodHandle chain is triggered, the profile should be already > gathered. > > Overriding branch frequencies is not enough. Statistics on > deoptimization events is also polluted. Even if a branch is never taken, > JIT doesn't issue an uncommon trap there unless corresponding bytecode > doesn't trap too much and doesn't cause too many recompiles. > > I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT > sees it on some method, Compile::too_many_traps & > Compile::too_many_recompiles for that method always return false. It > allows JIT to prune the branch based on custom profile and recompile the > method, if the branch is visited. > > For now, I wanted to keep the fix very focused. The next thing I plan to > do is to experiment with ignoring deoptimization counts for other > LambdaForms which are heavily shared. I already saw problems caused by > deoptimization counts pollution (see JDK-8068915 [2]). > > I plan to backport the fix into 8u40, once I finish extensive > performance testing. > > Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, Octane). > > Thanks! > > PS: as a summary, my experiments show that fixes for 8063137 & 8068915 > [2] almost completely recovers peak performance after LambdaForm sharing > [3]. There's one more problem left (non-inlined MethodHandle invocations > are more expensive when LFs are shared), but it's a story for another day. > > Best regards, > Vladimir Ivanov > > [1] https://bugs.openjdk.java.net/browse/JDK-8059877 > 8059877: GWT branch frequencies pollution due to LF sharing > [2] https://bugs.openjdk.java.net/browse/JDK-8068915 > [3] https://bugs.openjdk.java.net/browse/JDK-8046703 > JEP 210: LambdaForm Reduction and Caching > _______________________________________________ > mlvm-dev mailing list > mlvm-dev at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev From zoltan.majo at oracle.com Fri Jan 16 20:34:16 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 16 Jan 2015 21:34:16 +0100 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <54B80375.8050804@oracle.com> References: <54B50457.9080609@oracle.com> <71C5E452-5B2E-484B-82C5-5DD29F83B08B@oracle.com> <54B79D53.5090408@oracle.com> <54B80375.8050804@oracle.com> Message-ID: <54B975C8.6070306@oracle.com> Hi, On 01/15/2015 07:14 PM, Vladimir Kozlov wrote: > New web is good. could you please take another look at this issue? The previous webrev, webrev.02 broke the product build because the methods has_out(int upcode) and has_out(DUIterator i) are ambiguous to GCC in the product build. (When working on webrev.02, I considered that changing a method's name is a minor change, so I tested only locally with fastdebug.) Now I renamed has_out(int opcode) -> has_out_with(int opcode) find_out(int opcode) -> find_out_with(int opcode) I renamed both methods so that the naming stays consistent among them. JPRT tests pass on all platforms. Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.03/ Sorry for the extra work. Thank you and best regards, Zoltan > > Thanks, > Vladimir > > On 1/15/15 2:58 AM, Zolt?n Maj? wrote: >> Hi John, >> >> >> thank you for the review! >> >> On 01/14/2015 01:12 AM, John Rose wrote: >>> Good. >>> >>> One suggestion: Call it "find_out", not "find_user". The term "out" >>> is more in use for Node than "user"; cf. Node::unique_out, raw_out. >> >> I changed the names of the methods, as you suggested. Here is the >> updated webrev: >> >> http://cr.openjdk.java.net/~zmajo/8066312/webrev.02/ >> >> Thank you and best regards, >> >> >> Zoltan >> >>> >>> ? John >>> >>> On Jan 13, 2015, at 3:41 AM, Zolt?n Maj? >>> wrote: >>>> Hi, >>>> >>>> >>>> please review the following small patch. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8066312 >>>> >>>> Problem: There are some locations in the source code that search for >>>> users of a node of a particular type. >>>> >>>> Solution: To simplify the code, this patch adds a new method, >>>> Node::find_user(int opc) that can be used for searching. This patch >>>> also updates some comments in the source code. >>>> >>>> Webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.00/ >>>> >>>> Testing: JPRT >>>> >>>> Thank you and best regards, >>>> >>>> >>>> Zoltan >>>> >> From vladimir.kozlov at oracle.com Fri Jan 16 20:49:19 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 16 Jan 2015 12:49:19 -0800 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <54B975C8.6070306@oracle.com> References: <54B50457.9080609@oracle.com> <71C5E452-5B2E-484B-82C5-5DD29F83B08B@oracle.com> <54B79D53.5090408@oracle.com> <54B80375.8050804@oracle.com> <54B975C8.6070306@oracle.com> Message-ID: <54B9794F.3060007@oracle.com> Good. Thanks, Vladimir On 1/16/15 12:34 PM, Zolt?n Maj? wrote: > Hi, > > > On 01/15/2015 07:14 PM, Vladimir Kozlov wrote: >> New web is good. > > could you please take another look at this issue? The previous webrev, > webrev.02 broke the product build because the methods > > has_out(int upcode) and has_out(DUIterator i) > > are ambiguous to GCC in the product build. (When working on webrev.02, I > considered that changing a method's name is a minor change, so I tested > only locally with fastdebug.) > > Now I renamed > > has_out(int opcode) -> has_out_with(int opcode) > find_out(int opcode) -> find_out_with(int opcode) > > I renamed both methods so that the naming stays consistent among them. > > JPRT tests pass on all platforms. > > Here is the new webrev: > http://cr.openjdk.java.net/~zmajo/8066312/webrev.03/ > > Sorry for the extra work. > > Thank you and best regards, > > > Zoltan > >> >> Thanks, >> Vladimir >> >> On 1/15/15 2:58 AM, Zolt?n Maj? wrote: >>> Hi John, >>> >>> >>> thank you for the review! >>> >>> On 01/14/2015 01:12 AM, John Rose wrote: >>>> Good. >>>> >>>> One suggestion: Call it "find_out", not "find_user". The term "out" >>>> is more in use for Node than "user"; cf. Node::unique_out, raw_out. >>> >>> I changed the names of the methods, as you suggested. Here is the >>> updated webrev: >>> >>> http://cr.openjdk.java.net/~zmajo/8066312/webrev.02/ >>> >>> Thank you and best regards, >>> >>> >>> Zoltan >>> >>>> >>>> ? John >>>> >>>> On Jan 13, 2015, at 3:41 AM, Zolt?n Maj? >>>> wrote: >>>>> Hi, >>>>> >>>>> >>>>> please review the following small patch. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8066312 >>>>> >>>>> Problem: There are some locations in the source code that search for >>>>> users of a node of a particular type. >>>>> >>>>> Solution: To simplify the code, this patch adds a new method, >>>>> Node::find_user(int opc) that can be used for searching. This patch >>>>> also updates some comments in the source code. >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~zmajo/8066312/webrev.00/ >>>>> >>>>> Testing: JPRT >>>>> >>>>> Thank you and best regards, >>>>> >>>>> >>>>> Zoltan >>>>> >>> > From john.r.rose at oracle.com Fri Jan 16 21:56:00 2015 From: john.r.rose at oracle.com (John Rose) Date: Fri, 16 Jan 2015 13:56:00 -0800 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <54B975C8.6070306@oracle.com> References: <54B50457.9080609@oracle.com> <71C5E452-5B2E-484B-82C5-5DD29F83B08B@oracle.com> <54B79D53.5090408@oracle.com> <54B80375.8050804@oracle.com> <54B975C8.6070306@oracle.com> Message-ID: <2786C141-F66A-4198-9FA9-7FC08465F3EA@oracle.com> On Jan 16, 2015, at 12:34 PM, Zolt?n Maj? wrote: > > has_out(int opcode) -> has_out_with(int opcode) > find_out(int opcode) -> find_out_with(int opcode) > > I renamed both methods so that the naming stays consistent among them. Fine with me too. Thanks for doing the rename. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Fri Jan 16 23:13:48 2015 From: john.r.rose at oracle.com (John Rose) Date: Fri, 16 Jan 2015 15:13:48 -0800 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54B94766.2080102@oracle.com> References: <54B94766.2080102@oracle.com> Message-ID: <7B03B9FB-17B4-4AE0-92B8-F2DC5B231294@oracle.com> On Jan 16, 2015, at 9:16 AM, Vladimir Ivanov wrote: > > http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ > http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ > https://bugs.openjdk.java.net/browse/JDK-8063137 > ... > PS: as a summary, my experiments show that fixes for 8063137 & 8068915 [2] almost completely recovers peak performance after LambdaForm sharing [3]. There's one more problem left (non-inlined MethodHandle invocations are more expensive when LFs are shared), but it's a story for another day. This performance bump is excellent news. LFs are supposed to express emergently common behaviors, like hidden classes. We are much closer to that goal now. I'm glad to see that the library-assisted profiling turns out to be relatively clean. In effect this restores the pre-LF CountingMethodHandle logic from 2011, which was so beneficial in JDK 7: http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/02de5cdbef21/src/share/classes/java/lang/invoke/CountingMethodHandle.java I have some suggestions to make this version a little cleaner; see below. Starting with the JDK changes: In LambdaForm.java, I'm feeling flag pressure from all the little boolean fields and constructor parameters. (Is it time to put in a bit-encoded field "private byte LambdaForm.flags", or do we wait for another boolean to come along? But see next questions, which are more important.) What happens when a GWT LF gets inlined into a larger LF? Then there might be two or more selectAlternative calls. Will this confuse anything or will it Just Work? The combined LF will get profiled as usual, and the selectAlternative calls will also collect profile (or not?). This leads to another question: Why have a boolean 'isGWT' at all? Why not just check for one or more occurrence of selectAlternative, and declare that those guys override (some of) the profiling. Something like: -+ if (PROFILE_GWT && lambdaForm.isGWT) ++ if (PROFILE_GWT && lambdaForm.containsFunction(NF_selectAlternative)) (...where LF.containsFunction(NamedFunction) is a variation of LF.contains(Name).) I suppose the answer may be that you want to inline GWTs (if ever) into customized code where the JVM profiling should get maximum benefit. In that case case you might want to set the boolean to "false" to distinguish "immature" GWT combinators from customized ones. If that's the case, perhaps the real boolean flag you want is not 'isGWT' but 'sharedProfile' or 'immature' or some such, or (inverting) 'customized'. (I like the feel of a 'customized' flag.) Then @IgnoreProfile would get attached to a LF that (a ) contains selectAlternative and (b ) is marked as non-customized/immature/shared. You might also want to adjust the call to 'profileBranch' based on whether the containing LF was shared or customized. What I'm mainly poking at here is that 'isGWT' is not informative about the intended use of the flag. In 'updateCounters', if the counter overflows, you'll get continuous creation of ArithmeticExceptions. Will that optimize or will it cause a permanent slowdown? Consider a hack like this on the exception path: counters[idx] = Integer.MAX_VALUE / 2; On the Name Bikeshed: It looks like @IgnoreProfile (ignore_profile in the VM) promises too much "ignorance", since it suppresses branch counts and traps, but allows type profiles to be consulted. Maybe something positive like "@ManyTraps" or "@SharedMegamorphic"? (It's just a name, and this is just a suggestion.) Going to the JVM: In library_call.cpp, I think you should change the assert to a guard: -+ assert(aobj->length() == 2, ""); ++ && aobj->length() == 2) { In Parse::dynamic_branch_prediction, the mere presence of the Opaque4 node is enough to trigger replacement of profiling. I think there should *not* be a test of method()->ignore_profile(). That should provide better integration between the two sources of profile data to JVM profiling? Also, I think the name 'Opaque4Node' is way too? opaque. Suggest 'ProfileBranchNode', since that's exactly what it does. Suggest changing the log element "profile_branch" to "observe source='profileBranch'", to make a better hint as to the source of the info. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Fri Jan 16 23:27:31 2015 From: john.r.rose at oracle.com (John Rose) Date: Fri, 16 Jan 2015 15:27:31 -0800 Subject: [9, 8u40] RFR (XS): 8068915: uncommon trap w/ Reason_speculate_class_check causes performance regression due to continuous deoptimizations In-Reply-To: <54B959BF.1030709@oracle.com> References: <54B959BF.1030709@oracle.com> Message-ID: <1398BE34-717B-46BB-B833-2BA7EFD39CA9@oracle.com> On Jan 16, 2015, at 10:34 AM, Vladimir Ivanov wrote: > > http://cr.openjdk.java.net/~vlivanov/8068915/webrev.00 > https://bugs.openjdk.java.net/browse/JDK-8068915 > > The fix for 8063137 [1] (just sent out for review) uncovered another issue with deoptimization counts pollution. In some cases, speculative traps can stuck continuously deoptimizing and never reach recompilation. Usually, it ruins performance if it happens. Good; reviewed. While you are in graphKit.cpp, it would be nice to remove this not-reached line: http://hg.openjdk.java.net/jdk9/dev/hotspot/file/f35435a37581/src/share/vm/opto/graphKit.cpp#l2176 ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From filipp.zhinkin at gmail.com Sat Jan 17 08:07:58 2015 From: filipp.zhinkin at gmail.com (Filipp Zhinkin) Date: Sat, 17 Jan 2015 12:07:58 +0400 Subject: [9] RFR (XS): 8069126: compiler/rtm/locking/TestRTMTotalCountIncrRate.java nightly failure Message-ID: Hi, please review a tiny fix for 8069126. While working on JDK-8068269 I've executed tests on one host, but webrev was prepared on another, so I've made a typo which caused compilation failure in nightly. I'm really sorry for such a silly bug. Bug id: https://bugs.openjdk.java.net/browse/JDK-8069126 Webrev: http://cr.openjdk.java.net/~fzhinkin/8069126/webrev.00/ Testing: executed compiler/rtm tests on my laptop, which have Ivy Bridge CPU, but it's enough to ensure that there are no compile-time issues Thanks and sorry once again, Filipp. From vladimir.kozlov at oracle.com Sat Jan 17 18:12:44 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Sat, 17 Jan 2015 10:12:44 -0800 Subject: [9] RFR (XS): 8069126: compiler/rtm/locking/TestRTMTotalCountIncrRate.java nightly failure In-Reply-To: References: Message-ID: <54BAA61C.1030905@oracle.com> Good. Thanks, Vladimir On 1/17/15 12:07 AM, Filipp Zhinkin wrote: > Hi, > > please review a tiny fix for 8069126. > > While working on JDK-8068269 I've executed tests on one host, > but webrev was prepared on another, so I've made a typo which > caused compilation failure in nightly. > > I'm really sorry for such a silly bug. > > Bug id: https://bugs.openjdk.java.net/browse/JDK-8069126 > Webrev: http://cr.openjdk.java.net/~fzhinkin/8069126/webrev.00/ > Testing: executed compiler/rtm tests on my laptop, which have Ivy Bridge CPU, > but it's enough to ensure that there are no compile-time issues > > Thanks and sorry once again, > Filipp. > From filipp.zhinkin at gmail.com Sun Jan 18 10:34:14 2015 From: filipp.zhinkin at gmail.com (Filipp Zhinkin) Date: Sun, 18 Jan 2015 14:34:14 +0400 Subject: [9] RFR (XS): 8069126: compiler/rtm/locking/TestRTMTotalCountIncrRate.java nightly failure In-Reply-To: <54BAA61C.1030905@oracle.com> References: <54BAA61C.1030905@oracle.com> Message-ID: Thank you, Vladimir. Regards, Filipp. On Sat, Jan 17, 2015 at 9:12 PM, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > > On 1/17/15 12:07 AM, Filipp Zhinkin wrote: >> >> Hi, >> >> please review a tiny fix for 8069126. >> >> While working on JDK-8068269 I've executed tests on one host, >> but webrev was prepared on another, so I've made a typo which >> caused compilation failure in nightly. >> >> I'm really sorry for such a silly bug. >> >> Bug id: https://bugs.openjdk.java.net/browse/JDK-8069126 >> Webrev: http://cr.openjdk.java.net/~fzhinkin/8069126/webrev.00/ >> Testing: executed compiler/rtm tests on my laptop, which have Ivy Bridge >> CPU, >> but it's enough to ensure that there are no compile-time >> issues >> >> Thanks and sorry once again, >> Filipp. >> > From igor.veresov at oracle.com Sun Jan 18 20:25:25 2015 From: igor.veresov at oracle.com (Igor Veresov) Date: Sun, 18 Jan 2015 12:25:25 -0800 Subject: RFR(M) 8068881: SIGBUS in C2 compiled method weblogic.wsee.jaxws.framework.jaxrpc.EnvironmentFactory$SimulatedWsdlDefinitions. Message-ID: <79E8800E-AB67-4889-B5E4-784F31EB8014@oracle.com> After register allocation we may end up with nodes in the same block using different inputs that are in fact a part of a multidef lrg. Since that would confuse the scheduler, the post-allocation copy removal attempts to select a one of the inputs and replace all the uses that refer to the same value within the block. That works most of the time except when we try to replace an input coming from a phi that has the only user. In that case the phi goes dead along with the spill copy that merges the values, which produces incorrect code. Unfortunately there is no way to make a proper selection of a reaching def - the easiest counter example is having to select from two phis. The solution to the problem is to introduce a node that acts like a phi but is without control (or rather something like MergeMem) that would merge the defs (that are really the same value and are a part of a multidef lrg). The following change adds a new node (MachMerge) and a pass after the post-allocation copy removal to insert them when needed. Even though it?s a separate pass it?s a very fast linear traversal. Webrev: http://cr.openjdk.java.net/~iveresov/8068881/webrev.01/ Tested with the failing method in weblogic, jprt, CTW Thanks, igor From vladimir.kozlov at oracle.com Sun Jan 18 23:18:08 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Sun, 18 Jan 2015 15:18:08 -0800 Subject: RFR(M) 8068881: SIGBUS in C2 compiled method weblogic.wsee.jaxws.framework.jaxrpc.EnvironmentFactory$SimulatedWsdlDefinitions. In-Reply-To: <79E8800E-AB67-4889-B5E4-784F31EB8014@oracle.com> References: <79E8800E-AB67-4889-B5E4-784F31EB8014@oracle.com> Message-ID: <54BC3F30.3000407@oracle.com> Looks very good. Thanks, Vladimir On 1/18/15 12:25 PM, Igor Veresov wrote: > After register allocation we may end up with nodes in the same block using different inputs that are in fact a part of a multidef lrg. Since that would confuse the scheduler, the post-allocation copy removal attempts to select a one of the inputs and replace all the uses that refer to the same value within the block. That works most of the time except when we try to replace an input coming from a phi that has the only user. In that case the phi goes dead along with the spill copy that merges the values, which produces incorrect code. Unfortunately there is no way to make a proper selection of a reaching def - the easiest counter example is having to select from two phis. > > The solution to the problem is to introduce a node that acts like a phi but is without control (or rather something like MergeMem) that would merge the defs (that are really the same value and are a part of a multidef lrg). The following change adds a new node (MachMerge) and a pass after the post-allocation copy removal to insert them when needed. Even though it?s a separate pass it?s a very fast linear traversal. > > Webrev: http://cr.openjdk.java.net/~iveresov/8068881/webrev.01/ > > Tested with the failing method in weblogic, jprt, CTW > > Thanks, > igor > From igor.veresov at oracle.com Mon Jan 19 00:42:49 2015 From: igor.veresov at oracle.com (Igor Veresov) Date: Sun, 18 Jan 2015 16:42:49 -0800 Subject: RFR(M) 8068881: SIGBUS in C2 compiled method weblogic.wsee.jaxws.framework.jaxrpc.EnvironmentFactory$SimulatedWsdlDefinitions. In-Reply-To: <54BC3F30.3000407@oracle.com> References: <79E8800E-AB67-4889-B5E4-784F31EB8014@oracle.com> <54BC3F30.3000407@oracle.com> Message-ID: <6CA51394-1D15-4708-8752-5EA40900101F@oracle.com> Thanks, Vladimir! igor > On Jan 18, 2015, at 3:18 PM, Vladimir Kozlov wrote: > > Looks very good. > > Thanks, > Vladimir > > On 1/18/15 12:25 PM, Igor Veresov wrote: >> After register allocation we may end up with nodes in the same block using different inputs that are in fact a part of a multidef lrg. Since that would confuse the scheduler, the post-allocation copy removal attempts to select a one of the inputs and replace all the uses that refer to the same value within the block. That works most of the time except when we try to replace an input coming from a phi that has the only user. In that case the phi goes dead along with the spill copy that merges the values, which produces incorrect code. Unfortunately there is no way to make a proper selection of a reaching def - the easiest counter example is having to select from two phis. >> >> The solution to the problem is to introduce a node that acts like a phi but is without control (or rather something like MergeMem) that would merge the defs (that are really the same value and are a part of a multidef lrg). The following change adds a new node (MachMerge) and a pass after the post-allocation copy removal to insert them when needed. Even though it?s a separate pass it?s a very fast linear traversal. >> >> Webrev: http://cr.openjdk.java.net/~iveresov/8068881/webrev.01/ >> >> Tested with the failing method in weblogic, jprt, CTW >> >> Thanks, >> igor >> From zoltan.majo at oracle.com Mon Jan 19 08:32:20 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 19 Jan 2015 09:32:20 +0100 Subject: [9] RFR(S): 8066312: Add new Node* Node::find_user(int opc) method In-Reply-To: <2786C141-F66A-4198-9FA9-7FC08465F3EA@oracle.com> References: <54B50457.9080609@oracle.com> <71C5E452-5B2E-484B-82C5-5DD29F83B08B@oracle.com> <54B79D53.5090408@oracle.com> <54B80375.8050804@oracle.com> <54B975C8.6070306@oracle.com> <2786C141-F66A-4198-9FA9-7FC08465F3EA@oracle.com> Message-ID: <54BCC114.3030401@oracle.com> Thank you, Vladimir and John, for the review! Best regards, Zoltan From zoltan.majo at oracle.com Mon Jan 19 09:06:31 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 19 Jan 2015 10:06:31 +0100 Subject: [9] RFR(S): 8069162: quarantine serviceability/dcmd/compiler/CompilerQueueTest.java In-Reply-To: <54B91ECC.8010100@oracle.com> References: <54B901DF.9040107@oracle.com> <54B9027C.3080506@oracle.com> <54B91ECC.8010100@oracle.com> Message-ID: <54BCC917.1090209@oracle.com> Hi Albert, On 01/16/2015 03:23 PM, Albert Noll wrote: > Hi Zoltan, > > I agree to quarantine the test. > > Shouldn't quarantining the test look like this? > @ignore 8069160 you're right, thank for pointing that out. Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8069162/webrev.01/ Best wishes, Zoltan > > Best, > Albert > > On 01/16/2015 01:22 PM, Zolt?n Maj? wrote: >> Hi, >> >> >> small mistake in the mail I've just sent out: >> >> On 01/16/2015 01:19 PM, Zolt?n Maj? wrote: >>> Solution: The problem with the test is addressed by JDK-806912. >>> Quarantine the test until 806912 is fixed. >> >> The problem with the test is addressed by *JDK-8069160*. Quarantine >> the test until *8069160* is fixed. >> >> https://bugs.openjdk.java.net/browse/JDK-8069160 >> >> I'm sorry for the noise. >> >> Thank you and best regards, >> >> >> Zoltan >> >> >>> >>> Webrev: http://cr.openjdk.java.net/~zmajo/8069162/webrev.00/ >>> >>> Testing: manual testing, JPRT with the option -onlytests >>> '.*hotspot_serviceability.*' >>> >>> Thank you and best regards, >>> >>> >>> Zoltan >>> >> > From zoltan.majo at oracle.com Mon Jan 19 09:15:44 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 19 Jan 2015 10:15:44 +0100 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54B959BC.3050004@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> <54B90C9F.4000202@oracle.com> <54B959BC.3050004@oracle.com> Message-ID: <54BCCB40.5010803@oracle.com> On 01/16/2015 07:34 PM, Vladimir Kozlov wrote: > Looks good. You need second review for this - it is not small :) thank you, Vladimir, for the review. You're right, this change started small and got bigger, so it's definitely not small anymore. But I won't change the subject now so that we don't move discussion to a different thread. Best regards, Zoltan > > Thanks, > Vladimir > > On 1/16/15 5:05 AM, Zolt?n Maj? wrote: >> Hi Vladimir, >> >> >> thank you for the feedback! >> >> On 01/13/2015 08:09 PM, Vladimir Kozlov wrote: >>> Thank you, Zoltan, for performance testing! >>> >>> templateInterpreter_sparc.cpp - why you switched from G3 to G1 in >>> 325,344 lines? >> >> The reason is that the register Rcounter, which is used to store the >> address of MethodCounters, is an alias of G3. >> >> In the old code, we do not need MethodCounters after incrementing the >> invocation counter (lines 317--326), so that is >> why it is OK that the instruction >> >> 331: __ load_contents(profile_limit, G3_scratch); >> >> overwrites the contents of G3. >> >> In the new version, however, we do need MethodCounters afterwards >> (e.g., lines 339--344), so that is why I thought we >> should use G1 instead of G3. >> >>> >>> templateTable_x86_64.cpp - indention: >>> >>> __ movptr(rcx, Address(rcx, Method::method_counters_offset())); >>> + const Address mask(rcx, >>> in_bytes(MethodCounters::backedge_mask_offset())); >> >> Thanks for noticing, I've corrected the indentation. >> >>> Can you move new code in method.cpp into MethodCounters() >>> constructor? I don't see why it should be in method.cpp. >> >> I did that. >> >>> >>> advancedThresholdPolicy.hpp - some renaming left. >> >> Updated. >> >> Here is the new webrev: >> http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/ >> >> Thank you and best regards, >> >> >> Zoltan >> >>> >>> >>> Thanks, >>> Vladimir >>> >>> On 1/13/15 4:31 AM, Zolt?n Maj? wrote: >>>> Hi Vladimir, >>>> >>>> >>>> thank you for the feedback! Please see comments below. >>>> >>>> On 01/05/2015 08:13 PM, Vladimir Kozlov wrote: >>>>> On 1/5/15 10:38 AM, Zolt?n Maj? wrote: >>>>>> Hi, >>>>>> >>>>>> >>>>>> please review the following patch. >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8059606 >>>>>> >>>>>> Problem: Controlling compilation thresholds on a per-method level >>>>>> can >>>>>> be useful for debugging and understanding >>>>>> failures, but currently there is no way to control on a per-method >>>>>> level when methods are compiled. >>>>>> >>>>>> >>>>>> Solution: >>>>>> >>>>>> This patch adds support for scaling compilation thresholds on a >>>>>> per-method level using the CompileThresholdScaling flag. >>>>>> For example, the option >>>>>> >>>>>> -XX:CompileCommand=option,SomeClass.someMethod,double,CompileThresholdScaling,0.5 >>>>>> >>>>>> >>>>>> >>>>>> reduces compilation thresholds for method SomeClass.sometMethod() by >>>>>> 50% (but leaves global thresholds unaffected) and >>>>>> results in earlier compilation of the method. >>>>>> >>>>>> Similar to the global CompileThresholdScaling flag (added in >>>>>> JDK-805604), the per-method CompileThresholdScaling flag >>>>>> works with both tiered and non-tiered modes of operation. >>>>>> >>>>>> Per-method compilation thresholds are available only in non-product >>>>>> builds to avoid the overhead of accessing fields >>>>>> added by the patch MethodData and MethodCounters. >>>>> >>>>> Too many ifdefs :) >>>> >>>> I made per-method compilation thresholds available in product >>>> builds as >>>> well. That helps reducing the number of ifdefs :-). >>>> >>>>> The interpreter speed is not important. And the feature could be >>>>> interesting in product VM too. >>>>> The only drawback is 2 additional fields in MDO which is fine. >>>>> Can you make it product and run through our performance >>>>> infrastructure. >>>> >>>> Performance data show that per-method compilation thresholds do not >>>> result in a statistically significant change of performance. One >>>> benchmark, Footprint3-Client, degrades ~0.5% on the X86 Client VM, >>>> but I >>>> think that is negligible. >>>> >>>>> Also, as John Rose will say, we should have as much as possible a >>>>> similar code in product as in tested debug code. Otherwise we are not >>>>> testing product bits and will get into troubles. >>>>> >>>>>> >>>>>> The proposed patch supports x86_64, x86_32, and sparc. Do you think >>>>>> it is necessary to support other architectures as well? >>>>> >>>>> Yes. It should be supported on all platforms. >>>> >>>> The current patch supports all architectures except PPC64. >>>> >>>>> >>>>>> The patch updates the name of the flags Tier2BackEdgeThreshold, >>>>>> Tier3BackEdgeThreshold, Tier4BackEdgeThreshold >>>>>> (lowercase e in "Back*e*dge) so that the naming is consistent with >>>>>> other backedge-related flags >>>>>> (Tier0BackedgeNotifyFreqLog, Tier2BackedgeNotifyFreqLog, and >>>>>> Tier3BackedgeNotifyFreqLog). >>>>> >>>>> It added noise to main changes and may cause some testing (jfr?) >>>>> failures. Can we do it separately (other RFE?). >>>> >>>> I created issue 8068506 for that. >>>> >>>> Here is the new webrev: >>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.01/ >>>> >>>> Testing: manual testing, JPRT >>>> >>>>>> This patch is the third (and final) part of JDK-8050853: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8050853 . >>>>>> >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.00/ >>>>> >>>>> In general looks good. >>>> >>>> Thank you and best regards, >>>> >>>> >>>> Zoltan >>>> >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>>> >>>>>> Testing: manual testing on all supported architectures, JPRT. >>>>>> >>>>>> Thank you and best regards, >>>>>> >>>>>> >>>>>> Zoltan >>>>>> >>>> >> From albert.noll at oracle.com Mon Jan 19 09:17:06 2015 From: albert.noll at oracle.com (Albert Noll) Date: Mon, 19 Jan 2015 10:17:06 +0100 Subject: [9] RFR(S): 8069162: quarantine serviceability/dcmd/compiler/CompilerQueueTest.java In-Reply-To: <54BCC917.1090209@oracle.com> References: <54B901DF.9040107@oracle.com> <54B9027C.3080506@oracle.com> <54B91ECC.8010100@oracle.com> <54BCC917.1090209@oracle.com> Message-ID: <54BCCB92.8090607@oracle.com> Hi Zoltan, On 01/19/2015 10:06 AM, Zolt?n Maj? wrote: > Hi Albert, > > > On 01/16/2015 03:23 PM, Albert Noll wrote: >> Hi Zoltan, >> >> I agree to quarantine the test. >> >> Shouldn't quarantining the test look like this? >> @ignore 8069160 > > you're right, thank for pointing that out. > > Here is the new webrev: > > http://cr.openjdk.java.net/~zmajo/8069162/webrev.01/ This looks good to me (not a reviewer). Best, Albert > > Best wishes, > > > Zoltan > >> >> Best, >> Albert >> >> On 01/16/2015 01:22 PM, Zolt?n Maj? wrote: >>> Hi, >>> >>> >>> small mistake in the mail I've just sent out: >>> >>> On 01/16/2015 01:19 PM, Zolt?n Maj? wrote: >>>> Solution: The problem with the test is addressed by JDK-806912. >>>> Quarantine the test until 806912 is fixed. >>> >>> The problem with the test is addressed by *JDK-8069160*. Quarantine >>> the test until *8069160* is fixed. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8069160 >>> >>> I'm sorry for the noise. >>> >>> Thank you and best regards, >>> >>> >>> Zoltan >>> >>> >>>> >>>> Webrev: http://cr.openjdk.java.net/~zmajo/8069162/webrev.00/ >>>> >>>> Testing: manual testing, JPRT with the option -onlytests >>>> '.*hotspot_serviceability.*' >>>> >>>> Thank you and best regards, >>>> >>>> >>>> Zoltan >>>> >>> >> > From roland.westrelin at oracle.com Mon Jan 19 09:26:14 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Mon, 19 Jan 2015 10:26:14 +0100 Subject: [9, 8u40] RFR (XS): 8068915: uncommon trap w/ Reason_speculate_class_check causes performance regression due to continuous deoptimizations In-Reply-To: <54B959BF.1030709@oracle.com> References: <54B959BF.1030709@oracle.com> Message-ID: <182C621D-3DA4-4F71-9832-2A036D0A4120@oracle.com> Hi Vladimir, > The fix is to (1) forbid changing uncommon trap action under the hood, and (2) consult Compile::too_many_recompiles when adding a speculative guard. I?m not sure I understand what code (1) above is referring to in your webrev. Also, why is the problem restricted to speculative traps? Wouldn?t the same checks be required for non speculative traps as well? Roland. From roland.westrelin at oracle.com Mon Jan 19 09:47:35 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Mon, 19 Jan 2015 10:47:35 +0100 Subject: RFR(L): 6912521: System.arraycopy works slower than the simple loop for little lengths In-Reply-To: <54B6FBD5.2070009@oracle.com> References: <54B6D010.20706@oracle.com> <7A0EDCD9-3EC0-4CB1-A427-6764F4AF907D@oracle.com> <54B6FBD5.2070009@oracle.com> Message-ID: <1808916F-D1E1-4C43-AD59-D7300B92C2B6@oracle.com> Here is a new webrev: http://cr.openjdk.java.net/~roland/6912521/webrev.01/ As suggested by Vladimir: - the ArrayCopyNode stuff is its own file - I added comments for library_call.cpp Arrays.CopyOf changes As suggested by John privately: - I moved conv_I2X_offset to Compile - I now use Node::find_int_con and find_intptr_t_con in ArrayCopyNode::get_length_if_constant I also modified the test cases to check that once a copy is compiled as a series of loads/stores, we don?t allow copies that should throw ArrayStoreException to proceed. Roland. > On Jan 15, 2015, at 12:29 AM, Vladimir Kozlov wrote: > > On 1/14/15 1:32 PM, Roland Westrelin wrote: >> Hi Vladimir. Thanks for taking a look at this. >> >>> The logic which choose direction of coping in ArrayCopyNode::Ideal() is strange. I would like to see more explicit checks there. Something like: >>> >>> if (is_array_copy_overlap()) { >>> array_copy_backward() >>> } else { >>> array_copy_forward() >>> } >> >> I?m not following you. In the general case, the test needs to be done at runtime. Your code above seems to imply that we would always decide at compile time? > > My bad, you are right. > >> >>> Can you move ArraCopy class code from callnode.?pp to new arraycopynode.?pp files? the code become too large. >> >> Ok. >> >>> Can you add comment in library_call.cpp explaining new validation/casting logic? Why you do that? >> >> I will add comments. >> To be legal, the transformation of the ArrayCopyNode to loads/stores can only happen if we?re sure the Arrays.copyOf would succeed. So we need all input arguments to the copyOf to be validated, including that the copy to the new array won?t trigger an ArrayStoreException. That?s why there?s a subtype check. That subtype check can be optimized if we know something on the type of the input array from type speculation. > > Okay. > > Thanks, > Vladimir > >> >> Roland. >> >> >>> >>> Thanks, >>> Vladimir >>> >>> On 1/14/15 1:34 AM, Roland Westrelin wrote: >>>> http://cr.openjdk.java.net/~roland/6912521/webrev.00/ >>>> >>>> Follow up to 6700100 (instance clone as series of loads/stores): convert ArrayCopyNode for small array copies (clone of arrays, System.arraycopy, Arrays.copyOf) to series of loads and stores. >>>> >>>> Roland. >>>> >> From zoltan.majo at oracle.com Mon Jan 19 16:56:03 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 19 Jan 2015 17:56:03 +0100 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <87F38F22-A0A0-496D-BEC7-7669E9A36881@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> <54B90C9F.4000202@oracle.com> <87F38F22-A0A0-496D-BEC7-7669E9A36881@oracle.com> Message-ID: <54BD3723.4070405@oracle.com> Hi John, On 01/16/2015 08:59 PM, John Rose wrote: > On Jan 16, 2015, at 5:05 AM, Zolt?n Maj? wrote: >> Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/ > Reviewed, with some comments. thank you for your review and for the feedback! Please see detailed comments below. > Overall, I like the cleanups along the way. > > The basic idea of replacing a hard-coded 'mask' with an addressable variable is sound and nicely executed. > > I suppose that idea by itself is "S" small, but this really is a "M" or even "L" change, as Vladimir says, especially since the enhanced logic is spread all around many files. I agree that this is not a small change any more, but I would like to keep the subject of this discussion unchanged so that we don't move to a different thread (unless you or Vladimir think it is better to change it). > How have you regression tested this? I used our performance infrastructure. I collected performance data on 6 different architectures for 41 benchmark programs/suites. The data show that per-method compilation thresholds do not result in a statistically significant change of performance. One benchmark, Footprint3-Client, degrades ~0.5% on the X86 Client VM, but I think that is negligible. > Have you verified that the compilation sequence doesn't change for a non-trivial use case? A slip in the assembly code (for example) might cause a comparison against a garbage mask or limit that could cause compilation decisions to go crazy quietly. I didn't spot any such bug, but they are hard to spot and sometimes quiet. I did extensive manual testing on all architectures targeted by the patch. I checked that a target method (a method for which per-method compilation thresholds have been specified) is indeed compiled sooner/later than methods for which global compilation thresholds are in effect. I ran tests for all combinations of +/-TieredCompilation, +/-ProfileInterpreter, and +/-UseOnStackReplacement on each architecture that we support. While working on this issue I've discovered and reported two interpreter-related bugs, 8068652 and 8068505. I cannot, unfortunately, guarantee that my changes are error-free, but I did my best to catch any possible error that I can think of and that I can check. > In the sparc assembly code (more than one file), the live range of Rcounters has increased, since it is used to supply limits as well as to update the counter (which happens early in the code). > > To make it easier to maintain the code, I suggest renaming Rcounters to G3_method_counters. > (As you can see, when a register has a logical name but has a complicated live range, we add the hardware name is to the logical name, to make it easier to spot interfering uses, when manually editing code.) Thanks for the suggestion, I've changed the register's name. > If scale==0.0 is a valid input checked specially in compileBroker, perhaps the effect of a zero should be documented? > Suggest adding to globals.hpp: > "; values greater than one delay counter overflow; zero forces counter overflow immediately" > "; can be set as a per-method option." If CompileThresholdScaling is set to 0.0, all methods are interpreted. The reason is that CompileThreshold==0 has been historically equivalent to setting -Xint to true. Setting CompileThresholdScaling to 0.0 scales down CompileThreshold to 0 and we wanted to keep VM behavior consistent when we've added support for the global CompileThresholdScaling flag in JDK-8059604. If you think it would be good if we changed the meaning of CompileThreshold==0, please let me know. I'll file an RFE for it and change it. As CompileThreshold is a product flag, we will need CCC approval for that change. As you've suggested, I added a comment to globals.hpp that precisely describes the behavior of CompileThresholdScaling. > Question: What if both a global scale and a method option for scale are both set? Is the global one ignored? Do they multiply? It's worth specifying it explicitly (and checking that the logic DTRT). Global and per-method values multiply. That behavior is now described in the comment in globals.hpp. > Question: How are the log values (like Tier0InvokeNotifyFreqLog or the result of get_scaled_freq_log) constrained to fit in a reasonable range? (I would suppose that range is 0..31 or less.) Should we have range clipping code in the scaler functions? Currently InvocationCounter::number_of_count_bits=29 bits are reserved for counting. As a result, the value of the log2 of the notification frequency can be at most 30. I updated the source code accordingly. > It would give notably simpler code, in MethodData::init, to use a branch-free setup of tier_0_invoke_notify_freq_log etc. Set scale = 1.0 and then update it conditionally. Special-case scale=1.0 in get_scaled_freq_log to taste. I did that. > Same comment about less-branchy code for methodCounters.hpp. (It's better to have a one-way branch that sets up 'scale' than a two-way branch with duplicate setups of one or more variables.) I changed the code according to your suggestions. > In MethodCounters, I think the conditional scaling of _interpreter_backward_branch_limit is going to confuse someone, at some point. It should be scaled, unconditionally, like its sibling variables. (That would remove another somewhat verbose initialization branch!) The value of the *global* interpreter backward branch limit (InvocationCounter::InterpreterBackwardBranchLimit) is computed based on the value of CompileThreshold (in InvocationCounter::reinitialize()). The backward branch limit is assigned a different value depending on whether interpreter profiling is enabled or not. The logic of the *per-method* interpreter backward branch limit (MethodCounters::_interpreter_backward_branch_limit) is intended to be identical to that of the global branch limit. Therefore, I'm afraid that we have to keep the if-then-else construct initializing _interpreter_backward_branch_limit in methodCounters.hpp. I hope that I understood right what you've previously suggested. > Small style nit: The noise-word "get_" is discouraged in the style doc: > >> ? Getter accessor names are noun phrases, with no "get_" noise word. Boolean getters can also begin with "is_" or "has_". >> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide > Arguments.cpp follows this rule partially (in older code maybe?). It would be better to decrease counter-examples to the rule instead of increase them. > > Bigger style nit: Since the functions are not getting a preset value (from the arguments) but rather normalizing a provided argument value, I suggest naming them "scale_compile_threshold" (i.e., a verb phrase instead of a noun phrase). Again from the style doc: > >> ? Other method names are verb phrases, as if commands to the receiver. I did not know that, thanks for pointing it out to me. I changed get_scaled_compile_threshold -> scaled_compiled_threshold get_scaled_freq_log -> scaled_freq_log. > Since you are providing overloads of the scaling functions, the header file should either contain inline code for the convenience methods, or else document how the optional argument ('scale') defaults. I'd prefer inline code, since it is simple. It's as much text to document with a comment as just to supply the inline overload. I inlined the convenience methods into the header file. Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.03/ > > As I said before, nice work! Thank you! Best regards, Zoltan > > ? John From vladimir.x.ivanov at oracle.com Mon Jan 19 17:05:49 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 19 Jan 2015 20:05:49 +0300 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54B975EA.6040005@oracle.com> References: <54B94766.2080102@oracle.com> <54B975EA.6040005@oracle.com> Message-ID: <54BD396D.2050907@oracle.com> Thanks, Vladimir! > I would suggest to add more detailed comment (instead of simple "Stop > profiling") to inline_profileBranch() intrinsic explaining what it is > doing because it is not strictly "intrinsic" - it does not implement > profileBranch() java code when counts is constant. Sure, will do. > You forgot to mark Opaque4Node as macro node. I would suggest to base it > on Opaque2Node then you will get some methods from it. Do I really need to do so? I expect it to go away during IGVN pass right after parsing is over. That's why I register the node for igvn in LibraryCallKit::inline_profileBranch(). Changes in macro.cpp & compile.cpp are leftovers from the version when Opaque4 was macro node. I plan to remove them. Best regards, Vladimir Ivanov > On 1/16/15 9:16 AM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ >> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ >> https://bugs.openjdk.java.net/browse/JDK-8063137 >> >> After GuardWithTest (GWT) LambdaForms became shared, profile pollution >> significantly distorted compilation decisions. It affected inlining and >> hindered some optimizations. It causes significant performance >> regressions for Nashorn (on Octane benchmarks). >> >> Inlining was fixed by 8059877 [1], but it didn't cover the case when a >> branch is never taken. It can cause missed optimization opportunity, and >> not just increase in code size. For example, non-pruned branch can break >> escape analysis. >> >> Currently, there are 2 problems: >> - branch frequencies profile pollution >> - deoptimization counts pollution >> >> Branch frequency pollution hides from JIT the fact that a branch is >> never taken. Since GWT LambdaForms (and hence their bytecode) are >> heavily shared, but the behavior is specific to MethodHandle, there's no >> way for JIT to understand how particular GWT instance behaves. >> >> The solution I propose is to do profiling in Java code and feed it to >> JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where >> profiling info is stored. Once JIT kicks in, it can retrieve these >> counts, if corresponding MethodHandle is a compile-time constant (and it >> is usually the case). To communicate the profile data from Java code to >> JIT, MethodHandleImpl::profileBranch() is used. >> >> If GWT MethodHandle isn't a compile-time constant, profiling should >> proceed. It happens when corresponding LambdaForm is already shared, for >> newly created GWT MethodHandles profiling can occur only in native code >> (dedicated nmethod for a single LambdaForm). So, when compilation of the >> whole MethodHandle chain is triggered, the profile should be already >> gathered. >> >> Overriding branch frequencies is not enough. Statistics on >> deoptimization events is also polluted. Even if a branch is never taken, >> JIT doesn't issue an uncommon trap there unless corresponding bytecode >> doesn't trap too much and doesn't cause too many recompiles. >> >> I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT >> sees it on some method, Compile::too_many_traps & >> Compile::too_many_recompiles for that method always return false. It >> allows JIT to prune the branch based on custom profile and recompile the >> method, if the branch is visited. >> >> For now, I wanted to keep the fix very focused. The next thing I plan to >> do is to experiment with ignoring deoptimization counts for other >> LambdaForms which are heavily shared. I already saw problems caused by >> deoptimization counts pollution (see JDK-8068915 [2]). >> >> I plan to backport the fix into 8u40, once I finish extensive >> performance testing. >> >> Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, >> Octane). >> >> Thanks! >> >> PS: as a summary, my experiments show that fixes for 8063137 & 8068915 >> [2] almost completely recovers peak performance after LambdaForm sharing >> [3]. There's one more problem left (non-inlined MethodHandle invocations >> are more expensive when LFs are shared), but it's a story for another >> day. >> >> Best regards, >> Vladimir Ivanov >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8059877 >> 8059877: GWT branch frequencies pollution due to LF sharing >> [2] https://bugs.openjdk.java.net/browse/JDK-8068915 >> [3] https://bugs.openjdk.java.net/browse/JDK-8046703 >> JEP 210: LambdaForm Reduction and Caching >> _______________________________________________ >> mlvm-dev mailing list >> mlvm-dev at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > _______________________________________________ > mlvm-dev mailing list > mlvm-dev at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev From vladimir.kozlov at oracle.com Mon Jan 19 18:23:29 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 19 Jan 2015 10:23:29 -0800 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54BD396D.2050907@oracle.com> References: <54B94766.2080102@oracle.com> <54B975EA.6040005@oracle.com> <54BD396D.2050907@oracle.com> Message-ID: <54BD4BA1.2060300@oracle.com> On 1/19/15 9:05 AM, Vladimir Ivanov wrote: > Thanks, Vladimir! > >> I would suggest to add more detailed comment (instead of simple "Stop >> profiling") to inline_profileBranch() intrinsic explaining what it is >> doing because it is not strictly "intrinsic" - it does not implement >> profileBranch() java code when counts is constant. > Sure, will do. > >> You forgot to mark Opaque4Node as macro node. I would suggest to base it >> on Opaque2Node then you will get some methods from it. > Do I really need to do so? I expect it to go away during IGVN pass right after parsing is over. That's why I register > the node for igvn in LibraryCallKit::inline_profileBranch(). Changes in macro.cpp & compile.cpp are leftovers from the > version when Opaque4 was macro node. I plan to remove them. I see, this is why you did not inherited it. Okay. I would suggest to leave an assert in compile.cpp to make sure it is not left. I found typo when looked today (should be '&&'): + Node *Opaque4Node::Ideal(PhaseGVN *phase, bool can_reshape) { + if (can_reshape & _delay_removal) { Thanks, Vladimir > > Best regards, > Vladimir Ivanov > >> On 1/16/15 9:16 AM, Vladimir Ivanov wrote: >>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ >>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ >>> https://bugs.openjdk.java.net/browse/JDK-8063137 >>> >>> After GuardWithTest (GWT) LambdaForms became shared, profile pollution >>> significantly distorted compilation decisions. It affected inlining and >>> hindered some optimizations. It causes significant performance >>> regressions for Nashorn (on Octane benchmarks). >>> >>> Inlining was fixed by 8059877 [1], but it didn't cover the case when a >>> branch is never taken. It can cause missed optimization opportunity, and >>> not just increase in code size. For example, non-pruned branch can break >>> escape analysis. >>> >>> Currently, there are 2 problems: >>> - branch frequencies profile pollution >>> - deoptimization counts pollution >>> >>> Branch frequency pollution hides from JIT the fact that a branch is >>> never taken. Since GWT LambdaForms (and hence their bytecode) are >>> heavily shared, but the behavior is specific to MethodHandle, there's no >>> way for JIT to understand how particular GWT instance behaves. >>> >>> The solution I propose is to do profiling in Java code and feed it to >>> JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where >>> profiling info is stored. Once JIT kicks in, it can retrieve these >>> counts, if corresponding MethodHandle is a compile-time constant (and it >>> is usually the case). To communicate the profile data from Java code to >>> JIT, MethodHandleImpl::profileBranch() is used. >>> >>> If GWT MethodHandle isn't a compile-time constant, profiling should >>> proceed. It happens when corresponding LambdaForm is already shared, for >>> newly created GWT MethodHandles profiling can occur only in native code >>> (dedicated nmethod for a single LambdaForm). So, when compilation of the >>> whole MethodHandle chain is triggered, the profile should be already >>> gathered. >>> >>> Overriding branch frequencies is not enough. Statistics on >>> deoptimization events is also polluted. Even if a branch is never taken, >>> JIT doesn't issue an uncommon trap there unless corresponding bytecode >>> doesn't trap too much and doesn't cause too many recompiles. >>> >>> I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT >>> sees it on some method, Compile::too_many_traps & >>> Compile::too_many_recompiles for that method always return false. It >>> allows JIT to prune the branch based on custom profile and recompile the >>> method, if the branch is visited. >>> >>> For now, I wanted to keep the fix very focused. The next thing I plan to >>> do is to experiment with ignoring deoptimization counts for other >>> LambdaForms which are heavily shared. I already saw problems caused by >>> deoptimization counts pollution (see JDK-8068915 [2]). >>> >>> I plan to backport the fix into 8u40, once I finish extensive >>> performance testing. >>> >>> Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, >>> Octane). >>> >>> Thanks! >>> >>> PS: as a summary, my experiments show that fixes for 8063137 & 8068915 >>> [2] almost completely recovers peak performance after LambdaForm sharing >>> [3]. There's one more problem left (non-inlined MethodHandle invocations >>> are more expensive when LFs are shared), but it's a story for another >>> day. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8059877 >>> 8059877: GWT branch frequencies pollution due to LF sharing >>> [2] https://bugs.openjdk.java.net/browse/JDK-8068915 >>> [3] https://bugs.openjdk.java.net/browse/JDK-8046703 >>> JEP 210: LambdaForm Reduction and Caching >>> _______________________________________________ >>> mlvm-dev mailing list >>> mlvm-dev at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> _______________________________________________ >> mlvm-dev mailing list >> mlvm-dev at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev From vladimir.kozlov at oracle.com Mon Jan 19 18:26:39 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 19 Jan 2015 10:26:39 -0800 Subject: [9] RFR(S): 8069162: quarantine serviceability/dcmd/compiler/CompilerQueueTest.java In-Reply-To: <54BCC917.1090209@oracle.com> References: <54B901DF.9040107@oracle.com> <54B9027C.3080506@oracle.com> <54B91ECC.8010100@oracle.com> <54BCC917.1090209@oracle.com> Message-ID: <54BD4C5F.6000706@oracle.com> Good. Thanks, Vladimir On 1/19/15 1:06 AM, Zolt?n Maj? wrote: > Hi Albert, > > > On 01/16/2015 03:23 PM, Albert Noll wrote: >> Hi Zoltan, >> >> I agree to quarantine the test. >> >> Shouldn't quarantining the test look like this? >> @ignore 8069160 > > you're right, thank for pointing that out. > > Here is the new webrev: > > http://cr.openjdk.java.net/~zmajo/8069162/webrev.01/ > > Best wishes, > > > Zoltan > >> >> Best, >> Albert >> >> On 01/16/2015 01:22 PM, Zolt?n Maj? wrote: >>> Hi, >>> >>> >>> small mistake in the mail I've just sent out: >>> >>> On 01/16/2015 01:19 PM, Zolt?n Maj? wrote: >>>> Solution: The problem with the test is addressed by JDK-806912. Quarantine the test until 806912 is fixed. >>> >>> The problem with the test is addressed by *JDK-8069160*. Quarantine the test until *8069160* is fixed. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8069160 >>> >>> I'm sorry for the noise. >>> >>> Thank you and best regards, >>> >>> >>> Zoltan >>> >>> >>>> >>>> Webrev: http://cr.openjdk.java.net/~zmajo/8069162/webrev.00/ >>>> >>>> Testing: manual testing, JPRT with the option -onlytests '.*hotspot_serviceability.*' >>>> >>>> Thank you and best regards, >>>> >>>> >>>> Zoltan >>>> >>> >> > From igor.veresov at oracle.com Mon Jan 19 18:35:23 2015 From: igor.veresov at oracle.com (Igor Veresov) Date: Mon, 19 Jan 2015 10:35:23 -0800 Subject: RFR(M) 8068881: SIGBUS in C2 compiled method weblogic.wsee.jaxws.framework.jaxrpc.EnvironmentFactory$SimulatedWsdlDefinitions. In-Reply-To: <79E8800E-AB67-4889-B5E4-784F31EB8014@oracle.com> References: <79E8800E-AB67-4889-B5E4-784F31EB8014@oracle.com> Message-ID: <554922D7-725E-4B91-9DCA-144216502DD9@oracle.com> I just amended the change with an extended comment on why it?s enough to track base registers only and ignore multi-register effects (like multi-registers lrg definitions and fat projections), it is subtle, but saves unnecessary work: // We just updated the last edge, now null out the value produced by the instruction itself, // since we're only interested in defs implicitly defined by the uses. We are actually interested // in tracking only redefinitions of the multidef lrgs in the same register. For that matter it's enough // to track changes in the base register only and ignore other effects of multi-register lrgs // and fat projections. It is also ok to ignore defs coming from singledefs. After an implicit // overwrite by one of those our register is guaranteed to be used by another lrg and we won't // attempt to merge it. I also tightened the code following the comment to do updates only for multidefs lrgs (postaloc.cpp): lrg = _lrg_map.live_range_id(n); - if (lrg > 0) { + if (lrg > 0 && lrgs(lrg).is_multidef()) { OptoReg::Name reg = lrgs(lrg).reg(); reg2defuse.at(reg).clear(); } Webrev updated in place. Sorry about the last minute change. igor > On Jan 18, 2015, at 12:25 PM, Igor Veresov wrote: > > After register allocation we may end up with nodes in the same block using different inputs that are in fact a part of a multidef lrg. Since that would confuse the scheduler, the post-allocation copy removal attempts to select a one of the inputs and replace all the uses that refer to the same value within the block. That works most of the time except when we try to replace an input coming from a phi that has the only user. In that case the phi goes dead along with the spill copy that merges the values, which produces incorrect code. Unfortunately there is no way to make a proper selection of a reaching def - the easiest counter example is having to select from two phis. > > The solution to the problem is to introduce a node that acts like a phi but is without control (or rather something like MergeMem) that would merge the defs (that are really the same value and are a part of a multidef lrg). The following change adds a new node (MachMerge) and a pass after the post-allocation copy removal to insert them when needed. Even though it?s a separate pass it?s a very fast linear traversal. > > Webrev: http://cr.openjdk.java.net/~iveresov/8068881/webrev.01/ > > Tested with the failing method in weblogic, jprt, CTW > > Thanks, > igor -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Mon Jan 19 18:36:49 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 19 Jan 2015 10:36:49 -0800 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54BD3723.4070405@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> <54B90C9F.4000202@oracle.com> <87F38F22-A0A0-496D-BEC7-7669E9A36881@oracle.com> <54BD3723.4070405@oracle.com> Message-ID: <54BD4EC1.5060607@oracle.com> What was changed in globalDefinitions.hpp? Webrev shows nothing (sometimes it does not show spacing changes). In arguments.cpp, please, add check that CompileThresholdScaling is not negative. Otherwise this looks good to me. Thanks, Vladimir On 1/19/15 8:56 AM, Zolt?n Maj? wrote: > Hi John, > > > On 01/16/2015 08:59 PM, John Rose wrote: >> On Jan 16, 2015, at 5:05 AM, Zolt?n Maj? wrote: >>> Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/ >> Reviewed, with some comments. > > thank you for your review and for the feedback! Please see detailed comments below. > >> Overall, I like the cleanups along the way. >> >> The basic idea of replacing a hard-coded 'mask' with an addressable variable is sound and nicely executed. >> >> I suppose that idea by itself is "S" small, but this really is a "M" or even "L" change, as Vladimir says, especially >> since the enhanced logic is spread all around many files. > > I agree that this is not a small change any more, but I would like to keep the subject of this discussion unchanged so > that we don't move to a different thread (unless you or Vladimir think it is better to change it). > >> How have you regression tested this? > > I used our performance infrastructure. I collected performance data on 6 different architectures for 41 benchmark > programs/suites. The data show that per-method compilation thresholds do not result in a statistically significant > change of performance. One benchmark, Footprint3-Client, degrades ~0.5% on the X86 Client VM, but I think that is > negligible. > >> Have you verified that the compilation sequence doesn't change for a non-trivial use case? A slip in the assembly >> code (for example) might cause a comparison against a garbage mask or limit that could cause compilation decisions to >> go crazy quietly. I didn't spot any such bug, but they are hard to spot and sometimes quiet. > > I did extensive manual testing on all architectures targeted by the patch. > > I checked that a target method (a method for which per-method compilation thresholds have been specified) is indeed > compiled sooner/later than methods for which global compilation thresholds are in effect. I ran tests for all > combinations of +/-TieredCompilation, +/-ProfileInterpreter, and +/-UseOnStackReplacement > on each architecture that we support. While working on this issue I've discovered and reported two interpreter-related > bugs, 8068652 and 8068505. > > I cannot, unfortunately, guarantee that my changes are error-free, but I did my best to catch any possible error that I > can think of and that I can check. > >> In the sparc assembly code (more than one file), the live range of Rcounters has increased, since it is used to supply >> limits as well as to update the counter (which happens early in the code). >> >> To make it easier to maintain the code, I suggest renaming Rcounters to G3_method_counters. >> (As you can see, when a register has a logical name but has a complicated live range, we add the hardware name is to >> the logical name, to make it easier to spot interfering uses, when manually editing code.) > > Thanks for the suggestion, I've changed the register's name. > >> If scale==0.0 is a valid input checked specially in compileBroker, perhaps the effect of a zero should be documented? >> Suggest adding to globals.hpp: >> "; values greater than one delay counter overflow; zero forces counter overflow immediately" >> "; can be set as a per-method option." > > If CompileThresholdScaling is set to 0.0, all methods are interpreted. > > The reason is that CompileThreshold==0 has been historically equivalent to setting -Xint to true. Setting > CompileThresholdScaling to 0.0 scales down CompileThreshold to 0 and we wanted to keep VM behavior consistent when we've > added support for the global CompileThresholdScaling flag in JDK-8059604. > > If you think it would be good if we changed the meaning of CompileThreshold==0, please let me know. I'll file an RFE for > it and change it. As CompileThreshold is a product flag, we will need CCC approval for that change. > > As you've suggested, I added a comment to globals.hpp that precisely describes the behavior of CompileThresholdScaling. > >> Question: What if both a global scale and a method option for scale are both set? Is the global one ignored? Do >> they multiply? It's worth specifying it explicitly (and checking that the logic DTRT). > > Global and per-method values multiply. That behavior is now described in the comment in globals.hpp. > >> Question: How are the log values (like Tier0InvokeNotifyFreqLog or the result of get_scaled_freq_log) constrained to >> fit in a reasonable range? (I would suppose that range is 0..31 or less.) Should we have range clipping code in the >> scaler functions? > > Currently InvocationCounter::number_of_count_bits=29 bits are reserved for counting. As a result, the value of the log2 > of the notification frequency can be at most 30. I updated the source code accordingly. > >> It would give notably simpler code, in MethodData::init, to use a branch-free setup of tier_0_invoke_notify_freq_log >> etc. Set scale = 1.0 and then update it conditionally. Special-case scale=1.0 in get_scaled_freq_log to taste. > > I did that. > >> Same comment about less-branchy code for methodCounters.hpp. (It's better to have a one-way branch that sets up >> 'scale' than a two-way branch with duplicate setups of one or more variables.) > > I changed the code according to your suggestions. > >> In MethodCounters, I think the conditional scaling of _interpreter_backward_branch_limit is going to confuse someone, >> at some point. It should be scaled, unconditionally, like its sibling variables. (That would remove another somewhat >> verbose initialization branch!) > > The value of the *global* interpreter backward branch limit (InvocationCounter::InterpreterBackwardBranchLimit) is > computed based on the value of CompileThreshold (in InvocationCounter::reinitialize()). The backward branch limit is > assigned a different value depending on whether interpreter profiling is enabled or not. > > The logic of the *per-method* interpreter backward branch limit (MethodCounters::_interpreter_backward_branch_limit) is > intended to be identical to that of the global branch limit. Therefore, I'm afraid that we have to keep the if-then-else > construct initializing _interpreter_backward_branch_limit in methodCounters.hpp. I hope that I understood right what > you've previously suggested. > >> Small style nit: The noise-word "get_" is discouraged in the style doc: >> >>> ? Getter accessor names are noun phrases, with no "get_" noise word. Boolean getters can also begin with "is_" or >>> "has_". >>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide >> Arguments.cpp follows this rule partially (in older code maybe?). It would be better to decrease counter-examples to >> the rule instead of increase them. >> >> Bigger style nit: Since the functions are not getting a preset value (from the arguments) but rather normalizing a >> provided argument value, I suggest naming them "scale_compile_threshold" (i.e., a verb phrase instead of a noun >> phrase). Again from the style doc: >> >>> ? Other method names are verb phrases, as if commands to the receiver. > > I did not know that, thanks for pointing it out to me. I changed > > get_scaled_compile_threshold -> scaled_compiled_threshold > get_scaled_freq_log -> scaled_freq_log. > >> Since you are providing overloads of the scaling functions, the header file should either contain inline code for the >> convenience methods, or else document how the optional argument ('scale') defaults. I'd prefer inline code, since it >> is simple. It's as much text to document with a comment as just to supply the inline overload. > > I inlined the convenience methods into the header file. > > Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.03/ > >> >> As I said before, nice work! > > Thank you! > > Best regards, > > > Zoltan > >> >> ? John > From vladimir.kozlov at oracle.com Mon Jan 19 18:38:04 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 19 Jan 2015 10:38:04 -0800 Subject: RFR(M) 8068881: SIGBUS in C2 compiled method weblogic.wsee.jaxws.framework.jaxrpc.EnvironmentFactory$SimulatedWsdlDefinitions. In-Reply-To: <554922D7-725E-4B91-9DCA-144216502DD9@oracle.com> References: <79E8800E-AB67-4889-B5E4-784F31EB8014@oracle.com> <554922D7-725E-4B91-9DCA-144216502DD9@oracle.com> Message-ID: <54BD4F0C.6040300@oracle.com> Good. Thanks, Vladimir On 1/19/15 10:35 AM, Igor Veresov wrote: > I just amended the change with an extended comment on why it?s enough to track base registers only and ignore > multi-register effects (like multi-registers lrg definitions and fat projections), it is subtle, but saves unnecessary work: > > // We just updated the last edge, now null out the value produced by the instruction itself, > // since we're only interested in defs implicitly defined by the uses. We are actually interested > // in tracking only redefinitions of the multidef lrgs in the same register. For that matter it's enough > // to track changes in the base register only and ignore other effects of multi-register lrgs > // and fat projections. It is also ok to ignore defs coming from singledefs. After an implicit > // overwrite by one of those our register is guaranteed to be used by another lrg and we won't > // attempt to merge it. > > > I also tightened the code following the comment to do updates only for multidefs lrgs (postaloc.cpp): > > lrg = _lrg_map.live_range_id(n); > - if (lrg > 0) { > + if (lrg > 0 && lrgs(lrg).is_multidef()) { > OptoReg::Name reg = lrgs(lrg).reg(); > reg2defuse.at (reg).clear(); > } > > Webrev updated in place. > > Sorry about the last minute change. > igor > >> On Jan 18, 2015, at 12:25 PM, Igor Veresov > wrote: >> >> After register allocation we may end up with nodes in the same block using different inputs that are in fact a part of >> a multidef lrg. Since that would confuse the scheduler, the post-allocation copy removal attempts to select a one of >> the inputs and replace all the uses that refer to the same value within the block. That works most of the time except >> when we try to replace an input coming from a phi that has the only user. In that case the phi goes dead along with >> the spill copy that merges the values, which produces incorrect code. Unfortunately there is no way to make a proper >> selection of a reaching def - the easiest counter example is having to select from two phis. >> >> The solution to the problem is to introduce a node that acts like a phi but is without control (or rather something >> like MergeMem) that would merge the defs (that are really the same value and are a part of a multidef lrg). The >> following change adds a new node (MachMerge) and a pass after the post-allocation copy removal to insert them when >> needed. Even though it?s a separate pass it?s a very fast linear traversal. >> >> Webrev: http://cr.openjdk.java.net/~iveresov/8068881/webrev.01/ >> >> Tested with the failing method in weblogic, jprt, CTW >> >> Thanks, >> igor > From vladimir.kozlov at oracle.com Mon Jan 19 19:06:20 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 19 Jan 2015 11:06:20 -0800 Subject: RFR(L): 6912521: System.arraycopy works slower than the simple loop for little lengths In-Reply-To: <1808916F-D1E1-4C43-AD59-D7300B92C2B6@oracle.com> References: <54B6D010.20706@oracle.com> <7A0EDCD9-3EC0-4CB1-A427-6764F4AF907D@oracle.com> <54B6FBD5.2070009@oracle.com> <1808916F-D1E1-4C43-AD59-D7300B92C2B6@oracle.com> Message-ID: <54BD55AC.1030800@oracle.com> The comment in library_call.cpp has strange letters "we???re": + // ArrayCopyNode:Ideal may transform the ArrayCopyNode to + // loads/stores but it is legal only if we???re sure the + // Arrays.copyOf would succeed. So we need all input arguments + // to the copyOf to be validated, including that the copy to the + // new array won???t trigger an ArrayStoreException. That subtype + // check can be optimized if we know something on the type of + // the input array from type speculation. Otherwise looks good. Thanks, Vladimir On 1/19/15 1:47 AM, Roland Westrelin wrote: > Here is a new webrev: > > http://cr.openjdk.java.net/~roland/6912521/webrev.01/ > > As suggested by Vladimir: > - the ArrayCopyNode stuff is its own file > - I added comments for library_call.cpp Arrays.CopyOf changes > > As suggested by John privately: > - I moved conv_I2X_offset to Compile > - I now use Node::find_int_con and find_intptr_t_con in ArrayCopyNode::get_length_if_constant > > I also modified the test cases to check that once a copy is compiled as a series of loads/stores, we don?t allow copies that should throw ArrayStoreException to proceed. > > Roland. > > >> On Jan 15, 2015, at 12:29 AM, Vladimir Kozlov wrote: >> >> On 1/14/15 1:32 PM, Roland Westrelin wrote: >>> Hi Vladimir. Thanks for taking a look at this. >>> >>>> The logic which choose direction of coping in ArrayCopyNode::Ideal() is strange. I would like to see more explicit checks there. Something like: >>>> >>>> if (is_array_copy_overlap()) { >>>> array_copy_backward() >>>> } else { >>>> array_copy_forward() >>>> } >>> >>> I?m not following you. In the general case, the test needs to be done at runtime. Your code above seems to imply that we would always decide at compile time? >> >> My bad, you are right. >> >>> >>>> Can you move ArraCopy class code from callnode.?pp to new arraycopynode.?pp files? the code become too large. >>> >>> Ok. >>> >>>> Can you add comment in library_call.cpp explaining new validation/casting logic? Why you do that? >>> >>> I will add comments. >>> To be legal, the transformation of the ArrayCopyNode to loads/stores can only happen if we?re sure the Arrays.copyOf would succeed. So we need all input arguments to the copyOf to be validated, including that the copy to the new array won?t trigger an ArrayStoreException. That?s why there?s a subtype check. That subtype check can be optimized if we know something on the type of the input array from type speculation. >> >> Okay. >> >> Thanks, >> Vladimir >> >>> >>> Roland. >>> >>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 1/14/15 1:34 AM, Roland Westrelin wrote: >>>>> http://cr.openjdk.java.net/~roland/6912521/webrev.00/ >>>>> >>>>> Follow up to 6700100 (instance clone as series of loads/stores): convert ArrayCopyNode for small array copies (clone of arrays, System.arraycopy, Arrays.copyOf) to series of loads and stores. >>>>> >>>>> Roland. >>>>> >>> > From vladimir.x.ivanov at oracle.com Mon Jan 19 19:22:52 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 19 Jan 2015 22:22:52 +0300 Subject: RFR(M) 8068881: SIGBUS in C2 compiled method weblogic.wsee.jaxws.framework.jaxrpc.EnvironmentFactory$SimulatedWsdlDefinitions. In-Reply-To: <554922D7-725E-4B91-9DCA-144216502DD9@oracle.com> References: <79E8800E-AB67-4889-B5E4-784F31EB8014@oracle.com> <554922D7-725E-4B91-9DCA-144216502DD9@oracle.com> Message-ID: <54BD598C.3000006@oracle.com> Looks good. Just a cosmetic suggestion (feel free to ignore it): it looks like if (k == n->req() - 1) is better suited for PhaseChaitin::merge_multidefs() than PhaseChaitin::possibly_merge_multidef(). Best regards, Vladimir Ivanov On 1/19/15 9:35 PM, Igor Veresov wrote: > I just amended the change with an extended comment on why it?s enough to > track base registers only and ignore multi-register effects (like > multi-registers lrg definitions and fat projections), it is subtle, but > saves unnecessary work: > > // We just updated the last edge, now null out the value produced by the > instruction itself, > // since we're only interested in defs implicitly defined by the uses. > We are actually interested > // in tracking only redefinitions of the multidef lrgs in the same > register. For that matter it's enough > // to track changes in the base register only and ignore other effects > of multi-register lrgs > // and fat projections. It is also ok to ignore defs coming from > singledefs. After an implicit > // overwrite by one of those our register is guaranteed to be used by > another lrg and we won't > // attempt to merge it. > > > I also tightened the code following the comment to do updates only for > multidefs lrgs (postaloc.cpp): > > lrg = _lrg_map.live_range_id(n); > - if (lrg > 0) { > + if (lrg > 0 && lrgs(lrg).is_multidef()) { > OptoReg::Name reg = lrgs(lrg).reg(); > reg2defuse.at (reg).clear(); > } > > Webrev updated in place. > > Sorry about the last minute change. > igor > >> On Jan 18, 2015, at 12:25 PM, Igor Veresov > > wrote: >> >> After register allocation we may end up with nodes in the same block >> using different inputs that are in fact a part of a multidef lrg. >> Since that would confuse the scheduler, the post-allocation copy >> removal attempts to select a one of the inputs and replace all the >> uses that refer to the same value within the block. That works most of >> the time except when we try to replace an input coming from a phi that >> has the only user. In that case the phi goes dead along with the spill >> copy that merges the values, which produces incorrect code. >> Unfortunately there is no way to make a proper selection of a reaching >> def - the easiest counter example is having to select from two phis. >> >> The solution to the problem is to introduce a node that acts like a >> phi but is without control (or rather something like MergeMem) that >> would merge the defs (that are really the same value and are a part of >> a multidef lrg). The following change adds a new node (MachMerge) and >> a pass after the post-allocation copy removal to insert them when >> needed. Even though it?s a separate pass it?s a very fast linear >> traversal. >> >> Webrev: http://cr.openjdk.java.net/~iveresov/8068881/webrev.01/ >> >> Tested with the failing method in weblogic, jprt, CTW >> >> Thanks, >> igor > From vladimir.x.ivanov at oracle.com Mon Jan 19 19:27:31 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 19 Jan 2015 22:27:31 +0300 Subject: [9, 8u40] RFR (XS): 8068915: uncommon trap w/ Reason_speculate_class_check causes performance regression due to continuous deoptimizations In-Reply-To: <182C621D-3DA4-4F71-9832-2A036D0A4120@oracle.com> References: <54B959BF.1030709@oracle.com> <182C621D-3DA4-4F71-9832-2A036D0A4120@oracle.com> Message-ID: <54BD5AA3.6050005@oracle.com> Roland, thanks for the feedback! >> The fix is to (1) forbid changing uncommon trap action under the hood, and (2) consult Compile::too_many_recompiles when adding a speculative guard. > > I?m not sure I understand what code (1) above is referring to in your webrev. The fix is based on 8063137 which I've sent for review earlier. GraphKit::uncommon_trap_exact [1] delegates to GraphKit::uncommon_trap(..., /*keep_exact_action=*/true). keep_exact_action guards the logic which rewrites the action. > Also, why is the problem restricted to speculative traps? Wouldn?t the same checks be required for non speculative traps as well? Other trap types could be affected as well, but speculative traps and unstable_if are the main source of action transitions. My experiments on Octane show that all cases when action substitution happens have one of these trap reason. With this change I wanted to address speculative traps case. I'll experiment with making unstable_if traps exact as well and get back to you with updated webrev. I'd prefer to change the default for keep_exact_action to true (and fix only the case mentioned in GraphKit::uncommon_trap [3]), but since I consider backporting the fix into 8u40, I want to keep it very focused. Best regards, Vladimir Ivanov [1] http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/src/share/vm/opto/graphKit.hpp.udiff.html [2] http://cr.openjdk.java.net/~vlivanov/8063137 [3] http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/tip/src/share/vm/opto/graphKit.cpp#l1986 From igor.veresov at oracle.com Mon Jan 19 19:44:33 2015 From: igor.veresov at oracle.com (Igor Veresov) Date: Mon, 19 Jan 2015 11:44:33 -0800 Subject: RFR(M) 8068881: SIGBUS in C2 compiled method weblogic.wsee.jaxws.framework.jaxrpc.EnvironmentFactory$SimulatedWsdlDefinitions. In-Reply-To: <54BD598C.3000006@oracle.com> References: <79E8800E-AB67-4889-B5E4-784F31EB8014@oracle.com> <554922D7-725E-4B91-9DCA-144216502DD9@oracle.com> <54BD598C.3000006@oracle.com> Message-ID: <59DDE323-9553-456D-81DF-6E553D5E348B@oracle.com> Thanks, Vladimir! I did the proposed change. Here?s the updated webrev: http://cr.openjdk.java.net/~iveresov/8068881/webrev.02/ Thanks! igor > On Jan 19, 2015, at 11:22 AM, Vladimir Ivanov wrote: > > Looks good. > > Just a cosmetic suggestion (feel free to ignore it): > it looks like if (k == n->req() - 1) is better suited for PhaseChaitin::merge_multidefs() than PhaseChaitin::possibly_merge_multidef(). > > Best regards, > Vladimir Ivanov > > On 1/19/15 9:35 PM, Igor Veresov wrote: >> I just amended the change with an extended comment on why it?s enough to >> track base registers only and ignore multi-register effects (like >> multi-registers lrg definitions and fat projections), it is subtle, but >> saves unnecessary work: >> >> // We just updated the last edge, now null out the value produced by the >> instruction itself, >> // since we're only interested in defs implicitly defined by the uses. >> We are actually interested >> // in tracking only redefinitions of the multidef lrgs in the same >> register. For that matter it's enough >> // to track changes in the base register only and ignore other effects >> of multi-register lrgs >> // and fat projections. It is also ok to ignore defs coming from >> singledefs. After an implicit >> // overwrite by one of those our register is guaranteed to be used by >> another lrg and we won't >> // attempt to merge it. >> >> >> I also tightened the code following the comment to do updates only for >> multidefs lrgs (postaloc.cpp): >> >> lrg = _lrg_map.live_range_id(n); >> - if (lrg > 0) { >> + if (lrg > 0 && lrgs(lrg).is_multidef()) { >> OptoReg::Name reg = lrgs(lrg).reg(); >> reg2defuse.at >(reg).clear(); >> } >> >> Webrev updated in place. >> >> Sorry about the last minute change. >> igor >> >>> On Jan 18, 2015, at 12:25 PM, Igor Veresov >>> >> wrote: >>> >>> After register allocation we may end up with nodes in the same block >>> using different inputs that are in fact a part of a multidef lrg. >>> Since that would confuse the scheduler, the post-allocation copy >>> removal attempts to select a one of the inputs and replace all the >>> uses that refer to the same value within the block. That works most of >>> the time except when we try to replace an input coming from a phi that >>> has the only user. In that case the phi goes dead along with the spill >>> copy that merges the values, which produces incorrect code. >>> Unfortunately there is no way to make a proper selection of a reaching >>> def - the easiest counter example is having to select from two phis. >>> >>> The solution to the problem is to introduce a node that acts like a >>> phi but is without control (or rather something like MergeMem) that >>> would merge the defs (that are really the same value and are a part of >>> a multidef lrg). The following change adds a new node (MachMerge) and >>> a pass after the post-allocation copy removal to insert them when >>> needed. Even though it?s a separate pass it?s a very fast linear >>> traversal. >>> >>> Webrev: http://cr.openjdk.java.net/~iveresov/8068881/webrev.01/ >>> >>> Tested with the failing method in weblogic, jprt, CTW >>> >>> Thanks, >>> igor -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Mon Jan 19 20:15:42 2015 From: john.r.rose at oracle.com (John Rose) Date: Mon, 19 Jan 2015 12:15:42 -0800 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54BD3723.4070405@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> <54B90C9F.4000202@oracle.com> <87F38F22-A0A0-496D-BEC7-7669E9A36881@oracle.com> <54BD3723.4070405@oracle.com> Message-ID: <8A04F1FE-B3B8-4FB8-B9CA-EEB1EB5E658B@oracle.com> On Jan 19, 2015, at 8:56 AM, Zolt?n Maj? wrote: > ... >> How have you regression tested this? > > I used our performance infrastructure. I collected performance data on 6 different architectures for 41 benchmark programs/suites. The data show that per-method compilation thresholds do not result in a statistically significant change of performance. One benchmark, Footprint3-Client, degrades ~0.5% on the X86 Client VM, but I think that is negligible. > >> Have you verified that the compilation sequence doesn't change for a non-trivial use case? A slip in the assembly code (for example) might cause a comparison against a garbage mask or limit that could cause compilation decisions to go crazy quietly. I didn't spot any such bug, but they are hard to spot and sometimes quiet. > > I did extensive manual testing on all architectures targeted by the patch. > > I checked that a target method (a method for which per-method compilation thresholds have been specified) is indeed compiled sooner/later than methods for which global compilation thresholds are in effect. I ran tests for all combinations of +/-TieredCompilation, +/-ProfileInterpreter, and +/-UseOnStackReplacement > on each architecture that we support. While working on this issue I've discovered and reported two interpreter-related bugs, 8068652 and 8068505. > > I cannot, unfortunately, guarantee that my changes are error-free, but I did my best to catch any possible error that I can think of and that I can check. No, but you have clearly done the "due diligence". Thanks for explaining. The comments in globals.hpp are good, but watch the spacing. String concatenation will cause some words to run into each other. This isn't very important, but it might show up with java -XX:+PrintFlagsWithComments (in a debug build). > >> In MethodCounters, I think the conditional scaling of _interpreter_backward_branch_limit is going to confuse someone, at some point. It should be scaled, unconditionally, like its sibling variables. (That would remove another somewhat verbose initialization branch!) > > The value of the *global* interpreter backward branch limit (InvocationCounter::InterpreterBackwardBranchLimit) is computed based on the value of CompileThreshold (in InvocationCounter::reinitialize()). The backward branch limit is assigned a different value depending on whether interpreter profiling is enabled or not. > > The logic of the *per-method* interpreter backward branch limit (MethodCounters::_interpreter_backward_branch_limit) is intended to be identical to that of the global branch limit. Therefore, I'm afraid that we have to keep the if-then-else construct initializing _interpreter_backward_branch_limit in methodCounters.hpp. I hope that I understood right what you've previously suggested. I see; you are duplicating an odd distinction that was already there. I still think it is a little confusing, but it's not something you can change easily. > Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.03/ Reviewed; it's good. One last nit, and this doesn't need re-review (from me): Please use the macro right_n_bits instead of a C expression (1 << x)-1, if possible. There are lots of ways that the hand-written C expression can go wrong, and the macro is safer. One more quote from the style guide: > ? Use functions from globalDefinitions.hpp when performing bitwise operations on integers. Do not code directly as C operators, unless they are extremely simple. (Examples: round_to, is_power_of_2, exact_log2.) ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From duncan.macgregor at ge.com Mon Jan 19 20:21:59 2015 From: duncan.macgregor at ge.com (MacGregor, Duncan (GE Energy Management)) Date: Mon, 19 Jan 2015 20:21:59 +0000 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54B94766.2080102@oracle.com> References: <54B94766.2080102@oracle.com> Message-ID: Okay, I?ve done some tests of this with the micro benchmarks for our language & runtime which show pretty much no change except for one test which is now almost 3x slower. It uses nested loops to iterate over an array and concatenate the string-like objects it contains, and replaces elements with these new longer string-llike objects. It?s a bit of a pathological case, and I haven?t seen the same sort of degradation in the other benchmarks or in real applications, but I haven?t done serious benchmarking of them with this change. I shall see if the test case can be reduced down to anything simpler while still showing the same performance behaviour, and try add some compilation logging options to narrow down what?s going on. Duncan. On 16/01/2015 17:16, "Vladimir Ivanov" wrote: >http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ >http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ >https://bugs.openjdk.java.net/browse/JDK-8063137 > >After GuardWithTest (GWT) LambdaForms became shared, profile pollution >significantly distorted compilation decisions. It affected inlining and >hindered some optimizations. It causes significant performance >regressions for Nashorn (on Octane benchmarks). > >Inlining was fixed by 8059877 [1], but it didn't cover the case when a >branch is never taken. It can cause missed optimization opportunity, and >not just increase in code size. For example, non-pruned branch can break >escape analysis. > >Currently, there are 2 problems: > - branch frequencies profile pollution > - deoptimization counts pollution > >Branch frequency pollution hides from JIT the fact that a branch is >never taken. Since GWT LambdaForms (and hence their bytecode) are >heavily shared, but the behavior is specific to MethodHandle, there's no >way for JIT to understand how particular GWT instance behaves. > >The solution I propose is to do profiling in Java code and feed it to >JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where >profiling info is stored. Once JIT kicks in, it can retrieve these >counts, if corresponding MethodHandle is a compile-time constant (and it >is usually the case). To communicate the profile data from Java code to >JIT, MethodHandleImpl::profileBranch() is used. > >If GWT MethodHandle isn't a compile-time constant, profiling should >proceed. It happens when corresponding LambdaForm is already shared, for >newly created GWT MethodHandles profiling can occur only in native code >(dedicated nmethod for a single LambdaForm). So, when compilation of the >whole MethodHandle chain is triggered, the profile should be already >gathered. > >Overriding branch frequencies is not enough. Statistics on >deoptimization events is also polluted. Even if a branch is never taken, >JIT doesn't issue an uncommon trap there unless corresponding bytecode >doesn't trap too much and doesn't cause too many recompiles. > >I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT >sees it on some method, Compile::too_many_traps & >Compile::too_many_recompiles for that method always return false. It >allows JIT to prune the branch based on custom profile and recompile the >method, if the branch is visited. > >For now, I wanted to keep the fix very focused. The next thing I plan to >do is to experiment with ignoring deoptimization counts for other >LambdaForms which are heavily shared. I already saw problems caused by >deoptimization counts pollution (see JDK-8068915 [2]). > >I plan to backport the fix into 8u40, once I finish extensive >performance testing. > >Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, >Octane). > >Thanks! > >PS: as a summary, my experiments show that fixes for 8063137 & 8068915 >[2] almost completely recovers peak performance after LambdaForm sharing >[3]. There's one more problem left (non-inlined MethodHandle invocations >are more expensive when LFs are shared), but it's a story for another day. > >Best regards, >Vladimir Ivanov > >[1] https://bugs.openjdk.java.net/browse/JDK-8059877 > 8059877: GWT branch frequencies pollution due to LF sharing >[2] https://bugs.openjdk.java.net/browse/JDK-8068915 >[3] https://bugs.openjdk.java.net/browse/JDK-8046703 > JEP 210: LambdaForm Reduction and Caching >_______________________________________________ >mlvm-dev mailing list >mlvm-dev at openjdk.java.net >http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev From zoltan.majo at oracle.com Tue Jan 20 08:41:36 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Tue, 20 Jan 2015 09:41:36 +0100 Subject: [9] RFR(S): 8069162: quarantine serviceability/dcmd/compiler/CompilerQueueTest.java In-Reply-To: <54BD4C5F.6000706@oracle.com> References: <54B901DF.9040107@oracle.com> <54B9027C.3080506@oracle.com> <54B91ECC.8010100@oracle.com> <54BCC917.1090209@oracle.com> <54BD4C5F.6000706@oracle.com> Message-ID: <54BE14C0.20807@oracle.com> Thank you, Albert and Vladimir, for the review! Best regards, Zoltan On 01/19/2015 07:26 PM, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 1/19/15 1:06 AM, Zolt?n Maj? wrote: >> Hi Albert, >> >> >> On 01/16/2015 03:23 PM, Albert Noll wrote: >>> Hi Zoltan, >>> >>> I agree to quarantine the test. >>> >>> Shouldn't quarantining the test look like this? >>> @ignore 8069160 >> >> you're right, thank for pointing that out. >> >> Here is the new webrev: >> >> http://cr.openjdk.java.net/~zmajo/8069162/webrev.01/ >> >> Best wishes, >> >> >> Zoltan >> >>> >>> Best, >>> Albert >>> >>> On 01/16/2015 01:22 PM, Zolt?n Maj? wrote: >>>> Hi, >>>> >>>> >>>> small mistake in the mail I've just sent out: >>>> >>>> On 01/16/2015 01:19 PM, Zolt?n Maj? wrote: >>>>> Solution: The problem with the test is addressed by JDK-806912. >>>>> Quarantine the test until 806912 is fixed. >>>> >>>> The problem with the test is addressed by *JDK-8069160*. Quarantine >>>> the test until *8069160* is fixed. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8069160 >>>> >>>> I'm sorry for the noise. >>>> >>>> Thank you and best regards, >>>> >>>> >>>> Zoltan >>>> >>>> >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~zmajo/8069162/webrev.00/ >>>>> >>>>> Testing: manual testing, JPRT with the option -onlytests >>>>> '.*hotspot_serviceability.*' >>>>> >>>>> Thank you and best regards, >>>>> >>>>> >>>>> Zoltan >>>>> >>>> >>> >> From andrew_nuss at yahoo.com Tue Jan 20 10:22:56 2015 From: andrew_nuss at yahoo.com (Andy Nuss) Date: Tue, 20 Jan 2015 10:22:56 +0000 (UTC) Subject: safe temp array size that will always prefer TLAB when created Message-ID: <1670520695.3776497.1421749376477.JavaMail.yahoo@jws10673.mail.bf1.yahoo.com> Hi, I have a ctor for a class that creates a staging char[] buffer, which can be smallish, certainly, if necessary a size of 16 suffices for example. In C++, this would be declared as a simple temporary exact sized array as a local stack variable.? I wish to know what is the maximum length I can use for my char[] temp buffer, and have a reasonable guarantee that it will go into the java TLAB when constructed so that its impact on the heap is most like a hypothetical C++ small buffer on the stack. Further, are there conditions under which hotspot would not locate the smallish array in TLAB?? For example, based on whether or not the reference to the buffer is saved only in a local variable.? In fact, I wish to construct a worker class that in turn constructs the small temp char[], saves it in a member, and the worker is used only in a loop and does not escape the function that creates it, so the worker reference is never kept anywhere but on the stack. Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.x.ivanov at oracle.com Tue Jan 20 12:40:50 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 20 Jan 2015 15:40:50 +0300 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: References: <54B94766.2080102@oracle.com> Message-ID: <54BE4CD2.30805@oracle.com> Duncan, thanks a lot for giving it a try! If you plan to spend more time on it, please, apply 8068915 as well. I saw huge intermittent performance regressions due to continuous deoptimization storm. You can look into -XX:+LogCompilation output and look for repeated deoptimization events in steady state w/ Action_none. Also, there's deoptimization statistics in the log (at least, in jdk9). It's located right before compilation_log tag. Thanks again for the valuable feedback! Best regards, Vladimir Ivanov [1] http://cr.openjdk.java.net/~vlivanov/8068915/webrev.00 On 1/19/15 11:21 PM, MacGregor, Duncan (GE Energy Management) wrote: > Okay, I?ve done some tests of this with the micro benchmarks for our > language & runtime which show pretty much no change except for one test > which is now almost 3x slower. It uses nested loops to iterate over an > array and concatenate the string-like objects it contains, and replaces > elements with these new longer string-llike objects. It?s a bit of a > pathological case, and I haven?t seen the same sort of degradation in the > other benchmarks or in real applications, but I haven?t done serious > benchmarking of them with this change. > > I shall see if the test case can be reduced down to anything simpler while > still showing the same performance behaviour, and try add some compilation > logging options to narrow down what?s going on. > > Duncan. > > On 16/01/2015 17:16, "Vladimir Ivanov" > wrote: > >> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ >> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ >> https://bugs.openjdk.java.net/browse/JDK-8063137 >> >> After GuardWithTest (GWT) LambdaForms became shared, profile pollution >> significantly distorted compilation decisions. It affected inlining and >> hindered some optimizations. It causes significant performance >> regressions for Nashorn (on Octane benchmarks). >> >> Inlining was fixed by 8059877 [1], but it didn't cover the case when a >> branch is never taken. It can cause missed optimization opportunity, and >> not just increase in code size. For example, non-pruned branch can break >> escape analysis. >> >> Currently, there are 2 problems: >> - branch frequencies profile pollution >> - deoptimization counts pollution >> >> Branch frequency pollution hides from JIT the fact that a branch is >> never taken. Since GWT LambdaForms (and hence their bytecode) are >> heavily shared, but the behavior is specific to MethodHandle, there's no >> way for JIT to understand how particular GWT instance behaves. >> >> The solution I propose is to do profiling in Java code and feed it to >> JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where >> profiling info is stored. Once JIT kicks in, it can retrieve these >> counts, if corresponding MethodHandle is a compile-time constant (and it >> is usually the case). To communicate the profile data from Java code to >> JIT, MethodHandleImpl::profileBranch() is used. >> >> If GWT MethodHandle isn't a compile-time constant, profiling should >> proceed. It happens when corresponding LambdaForm is already shared, for >> newly created GWT MethodHandles profiling can occur only in native code >> (dedicated nmethod for a single LambdaForm). So, when compilation of the >> whole MethodHandle chain is triggered, the profile should be already >> gathered. >> >> Overriding branch frequencies is not enough. Statistics on >> deoptimization events is also polluted. Even if a branch is never taken, >> JIT doesn't issue an uncommon trap there unless corresponding bytecode >> doesn't trap too much and doesn't cause too many recompiles. >> >> I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT >> sees it on some method, Compile::too_many_traps & >> Compile::too_many_recompiles for that method always return false. It >> allows JIT to prune the branch based on custom profile and recompile the >> method, if the branch is visited. >> >> For now, I wanted to keep the fix very focused. The next thing I plan to >> do is to experiment with ignoring deoptimization counts for other >> LambdaForms which are heavily shared. I already saw problems caused by >> deoptimization counts pollution (see JDK-8068915 [2]). >> >> I plan to backport the fix into 8u40, once I finish extensive >> performance testing. >> >> Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, >> Octane). >> >> Thanks! >> >> PS: as a summary, my experiments show that fixes for 8063137 & 8068915 >> [2] almost completely recovers peak performance after LambdaForm sharing >> [3]. There's one more problem left (non-inlined MethodHandle invocations >> are more expensive when LFs are shared), but it's a story for another day. >> >> Best regards, >> Vladimir Ivanov >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8059877 >> 8059877: GWT branch frequencies pollution due to LF sharing >> [2] https://bugs.openjdk.java.net/browse/JDK-8068915 >> [3] https://bugs.openjdk.java.net/browse/JDK-8046703 >> JEP 210: LambdaForm Reduction and Caching >> _______________________________________________ >> mlvm-dev mailing list >> mlvm-dev at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > _______________________________________________ > mlvm-dev mailing list > mlvm-dev at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > From roland.westrelin at oracle.com Tue Jan 20 13:03:30 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Tue, 20 Jan 2015 14:03:30 +0100 Subject: [9, 8u40] RFR (XS): 8068915: uncommon trap w/ Reason_speculate_class_check causes performance regression due to continuous deoptimizations In-Reply-To: <54BD5AA3.6050005@oracle.com> References: <54B959BF.1030709@oracle.com> <182C621D-3DA4-4F71-9832-2A036D0A4120@oracle.com> <54BD5AA3.6050005@oracle.com> Message-ID: <4F0DB021-7B33-4476-8ACA-3ED93425444A@oracle.com> > Roland, thanks for the feedback! > >>> The fix is to (1) forbid changing uncommon trap action under the hood, and (2) consult Compile::too_many_recompiles when adding a speculative guard. >> >> I?m not sure I understand what code (1) above is referring to in your webrev. > The fix is based on 8063137 which I've sent for review earlier. > GraphKit::uncommon_trap_exact [1] delegates to GraphKit::uncommon_trap(..., /*keep_exact_action=*/true). keep_exact_action guards the logic which rewrites the action. > >> Also, why is the problem restricted to speculative traps? Wouldn?t the same checks be required for non speculative traps as well? > Other trap types could be affected as well, but speculative traps and unstable_if are the main source of action transitions. My experiments on Octane show that all cases when action substitution happens have one of these trap reason. > > With this change I wanted to address speculative traps case. I'll experiment with making unstable_if traps exact as well and get back to you with updated webrev. > > I'd prefer to change the default for keep_exact_action to true (and fix only the case mentioned in GraphKit::uncommon_trap [3]), but since I consider backporting the fix into 8u40, I want to keep it very focused. Thanks for the explanation, Vladimir. That looks good to me as it is. Roland. From zoltan.majo at oracle.com Tue Jan 20 13:29:32 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Tue, 20 Jan 2015 14:29:32 +0100 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54BD4EC1.5060607@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> <54B90C9F.4000202@oracle.com> <87F38F22-A0A0-496D-BEC7-7669E9A36881@oracle.com> <54BD3723.4070405@oracle.com> <54BD4EC1.5060607@oracle.com> Message-ID: <54BE583C.3030700@oracle.com> Hi Vladimir, thanks for the review! On 01/19/2015 07:36 PM, Vladimir Kozlov wrote: > What was changed in globalDefinitions.hpp? Webrev shows nothing > (sometimes it does not show spacing changes). Yes, it was some whitespace I've removed around line 1150: - uintptr_t p = 1; + uintptr_t p = 1; The newest webrev does not show that either. In addition to that, I've updated two comments in the newest webrev. > In arguments.cpp, please, add check that CompileThresholdScaling is > not negative. I added the check. If CompileThresholdScaling is negative, we set its value to 1.0 (the default value). Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.04/ The webrev link is the same as the one in my latest reply to John. All JPRT tests pass. Thank you! Best regards, Zoltan > > Otherwise this looks good to me. > > Thanks, > Vladimir > > On 1/19/15 8:56 AM, Zolt?n Maj? wrote: >> Hi John, >> >> >> On 01/16/2015 08:59 PM, John Rose wrote: >>> On Jan 16, 2015, at 5:05 AM, Zolt?n Maj? >>> wrote: >>>> Here is the new webrev: >>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/ >>> Reviewed, with some comments. >> >> thank you for your review and for the feedback! Please see detailed >> comments below. >> >>> Overall, I like the cleanups along the way. >>> >>> The basic idea of replacing a hard-coded 'mask' with an addressable >>> variable is sound and nicely executed. >>> >>> I suppose that idea by itself is "S" small, but this really is a "M" >>> or even "L" change, as Vladimir says, especially >>> since the enhanced logic is spread all around many files. >> >> I agree that this is not a small change any more, but I would like to >> keep the subject of this discussion unchanged so >> that we don't move to a different thread (unless you or Vladimir >> think it is better to change it). >> >>> How have you regression tested this? >> >> I used our performance infrastructure. I collected performance data >> on 6 different architectures for 41 benchmark >> programs/suites. The data show that per-method compilation thresholds >> do not result in a statistically significant >> change of performance. One benchmark, Footprint3-Client, degrades >> ~0.5% on the X86 Client VM, but I think that is >> negligible. >> >>> Have you verified that the compilation sequence doesn't change for a >>> non-trivial use case? A slip in the assembly >>> code (for example) might cause a comparison against a garbage mask >>> or limit that could cause compilation decisions to >>> go crazy quietly. I didn't spot any such bug, but they are hard to >>> spot and sometimes quiet. >> >> I did extensive manual testing on all architectures targeted by the >> patch. >> >> I checked that a target method (a method for which per-method >> compilation thresholds have been specified) is indeed >> compiled sooner/later than methods for which global compilation >> thresholds are in effect. I ran tests for all >> combinations of +/-TieredCompilation, +/-ProfileInterpreter, and >> +/-UseOnStackReplacement >> on each architecture that we support. While working on this issue >> I've discovered and reported two interpreter-related >> bugs, 8068652 and 8068505. >> >> I cannot, unfortunately, guarantee that my changes are error-free, >> but I did my best to catch any possible error that I >> can think of and that I can check. >> >>> In the sparc assembly code (more than one file), the live range of >>> Rcounters has increased, since it is used to supply >>> limits as well as to update the counter (which happens early in the >>> code). >>> >>> To make it easier to maintain the code, I suggest renaming Rcounters >>> to G3_method_counters. >>> (As you can see, when a register has a logical name but has a >>> complicated live range, we add the hardware name is to >>> the logical name, to make it easier to spot interfering uses, when >>> manually editing code.) >> >> Thanks for the suggestion, I've changed the register's name. >> >>> If scale==0.0 is a valid input checked specially in compileBroker, >>> perhaps the effect of a zero should be documented? >>> Suggest adding to globals.hpp: >>> "; values greater than one delay counter overflow; zero forces >>> counter overflow immediately" >>> "; can be set as a per-method option." >> >> If CompileThresholdScaling is set to 0.0, all methods are interpreted. >> >> The reason is that CompileThreshold==0 has been historically >> equivalent to setting -Xint to true. Setting >> CompileThresholdScaling to 0.0 scales down CompileThreshold to 0 and >> we wanted to keep VM behavior consistent when we've >> added support for the global CompileThresholdScaling flag in >> JDK-8059604. >> >> If you think it would be good if we changed the meaning of >> CompileThreshold==0, please let me know. I'll file an RFE for >> it and change it. As CompileThreshold is a product flag, we will need >> CCC approval for that change. >> >> As you've suggested, I added a comment to globals.hpp that precisely >> describes the behavior of CompileThresholdScaling. >> >>> Question: What if both a global scale and a method option for scale >>> are both set? Is the global one ignored? Do >>> they multiply? It's worth specifying it explicitly (and checking >>> that the logic DTRT). >> >> Global and per-method values multiply. That behavior is now described >> in the comment in globals.hpp. >> >>> Question: How are the log values (like Tier0InvokeNotifyFreqLog or >>> the result of get_scaled_freq_log) constrained to >>> fit in a reasonable range? (I would suppose that range is 0..31 or >>> less.) Should we have range clipping code in the >>> scaler functions? >> >> Currently InvocationCounter::number_of_count_bits=29 bits are >> reserved for counting. As a result, the value of the log2 >> of the notification frequency can be at most 30. I updated the source >> code accordingly. >> >>> It would give notably simpler code, in MethodData::init, to use a >>> branch-free setup of tier_0_invoke_notify_freq_log >>> etc. Set scale = 1.0 and then update it conditionally. Special-case >>> scale=1.0 in get_scaled_freq_log to taste. >> >> I did that. >> >>> Same comment about less-branchy code for methodCounters.hpp. (It's >>> better to have a one-way branch that sets up >>> 'scale' than a two-way branch with duplicate setups of one or more >>> variables.) >> >> I changed the code according to your suggestions. >> >>> In MethodCounters, I think the conditional scaling of >>> _interpreter_backward_branch_limit is going to confuse someone, >>> at some point. It should be scaled, unconditionally, like its >>> sibling variables. (That would remove another somewhat >>> verbose initialization branch!) >> >> The value of the *global* interpreter backward branch limit >> (InvocationCounter::InterpreterBackwardBranchLimit) is >> computed based on the value of CompileThreshold (in >> InvocationCounter::reinitialize()). The backward branch limit is >> assigned a different value depending on whether interpreter profiling >> is enabled or not. >> >> The logic of the *per-method* interpreter backward branch limit >> (MethodCounters::_interpreter_backward_branch_limit) is >> intended to be identical to that of the global branch limit. >> Therefore, I'm afraid that we have to keep the if-then-else >> construct initializing _interpreter_backward_branch_limit in >> methodCounters.hpp. I hope that I understood right what >> you've previously suggested. >> >>> Small style nit: The noise-word "get_" is discouraged in the style >>> doc: >>> >>>> ? Getter accessor names are noun phrases, with no "get_" noise >>>> word. Boolean getters can also begin with "is_" or >>>> "has_". >>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide >>> Arguments.cpp follows this rule partially (in older code maybe?). >>> It would be better to decrease counter-examples to >>> the rule instead of increase them. >>> >>> Bigger style nit: Since the functions are not getting a preset >>> value (from the arguments) but rather normalizing a >>> provided argument value, I suggest naming them >>> "scale_compile_threshold" (i.e., a verb phrase instead of a noun >>> phrase). Again from the style doc: >>> >>>> ? Other method names are verb phrases, as if commands to the >>>> receiver. >> >> I did not know that, thanks for pointing it out to me. I changed >> >> get_scaled_compile_threshold -> scaled_compiled_threshold >> get_scaled_freq_log -> scaled_freq_log. >> >>> Since you are providing overloads of the scaling functions, the >>> header file should either contain inline code for the >>> convenience methods, or else document how the optional argument >>> ('scale') defaults. I'd prefer inline code, since it >>> is simple. It's as much text to document with a comment as just to >>> supply the inline overload. >> >> I inlined the convenience methods into the header file. >> >> Here is the new webrev: >> http://cr.openjdk.java.net/~zmajo/8059606/webrev.03/ >> >>> >>> As I said before, nice work! >> >> Thank you! >> >> Best regards, >> >> >> Zoltan >> >>> >>> ? John >> From zoltan.majo at oracle.com Tue Jan 20 13:29:36 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Tue, 20 Jan 2015 14:29:36 +0100 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <8A04F1FE-B3B8-4FB8-B9CA-EEB1EB5E658B@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> <54B90C9F.4000202@oracle.com> <87F38F22-A0A0-496D-BEC7-7669E9A36881@oracle.com> <54BD3723.4070405@oracle.com> <8A04F1FE-B3B8-4FB8-B9CA-EEB1EB5E658B@oracle.com> Message-ID: <54BE5840.7050505@oracle.com> Hi John, thank you for the feedback! On 01/19/2015 09:15 PM, John Rose wrote: > > The comments in globals.hpp are good, but watch the spacing. String > concatenation will cause some words to run into each other. This > isn't very important, but it might show up with java > -XX:+PrintFlagsWithComments (in a debug build). Thank you for catching that. I updated the comments and now they are printed OK with -XX:+PrintFlagsWithComment. >> Here is the new >> webrev:http://cr.openjdk.java.net/~zmajo/8059606/webrev.03/ >> > Reviewed; it's good. > > One last nit, and this doesn't need re-review (from me): Please use > the macro right_n_bits instead of a C expression (1 << x)-1, if > possible. There are lots of ways that the hand-written C expression > can go wrong, and the macro is safer. One more quote from the style > guide: > >> ? Use functions from globalDefinitions.hpp when performing bitwise >> operations on integers. Do not code directly as C operators, unless >> they are extremely >> simple. (Examples: round_to, is_power_of_2, exact_log2.) I changed the code to use the right_n_bits() and the nth_bit() macros wherever it makes sense. Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.04/ The webrev link is the same as the one in my latest reply to Vladimir. Thank you for the review! Best regards, Zoltan > > ? John > From vitalyd at gmail.com Tue Jan 20 14:04:57 2015 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Tue, 20 Jan 2015 09:04:57 -0500 Subject: safe temp array size that will always prefer TLAB when created In-Reply-To: <1670520695.3776497.1421749376477.JavaMail.yahoo@jws10673.mail.bf1.yahoo.com> References: <1670520695.3776497.1421749376477.JavaMail.yahoo@jws10673.mail.bf1.yahoo.com> Message-ID: See this SO answer for TLAB allocation: http://stackoverflow.com/a/24620205 TLAB allocation doesn't depend on whether the reference is kept in a local variable or not; perhaps you're thinking of escape analysis, which can stack allocate objects in some cases. sent from my phone On Jan 20, 2015 5:27 AM, "Andy Nuss" wrote: > Hi, > > I have a ctor for a class that creates a staging char[] buffer, which can > be smallish, certainly, if necessary a size of 16 suffices for example. > > In C++, this would be declared as a simple temporary exact sized array as > a local stack variable. I wish to know what is the maximum length I can > use for my char[] temp buffer, and have a reasonable guarantee that it will > go into the java TLAB when constructed so that its impact on the heap is > most like a hypothetical C++ small buffer on the stack. > > Further, are there conditions under which hotspot would not locate the > smallish array in TLAB? For example, based on whether or not the reference > to the buffer is saved only in a local variable. In fact, I wish to > construct a worker class that in turn constructs the small temp char[], > saves it in a member, and the worker is used only in a loop and does not > escape the function that creates it, so the worker reference is never kept > anywhere but on the stack. > > Andy > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavel.chistyakov at oracle.com Tue Jan 20 17:37:05 2015 From: pavel.chistyakov at oracle.com (Pavel Chistyakov) Date: Tue, 20 Jan 2015 20:37:05 +0300 Subject: RFR(S): 8069125: compiler/codecache/stress tests timeout in nightlies Message-ID: <54BE9241.2060007@oracle.com> Hi all, please review fix for JDK-8069125 . webrev: http://cr.openjdk.java.net/~pchistyakov/8069125/webrev.00/ Problem: new stress tests for code cache sometimes fails due to timeout. There are two potential problems: * OverloadCompileQueueTest failed in nightlies in Xcomp mode. This behavior forced by lock-unlocker thread that is used to lock compilation and create a pause during which we can fill compilation queue. This thread starts at test beginning and works infinitely as daemon. For now compilation locking implemented using MonitorLockerEx object and wait on it. To prevent spurious wake up wait function is called inside while loop checking Whitebox::compilation_locked volatile boolean flag. W/o any timeout between lock-unlocker thread iterations we get a situation when compiler thread can not exit the loop because while notify_all signals lock to exit wait and continue execution Whitebox::compilation_locked flag become true again. (see compileBroker.cpp:1967-1971 and whitebox.cpp:802-810 for details of compilation locking process) * we got one RandomAllocationTest failure caused by timeout. But it happened on slow system and there jtreg didn't have enough time to do its postprocess work after test execution. This can be solved by decreasing test execution time (for now test gets 90% of overall jtreg timeout that is set up on test initialization) to 80%. Solution: * add timeout in iterations for lock-unlocker thread in OverloadCompileQueueTest to give a chance for compiler to work (very critical in Xcomp mode) * decrease test execution time to 80% that will give more time for jtreg and vm init and exit process to prevent timeout of other stress tests if any Testing: manual locally, JPRT ---------------- Thanks, Pavel -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavel.chistyakov at oracle.com Tue Jan 20 17:37:12 2015 From: pavel.chistyakov at oracle.com (Pavel Chistyakov) Date: Tue, 20 Jan 2015 20:37:12 +0300 Subject: RFR(XS): 8066998: [TESTBUG] compiler/whitebox/ForceNMethodSweepTest.java : sweep shouldn't increase usage Message-ID: <54BE9248.9010106@oracle.com> Hi all, please review small fix for JDK-8066998 . webrev: http://cr.openjdk.java.net/~pchistyakov/8066998/webrev.00/ Problem: sometimes ForceNMethodSweepTest fails w/ "sweep shouldn't increase usage" message that is caused by increasing of total counted code cache usage after sweeping instead of decreasing. After some investigations (thanks Albert Noll) we found that this behavior can be forced by background compilation. It happens after controlled sweep process and increases code cache usage unexpectedly. Solution: disable background compilation for this test. Testing: manual locally (about 1000 runs w/ and w/o background compilation) ---------------- Thanks, Pavel -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Tue Jan 20 17:48:42 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 20 Jan 2015 09:48:42 -0800 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54BE583C.3030700@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> <54B90C9F.4000202@oracle.com> <87F38F22-A0A0-496D-BEC7-7669E9A36881@oracle.com> <54BD3723.4070405@oracle.com> <54BD4EC1.5060607@oracle.com> <54BE583C.3030700@oracle.com> Message-ID: <54BE94FA.2060806@oracle.com> We usually exit with error message if a flag's value is incorrect - vm_exit_during_initialization(). We never overwrite incorrect value and hide what we did. Thanks, Vladimir On 1/20/15 5:29 AM, Zolt?n Maj? wrote: > Hi Vladimir, > > > thanks for the review! > > On 01/19/2015 07:36 PM, Vladimir Kozlov wrote: >> What was changed in globalDefinitions.hpp? Webrev shows nothing (sometimes it does not show spacing changes). > > Yes, it was some whitespace I've removed around line 1150: > > - uintptr_t p = 1; > + uintptr_t p = 1; > > The newest webrev does not show that either. In addition to that, I've updated two comments in the newest webrev. > >> In arguments.cpp, please, add check that CompileThresholdScaling is not negative. > > I added the check. If CompileThresholdScaling is negative, we set its value to 1.0 (the default value). > > Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.04/ > > The webrev link is the same as the one in my latest reply to John. > > All JPRT tests pass. > > Thank you! > > Best regards, > > > Zoltan > >> >> Otherwise this looks good to me. >> >> Thanks, >> Vladimir >> >> On 1/19/15 8:56 AM, Zolt?n Maj? wrote: >>> Hi John, >>> >>> >>> On 01/16/2015 08:59 PM, John Rose wrote: >>>> On Jan 16, 2015, at 5:05 AM, Zolt?n Maj? wrote: >>>>> Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/ >>>> Reviewed, with some comments. >>> >>> thank you for your review and for the feedback! Please see detailed comments below. >>> >>>> Overall, I like the cleanups along the way. >>>> >>>> The basic idea of replacing a hard-coded 'mask' with an addressable variable is sound and nicely executed. >>>> >>>> I suppose that idea by itself is "S" small, but this really is a "M" or even "L" change, as Vladimir says, especially >>>> since the enhanced logic is spread all around many files. >>> >>> I agree that this is not a small change any more, but I would like to keep the subject of this discussion unchanged so >>> that we don't move to a different thread (unless you or Vladimir think it is better to change it). >>> >>>> How have you regression tested this? >>> >>> I used our performance infrastructure. I collected performance data on 6 different architectures for 41 benchmark >>> programs/suites. The data show that per-method compilation thresholds do not result in a statistically significant >>> change of performance. One benchmark, Footprint3-Client, degrades ~0.5% on the X86 Client VM, but I think that is >>> negligible. >>> >>>> Have you verified that the compilation sequence doesn't change for a non-trivial use case? A slip in the assembly >>>> code (for example) might cause a comparison against a garbage mask or limit that could cause compilation decisions to >>>> go crazy quietly. I didn't spot any such bug, but they are hard to spot and sometimes quiet. >>> >>> I did extensive manual testing on all architectures targeted by the patch. >>> >>> I checked that a target method (a method for which per-method compilation thresholds have been specified) is indeed >>> compiled sooner/later than methods for which global compilation thresholds are in effect. I ran tests for all >>> combinations of +/-TieredCompilation, +/-ProfileInterpreter, and +/-UseOnStackReplacement >>> on each architecture that we support. While working on this issue I've discovered and reported two interpreter-related >>> bugs, 8068652 and 8068505. >>> >>> I cannot, unfortunately, guarantee that my changes are error-free, but I did my best to catch any possible error that I >>> can think of and that I can check. >>> >>>> In the sparc assembly code (more than one file), the live range of Rcounters has increased, since it is used to supply >>>> limits as well as to update the counter (which happens early in the code). >>>> >>>> To make it easier to maintain the code, I suggest renaming Rcounters to G3_method_counters. >>>> (As you can see, when a register has a logical name but has a complicated live range, we add the hardware name is to >>>> the logical name, to make it easier to spot interfering uses, when manually editing code.) >>> >>> Thanks for the suggestion, I've changed the register's name. >>> >>>> If scale==0.0 is a valid input checked specially in compileBroker, perhaps the effect of a zero should be documented? >>>> Suggest adding to globals.hpp: >>>> "; values greater than one delay counter overflow; zero forces counter overflow immediately" >>>> "; can be set as a per-method option." >>> >>> If CompileThresholdScaling is set to 0.0, all methods are interpreted. >>> >>> The reason is that CompileThreshold==0 has been historically equivalent to setting -Xint to true. Setting >>> CompileThresholdScaling to 0.0 scales down CompileThreshold to 0 and we wanted to keep VM behavior consistent when we've >>> added support for the global CompileThresholdScaling flag in JDK-8059604. >>> >>> If you think it would be good if we changed the meaning of CompileThreshold==0, please let me know. I'll file an RFE for >>> it and change it. As CompileThreshold is a product flag, we will need CCC approval for that change. >>> >>> As you've suggested, I added a comment to globals.hpp that precisely describes the behavior of CompileThresholdScaling. >>> >>>> Question: What if both a global scale and a method option for scale are both set? Is the global one ignored? Do >>>> they multiply? It's worth specifying it explicitly (and checking that the logic DTRT). >>> >>> Global and per-method values multiply. That behavior is now described in the comment in globals.hpp. >>> >>>> Question: How are the log values (like Tier0InvokeNotifyFreqLog or the result of get_scaled_freq_log) constrained to >>>> fit in a reasonable range? (I would suppose that range is 0..31 or less.) Should we have range clipping code in the >>>> scaler functions? >>> >>> Currently InvocationCounter::number_of_count_bits=29 bits are reserved for counting. As a result, the value of the log2 >>> of the notification frequency can be at most 30. I updated the source code accordingly. >>> >>>> It would give notably simpler code, in MethodData::init, to use a branch-free setup of tier_0_invoke_notify_freq_log >>>> etc. Set scale = 1.0 and then update it conditionally. Special-case scale=1.0 in get_scaled_freq_log to taste. >>> >>> I did that. >>> >>>> Same comment about less-branchy code for methodCounters.hpp. (It's better to have a one-way branch that sets up >>>> 'scale' than a two-way branch with duplicate setups of one or more variables.) >>> >>> I changed the code according to your suggestions. >>> >>>> In MethodCounters, I think the conditional scaling of _interpreter_backward_branch_limit is going to confuse someone, >>>> at some point. It should be scaled, unconditionally, like its sibling variables. (That would remove another somewhat >>>> verbose initialization branch!) >>> >>> The value of the *global* interpreter backward branch limit (InvocationCounter::InterpreterBackwardBranchLimit) is >>> computed based on the value of CompileThreshold (in InvocationCounter::reinitialize()). The backward branch limit is >>> assigned a different value depending on whether interpreter profiling is enabled or not. >>> >>> The logic of the *per-method* interpreter backward branch limit (MethodCounters::_interpreter_backward_branch_limit) is >>> intended to be identical to that of the global branch limit. Therefore, I'm afraid that we have to keep the if-then-else >>> construct initializing _interpreter_backward_branch_limit in methodCounters.hpp. I hope that I understood right what >>> you've previously suggested. >>> >>>> Small style nit: The noise-word "get_" is discouraged in the style doc: >>>> >>>>> ? Getter accessor names are noun phrases, with no "get_" noise word. Boolean getters can also begin with "is_" or >>>>> "has_". >>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide >>>> Arguments.cpp follows this rule partially (in older code maybe?). It would be better to decrease counter-examples to >>>> the rule instead of increase them. >>>> >>>> Bigger style nit: Since the functions are not getting a preset value (from the arguments) but rather normalizing a >>>> provided argument value, I suggest naming them "scale_compile_threshold" (i.e., a verb phrase instead of a noun >>>> phrase). Again from the style doc: >>>> >>>>> ? Other method names are verb phrases, as if commands to the receiver. >>> >>> I did not know that, thanks for pointing it out to me. I changed >>> >>> get_scaled_compile_threshold -> scaled_compiled_threshold >>> get_scaled_freq_log -> scaled_freq_log. >>> >>>> Since you are providing overloads of the scaling functions, the header file should either contain inline code for the >>>> convenience methods, or else document how the optional argument ('scale') defaults. I'd prefer inline code, since it >>>> is simple. It's as much text to document with a comment as just to supply the inline overload. >>> >>> I inlined the convenience methods into the header file. >>> >>> Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.03/ >>> >>>> >>>> As I said before, nice work! >>> >>> Thank you! >>> >>> Best regards, >>> >>> >>> Zoltan >>> >>>> >>>> ? John >>> > From vladimir.kozlov at oracle.com Tue Jan 20 17:52:27 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 20 Jan 2015 09:52:27 -0800 Subject: RFR(XS): 8066998: [TESTBUG] compiler/whitebox/ForceNMethodSweepTest.java : sweep shouldn't increase usage In-Reply-To: <54BE9248.9010106@oracle.com> References: <54BE9248.9010106@oracle.com> Message-ID: <54BE95DB.5020208@oracle.com> Looks good. Thanks, Vladimir On 1/20/15 9:37 AM, Pavel Chistyakov wrote: > Hi all, > > please review small fix for JDK-8066998 . > > webrev: http://cr.openjdk.java.net/~pchistyakov/8066998/webrev.00/ > > Problem: sometimes ForceNMethodSweepTest fails w/ "sweep shouldn't increase usage" message that is caused by increasing > of total counted code cache usage after sweeping instead of decreasing. After some investigations (thanks Albert Noll) > we found that this behavior can be forced by background compilation. It happens after controlled sweep process and > increases code cache usage unexpectedly. > > Solution: disable background compilation for this test. > > Testing: manual locally (about 1000 runs w/ and w/o background compilation) > > ---------------- > Thanks, > Pavel From vladimir.kozlov at oracle.com Tue Jan 20 17:59:07 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 20 Jan 2015 09:59:07 -0800 Subject: RFR(S): 8069125: compiler/codecache/stress tests timeout in nightlies In-Reply-To: <54BE9241.2060007@oracle.com> References: <54BE9241.2060007@oracle.com> Message-ID: <54BE976B.30201@oracle.com> Seems reasonable. Thanks, Vladimir On 1/20/15 9:37 AM, Pavel Chistyakov wrote: > Hi all, > > please review fix for JDK-8069125 . > > webrev: http://cr.openjdk.java.net/~pchistyakov/8069125/webrev.00/ > > Problem: new stress tests for code cache sometimes fails due to timeout. There are two potential problems: > > * OverloadCompileQueueTest failed in nightlies in Xcomp mode. This behavior forced by lock-unlocker thread that is > used to lock compilation and create a pause during which we can fill compilation queue. This thread starts at test > beginning and works infinitely as daemon. For now compilation locking implemented using MonitorLockerEx object and > wait on it. To prevent spurious wake up wait function is called inside while loop checking > Whitebox::compilation_locked volatile boolean flag. W/o any timeout between lock-unlocker thread iterations we get a > situation when compiler thread can not exit the loop because while notify_all signals lock to exit wait and continue > execution Whitebox::compilation_locked flag become true again. (see compileBroker.cpp:1967-1971 and > whitebox.cpp:802-810 for details of compilation locking process) > * we got one RandomAllocationTest failure caused by timeout. But it happened on slow system and there jtreg didn't > have enough time to do its postprocess work after test execution. This can be solved by decreasing test execution > time (for now test gets 90% of overall jtreg timeout that is set up on test initialization) to 80%. > > > Solution: > > * add timeout in iterations for lock-unlocker thread in OverloadCompileQueueTest to give a chance for compiler to work > (very critical in Xcomp mode) > * decrease test execution time to 80% that will give more time for jtreg and vm init and exit process to prevent > timeout of other stress tests if any > > > Testing: manual locally, JPRT > > ---------------- > Thanks, > Pavel From zoltan.majo at oracle.com Tue Jan 20 18:46:24 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Tue, 20 Jan 2015 19:46:24 +0100 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54BE94FA.2060806@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> <54B90C9F.4000202@oracle.com> <87F38F22-A0A0-496D-BEC7-7669E9A36881@oracle.com> <54BD3723.4070405@oracle.com> <54BD4EC1.5060607@oracle.com> <54BE583C.3030700@oracle.com> <54BE94FA.2060806@oracle.com> Message-ID: <54BEA280.6010408@oracle.com> Hi Vladimir, thank you for the feedback! On 01/20/2015 06:48 PM, Vladimir Kozlov wrote: > We usually exit with error message if a flag's value is incorrect - > vm_exit_during_initialization(). > We never overwrite incorrect value and hide what we did. I was not aware of that convention. I changed the check in arguments.cpp so that the VM exits if the *global* CompileThresholdScaling flag has a negative value (instead of overwriting it). Currently, CompileCommand=option does not accept negative values for flags of type double, so a *per-method* CompileThresholdScaling cannot have a negative value. But the behavior of CompileCommand might change in the future, so I changed the Arguments::scaled_freq_log(intx freq_log, scale) Arguments::scaled_compile_threshold(intx threshold, double scale) methods to return the unscaled value of parameters freq_log and threshold, respectively, if scale is negative. I hope this change is fine. Flags specified using CommandCommand=option are always listed on the output at startup, so maybe this does not count as hiding overwritten flag values. Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.05/ Thank you and best regards, Zoltan > > Thanks, > Vladimir > > On 1/20/15 5:29 AM, Zolt?n Maj? wrote: >> Hi Vladimir, >> >> >> thanks for the review! >> >> On 01/19/2015 07:36 PM, Vladimir Kozlov wrote: >>> What was changed in globalDefinitions.hpp? Webrev shows nothing >>> (sometimes it does not show spacing changes). >> >> Yes, it was some whitespace I've removed around line 1150: >> >> - uintptr_t p = 1; >> + uintptr_t p = 1; >> >> The newest webrev does not show that either. In addition to that, >> I've updated two comments in the newest webrev. >> >>> In arguments.cpp, please, add check that CompileThresholdScaling is >>> not negative. >> >> I added the check. If CompileThresholdScaling is negative, we set its >> value to 1.0 (the default value). >> >> Here is the new webrev: >> http://cr.openjdk.java.net/~zmajo/8059606/webrev.04/ >> >> The webrev link is the same as the one in my latest reply to John. >> >> All JPRT tests pass. >> >> Thank you! >> >> Best regards, >> >> >> Zoltan >> >>> >>> Otherwise this looks good to me. >>> >>> Thanks, >>> Vladimir >>> >>> On 1/19/15 8:56 AM, Zolt?n Maj? wrote: >>>> Hi John, >>>> >>>> >>>> On 01/16/2015 08:59 PM, John Rose wrote: >>>>> On Jan 16, 2015, at 5:05 AM, Zolt?n Maj? >>>>> wrote: >>>>>> Here is the new webrev: >>>>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/ >>>>> Reviewed, with some comments. >>>> >>>> thank you for your review and for the feedback! Please see detailed >>>> comments below. >>>> >>>>> Overall, I like the cleanups along the way. >>>>> >>>>> The basic idea of replacing a hard-coded 'mask' with an >>>>> addressable variable is sound and nicely executed. >>>>> >>>>> I suppose that idea by itself is "S" small, but this really is a >>>>> "M" or even "L" change, as Vladimir says, especially >>>>> since the enhanced logic is spread all around many files. >>>> >>>> I agree that this is not a small change any more, but I would like >>>> to keep the subject of this discussion unchanged so >>>> that we don't move to a different thread (unless you or Vladimir >>>> think it is better to change it). >>>> >>>>> How have you regression tested this? >>>> >>>> I used our performance infrastructure. I collected performance data >>>> on 6 different architectures for 41 benchmark >>>> programs/suites. The data show that per-method compilation >>>> thresholds do not result in a statistically significant >>>> change of performance. One benchmark, Footprint3-Client, degrades >>>> ~0.5% on the X86 Client VM, but I think that is >>>> negligible. >>>> >>>>> Have you verified that the compilation sequence doesn't change for >>>>> a non-trivial use case? A slip in the assembly >>>>> code (for example) might cause a comparison against a garbage mask >>>>> or limit that could cause compilation decisions to >>>>> go crazy quietly. I didn't spot any such bug, but they are hard >>>>> to spot and sometimes quiet. >>>> >>>> I did extensive manual testing on all architectures targeted by the >>>> patch. >>>> >>>> I checked that a target method (a method for which per-method >>>> compilation thresholds have been specified) is indeed >>>> compiled sooner/later than methods for which global compilation >>>> thresholds are in effect. I ran tests for all >>>> combinations of +/-TieredCompilation, +/-ProfileInterpreter, and >>>> +/-UseOnStackReplacement >>>> on each architecture that we support. While working on this issue >>>> I've discovered and reported two interpreter-related >>>> bugs, 8068652 and 8068505. >>>> >>>> I cannot, unfortunately, guarantee that my changes are error-free, >>>> but I did my best to catch any possible error that I >>>> can think of and that I can check. >>>> >>>>> In the sparc assembly code (more than one file), the live range of >>>>> Rcounters has increased, since it is used to supply >>>>> limits as well as to update the counter (which happens early in >>>>> the code). >>>>> >>>>> To make it easier to maintain the code, I suggest renaming >>>>> Rcounters to G3_method_counters. >>>>> (As you can see, when a register has a logical name but has a >>>>> complicated live range, we add the hardware name is to >>>>> the logical name, to make it easier to spot interfering uses, when >>>>> manually editing code.) >>>> >>>> Thanks for the suggestion, I've changed the register's name. >>>> >>>>> If scale==0.0 is a valid input checked specially in compileBroker, >>>>> perhaps the effect of a zero should be documented? >>>>> Suggest adding to globals.hpp: >>>>> "; values greater than one delay counter overflow; zero forces >>>>> counter overflow immediately" >>>>> "; can be set as a per-method option." >>>> >>>> If CompileThresholdScaling is set to 0.0, all methods are interpreted. >>>> >>>> The reason is that CompileThreshold==0 has been historically >>>> equivalent to setting -Xint to true. Setting >>>> CompileThresholdScaling to 0.0 scales down CompileThreshold to 0 >>>> and we wanted to keep VM behavior consistent when we've >>>> added support for the global CompileThresholdScaling flag in >>>> JDK-8059604. >>>> >>>> If you think it would be good if we changed the meaning of >>>> CompileThreshold==0, please let me know. I'll file an RFE for >>>> it and change it. As CompileThreshold is a product flag, we will >>>> need CCC approval for that change. >>>> >>>> As you've suggested, I added a comment to globals.hpp that >>>> precisely describes the behavior of CompileThresholdScaling. >>>> >>>>> Question: What if both a global scale and a method option for >>>>> scale are both set? Is the global one ignored? Do >>>>> they multiply? It's worth specifying it explicitly (and checking >>>>> that the logic DTRT). >>>> >>>> Global and per-method values multiply. That behavior is now >>>> described in the comment in globals.hpp. >>>> >>>>> Question: How are the log values (like Tier0InvokeNotifyFreqLog >>>>> or the result of get_scaled_freq_log) constrained to >>>>> fit in a reasonable range? (I would suppose that range is 0..31 >>>>> or less.) Should we have range clipping code in the >>>>> scaler functions? >>>> >>>> Currently InvocationCounter::number_of_count_bits=29 bits are >>>> reserved for counting. As a result, the value of the log2 >>>> of the notification frequency can be at most 30. I updated the >>>> source code accordingly. >>>> >>>>> It would give notably simpler code, in MethodData::init, to use a >>>>> branch-free setup of tier_0_invoke_notify_freq_log >>>>> etc. Set scale = 1.0 and then update it conditionally. >>>>> Special-case scale=1.0 in get_scaled_freq_log to taste. >>>> >>>> I did that. >>>> >>>>> Same comment about less-branchy code for methodCounters.hpp. >>>>> (It's better to have a one-way branch that sets up >>>>> 'scale' than a two-way branch with duplicate setups of one or more >>>>> variables.) >>>> >>>> I changed the code according to your suggestions. >>>> >>>>> In MethodCounters, I think the conditional scaling of >>>>> _interpreter_backward_branch_limit is going to confuse someone, >>>>> at some point. It should be scaled, unconditionally, like its >>>>> sibling variables. (That would remove another somewhat >>>>> verbose initialization branch!) >>>> >>>> The value of the *global* interpreter backward branch limit >>>> (InvocationCounter::InterpreterBackwardBranchLimit) is >>>> computed based on the value of CompileThreshold (in >>>> InvocationCounter::reinitialize()). The backward branch limit is >>>> assigned a different value depending on whether interpreter >>>> profiling is enabled or not. >>>> >>>> The logic of the *per-method* interpreter backward branch limit >>>> (MethodCounters::_interpreter_backward_branch_limit) is >>>> intended to be identical to that of the global branch limit. >>>> Therefore, I'm afraid that we have to keep the if-then-else >>>> construct initializing _interpreter_backward_branch_limit in >>>> methodCounters.hpp. I hope that I understood right what >>>> you've previously suggested. >>>> >>>>> Small style nit: The noise-word "get_" is discouraged in the >>>>> style doc: >>>>> >>>>>> ? Getter accessor names are noun phrases, with no "get_" noise >>>>>> word. Boolean getters can also begin with "is_" or >>>>>> "has_". >>>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide >>>>> Arguments.cpp follows this rule partially (in older code maybe?). >>>>> It would be better to decrease counter-examples to >>>>> the rule instead of increase them. >>>>> >>>>> Bigger style nit: Since the functions are not getting a preset >>>>> value (from the arguments) but rather normalizing a >>>>> provided argument value, I suggest naming them >>>>> "scale_compile_threshold" (i.e., a verb phrase instead of a noun >>>>> phrase). Again from the style doc: >>>>> >>>>>> ? Other method names are verb phrases, as if commands to the >>>>>> receiver. >>>> >>>> I did not know that, thanks for pointing it out to me. I changed >>>> >>>> get_scaled_compile_threshold -> scaled_compiled_threshold >>>> get_scaled_freq_log -> scaled_freq_log. >>>> >>>>> Since you are providing overloads of the scaling functions, the >>>>> header file should either contain inline code for the >>>>> convenience methods, or else document how the optional argument >>>>> ('scale') defaults. I'd prefer inline code, since it >>>>> is simple. It's as much text to document with a comment as just >>>>> to supply the inline overload. >>>> >>>> I inlined the convenience methods into the header file. >>>> >>>> Here is the new webrev: >>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.03/ >>>> >>>>> >>>>> As I said before, nice work! >>>> >>>> Thank you! >>>> >>>> Best regards, >>>> >>>> >>>> Zoltan >>>> >>>>> >>>>> ? John >>>> >> From vladimir.kozlov at oracle.com Tue Jan 20 19:09:19 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 20 Jan 2015 11:09:19 -0800 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54BEA280.6010408@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> <54B90C9F.4000202@oracle.com> <87F38F22-A0A0-496D-BEC7-7669E9A36881@oracle.com> <54BD3723.4070405@oracle.com> <54BD4EC1.5060607@oracle.com> <54BE583C.3030700@oracle.com> <54BE94FA.2060806@oracle.com> <54BEA280.6010408@oracle.com> Message-ID: <54BEA7DF.9010303@oracle.com> Looks good. Thanks, Vladimir On 1/20/15 10:46 AM, Zolt?n Maj? wrote: > Hi Vladimir, > > > thank you for the feedback! > > On 01/20/2015 06:48 PM, Vladimir Kozlov wrote: >> We usually exit with error message if a flag's value is incorrect - >> vm_exit_during_initialization(). >> We never overwrite incorrect value and hide what we did. > > I was not aware of that convention. I changed the check in arguments.cpp > so that the VM exits if the *global* CompileThresholdScaling flag has a > negative value (instead of overwriting it). > > Currently, CompileCommand=option does not accept negative values for > flags of type double, so a *per-method* CompileThresholdScaling cannot > have a negative value. > > But the behavior of CompileCommand might change in the future, so I > changed the > > Arguments::scaled_freq_log(intx freq_log, scale) > Arguments::scaled_compile_threshold(intx threshold, double scale) > > methods to return the unscaled value of parameters freq_log and > threshold, respectively, if scale is negative. > > I hope this change is fine. Flags specified using CommandCommand=option > are always listed on the output at startup, so maybe this does not count > as hiding overwritten flag values. > > Here is the new webrev: > http://cr.openjdk.java.net/~zmajo/8059606/webrev.05/ > > Thank you and best regards, > > > Zoltan > >> >> Thanks, >> Vladimir >> >> On 1/20/15 5:29 AM, Zolt?n Maj? wrote: >>> Hi Vladimir, >>> >>> >>> thanks for the review! >>> >>> On 01/19/2015 07:36 PM, Vladimir Kozlov wrote: >>>> What was changed in globalDefinitions.hpp? Webrev shows nothing >>>> (sometimes it does not show spacing changes). >>> >>> Yes, it was some whitespace I've removed around line 1150: >>> >>> - uintptr_t p = 1; >>> + uintptr_t p = 1; >>> >>> The newest webrev does not show that either. In addition to that, >>> I've updated two comments in the newest webrev. >>> >>>> In arguments.cpp, please, add check that CompileThresholdScaling is >>>> not negative. >>> >>> I added the check. If CompileThresholdScaling is negative, we set its >>> value to 1.0 (the default value). >>> >>> Here is the new webrev: >>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.04/ >>> >>> The webrev link is the same as the one in my latest reply to John. >>> >>> All JPRT tests pass. >>> >>> Thank you! >>> >>> Best regards, >>> >>> >>> Zoltan >>> >>>> >>>> Otherwise this looks good to me. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 1/19/15 8:56 AM, Zolt?n Maj? wrote: >>>>> Hi John, >>>>> >>>>> >>>>> On 01/16/2015 08:59 PM, John Rose wrote: >>>>>> On Jan 16, 2015, at 5:05 AM, Zolt?n Maj? >>>>>> wrote: >>>>>>> Here is the new webrev: >>>>>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/ >>>>>> Reviewed, with some comments. >>>>> >>>>> thank you for your review and for the feedback! Please see detailed >>>>> comments below. >>>>> >>>>>> Overall, I like the cleanups along the way. >>>>>> >>>>>> The basic idea of replacing a hard-coded 'mask' with an >>>>>> addressable variable is sound and nicely executed. >>>>>> >>>>>> I suppose that idea by itself is "S" small, but this really is a >>>>>> "M" or even "L" change, as Vladimir says, especially >>>>>> since the enhanced logic is spread all around many files. >>>>> >>>>> I agree that this is not a small change any more, but I would like >>>>> to keep the subject of this discussion unchanged so >>>>> that we don't move to a different thread (unless you or Vladimir >>>>> think it is better to change it). >>>>> >>>>>> How have you regression tested this? >>>>> >>>>> I used our performance infrastructure. I collected performance data >>>>> on 6 different architectures for 41 benchmark >>>>> programs/suites. The data show that per-method compilation >>>>> thresholds do not result in a statistically significant >>>>> change of performance. One benchmark, Footprint3-Client, degrades >>>>> ~0.5% on the X86 Client VM, but I think that is >>>>> negligible. >>>>> >>>>>> Have you verified that the compilation sequence doesn't change for >>>>>> a non-trivial use case? A slip in the assembly >>>>>> code (for example) might cause a comparison against a garbage mask >>>>>> or limit that could cause compilation decisions to >>>>>> go crazy quietly. I didn't spot any such bug, but they are hard >>>>>> to spot and sometimes quiet. >>>>> >>>>> I did extensive manual testing on all architectures targeted by the >>>>> patch. >>>>> >>>>> I checked that a target method (a method for which per-method >>>>> compilation thresholds have been specified) is indeed >>>>> compiled sooner/later than methods for which global compilation >>>>> thresholds are in effect. I ran tests for all >>>>> combinations of +/-TieredCompilation, +/-ProfileInterpreter, and >>>>> +/-UseOnStackReplacement >>>>> on each architecture that we support. While working on this issue >>>>> I've discovered and reported two interpreter-related >>>>> bugs, 8068652 and 8068505. >>>>> >>>>> I cannot, unfortunately, guarantee that my changes are error-free, >>>>> but I did my best to catch any possible error that I >>>>> can think of and that I can check. >>>>> >>>>>> In the sparc assembly code (more than one file), the live range of >>>>>> Rcounters has increased, since it is used to supply >>>>>> limits as well as to update the counter (which happens early in >>>>>> the code). >>>>>> >>>>>> To make it easier to maintain the code, I suggest renaming >>>>>> Rcounters to G3_method_counters. >>>>>> (As you can see, when a register has a logical name but has a >>>>>> complicated live range, we add the hardware name is to >>>>>> the logical name, to make it easier to spot interfering uses, when >>>>>> manually editing code.) >>>>> >>>>> Thanks for the suggestion, I've changed the register's name. >>>>> >>>>>> If scale==0.0 is a valid input checked specially in compileBroker, >>>>>> perhaps the effect of a zero should be documented? >>>>>> Suggest adding to globals.hpp: >>>>>> "; values greater than one delay counter overflow; zero forces >>>>>> counter overflow immediately" >>>>>> "; can be set as a per-method option." >>>>> >>>>> If CompileThresholdScaling is set to 0.0, all methods are interpreted. >>>>> >>>>> The reason is that CompileThreshold==0 has been historically >>>>> equivalent to setting -Xint to true. Setting >>>>> CompileThresholdScaling to 0.0 scales down CompileThreshold to 0 >>>>> and we wanted to keep VM behavior consistent when we've >>>>> added support for the global CompileThresholdScaling flag in >>>>> JDK-8059604. >>>>> >>>>> If you think it would be good if we changed the meaning of >>>>> CompileThreshold==0, please let me know. I'll file an RFE for >>>>> it and change it. As CompileThreshold is a product flag, we will >>>>> need CCC approval for that change. >>>>> >>>>> As you've suggested, I added a comment to globals.hpp that >>>>> precisely describes the behavior of CompileThresholdScaling. >>>>> >>>>>> Question: What if both a global scale and a method option for >>>>>> scale are both set? Is the global one ignored? Do >>>>>> they multiply? It's worth specifying it explicitly (and checking >>>>>> that the logic DTRT). >>>>> >>>>> Global and per-method values multiply. That behavior is now >>>>> described in the comment in globals.hpp. >>>>> >>>>>> Question: How are the log values (like Tier0InvokeNotifyFreqLog >>>>>> or the result of get_scaled_freq_log) constrained to >>>>>> fit in a reasonable range? (I would suppose that range is 0..31 >>>>>> or less.) Should we have range clipping code in the >>>>>> scaler functions? >>>>> >>>>> Currently InvocationCounter::number_of_count_bits=29 bits are >>>>> reserved for counting. As a result, the value of the log2 >>>>> of the notification frequency can be at most 30. I updated the >>>>> source code accordingly. >>>>> >>>>>> It would give notably simpler code, in MethodData::init, to use a >>>>>> branch-free setup of tier_0_invoke_notify_freq_log >>>>>> etc. Set scale = 1.0 and then update it conditionally. >>>>>> Special-case scale=1.0 in get_scaled_freq_log to taste. >>>>> >>>>> I did that. >>>>> >>>>>> Same comment about less-branchy code for methodCounters.hpp. (It's >>>>>> better to have a one-way branch that sets up >>>>>> 'scale' than a two-way branch with duplicate setups of one or more >>>>>> variables.) >>>>> >>>>> I changed the code according to your suggestions. >>>>> >>>>>> In MethodCounters, I think the conditional scaling of >>>>>> _interpreter_backward_branch_limit is going to confuse someone, >>>>>> at some point. It should be scaled, unconditionally, like its >>>>>> sibling variables. (That would remove another somewhat >>>>>> verbose initialization branch!) >>>>> >>>>> The value of the *global* interpreter backward branch limit >>>>> (InvocationCounter::InterpreterBackwardBranchLimit) is >>>>> computed based on the value of CompileThreshold (in >>>>> InvocationCounter::reinitialize()). The backward branch limit is >>>>> assigned a different value depending on whether interpreter >>>>> profiling is enabled or not. >>>>> >>>>> The logic of the *per-method* interpreter backward branch limit >>>>> (MethodCounters::_interpreter_backward_branch_limit) is >>>>> intended to be identical to that of the global branch limit. >>>>> Therefore, I'm afraid that we have to keep the if-then-else >>>>> construct initializing _interpreter_backward_branch_limit in >>>>> methodCounters.hpp. I hope that I understood right what >>>>> you've previously suggested. >>>>> >>>>>> Small style nit: The noise-word "get_" is discouraged in the >>>>>> style doc: >>>>>> >>>>>>> ? Getter accessor names are noun phrases, with no "get_" noise >>>>>>> word. Boolean getters can also begin with "is_" or >>>>>>> "has_". >>>>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide >>>>>> Arguments.cpp follows this rule partially (in older code maybe?). >>>>>> It would be better to decrease counter-examples to >>>>>> the rule instead of increase them. >>>>>> >>>>>> Bigger style nit: Since the functions are not getting a preset >>>>>> value (from the arguments) but rather normalizing a >>>>>> provided argument value, I suggest naming them >>>>>> "scale_compile_threshold" (i.e., a verb phrase instead of a noun >>>>>> phrase). Again from the style doc: >>>>>> >>>>>>> ? Other method names are verb phrases, as if commands to the >>>>>>> receiver. >>>>> >>>>> I did not know that, thanks for pointing it out to me. I changed >>>>> >>>>> get_scaled_compile_threshold -> scaled_compiled_threshold >>>>> get_scaled_freq_log -> scaled_freq_log. >>>>> >>>>>> Since you are providing overloads of the scaling functions, the >>>>>> header file should either contain inline code for the >>>>>> convenience methods, or else document how the optional argument >>>>>> ('scale') defaults. I'd prefer inline code, since it >>>>>> is simple. It's as much text to document with a comment as just >>>>>> to supply the inline overload. >>>>> >>>>> I inlined the convenience methods into the header file. >>>>> >>>>> Here is the new webrev: >>>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.03/ >>>>> >>>>>> >>>>>> As I said before, nice work! >>>>> >>>>> Thank you! >>>>> >>>>> Best regards, >>>>> >>>>> >>>>> Zoltan >>>>> >>>>>> >>>>>> ? John >>>>> >>> > From vladimir.x.ivanov at oracle.com Tue Jan 20 19:09:11 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 20 Jan 2015 22:09:11 +0300 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <7B03B9FB-17B4-4AE0-92B8-F2DC5B231294@oracle.com> References: <54B94766.2080102@oracle.com> <7B03B9FB-17B4-4AE0-92B8-F2DC5B231294@oracle.com> Message-ID: <54BEA7D7.6080008@oracle.com> John, thanks for the review! Updated webrev: http://cr.openjdk.java.net/~vlivanov/8063137/webrev.01/hotspot http://cr.openjdk.java.net/~vlivanov/8063137/webrev.01/jdk See my answers inline. On 1/17/15 2:13 AM, John Rose wrote: > On Jan 16, 2015, at 9:16 AM, Vladimir Ivanov > > wrote: >> >> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ >> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ >> https://bugs.openjdk.java.net/browse/JDK-8063137 >> ... >> PS: as a summary, my experiments show that fixes for 8063137 & 8068915 >> [2] almost completely recovers peak performance after LambdaForm >> sharing [3]. There's one more problem left (non-inlined MethodHandle >> invocations are more expensive when LFs are shared), but it's a story >> for another day. > > This performance bump is excellent news. LFs are supposed to express > emergently common behaviors, like hidden classes. We are much closer to > that goal now. > > I'm glad to see that the library-assisted profiling turns out to be > relatively clean. > > In effect this restores the pre-LF CountingMethodHandle logic from 2011, > which was so beneficial in JDK 7: > http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/02de5cdbef21/src/share/classes/java/lang/invoke/CountingMethodHandle.java > > I have some suggestions to make this version a little cleaner; see below. > > Starting with the JDK changes: > > In LambdaForm.java, I'm feeling flag pressure from all the little > boolean fields and constructor parameters. > > (Is it time to put in a bit-encoded field "private byte > LambdaForm.flags", or do we wait for another boolean to come along? But > see next questions, which are more important.) > > What happens when a GWT LF gets inlined into a larger LF? Then there > might be two or more selectAlternative calls. > Will this confuse anything or will it Just Work? The combined LF will > get profiled as usual, and the selectAlternative calls will also collect > profile (or not?). > > This leads to another question: Why have a boolean 'isGWT' at all? Why > not just check for one or more occurrence of selectAlternative, and > declare that those guys override (some of) the profiling. Something like: > > -+ if (PROFILE_GWT && lambdaForm.isGWT) > ++ if (PROFILE_GWT && lambdaForm.containsFunction(NF_selectAlternative)) > (...where LF.containsFunction(NamedFunction) is a variation of > LF.contains(Name).) > > I suppose the answer may be that you want to inline GWTs (if ever) into > customized code where the JVM profiling should get maximum benefit. In > that case case you might want to set the boolean to "false" to > distinguish "immature" GWT combinators from customized ones. > > If that's the case, perhaps the real boolean flag you want is not > 'isGWT' but 'sharedProfile' or 'immature' or some such, or (inverting) > 'customized'. (I like the feel of a 'customized' flag.) Then > @IgnoreProfile would get attached to a LF that (a ) contains > selectAlternative and (b ) is marked as non-customized/immature/shared. > You might also want to adjust the call to 'profileBranch' based on > whether the containing LF was shared or customized. > > What I'm mainly poking at here is that 'isGWT' is not informative about > the intended use of the flag. I agree. It was an interim solution. Initially, I planned to introduce customization and guide the logic based on that property. But it's not there yet and I needed something for GWT case. Unfortunately, I missed the case when GWT is edited. In that case, isGWT flag is missed and no annotation is set. So, I removed isGWT flag and introduced a check for selectAlternative occurence in LambdaForm shape, as you suggested. > In 'updateCounters', if the counter overflows, you'll get continuous > creation of ArithmeticExceptions. Will that optimize or will it cause a > permanent slowdown? Consider a hack like this on the exception path: > counters[idx] = Integer.MAX_VALUE / 2; I had an impression that VM optimizes overflows in Math.exact* intrinsics, but it's not the case - it always inserts an uncommon trap. I used the workaround you proposed. > On the Name Bikeshed: It looks like @IgnoreProfile (ignore_profile in > the VM) promises too much "ignorance", since it suppresses branch counts > and traps, but allows type profiles to be consulted. Maybe something > positive like "@ManyTraps" or "@SharedMegamorphic"? (It's just a name, > and this is just a suggestion.) What do you think about @LambdaForm.Shared? > Going to the JVM: > > In library_call.cpp, I think you should change the assert to a guard: > -+ assert(aobj->length() == 2, ""); > ++ && aobj->length() == 2) { Done. > In Parse::dynamic_branch_prediction, the mere presence of the Opaque4 > node is enough to trigger replacement of profiling. I think there > should *not* be a test of method()->ignore_profile(). That should > provide better integration between the two sources of profile data to > JVM profiling? Done. > Also, I think the name 'Opaque4Node' is way too? opaque. Suggest > 'ProfileBranchNode', since that's exactly what it does. Done. > Suggest changing the log element "profile_branch" to "observe > source='profileBranch'", to make a better hint as to the source of the info. Done. Best regards, Vladimir Ivanov From vladimir.x.ivanov at oracle.com Tue Jan 20 19:09:30 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 20 Jan 2015 22:09:30 +0300 Subject: [9, 8u40] RFR (XS): 8068915: uncommon trap w/ Reason_speculate_class_check causes performance regression due to continuous deoptimizations In-Reply-To: <4F0DB021-7B33-4476-8ACA-3ED93425444A@oracle.com> References: <54B959BF.1030709@oracle.com> <182C621D-3DA4-4F71-9832-2A036D0A4120@oracle.com> <54BD5AA3.6050005@oracle.com> <4F0DB021-7B33-4476-8ACA-3ED93425444A@oracle.com> Message-ID: <54BEA7EA.4030700@oracle.com> Thank you, Roland. Best regards, Vladimir Ivanov On 1/20/15 4:03 PM, Roland Westrelin wrote: >> Roland, thanks for the feedback! >> >>>> The fix is to (1) forbid changing uncommon trap action under the hood, and (2) consult Compile::too_many_recompiles when adding a speculative guard. >>> >>> I?m not sure I understand what code (1) above is referring to in your webrev. >> The fix is based on 8063137 which I've sent for review earlier. >> GraphKit::uncommon_trap_exact [1] delegates to GraphKit::uncommon_trap(..., /*keep_exact_action=*/true). keep_exact_action guards the logic which rewrites the action. >> >>> Also, why is the problem restricted to speculative traps? Wouldn?t the same checks be required for non speculative traps as well? >> Other trap types could be affected as well, but speculative traps and unstable_if are the main source of action transitions. My experiments on Octane show that all cases when action substitution happens have one of these trap reason. >> >> With this change I wanted to address speculative traps case. I'll experiment with making unstable_if traps exact as well and get back to you with updated webrev. >> >> I'd prefer to change the default for keep_exact_action to true (and fix only the case mentioned in GraphKit::uncommon_trap [3]), but since I consider backporting the fix into 8u40, I want to keep it very focused. > > Thanks for the explanation, Vladimir. That looks good to me as it is. > > Roland. > From vladimir.x.ivanov at oracle.com Tue Jan 20 19:10:44 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 20 Jan 2015 22:10:44 +0300 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54BD4BA1.2060300@oracle.com> References: <54B94766.2080102@oracle.com> <54B975EA.6040005@oracle.com> <54BD396D.2050907@oracle.com> <54BD4BA1.2060300@oracle.com> Message-ID: <54BEA834.80106@oracle.com> >>> You forgot to mark Opaque4Node as macro node. I would suggest to base it >>> on Opaque2Node then you will get some methods from it. >> Do I really need to do so? I expect it to go away during IGVN pass >> right after parsing is over. That's why I register >> the node for igvn in LibraryCallKit::inline_profileBranch(). Changes >> in macro.cpp & compile.cpp are leftovers from the >> version when Opaque4 was macro node. I plan to remove them. > > I see, this is why you did not inherited it. Okay. I would suggest to > leave an assert in compile.cpp to make sure it is not left. > > I found typo when looked today (should be '&&'): > > + Node *Opaque4Node::Ideal(PhaseGVN *phase, bool can_reshape) { > + if (can_reshape & _delay_removal) { Good catch! Fixed in the latest version: http://cr.openjdk.java.net/~vlivanov/8063137/webrev.01/hotspot Best regards, Vladimir Ivanov > > Thanks, > Vladimir > >> >> Best regards, >> Vladimir Ivanov >> >>> On 1/16/15 9:16 AM, Vladimir Ivanov wrote: >>>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ >>>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ >>>> https://bugs.openjdk.java.net/browse/JDK-8063137 >>>> >>>> After GuardWithTest (GWT) LambdaForms became shared, profile pollution >>>> significantly distorted compilation decisions. It affected inlining and >>>> hindered some optimizations. It causes significant performance >>>> regressions for Nashorn (on Octane benchmarks). >>>> >>>> Inlining was fixed by 8059877 [1], but it didn't cover the case when a >>>> branch is never taken. It can cause missed optimization opportunity, >>>> and >>>> not just increase in code size. For example, non-pruned branch can >>>> break >>>> escape analysis. >>>> >>>> Currently, there are 2 problems: >>>> - branch frequencies profile pollution >>>> - deoptimization counts pollution >>>> >>>> Branch frequency pollution hides from JIT the fact that a branch is >>>> never taken. Since GWT LambdaForms (and hence their bytecode) are >>>> heavily shared, but the behavior is specific to MethodHandle, >>>> there's no >>>> way for JIT to understand how particular GWT instance behaves. >>>> >>>> The solution I propose is to do profiling in Java code and feed it to >>>> JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where >>>> profiling info is stored. Once JIT kicks in, it can retrieve these >>>> counts, if corresponding MethodHandle is a compile-time constant >>>> (and it >>>> is usually the case). To communicate the profile data from Java code to >>>> JIT, MethodHandleImpl::profileBranch() is used. >>>> >>>> If GWT MethodHandle isn't a compile-time constant, profiling should >>>> proceed. It happens when corresponding LambdaForm is already shared, >>>> for >>>> newly created GWT MethodHandles profiling can occur only in native code >>>> (dedicated nmethod for a single LambdaForm). So, when compilation of >>>> the >>>> whole MethodHandle chain is triggered, the profile should be already >>>> gathered. >>>> >>>> Overriding branch frequencies is not enough. Statistics on >>>> deoptimization events is also polluted. Even if a branch is never >>>> taken, >>>> JIT doesn't issue an uncommon trap there unless corresponding bytecode >>>> doesn't trap too much and doesn't cause too many recompiles. >>>> >>>> I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT >>>> sees it on some method, Compile::too_many_traps & >>>> Compile::too_many_recompiles for that method always return false. It >>>> allows JIT to prune the branch based on custom profile and recompile >>>> the >>>> method, if the branch is visited. >>>> >>>> For now, I wanted to keep the fix very focused. The next thing I >>>> plan to >>>> do is to experiment with ignoring deoptimization counts for other >>>> LambdaForms which are heavily shared. I already saw problems caused by >>>> deoptimization counts pollution (see JDK-8068915 [2]). >>>> >>>> I plan to backport the fix into 8u40, once I finish extensive >>>> performance testing. >>>> >>>> Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, >>>> Octane). >>>> >>>> Thanks! >>>> >>>> PS: as a summary, my experiments show that fixes for 8063137 & 8068915 >>>> [2] almost completely recovers peak performance after LambdaForm >>>> sharing >>>> [3]. There's one more problem left (non-inlined MethodHandle >>>> invocations >>>> are more expensive when LFs are shared), but it's a story for another >>>> day. >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8059877 >>>> 8059877: GWT branch frequencies pollution due to LF sharing >>>> [2] https://bugs.openjdk.java.net/browse/JDK-8068915 >>>> [3] https://bugs.openjdk.java.net/browse/JDK-8046703 >>>> JEP 210: LambdaForm Reduction and Caching >>>> _______________________________________________ >>>> mlvm-dev mailing list >>>> mlvm-dev at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >>> _______________________________________________ >>> mlvm-dev mailing list >>> mlvm-dev at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev From vladimir.kozlov at oracle.com Tue Jan 20 19:22:56 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 20 Jan 2015 11:22:56 -0800 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54BEA834.80106@oracle.com> References: <54B94766.2080102@oracle.com> <54B975EA.6040005@oracle.com> <54BD396D.2050907@oracle.com> <54BD4BA1.2060300@oracle.com> <54BEA834.80106@oracle.com> Message-ID: <54BEAB10.9020307@oracle.com> Looks good. Webrev has empty changes for macro.cpp. Please, make sure nothing in it when you push. Thanks, Vladimir On 1/20/15 11:10 AM, Vladimir Ivanov wrote: >>>> You forgot to mark Opaque4Node as macro node. I would suggest to >>>> base it >>>> on Opaque2Node then you will get some methods from it. >>> Do I really need to do so? I expect it to go away during IGVN pass >>> right after parsing is over. That's why I register >>> the node for igvn in LibraryCallKit::inline_profileBranch(). Changes >>> in macro.cpp & compile.cpp are leftovers from the >>> version when Opaque4 was macro node. I plan to remove them. >> >> I see, this is why you did not inherited it. Okay. I would suggest to >> leave an assert in compile.cpp to make sure it is not left. >> >> I found typo when looked today (should be '&&'): >> >> + Node *Opaque4Node::Ideal(PhaseGVN *phase, bool can_reshape) { >> + if (can_reshape & _delay_removal) { > Good catch! Fixed in the latest version: > http://cr.openjdk.java.net/~vlivanov/8063137/webrev.01/hotspot > > Best regards, > Vladimir Ivanov > >> >> Thanks, >> Vladimir >> >>> >>> Best regards, >>> Vladimir Ivanov >>> >>>> On 1/16/15 9:16 AM, Vladimir Ivanov wrote: >>>>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ >>>>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ >>>>> https://bugs.openjdk.java.net/browse/JDK-8063137 >>>>> >>>>> After GuardWithTest (GWT) LambdaForms became shared, profile pollution >>>>> significantly distorted compilation decisions. It affected inlining >>>>> and >>>>> hindered some optimizations. It causes significant performance >>>>> regressions for Nashorn (on Octane benchmarks). >>>>> >>>>> Inlining was fixed by 8059877 [1], but it didn't cover the case when a >>>>> branch is never taken. It can cause missed optimization opportunity, >>>>> and >>>>> not just increase in code size. For example, non-pruned branch can >>>>> break >>>>> escape analysis. >>>>> >>>>> Currently, there are 2 problems: >>>>> - branch frequencies profile pollution >>>>> - deoptimization counts pollution >>>>> >>>>> Branch frequency pollution hides from JIT the fact that a branch is >>>>> never taken. Since GWT LambdaForms (and hence their bytecode) are >>>>> heavily shared, but the behavior is specific to MethodHandle, >>>>> there's no >>>>> way for JIT to understand how particular GWT instance behaves. >>>>> >>>>> The solution I propose is to do profiling in Java code and feed it to >>>>> JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where >>>>> profiling info is stored. Once JIT kicks in, it can retrieve these >>>>> counts, if corresponding MethodHandle is a compile-time constant >>>>> (and it >>>>> is usually the case). To communicate the profile data from Java >>>>> code to >>>>> JIT, MethodHandleImpl::profileBranch() is used. >>>>> >>>>> If GWT MethodHandle isn't a compile-time constant, profiling should >>>>> proceed. It happens when corresponding LambdaForm is already shared, >>>>> for >>>>> newly created GWT MethodHandles profiling can occur only in native >>>>> code >>>>> (dedicated nmethod for a single LambdaForm). So, when compilation of >>>>> the >>>>> whole MethodHandle chain is triggered, the profile should be already >>>>> gathered. >>>>> >>>>> Overriding branch frequencies is not enough. Statistics on >>>>> deoptimization events is also polluted. Even if a branch is never >>>>> taken, >>>>> JIT doesn't issue an uncommon trap there unless corresponding bytecode >>>>> doesn't trap too much and doesn't cause too many recompiles. >>>>> >>>>> I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT >>>>> sees it on some method, Compile::too_many_traps & >>>>> Compile::too_many_recompiles for that method always return false. It >>>>> allows JIT to prune the branch based on custom profile and recompile >>>>> the >>>>> method, if the branch is visited. >>>>> >>>>> For now, I wanted to keep the fix very focused. The next thing I >>>>> plan to >>>>> do is to experiment with ignoring deoptimization counts for other >>>>> LambdaForms which are heavily shared. I already saw problems caused by >>>>> deoptimization counts pollution (see JDK-8068915 [2]). >>>>> >>>>> I plan to backport the fix into 8u40, once I finish extensive >>>>> performance testing. >>>>> >>>>> Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, >>>>> Octane). >>>>> >>>>> Thanks! >>>>> >>>>> PS: as a summary, my experiments show that fixes for 8063137 & 8068915 >>>>> [2] almost completely recovers peak performance after LambdaForm >>>>> sharing >>>>> [3]. There's one more problem left (non-inlined MethodHandle >>>>> invocations >>>>> are more expensive when LFs are shared), but it's a story for another >>>>> day. >>>>> >>>>> Best regards, >>>>> Vladimir Ivanov >>>>> >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8059877 >>>>> 8059877: GWT branch frequencies pollution due to LF sharing >>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8068915 >>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8046703 >>>>> JEP 210: LambdaForm Reduction and Caching >>>>> _______________________________________________ >>>>> mlvm-dev mailing list >>>>> mlvm-dev at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >>>> _______________________________________________ >>>> mlvm-dev mailing list >>>> mlvm-dev at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev From vladimir.x.ivanov at oracle.com Tue Jan 20 19:34:09 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 20 Jan 2015 22:34:09 +0300 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54BEAB10.9020307@oracle.com> References: <54B94766.2080102@oracle.com> <54B975EA.6040005@oracle.com> <54BD396D.2050907@oracle.com> <54BD4BA1.2060300@oracle.com> <54BEA834.80106@oracle.com> <54BEAB10.9020307@oracle.com> Message-ID: <54BEADB1.2090404@oracle.com> Thanks, Vladimir! > Webrev has empty changes for macro.cpp. Please, make sure nothing in it > when you push. Yes, that's the problem with webrev script when there are multiple patches with inverse changes. Best regards, Vladimir Ivanov > > Thanks, > Vladimir > > On 1/20/15 11:10 AM, Vladimir Ivanov wrote: >>>>> You forgot to mark Opaque4Node as macro node. I would suggest to >>>>> base it >>>>> on Opaque2Node then you will get some methods from it. >>>> Do I really need to do so? I expect it to go away during IGVN pass >>>> right after parsing is over. That's why I register >>>> the node for igvn in LibraryCallKit::inline_profileBranch(). Changes >>>> in macro.cpp & compile.cpp are leftovers from the >>>> version when Opaque4 was macro node. I plan to remove them. >>> >>> I see, this is why you did not inherited it. Okay. I would suggest to >>> leave an assert in compile.cpp to make sure it is not left. >>> >>> I found typo when looked today (should be '&&'): >>> >>> + Node *Opaque4Node::Ideal(PhaseGVN *phase, bool can_reshape) { >>> + if (can_reshape & _delay_removal) { >> Good catch! Fixed in the latest version: >> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.01/hotspot >> >> Best regards, >> Vladimir Ivanov >> >>> >>> Thanks, >>> Vladimir >>> >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>>> On 1/16/15 9:16 AM, Vladimir Ivanov wrote: >>>>>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ >>>>>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ >>>>>> https://bugs.openjdk.java.net/browse/JDK-8063137 >>>>>> >>>>>> After GuardWithTest (GWT) LambdaForms became shared, profile >>>>>> pollution >>>>>> significantly distorted compilation decisions. It affected inlining >>>>>> and >>>>>> hindered some optimizations. It causes significant performance >>>>>> regressions for Nashorn (on Octane benchmarks). >>>>>> >>>>>> Inlining was fixed by 8059877 [1], but it didn't cover the case >>>>>> when a >>>>>> branch is never taken. It can cause missed optimization opportunity, >>>>>> and >>>>>> not just increase in code size. For example, non-pruned branch can >>>>>> break >>>>>> escape analysis. >>>>>> >>>>>> Currently, there are 2 problems: >>>>>> - branch frequencies profile pollution >>>>>> - deoptimization counts pollution >>>>>> >>>>>> Branch frequency pollution hides from JIT the fact that a branch is >>>>>> never taken. Since GWT LambdaForms (and hence their bytecode) are >>>>>> heavily shared, but the behavior is specific to MethodHandle, >>>>>> there's no >>>>>> way for JIT to understand how particular GWT instance behaves. >>>>>> >>>>>> The solution I propose is to do profiling in Java code and feed it to >>>>>> JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where >>>>>> profiling info is stored. Once JIT kicks in, it can retrieve these >>>>>> counts, if corresponding MethodHandle is a compile-time constant >>>>>> (and it >>>>>> is usually the case). To communicate the profile data from Java >>>>>> code to >>>>>> JIT, MethodHandleImpl::profileBranch() is used. >>>>>> >>>>>> If GWT MethodHandle isn't a compile-time constant, profiling should >>>>>> proceed. It happens when corresponding LambdaForm is already shared, >>>>>> for >>>>>> newly created GWT MethodHandles profiling can occur only in native >>>>>> code >>>>>> (dedicated nmethod for a single LambdaForm). So, when compilation of >>>>>> the >>>>>> whole MethodHandle chain is triggered, the profile should be already >>>>>> gathered. >>>>>> >>>>>> Overriding branch frequencies is not enough. Statistics on >>>>>> deoptimization events is also polluted. Even if a branch is never >>>>>> taken, >>>>>> JIT doesn't issue an uncommon trap there unless corresponding >>>>>> bytecode >>>>>> doesn't trap too much and doesn't cause too many recompiles. >>>>>> >>>>>> I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT >>>>>> sees it on some method, Compile::too_many_traps & >>>>>> Compile::too_many_recompiles for that method always return false. It >>>>>> allows JIT to prune the branch based on custom profile and recompile >>>>>> the >>>>>> method, if the branch is visited. >>>>>> >>>>>> For now, I wanted to keep the fix very focused. The next thing I >>>>>> plan to >>>>>> do is to experiment with ignoring deoptimization counts for other >>>>>> LambdaForms which are heavily shared. I already saw problems >>>>>> caused by >>>>>> deoptimization counts pollution (see JDK-8068915 [2]). >>>>>> >>>>>> I plan to backport the fix into 8u40, once I finish extensive >>>>>> performance testing. >>>>>> >>>>>> Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, >>>>>> Octane). >>>>>> >>>>>> Thanks! >>>>>> >>>>>> PS: as a summary, my experiments show that fixes for 8063137 & >>>>>> 8068915 >>>>>> [2] almost completely recovers peak performance after LambdaForm >>>>>> sharing >>>>>> [3]. There's one more problem left (non-inlined MethodHandle >>>>>> invocations >>>>>> are more expensive when LFs are shared), but it's a story for another >>>>>> day. >>>>>> >>>>>> Best regards, >>>>>> Vladimir Ivanov >>>>>> >>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8059877 >>>>>> 8059877: GWT branch frequencies pollution due to LF sharing >>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8068915 >>>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8046703 >>>>>> JEP 210: LambdaForm Reduction and Caching >>>>>> _______________________________________________ >>>>>> mlvm-dev mailing list >>>>>> mlvm-dev at openjdk.java.net >>>>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >>>>> _______________________________________________ >>>>> mlvm-dev mailing list >>>>> mlvm-dev at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev From pavel.chistyakov at oracle.com Tue Jan 20 21:13:32 2015 From: pavel.chistyakov at oracle.com (Pavel Chistyakov) Date: Tue, 20 Jan 2015 13:13:32 -0800 (PST) Subject: RFR(XS): 8066998: [TESTBUG] compiler/whitebox/ForceNMethodSweepTest.java : sweep shouldn't increase usage Message-ID: <2ab2d3d6-eef2-4a4a-9dea-e647414887d8@default> Vladimir, thank you for review! ----------- Regards, Pavel ----- Original Message ----- From: vladimir.kozlov at oracle.com To: hotspot-compiler-dev at openjdk.java.net Sent: Tuesday, January 20, 2015 8:52:44 PM GMT +04:00 Abu Dhabi / Muscat Subject: Re: RFR(XS): 8066998: [TESTBUG] compiler/whitebox/ForceNMethodSweepTest.java : sweep shouldn't increase usage Looks good. Thanks, Vladimir On 1/20/15 9:37 AM, Pavel Chistyakov wrote: > Hi all, > > please review small fix for JDK-8066998 . > > webrev: http://cr.openjdk.java.net/~pchistyakov/8066998/webrev.00/ > > Problem: sometimes ForceNMethodSweepTest fails w/ "sweep shouldn't increase usage" message that is caused by increasing > of total counted code cache usage after sweeping instead of decreasing. After some investigations (thanks Albert Noll) > we found that this behavior can be forced by background compilation. It happens after controlled sweep process and > increases code cache usage unexpectedly. > > Solution: disable background compilation for this test. > > Testing: manual locally (about 1000 runs w/ and w/o background compilation) > > ---------------- > Thanks, > Pavel From pavel.chistyakov at oracle.com Tue Jan 20 21:13:43 2015 From: pavel.chistyakov at oracle.com (Pavel Chistyakov) Date: Tue, 20 Jan 2015 13:13:43 -0800 (PST) Subject: RFR(S): 8069125: compiler/codecache/stress tests timeout in nightlies Message-ID: <3d4be358-dc3a-4037-9cb8-063118433dee@default> Vladimir, thank you for review! ----------- Regards, Pavel ----- Original Message ----- From: vladimir.kozlov at oracle.com To: hotspot-compiler-dev at openjdk.java.net Sent: Tuesday, January 20, 2015 8:59:26 PM GMT +04:00 Abu Dhabi / Muscat Subject: Re: RFR(S): 8069125: compiler/codecache/stress tests timeout in nightlies Seems reasonable. Thanks, Vladimir On 1/20/15 9:37 AM, Pavel Chistyakov wrote: > Hi all, > > please review fix for JDK-8069125 . > > webrev: http://cr.openjdk.java.net/~pchistyakov/8069125/webrev.00/ > > Problem: new stress tests for code cache sometimes fails due to timeout. There are two potential problems: > > * OverloadCompileQueueTest failed in nightlies in Xcomp mode. This behavior forced by lock-unlocker thread that is > used to lock compilation and create a pause during which we can fill compilation queue. This thread starts at test > beginning and works infinitely as daemon. For now compilation locking implemented using MonitorLockerEx object and > wait on it. To prevent spurious wake up wait function is called inside while loop checking > Whitebox::compilation_locked volatile boolean flag. W/o any timeout between lock-unlocker thread iterations we get a > situation when compiler thread can not exit the loop because while notify_all signals lock to exit wait and continue > execution Whitebox::compilation_locked flag become true again. (see compileBroker.cpp:1967-1971 and > whitebox.cpp:802-810 for details of compilation locking process) > * we got one RandomAllocationTest failure caused by timeout. But it happened on slow system and there jtreg didn't > have enough time to do its postprocess work after test execution. This can be solved by decreasing test execution > time (for now test gets 90% of overall jtreg timeout that is set up on test initialization) to 80%. > > > Solution: > > * add timeout in iterations for lock-unlocker thread in OverloadCompileQueueTest to give a chance for compiler to work > (very critical in Xcomp mode) > * decrease test execution time to 80% that will give more time for jtreg and vm init and exit process to prevent > timeout of other stress tests if any > > > Testing: manual locally, JPRT > > ---------------- > Thanks, > Pavel From zoltan.majo at oracle.com Wed Jan 21 09:57:09 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Wed, 21 Jan 2015 10:57:09 +0100 Subject: [9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds) In-Reply-To: <54BEA7DF.9010303@oracle.com> References: <54AADA33.30203@oracle.com> <54AAE277.3030209@oracle.com> <54B5101C.5020609@oracle.com> <54B56D6D.3040707@oracle.com> <54B90C9F.4000202@oracle.com> <87F38F22-A0A0-496D-BEC7-7669E9A36881@oracle.com> <54BD3723.4070405@oracle.com> <54BD4EC1.5060607@oracle.com> <54BE583C.3030700@oracle.com> <54BE94FA.2060806@oracle.com> <54BEA280.6010408@oracle.com> <54BEA7DF.9010303@oracle.com> Message-ID: <54BF77F5.605@oracle.com> Thank you, Vladimir, for the review! Best regards, Zoltan On 01/20/2015 08:09 PM, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 1/20/15 10:46 AM, Zolt?n Maj? wrote: >> Hi Vladimir, >> >> >> thank you for the feedback! >> >> On 01/20/2015 06:48 PM, Vladimir Kozlov wrote: >>> We usually exit with error message if a flag's value is incorrect - >>> vm_exit_during_initialization(). >>> We never overwrite incorrect value and hide what we did. >> >> I was not aware of that convention. I changed the check in arguments.cpp >> so that the VM exits if the *global* CompileThresholdScaling flag has a >> negative value (instead of overwriting it). >> >> Currently, CompileCommand=option does not accept negative values for >> flags of type double, so a *per-method* CompileThresholdScaling cannot >> have a negative value. >> >> But the behavior of CompileCommand might change in the future, so I >> changed the >> >> Arguments::scaled_freq_log(intx freq_log, scale) >> Arguments::scaled_compile_threshold(intx threshold, double scale) >> >> methods to return the unscaled value of parameters freq_log and >> threshold, respectively, if scale is negative. >> >> I hope this change is fine. Flags specified using CommandCommand=option >> are always listed on the output at startup, so maybe this does not count >> as hiding overwritten flag values. >> >> Here is the new webrev: >> http://cr.openjdk.java.net/~zmajo/8059606/webrev.05/ >> >> Thank you and best regards, >> >> >> Zoltan >> >>> >>> Thanks, >>> Vladimir >>> >>> On 1/20/15 5:29 AM, Zolt?n Maj? wrote: >>>> Hi Vladimir, >>>> >>>> >>>> thanks for the review! >>>> >>>> On 01/19/2015 07:36 PM, Vladimir Kozlov wrote: >>>>> What was changed in globalDefinitions.hpp? Webrev shows nothing >>>>> (sometimes it does not show spacing changes). >>>> >>>> Yes, it was some whitespace I've removed around line 1150: >>>> >>>> - uintptr_t p = 1; >>>> + uintptr_t p = 1; >>>> >>>> The newest webrev does not show that either. In addition to that, >>>> I've updated two comments in the newest webrev. >>>> >>>>> In arguments.cpp, please, add check that CompileThresholdScaling is >>>>> not negative. >>>> >>>> I added the check. If CompileThresholdScaling is negative, we set its >>>> value to 1.0 (the default value). >>>> >>>> Here is the new webrev: >>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.04/ >>>> >>>> The webrev link is the same as the one in my latest reply to John. >>>> >>>> All JPRT tests pass. >>>> >>>> Thank you! >>>> >>>> Best regards, >>>> >>>> >>>> Zoltan >>>> >>>>> >>>>> Otherwise this looks good to me. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 1/19/15 8:56 AM, Zolt?n Maj? wrote: >>>>>> Hi John, >>>>>> >>>>>> >>>>>> On 01/16/2015 08:59 PM, John Rose wrote: >>>>>>> On Jan 16, 2015, at 5:05 AM, Zolt?n Maj? >>>>>>> wrote: >>>>>>>> Here is the new webrev: >>>>>>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/ >>>>>>> Reviewed, with some comments. >>>>>> >>>>>> thank you for your review and for the feedback! Please see detailed >>>>>> comments below. >>>>>> >>>>>>> Overall, I like the cleanups along the way. >>>>>>> >>>>>>> The basic idea of replacing a hard-coded 'mask' with an >>>>>>> addressable variable is sound and nicely executed. >>>>>>> >>>>>>> I suppose that idea by itself is "S" small, but this really is a >>>>>>> "M" or even "L" change, as Vladimir says, especially >>>>>>> since the enhanced logic is spread all around many files. >>>>>> >>>>>> I agree that this is not a small change any more, but I would like >>>>>> to keep the subject of this discussion unchanged so >>>>>> that we don't move to a different thread (unless you or Vladimir >>>>>> think it is better to change it). >>>>>> >>>>>>> How have you regression tested this? >>>>>> >>>>>> I used our performance infrastructure. I collected performance data >>>>>> on 6 different architectures for 41 benchmark >>>>>> programs/suites. The data show that per-method compilation >>>>>> thresholds do not result in a statistically significant >>>>>> change of performance. One benchmark, Footprint3-Client, degrades >>>>>> ~0.5% on the X86 Client VM, but I think that is >>>>>> negligible. >>>>>> >>>>>>> Have you verified that the compilation sequence doesn't change for >>>>>>> a non-trivial use case? A slip in the assembly >>>>>>> code (for example) might cause a comparison against a garbage mask >>>>>>> or limit that could cause compilation decisions to >>>>>>> go crazy quietly. I didn't spot any such bug, but they are hard >>>>>>> to spot and sometimes quiet. >>>>>> >>>>>> I did extensive manual testing on all architectures targeted by the >>>>>> patch. >>>>>> >>>>>> I checked that a target method (a method for which per-method >>>>>> compilation thresholds have been specified) is indeed >>>>>> compiled sooner/later than methods for which global compilation >>>>>> thresholds are in effect. I ran tests for all >>>>>> combinations of +/-TieredCompilation, +/-ProfileInterpreter, and >>>>>> +/-UseOnStackReplacement >>>>>> on each architecture that we support. While working on this issue >>>>>> I've discovered and reported two interpreter-related >>>>>> bugs, 8068652 and 8068505. >>>>>> >>>>>> I cannot, unfortunately, guarantee that my changes are error-free, >>>>>> but I did my best to catch any possible error that I >>>>>> can think of and that I can check. >>>>>> >>>>>>> In the sparc assembly code (more than one file), the live range of >>>>>>> Rcounters has increased, since it is used to supply >>>>>>> limits as well as to update the counter (which happens early in >>>>>>> the code). >>>>>>> >>>>>>> To make it easier to maintain the code, I suggest renaming >>>>>>> Rcounters to G3_method_counters. >>>>>>> (As you can see, when a register has a logical name but has a >>>>>>> complicated live range, we add the hardware name is to >>>>>>> the logical name, to make it easier to spot interfering uses, when >>>>>>> manually editing code.) >>>>>> >>>>>> Thanks for the suggestion, I've changed the register's name. >>>>>> >>>>>>> If scale==0.0 is a valid input checked specially in compileBroker, >>>>>>> perhaps the effect of a zero should be documented? >>>>>>> Suggest adding to globals.hpp: >>>>>>> "; values greater than one delay counter overflow; zero forces >>>>>>> counter overflow immediately" >>>>>>> "; can be set as a per-method option." >>>>>> >>>>>> If CompileThresholdScaling is set to 0.0, all methods are >>>>>> interpreted. >>>>>> >>>>>> The reason is that CompileThreshold==0 has been historically >>>>>> equivalent to setting -Xint to true. Setting >>>>>> CompileThresholdScaling to 0.0 scales down CompileThreshold to 0 >>>>>> and we wanted to keep VM behavior consistent when we've >>>>>> added support for the global CompileThresholdScaling flag in >>>>>> JDK-8059604. >>>>>> >>>>>> If you think it would be good if we changed the meaning of >>>>>> CompileThreshold==0, please let me know. I'll file an RFE for >>>>>> it and change it. As CompileThreshold is a product flag, we will >>>>>> need CCC approval for that change. >>>>>> >>>>>> As you've suggested, I added a comment to globals.hpp that >>>>>> precisely describes the behavior of CompileThresholdScaling. >>>>>> >>>>>>> Question: What if both a global scale and a method option for >>>>>>> scale are both set? Is the global one ignored? Do >>>>>>> they multiply? It's worth specifying it explicitly (and checking >>>>>>> that the logic DTRT). >>>>>> >>>>>> Global and per-method values multiply. That behavior is now >>>>>> described in the comment in globals.hpp. >>>>>> >>>>>>> Question: How are the log values (like Tier0InvokeNotifyFreqLog >>>>>>> or the result of get_scaled_freq_log) constrained to >>>>>>> fit in a reasonable range? (I would suppose that range is 0..31 >>>>>>> or less.) Should we have range clipping code in the >>>>>>> scaler functions? >>>>>> >>>>>> Currently InvocationCounter::number_of_count_bits=29 bits are >>>>>> reserved for counting. As a result, the value of the log2 >>>>>> of the notification frequency can be at most 30. I updated the >>>>>> source code accordingly. >>>>>> >>>>>>> It would give notably simpler code, in MethodData::init, to use a >>>>>>> branch-free setup of tier_0_invoke_notify_freq_log >>>>>>> etc. Set scale = 1.0 and then update it conditionally. >>>>>>> Special-case scale=1.0 in get_scaled_freq_log to taste. >>>>>> >>>>>> I did that. >>>>>> >>>>>>> Same comment about less-branchy code for methodCounters.hpp. (It's >>>>>>> better to have a one-way branch that sets up >>>>>>> 'scale' than a two-way branch with duplicate setups of one or more >>>>>>> variables.) >>>>>> >>>>>> I changed the code according to your suggestions. >>>>>> >>>>>>> In MethodCounters, I think the conditional scaling of >>>>>>> _interpreter_backward_branch_limit is going to confuse someone, >>>>>>> at some point. It should be scaled, unconditionally, like its >>>>>>> sibling variables. (That would remove another somewhat >>>>>>> verbose initialization branch!) >>>>>> >>>>>> The value of the *global* interpreter backward branch limit >>>>>> (InvocationCounter::InterpreterBackwardBranchLimit) is >>>>>> computed based on the value of CompileThreshold (in >>>>>> InvocationCounter::reinitialize()). The backward branch limit is >>>>>> assigned a different value depending on whether interpreter >>>>>> profiling is enabled or not. >>>>>> >>>>>> The logic of the *per-method* interpreter backward branch limit >>>>>> (MethodCounters::_interpreter_backward_branch_limit) is >>>>>> intended to be identical to that of the global branch limit. >>>>>> Therefore, I'm afraid that we have to keep the if-then-else >>>>>> construct initializing _interpreter_backward_branch_limit in >>>>>> methodCounters.hpp. I hope that I understood right what >>>>>> you've previously suggested. >>>>>> >>>>>>> Small style nit: The noise-word "get_" is discouraged in the >>>>>>> style doc: >>>>>>> >>>>>>>> ? Getter accessor names are noun phrases, with no "get_" noise >>>>>>>> word. Boolean getters can also begin with "is_" or >>>>>>>> "has_". >>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide >>>>>>> Arguments.cpp follows this rule partially (in older code maybe?). >>>>>>> It would be better to decrease counter-examples to >>>>>>> the rule instead of increase them. >>>>>>> >>>>>>> Bigger style nit: Since the functions are not getting a preset >>>>>>> value (from the arguments) but rather normalizing a >>>>>>> provided argument value, I suggest naming them >>>>>>> "scale_compile_threshold" (i.e., a verb phrase instead of a noun >>>>>>> phrase). Again from the style doc: >>>>>>> >>>>>>>> ? Other method names are verb phrases, as if commands to the >>>>>>>> receiver. >>>>>> >>>>>> I did not know that, thanks for pointing it out to me. I changed >>>>>> >>>>>> get_scaled_compile_threshold -> scaled_compiled_threshold >>>>>> get_scaled_freq_log -> scaled_freq_log. >>>>>> >>>>>>> Since you are providing overloads of the scaling functions, the >>>>>>> header file should either contain inline code for the >>>>>>> convenience methods, or else document how the optional argument >>>>>>> ('scale') defaults. I'd prefer inline code, since it >>>>>>> is simple. It's as much text to document with a comment as just >>>>>>> to supply the inline overload. >>>>>> >>>>>> I inlined the convenience methods into the header file. >>>>>> >>>>>> Here is the new webrev: >>>>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.03/ >>>>>> >>>>>>> >>>>>>> As I said before, nice work! >>>>>> >>>>>> Thank you! >>>>>> >>>>>> Best regards, >>>>>> >>>>>> >>>>>> Zoltan >>>>>> >>>>>>> >>>>>>> ? John >>>>>> >>>> >> From duncan.macgregor at ge.com Wed Jan 21 10:39:54 2015 From: duncan.macgregor at ge.com (MacGregor, Duncan (GE Energy Management)) Date: Wed, 21 Jan 2015 10:39:54 +0000 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: References: <54B94766.2080102@oracle.com> Message-ID: This version seems to have inconsistent removal of ignore profile in the hotspot patch. It?s no longer added to vmSymbols but is still referenced in classFileParser. On 19/01/2015 20:21, "MacGregor, Duncan (GE Energy Management)" wrote: >Okay, I?ve done some tests of this with the micro benchmarks for our >language & runtime which show pretty much no change except for one test >which is now almost 3x slower. It uses nested loops to iterate over an >array and concatenate the string-like objects it contains, and replaces >elements with these new longer string-llike objects. It?s a bit of a >pathological case, and I haven?t seen the same sort of degradation in the >other benchmarks or in real applications, but I haven?t done serious >benchmarking of them with this change. > >I shall see if the test case can be reduced down to anything simpler while >still showing the same performance behaviour, and try add some compilation >logging options to narrow down what?s going on. > >Duncan. > >On 16/01/2015 17:16, "Vladimir Ivanov" >wrote: > >>http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ >>http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ >>https://bugs.openjdk.java.net/browse/JDK-8063137 >> >>After GuardWithTest (GWT) LambdaForms became shared, profile pollution >>significantly distorted compilation decisions. It affected inlining and >>hindered some optimizations. It causes significant performance >>regressions for Nashorn (on Octane benchmarks). >> >>Inlining was fixed by 8059877 [1], but it didn't cover the case when a >>branch is never taken. It can cause missed optimization opportunity, and >>not just increase in code size. For example, non-pruned branch can break >>escape analysis. >> >>Currently, there are 2 problems: >> - branch frequencies profile pollution >> - deoptimization counts pollution >> >>Branch frequency pollution hides from JIT the fact that a branch is >>never taken. Since GWT LambdaForms (and hence their bytecode) are >>heavily shared, but the behavior is specific to MethodHandle, there's no >>way for JIT to understand how particular GWT instance behaves. >> >>The solution I propose is to do profiling in Java code and feed it to >>JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where >>profiling info is stored. Once JIT kicks in, it can retrieve these >>counts, if corresponding MethodHandle is a compile-time constant (and it >>is usually the case). To communicate the profile data from Java code to >>JIT, MethodHandleImpl::profileBranch() is used. >> >>If GWT MethodHandle isn't a compile-time constant, profiling should >>proceed. It happens when corresponding LambdaForm is already shared, for >>newly created GWT MethodHandles profiling can occur only in native code >>(dedicated nmethod for a single LambdaForm). So, when compilation of the >>whole MethodHandle chain is triggered, the profile should be already >>gathered. >> >>Overriding branch frequencies is not enough. Statistics on >>deoptimization events is also polluted. Even if a branch is never taken, >>JIT doesn't issue an uncommon trap there unless corresponding bytecode >>doesn't trap too much and doesn't cause too many recompiles. >> >>I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT >>sees it on some method, Compile::too_many_traps & >>Compile::too_many_recompiles for that method always return false. It >>allows JIT to prune the branch based on custom profile and recompile the >>method, if the branch is visited. >> >>For now, I wanted to keep the fix very focused. The next thing I plan to >>do is to experiment with ignoring deoptimization counts for other >>LambdaForms which are heavily shared. I already saw problems caused by >>deoptimization counts pollution (see JDK-8068915 [2]). >> >>I plan to backport the fix into 8u40, once I finish extensive >>performance testing. >> >>Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, >>Octane). >> >>Thanks! >> >>PS: as a summary, my experiments show that fixes for 8063137 & 8068915 >>[2] almost completely recovers peak performance after LambdaForm sharing >>[3]. There's one more problem left (non-inlined MethodHandle invocations >>are more expensive when LFs are shared), but it's a story for another >>day. >> >>Best regards, >>Vladimir Ivanov >> >>[1] https://bugs.openjdk.java.net/browse/JDK-8059877 >> 8059877: GWT branch frequencies pollution due to LF sharing >>[2] https://bugs.openjdk.java.net/browse/JDK-8068915 >>[3] https://bugs.openjdk.java.net/browse/JDK-8046703 >> JEP 210: LambdaForm Reduction and Caching >>_______________________________________________ >>mlvm-dev mailing list >>mlvm-dev at openjdk.java.net >>http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > >_______________________________________________ >mlvm-dev mailing list >mlvm-dev at openjdk.java.net >http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev From vladimir.x.ivanov at oracle.com Wed Jan 21 11:41:15 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 21 Jan 2015 14:41:15 +0300 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: References: <54B94766.2080102@oracle.com> Message-ID: <54BF905B.7020407@oracle.com> Duncan, sorry for that. Updated webrev inplace. Best regards, Vladimir Ivanov On 1/21/15 1:39 PM, MacGregor, Duncan (GE Energy Management) wrote: > This version seems to have inconsistent removal of ignore profile in the > hotspot patch. It?s no longer added to vmSymbols but is still referenced > in classFileParser. > > On 19/01/2015 20:21, "MacGregor, Duncan (GE Energy Management)" > wrote: > >> Okay, I?ve done some tests of this with the micro benchmarks for our >> language & runtime which show pretty much no change except for one test >> which is now almost 3x slower. It uses nested loops to iterate over an >> array and concatenate the string-like objects it contains, and replaces >> elements with these new longer string-llike objects. It?s a bit of a >> pathological case, and I haven?t seen the same sort of degradation in the >> other benchmarks or in real applications, but I haven?t done serious >> benchmarking of them with this change. >> >> I shall see if the test case can be reduced down to anything simpler while >> still showing the same performance behaviour, and try add some compilation >> logging options to narrow down what?s going on. >> >> Duncan. >> >> On 16/01/2015 17:16, "Vladimir Ivanov" >> wrote: >> >>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ >>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ >>> https://bugs.openjdk.java.net/browse/JDK-8063137 >>> >>> After GuardWithTest (GWT) LambdaForms became shared, profile pollution >>> significantly distorted compilation decisions. It affected inlining and >>> hindered some optimizations. It causes significant performance >>> regressions for Nashorn (on Octane benchmarks). >>> >>> Inlining was fixed by 8059877 [1], but it didn't cover the case when a >>> branch is never taken. It can cause missed optimization opportunity, and >>> not just increase in code size. For example, non-pruned branch can break >>> escape analysis. >>> >>> Currently, there are 2 problems: >>> - branch frequencies profile pollution >>> - deoptimization counts pollution >>> >>> Branch frequency pollution hides from JIT the fact that a branch is >>> never taken. Since GWT LambdaForms (and hence their bytecode) are >>> heavily shared, but the behavior is specific to MethodHandle, there's no >>> way for JIT to understand how particular GWT instance behaves. >>> >>> The solution I propose is to do profiling in Java code and feed it to >>> JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where >>> profiling info is stored. Once JIT kicks in, it can retrieve these >>> counts, if corresponding MethodHandle is a compile-time constant (and it >>> is usually the case). To communicate the profile data from Java code to >>> JIT, MethodHandleImpl::profileBranch() is used. >>> >>> If GWT MethodHandle isn't a compile-time constant, profiling should >>> proceed. It happens when corresponding LambdaForm is already shared, for >>> newly created GWT MethodHandles profiling can occur only in native code >>> (dedicated nmethod for a single LambdaForm). So, when compilation of the >>> whole MethodHandle chain is triggered, the profile should be already >>> gathered. >>> >>> Overriding branch frequencies is not enough. Statistics on >>> deoptimization events is also polluted. Even if a branch is never taken, >>> JIT doesn't issue an uncommon trap there unless corresponding bytecode >>> doesn't trap too much and doesn't cause too many recompiles. >>> >>> I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT >>> sees it on some method, Compile::too_many_traps & >>> Compile::too_many_recompiles for that method always return false. It >>> allows JIT to prune the branch based on custom profile and recompile the >>> method, if the branch is visited. >>> >>> For now, I wanted to keep the fix very focused. The next thing I plan to >>> do is to experiment with ignoring deoptimization counts for other >>> LambdaForms which are heavily shared. I already saw problems caused by >>> deoptimization counts pollution (see JDK-8068915 [2]). >>> >>> I plan to backport the fix into 8u40, once I finish extensive >>> performance testing. >>> >>> Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, >>> Octane). >>> >>> Thanks! >>> >>> PS: as a summary, my experiments show that fixes for 8063137 & 8068915 >>> [2] almost completely recovers peak performance after LambdaForm sharing >>> [3]. There's one more problem left (non-inlined MethodHandle invocations >>> are more expensive when LFs are shared), but it's a story for another >>> day. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8059877 >>> 8059877: GWT branch frequencies pollution due to LF sharing >>> [2] https://bugs.openjdk.java.net/browse/JDK-8068915 >>> [3] https://bugs.openjdk.java.net/browse/JDK-8046703 >>> JEP 210: LambdaForm Reduction and Caching >>> _______________________________________________ >>> mlvm-dev mailing list >>> mlvm-dev at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >> _______________________________________________ >> mlvm-dev mailing list >> mlvm-dev at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > From tobias.hartmann at oracle.com Wed Jan 21 13:25:57 2015 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 21 Jan 2015 14:25:57 +0100 Subject: [9] RFR(S): 8069580: String intrinsic related cleanups Message-ID: <54BFA8E5.1060300@oracle.com> Hi, please review this small cleanup of string intrinsic related code. https://bugs.openjdk.java.net/browse/JDK-8069580 http://cr.openjdk.java.net/~thartmann/8069580/webrev.00/ Thanks, Tobias From roland.westrelin at oracle.com Wed Jan 21 13:35:10 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Wed, 21 Jan 2015 14:35:10 +0100 Subject: [9] RFR(S): 8069580: String intrinsic related cleanups In-Reply-To: <54BFA8E5.1060300@oracle.com> References: <54BFA8E5.1060300@oracle.com> Message-ID: > http://cr.openjdk.java.net/~thartmann/8069580/webrev.00/ That looks good to me. Roland. From tobias.hartmann at oracle.com Wed Jan 21 13:36:17 2015 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 21 Jan 2015 14:36:17 +0100 Subject: [9] RFR(S): 8069580: String intrinsic related cleanups In-Reply-To: References: <54BFA8E5.1060300@oracle.com> Message-ID: <54BFAB51.9010603@oracle.com> Thanks, Roland. Best, Tobias On 21.01.2015 14:35, Roland Westrelin wrote: >> http://cr.openjdk.java.net/~thartmann/8069580/webrev.00/ > > That looks good to me. > > Roland. > From nils.eliasson at oracle.com Wed Jan 21 15:08:37 2015 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 21 Jan 2015 16:08:37 +0100 Subject: RFR(M): 8069035: compiler/oracle/CheckCompileCommandOption.java nightly failure In-Reply-To: <54B92161.3090109@oracle.com> References: <54B81398.90409@oracle.com> <54B92161.3090109@oracle.com> Message-ID: <54BFC0F5.1040405@oracle.com> Hi Zoltan, On 2015-01-16 15:34, Zolt?n Maj? wrote: > > I agree with Albert that it might be better to execute the > test/compiler/oracle/CheckCompileCommandOption.java only in nightlies. > Especially that the test might take much longer on the slower machines > that we have. ok, I removed the test. > >> * Added additional test cases for the CompilerCommandFile > > That is a good idea, but I think you've forgotten to include those > command files into the webrev. As a result, when I run the test with > jtreg on my own machine, it fails with the following error message: Yes I did. Thanks for finding. I also added the added the bug number to the test attributes. Webrev: http://cr.openjdk.java.net/~neliasso/8069035/webrev.03/ Thanks for your help, Nils > > Thank you and best regards, > > > Zoltan > >> * Added testcases for option parsing together with a method pattern >> that includes a signature >> * Moved some testcases together that didn't require separate VMs >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8069035 >> Webrev: http://cr.openjdk.java.net/~neliasso/8069035/webrev.01 >> >> Best regards, >> Nils Eliasson >> > From nils.eliasson at oracle.com Wed Jan 21 15:16:24 2015 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 21 Jan 2015 16:16:24 +0100 Subject: RFR (S): 8069389: CompilerOracle prefix wildcarding is broken for long strings on Mac and Solaris Message-ID: <54BFC2C8.3000805@oracle.com> Hi, Can I please have a review for this small change. Tom found a error in the CompilerOracle method matcher. This bug can be seen on Mac and Solaris. I have changed from strcopy to memmove as Tom suggested and created a simple regression test. Bug: https://bugs.openjdk.java.net/browse/JDK-8069389 Webrev: http://cr.openjdk.java.net/~neliasso/8069389/webrev.01/ Best regards, Nils From zoltan.majo at oracle.com Wed Jan 21 15:47:46 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Wed, 21 Jan 2015 16:47:46 +0100 Subject: RFR(M): 8069035: compiler/oracle/CheckCompileCommandOption.java nightly failure In-Reply-To: <54BFC0F5.1040405@oracle.com> References: <54B81398.90409@oracle.com> <54B92161.3090109@oracle.com> <54BFC0F5.1040405@oracle.com> Message-ID: <54BFCA22.1020400@oracle.com> Hi Nils, On 01/21/2015 04:08 PM, Nils Eliasson wrote: > Hi Zoltan, > > On 2015-01-16 15:34, Zolt?n Maj? wrote: >> >> I agree with Albert that it might be better to execute the >> test/compiler/oracle/CheckCompileCommandOption.java only in >> nightlies. Especially that the test might take much longer on the >> slower machines that we have. > > ok, I removed the test. > >> >>> * Added additional test cases for the CompilerCommandFile >> >> That is a good idea, but I think you've forgotten to include those >> command files into the webrev. As a result, when I run the test with >> jtreg on my own machine, it fails with the following error message: > > Yes I did. Thanks for finding. > > I also added the added the bug number to the test attributes. > > Webrev: http://cr.openjdk.java.net/~neliasso/8069035/webrev.03/ This version looks good to me (not a Reviewer). Thank you and best regards, Zoltan > > Thanks for your help, > Nils > >> >> Thank you and best regards, >> >> >> Zoltan >> >>> * Added testcases for option parsing together with a method pattern >>> that includes a signature >>> * Moved some testcases together that didn't require separate VMs >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8069035 >>> Webrev: http://cr.openjdk.java.net/~neliasso/8069035/webrev.01 >>> >>> Best regards, >>> Nils Eliasson >>> >> > From vladimir.kozlov at oracle.com Wed Jan 21 17:24:24 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 21 Jan 2015 09:24:24 -0800 Subject: [9] RFR(S): 8069580: String intrinsic related cleanups In-Reply-To: References: <54BFA8E5.1060300@oracle.com> Message-ID: <54BFE0C8.50700@oracle.com> On 1/21/15 5:35 AM, Roland Westrelin wrote: >> http://cr.openjdk.java.net/~thartmann/8069580/webrev.00/ > > That looks good to me. +1 Vladimir > > Roland. > From vladimir.kozlov at oracle.com Wed Jan 21 17:26:53 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 21 Jan 2015 09:26:53 -0800 Subject: RFR(M): 8069035: compiler/oracle/CheckCompileCommandOption.java nightly failure In-Reply-To: <54BFC0F5.1040405@oracle.com> References: <54B81398.90409@oracle.com> <54B92161.3090109@oracle.com> <54BFC0F5.1040405@oracle.com> Message-ID: <54BFE15D.2050006@oracle.com> Looks good. Thanks, Vladimir On 1/21/15 7:08 AM, Nils Eliasson wrote: > Hi Zoltan, > > On 2015-01-16 15:34, Zolt?n Maj? wrote: >> >> I agree with Albert that it might be better to execute the test/compiler/oracle/CheckCompileCommandOption.java only in >> nightlies. Especially that the test might take much longer on the slower machines that we have. > > ok, I removed the test. > >> >>> * Added additional test cases for the CompilerCommandFile >> >> That is a good idea, but I think you've forgotten to include those command files into the webrev. As a result, when I >> run the test with jtreg on my own machine, it fails with the following error message: > > Yes I did. Thanks for finding. > > I also added the added the bug number to the test attributes. > > Webrev: http://cr.openjdk.java.net/~neliasso/8069035/webrev.03/ > > Thanks for your help, > Nils > >> >> Thank you and best regards, >> >> >> Zoltan >> >>> * Added testcases for option parsing together with a method pattern that includes a signature >>> * Moved some testcases together that didn't require separate VMs >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8069035 >>> Webrev: http://cr.openjdk.java.net/~neliasso/8069035/webrev.01 >>> >>> Best regards, >>> Nils Eliasson >>> >> > From tobias.hartmann at oracle.com Wed Jan 21 17:28:13 2015 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 21 Jan 2015 18:28:13 +0100 Subject: [9] RFR(S): 8069580: String intrinsic related cleanups In-Reply-To: <54BFE0C8.50700@oracle.com> References: <54BFA8E5.1060300@oracle.com> <54BFE0C8.50700@oracle.com> Message-ID: <54BFE1AD.6030303@oracle.com> Thanks, Vladimir. Best, Tobias On 21.01.2015 18:24, Vladimir Kozlov wrote: > On 1/21/15 5:35 AM, Roland Westrelin wrote: >>> http://cr.openjdk.java.net/~thartmann/8069580/webrev.00/ >> >> That looks good to me. > > +1 > > Vladimir > >> >> Roland. >> From vladimir.kozlov at oracle.com Wed Jan 21 17:29:36 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 21 Jan 2015 09:29:36 -0800 Subject: RFR (S): 8069389: CompilerOracle prefix wildcarding is broken for long strings on Mac and Solaris In-Reply-To: <54BFC2C8.3000805@oracle.com> References: <54BFC2C8.3000805@oracle.com> Message-ID: <54BFE200.7040202@oracle.com> Looks good. Thanks, Vladimir On 1/21/15 7:16 AM, Nils Eliasson wrote: > Hi, > > Can I please have a review for this small change. > > Tom found a error in the CompilerOracle method matcher. This bug can be seen on Mac and Solaris. > > I have changed from strcopy to memmove as Tom suggested and created a simple regression test. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8069389 > Webrev: http://cr.openjdk.java.net/~neliasso/8069389/webrev.01/ > > Best regards, > Nils From zoltan.majo at oracle.com Thu Jan 22 14:16:16 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 22 Jan 2015 15:16:16 +0100 Subject: [9] RFR(S): 8071312: compiler/arguments/CheckCompileThresholdScaling.java fails Message-ID: <54C10630.5070104@oracle.com> Hi, please review the following small patch. Bug (nightly failure): https://bugs.openjdk.java.net/browse/JDK-8071312 Problem: The test compiler/arguments/CheckCompileThresholdScaling.java fails due to changes by 8059606. The reason of the failure is the changed logic for the case when CompileThresholdScaling==0.0. Solution: This patch changes the way the VM handles CompileThresholdScaling==0.0. By convention, CompileThresholdScaling==0.0 is equivalent to -Xint. Before, CompileThresholdScaling==0 set the value of all compilation thresholds to 0 *silently*. This behavior is inconsistent with the logic of -Xint, which leaves the values of the compilation thresholds unaffected. With this change, if CompileThresholdScaling==0, the -Xint flag is set but the value of compilation thresholds is left unchanged. I changed compiler/arguments/CheckCompileThresholdScaling.java test to reflect this behavior. Added test case for CompileThresholdScaling == 0 with tiered compilation disabled. Webrev: http://cr.openjdk.java.net/~zmajo/8071312/webrev.00/ Testing: manual testing, JPRT (all standard JPRT tests + CheckCompileThresholdScaling) Thank you and best regards, Zoltan From fredrik at dolda2000.com Thu Jan 22 14:36:03 2015 From: fredrik at dolda2000.com (Fredrik Tolf) Date: Thu, 22 Jan 2015 15:36:03 +0100 (CET) Subject: C2 compiler hangs and eats memory in escape analysis Message-ID: Dear list, I'm not all too familiar with the OpenJDK project, so if this is not the right place to discuss bugs, please tell me off and I'll know better. Otherwise, I seem to have encountered a bug in the C2 compiler where it appears to hang in the optimizer/escapeAnalysis/connectionGraph phase if the compilation log is anything to go by. I have no idea thus far what actually triggers the problem, but I appear to have found a reliable (though complex, hard-to-communicate, and without any obvious way to reduce to an SSCCE) procedure for reproducing it. When it happens, both the C2 compiler threads hang, consuming 100% CPU each and very often (but not always) eating all my memory. I've been able to reproduce it on three distinct systems, and with both Oracle JRE 8u31 and the Debian-supplied OpenJDK-8 JRE, so it seems to be a bug in "stock" Hotspot rather than some local configuration issue. All the systems are running x86_64 Linux, but I doubt that is related to the problem. I tried debugging it using GDB but either something confuses GDB, or the debug symbols are bogus somehow, or Hotspot uses some very weird flow control that I don't understand. As such, I'm at a bit of a loss as to how to debug this further, and I'd appreciate any help. As this seems to be perfectly reproducible, I can provide any number of compilation logs, core dumps, stack traces or anything else you might want. To begin with, here's a compilation log: I had this JVM running for some 90 seconds, and triggered the bug at slightly under 30. You can see both the C2 compiler threads ending with a "fragment" that ends with the connectionGraph phase. It seems common, also, to this problem that, when I actually trigger the problem, both threads encounter a single compilation with a connectionGraph phase that takes "longer than normal" (some 5-10 seconds, where as "normal" appears to be almost instantaneous), and only then comes another compilation that never finishes. In the case of this log, those "harder" compilations are IDs 4929 and 4930, for each thread. Now I'm really not sure how accurate this is given how weirdly GDB behaved, but just for the record, here's the backtrace of one of the compiler threads at an interrupt during this hang, from Debian's OpenJDK-8 JVM: Please let me know if I can provide more information. In the meantime, I'll try to get acquainted with the build procedure so that I can build my own Hotspot with, hopefully, more useful debugging characteristics. Thanks for reading! -- Fredrik Tolf From aleksey.shipilev at oracle.com Thu Jan 22 14:41:46 2015 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Thu, 22 Jan 2015 17:41:46 +0300 Subject: C2 compiler hangs and eats memory in escape analysis In-Reply-To: References: Message-ID: <54C10C2A.4050708@oracle.com> On 22.01.2015 17:36, Fredrik Tolf wrote: > I have no idea thus far what actually triggers the problem, but I appear > to have found a reliable (though complex, hard-to-communicate, and > without any obvious way to reduce to an SSCCE) procedure for reproducing > it. When it happens, both the C2 compiler threads hang, consuming 100% > CPU each and very often (but not always) eating all my memory. I've been > able to reproduce it on three distinct systems, and with both Oracle JRE > 8u31 and the Debian-supplied OpenJDK-8 JRE, so it seems to be a bug in > "stock" Hotspot rather than some local configuration issue. All the > systems are running x86_64 Linux, but I doubt that is related to the > problem. I would suggest you to try building the latest JDK 9, or using the JDK 9 EA to see if this is fixed in already. Vladimir had fixed the issues like these before, and here is the recent one: https://bugs.openjdk.java.net/browse/JDK-8066199 The bug, if any, would be easier to demonstrate with Timeline view in Solaris Studio Performance Analyzer, like this: http://cr.openjdk.java.net/~shade/8066199/timeline.png Thanks, -Aleksey. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From david.r.chase at oracle.com Thu Jan 22 16:15:10 2015 From: david.r.chase at oracle.com (David Chase) Date: Thu, 22 Jan 2015 11:15:10 -0500 Subject: [9] RFR(S): 8071312: compiler/arguments/CheckCompileThresholdScaling.java fails In-Reply-To: <54C10630.5070104@oracle.com> References: <54C10630.5070104@oracle.com> Message-ID: <53CFB4AE-5A4A-4F07-B345-A63FB4F7A5B5@oracle.com> Some of the explanation below ought to appear in the code in the form of a comment, perhaps before this: 3484 if ((TieredCompilation && CompileThresholdScaling == 0) 3485 || (!TieredCompilation && (CompileThreshold == 0 || CompileThresholdScaling == 0.0))) { 3486 set_mode_flags(_int); 3487 } And should the if above may be reorganized? So maybe something like (assuming I read it correctly): // CompileThresholdScaling == 0.0 is same as -Xint ; enable interpreter, but like -Xint, leave other thresholds unaffected. // and if [ describe next condition in if ] use of interpreter is also implied. if ( CompileThresholdScaling == 0.0 || ( !TieredCompilation && CompileThreshold == 0 ) ) { set_mode_flags(_int); } I think we should always err on the side of leaving too many breadcrumbs in our code, rather than too few. David On 2015-01-22, at 9:16 AM, Zolt?n Maj? wrote: > Solution: > > This patch changes the way the VM handles CompileThresholdScaling==0.0. > > By convention, CompileThresholdScaling==0.0 is equivalent to -Xint. > > Before, CompileThresholdScaling==0 set the value of all compilation thresholds to 0 *silently*. This behavior is inconsistent with the logic of -Xint, which leaves the values of the compilation thresholds unaffected. > > With this change, if CompileThresholdScaling==0, the -Xint flag is set but the value of compilation thresholds is left unchanged. > > I changed compiler/arguments/CheckCompileThresholdScaling.java test to reflect this behavior. Added test case for CompileThresholdScaling == 0 with tiered compilation disabled. > > > Webrev: http://cr.openjdk.java.net/~zmajo/8071312/webrev.00/ > > Testing: manual testing, JPRT (all standard JPRT tests + CheckCompileThresholdScaling) > > Thank you and best regards, > > > Zoltan > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From igor.veresov at oracle.com Thu Jan 22 17:37:13 2015 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 22 Jan 2015 09:37:13 -0800 Subject: RFR(XS) 8071302: assert(!_reg_node[reg_lo] || edge_from_to(_reg_node[reg_lo], def)) failed: after block local scheduling Message-ID: This is caused by JDK-8068881. MachMerge node is scheduled incorrectly because it is not mapped to an appropriate block. Webrev: http://cr.openjdk.java.net/~iveresov/8071302/webrev.00/ Thanks, igor From vladimir.kozlov at oracle.com Thu Jan 22 17:40:19 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 22 Jan 2015 09:40:19 -0800 Subject: RFR(XS) 8071302: assert(!_reg_node[reg_lo] || edge_from_to(_reg_node[reg_lo], def)) failed: after block local scheduling In-Reply-To: References: Message-ID: <54C13603.20909@oracle.com> Looks good. Thanks, Vladimir On 1/22/15 9:37 AM, Igor Veresov wrote: > This is caused by JDK-8068881. MachMerge node is scheduled incorrectly because it is not mapped to an appropriate block. > > Webrev: http://cr.openjdk.java.net/~iveresov/8071302/webrev.00/ > > Thanks, > igor > From vladimir.kozlov at oracle.com Thu Jan 22 17:42:11 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 22 Jan 2015 09:42:11 -0800 Subject: [9] RFR(S): 8071312: compiler/arguments/CheckCompileThresholdScaling.java fails In-Reply-To: <54C10630.5070104@oracle.com> References: <54C10630.5070104@oracle.com> Message-ID: <54C13673.4060702@oracle.com> Good. Thanks, Vladimir On 1/22/15 6:16 AM, Zolt?n Maj? wrote: > Hi, > > > please review the following small patch. > > Bug (nightly failure): https://bugs.openjdk.java.net/browse/JDK-8071312 > > Problem: The test compiler/arguments/CheckCompileThresholdScaling.java fails due to changes by 8059606. The reason of > the failure is the changed logic for the case when CompileThresholdScaling==0.0. > > > Solution: > > This patch changes the way the VM handles CompileThresholdScaling==0.0. > > By convention, CompileThresholdScaling==0.0 is equivalent to -Xint. > > Before, CompileThresholdScaling==0 set the value of all compilation thresholds to 0 *silently*. This behavior is > inconsistent with the logic of -Xint, which leaves the values of the compilation thresholds unaffected. > > With this change, if CompileThresholdScaling==0, the -Xint flag is set but the value of compilation thresholds is left > unchanged. > > I changed compiler/arguments/CheckCompileThresholdScaling.java test to reflect this behavior. Added test case for > CompileThresholdScaling == 0 with tiered compilation disabled. > > > Webrev: http://cr.openjdk.java.net/~zmajo/8071312/webrev.00/ > > Testing: manual testing, JPRT (all standard JPRT tests + CheckCompileThresholdScaling) > > Thank you and best regards, > > > Zoltan > From vladimir.x.ivanov at oracle.com Thu Jan 22 17:43:03 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 22 Jan 2015 20:43:03 +0300 Subject: RFR(XS) 8071302: assert(!_reg_node[reg_lo] || edge_from_to(_reg_node[reg_lo], def)) failed: after block local scheduling In-Reply-To: References: Message-ID: <54C136A7.3080602@oracle.com> Looks good. Best regards, Vladimir Ivanov On 1/22/15 8:37 PM, Igor Veresov wrote: > This is caused by JDK-8068881. MachMerge node is scheduled incorrectly because it is not mapped to an appropriate block. > > Webrev: http://cr.openjdk.java.net/~iveresov/8071302/webrev.00/ > > Thanks, > igor > From vladimir.kozlov at oracle.com Thu Jan 22 17:45:56 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 22 Jan 2015 09:45:56 -0800 Subject: C2 compiler hangs and eats memory in escape analysis In-Reply-To: <54C10C2A.4050708@oracle.com> References: <54C10C2A.4050708@oracle.com> Message-ID: <54C13754.50506@oracle.com> It is also https://bugs.openjdk.java.net/browse/JDK-8041984 Both bugs are fixed in jdk 8u40 too. Fredrik, you can download early access 8u40 and try it: https://jdk8.java.net/download.html Regards, Vladimir On 1/22/15 6:41 AM, Aleksey Shipilev wrote: > On 22.01.2015 17:36, Fredrik Tolf wrote: >> I have no idea thus far what actually triggers the problem, but I appear >> to have found a reliable (though complex, hard-to-communicate, and >> without any obvious way to reduce to an SSCCE) procedure for reproducing >> it. When it happens, both the C2 compiler threads hang, consuming 100% >> CPU each and very often (but not always) eating all my memory. I've been >> able to reproduce it on three distinct systems, and with both Oracle JRE >> 8u31 and the Debian-supplied OpenJDK-8 JRE, so it seems to be a bug in >> "stock" Hotspot rather than some local configuration issue. All the >> systems are running x86_64 Linux, but I doubt that is related to the >> problem. > > I would suggest you to try building the latest JDK 9, or using the JDK 9 > EA to see if this is fixed in already. Vladimir had fixed the issues > like these before, and here is the recent one: > https://bugs.openjdk.java.net/browse/JDK-8066199 > > The bug, if any, would be easier to demonstrate with Timeline view in > Solaris Studio Performance Analyzer, like this: > http://cr.openjdk.java.net/~shade/8066199/timeline.png > > Thanks, > -Aleksey. > From igor.veresov at oracle.com Thu Jan 22 18:11:16 2015 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 22 Jan 2015 10:11:16 -0800 Subject: RFR(XS) 8071302: assert(!_reg_node[reg_lo] || edge_from_to(_reg_node[reg_lo], def)) failed: after block local scheduling In-Reply-To: <54C13603.20909@oracle.com> References: <54C13603.20909@oracle.com> Message-ID: Thanks, Vladimirs! igor > On Jan 22, 2015, at 9:40 AM, Vladimir Kozlov wrote: > > Looks good. > > Thanks, > Vladimir > > On 1/22/15 9:37 AM, Igor Veresov wrote: >> This is caused by JDK-8068881. MachMerge node is scheduled incorrectly because it is not mapped to an appropriate block. >> >> Webrev: http://cr.openjdk.java.net/~iveresov/8071302/webrev.00/ >> >> Thanks, >> igor >> From vladimir.x.ivanov at oracle.com Thu Jan 22 18:17:28 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 22 Jan 2015 21:17:28 +0300 Subject: [9, 8u40] RFR (XS): 8068915: uncommon trap w/ Reason_speculate_class_check causes performance regression due to continuous deoptimizations In-Reply-To: <54B959BF.1030709@oracle.com> References: <54B959BF.1030709@oracle.com> Message-ID: <54C13EB8.4060600@oracle.com> Updated version: http://cr.openjdk.java.net/~vlivanov/8068915/webrev.01 Added diagnostic message in compilation log. Added too_many_recompiles() for null_assert. Best regards, Vladimir Ivanov On 1/16/15 9:34 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8068915/webrev.00 > https://bugs.openjdk.java.net/browse/JDK-8068915 > > The fix for 8063137 [1] (just sent out for review) uncovered another > issue with deoptimization counts pollution. In some cases, speculative > traps can stuck continuously deoptimizing and never reach recompilation. > Usually, it ruins performance if it happens. > > When a speculative guard is considered, Compile::too_many_traps() is > consulted to decide whether to add it or not. But, uncommon trap action > can be changed to Action_none in GraphKit::uncommon_trap, if > Compile::too_many_recompiles() fires. > > The fix is to (1) forbid changing uncommon trap action under the hood, > and (2) consult Compile::too_many_recompiles when adding a speculative > guard. > > The fix is based on 8063137 [2] (uncommon_trap_exact is used). > > Testing: JPRT, java/lang/invoke tests, nashorn. > > Thanks! > > Best regards, > Vladimir Ivanov > > [1] https://bugs.openjdk.java.net/browse/JDK-8063137 > [2] http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ From vladimir.kozlov at oracle.com Thu Jan 22 18:41:28 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 22 Jan 2015 10:41:28 -0800 Subject: [9, 8u40] RFR (XS): 8068915: uncommon trap w/ Reason_speculate_class_check causes performance regression due to continuous deoptimizations In-Reply-To: <54C13EB8.4060600@oracle.com> References: <54B959BF.1030709@oracle.com> <54C13EB8.4060600@oracle.com> Message-ID: <54C14458.9050007@oracle.com> Okay. Thanks, Vladimir K On 1/22/15 10:17 AM, Vladimir Ivanov wrote: > Updated version: > http://cr.openjdk.java.net/~vlivanov/8068915/webrev.01 > > Added diagnostic message in compilation log. > Added too_many_recompiles() for null_assert. > > Best regards, > Vladimir Ivanov > > On 1/16/15 9:34 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8068915/webrev.00 >> https://bugs.openjdk.java.net/browse/JDK-8068915 >> >> The fix for 8063137 [1] (just sent out for review) uncovered another >> issue with deoptimization counts pollution. In some cases, speculative >> traps can stuck continuously deoptimizing and never reach recompilation. >> Usually, it ruins performance if it happens. >> >> When a speculative guard is considered, Compile::too_many_traps() is >> consulted to decide whether to add it or not. But, uncommon trap action >> can be changed to Action_none in GraphKit::uncommon_trap, if >> Compile::too_many_recompiles() fires. >> >> The fix is to (1) forbid changing uncommon trap action under the hood, >> and (2) consult Compile::too_many_recompiles when adding a speculative >> guard. >> >> The fix is based on 8063137 [2] (uncommon_trap_exact is used). >> >> Testing: JPRT, java/lang/invoke tests, nashorn. >> >> Thanks! >> >> Best regards, >> Vladimir Ivanov >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8063137 >> [2] http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ From alejandro.murillo at oracle.com Thu Jan 22 19:20:03 2015 From: alejandro.murillo at oracle.com (Alejandro E Murillo) Date: Thu, 22 Jan 2015 12:20:03 -0700 Subject: [9, 8u40] RFR (XS): 8068915: uncommon trap w/ Reason_speculate_class_check causes performance regression due to continuous deoptimizations In-Reply-To: <54C13EB8.4060600@oracle.com> References: <54B959BF.1030709@oracle.com> <54C13EB8.4060600@oracle.com> Message-ID: <54C14D63.4080509@oracle.com> Hi Vladimir, looks like you want this for 8u40? if so, please move the 8u40-critical-watch from the backport to the main bug and get the approval from SQE cheers Alejandro On 1/22/2015 11:17 AM, Vladimir Ivanov wrote: > Updated version: > http://cr.openjdk.java.net/~vlivanov/8068915/webrev.01 > > Added diagnostic message in compilation log. > Added too_many_recompiles() for null_assert. > > Best regards, > Vladimir Ivanov > > On 1/16/15 9:34 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8068915/webrev.00 >> https://bugs.openjdk.java.net/browse/JDK-8068915 >> >> The fix for 8063137 [1] (just sent out for review) uncovered another >> issue with deoptimization counts pollution. In some cases, speculative >> traps can stuck continuously deoptimizing and never reach recompilation. >> Usually, it ruins performance if it happens. >> >> When a speculative guard is considered, Compile::too_many_traps() is >> consulted to decide whether to add it or not. But, uncommon trap action >> can be changed to Action_none in GraphKit::uncommon_trap, if >> Compile::too_many_recompiles() fires. >> >> The fix is to (1) forbid changing uncommon trap action under the hood, >> and (2) consult Compile::too_many_recompiles when adding a speculative >> guard. >> >> The fix is based on 8063137 [2] (uncommon_trap_exact is used). >> >> Testing: JPRT, java/lang/invoke tests, nashorn. >> >> Thanks! >> >> Best regards, >> Vladimir Ivanov >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8063137 >> [2] http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ -- Alejandro From john.r.rose at oracle.com Fri Jan 23 01:31:59 2015 From: john.r.rose at oracle.com (John Rose) Date: Thu, 22 Jan 2015 17:31:59 -0800 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54BEA7D7.6080008@oracle.com> References: <54B94766.2080102@oracle.com> <7B03B9FB-17B4-4AE0-92B8-F2DC5B231294@oracle.com> <54BEA7D7.6080008@oracle.com> Message-ID: <5BA1E369-ED87-4EBD-8408-B73B726D91BD@oracle.com> On Jan 20, 2015, at 11:09 AM, Vladimir Ivanov wrote: > >> What I'm mainly poking at here is that 'isGWT' is not informative about >> the intended use of the flag. > I agree. It was an interim solution. Initially, I planned to introduce customization and guide the logic based on that property. But it's not there yet and I needed something for GWT case. Unfortunately, I missed the case when GWT is edited. In that case, isGWT flag is missed and no annotation is set. > So, I removed isGWT flag and introduced a check for selectAlternative occurence in LambdaForm shape, as you suggested. Good. I think there is a sweeter spot just a little further on. Make profileBranch be an LF intrinsic and expose it like this: GWT(p,t,f;S) := let(a=new int[3]) in lambda(*: S) { selectAlternative(profileBranch(p.invoke( *), a), t, f).invoke( *); } Then selectAlternative triggers branchy bytecodes in the IBGen, and profileBranch injects profiling in C2. The presence of profileBranch would then trigger the @Shared annotation, if you still need it. After thinking about it some more, I still believe it would be better to detect the use of profileBranch during a C2 compile task, and feed that to the too_many_traps logic. I agree it is much easier to stick the annotation on in the IBGen; the problem is that because of a minor phase ordering problem you are introducing an annotation which flows from the JDK to the VM. Here's one more suggestion at reducing this coupling? Note that C->set_trap_count is called when each Parse phase processes a whole method. This means that information about the contents of the nmethod accumulates during the parse. Likewise, add a flag method C->{has,set}_injected_profile, and set the flag whenever the parser sees a profileBranch intrinsic (with or without a constant profile array; your call). Then consult that flag from too_many_traps. It is true that code which is parsed upstream of the very first profileBranch will potentially issue a non-trapping fallback, but by definition that code would be unrelated to the injected profile, so I don't see a harm in that. If this approach works, then you can remove the annotation altogether, which is clearly preferable. We understand the annotation now, but it has the danger of becoming a maintainer's puzzlement. > >> In 'updateCounters', if the counter overflows, you'll get continuous >> creation of ArithmeticExceptions. Will that optimize or will it cause a >> permanent slowdown? Consider a hack like this on the exception path: >> counters[idx] = Integer.MAX_VALUE / 2; > I had an impression that VM optimizes overflows in Math.exact* intrinsics, but it's not the case - it always inserts an uncommon trap. I used the workaround you proposed. Good. > >> On the Name Bikeshed: It looks like @IgnoreProfile (ignore_profile in >> the VM) promises too much "ignorance", since it suppresses branch counts >> and traps, but allows type profiles to be consulted. Maybe something >> positive like "@ManyTraps" or "@SharedMegamorphic"? (It's just a name, >> and this is just a suggestion.) > What do you think about @LambdaForm.Shared? That's fine. Suggest changing the JVM accessor to is_lambda_form_shared, because the term "shared" is already overused in the VM. Or, to be much more accurate, s/@Shared/@CollectiveProfile/. Better yet, get rid of it, as suggested above. (I just realized that profile pollution looks logically parallel to the http://en.wikipedia.org/wiki/Tragedy_of_the_commons .) Also, in the comment explaining the annotation: s/mostly useless/probably polluted by conflicting behavior from multiple call sites/ I very much like the fact that profileBranch is the VM intrinsic, not selectAlternative. A VM intrinsic should be nice and narrow like that. In fact, you can delete selectAlternative from vmSymbols while you are at it. (We could do profileInteger and profileClass in a similar way, if that turned out to be useful.) ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From filipp.zhinkin at gmail.com Fri Jan 23 06:44:33 2015 From: filipp.zhinkin at gmail.com (Filipp Zhinkin) Date: Fri, 23 Jan 2015 10:44:33 +0400 Subject: [9] RFR(S): 8071312: compiler/arguments/CheckCompileThresholdScaling.java fails In-Reply-To: <54C10630.5070104@oracle.com> References: <54C10630.5070104@oracle.com> Message-ID: Hi Zoltan, maybe you can eliminate unnecessary else-if clause in Arguments::scaled_compile_threshold? if (scale == 1.0 || scale <= 0.0) { return threshold; } else { return (intx)(threshold * scale); } Thanks, Filipp. On Thu, Jan 22, 2015 at 5:16 PM, Zolt?n Maj? wrote: > Hi, > > > please review the following small patch. > > Bug (nightly failure): https://bugs.openjdk.java.net/browse/JDK-8071312 > > Problem: The test compiler/arguments/CheckCompileThresholdScaling.java fails > due to changes by 8059606. The reason of the failure is the changed logic > for the case when CompileThresholdScaling==0.0. > > > Solution: > > This patch changes the way the VM handles CompileThresholdScaling==0.0. > > By convention, CompileThresholdScaling==0.0 is equivalent to -Xint. > > Before, CompileThresholdScaling==0 set the value of all compilation > thresholds to 0 *silently*. This behavior is inconsistent with the logic of > -Xint, which leaves the values of the compilation thresholds unaffected. > > With this change, if CompileThresholdScaling==0, the -Xint flag is set but > the value of compilation thresholds is left unchanged. > > I changed compiler/arguments/CheckCompileThresholdScaling.java test to > reflect this behavior. Added test case for CompileThresholdScaling == 0 with > tiered compilation disabled. > > > Webrev: http://cr.openjdk.java.net/~zmajo/8071312/webrev.00/ > > Testing: manual testing, JPRT (all standard JPRT tests + > CheckCompileThresholdScaling) > > Thank you and best regards, > > > Zoltan > From roland.westrelin at oracle.com Fri Jan 23 08:54:11 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Fri, 23 Jan 2015 09:54:11 +0100 Subject: [9, 8u40] RFR (XS): 8068915: uncommon trap w/ Reason_speculate_class_check causes performance regression due to continuous deoptimizations In-Reply-To: <54C13EB8.4060600@oracle.com> References: <54B959BF.1030709@oracle.com> <54C13EB8.4060600@oracle.com> Message-ID: <17EA4F8F-FD7D-4AB5-B30E-384F6FC2A15A@oracle.com> > Updated version: > http://cr.openjdk.java.net/~vlivanov/8068915/webrev.01 Looks good. Roland. From zoltan.majo at oracle.com Fri Jan 23 10:11:33 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 23 Jan 2015 11:11:33 +0100 Subject: [9] RFR(S): 8071312: compiler/arguments/CheckCompileThresholdScaling.java fails In-Reply-To: References: <54C10630.5070104@oracle.com> Message-ID: <54C21E55.40402@oracle.com> Thank you, David, Vladimir, and Filipp, for the feedback! Here is the newest webrev: http://cr.openjdk.java.net/~zmajo/8071312/webrev.01/ Please let me know if you think other changes are necessary! Best regards, Zoltan On 01/23/2015 07:44 AM, Filipp Zhinkin wrote: > Hi Zoltan, > > maybe you can eliminate unnecessary else-if clause in > Arguments::scaled_compile_threshold? > > if (scale == 1.0 || scale <= 0.0) { > return threshold; > } else { > return (intx)(threshold * scale); > } > > Thanks, > Filipp. > > On Thu, Jan 22, 2015 at 5:16 PM, Zolt?n Maj? wrote: >> Hi, >> >> >> please review the following small patch. >> >> Bug (nightly failure): https://bugs.openjdk.java.net/browse/JDK-8071312 >> >> Problem: The test compiler/arguments/CheckCompileThresholdScaling.java fails >> due to changes by 8059606. The reason of the failure is the changed logic >> for the case when CompileThresholdScaling==0.0. >> >> >> Solution: >> >> This patch changes the way the VM handles CompileThresholdScaling==0.0. >> >> By convention, CompileThresholdScaling==0.0 is equivalent to -Xint. >> >> Before, CompileThresholdScaling==0 set the value of all compilation >> thresholds to 0 *silently*. This behavior is inconsistent with the logic of >> -Xint, which leaves the values of the compilation thresholds unaffected. >> >> With this change, if CompileThresholdScaling==0, the -Xint flag is set but >> the value of compilation thresholds is left unchanged. >> >> I changed compiler/arguments/CheckCompileThresholdScaling.java test to >> reflect this behavior. Added test case for CompileThresholdScaling == 0 with >> tiered compilation disabled. >> >> >> Webrev: http://cr.openjdk.java.net/~zmajo/8071312/webrev.00/ >> >> Testing: manual testing, JPRT (all standard JPRT tests + >> CheckCompileThresholdScaling) >> >> Thank you and best regards, >> >> >> Zoltan >> From vladimir.x.ivanov at oracle.com Fri Jan 23 15:02:31 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 23 Jan 2015 18:02:31 +0300 Subject: [9, 8u40] RFR (XS): 8068915: uncommon trap w/ Reason_speculate_class_check causes performance regression due to continuous deoptimizations In-Reply-To: <17EA4F8F-FD7D-4AB5-B30E-384F6FC2A15A@oracle.com> References: <54B959BF.1030709@oracle.com> <54C13EB8.4060600@oracle.com> <17EA4F8F-FD7D-4AB5-B30E-384F6FC2A15A@oracle.com> Message-ID: <54C26287.1050009@oracle.com> Vladimir, Roland, John, thanks for review! Best regards, Vladimir Ivanov On 1/23/15 11:54 AM, Roland Westrelin wrote: >> Updated version: >> http://cr.openjdk.java.net/~vlivanov/8068915/webrev.01 > > Looks good. > > Roland. > From fredrik at dolda2000.com Fri Jan 23 23:41:02 2015 From: fredrik at dolda2000.com (Fredrik Tolf) Date: Sat, 24 Jan 2015 00:41:02 +0100 (CET) Subject: C2 compiler hangs and eats memory in escape analysis In-Reply-To: <54C13754.50506@oracle.com> References: <54C10C2A.4050708@oracle.com> <54C13754.50506@oracle.com> Message-ID: On Thu, 22 Jan 2015, Vladimir Kozlov wrote: > It is also https://bugs.openjdk.java.net/browse/JDK-8041984 > Both bugs are fixed in jdk 8u40 too. > > Fredrik, you can download early access 8u40 and try it: > > https://jdk8.java.net/download.html Ah, thank you and sorry for not checking the EA in advance. It does indeed catch the problem I'm seeing. However, judging from the description ("[...] in a very rare situation") of the linked bug report and the nature of the solution (a timeout if the CG phase takes too long), it seems to me that this might be a workaround to a problem that you might not have been able to reproduce reliably enough to find any root cause. Correct me if I'm wrong, of course. If that is the case, I do indeed have a close to perfectly reproducible test-case over here, so if you want me to produce information for you to find the actual root cause, I'd be happy to do so. -- Fredrik Tolf From vladimir.kozlov at oracle.com Fri Jan 23 23:57:55 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 23 Jan 2015 15:57:55 -0800 Subject: C2 compiler hangs and eats memory in escape analysis In-Reply-To: References: <54C10C2A.4050708@oracle.com> <54C13754.50506@oracle.com> Message-ID: <54C2E003.4070809@oracle.com> On 1/23/15 3:41 PM, Fredrik Tolf wrote: > On Thu, 22 Jan 2015, Vladimir Kozlov wrote: >> It is also https://bugs.openjdk.java.net/browse/JDK-8041984 >> Both bugs are fixed in jdk 8u40 too. >> >> Fredrik, you can download early access 8u40 and try it: >> >> https://jdk8.java.net/download.html > > Ah, thank you and sorry for not checking the EA in advance. It does > indeed catch the problem I'm seeing. Did it catch or solve your problem? > > However, judging from the description ("[...] in a very rare situation") > of the linked bug report and the nature of the solution (a timeout if > the CG phase takes too long), it seems to me that this might be a > workaround to a problem that you might not have been able to reproduce > reliably enough to find any root cause. Correct me if I'm wrong, of course. No, I reproduced it reliable with GraphHopper program as I reported in 8041984. The data connection graph is way too complex and it take a lot of time to build (about 20000 nodes in graph). I did not find a solution other than to check time on each build's iteration and bailout. I understand that the algorithm used to build the graph may be not optimal but we don't have resources to work on it now. > > If that is the case, I do indeed have a close to perfectly reproducible > test-case over here, so if you want me to produce information for you to > find the actual root cause, I'd be happy to do so. Yes, it would be nice if you can provide a simple (small number of nodes) test which we can run. It may help to identify particular parts in code which we can improve. Regards, Vladimir > > > -- > Fredrik Tolf From fredrik at dolda2000.com Sat Jan 24 00:11:57 2015 From: fredrik at dolda2000.com (Fredrik Tolf) Date: Sat, 24 Jan 2015 01:11:57 +0100 (CET) Subject: C2 compiler hangs and eats memory in escape analysis In-Reply-To: <54C2E003.4070809@oracle.com> References: <54C10C2A.4050708@oracle.com> <54C13754.50506@oracle.com> <54C2E003.4070809@oracle.com> Message-ID: On Fri, 23 Jan 2015, Vladimir Kozlov wrote: >> Ah, thank you and sorry for not checking the EA in advance. It does >> indeed catch the problem I'm seeing. > > Did it catch or solve your problem? What I meant was that it catches the runaway compiler, so yes, it does solve my problem. >> If that is the case, I do indeed have a close to perfectly reproducible >> test-case over here, so if you want me to produce information for you to >> find the actual root cause, I'd be happy to do so. > > Yes, it would be nice if you can provide a simple (small number of nodes) > test which we can run. It may help to identify particular parts in code which > we can improve. Sorry, but I don't think I'll be able to do that. While it's perfectly reproducible, I can't find a way to isolate it out of the rather complex context I use to reproduce it. Thanks for your helpfulness! -- Fredrik Tolf From filipp.zhinkin at gmail.com Sat Jan 24 19:24:26 2015 From: filipp.zhinkin at gmail.com (Filipp Zhinkin) Date: Sat, 24 Jan 2015 23:24:26 +0400 Subject: [9] RFR(S): 8071312: compiler/arguments/CheckCompileThresholdScaling.java fails In-Reply-To: <54C21E55.40402@oracle.com> References: <54C10630.5070104@oracle.com> <54C21E55.40402@oracle.com> Message-ID: Hi Zoltan, thank you for fixing it. The change looks good to me (not a R-reviewer). Regards, Filipp. On Fri, Jan 23, 2015 at 1:11 PM, Zolt?n Maj? wrote: > Thank you, David, Vladimir, and Filipp, for the feedback! > > Here is the newest webrev: > http://cr.openjdk.java.net/~zmajo/8071312/webrev.01/ > > Please let me know if you think other changes are necessary! > > Best regards, > > > Zoltan > > > On 01/23/2015 07:44 AM, Filipp Zhinkin wrote: >> >> Hi Zoltan, >> >> maybe you can eliminate unnecessary else-if clause in >> Arguments::scaled_compile_threshold? >> >> if (scale == 1.0 || scale <= 0.0) { >> return threshold; >> } else { >> return (intx)(threshold * scale); >> } >> >> Thanks, >> Filipp. >> >> On Thu, Jan 22, 2015 at 5:16 PM, Zolt?n Maj? >> wrote: >>> >>> Hi, >>> >>> >>> please review the following small patch. >>> >>> Bug (nightly failure): https://bugs.openjdk.java.net/browse/JDK-8071312 >>> >>> Problem: The test compiler/arguments/CheckCompileThresholdScaling.java >>> fails >>> due to changes by 8059606. The reason of the failure is the changed logic >>> for the case when CompileThresholdScaling==0.0. >>> >>> >>> Solution: >>> >>> This patch changes the way the VM handles CompileThresholdScaling==0.0. >>> >>> By convention, CompileThresholdScaling==0.0 is equivalent to -Xint. >>> >>> Before, CompileThresholdScaling==0 set the value of all compilation >>> thresholds to 0 *silently*. This behavior is inconsistent with the logic >>> of >>> -Xint, which leaves the values of the compilation thresholds unaffected. >>> >>> With this change, if CompileThresholdScaling==0, the -Xint flag is set >>> but >>> the value of compilation thresholds is left unchanged. >>> >>> I changed compiler/arguments/CheckCompileThresholdScaling.java test to >>> reflect this behavior. Added test case for CompileThresholdScaling == 0 >>> with >>> tiered compilation disabled. >>> >>> >>> Webrev: http://cr.openjdk.java.net/~zmajo/8071312/webrev.00/ >>> >>> Testing: manual testing, JPRT (all standard JPRT tests + >>> CheckCompileThresholdScaling) >>> >>> Thank you and best regards, >>> >>> >>> Zoltan >>> > From vladimir.x.ivanov at oracle.com Mon Jan 26 16:41:50 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 26 Jan 2015 19:41:50 +0300 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <5BA1E369-ED87-4EBD-8408-B73B726D91BD@oracle.com> References: <54B94766.2080102@oracle.com> <7B03B9FB-17B4-4AE0-92B8-F2DC5B231294@oracle.com> <54BEA7D7.6080008@oracle.com> <5BA1E369-ED87-4EBD-8408-B73B726D91BD@oracle.com> Message-ID: <54C66E4E.9050805@oracle.com> John, What do you think about the following version? http://cr.openjdk.java.net/~vlivanov/8063137/webrev.02 As you suggested, I reified MHI::profileBranch on LambdaForm level and removed @LambdaForm.Shared. My main concern about removing @Sharen was that profile pollution can affect the code before profileBranch call (akin to 8068915 [1]) and it seems it's the case: Gbemu (at least) is sensitive to that change (there's a 10% difference in peak performance between @Shared and has_injected_profile()). I can leave @Shared as is for now or remove it and work on the fix to the deoptimization counts pollution. What do you prefer? Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8068915 On 1/23/15 4:31 AM, John Rose wrote: > On Jan 20, 2015, at 11:09 AM, Vladimir Ivanov > > wrote: >> >>> What I'm mainly poking at here is that 'isGWT' is not informative about >>> the intended use of the flag. >> I agree. It was an interim solution. Initially, I planned to introduce >> customization and guide the logic based on that property. But it's not >> there yet and I needed something for GWT case. Unfortunately, I missed >> the case when GWT is edited. In that case, isGWT flag is missed and no >> annotation is set. >> So, I removed isGWT flag and introduced a check for selectAlternative >> occurence in LambdaForm shape, as you suggested. > > Good. > > I think there is a sweeter spot just a little further on. Make > profileBranch be an LF intrinsic and expose it like this: > GWT(p,t,f;S) := let(a=new int[3]) in lambda(*: S) { > selectAlternative(profileBranch(p.invoke( *), a), t, f).invoke( *); } > > Then selectAlternative triggers branchy bytecodes in the IBGen, and > profileBranch injects profiling in C2. > The presence of profileBranch would then trigger the @Shared annotation, > if you still need it. > > After thinking about it some more, I still believe it would be better to > detect the use of profileBranch during a C2 compile task, and feed that > to the too_many_traps logic. I agree it is much easier to stick the > annotation on in the IBGen; the problem is that because of a minor phase > ordering problem you are introducing an annotation which flows from the > JDK to the VM. Here's one more suggestion at reducing this coupling? > > Note that C->set_trap_count is called when each Parse phase processes a > whole method. This means that information about the contents of the > nmethod accumulates during the parse. Likewise, add a flag method > C->{has,set}_injected_profile, and set the flag whenever the parser sees > a profileBranch intrinsic (with or without a constant profile array; > your call). Then consult that flag from too_many_traps. It is true > that code which is parsed upstream of the very first profileBranch will > potentially issue a non-trapping fallback, but by definition that code > would be unrelated to the injected profile, so I don't see a harm in > that. If this approach works, then you can remove the annotation > altogether, which is clearly preferable. We understand the annotation > now, but it has the danger of becoming a maintainer's puzzlement. > >> >>> In 'updateCounters', if the counter overflows, you'll get continuous >>> creation of ArithmeticExceptions. Will that optimize or will it cause a >>> permanent slowdown? Consider a hack like this on the exception path: >>> counters[idx] = Integer.MAX_VALUE / 2; >> I had an impression that VM optimizes overflows in Math.exact* >> intrinsics, but it's not the case - it always inserts an uncommon >> trap. I used the workaround you proposed. > > Good. > >> >>> On the Name Bikeshed: It looks like @IgnoreProfile (ignore_profile in >>> the VM) promises too much "ignorance", since it suppresses branch counts >>> and traps, but allows type profiles to be consulted. Maybe something >>> positive like "@ManyTraps" or "@SharedMegamorphic"? (It's just a name, >>> and this is just a suggestion.) >> What do you think about @LambdaForm.Shared? > > That's fine. Suggest changing the JVM accessor to > is_lambda_form_shared, because the term "shared" is already overused in > the VM. > > Or, to be much more accurate, s/@Shared/@CollectiveProfile/. Better > yet, get rid of it, as suggested above. > > (I just realized that profile pollution looks logically parallel to the > http://en.wikipedia.org/wiki/Tragedy_of_the_commons .) > > Also, in the comment explaining the annotation: > s/mostly useless/probably polluted by conflicting behavior from > multiple call sites/ > > I very much like the fact that profileBranch is the VM intrinsic, not > selectAlternative. A VM intrinsic should be nice and narrow like that. > In fact, you can delete selectAlternative from vmSymbols while you are > at it. > > (We could do profileInteger and profileClass in a similar way, if that > turned out to be useful.) > > ? John From vladimir.x.ivanov at oracle.com Mon Jan 26 18:31:30 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 26 Jan 2015 21:31:30 +0300 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54C66E4E.9050805@oracle.com> References: <54B94766.2080102@oracle.com> <7B03B9FB-17B4-4AE0-92B8-F2DC5B231294@oracle.com> <54BEA7D7.6080008@oracle.com> <5BA1E369-ED87-4EBD-8408-B73B726D91BD@oracle.com> <54C66E4E.9050805@oracle.com> Message-ID: <54C68802.7020105@oracle.com> > As you suggested, I reified MHI::profileBranch on LambdaForm level and > removed @LambdaForm.Shared. My main concern about removing @Sharen was > that profile pollution can affect the code before profileBranch call > (akin to 8068915 [1]) and it seems it's the case: Gbemu (at least) is > sensitive to that change (there's a 10% difference in peak performance > between @Shared and has_injected_profile()). Ignore that. Additional runs don't prove there's a regression on Gbemu. There's some variance on Gbemu and it's present w/ and w/o @Shared. Best regards, Vladimir Ivanov > I can leave @Shared as is for now or remove it and work on the fix to > the deoptimization counts pollution. What do you prefer? > > Best regards, > Vladimir Ivanov > > [1] https://bugs.openjdk.java.net/browse/JDK-8068915 > > On 1/23/15 4:31 AM, John Rose wrote: >> On Jan 20, 2015, at 11:09 AM, Vladimir Ivanov >> > >> wrote: >>> >>>> What I'm mainly poking at here is that 'isGWT' is not informative about >>>> the intended use of the flag. >>> I agree. It was an interim solution. Initially, I planned to introduce >>> customization and guide the logic based on that property. But it's not >>> there yet and I needed something for GWT case. Unfortunately, I missed >>> the case when GWT is edited. In that case, isGWT flag is missed and no >>> annotation is set. >>> So, I removed isGWT flag and introduced a check for selectAlternative >>> occurence in LambdaForm shape, as you suggested. >> >> Good. >> >> I think there is a sweeter spot just a little further on. Make >> profileBranch be an LF intrinsic and expose it like this: >> GWT(p,t,f;S) := let(a=new int[3]) in lambda(*: S) { >> selectAlternative(profileBranch(p.invoke( *), a), t, f).invoke( *); } >> >> Then selectAlternative triggers branchy bytecodes in the IBGen, and >> profileBranch injects profiling in C2. >> The presence of profileBranch would then trigger the @Shared annotation, >> if you still need it. >> >> After thinking about it some more, I still believe it would be better to >> detect the use of profileBranch during a C2 compile task, and feed that >> to the too_many_traps logic. I agree it is much easier to stick the >> annotation on in the IBGen; the problem is that because of a minor phase >> ordering problem you are introducing an annotation which flows from the >> JDK to the VM. Here's one more suggestion at reducing this coupling? >> >> Note that C->set_trap_count is called when each Parse phase processes a >> whole method. This means that information about the contents of the >> nmethod accumulates during the parse. Likewise, add a flag method >> C->{has,set}_injected_profile, and set the flag whenever the parser sees >> a profileBranch intrinsic (with or without a constant profile array; >> your call). Then consult that flag from too_many_traps. It is true >> that code which is parsed upstream of the very first profileBranch will >> potentially issue a non-trapping fallback, but by definition that code >> would be unrelated to the injected profile, so I don't see a harm in >> that. If this approach works, then you can remove the annotation >> altogether, which is clearly preferable. We understand the annotation >> now, but it has the danger of becoming a maintainer's puzzlement. >> >>> >>>> In 'updateCounters', if the counter overflows, you'll get continuous >>>> creation of ArithmeticExceptions. Will that optimize or will it >>>> cause a >>>> permanent slowdown? Consider a hack like this on the exception path: >>>> counters[idx] = Integer.MAX_VALUE / 2; >>> I had an impression that VM optimizes overflows in Math.exact* >>> intrinsics, but it's not the case - it always inserts an uncommon >>> trap. I used the workaround you proposed. >> >> Good. >> >>> >>>> On the Name Bikeshed: It looks like @IgnoreProfile (ignore_profile in >>>> the VM) promises too much "ignorance", since it suppresses branch >>>> counts >>>> and traps, but allows type profiles to be consulted. Maybe something >>>> positive like "@ManyTraps" or "@SharedMegamorphic"? (It's just a name, >>>> and this is just a suggestion.) >>> What do you think about @LambdaForm.Shared? >> >> That's fine. Suggest changing the JVM accessor to >> is_lambda_form_shared, because the term "shared" is already overused in >> the VM. >> >> Or, to be much more accurate, s/@Shared/@CollectiveProfile/. Better >> yet, get rid of it, as suggested above. >> >> (I just realized that profile pollution looks logically parallel to the >> http://en.wikipedia.org/wiki/Tragedy_of_the_commons .) >> >> Also, in the comment explaining the annotation: >> s/mostly useless/probably polluted by conflicting behavior from >> multiple call sites/ >> >> I very much like the fact that profileBranch is the VM intrinsic, not >> selectAlternative. A VM intrinsic should be nice and narrow like that. >> In fact, you can delete selectAlternative from vmSymbols while you are >> at it. >> >> (We could do profileInteger and profileClass in a similar way, if that >> turned out to be useful.) >> >> ? John From zoltan.majo at oracle.com Mon Jan 26 22:00:52 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 26 Jan 2015 23:00:52 +0100 Subject: [9] RFR(S): 8071312: compiler/arguments/CheckCompileThresholdScaling.java fails In-Reply-To: References: <54C10630.5070104@oracle.com> <54C21E55.40402@oracle.com> Message-ID: <54C6B914.9070807@oracle.com> Thank you, Filipp! Best regards, Zoltan On 01/24/2015 08:24 PM, Filipp Zhinkin wrote: > Hi Zoltan, > > thank you for fixing it. > The change looks good to me (not a R-reviewer). > > Regards, > Filipp. > > On Fri, Jan 23, 2015 at 1:11 PM, Zolt?n Maj? wrote: >> Thank you, David, Vladimir, and Filipp, for the feedback! >> >> Here is the newest webrev: >> http://cr.openjdk.java.net/~zmajo/8071312/webrev.01/ >> >> Please let me know if you think other changes are necessary! >> >> Best regards, >> >> >> Zoltan >> >> >> On 01/23/2015 07:44 AM, Filipp Zhinkin wrote: >>> Hi Zoltan, >>> >>> maybe you can eliminate unnecessary else-if clause in >>> Arguments::scaled_compile_threshold? >>> >>> if (scale == 1.0 || scale <= 0.0) { >>> return threshold; >>> } else { >>> return (intx)(threshold * scale); >>> } >>> >>> Thanks, >>> Filipp. >>> >>> On Thu, Jan 22, 2015 at 5:16 PM, Zolt?n Maj? >>> wrote: >>>> Hi, >>>> >>>> >>>> please review the following small patch. >>>> >>>> Bug (nightly failure): https://bugs.openjdk.java.net/browse/JDK-8071312 >>>> >>>> Problem: The test compiler/arguments/CheckCompileThresholdScaling.java >>>> fails >>>> due to changes by 8059606. The reason of the failure is the changed logic >>>> for the case when CompileThresholdScaling==0.0. >>>> >>>> >>>> Solution: >>>> >>>> This patch changes the way the VM handles CompileThresholdScaling==0.0. >>>> >>>> By convention, CompileThresholdScaling==0.0 is equivalent to -Xint. >>>> >>>> Before, CompileThresholdScaling==0 set the value of all compilation >>>> thresholds to 0 *silently*. This behavior is inconsistent with the logic >>>> of >>>> -Xint, which leaves the values of the compilation thresholds unaffected. >>>> >>>> With this change, if CompileThresholdScaling==0, the -Xint flag is set >>>> but >>>> the value of compilation thresholds is left unchanged. >>>> >>>> I changed compiler/arguments/CheckCompileThresholdScaling.java test to >>>> reflect this behavior. Added test case for CompileThresholdScaling == 0 >>>> with >>>> tiered compilation disabled. >>>> >>>> >>>> Webrev: http://cr.openjdk.java.net/~zmajo/8071312/webrev.00/ >>>> >>>> Testing: manual testing, JPRT (all standard JPRT tests + >>>> CheckCompileThresholdScaling) >>>> >>>> Thank you and best regards, >>>> >>>> >>>> Zoltan >>>> From john.r.rose at oracle.com Tue Jan 27 00:04:03 2015 From: john.r.rose at oracle.com (John Rose) Date: Mon, 26 Jan 2015 16:04:03 -0800 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54C66E4E.9050805@oracle.com> References: <54B94766.2080102@oracle.com> <7B03B9FB-17B4-4AE0-92B8-F2DC5B231294@oracle.com> <54BEA7D7.6080008@oracle.com> <5BA1E369-ED87-4EBD-8408-B73B726D91BD@oracle.com> <54C66E4E.9050805@oracle.com> Message-ID: <915998BE-25E9-4196-BAC7-FE5527E10F83@oracle.com> On Jan 26, 2015, at 8:41 AM, Vladimir Ivanov wrote: > > What do you think about the following version? > http://cr.openjdk.java.net/~vlivanov/8063137/webrev.02 > > As you suggested, I reified MHI::profileBranch on LambdaForm level and removed @LambdaForm.Shared. My main concern about removing @Sharen was that profile pollution can affect the code before profileBranch call (akin to 8068915 [1]) and it seems it's the case: Gbemu (at least) is sensitive to that change (there's a 10% difference in peak performance between @Shared and has_injected_profile()). > > I can leave @Shared as is for now or remove it and work on the fix to the deoptimization counts pollution. What do you prefer? Generic advice here: It's better to leave it out, if in doubt. If it has a real benefit, and we don't have time to make it clean, put it in and file a tracking bug to clean it up. I re-read the change. It's simpler and more coherent now. I see one more issue which we should fix now, while we can. It's the sort of thing which is hard to clean up later. The two fields of the profileBranch array have obscure and inconsistent labelings. It took me some hard thought and the inspection of three files to decide what "taken" and "not taken" mean in the C2 code that injects the profile. The problem is that, when you look at profileBranch, all you see is an integer (boolean) argument and an array, and no clear indication about which array element corresponds to which argument value. It's made worse by the fact that "taken" and "not taken" are not mentioned at all in the JDK code, which instead wires together the branches of selectAlternative without much comment. My preferred formulation, for making things clearer: Decouple the idea of branching from the idea of profile injection. Name the intrinsic (yes, one more bikeshed color) "profileBoolean" (or even "injectBooleanProfile"), and use the natural indexing of the array: 0 (Java false) is a[0], and 1 (Java true) is a[1]. We might later extend this to work with "booleans" (more generally, small-integer flags), of more than two possible values, klasses, etc. This line then goes away, and 'result' is used directly as the profile index: + int idx = result ? 0 : 1; The ProfileBooleanNode should have an embedded (or simply indirect) array of ints which is a simple copy of the profile array, so there's no doubt about which count is which. The parsing of the predicate that contains "profileBoolean" should probably be more robust, at least allowing for 'eq' and 'ne' versions of the test. (C2 freely flips comparison senses, in various places.) The check for Op_AndI must be more precise; make sure n->in(2) is a constant of the expected value (1). The most robust way to handle it (but try this another time, I think) would be to make two temp copies of the predicate, substituting the occurrence of ProfileBoolean with '0' and '1', respectively; if they both fold to '0' and '1' or '1' and '0', then you take the indicated action. I suggest putting the new code in Parse::dynamic_branch_prediction, which pattern-matches for injected profiles, into its own subroutine. Maybe: bool use_mdo = true; if (has_injected_profile(btest, test, &taken, ¬_taken)) { use_mdo = false; } if (use_mdo) { ... old code I see why you used the opposite order in the existing code: It mirrors the order of the second and third arguments to selectAlternative. But the JVM knows nothing about selectAlternative, so it's just confusing when reading the VM code to know which profile array element means what. ? John P.S. Long experience with byte-order bugs in HotSpot convinces me that if you are not scrupulously clear in your terms, when working with equal and opposite configuration pairs, you will have a long bug tail, especially if you have to maintain agreement about the configurations through many layers of software. This is one of those cases. The best chance to fix such bugs is not to allow them in the first place. In the case of byte-order, we have "first" vs. "second", "MSB" vs. "LSB", and "high" vs. "low" parts of values, for values in memory and in registers, and all possible misunderstandings about them and their relation have probably happened and caused bugs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.x.ivanov at oracle.com Tue Jan 27 16:05:19 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 27 Jan 2015 19:05:19 +0300 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <915998BE-25E9-4196-BAC7-FE5527E10F83@oracle.com> References: <54B94766.2080102@oracle.com> <7B03B9FB-17B4-4AE0-92B8-F2DC5B231294@oracle.com> <54BEA7D7.6080008@oracle.com> <5BA1E369-ED87-4EBD-8408-B73B726D91BD@oracle.com> <54C66E4E.9050805@oracle.com> <915998BE-25E9-4196-BAC7-FE5527E10F83@oracle.com> Message-ID: <54C7B73F.50404@oracle.com> Thanks for the feedback, John! Updated webrev: http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/jdk http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/hotspot Changes: - renamed MHI::profileBranch to MHI::profileBoolean, and ProfileBranchNode to ProfileBooleanNode; - restructured profile layout ([0] => false_cnt, [1] => true_cnt) - factored out profile injection in a separate function (has_injected_profile() in parse2.cpp) - ProfileBooleanNode stores true/false counts instead of taken/not_taken counts - matching from value counts to taken/not_taken happens in has_injected_profile(); - added BoolTest::ne support - sharpened test for AndI case: now it checks AndI (ProfileBoolean) (ConI 1) shape Best regards, Vladimir Ivanov On 1/27/15 3:04 AM, John Rose wrote: > On Jan 26, 2015, at 8:41 AM, Vladimir Ivanov > > wrote: >> >> What do you think about the following version? >> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.02 >> >> As you suggested, I reified MHI::profileBranch on LambdaForm level and >> removed @LambdaForm.Shared. My main concern about removing @Sharen was >> that profile pollution can affect the code before profileBranch call >> (akin to 8068915 [1]) and it seems it's the case: Gbemu (at least) is >> sensitive to that change (there's a 10% difference in peak performance >> between @Shared and has_injected_profile()). >> >> I can leave @Shared as is for now or remove it and work on the fix to >> the deoptimization counts pollution. What do you prefer? > > Generic advice here: It's better to leave it out, if in doubt. If it > has a real benefit, and we don't have time to make it clean, put it in > and file a tracking bug to clean it up. > > I re-read the change. It's simpler and more coherent now. > > I see one more issue which we should fix now, while we can. It's the > sort of thing which is hard to clean up later. > > The two fields of the profileBranch array have obscure and inconsistent > labelings. It took me some hard thought and the inspection of three > files to decide what "taken" and "not taken" mean in the C2 code that > injects the profile. The problem is that, when you look at > profileBranch, all you see is an integer (boolean) argument and an > array, and no clear indication about which array element corresponds to > which argument value. It's made worse by the fact that "taken" and "not > taken" are not mentioned at all in the JDK code, which instead wires > together the branches of selectAlternative without much comment. > > My preferred formulation, for making things clearer: Decouple the idea > of branching from the idea of profile injection. Name the intrinsic > (yes, one more bikeshed color) "profileBoolean" (or even > "injectBooleanProfile"), and use the natural indexing of the array: 0 > (Java false) is a[0], and 1 (Java true) is a[1]. We might later extend > this to work with "booleans" (more generally, small-integer flags), of > more than two possible values, klasses, etc. > > This line then goes away, and 'result' is used directly as the profile > index: > + int idx = result ? 0 : 1; > > The ProfileBooleanNode should have an embedded (or simply indirect) > array of ints which is a simple copy of the profile array, so there's no > doubt about which count is which. > > The parsing of the predicate that contains "profileBoolean" should > probably be more robust, at least allowing for 'eq' and 'ne' versions of > the test. (C2 freely flips comparison senses, in various places.) The > check for Op_AndI must be more precise; make sure n->in(2) is a constant > of the expected value (1). The most robust way to handle it (but try > this another time, I think) would be to make two temp copies of the > predicate, substituting the occurrence of ProfileBoolean with '0' and > '1', respectively; if they both fold to '0' and '1' or '1' and '0', then > you take the indicated action. > > I suggest putting the new code in Parse::dynamic_branch_prediction, > which pattern-matches for injected profiles, into its own subroutine. > Maybe: > bool use_mdo = true; > if (has_injected_profile(btest, test, &taken, ¬_taken)) { > use_mdo = false; > } > if (use_mdo) { ... old code > > I see why you used the opposite order in the existing code: It mirrors > the order of the second and third arguments to selectAlternative. But > the JVM knows nothing about selectAlternative, so it's just confusing > when reading the VM code to know which profile array element means what. > > ? John > > P.S. Long experience with byte-order bugs in HotSpot convinces me that > if you are not scrupulously clear in your terms, when working with equal > and opposite configuration pairs, you will have a long bug tail, > especially if you have to maintain agreement about the configurations > through many layers of software. This is one of those cases. The best > chance to fix such bugs is not to allow them in the first place. In the > case of byte-order, we have "first" vs. "second", "MSB" vs. "LSB", and > "high" vs. "low" parts of values, for values in memory and in registers, > and all possible misunderstandings about them and their relation have > probably happened and caused bugs. From tatiana.pivovarova at oracle.com Tue Jan 27 16:06:38 2015 From: tatiana.pivovarova at oracle.com (Tatiana Pivovarova) Date: Tue, 27 Jan 2015 19:06:38 +0300 Subject: RFR(S): 7605373: Decrease count of CTW testlists Message-ID: <54C7B78E.80402@oracle.com> Hi all, Please review this small enhancement. bug-id: https://bugs.openjdk.java.net/browse/INTJDK-7605373 webrev: http://cr.openjdk.java.net/~tpivovarova/7605373/webrev.00/ Problem: There are 106 testlist for ctw, so for 6 platforms, 4 tiered levels we have 2544 run jobs. Some of them have very small execution time (< 1min), but others execute more than 3hour. Solution: This tool enumerates all jar files in specified dirs and then grouping them by approximate equal running time (the running time calculated for all methods in all classes) Thanks, Tatiana From tatiana.pivovarova at oracle.com Tue Jan 27 16:52:48 2015 From: tatiana.pivovarova at oracle.com (Tatiana Pivovarova) Date: Tue, 27 Jan 2015 19:52:48 +0300 Subject: RFR(S): 7605373: Decrease count of CTW testlists In-Reply-To: <54C7B78E.80402@oracle.com> References: <54C7B78E.80402@oracle.com> Message-ID: <54C7C260.306@oracle.com> Sorry guys, This RFR was sent by mistake Cancel my request. Thanks, Tatiana On 01/27/2015 07:06 PM, Tatiana Pivovarova wrote: > Hi all, > > Please review this small enhancement. > > bug-id: https://bugs.openjdk.java.net/browse/INTJDK-7605373 > webrev: http://cr.openjdk.java.net/~tpivovarova/7605373/webrev.00/ > > Problem: > There are 106 testlist for ctw, so for 6 platforms, 4 tiered levels we > have 2544 run jobs. Some of them have very small execution time (< > 1min), but others execute more than 3hour. > > Solution: > This tool enumerates all jar files in specified dirs and then grouping > them by approximate equal running time (the running time calculated > for all methods in all classes) > > Thanks, > Tatiana From john.r.rose at oracle.com Tue Jan 27 21:08:47 2015 From: john.r.rose at oracle.com (John Rose) Date: Tue, 27 Jan 2015 13:08:47 -0800 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54C7B73F.50404@oracle.com> References: <54B94766.2080102@oracle.com> <7B03B9FB-17B4-4AE0-92B8-F2DC5B231294@oracle.com> <54BEA7D7.6080008@oracle.com> <5BA1E369-ED87-4EBD-8408-B73B726D91BD@oracle.com> <54C66E4E.9050805@oracle.com> <915998BE-25E9-4196-BAC7-FE5527E10F83@oracle.com> <54C7B73F.50404@oracle.com> Message-ID: <8AD9A8CC-E570-4DE6-ABB1-10B00FACB8AB@oracle.com> Looking very good, thanks. Ship it! Actually, can you insert a comment why the injected counts are not scaled? (Or perhaps they should be??) Also, we may need a followup bug for the code with this comment: // Look for the following shape: AndI (ProfileBoolean) (ConI 1)) Since profileBoolean returns a TypeInt::BOOL, the AndI with (ConI 1) should fold up. So there's some work to do in MulNode, which may allow that special pattern match to go away. But I don't want to divert the present bug by a possibly complex dive into fixing AndI::Ideal. (Generally speaking, pattern matching should assume strong normalization of its inputs. Otherwise you end up duplicating pattern match code in many places, inconsistently. Funny one-off idiom checks like this are evidence of incomplete IR normalization. See http://en.wikipedia.org/wiki/Rewriting for some background on terms like "normalization" and "confluence" which are relevant to C2.) ? John On Jan 27, 2015, at 8:05 AM, Vladimir Ivanov wrote: > > Thanks for the feedback, John! > > Updated webrev: > http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/jdk > http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/hotspot > > Changes: > - renamed MHI::profileBranch to MHI::profileBoolean, and ProfileBranchNode to ProfileBooleanNode; > - restructured profile layout ([0] => false_cnt, [1] => true_cnt) > - factored out profile injection in a separate function (has_injected_profile() in parse2.cpp) > - ProfileBooleanNode stores true/false counts instead of taken/not_taken counts > - matching from value counts to taken/not_taken happens in has_injected_profile(); > - added BoolTest::ne support > - sharpened test for AndI case: now it checks AndI (ProfileBoolean) (ConI 1) shape > > Best regards, > Vladimir Ivanov From vladimir.x.ivanov at oracle.com Wed Jan 28 09:00:55 2015 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 28 Jan 2015 12:00:55 +0300 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <8AD9A8CC-E570-4DE6-ABB1-10B00FACB8AB@oracle.com> References: <54B94766.2080102@oracle.com> <7B03B9FB-17B4-4AE0-92B8-F2DC5B231294@oracle.com> <54BEA7D7.6080008@oracle.com> <5BA1E369-ED87-4EBD-8408-B73B726D91BD@oracle.com> <54C66E4E.9050805@oracle.com> <915998BE-25E9-4196-BAC7-FE5527E10F83@oracle.com> <54C7B73F.50404@oracle.com> <8AD9A8CC-E570-4DE6-ABB1-10B00FACB8AB@oracle.com> Message-ID: <54C8A547.6050607@oracle.com> > Looking very good, thanks. Ship it! Thanks, John! > Actually, can you insert a comment why the injected counts are not scaled? (Or perhaps they should be??) Sure! I intentionally don't scale the counts because I don't see any reason to do so. Profiling is done on per-MethodHandle basis, so the counts should be very close (considering racy updates) to the actual behavior. > Also, we may need a followup bug for the code with this comment: > // Look for the following shape: AndI (ProfileBoolean) (ConI 1)) > > Since profileBoolean returns a TypeInt::BOOL, the AndI with (ConI 1) should fold up. > So there's some work to do in MulNode, which may allow that special pattern match to go away. > But I don't want to divert the present bug by a possibly complex dive into fixing AndI::Ideal. Good catch! It's an overlook on my side. The following change for ProfileBooleanNode solves the problem: - virtual const Type *bottom_type() const { return TypeInt::INT; } + virtual const Type *bottom_type() const { return TypeInt::BOOL; } I polished the change a little according to your comments (diff against v03): http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03-04/hotspot Changes: - added short explanation why injected counts aren't scaled - adjusted ProfileBooleanNode type to TypeInt::BOOL and removed excessive pattern matching in has_injected_profile() - added an assert when ProfileBooleanNode is removed to catch the cases when injected profile isn't used: if we decide to generalize the API, I'd be happy to remove it, but current usages assumes that injected counts are always consumed during parsing and missing cases can cause hard-to-diagnose performance problems. Best regards, Vladimir Ivanov > > (Generally speaking, pattern matching should assume strong normalization of its inputs. Otherwise you end up duplicating pattern match code in many places, inconsistently. Funny one-off idiom checks like this are evidence of incomplete IR normalization. See http://en.wikipedia.org/wiki/Rewriting for some background on terms like "normalization" and "confluence" which are relevant to C2.) > > ? John > > On Jan 27, 2015, at 8:05 AM, Vladimir Ivanov wrote: >> >> Thanks for the feedback, John! >> >> Updated webrev: >> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/jdk >> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/hotspot >> >> Changes: >> - renamed MHI::profileBranch to MHI::profileBoolean, and ProfileBranchNode to ProfileBooleanNode; >> - restructured profile layout ([0] => false_cnt, [1] => true_cnt) >> - factored out profile injection in a separate function (has_injected_profile() in parse2.cpp) >> - ProfileBooleanNode stores true/false counts instead of taken/not_taken counts >> - matching from value counts to taken/not_taken happens in has_injected_profile(); >> - added BoolTest::ne support >> - sharpened test for AndI case: now it checks AndI (ProfileBoolean) (ConI 1) shape >> >> Best regards, >> Vladimir Ivanov > From pavel.punegov at oracle.com Wed Jan 28 12:18:18 2015 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Wed, 28 Jan 2015 15:18:18 +0300 Subject: RFR (XXS): 8067012 : Don't create MDO for constant getters Message-ID: <54C8D38A.6050100@oracle.com> Hi, please review the following small change. With the fix for JDK-8056071 [*] constant getters are now compiled on level 1, so there are no need to create MDO for constant getter methods in TieredCompilation Bug: https://bugs.openjdk.java.net/browse/JDK-8067012 webrev: http://cr.openjdk.java.net/~ppunegov/8067012/webrev/ Testing: done locally and with JPRT. -- [*] https://bugs.openjdk.java.net/browse/JDK-8056071 -- Thanks, Pavel Punegov From zoltan.majo at oracle.com Wed Jan 28 15:51:18 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Wed, 28 Jan 2015 16:51:18 +0100 Subject: [9] RFR(S): 8071654: disassembler handles embedded OOPs not uniformly Message-ID: <54C90576.4070806@oracle.com> Hi, please review the following small patch. Bug: https://bugs.openjdk.java.net/browse/JDK-8071654 Problem: The disassembler has two different ways of handling OOPs embedded into instructions. 1) Embedded OOPs are printed in hexadecimal within a disassembled instruction. The comment block following the instruction contains the name of the OOP's class. For example: 0x00007f52ed58e3e4: mov $0x7f52c4e03d78,%rdi ; {oop(a 'java/lang/Class' = 'java/lang/invoke/MethodHandle')} 2) Embedded OOPs are replaced in the disassembled instruction by the OOP's class name (this functionality was available only before PermGen removal, see JDK-8066438 for more details). The comment block following the disassembled instruction contains the same information *again* (the name of the OOP's class). For example: 0x01882738: cmp edx, a 'sun/dyn/ToGeneric$A2'; {oop(a 'sun/dyn/ToGeneric$A2')} (output taken from https://wikis.oracle.com/display/HotSpotInternals/PrintAssembly) Whether an OOP is replaced with its class name depends on the external binutils library. For some types of instructions (e.g., compares, jumps and calls on x86), binutils indicates to the VM that there are addresses embedded into the instruction (and then the VM applies Way 1). For some instructions (e.g., movs on x86), binutils does not indicate to the VM that the instruction contains an embedded address and the VM applies Way 2. Solution: This patch proposes that the disassembler handles OOPs in a single way (Way 1): An embedded OOP's is printed in hexadecimal within the instruction; the comment following the disassembled instruction prints the OOP's class. This way both the OOP as raw address and the OOP's class is available in the disassembly by default. The patch proposes to not print an OOP's class in decode_env::print_address() anymore; printing should be done only in nmethod::print_code_comment_on(). Thenmethod::embeddedOop_at method is not needed any more and is therefore removed. Webrev: http://cr.openjdk.java.net/~zmajo/8071654/webrev.00/ Testing: manual testing, built + minimal tests with JPRT on all supported platforms. Thank you! Best regards, Zoltan From vladimir.kozlov at oracle.com Wed Jan 28 17:07:08 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 28 Jan 2015 09:07:08 -0800 Subject: RFR (XXS): 8067012 : Don't create MDO for constant getters In-Reply-To: <54C8D38A.6050100@oracle.com> References: <54C8D38A.6050100@oracle.com> Message-ID: <54C9173C.6000904@oracle.com> Looks good. Does this change affect any tests which have to be modified too? I remember some WB tests use method which return only constant to test compilation. Thanks, Vladimir On 1/28/15 4:18 AM, Pavel Punegov wrote: > Hi, > > please review the following small change. > > With the fix for JDK-8056071 [*] constant getters are now compiled on level 1, > so there are no need to create MDO for constant getter methods in TieredCompilation > > Bug: https://bugs.openjdk.java.net/browse/JDK-8067012 > webrev: http://cr.openjdk.java.net/~ppunegov/8067012/webrev/ > > Testing: done locally and with JPRT. > > -- > [*] https://bugs.openjdk.java.net/browse/JDK-8056071 > From vladimir.kozlov at oracle.com Wed Jan 28 17:33:29 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 28 Jan 2015 09:33:29 -0800 Subject: [9] RFR(S): 8071654: disassembler handles embedded OOPs not uniformly In-Reply-To: <54C90576.4070806@oracle.com> References: <54C90576.4070806@oracle.com> Message-ID: <54C91D69.1020203@oracle.com> So this is just clean up since Universe::heap()->is_in(obj->klass() is false after PermGen removal. Now you get only hexadecimal value and no comment when binutils does not indicate VM that the instruction has embedded oop. Right or I am wrong? If I am right, can we do something about that? I agree that we should not replace hexadecimal address but can we generate comment too because even if binutils does not report VM can see it: _nm->embeddedOop_at(cur_insn()) Thanks, Vladimir On 1/28/15 7:51 AM, Zolt?n Maj? wrote: > Hi, > > > please review the following small patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8071654 > > > Problem: The disassembler has two different ways of handling OOPs embedded into instructions. > > 1) Embedded OOPs are printed in hexadecimal within a disassembled instruction. The comment block following the > instruction contains the name of the OOP's class. For example: > > 0x00007f52ed58e3e4: mov $0x7f52c4e03d78,%rdi ; {oop(a 'java/lang/Class' = 'java/lang/invoke/MethodHandle')} > > 2) Embedded OOPs are replaced in the disassembled instruction by the OOP's class name (this functionality was available > only before PermGen removal, see JDK-8066438 for more details). The comment block following the disassembled instruction > contains the same information *again* (the name of the OOP's class). For example: > > 0x01882738: cmp edx, a 'sun/dyn/ToGeneric$A2'; {oop(a 'sun/dyn/ToGeneric$A2')} > > (output taken from https://wikis.oracle.com/display/HotSpotInternals/PrintAssembly) > > Whether an OOP is replaced with its class name depends on the external binutils library. For some types of instructions > (e.g., compares, jumps and calls on x86), binutils indicates to the VM that there are addresses embedded into the > instruction (and then the VM applies Way 1). For some instructions (e.g., movs on x86), binutils does not indicate to > the VM that the instruction contains an embedded address and the VM applies Way 2. > > > Solution: This patch proposes that the disassembler handles OOPs in a single way (Way 1): An embedded OOP's is printed > in hexadecimal within the instruction; the comment following the disassembled instruction prints the OOP's class. This > way both the OOP as raw address and the OOP's class is available in the disassembly by default. > > The patch proposes to not print an OOP's class in decode_env::print_address() anymore; printing should be done only in > nmethod::print_code_comment_on(). Thenmethod::embeddedOop_at method is not needed any more and is therefore removed. > > Webrev: http://cr.openjdk.java.net/~zmajo/8071654/webrev.00/ > > Testing: manual testing, built + minimal tests with JPRT on all supported platforms. > > Thank you! > > Best regards, > > > Zoltan > From pavel.punegov at oracle.com Wed Jan 28 17:47:29 2015 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Wed, 28 Jan 2015 20:47:29 +0300 Subject: RFR (XXS): 8067012 : Don't create MDO for constant getters In-Reply-To: <54C9173C.6000904@oracle.com> References: <54C8D38A.6050100@oracle.com> <54C9173C.6000904@oracle.com> Message-ID: <54C920B1.7030102@oracle.com> Vladimir, It doesn't affect any test. There is a compiler/tiered/ConstantGettersTransitionsTest.java that checks that such methods will go to the 1st level only, but they go there anyway regardless they have or not MDO. On 28.01.2015 20:07, Vladimir Kozlov wrote: > Looks good. Does this change affect any tests which have to be > modified too? I remember some WB tests use method which return only > constant to test compilation. > > Thanks, > Vladimir > > On 1/28/15 4:18 AM, Pavel Punegov wrote: >> Hi, >> >> please review the following small change. >> >> With the fix for JDK-8056071 [*] constant getters are now compiled >> on level 1, >> so there are no need to create MDO for constant getter methods in >> TieredCompilation >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8067012 >> webrev: http://cr.openjdk.java.net/~ppunegov/8067012/webrev/ >> >> Testing: done locally and with JPRT. >> >> -- >> [*] https://bugs.openjdk.java.net/browse/JDK-8056071 >> -- Thanks, Pavel Punegov From vladimir.kozlov at oracle.com Wed Jan 28 17:55:45 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 28 Jan 2015 09:55:45 -0800 Subject: RFR (XXS): 8067012 : Don't create MDO for constant getters In-Reply-To: <54C920B1.7030102@oracle.com> References: <54C8D38A.6050100@oracle.com> <54C9173C.6000904@oracle.com> <54C920B1.7030102@oracle.com> Message-ID: <54C922A1.6070108@oracle.com> Okay. Thanks, Vladimir On 1/28/15 9:47 AM, Pavel Punegov wrote: > Vladimir, > > It doesn't affect any test. There is a compiler/tiered/ConstantGettersTransitionsTest.java that checks that such methods > will go to the 1st level only, but they go there anyway regardless they have or not MDO. > > On 28.01.2015 20:07, Vladimir Kozlov wrote: >> Looks good. Does this change affect any tests which have to be modified too? I remember some WB tests use method which >> return only constant to test compilation. >> >> Thanks, >> Vladimir >> >> On 1/28/15 4:18 AM, Pavel Punegov wrote: >>> Hi, >>> >>> please review the following small change. >>> >>> With the fix for JDK-8056071 [*] constant getters are now compiled on level 1, >>> so there are no need to create MDO for constant getter methods in TieredCompilation >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8067012 >>> webrev: http://cr.openjdk.java.net/~ppunegov/8067012/webrev/ >>> >>> Testing: done locally and with JPRT. >>> >>> -- >>> [*] https://bugs.openjdk.java.net/browse/JDK-8056071 >>> > From zoltan.majo at oracle.com Wed Jan 28 17:59:03 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Wed, 28 Jan 2015 18:59:03 +0100 Subject: [9] RFR(S): 8071654: disassembler handles embedded OOPs not uniformly In-Reply-To: <54C91D69.1020203@oracle.com> References: <54C90576.4070806@oracle.com> <54C91D69.1020203@oracle.com> Message-ID: <54C92367.3010907@oracle.com> Hi Vladimir, thank you for the feedback. On 01/28/2015 06:33 PM, Vladimir Kozlov wrote: > So this is just clean up since Universe::heap()->is_in(obj->klass() is > false after PermGen removal. That is right. > Now you get only hexadecimal value and no comment when binutils does > not indicate VM that the instruction has embedded oop. Right or I am > wrong? No, we still get a comment in that case. If binutils does not indicate to the VM that an instruction contains an embedded OOP, decode_env::print_address() is not called. Therefore_nm->embeddedOop_at() is not called either. As a result, we don't process the relocation information for _nm and a hexadecimal value is printed. But decode_env::end_insn() is always called right after the current instruction has been processed. Therefore, _nm->print_code_comment_on() is called as well and all relocation information (including those with type 'oop_type') are printed. I checked by changing binutils to do some extra VM notifications. Please let me know if you think I miss/oversee anything. Thank you very much! Best regards, Zoltan > If I am right, can we do something about that? I agree that we should > not replace hexadecimal address but can we generate comment too > because even if binutils does not report VM can see it: > _nm->embeddedOop_at(cur_insn()) > > Thanks, > Vladimir > > On 1/28/15 7:51 AM, Zolt?n Maj? wrote: >> Hi, >> >> >> please review the following small patch. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8071654 >> >> >> Problem: The disassembler has two different ways of handling OOPs >> embedded into instructions. >> >> 1) Embedded OOPs are printed in hexadecimal within a disassembled >> instruction. The comment block following the >> instruction contains the name of the OOP's class. For example: >> >> 0x00007f52ed58e3e4: mov $0x7f52c4e03d78,%rdi ; {oop(a >> 'java/lang/Class' = 'java/lang/invoke/MethodHandle')} >> >> 2) Embedded OOPs are replaced in the disassembled instruction by the >> OOP's class name (this functionality was available >> only before PermGen removal, see JDK-8066438 for more details). The >> comment block following the disassembled instruction >> contains the same information *again* (the name of the OOP's class). >> For example: >> >> 0x01882738: cmp edx, a 'sun/dyn/ToGeneric$A2'; {oop(a >> 'sun/dyn/ToGeneric$A2')} >> >> (output taken from >> https://wikis.oracle.com/display/HotSpotInternals/PrintAssembly) >> >> Whether an OOP is replaced with its class name depends on the >> external binutils library. For some types of instructions >> (e.g., compares, jumps and calls on x86), binutils indicates to the >> VM that there are addresses embedded into the >> instruction (and then the VM applies Way 1). For some instructions >> (e.g., movs on x86), binutils does not indicate to >> the VM that the instruction contains an embedded address and the VM >> applies Way 2. >> >> >> Solution: This patch proposes that the disassembler handles OOPs in a >> single way (Way 1): An embedded OOP's is printed >> in hexadecimal within the instruction; the comment following the >> disassembled instruction prints the OOP's class. This >> way both the OOP as raw address and the OOP's class is available in >> the disassembly by default. >> >> The patch proposes to not print an OOP's class in >> decode_env::print_address() anymore; printing should be done only in >> nmethod::print_code_comment_on(). Thenmethod::embeddedOop_at method >> is not needed any more and is therefore removed. >> >> Webrev: http://cr.openjdk.java.net/~zmajo/8071654/webrev.00/ >> >> Testing: manual testing, built + minimal tests with JPRT on all >> supported platforms. >> >> Thank you! >> >> Best regards, >> >> >> Zoltan >> From vladimir.kozlov at oracle.com Wed Jan 28 18:09:52 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 28 Jan 2015 10:09:52 -0800 Subject: [9] RFR(S): 8071654: disassembler handles embedded OOPs not uniformly In-Reply-To: <54C92367.3010907@oracle.com> References: <54C90576.4070806@oracle.com> <54C91D69.1020203@oracle.com> <54C92367.3010907@oracle.com> Message-ID: <54C925F0.8040403@oracle.com> Good. Thank you for clarifying. I was confused because bugs says it produce different results. But you are saying that currently it produces the same output and we need only to cleanup code which is not used anymore. Right? Thanks, Vladimir On 1/28/15 9:59 AM, Zolt?n Maj? wrote: > Hi Vladimir, > > > thank you for the feedback. > > On 01/28/2015 06:33 PM, Vladimir Kozlov wrote: >> So this is just clean up since Universe::heap()->is_in(obj->klass() is false after PermGen removal. > > That is right. > >> Now you get only hexadecimal value and no comment when binutils does not indicate VM that the instruction has embedded >> oop. Right or I am wrong? > > No, we still get a comment in that case. > > If binutils does not indicate to the VM that an instruction contains an embedded OOP, decode_env::print_address() is not > called. Therefore_nm->embeddedOop_at() is not called either. As a result, we don't process the relocation information > for _nm and a hexadecimal value is printed. > > But decode_env::end_insn() is always called right after the current instruction has been processed. Therefore, > _nm->print_code_comment_on() is called as well and all relocation information (including those with type 'oop_type') are > printed. > > I checked by changing binutils to do some extra VM notifications. > > Please let me know if you think I miss/oversee anything. > > Thank you very much! > > Best regards, > > > Zoltan > >> If I am right, can we do something about that? I agree that we should not replace hexadecimal address but can we >> generate comment too because even if binutils does not report VM can see it: _nm->embeddedOop_at(cur_insn()) >> >> Thanks, >> Vladimir >> >> On 1/28/15 7:51 AM, Zolt?n Maj? wrote: >>> Hi, >>> >>> >>> please review the following small patch. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8071654 >>> >>> >>> Problem: The disassembler has two different ways of handling OOPs embedded into instructions. >>> >>> 1) Embedded OOPs are printed in hexadecimal within a disassembled instruction. The comment block following the >>> instruction contains the name of the OOP's class. For example: >>> >>> 0x00007f52ed58e3e4: mov $0x7f52c4e03d78,%rdi ; {oop(a 'java/lang/Class' = 'java/lang/invoke/MethodHandle')} >>> >>> 2) Embedded OOPs are replaced in the disassembled instruction by the OOP's class name (this functionality was available >>> only before PermGen removal, see JDK-8066438 for more details). The comment block following the disassembled instruction >>> contains the same information *again* (the name of the OOP's class). For example: >>> >>> 0x01882738: cmp edx, a 'sun/dyn/ToGeneric$A2'; {oop(a 'sun/dyn/ToGeneric$A2')} >>> >>> (output taken from https://wikis.oracle.com/display/HotSpotInternals/PrintAssembly) >>> >>> Whether an OOP is replaced with its class name depends on the external binutils library. For some types of instructions >>> (e.g., compares, jumps and calls on x86), binutils indicates to the VM that there are addresses embedded into the >>> instruction (and then the VM applies Way 1). For some instructions (e.g., movs on x86), binutils does not indicate to >>> the VM that the instruction contains an embedded address and the VM applies Way 2. >>> >>> >>> Solution: This patch proposes that the disassembler handles OOPs in a single way (Way 1): An embedded OOP's is printed >>> in hexadecimal within the instruction; the comment following the disassembled instruction prints the OOP's class. This >>> way both the OOP as raw address and the OOP's class is available in the disassembly by default. >>> >>> The patch proposes to not print an OOP's class in decode_env::print_address() anymore; printing should be done only in >>> nmethod::print_code_comment_on(). Thenmethod::embeddedOop_at method is not needed any more and is therefore removed. >>> >>> Webrev: http://cr.openjdk.java.net/~zmajo/8071654/webrev.00/ >>> >>> Testing: manual testing, built + minimal tests with JPRT on all supported platforms. >>> >>> Thank you! >>> >>> Best regards, >>> >>> >>> Zoltan >>> > From pavel.punegov at oracle.com Wed Jan 28 18:32:35 2015 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Wed, 28 Jan 2015 21:32:35 +0300 Subject: RFR (XXS): 8067012 : Don't create MDO for constant getters In-Reply-To: <54C922A1.6070108@oracle.com> References: <54C8D38A.6050100@oracle.com> <54C9173C.6000904@oracle.com> <54C920B1.7030102@oracle.com> <54C922A1.6070108@oracle.com> Message-ID: <54C92B43.9060904@oracle.com> Vladimir, thank you for review On 28.01.2015 20:55, Vladimir Kozlov wrote: > Okay. > > Thanks, > Vladimir > > On 1/28/15 9:47 AM, Pavel Punegov wrote: >> Vladimir, >> >> It doesn't affect any test. There is a >> compiler/tiered/ConstantGettersTransitionsTest.java that checks that >> such methods >> will go to the 1st level only, but they go there anyway regardless >> they have or not MDO. >> >> On 28.01.2015 20:07, Vladimir Kozlov wrote: >>> Looks good. Does this change affect any tests which have to be >>> modified too? I remember some WB tests use method which >>> return only constant to test compilation. >>> >>> Thanks, >>> Vladimir >>> >>> On 1/28/15 4:18 AM, Pavel Punegov wrote: >>>> Hi, >>>> >>>> please review the following small change. >>>> >>>> With the fix for JDK-8056071 [*] constant getters are now compiled >>>> on level 1, >>>> so there are no need to create MDO for constant getter methods in >>>> TieredCompilation >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8067012 >>>> webrev: http://cr.openjdk.java.net/~ppunegov/8067012/webrev/ >>>> >>>> Testing: done locally and with JPRT. >>>> >>>> -- >>>> [*] https://bugs.openjdk.java.net/browse/JDK-8056071 >>>> >> -- Thanks, Pavel Punegov From zoltan.majo at oracle.com Wed Jan 28 18:37:44 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Wed, 28 Jan 2015 19:37:44 +0100 Subject: [9] RFR(S): 8071654: disassembler handles embedded OOPs not uniformly In-Reply-To: <54C925F0.8040403@oracle.com> References: <54C90576.4070806@oracle.com> <54C91D69.1020203@oracle.com> <54C92367.3010907@oracle.com> <54C925F0.8040403@oracle.com> Message-ID: <54C92C78.1060809@oracle.com> Hi Vladimir, thank you for the feedback. On 01/28/2015 07:09 PM, Vladimir Kozlov wrote: > Good. Thank you for clarifying. > I was confused because bugs says it produce different results. But you > are saying that currently it produces the same output and we need only > to cleanup code which is not used anymore. Right? that is right. Currently the same output is produced and it would be nice to keep it that way. I think the bug description was not clear enough, sorry for that. Thank you and best regards, Zoltan > > Thanks, > Vladimir > > On 1/28/15 9:59 AM, Zolt?n Maj? wrote: >> Hi Vladimir, >> >> >> thank you for the feedback. >> >> On 01/28/2015 06:33 PM, Vladimir Kozlov wrote: >>> So this is just clean up since Universe::heap()->is_in(obj->klass() >>> is false after PermGen removal. >> >> That is right. >> >>> Now you get only hexadecimal value and no comment when binutils does >>> not indicate VM that the instruction has embedded >>> oop. Right or I am wrong? >> >> No, we still get a comment in that case. >> >> If binutils does not indicate to the VM that an instruction contains >> an embedded OOP, decode_env::print_address() is not >> called. Therefore_nm->embeddedOop_at() is not called either. As a >> result, we don't process the relocation information >> for _nm and a hexadecimal value is printed. >> >> But decode_env::end_insn() is always called right after the current >> instruction has been processed. Therefore, >> _nm->print_code_comment_on() is called as well and all relocation >> information (including those with type 'oop_type') are >> printed. >> >> I checked by changing binutils to do some extra VM notifications. >> >> Please let me know if you think I miss/oversee anything. >> >> Thank you very much! >> >> Best regards, >> >> >> Zoltan >> >>> If I am right, can we do something about that? I agree that we >>> should not replace hexadecimal address but can we >>> generate comment too because even if binutils does not report VM can >>> see it: _nm->embeddedOop_at(cur_insn()) >>> >>> Thanks, >>> Vladimir >>> >>> On 1/28/15 7:51 AM, Zolt?n Maj? wrote: >>>> Hi, >>>> >>>> >>>> please review the following small patch. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8071654 >>>> >>>> >>>> Problem: The disassembler has two different ways of handling OOPs >>>> embedded into instructions. >>>> >>>> 1) Embedded OOPs are printed in hexadecimal within a disassembled >>>> instruction. The comment block following the >>>> instruction contains the name of the OOP's class. For example: >>>> >>>> 0x00007f52ed58e3e4: mov $0x7f52c4e03d78,%rdi ; {oop(a >>>> 'java/lang/Class' = 'java/lang/invoke/MethodHandle')} >>>> >>>> 2) Embedded OOPs are replaced in the disassembled instruction by >>>> the OOP's class name (this functionality was available >>>> only before PermGen removal, see JDK-8066438 for more details). The >>>> comment block following the disassembled instruction >>>> contains the same information *again* (the name of the OOP's >>>> class). For example: >>>> >>>> 0x01882738: cmp edx, a 'sun/dyn/ToGeneric$A2'; {oop(a >>>> 'sun/dyn/ToGeneric$A2')} >>>> >>>> (output taken from >>>> https://wikis.oracle.com/display/HotSpotInternals/PrintAssembly) >>>> >>>> Whether an OOP is replaced with its class name depends on the >>>> external binutils library. For some types of instructions >>>> (e.g., compares, jumps and calls on x86), binutils indicates to the >>>> VM that there are addresses embedded into the >>>> instruction (and then the VM applies Way 1). For some instructions >>>> (e.g., movs on x86), binutils does not indicate to >>>> the VM that the instruction contains an embedded address and the VM >>>> applies Way 2. >>>> >>>> >>>> Solution: This patch proposes that the disassembler handles OOPs in >>>> a single way (Way 1): An embedded OOP's is printed >>>> in hexadecimal within the instruction; the comment following the >>>> disassembled instruction prints the OOP's class. This >>>> way both the OOP as raw address and the OOP's class is available in >>>> the disassembly by default. >>>> >>>> The patch proposes to not print an OOP's class in >>>> decode_env::print_address() anymore; printing should be done only in >>>> nmethod::print_code_comment_on(). Thenmethod::embeddedOop_at method >>>> is not needed any more and is therefore removed. >>>> >>>> Webrev: http://cr.openjdk.java.net/~zmajo/8071654/webrev.00/ >>>> >>>> Testing: manual testing, built + minimal tests with JPRT on all >>>> supported platforms. >>>> >>>> Thank you! >>>> >>>> Best regards, >>>> >>>> >>>> Zoltan >>>> >> From igor.veresov at oracle.com Wed Jan 28 20:12:44 2015 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 28 Jan 2015 12:12:44 -0800 Subject: RFR (XXS): 8067012 : Don't create MDO for constant getters In-Reply-To: <54C8D38A.6050100@oracle.com> References: <54C8D38A.6050100@oracle.com> Message-ID: <02DDBA3C-CA40-43CA-A01E-D304C5F59164@oracle.com> Looks good. igor > On Jan 28, 2015, at 4:18 AM, Pavel Punegov wrote: > > Hi, > > please review the following small change. > > With the fix for JDK-8056071 [*] constant getters are now compiled on level 1, > so there are no need to create MDO for constant getter methods in TieredCompilation > > Bug: https://bugs.openjdk.java.net/browse/JDK-8067012 > webrev: http://cr.openjdk.java.net/~ppunegov/8067012/webrev/ > > Testing: done locally and with JPRT. > > -- > [*] https://bugs.openjdk.java.net/browse/JDK-8056071 > > -- > Thanks, > Pavel Punegov > From john.r.rose at oracle.com Wed Jan 28 20:30:37 2015 From: john.r.rose at oracle.com (John Rose) Date: Wed, 28 Jan 2015 12:30:37 -0800 Subject: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared In-Reply-To: <54C8A547.6050607@oracle.com> References: <54B94766.2080102@oracle.com> <7B03B9FB-17B4-4AE0-92B8-F2DC5B231294@oracle.com> <54BEA7D7.6080008@oracle.com> <5BA1E369-ED87-4EBD-8408-B73B726D91BD@oracle.com> <54C66E4E.9050805@oracle.com> <915998BE-25E9-4196-BAC7-FE5527E10F83@oracle.com> <54C7B73F.50404@oracle.com> <8AD9A8CC-E570-4DE6-ABB1-10B00FACB8AB@oracle.com> <54C8A547.6050607@oracle.com> Message-ID: On Jan 28, 2015, at 1:00 AM, Vladimir Ivanov wrote: > I polished the change a little according to your comments (diff against v03): > http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03-04/hotspot +1 Glad to see the AndI folds up easily; thanks for the cleanup. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavel.punegov at oracle.com Wed Jan 28 22:47:28 2015 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Thu, 29 Jan 2015 01:47:28 +0300 Subject: RFR (XXS): 8067157: Closed compiler tests should not be in hotspot/test/TEST.groups Message-ID: <54C96700.1020606@oracle.com> Hi, please review this small change. Issue: closed/compiler do not belong to hotpost/test, so they shouldn't be in hotspot/test/TEST.groups Fix: Replace closed/compiler tests with sanity/ExecuteInternalVMTests.java to keep the hotspot_compiler_closed group Bug: https://bugs.openjdk.java.net/browse/JDK-8067157 webrev: http://cr.openjdk.java.net/~ppunegov/8067157/webrev.00/ Testing: executed this group with jprt and locally -- Thanks, Pavel Punegov From pavel.punegov at oracle.com Wed Jan 28 22:49:43 2015 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Thu, 29 Jan 2015 01:49:43 +0300 Subject: RFR (XXS): 8067012 : Don't create MDO for constant getters In-Reply-To: <02DDBA3C-CA40-43CA-A01E-D304C5F59164@oracle.com> References: <54C8D38A.6050100@oracle.com> <02DDBA3C-CA40-43CA-A01E-D304C5F59164@oracle.com> Message-ID: <54C96787.9080301@oracle.com> Thanks, Igor. On 01/28/2015 11:12 PM, Igor Veresov wrote: > Looks good. > > igor > >> On Jan 28, 2015, at 4:18 AM, Pavel Punegov wrote: >> >> Hi, >> >> please review the following small change. >> >> With the fix for JDK-8056071 [*] constant getters are now compiled on level 1, >> so there are no need to create MDO for constant getter methods in TieredCompilation >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8067012 >> webrev: http://cr.openjdk.java.net/~ppunegov/8067012/webrev/ >> >> Testing: done locally and with JPRT. >> >> -- >> [*] https://bugs.openjdk.java.net/browse/JDK-8056071 >> >> -- >> Thanks, >> Pavel Punegov >> -- Thanks, Pavel Punegov From vladimir.kozlov at oracle.com Wed Jan 28 23:58:17 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 28 Jan 2015 15:58:17 -0800 Subject: RFR (XXS): 8067157: Closed compiler tests should not be in hotspot/test/TEST.groups In-Reply-To: <54C96700.1020606@oracle.com> References: <54C96700.1020606@oracle.com> Message-ID: <54C97799.6060505@oracle.com> Good. Thanks, Vladimir On 1/28/15 2:47 PM, Pavel Punegov wrote: > Hi, > > please review this small change. > > Issue: closed/compiler do not belong to hotpost/test, so they shouldn't > be in hotspot/test/TEST.groups > > Fix: Replace closed/compiler tests with > sanity/ExecuteInternalVMTests.java to keep the hotspot_compiler_closed > group > > Bug: https://bugs.openjdk.java.net/browse/JDK-8067157 > webrev: http://cr.openjdk.java.net/~ppunegov/8067157/webrev.00/ > > Testing: executed this group with jprt and locally > From tobias.hartmann at oracle.com Thu Jan 29 08:36:04 2015 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 29 Jan 2015 09:36:04 +0100 Subject: [9] RFR(S): 8071654: disassembler handles embedded OOPs not uniformly In-Reply-To: <54C90576.4070806@oracle.com> References: <54C90576.4070806@oracle.com> Message-ID: <54C9F0F4.5000507@oracle.com> Hi Zoltan, looks good to me (not a reviewer). Maybe you should change the bug type to "enhancement" to clarify that there is no error in the current implementation but dead code is removed instead of being re-enabled. I assume you'll close JDK-8066438 after pushing this, right? Best, Tobias On 28.01.2015 16:51, Zolt?n Maj? wrote: > Hi, > > > please review the following small patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8071654 > > > Problem: The disassembler has two different ways of handling OOPs embedded into > instructions. > > 1) Embedded OOPs are printed in hexadecimal within a disassembled instruction. > The comment block following the instruction contains the name of the OOP's > class. For example: > > 0x00007f52ed58e3e4: mov $0x7f52c4e03d78,%rdi ; {oop(a 'java/lang/Class' = > 'java/lang/invoke/MethodHandle')} > > 2) Embedded OOPs are replaced in the disassembled instruction by the OOP's class > name (this functionality was available only before PermGen removal, see > JDK-8066438 for more details). The comment block following the disassembled > instruction contains the same information *again* (the name of the OOP's class). > For example: > > 0x01882738: cmp edx, a 'sun/dyn/ToGeneric$A2'; {oop(a 'sun/dyn/ToGeneric$A2')} > > (output taken from https://wikis.oracle.com/display/HotSpotInternals/PrintAssembly) > > Whether an OOP is replaced with its class name depends on the external binutils > library. For some types of instructions (e.g., compares, jumps and calls on > x86), binutils indicates to the VM that there are addresses embedded into the > instruction (and then the VM applies Way 1). For some instructions (e.g., movs > on x86), binutils does not indicate to the VM that the instruction contains an > embedded address and the VM applies Way 2. > > > Solution: This patch proposes that the disassembler handles OOPs in a single way > (Way 1): An embedded OOP's is printed in hexadecimal within the instruction; the > comment following the disassembled instruction prints the OOP's class. This way > both the OOP as raw address and the OOP's class is available in the disassembly > by default. > > The patch proposes to not print an OOP's class in decode_env::print_address() > anymore; printing should be done only in nmethod::print_code_comment_on(). > Thenmethod::embeddedOop_at method is not needed any more and is therefore removed. > > Webrev: http://cr.openjdk.java.net/~zmajo/8071654/webrev.00/ > > Testing: manual testing, built + minimal tests with JPRT on all supported > platforms. > > Thank you! > > Best regards, > > > Zoltan > From zoltan.majo at oracle.com Thu Jan 29 09:01:52 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 29 Jan 2015 10:01:52 +0100 Subject: [9] RFR(S): 8071654: disassembler handles embedded OOPs not uniformly In-Reply-To: <54C9F0F4.5000507@oracle.com> References: <54C90576.4070806@oracle.com> <54C9F0F4.5000507@oracle.com> Message-ID: <54C9F700.5020802@oracle.com> Hi Tobias, On 01/29/2015 09:36 AM, Tobias Hartmann wrote: > Hi Zoltan, > > looks good to me (not a reviewer). thank you for the feedback! > Maybe you should change the bug type to "enhancement" to clarify that there is > no error in the current implementation but dead code is removed instead of being > re-enabled. That's a good idea, I changed the type. > I assume you'll close JDK-8066438 after pushing this, right? That is correct. We don't want to re-enable the dead code as suggested by 8066438. Thank you and best regards, Zoltan > > Best, > Tobias > > On 28.01.2015 16:51, Zolt?n Maj? wrote: >> Hi, >> >> >> please review the following small patch. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8071654 >> >> >> Problem: The disassembler has two different ways of handling OOPs embedded into >> instructions. >> >> 1) Embedded OOPs are printed in hexadecimal within a disassembled instruction. >> The comment block following the instruction contains the name of the OOP's >> class. For example: >> >> 0x00007f52ed58e3e4: mov $0x7f52c4e03d78,%rdi ; {oop(a 'java/lang/Class' = >> 'java/lang/invoke/MethodHandle')} >> >> 2) Embedded OOPs are replaced in the disassembled instruction by the OOP's class >> name (this functionality was available only before PermGen removal, see >> JDK-8066438 for more details). The comment block following the disassembled >> instruction contains the same information *again* (the name of the OOP's class). >> For example: >> >> 0x01882738: cmp edx, a 'sun/dyn/ToGeneric$A2'; {oop(a 'sun/dyn/ToGeneric$A2')} >> >> (output taken from https://wikis.oracle.com/display/HotSpotInternals/PrintAssembly) >> >> Whether an OOP is replaced with its class name depends on the external binutils >> library. For some types of instructions (e.g., compares, jumps and calls on >> x86), binutils indicates to the VM that there are addresses embedded into the >> instruction (and then the VM applies Way 1). For some instructions (e.g., movs >> on x86), binutils does not indicate to the VM that the instruction contains an >> embedded address and the VM applies Way 2. >> >> >> Solution: This patch proposes that the disassembler handles OOPs in a single way >> (Way 1): An embedded OOP's is printed in hexadecimal within the instruction; the >> comment following the disassembled instruction prints the OOP's class. This way >> both the OOP as raw address and the OOP's class is available in the disassembly >> by default. >> >> The patch proposes to not print an OOP's class in decode_env::print_address() >> anymore; printing should be done only in nmethod::print_code_comment_on(). >> Thenmethod::embeddedOop_at method is not needed any more and is therefore removed. >> >> Webrev: http://cr.openjdk.java.net/~zmajo/8071654/webrev.00/ >> >> Testing: manual testing, built + minimal tests with JPRT on all supported >> platforms. >> >> Thank you! >> >> Best regards, >> >> >> Zoltan >> From zoltan.majo at oracle.com Thu Jan 29 09:07:44 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 29 Jan 2015 10:07:44 +0100 Subject: RFR (XXS): 8067157: Closed compiler tests should not be in hotspot/test/TEST.groups In-Reply-To: <54C97799.6060505@oracle.com> References: <54C96700.1020606@oracle.com> <54C97799.6060505@oracle.com> Message-ID: <54C9F860.9040800@oracle.com> Hi Pavel, it looks good to me as well. Thank you and best regards, Zoltan On 01/29/2015 12:58 AM, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 1/28/15 2:47 PM, Pavel Punegov wrote: >> Hi, >> >> please review this small change. >> >> Issue: closed/compiler do not belong to hotpost/test, so they shouldn't >> be in hotspot/test/TEST.groups >> >> Fix: Replace closed/compiler tests with >> sanity/ExecuteInternalVMTests.java to keep the hotspot_compiler_closed >> group >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8067157 >> webrev: http://cr.openjdk.java.net/~ppunegov/8067157/webrev.00/ >> >> Testing: executed this group with jprt and locally >> From zoltan.majo at oracle.com Thu Jan 29 09:15:14 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 29 Jan 2015 10:15:14 +0100 Subject: [9] RFR(S): 8071654: disassembler handles embedded OOPs not uniformly In-Reply-To: <54C92C78.1060809@oracle.com> References: <54C90576.4070806@oracle.com> <54C91D69.1020203@oracle.com> <54C92367.3010907@oracle.com> <54C925F0.8040403@oracle.com> <54C92C78.1060809@oracle.com> Message-ID: <54C9FA22.90008@oracle.com> Hi Vladimir, now that Tobias has also reviewed this change, is it OK to push it? Thank you! Best regards, Zoltan On 01/28/2015 07:37 PM, Zolt?n Maj? wrote: > Hi Vladimir, > > > thank you for the feedback. > > On 01/28/2015 07:09 PM, Vladimir Kozlov wrote: >> Good. Thank you for clarifying. >> I was confused because bugs says it produce different results. But >> you are saying that currently it produces the same output and we need >> only to cleanup code which is not used anymore. Right? > > that is right. Currently the same output is produced and it would be > nice to keep it that way. > > I think the bug description was not clear enough, sorry for that. > > Thank you and best regards, > > > Zoltan > > >> >> Thanks, >> Vladimir >> >> On 1/28/15 9:59 AM, Zolt?n Maj? wrote: >>> Hi Vladimir, >>> >>> >>> thank you for the feedback. >>> >>> On 01/28/2015 06:33 PM, Vladimir Kozlov wrote: >>>> So this is just clean up since Universe::heap()->is_in(obj->klass() >>>> is false after PermGen removal. >>> >>> That is right. >>> >>>> Now you get only hexadecimal value and no comment when binutils >>>> does not indicate VM that the instruction has embedded >>>> oop. Right or I am wrong? >>> >>> No, we still get a comment in that case. >>> >>> If binutils does not indicate to the VM that an instruction contains >>> an embedded OOP, decode_env::print_address() is not >>> called. Therefore_nm->embeddedOop_at() is not called either. As a >>> result, we don't process the relocation information >>> for _nm and a hexadecimal value is printed. >>> >>> But decode_env::end_insn() is always called right after the current >>> instruction has been processed. Therefore, >>> _nm->print_code_comment_on() is called as well and all relocation >>> information (including those with type 'oop_type') are >>> printed. >>> >>> I checked by changing binutils to do some extra VM notifications. >>> >>> Please let me know if you think I miss/oversee anything. >>> >>> Thank you very much! >>> >>> Best regards, >>> >>> >>> Zoltan >>> >>>> If I am right, can we do something about that? I agree that we >>>> should not replace hexadecimal address but can we >>>> generate comment too because even if binutils does not report VM >>>> can see it: _nm->embeddedOop_at(cur_insn()) >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 1/28/15 7:51 AM, Zolt?n Maj? wrote: >>>>> Hi, >>>>> >>>>> >>>>> please review the following small patch. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8071654 >>>>> >>>>> >>>>> Problem: The disassembler has two different ways of handling OOPs >>>>> embedded into instructions. >>>>> >>>>> 1) Embedded OOPs are printed in hexadecimal within a disassembled >>>>> instruction. The comment block following the >>>>> instruction contains the name of the OOP's class. For example: >>>>> >>>>> 0x00007f52ed58e3e4: mov $0x7f52c4e03d78,%rdi ; {oop(a >>>>> 'java/lang/Class' = 'java/lang/invoke/MethodHandle')} >>>>> >>>>> 2) Embedded OOPs are replaced in the disassembled instruction by >>>>> the OOP's class name (this functionality was available >>>>> only before PermGen removal, see JDK-8066438 for more details). >>>>> The comment block following the disassembled instruction >>>>> contains the same information *again* (the name of the OOP's >>>>> class). For example: >>>>> >>>>> 0x01882738: cmp edx, a 'sun/dyn/ToGeneric$A2'; {oop(a >>>>> 'sun/dyn/ToGeneric$A2')} >>>>> >>>>> (output taken from >>>>> https://wikis.oracle.com/display/HotSpotInternals/PrintAssembly) >>>>> >>>>> Whether an OOP is replaced with its class name depends on the >>>>> external binutils library. For some types of instructions >>>>> (e.g., compares, jumps and calls on x86), binutils indicates to >>>>> the VM that there are addresses embedded into the >>>>> instruction (and then the VM applies Way 1). For some instructions >>>>> (e.g., movs on x86), binutils does not indicate to >>>>> the VM that the instruction contains an embedded address and the >>>>> VM applies Way 2. >>>>> >>>>> >>>>> Solution: This patch proposes that the disassembler handles OOPs >>>>> in a single way (Way 1): An embedded OOP's is printed >>>>> in hexadecimal within the instruction; the comment following the >>>>> disassembled instruction prints the OOP's class. This >>>>> way both the OOP as raw address and the OOP's class is available >>>>> in the disassembly by default. >>>>> >>>>> The patch proposes to not print an OOP's class in >>>>> decode_env::print_address() anymore; printing should be done only in >>>>> nmethod::print_code_comment_on(). Thenmethod::embeddedOop_at >>>>> method is not needed any more and is therefore removed. >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~zmajo/8071654/webrev.00/ >>>>> >>>>> Testing: manual testing, built + minimal tests with JPRT on all >>>>> supported platforms. >>>>> >>>>> Thank you! >>>>> >>>>> Best regards, >>>>> >>>>> >>>>> Zoltan >>>>> >>> > From albert.noll at oracle.com Thu Jan 29 09:47:28 2015 From: albert.noll at oracle.com (Albert Noll) Date: Thu, 29 Jan 2015 10:47:28 +0100 Subject: [9] RFR(S): 8068440: Test6857159.java times out Message-ID: <54CA01B0.60509@oracle.com> Hi, Could I get reviews for this small patch? Bug: https://bugs.openjdk.java.net/browse/JDK-8068440 Problem The test times out on Windows. The test uses a script to check that a particular method can be compiled. I suspect that there is a problem (deadlock) when using the script on Windows. However, I cannot prove it. Solution: Remove the script and do the check if the method was compiled correctly in Java. If the time-out goes away, the bug is fixed. If the time-out persists, we need look for a different cause. Testing: Local testing. Make sure that the method is still compiled in the new (Java-only) version. Webrev: http://cr.openjdk.java.net/~anoll/8068440/webrev.00/ Many thanks in advance, Albert From roland.westrelin at oracle.com Thu Jan 29 09:54:42 2015 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Thu, 29 Jan 2015 10:54:42 +0100 Subject: [9] RFR(S): 8071654: disassembler handles embedded OOPs not uniformly In-Reply-To: <54C9FA22.90008@oracle.com> References: <54C90576.4070806@oracle.com> <54C91D69.1020203@oracle.com> <54C92367.3010907@oracle.com> <54C925F0.8040403@oracle.com> <54C92C78.1060809@oracle.com> <54C9FA22.90008@oracle.com> Message-ID: <075680D3-5966-404E-9B20-80B746C96D07@oracle.com> > now that Tobias has also reviewed this change, is it OK to push it? That looks good to me as well. Roland. From zoltan.majo at oracle.com Thu Jan 29 10:09:06 2015 From: zoltan.majo at oracle.com (=?windows-1252?Q?Zolt=E1n_Maj=F3?=) Date: Thu, 29 Jan 2015 11:09:06 +0100 Subject: [9] RFR(S): 8071654: disassembler handles embedded OOPs not uniformly In-Reply-To: <075680D3-5966-404E-9B20-80B746C96D07@oracle.com> References: <54C90576.4070806@oracle.com> <54C91D69.1020203@oracle.com> <54C92367.3010907@oracle.com> <54C925F0.8040403@oracle.com> <54C92C78.1060809@oracle.com> <54C9FA22.90008@oracle.com> <075680D3-5966-404E-9B20-80B746C96D07@oracle.com> Message-ID: <54CA06C2.4090906@oracle.com> Thank you, Roland, for the review! Best wishes, Zoltan On 01/29/2015 10:54 AM, Roland Westrelin wrote: >> now that Tobias has also reviewed this change, is it OK to push it? > That looks good to me as well. > > Roland. From igor.ignatyev at oracle.com Thu Jan 29 10:30:54 2015 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 29 Jan 2015 13:30:54 +0300 Subject: [9] RFR(S): 8068440: Test6857159.java times out In-Reply-To: <54CA01B0.60509@oracle.com> References: <54CA01B0.60509@oracle.com> Message-ID: <54CA0BDE.101@oracle.com> Hi Albert, could you also check that process was finished gracefully? otherwise the fix looks good to me. Thanks Igor On 01/29/2015 12:47 PM, Albert Noll wrote: > Hi, > > Could I get reviews for this small patch? > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8068440 > > Problem > The test times out on Windows. The test uses a script to check that a > particular method can be compiled. I suspect that there is a problem > (deadlock) when using the script on Windows. However, I cannot prove it. > > Solution: > Remove the script and do the check if the method was compiled correctly > in Java. If the time-out goes away, the bug is fixed. If the time-out > persists, we need look for a different cause. > > Testing: > Local testing. Make sure that the method is still compiled in the new > (Java-only) version. > > Webrev: > http://cr.openjdk.java.net/~anoll/8068440/webrev.00/ > > Many thanks in advance, > Albert > From albert.noll at oracle.com Thu Jan 29 11:28:47 2015 From: albert.noll at oracle.com (Albert Noll) Date: Thu, 29 Jan 2015 12:28:47 +0100 Subject: [9] RFR(S): 8068440: Test6857159.java times out In-Reply-To: <54CA0BDE.101@oracle.com> References: <54CA01B0.60509@oracle.com> <54CA0BDE.101@oracle.com> Message-ID: <54CA196F.2020005@oracle.com> Hi Igor, Thanks for looking at this. What do you mean by 'finish gracefully'? The process was killed after the timeout was hit. Thanks, Albert On 01/29/2015 11:30 AM, Igor Ignatyev wrote: > Hi Albert, > > could you also check that process was finished gracefully? > > otherwise the fix looks good to me. > > Thanks > Igor > > On 01/29/2015 12:47 PM, Albert Noll wrote: >> Hi, >> >> Could I get reviews for this small patch? >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8068440 >> >> Problem >> The test times out on Windows. The test uses a script to check that a >> particular method can be compiled. I suspect that there is a problem >> (deadlock) when using the script on Windows. However, I cannot prove it. >> >> Solution: >> Remove the script and do the check if the method was compiled correctly >> in Java. If the time-out goes away, the bug is fixed. If the time-out >> persists, we need look for a different cause. >> >> Testing: >> Local testing. Make sure that the method is still compiled in the new >> (Java-only) version. >> >> Webrev: >> http://cr.openjdk.java.net/~anoll/8068440/webrev.00/ >> >> Many thanks in advance, >> Albert >> From albert.noll at oracle.com Thu Jan 29 11:47:24 2015 From: albert.noll at oracle.com (Albert Noll) Date: Thu, 29 Jan 2015 12:47:24 +0100 Subject: [9] RFR(XXS): 8071906: Quarantine OverloadCompileQueueTest until the reason for timeout is known Message-ID: <54CA1DCC.4050309@oracle.com> Hi, could I get reviews for this small patch? The test times out. The cause is not yet determined. To keep nightly failures at a reasonable level, we should quarantine this test until the root cause of the timeout is known. Webrev: http://cr.openjdk.java.net/~anoll/8071906/webrev.00/ Thanks, Albert From pavel.punegov at oracle.com Thu Jan 29 12:09:54 2015 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Thu, 29 Jan 2015 15:09:54 +0300 Subject: RFR (XXS): 8067157: Closed compiler tests should not be in hotspot/test/TEST.groups In-Reply-To: <54C9F860.9040800@oracle.com> References: <54C96700.1020606@oracle.com> <54C97799.6060505@oracle.com> <54C9F860.9040800@oracle.com> Message-ID: <54CA2312.7070704@oracle.com> Zoltan, Thank you for review On 29.01.2015 12:07, Zolt?n Maj? wrote: > Hi Pavel, > > > it looks good to me as well. > > Thank you and best regards, > > > Zoltan > > On 01/29/2015 12:58 AM, Vladimir Kozlov wrote: >> Good. >> >> Thanks, >> Vladimir >> >> On 1/28/15 2:47 PM, Pavel Punegov wrote: >>> Hi, >>> >>> please review this small change. >>> >>> Issue: closed/compiler do not belong to hotpost/test, so they shouldn't >>> be in hotspot/test/TEST.groups >>> >>> Fix: Replace closed/compiler tests with >>> sanity/ExecuteInternalVMTests.java to keep the hotspot_compiler_closed >>> group >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8067157 >>> webrev: http://cr.openjdk.java.net/~ppunegov/8067157/webrev.00/ >>> >>> Testing: executed this group with jprt and locally >>> > -- Thanks, Pavel Punegov From vladimir.kozlov at oracle.com Thu Jan 29 17:14:08 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 29 Jan 2015 09:14:08 -0800 Subject: [9] RFR(S): 8071654: disassembler handles embedded OOPs not uniformly In-Reply-To: <54C9FA22.90008@oracle.com> References: <54C90576.4070806@oracle.com> <54C91D69.1020203@oracle.com> <54C92367.3010907@oracle.com> <54C925F0.8040403@oracle.com> <54C92C78.1060809@oracle.com> <54C9FA22.90008@oracle.com> Message-ID: <54CA6A60.1070708@oracle.com> Yes, you can push changes are small. Thanks, Vladimir On 1/29/15 1:15 AM, Zolt?n Maj? wrote: > Hi Vladimir, > > > now that Tobias has also reviewed this change, is it OK to push it? > > Thank you! > > Best regards, > > > Zoltan > > On 01/28/2015 07:37 PM, Zolt?n Maj? wrote: >> Hi Vladimir, >> >> >> thank you for the feedback. >> >> On 01/28/2015 07:09 PM, Vladimir Kozlov wrote: >>> Good. Thank you for clarifying. >>> I was confused because bugs says it produce different results. But you are saying that currently it produces the same >>> output and we need only to cleanup code which is not used anymore. Right? >> >> that is right. Currently the same output is produced and it would be nice to keep it that way. >> >> I think the bug description was not clear enough, sorry for that. >> >> Thank you and best regards, >> >> >> Zoltan >> >> >>> >>> Thanks, >>> Vladimir >>> >>> On 1/28/15 9:59 AM, Zolt?n Maj? wrote: >>>> Hi Vladimir, >>>> >>>> >>>> thank you for the feedback. >>>> >>>> On 01/28/2015 06:33 PM, Vladimir Kozlov wrote: >>>>> So this is just clean up since Universe::heap()->is_in(obj->klass() is false after PermGen removal. >>>> >>>> That is right. >>>> >>>>> Now you get only hexadecimal value and no comment when binutils does not indicate VM that the instruction has embedded >>>>> oop. Right or I am wrong? >>>> >>>> No, we still get a comment in that case. >>>> >>>> If binutils does not indicate to the VM that an instruction contains an embedded OOP, decode_env::print_address() is >>>> not >>>> called. Therefore_nm->embeddedOop_at() is not called either. As a result, we don't process the relocation information >>>> for _nm and a hexadecimal value is printed. >>>> >>>> But decode_env::end_insn() is always called right after the current instruction has been processed. Therefore, >>>> _nm->print_code_comment_on() is called as well and all relocation information (including those with type 'oop_type') >>>> are >>>> printed. >>>> >>>> I checked by changing binutils to do some extra VM notifications. >>>> >>>> Please let me know if you think I miss/oversee anything. >>>> >>>> Thank you very much! >>>> >>>> Best regards, >>>> >>>> >>>> Zoltan >>>> >>>>> If I am right, can we do something about that? I agree that we should not replace hexadecimal address but can we >>>>> generate comment too because even if binutils does not report VM can see it: _nm->embeddedOop_at(cur_insn()) >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 1/28/15 7:51 AM, Zolt?n Maj? wrote: >>>>>> Hi, >>>>>> >>>>>> >>>>>> please review the following small patch. >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8071654 >>>>>> >>>>>> >>>>>> Problem: The disassembler has two different ways of handling OOPs embedded into instructions. >>>>>> >>>>>> 1) Embedded OOPs are printed in hexadecimal within a disassembled instruction. The comment block following the >>>>>> instruction contains the name of the OOP's class. For example: >>>>>> >>>>>> 0x00007f52ed58e3e4: mov $0x7f52c4e03d78,%rdi ; {oop(a 'java/lang/Class' = 'java/lang/invoke/MethodHandle')} >>>>>> >>>>>> 2) Embedded OOPs are replaced in the disassembled instruction by the OOP's class name (this functionality was >>>>>> available >>>>>> only before PermGen removal, see JDK-8066438 for more details). The comment block following the disassembled >>>>>> instruction >>>>>> contains the same information *again* (the name of the OOP's class). For example: >>>>>> >>>>>> 0x01882738: cmp edx, a 'sun/dyn/ToGeneric$A2'; {oop(a 'sun/dyn/ToGeneric$A2')} >>>>>> >>>>>> (output taken from https://wikis.oracle.com/display/HotSpotInternals/PrintAssembly) >>>>>> >>>>>> Whether an OOP is replaced with its class name depends on the external binutils library. For some types of >>>>>> instructions >>>>>> (e.g., compares, jumps and calls on x86), binutils indicates to the VM that there are addresses embedded into the >>>>>> instruction (and then the VM applies Way 1). For some instructions (e.g., movs on x86), binutils does not indicate to >>>>>> the VM that the instruction contains an embedded address and the VM applies Way 2. >>>>>> >>>>>> >>>>>> Solution: This patch proposes that the disassembler handles OOPs in a single way (Way 1): An embedded OOP's is >>>>>> printed >>>>>> in hexadecimal within the instruction; the comment following the disassembled instruction prints the OOP's class. >>>>>> This >>>>>> way both the OOP as raw address and the OOP's class is available in the disassembly by default. >>>>>> >>>>>> The patch proposes to not print an OOP's class in decode_env::print_address() anymore; printing should be done >>>>>> only in >>>>>> nmethod::print_code_comment_on(). Thenmethod::embeddedOop_at method is not needed any more and is therefore removed. >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~zmajo/8071654/webrev.00/ >>>>>> >>>>>> Testing: manual testing, built + minimal tests with JPRT on all supported platforms. >>>>>> >>>>>> Thank you! >>>>>> >>>>>> Best regards, >>>>>> >>>>>> >>>>>> Zoltan >>>>>> >>>> >> > From vladimir.kozlov at oracle.com Thu Jan 29 17:21:34 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 29 Jan 2015 09:21:34 -0800 Subject: [9] RFR(S): 8068440: Test6857159.java times out In-Reply-To: <54CA01B0.60509@oracle.com> References: <54CA01B0.60509@oracle.com> Message-ID: <54CA6C1E.303@oracle.com> I know that script did the same but the code does not check that method is really compiled - it only checks that output does not have "COMPILE SKIPPED". We need to check that the method was really compiled - for example, it has line for method compilation in PrintCompilation output. Thanks, Vladimir On 1/29/15 1:47 AM, Albert Noll wrote: > Hi, > > Could I get reviews for this small patch? > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8068440 > > Problem > The test times out on Windows. The test uses a script to check that a particular method can be compiled. I suspect that > there is a problem (deadlock) when using the script on Windows. However, I cannot prove it. > > Solution: > Remove the script and do the check if the method was compiled correctly in Java. If the time-out goes away, the bug is > fixed. If the time-out persists, we need look for a different cause. > > Testing: > Local testing. Make sure that the method is still compiled in the new (Java-only) version. > > Webrev: > http://cr.openjdk.java.net/~anoll/8068440/webrev.00/ > > Many thanks in advance, > Albert > From igor.ignatyev at oracle.com Thu Jan 29 17:38:22 2015 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 29 Jan 2015 20:38:22 +0300 Subject: [9] RFR(S): 8068440: Test6857159.java times out In-Reply-To: <54CA196F.2020005@oracle.com> References: <54CA01B0.60509@oracle.com> <54CA0BDE.101@oracle.com> <54CA196F.2020005@oracle.com> Message-ID: <54CA700E.3070501@oracle.com> I meant to check that exit code of spawn java process is zero. also I missed that you use ProcessTools.createJavaProcessBuilder instead of ProcessTools.executeTestJvm to spawn tested jvm. so created jvm ignores external vm flags. Igor On 01/29/2015 02:28 PM, Albert Noll wrote: > Hi Igor, > > Thanks for looking at this. What do you mean by 'finish gracefully'? > The process was killed after the timeout was hit. > > Thanks, > Albert > > On 01/29/2015 11:30 AM, Igor Ignatyev wrote: >> Hi Albert, >> >> could you also check that process was finished gracefully? >> >> otherwise the fix looks good to me. >> >> Thanks >> Igor >> >> On 01/29/2015 12:47 PM, Albert Noll wrote: >>> Hi, >>> >>> Could I get reviews for this small patch? >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8068440 >>> >>> Problem >>> The test times out on Windows. The test uses a script to check that a >>> particular method can be compiled. I suspect that there is a problem >>> (deadlock) when using the script on Windows. However, I cannot prove it. >>> >>> Solution: >>> Remove the script and do the check if the method was compiled correctly >>> in Java. If the time-out goes away, the bug is fixed. If the time-out >>> persists, we need look for a different cause. >>> >>> Testing: >>> Local testing. Make sure that the method is still compiled in the new >>> (Java-only) version. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~anoll/8068440/webrev.00/ >>> >>> Many thanks in advance, >>> Albert >>> > From vladimir.kozlov at oracle.com Thu Jan 29 17:45:57 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 29 Jan 2015 09:45:57 -0800 Subject: [9] RFR(XXS): 8071906: Quarantine OverloadCompileQueueTest until the reason for timeout is known In-Reply-To: <54CA1DCC.4050309@oracle.com> References: <54CA1DCC.4050309@oracle.com> Message-ID: <54CA71D5.3050707@oracle.com> Okay. Thanks, Vladimir On 1/29/15 3:47 AM, Albert Noll wrote: > Hi, > > could I get reviews for this small patch? > > The test times out. The cause is not yet determined. To keep nightly failures at a reasonable level, we should > quarantine this test until the root cause of the timeout is known. > > Webrev: > http://cr.openjdk.java.net/~anoll/8071906/webrev.00/ > > Thanks, > Albert From zoltan.majo at oracle.com Thu Jan 29 19:00:42 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 29 Jan 2015 20:00:42 +0100 Subject: [9] RFR(S): 8071818: incorrect addressing mode used for ldf in SPARC assembler Message-ID: <54CA835A.2050302@oracle.com> Hi, please review the following small patch. Bug: https://bugs.openjdk.java.net/browse/JDK-8071818 Problem: For the 'ldf' instruction, the SPARC assembler uses only the addressing mode with 'base + displacement + offset'. In some cases, however, an addressing mode with 'base + index' is needed. The necessary functionality is not in place, which results in a VM crash. Solution: Add support for index-based addressing to MacroAssembler::ldf. 'ldf' determines the addressing mode needed by using Address::has_index(). The resulting code is analogous to the code in 'ld', 'st', and variations of them. Webrev: http://cr.openjdk.java.net/~zmajo/8071818/webrev.00/ Testing: manual testing of failing test case, JPRT tests on Solaris SPARC. The patch was originally contributed by Andrew Gross. Thank you! Best regards, Zoltan From vladimir.kozlov at oracle.com Thu Jan 29 19:04:03 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 29 Jan 2015 11:04:03 -0800 Subject: [9] RFR(S): 8071818: incorrect addressing mode used for ldf in SPARC assembler In-Reply-To: <54CA835A.2050302@oracle.com> References: <54CA835A.2050302@oracle.com> Message-ID: <54CA8423.8010200@oracle.com> Looks good. Thanks, Vladimir On 1/29/15 11:00 AM, Zolt?n Maj? wrote: > Hi, > > > please review the following small patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8071818 > > Problem: For the 'ldf' instruction, the SPARC assembler uses only the > addressing mode with 'base + displacement + offset'. In some cases, > however, an addressing mode with 'base + index' is needed. The necessary > functionality is not in place, which results in a VM crash. > > Solution: Add support for index-based addressing to MacroAssembler::ldf. > 'ldf' determines the addressing mode needed by using > Address::has_index(). The resulting code is analogous to the code in > 'ld', 'st', and variations of them. > > Webrev: http://cr.openjdk.java.net/~zmajo/8071818/webrev.00/ > > Testing: manual testing of failing test case, JPRT tests on Solaris SPARC. > > The patch was originally contributed by Andrew Gross. > > Thank you! > > Best regards, > > > Zoltan From dean.long at oracle.com Thu Jan 29 21:41:48 2015 From: dean.long at oracle.com (Dean Long) Date: Thu, 29 Jan 2015 13:41:48 -0800 Subject: [9] RFR(S): 8071818: incorrect addressing mode used for ldf in SPARC assembler In-Reply-To: <54CA835A.2050302@oracle.com> References: <54CA835A.2050302@oracle.com> Message-ID: <54CAA91C.30904@oracle.com> This looks consistent with ld and st, but I'm wondering if in all of them, the assert would be better as offset == 0 && !a.has_disp(). It does appear that has_index() and has_disp() are mutually exclusive, however, so feel free to ignore this minor issue. dl On 1/29/2015 11:00 AM, Zolt?n Maj? wrote: > Hi, > > > please review the following small patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8071818 > > Problem: For the 'ldf' instruction, the SPARC assembler uses only the > addressing mode with 'base + displacement + offset'. In some cases, > however, an addressing mode with 'base + index' is needed. The > necessary functionality is not in place, which results in a VM crash. > > Solution: Add support for index-based addressing to > MacroAssembler::ldf. 'ldf' determines the addressing mode needed by > using Address::has_index(). The resulting code is analogous to the > code in 'ld', 'st', and variations of them. > > Webrev: http://cr.openjdk.java.net/~zmajo/8071818/webrev.00/ > > Testing: manual testing of failing test case, JPRT tests on Solaris > SPARC. > > The patch was originally contributed by Andrew Gross. > > Thank you! > > Best regards, > > > Zoltan From vladimir.kozlov at oracle.com Fri Jan 30 01:14:24 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 29 Jan 2015 17:14:24 -0800 Subject: [9] RFR (XS) 8071534: assert(!failing()) failed: Must not have pending failure. Reason is: out of memory Message-ID: <54CADAF0.30705@oracle.com> https://bugs.openjdk.java.net/browse/JDK-8071534 http://cr.openjdk.java.net/~kvn/8071534/webrev Add missing C->failing() check after Connection graph construction. Note, there are C->failing() checks before EA starts Connection graph construction. EA uses bytecode analyzer BCEscapeAnalyzer to look into called bit not inlined methods. BCEscapeAnalyzer try to resolve symbols referenced in bytecode and VM may throw an exception during resolution. For example, if it needs to allocate in metaspace which does not have space. CI cleans exceptions and converts them to compilation failure: void ciEnv::record_out_of_memory_failure() { // If memory is low, we stop compiling methods. record_method_not_compilable("out of memory"); } It is called when there is a pending exception. For example: // Make a ciSymbol from a C string (implementation). ciSymbol* ciSymbol::make_impl(const char* s) { EXCEPTION_CONTEXT; TempNewSymbol sym = SymbolTable::new_symbol(s, THREAD); if (HAS_PENDING_EXCEPTION) { CLEAR_PENDING_EXCEPTION; CURRENT_THREAD_ENV->record_out_of_memory_failure(); return ciEnv::_unloaded_cisymbol; } return CURRENT_THREAD_ENV->get_symbol(sym); } That is the only explanation I can think of because after 2 days I was not able to reproduce the problem. And it unknown which particular CI code produced the exception. The test stresses VM with limited metaspace and I see one metaspace OOM exception in hs_err file (in different thread): Event: 253.978 Thread 0x28069a60 Exception (0x05d93ad8) thrown at [hotspot\src\share\vm\memory\metaspace.cpp, line 3597] Thanks, Vladimir From igor.veresov at oracle.com Fri Jan 30 01:45:12 2015 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 29 Jan 2015 17:45:12 -0800 Subject: [9] RFR (XS) 8071534: assert(!failing()) failed: Must not have pending failure. Reason is: out of memory In-Reply-To: <54CADAF0.30705@oracle.com> References: <54CADAF0.30705@oracle.com> Message-ID: <3999B2D5-C1E4-475C-A204-30B5BD89CDA3@oracle.com> Looks good. igor > On Jan 29, 2015, at 5:14 PM, Vladimir Kozlov wrote: > > https://bugs.openjdk.java.net/browse/JDK-8071534 > > http://cr.openjdk.java.net/~kvn/8071534/webrev > > Add missing C->failing() check after Connection graph construction. > Note, there are C->failing() checks before EA starts Connection graph construction. > > EA uses bytecode analyzer BCEscapeAnalyzer to look into called bit not inlined methods. BCEscapeAnalyzer try to resolve symbols referenced in bytecode and VM may throw an exception during resolution. For example, if it needs to allocate in metaspace which does not have space. > > CI cleans exceptions and converts them to compilation failure: > > void ciEnv::record_out_of_memory_failure() { > // If memory is low, we stop compiling methods. > record_method_not_compilable("out of memory"); > } > > It is called when there is a pending exception. For example: > > // Make a ciSymbol from a C string (implementation). > ciSymbol* ciSymbol::make_impl(const char* s) { > EXCEPTION_CONTEXT; > TempNewSymbol sym = SymbolTable::new_symbol(s, THREAD); > if (HAS_PENDING_EXCEPTION) { > CLEAR_PENDING_EXCEPTION; > CURRENT_THREAD_ENV->record_out_of_memory_failure(); > return ciEnv::_unloaded_cisymbol; > } > return CURRENT_THREAD_ENV->get_symbol(sym); > } > > That is the only explanation I can think of because after 2 days I was not able to reproduce the problem. And it unknown which particular CI code produced the exception. > > The test stresses VM with limited metaspace and I see one metaspace OOM exception in hs_err file (in different thread): > > Event: 253.978 Thread 0x28069a60 Exception (0x05d93ad8) thrown at [hotspot\src\share\vm\memory\metaspace.cpp, line 3597] > > Thanks, > Vladimir From vladimir.kozlov at oracle.com Fri Jan 30 01:52:48 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 29 Jan 2015 17:52:48 -0800 Subject: [9] RFR (XS) 8071534: assert(!failing()) failed: Must not have pending failure. Reason is: out of memory In-Reply-To: <3999B2D5-C1E4-475C-A204-30B5BD89CDA3@oracle.com> References: <54CADAF0.30705@oracle.com> <3999B2D5-C1E4-475C-A204-30B5BD89CDA3@oracle.com> Message-ID: <54CAE3F0.7000400@oracle.com> Thank you, Igor Vladimir On 1/29/15 5:45 PM, Igor Veresov wrote: > Looks good. > > igor > >> On Jan 29, 2015, at 5:14 PM, Vladimir Kozlov wrote: >> >> https://bugs.openjdk.java.net/browse/JDK-8071534 >> >> http://cr.openjdk.java.net/~kvn/8071534/webrev >> >> Add missing C->failing() check after Connection graph construction. >> Note, there are C->failing() checks before EA starts Connection graph construction. >> >> EA uses bytecode analyzer BCEscapeAnalyzer to look into called bit not inlined methods. BCEscapeAnalyzer try to resolve symbols referenced in bytecode and VM may throw an exception during resolution. For example, if it needs to allocate in metaspace which does not have space. >> >> CI cleans exceptions and converts them to compilation failure: >> >> void ciEnv::record_out_of_memory_failure() { >> // If memory is low, we stop compiling methods. >> record_method_not_compilable("out of memory"); >> } >> >> It is called when there is a pending exception. For example: >> >> // Make a ciSymbol from a C string (implementation). >> ciSymbol* ciSymbol::make_impl(const char* s) { >> EXCEPTION_CONTEXT; >> TempNewSymbol sym = SymbolTable::new_symbol(s, THREAD); >> if (HAS_PENDING_EXCEPTION) { >> CLEAR_PENDING_EXCEPTION; >> CURRENT_THREAD_ENV->record_out_of_memory_failure(); >> return ciEnv::_unloaded_cisymbol; >> } >> return CURRENT_THREAD_ENV->get_symbol(sym); >> } >> >> That is the only explanation I can think of because after 2 days I was not able to reproduce the problem. And it unknown which particular CI code produced the exception. >> >> The test stresses VM with limited metaspace and I see one metaspace OOM exception in hs_err file (in different thread): >> >> Event: 253.978 Thread 0x28069a60 Exception (0x05d93ad8) thrown at [hotspot\src\share\vm\memory\metaspace.cpp, line 3597] >> >> Thanks, >> Vladimir > From zoltan.majo at oracle.com Fri Jan 30 09:28:25 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 30 Jan 2015 10:28:25 +0100 Subject: [9] RFR(S): 8071818: incorrect addressing mode used for ldf in SPARC assembler In-Reply-To: <54CAA91C.30904@oracle.com> References: <54CA835A.2050302@oracle.com> <54CAA91C.30904@oracle.com> Message-ID: <54CB4EB9.9010903@oracle.com> Hi Dean, thank you for the feedback! On 01/29/2015 10:41 PM, Dean Long wrote: > This looks consistent with ld and st, but I'm wondering if in all of > them, the assert would be > better as offset == 0 && !a.has_disp(). It does appear that > has_index() and has_disp() > are mutually exclusive, however, so feel free to ignore this minor issue. Yes, they are mutually exclusive but I think is a good idea to have stronger asserts. We need update all instructions in the 'ld' and 'st' family, so I've filed a separate RFE for that (JDK-8071986). Thank you and best regards, Zoltan > > dl > > On 1/29/2015 11:00 AM, Zolt?n Maj? wrote: >> Hi, >> >> >> please review the following small patch. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8071818 >> >> Problem: For the 'ldf' instruction, the SPARC assembler uses only the >> addressing mode with 'base + displacement + offset'. In some cases, >> however, an addressing mode with 'base + index' is needed. The >> necessary functionality is not in place, which results in a VM crash. >> >> Solution: Add support for index-based addressing to >> MacroAssembler::ldf. 'ldf' determines the addressing mode needed by >> using Address::has_index(). The resulting code is analogous to the >> code in 'ld', 'st', and variations of them. >> >> Webrev: http://cr.openjdk.java.net/~zmajo/8071818/webrev.00/ >> >> Testing: manual testing of failing test case, JPRT tests on Solaris >> SPARC. >> >> The patch was originally contributed by Andrew Gross. >> >> Thank you! >> >> Best regards, >> >> >> Zoltan > From zoltan.majo at oracle.com Fri Jan 30 09:54:55 2015 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 30 Jan 2015 10:54:55 +0100 Subject: [9] RFR(S): 8071818: incorrect addressing mode used for ldf in SPARC assembler In-Reply-To: <54CA8423.8010200@oracle.com> References: <54CA835A.2050302@oracle.com> <54CA8423.8010200@oracle.com> Message-ID: <54CB54EF.5090308@oracle.com> Thank you, Vladimir and Dean, for the review! Best regards, Zoltan On 01/29/2015 08:04 PM, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 1/29/15 11:00 AM, Zolt?n Maj? wrote: >> Hi, >> >> >> please review the following small patch. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8071818 >> >> Problem: For the 'ldf' instruction, the SPARC assembler uses only the >> addressing mode with 'base + displacement + offset'. In some cases, >> however, an addressing mode with 'base + index' is needed. The necessary >> functionality is not in place, which results in a VM crash. >> >> Solution: Add support for index-based addressing to MacroAssembler::ldf. >> 'ldf' determines the addressing mode needed by using >> Address::has_index(). The resulting code is analogous to the code in >> 'ld', 'st', and variations of them. >> >> Webrev: http://cr.openjdk.java.net/~zmajo/8071818/webrev.00/ >> >> Testing: manual testing of failing test case, JPRT tests on Solaris >> SPARC. >> >> The patch was originally contributed by Andrew Gross. >> >> Thank you! >> >> Best regards, >> >> >> Zoltan From tobias.hartmann at oracle.com Fri Jan 30 14:04:36 2015 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 30 Jan 2015 15:04:36 +0100 Subject: [9] RFR(XS): 8071995: compiler/codecache/jmx/InitialAndMaxUsageTest.java fails with large pages Message-ID: <54CB8F74.6030805@oracle.com> Hi, please review the following patch. https://bugs.openjdk.java.net/browse/JDK-8071995 http://cr.openjdk.java.net/~thartmann/8071995/webrev.00/ With JDK-8064940 code heaps are large page aligned if XX:UseLargePages is enabled. The test 'compiler/codecache/jmx/InitialAndMaxUsageTest' fails because due to the alignment the actual code heap sizes differ from the expected value. The test should be executed with -XX:-UseLargePages. Thanks, Tobias From vladimir.kozlov at oracle.com Fri Jan 30 17:22:57 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 30 Jan 2015 09:22:57 -0800 Subject: [9] RFR(XS): 8071995: compiler/codecache/jmx/InitialAndMaxUsageTest.java fails with large pages In-Reply-To: <54CB8F74.6030805@oracle.com> References: <54CB8F74.6030805@oracle.com> Message-ID: <54CBBDF1.5090705@oracle.com> Okay. Thanks, Vladimir On 1/30/15 6:04 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch. > > https://bugs.openjdk.java.net/browse/JDK-8071995 > http://cr.openjdk.java.net/~thartmann/8071995/webrev.00/ > > With JDK-8064940 code heaps are large page aligned if XX:UseLargePages is > enabled. The test 'compiler/codecache/jmx/InitialAndMaxUsageTest' fails because > due to the alignment the actual code heap sizes differ from the expected value. > > The test should be executed with -XX:-UseLargePages. > > Thanks, > Tobias > From pavel.chistyakov at oracle.com Fri Jan 30 17:37:06 2015 From: pavel.chistyakov at oracle.com (Pavel Chistyakov) Date: Fri, 30 Jan 2015 20:37:06 +0300 Subject: RFR(XXS): 8068003: compiler/whitebox/DeoptimizeFramesTest.java fails: compilation 48 can't be available Message-ID: <54CBC142.6020703@oracle.com> Hi all, please take a look into very small change for JDK-8068003 Problem: compiler/whitebox/DeoptimizeFramesTest fails frequently with 'compilation 48 can't be available' Solution: Some investigations show that -XX:+DeoptimizeALot used in nightlies produces such results. WB::deotimizeFrames function tries to deoptimize frames (and our test method) but VM deoptimizeAll operation forced by DeoptimizeALot already done this and we got unexpected method state. Disabling DeoptimizeALot in test helps to prevent failures. Testing: manual locally and on remote failing machine webrev: http://cr.openjdk.java.net/~pchistyakov/8068003/webrev.00/ JBS: https://bugs.openjdk.java.net/browse/JDK-8068003 ------------------ Thanks, Pavel -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Fri Jan 30 17:50:04 2015 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 30 Jan 2015 09:50:04 -0800 Subject: RFR(XXS): 8068003: compiler/whitebox/DeoptimizeFramesTest.java fails: compilation 48 can't be available In-Reply-To: <54CBC142.6020703@oracle.com> References: <54CBC142.6020703@oracle.com> Message-ID: <54CBC44C.5060705@oracle.com> Looks good. Thanks, Vladimir On 1/30/15 9:37 AM, Pavel Chistyakov wrote: > Hi all, > > please take a look into very small change for JDK-8068003 > > Problem: compiler/whitebox/DeoptimizeFramesTest fails frequently with 'compilation 48 can't be available' > > Solution: > Some investigations show that -XX:+DeoptimizeALot used in nightlies produces such results. WB::deotimizeFrames function > tries to deoptimize frames (and our test method) but VM deoptimizeAll operation forced by DeoptimizeALot already done > this and we got unexpected method state. > Disabling DeoptimizeALot in test helps to prevent failures. > > Testing: manual locally and on remote failing machine > > webrev: http://cr.openjdk.java.net/~pchistyakov/8068003/webrev.00/ > JBS: https://bugs.openjdk.java.net/browse/JDK-8068003 > > ------------------ > Thanks, > Pavel From pavel.chistyakov at oracle.com Fri Jan 30 17:42:38 2015 From: pavel.chistyakov at oracle.com (Pavel Chistyakov) Date: Fri, 30 Jan 2015 20:42:38 +0300 Subject: RFR(XXS): 8068003: compiler/whitebox/DeoptimizeFramesTest.java fails: compilation 48 can't be available In-Reply-To: <54CBC44C.5060705@oracle.com> References: <54CBC142.6020703@oracle.com> <54CBC44C.5060705@oracle.com> Message-ID: <54CBC28E.8080409@oracle.com> Vladimir, thank you for review! ------------- Regards, Pavel On 30.01.2015 20:50, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 1/30/15 9:37 AM, Pavel Chistyakov wrote: >> Hi all, >> >> please take a look into very small change for JDK-8068003 >> >> >> Problem: compiler/whitebox/DeoptimizeFramesTest fails frequently with >> 'compilation 48 can't be available' >> >> Solution: >> Some investigations show that -XX:+DeoptimizeALot used in nightlies >> produces such results. WB::deotimizeFrames function >> tries to deoptimize frames (and our test method) but VM deoptimizeAll >> operation forced by DeoptimizeALot already done >> this and we got unexpected method state. >> Disabling DeoptimizeALot in test helps to prevent failures. >> >> Testing: manual locally and on remote failing machine >> >> webrev: http://cr.openjdk.java.net/~pchistyakov/8068003/webrev.00/ >> JBS: https://bugs.openjdk.java.net/browse/JDK-8068003 >> >> ------------------ >> Thanks, >> Pavel