From roland.westrelin at sun.com Wed Nov 4 09:03:18 2009 From: roland.westrelin at sun.com (roland.westrelin at sun.com) Date: Wed, 04 Nov 2009 17:03:18 +0000 Subject: hg: jdk7/hotspot-comp/hotspot: 6769124: various 64-bit fixes for c1 Message-ID: <20091104170338.0E50C4144C@hg.openjdk.java.net> Changeset: 323bd24c6520 Author: roland Date: 2009-11-02 11:17 +0100 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/323bd24c6520 6769124: various 64-bit fixes for c1 Reviewed-by: never ! src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp ! src/cpu/sparc/vm/c1_LIRGenerator_sparc.cpp ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/assembler_x86.hpp ! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp ! src/cpu/x86/vm/c1_LIRGenerator_x86.cpp ! src/cpu/x86/vm/vm_version_x86.cpp ! src/share/vm/c1/c1_GraphBuilder.cpp ! src/share/vm/c1/c1_LIRGenerator.cpp ! src/share/vm/c1/c1_LinearScan.cpp ! src/share/vm/runtime/arguments.cpp + test/compiler/6769124/TestArrayCopy6769124.java + test/compiler/6769124/TestDeoptInt6769124.java + test/compiler/6769124/TestUnalignedLoad6769124.java From Vladimir.Kozlov at Sun.COM Wed Nov 4 10:43:17 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 04 Nov 2009 10:43:17 -0800 Subject: Request reviews (S): 6896370: CTW fails share/vm/opto/matcher.cpp:1475 "duplicating node that's already been matched" Message-ID: <4AF1CB45.1000601@sun.com> http://cr.openjdk.java.net/~kvn/6896370/webrev.00 Fixed 6896370: CTW fails share/vm/opto/matcher.cpp:1475 "duplicating node that's already been matched" Problem: It is compressed oops related. LoadN node is not marked as shared since the method Matcher::find_shared() misses the case of address phi which has AddP nodes as input (after split through phi). As result the special code for DecodeN in address expressions is not executed. Solution: Move DecodeN code outside the memory nodes only code. I also noticed that several new memory nodes are missing from the switch's cases in find_shared(). Instead of adding them I replaced cases with common code for stores and loads at the end of the switch. Reviewed by: Fix verified (y/n): y, test Other testing: JPRT, CTW From Vladimir.Kozlov at Sun.COM Wed Nov 4 10:51:46 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 04 Nov 2009 10:51:46 -0800 Subject: Request reviews (S): 6896352: CTW fails hotspot/src/share/vm/opto/escape.cpp:1155 Message-ID: <4AF1CD42.8040107@sun.com> http://cr.openjdk.java.net/~kvn/6896352/webrev.00 Fixed 6896352: CTW fails hotspot/src/share/vm/opto/escape.cpp:1155 Problem: Alias type for LoadUS(ConP char[]) node from the test was not defined during Parse before EA since C->get_alias_index(phase->type(address)) was not called. But EA expects all alias types to be defined before it starts. Solution: Always call C->get_alias_index(phase->type(address)) in MemNode::Ideal_common() which is called by all memory nodes. I added "volatile" to expressions to avoid C++ removal since the result is not used. Reviewed by: Fix verified (y/n): y, test Other testing: JPRT, CTW From Thomas.Rodriguez at Sun.COM Wed Nov 4 11:45:22 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Wed, 04 Nov 2009 11:45:22 -0800 Subject: Request reviews (S): 6896352: CTW fails hotspot/src/share/vm/opto/escape.cpp:1155 In-Reply-To: <4AF1CD42.8040107@sun.com> References: <4AF1CD42.8040107@sun.com> Message-ID: <6EFE8668-2768-4DEA-BEA4-1EBFF97B23C1@sun.com> I don't think adding volatile could have the effect you are hoping for and I also don't think it's needed even if it did. get_alias_index has a side effect over in find_alias_type so it's impossible for it to be optimized away even if the return value isn't used. Also if the new call in Ideal_common is doing it's job then do you really need the call in escape.cpp? I don't see how splitting an AddP could create a new alias type if one already existed for that AddP. tom On Nov 4, 2009, at 10:51 AM, Vladimir Kozlov wrote: > > http://cr.openjdk.java.net/~kvn/6896352/webrev.00 > > Fixed 6896352: CTW fails hotspot/src/share/vm/opto/escape.cpp:1155 > > Problem: > Alias type for LoadUS(ConP char[]) node from the test > was not defined during Parse before EA since > C->get_alias_index(phase->type(address)) was not called. > But EA expects all alias types to be defined before it starts. > > Solution: > Always call C->get_alias_index(phase->type(address)) in > MemNode::Ideal_common() which is called by all memory nodes. > I added "volatile" to expressions to avoid C++ removal > since the result is not used. > > Reviewed by: > > Fix verified (y/n): y, test > > Other testing: > JPRT, CTW > From Thomas.Rodriguez at Sun.COM Wed Nov 4 12:04:14 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Wed, 04 Nov 2009 12:04:14 -0800 Subject: Request reviews (S): 6896370: CTW fails share/vm/opto/matcher.cpp:1475 "duplicating node that's already been matched" In-Reply-To: <4AF1CB45.1000601@sun.com> References: <4AF1CB45.1000601@sun.com> Message-ID: <071E9681-F617-42FD-9A70-59E324382766@sun.com> On Nov 4, 2009, at 10:43 AM, Vladimir Kozlov wrote: > > http://cr.openjdk.java.net/~kvn/6896370/webrev.00 > > Fixed 6896370: CTW fails share/vm/opto/matcher.cpp:1475 "duplicating > node that's already been matched" > > Problem: > It is compressed oops related. > LoadN node is not marked as shared since the method > Matcher::find_shared() misses the case of address phi > which has AddP nodes as input (after split through phi). > As result the special code for DecodeN in address > expressions is not executed. > > Solution: > Move DecodeN code outside the memory nodes only code. > I also noticed that several new memory nodes are missing > from the switch's cases in find_shared(). Instead of > adding them I replaced cases with common code for > stores and loads at the end of the switch. I like this. Which ones were missing? There's also an oddity that ! is_Store() && is_Mem() != is_Load() so you're now treating LoadStore nodes as loads and mem_ops and they weren't previously. Was that intentional? Calling set_shared on LoadStoreNodes is probably benign but triggering the clone_shift_expressions logic for them probably isn't. Most cas style instructions don't support full address modes so any cloning would be useless. tom > > Reviewed by: > > Fix verified (y/n): y, test > > Other testing: > JPRT, CTW > From Vladimir.Kozlov at Sun.COM Wed Nov 4 12:10:09 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 04 Nov 2009 12:10:09 -0800 Subject: Request reviews (S): 6896352: CTW fails hotspot/src/share/vm/opto/escape.cpp:1155 In-Reply-To: <6EFE8668-2768-4DEA-BEA4-1EBFF97B23C1@sun.com> References: <4AF1CD42.8040107@sun.com> <6EFE8668-2768-4DEA-BEA4-1EBFF97B23C1@sun.com> Message-ID: <4AF1DFA1.10804@sun.com> Yes, it works currently without volatile because of side effect but I did not feel comfortable with the code as it looks. May be I should just add the comment that C++ will not remove the call since it has side effect. The code in escape.cpp creates new alias type with instance id which is different from original general type created now in Ideal_common (and in different places before). When we process allocations candidates for scalar replacement we cast their type (type of CheckCastPP node) to instance id: const TypeOopPtr *t = igvn->type(n)->isa_oopptr(); if (t == NULL) continue; // not a TypeInstPtr tinst = t->cast_to_exactness(true)->is_oopptr()->cast_to_instance_id(ni); Than we pass the allocation (CheckCastPP node) node as base to split_AddP() and replace AddP node's type with new one: const TypeOopPtr *base_t = igvn->type(base)->isa_oopptr(); ... const TypeOopPtr *tinst = base_t->add_offset(t->offset())->is_oopptr(); // Do NOT remove the next line: ensure a new alias index is allocated // for the instance type int alias_idx = _compile->get_alias_index(tinst); igvn->set_type(addp, tinst); This is the core for scalar replacement and how we separate memory slices for non-escaping objects which we intend to eliminate. Vladimir Tom Rodriguez wrote: > I don't think adding volatile could have the effect you are hoping for > and I also don't think it's needed even if it did. get_alias_index has > a side effect over in find_alias_type so it's impossible for it to be > optimized away even if the return value isn't used. Also if the new > call in Ideal_common is doing it's job then do you really need the call > in escape.cpp? I don't see how splitting an AddP could create a new > alias type if one already existed for that AddP. > > tom > > On Nov 4, 2009, at 10:51 AM, Vladimir Kozlov wrote: > >> >> http://cr.openjdk.java.net/~kvn/6896352/webrev.00 >> >> Fixed 6896352: CTW fails hotspot/src/share/vm/opto/escape.cpp:1155 >> >> Problem: >> Alias type for LoadUS(ConP char[]) node from the test >> was not defined during Parse before EA since >> C->get_alias_index(phase->type(address)) was not called. >> But EA expects all alias types to be defined before it starts. >> >> Solution: >> Always call C->get_alias_index(phase->type(address)) in >> MemNode::Ideal_common() which is called by all memory nodes. >> I added "volatile" to expressions to avoid C++ removal >> since the result is not used. >> >> Reviewed by: >> >> Fix verified (y/n): y, test >> >> Other testing: >> JPRT, CTW >> > From Thomas.Rodriguez at Sun.COM Wed Nov 4 12:26:21 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Wed, 04 Nov 2009 12:26:21 -0800 Subject: Request reviews (S): 6896352: CTW fails hotspot/src/share/vm/opto/escape.cpp:1155 In-Reply-To: <4AF1DFA1.10804@sun.com> References: <4AF1CD42.8040107@sun.com> <6EFE8668-2768-4DEA-BEA4-1EBFF97B23C1@sun.com> <4AF1DFA1.10804@sun.com> Message-ID: <83B2F293-17EA-4D57-86B4-00AF4E6E8551@Sun.COM> On Nov 4, 2009, at 12:10 PM, Vladimir Kozlov wrote: > Yes, it works currently without volatile because of side effect > but I did not feel comfortable with the code as it looks. > May be I should just add the comment that C++ will not remove > the call since it has side effect. It seems ok without the volatile to me but add a comment if you like. > > The code in escape.cpp creates new alias type with instance id > which is different from original general type created now in > Ideal_common (and in different places before). I'd forgotten about the instance_id. That makes sense then. tom > When we process allocations candidates for scalar replacement > we cast their type (type of CheckCastPP node) to instance id: > > const TypeOopPtr *t = igvn->type(n)->isa_oopptr(); > if (t == NULL) > continue; // not a TypeInstPtr > tinst = t->cast_to_exactness(true)->is_oopptr()- > >cast_to_instance_id(ni); > > Than we pass the allocation (CheckCastPP node) node as base to > split_AddP() and replace AddP node's type with new one: > > const TypeOopPtr *base_t = igvn->type(base)->isa_oopptr(); > ... > const TypeOopPtr *tinst = base_t->add_offset(t->offset())->is_oopptr > (); > // Do NOT remove the next line: ensure a new alias index is allocated > // for the instance type > int alias_idx = _compile->get_alias_index(tinst); > igvn->set_type(addp, tinst); > > This is the core for scalar replacement and how we separate memory > slices for non-escaping objects which we intend to eliminate. > > Vladimir > > Tom Rodriguez wrote: >> I don't think adding volatile could have the effect you are hoping >> for and I also don't think it's needed even if it did. >> get_alias_index has a side effect over in find_alias_type so it's >> impossible for it to be optimized away even if the return value >> isn't used. Also if the new call in Ideal_common is doing it's job >> then do you really need the call in escape.cpp? I don't see how >> splitting an AddP could create a new alias type if one already >> existed for that AddP. >> tom >> On Nov 4, 2009, at 10:51 AM, Vladimir Kozlov wrote: >>> >>> http://cr.openjdk.java.net/~kvn/6896352/webrev.00 >>> >>> Fixed 6896352: CTW fails hotspot/src/share/vm/opto/escape.cpp:1155 >>> >>> Problem: >>> Alias type for LoadUS(ConP char[]) node from the test >>> was not defined during Parse before EA since >>> C->get_alias_index(phase->type(address)) was not called. >>> But EA expects all alias types to be defined before it starts. >>> >>> Solution: >>> Always call C->get_alias_index(phase->type(address)) in >>> MemNode::Ideal_common() which is called by all memory nodes. >>> I added "volatile" to expressions to avoid C++ removal >>> since the result is not used. >>> >>> Reviewed by: >>> >>> Fix verified (y/n): y, test >>> >>> Other testing: >>> JPRT, CTW >>> From Vladimir.Kozlov at Sun.COM Wed Nov 4 12:45:40 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 04 Nov 2009 12:45:40 -0800 Subject: Request reviews (S): 6896370: CTW fails share/vm/opto/matcher.cpp:1475 "duplicating node that's already been matched" In-Reply-To: <071E9681-F617-42FD-9A70-59E324382766@sun.com> References: <4AF1CB45.1000601@sun.com> <071E9681-F617-42FD-9A70-59E324382766@sun.com> Message-ID: <4AF1E7F4.8060806@sun.com> Tom Rodriguez wrote: > > On Nov 4, 2009, at 10:43 AM, Vladimir Kozlov wrote: > >> >> http://cr.openjdk.java.net/~kvn/6896370/webrev.00 >> >> Fixed 6896370: CTW fails share/vm/opto/matcher.cpp:1475 "duplicating >> node that's already been matched" >> > I like this. Which ones were missing? There's also an oddity that LoadUB, LoadUI2L, LoadPLocked, LoadLLocked and all LoadStore nodes. > !is_Store() && is_Mem() != is_Load() so you're now treating LoadStore > nodes as loads and mem_ops and they weren't previously. Was that > intentional? Calling set_shared on LoadStoreNodes is probably benign Yes, I did it intentionally since all Store[P|I|L]Conditional and CompareAndSwap nodes have general memory with all address modes. > but triggering the clone_shift_expressions logic for them probably > isn't. Most cas style instructions don't support full address modes so > any cloning would be useless. I disagree, according to x86 documents cas uses general memory: CMPXCHG r/m32,r32 - Compare EAX with r/m32. If equal, ZF is set and r32 is loaded into r/m32. Else, clear ZF and load r/m32 into AL Vladimir > > tom > >> >> Reviewed by: >> >> Fix verified (y/n): y, test >> >> Other testing: >> JPRT, CTW >> > From Vladimir.Kozlov at Sun.COM Wed Nov 4 12:46:59 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 04 Nov 2009 12:46:59 -0800 Subject: Request reviews (S): 6896352: CTW fails hotspot/src/share/vm/opto/escape.cpp:1155 In-Reply-To: <83B2F293-17EA-4D57-86B4-00AF4E6E8551@Sun.COM> References: <4AF1CD42.8040107@sun.com> <6EFE8668-2768-4DEA-BEA4-1EBFF97B23C1@sun.com> <4AF1DFA1.10804@sun.com> <83B2F293-17EA-4D57-86B4-00AF4E6E8551@Sun.COM> Message-ID: <4AF1E843.9050505@sun.com> Thanks, Tom Vladimir Tom Rodriguez wrote: > > On Nov 4, 2009, at 12:10 PM, Vladimir Kozlov wrote: > >> Yes, it works currently without volatile because of side effect >> but I did not feel comfortable with the code as it looks. >> May be I should just add the comment that C++ will not remove >> the call since it has side effect. > > It seems ok without the volatile to me but add a comment if you like. > >> >> The code in escape.cpp creates new alias type with instance id >> which is different from original general type created now in >> Ideal_common (and in different places before). > > I'd forgotten about the instance_id. That makes sense then. > > tom > >> When we process allocations candidates for scalar replacement >> we cast their type (type of CheckCastPP node) to instance id: >> >> const TypeOopPtr *t = igvn->type(n)->isa_oopptr(); >> if (t == NULL) >> continue; // not a TypeInstPtr >> tinst = >> t->cast_to_exactness(true)->is_oopptr()->cast_to_instance_id(ni); >> >> Than we pass the allocation (CheckCastPP node) node as base to >> split_AddP() and replace AddP node's type with new one: >> >> const TypeOopPtr *base_t = igvn->type(base)->isa_oopptr(); >> ... >> const TypeOopPtr *tinst = base_t->add_offset(t->offset())->is_oopptr(); >> // Do NOT remove the next line: ensure a new alias index is allocated >> // for the instance type >> int alias_idx = _compile->get_alias_index(tinst); >> igvn->set_type(addp, tinst); >> >> This is the core for scalar replacement and how we separate memory >> slices for non-escaping objects which we intend to eliminate. >> >> Vladimir >> >> Tom Rodriguez wrote: >>> I don't think adding volatile could have the effect you are hoping >>> for and I also don't think it's needed even if it did. >>> get_alias_index has a side effect over in find_alias_type so it's >>> impossible for it to be optimized away even if the return value isn't >>> used. Also if the new call in Ideal_common is doing it's job then do >>> you really need the call in escape.cpp? I don't see how splitting an >>> AddP could create a new alias type if one already existed for that AddP. >>> tom >>> On Nov 4, 2009, at 10:51 AM, Vladimir Kozlov wrote: >>>> >>>> http://cr.openjdk.java.net/~kvn/6896352/webrev.00 >>>> >>>> Fixed 6896352: CTW fails hotspot/src/share/vm/opto/escape.cpp:1155 >>>> >>>> Problem: >>>> Alias type for LoadUS(ConP char[]) node from the test >>>> was not defined during Parse before EA since >>>> C->get_alias_index(phase->type(address)) was not called. >>>> But EA expects all alias types to be defined before it starts. >>>> >>>> Solution: >>>> Always call C->get_alias_index(phase->type(address)) in >>>> MemNode::Ideal_common() which is called by all memory nodes. >>>> I added "volatile" to expressions to avoid C++ removal >>>> since the result is not used. >>>> >>>> Reviewed by: >>>> >>>> Fix verified (y/n): y, test >>>> >>>> Other testing: >>>> JPRT, CTW >>>> > From Thomas.Rodriguez at Sun.COM Wed Nov 4 13:35:29 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Wed, 04 Nov 2009 13:35:29 -0800 Subject: Request reviews (S): 6896370: CTW fails share/vm/opto/matcher.cpp:1475 "duplicating node that's already been matched" In-Reply-To: <4AF1E7F4.8060806@sun.com> References: <4AF1CB45.1000601@sun.com> <071E9681-F617-42FD-9A70-59E324382766@sun.com> <4AF1E7F4.8060806@sun.com> Message-ID: <2DD3777B-DAEA-4CD5-8915-3595583EBFFB@sun.com> Sorry for some reason I thought it didn't. Looks good then. Tom On Nov 4, 2009, at 12:45 PM, Vladimir Kozlov wrote: > > > Tom Rodriguez wrote: >> On Nov 4, 2009, at 10:43 AM, Vladimir Kozlov wrote: >>> >>> http://cr.openjdk.java.net/~kvn/6896370/webrev.00 >>> >>> Fixed 6896370: CTW fails share/vm/opto/matcher.cpp:1475 >>> "duplicating node that's already been matched" >>> >> I like this. Which ones were missing? There's also an oddity that > > LoadUB, LoadUI2L, LoadPLocked, LoadLLocked and all LoadStore nodes. > >> !is_Store() && is_Mem() != is_Load() so you're now treating >> LoadStore nodes as loads and mem_ops and they weren't previously. >> Was that intentional? Calling set_shared on LoadStoreNodes is >> probably benign > > Yes, I did it intentionally since all Store[P|I|L]Conditional and > CompareAndSwap nodes have general memory with all address modes. > >> but triggering the clone_shift_expressions logic for them probably >> isn't. Most cas style instructions don't support full address >> modes so any cloning would be useless. > > I disagree, according to x86 documents cas uses general memory: > > CMPXCHG r/m32,r32 - Compare EAX with r/m32. If equal, ZF is > set and r32 is > loaded into r/m32. Else, clear ZF and load > r/m32 into AL > > Vladimir > >> tom >>> >>> Reviewed by: >>> >>> Fix verified (y/n): y, test >>> >>> Other testing: >>> JPRT, CTW >>> From Thomas.Rodriguez at Sun.COM Wed Nov 4 16:13:26 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Wed, 04 Nov 2009 16:13:26 -0800 Subject: rough ideal overview Message-ID: <906B9397-5CEE-4157-A7CE-6DF2A0D7700C@Sun.COM> While trolling through some old mail I found a rough overview of Ideal I wrote a few years back. I cleaned it up very slightly and dropped in on the wiki in case anyone is interested. http://wikis.sun.com/display/HotSpotInternals/Overview+of+Ideal%2C+C2%27s+high+level+intermediate+representation tom From vladimir.kozlov at sun.com Wed Nov 4 16:51:30 2009 From: vladimir.kozlov at sun.com (vladimir.kozlov at sun.com) Date: Thu, 05 Nov 2009 00:51:30 +0000 Subject: hg: jdk7/hotspot-comp/hotspot: 6896370: CTW fails share/vm/opto/matcher.cpp:1475 "duplicating node that's already been matched" Message-ID: <20091105005148.B7748414E0@hg.openjdk.java.net> Changeset: 09572fede9d1 Author: kvn Date: 2009-11-04 14:16 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/09572fede9d1 6896370: CTW fails share/vm/opto/matcher.cpp:1475 "duplicating node that's already been matched" Summary: Move DecodeN code outside the memory nodes only code. Reviewed-by: never ! src/share/vm/opto/matcher.cpp From vladimir.kozlov at sun.com Wed Nov 4 19:39:37 2009 From: vladimir.kozlov at sun.com (vladimir.kozlov at sun.com) Date: Thu, 05 Nov 2009 03:39:37 +0000 Subject: hg: jdk7/hotspot-comp/hotspot: 6896352: CTW fails hotspot/src/share/vm/opto/escape.cpp:1155 Message-ID: <20091105033947.6D6324151D@hg.openjdk.java.net> Changeset: dcdcc8c16e20 Author: kvn Date: 2009-11-04 14:43 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/dcdcc8c16e20 6896352: CTW fails hotspot/src/share/vm/opto/escape.cpp:1155 Summary: Always call C->get_alias_index(phase->type(address)) during parsing. Reviewed-by: never ! src/share/vm/opto/escape.cpp ! src/share/vm/opto/memnode.cpp From john.coomes at sun.com Fri Nov 6 11:02:37 2009 From: john.coomes at sun.com (john.coomes at sun.com) Date: Fri, 06 Nov 2009 19:02:37 +0000 Subject: hg: jdk7/hotspot-comp: Added tag jdk7-b75 for changeset d1516b9f2395 Message-ID: <20091106190237.C48F9417D7@hg.openjdk.java.net> Changeset: 2bad7eac71b3 Author: mikejwre Date: 2009-10-30 10:54 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/rev/2bad7eac71b3 Added tag jdk7-b75 for changeset d1516b9f2395 ! .hgtags From john.coomes at sun.com Fri Nov 6 11:03:02 2009 From: john.coomes at sun.com (john.coomes at sun.com) Date: Fri, 06 Nov 2009 19:03:02 +0000 Subject: hg: jdk7/hotspot-comp/corba: Added tag jdk7-b75 for changeset 0fb137085952 Message-ID: <20091106190306.E4136417D8@hg.openjdk.java.net> Changeset: c8a56aff861b Author: mikejwre Date: 2009-10-30 10:54 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/corba/rev/c8a56aff861b Added tag jdk7-b75 for changeset 0fb137085952 ! .hgtags From john.coomes at sun.com Fri Nov 6 11:07:00 2009 From: john.coomes at sun.com (john.coomes at sun.com) Date: Fri, 06 Nov 2009 19:07:00 +0000 Subject: hg: jdk7/hotspot-comp/jaxp: Added tag jdk7-b75 for changeset 555fb78ee4ce Message-ID: <20091106190706.C39CC417DA@hg.openjdk.java.net> Changeset: cb7bd40f5031 Author: mikejwre Date: 2009-10-30 10:54 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/jaxp/rev/cb7bd40f5031 Added tag jdk7-b75 for changeset 555fb78ee4ce ! .hgtags From john.coomes at sun.com Fri Nov 6 11:07:34 2009 From: john.coomes at sun.com (john.coomes at sun.com) Date: Fri, 06 Nov 2009 19:07:34 +0000 Subject: hg: jdk7/hotspot-comp/jaxws: Added tag jdk7-b75 for changeset fcf2b8b5d606 Message-ID: <20091106190745.56669417DB@hg.openjdk.java.net> Changeset: 27c05c2ad35f Author: mikejwre Date: 2009-10-30 10:54 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/jaxws/rev/27c05c2ad35f Added tag jdk7-b75 for changeset fcf2b8b5d606 ! .hgtags From john.coomes at sun.com Fri Nov 6 11:12:05 2009 From: john.coomes at sun.com (john.coomes at sun.com) Date: Fri, 06 Nov 2009 19:12:05 +0000 Subject: hg: jdk7/hotspot-comp/langtools: 8 new changesets Message-ID: <20091106191232.27AB3417DF@hg.openjdk.java.net> Changeset: e526e39579ae Author: jjg Date: 2009-10-13 14:02 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/langtools/rev/e526e39579ae 6887895: CONSTANT_Class_info getBaseName does not handle arrays of primitives correctly Reviewed-by: ksrini ! src/share/classes/com/sun/tools/classfile/ConstantPool.java + test/tools/javap/classfile/T6887895.java Changeset: 8a4543b30586 Author: jjg Date: 2009-10-13 15:26 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/langtools/rev/8a4543b30586 6891079: Compiler allows invalid binary literals 0b and oBL Reviewed-by: darcy ! src/share/classes/com/sun/tools/javac/parser/Scanner.java ! src/share/classes/com/sun/tools/javac/resources/compiler.properties + test/tools/javac/literals/T6891079.java + test/tools/javac/literals/T6891079.out Changeset: 86b773b7cb40 Author: jjg Date: 2009-10-14 15:41 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/langtools/rev/86b773b7cb40 6838467: JSR199 FileObjects don't obey general contract of equals. Reviewed-by: darcy ! src/share/classes/com/sun/tools/javac/file/BaseFileObject.java ! src/share/classes/com/sun/tools/javac/file/JavacFileManager.java ! src/share/classes/com/sun/tools/javac/file/RegularFileObject.java ! src/share/classes/com/sun/tools/javac/file/SymbolArchive.java ! src/share/classes/com/sun/tools/javac/file/ZipArchive.java ! src/share/classes/com/sun/tools/javac/file/ZipFileIndex.java ! src/share/classes/com/sun/tools/javac/file/ZipFileIndexArchive.java ! src/share/classes/com/sun/tools/javac/jvm/ClassReader.java ! test/tools/javac/api/6440528/T6440528.java + test/tools/javac/api/T6838467.java Changeset: b8936a7930fe Author: darcy Date: 2009-10-14 18:56 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/langtools/rev/b8936a7930fe 6558804: Specification for Elements.getDocComment(Element e) should be clarified Reviewed-by: jjg ! src/share/classes/javax/lang/model/util/Elements.java Changeset: d1e62f78c48b Author: tbell Date: 2009-10-15 22:48 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/langtools/rev/d1e62f78c48b Merge Changeset: 6ba399eff2cb Author: jjg Date: 2009-10-16 12:56 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/langtools/rev/6ba399eff2cb 6888367: classfile library parses signature attributes incorrectly Reviewed-by: ksrini ! src/share/classes/com/sun/tools/classfile/Signature.java ! src/share/classes/com/sun/tools/classfile/Type.java ! src/share/classes/com/sun/tools/javap/ClassWriter.java ! src/share/classes/com/sun/tools/javap/LocalVariableTypeTableWriter.java + test/tools/javap/classfile/6888367/T6888367.java Changeset: 2485f5641ed0 Author: jjg Date: 2009-10-19 13:38 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/langtools/rev/2485f5641ed0 6889255: javac MethodSymbol throws NPE if ClassReader does not read parameter names correctly Reviewed-by: darcy ! src/share/classes/com/sun/tools/javac/code/Symbol.java ! src/share/classes/com/sun/tools/javac/jvm/ClassReader.java + test/tools/javac/6889255/T6889255.java Changeset: c8083dc525b6 Author: mikejwre Date: 2009-10-30 10:55 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/langtools/rev/c8083dc525b6 Added tag jdk7-b75 for changeset 2485f5641ed0 ! .hgtags From Thomas.Rodriguez at Sun.COM Mon Nov 9 15:46:57 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Mon, 09 Nov 2009 15:46:57 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns Message-ID: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> http://cr.openjdk.java.net/~never/6892658/ From Thomas.Rodriguez at Sun.COM Mon Nov 9 15:57:31 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Mon, 09 Nov 2009 15:57:31 -0800 Subject: review (XS) for 6892079: live value must not be garbage failure after fix for 6854812 Message-ID: <20DC8F32-10AD-4463-A6E6-703BD41AC1CE@Sun.COM> http://cr.openjdk.java.net/~never/6892079 From Vladimir.Kozlov at Sun.COM Mon Nov 9 17:00:14 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Mon, 09 Nov 2009 17:00:14 -0800 Subject: review (XS) for 6892079: live value must not be garbage failure after fix for 6854812 In-Reply-To: <20DC8F32-10AD-4463-A6E6-703BD41AC1CE@Sun.COM> References: <20DC8F32-10AD-4463-A6E6-703BD41AC1CE@Sun.COM> Message-ID: <4AF8BB1E.8070803@sun.com> Looks good. Vladimir Tom Rodriguez wrote: > http://cr.openjdk.java.net/~never/6892079 From andreas.kohn at fredhopper.com Tue Nov 10 05:55:18 2009 From: andreas.kohn at fredhopper.com (Andreas Kohn) Date: Tue, 10 Nov 2009 14:55:18 +0100 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> Message-ID: <1257861318.4141.24.camel@tiamaria.ams.fredhopper.com> On Mon, 2009-11-09 at 15:46 -0800, Tom Rodriguez wrote: > http://cr.openjdk.java.net/~never/6892658/ Hi, more question a question than a review: should match_F_Y() not require synchronized? 306 inline bool match_F_Y(jshort flags) { 307 const int req = 0; = JVM_ACC_SYNCHRONIZED; ? 308 const int neg = JVM_ACC_STATIC; 309 return (flags & (req | neg)) == req; 310 } I'm just a casual reader, so I might be missing something here. Regards, -- Andreas -- Never attribute to malice that which can be adequately explained by stupidity. -- Hanlon's Razor -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20091110/8bc3c6d7/attachment.bin From Christian.Thalinger at Sun.COM Tue Nov 10 11:42:27 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Tue, 10 Nov 2009 20:42:27 +0100 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> Message-ID: <1257882147.998.1.camel@macbook> On Mon, 2009-11-09 at 15:46 -0800, Tom Rodriguez wrote: > http://cr.openjdk.java.net/~never/6892658/ src/share/vm/memory/universe.hpp: + static oop _the_MIN_VALUE_string; // A cache of "" as a Java string The min-value is missing in the comment. src/share/vm/opto/callGenerator.cpp: void WarmCallInfo::make_hot() { Why is this method now Unimplemented()? src/share/vm/opto/callnode.cpp: + // The resproj may not exist because the result couuld be ignored + // and the exception object may not be exist if an exception handler Typo. src/share/vm/opto/compile.cpp: + // Separate projections were use for the exception path which "were used"? src/share/vm/opto/phase.hpp: LIVE, // Dragon-book LIVE range problem + StringOpts, Interference_Graph, // Building the IFG A comment would be nice. src/share/vm/utilities/growableArray.hpp: + /* inserts the given element before the element at index i */ A C comment? I will look at src/share/vm/opto/stringopts.cpp tomorrow. -- Christian From Thomas.Rodriguez at Sun.COM Tue Nov 10 13:11:57 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Tue, 10 Nov 2009 13:11:57 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <1257861318.4141.24.camel@tiamaria.ams.fredhopper.com> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <1257861318.4141.24.camel@tiamaria.ams.fredhopper.com> Message-ID: <48E85157-7EF7-4CCC-A8FF-62918F273E73@Sun.COM> No, there's definitely something wrong. It worked to match what I wanted to so I didn't look that closely but it will certainly match a broader range of things than it should. I would expect the match functions to be disjoint from each other but both F_R and F_S ignore the JVM_ACC_NATIVE bit so F_RY matches F_R and F_SN matches F_S though the reverse isn't true. Anyway, thanks for catching this. I've fixed match_F_Y as you indicated and updated the flags enum to more clearly report the indeterminate nature of the native bit in some of the types. F_none = 0, F_R, // !static ?native !synchronized (R="regular") F_S, // static ?native !synchronized F_Y, // !static ?native synchronized F_RN, // !static native !synchronized F_SN, // static native !synchronized F_RNY // !static native synchronized I may file a separate bug to add testing of JVM_ACC_NATIVE to the one which are currently ignoring it. tom On Nov 10, 2009, at 5:55 AM, Andreas Kohn wrote: > On Mon, 2009-11-09 at 15:46 -0800, Tom Rodriguez wrote: >> http://cr.openjdk.java.net/~never/6892658/ > Hi, > > more question a question than a review: should match_F_Y() not require > synchronized? > > 306 inline bool match_F_Y(jshort flags) { > 307 const int req = 0; > = JVM_ACC_SYNCHRONIZED; ? > 308 const int neg = JVM_ACC_STATIC; > 309 return (flags & (req | neg)) == req; > 310 } > > I'm just a casual reader, so I might be missing something here. > > Regards, > -- > Andreas > > -- > Never attribute to malice that which can be adequately explained by > stupidity. -- Hanlon's Razor From Thomas.Rodriguez at Sun.COM Tue Nov 10 13:15:19 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Tue, 10 Nov 2009 13:15:19 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <1257882147.998.1.camel@macbook> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <1257882147.998.1.camel@macbook> Message-ID: <20D7A9DD-F273-4EB7-8DD2-5A3FEF1D0324@Sun.COM> On Nov 10, 2009, at 11:42 AM, Christian Thalinger wrote: > On Mon, 2009-11-09 at 15:46 -0800, Tom Rodriguez wrote: >> http://cr.openjdk.java.net/~never/6892658/ > > src/share/vm/memory/universe.hpp: > > + static oop _the_MIN_VALUE_string; // A cache of "" as a Java string > > The min-value is missing in the comment. Fixed. > src/share/vm/opto/callGenerator.cpp: > > void WarmCallInfo::make_hot() { > > Why is this method now Unimplemented()? make_hot is a non-working implementation of late inlining and long term I think all that code should be deleted and rebuilt if we intend to move more generally to a late inlining strategy. I just went ahead and deleted that for now. I could leave it behind if you like. > > > src/share/vm/opto/callnode.cpp: > > + // The resproj may not exist because the result couuld be ignored > + // and the exception object may not be exist if an exception handler > > Typo. Fixed. > > > src/share/vm/opto/compile.cpp: > > + // Separate projections were use for the exception path which > > "were used"? Fixed. > > > src/share/vm/opto/phase.hpp: > > LIVE, // Dragon-book LIVE range problem > + StringOpts, > Interference_Graph, // Building the IFG > > A comment would be nice. Fixed. > src/share/vm/utilities/growableArray.hpp: > > + /* inserts the given element before the element at index i */ > > A C comment? It started as a copy paste from a context that couldn't use C++ style comments and I never fixed it. I've corrected it. tom > > I will look at src/share/vm/opto/stringopts.cpp tomorrow. > > -- Christian > From Christian.Thalinger at Sun.COM Tue Nov 10 13:24:34 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Tue, 10 Nov 2009 22:24:34 +0100 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <20D7A9DD-F273-4EB7-8DD2-5A3FEF1D0324@Sun.COM> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <1257882147.998.1.camel@macbook> <20D7A9DD-F273-4EB7-8DD2-5A3FEF1D0324@Sun.COM> Message-ID: <1257888274.998.3.camel@macbook> On Tue, 2009-11-10 at 13:15 -0800, Tom Rodriguez wrote: > > src/share/vm/opto/callGenerator.cpp: > > > > void WarmCallInfo::make_hot() { > > > > Why is this method now Unimplemented()? > > make_hot is a non-working implementation of late inlining and long > term I think all that code should be deleted and rebuilt if we intend > to move more generally to a late inlining strategy. I just went ahead > and deleted that for now. I could leave it behind if you like. No, I just wanted to have a context here. Thanks. -- Christian From Vladimir.Kozlov at Sun.COM Tue Nov 10 14:11:21 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Tue, 10 Nov 2009 14:11:21 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> Message-ID: <4AF9E509.9030108@sun.com> Tom, here is review of all files but 2 new, I am looking on them next the_MIN_VALUE_string may be should be the_MIN_INT_VALUE_string universe.hpp static oop _the_MIN_VALUE_string; // A cache of "" as a Java string ^ "-2147483648" c2_globals.hpp I thought we will use experimental for OptimizeStringConcat. callGenerator.cpp missing _call_node(NULL) in DirectCallGenerator() constructor. bool _separate_io_proj; <- add field's descriptor by copying the comment from LateInlineCallGenerator::generate() do_late_inline() add checks to not inline when call_node()->in(0) == NULL || call_node()->in(0)->is_top() + for (uint i1 = 0; i1 < call->req(); i1++) { ^ size callnode.cpp You copied code from PhaseMacroExpand::extract_call_projections(). Can you use your new method in macro.cpp also? At least for HS17 changes when you will have more time. compile.cpp In gvn_replace_by() why there is no initial_gvn()->hash_insert(use)? Instead of OptimizeStringConcat && has_stringbuilder() and OptimizeStringConcat checks may be we can use only has_stringbuilder() (or different name for query method) and check OptimizeStringConcat in parseHelper.cpp where has_stringbuilder is set? doCall.cpp I saw cases when append methods were not inlined because they were already compiled into "big" compiled method. You call for_late_inline() after ok_to_inline(), so may be we should relax that condition for OptimizeStringConcat case. Should you check that safepoint has > jvms->argoff() inputs?: Node* receiver = jvms->map()->in(jvms->argoff() + 1); macro.cpp Can you print array only when it is array?: + log->head("eliminate_allocation %s type='%d'", + alloc->is_AllocateArray() ? "array" : "", log->identify(tklass->klass())); the same with "lock":"unlock" Vladimir Tom Rodriguez wrote: > http://cr.openjdk.java.net/~never/6892658/ From Vladimir.Kozlov at Sun.COM Tue Nov 10 18:13:24 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Tue, 10 Nov 2009 18:13:24 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> Message-ID: <4AFA1DC4.6060302@sun.com> Tom Rodriguez wrote: > http://cr.openjdk.java.net/~never/6892658/ stringopts.cpp It would be nice to have all fields description in StringConcat class. In the changes description you said that new String(SB.toString()) is processed also but the code is under #if 0 (merge_add()). Change PrintOptimizeStringConcat to "notproduct" flag since it is used only under #ifndef PRODUCT. In StringConcat::merge() _stringopt could be used instead of other->_stringopts in new StringConcat(other->_stringopts, _end) Also _begin instead of other->_begin in result->set_allocation(other->_begin) It will allow to merge "other" several times see my comment bellow about coalesce concats. In StringConcat::eliminate_initialize() would be nice to have assert that initialize node doesn't have raw stores: assert(init->req() <= InitializeNode::RawStores,""); In build_candidate() you may use recv->uncast() instead of: 366 if (recv->Opcode() == Op_CastPP) { 367 recv = recv->in(1); 368 } At the same code you skip Proj so you may fail AllocateNode::Ideal_allocation() since it expects Proj or CheckCastPP nodes. Just check that recv->is_Allocate(). There are several places where you do next check, may be you can factor it in a separate function: method->holder() == C->env()->StringBuilder_klass() || method->holder() == C->env()->StringBuffer_klass() May be also verify has_stringbuilder() in PhaseStringOpts(). I don't understand next break code. 562 c = 0; 563 break; You have 3 nested loops so the next break will return to the second loop, not first, so "sc" will not be updated and o==0 will be skipped. Why? Also this coalesce code will not work if "other" is used by several sc/arguments since you removed it from the list after first match and merge. For example: String s0 = new SB().append(1)...toString(); String s1 = new SB().append(s0).append(s0).toString(); I would keep it and always replace "c" with merged (you need to modify StringConcat::merge() as I pointed above). The "o" will be removed automatically if there are no other uses. I will look on copy_string() and related methods tomorrow. Vladimir From Thomas.Rodriguez at Sun.COM Tue Nov 10 22:31:16 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Tue, 10 Nov 2009 22:31:16 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <4AF9E509.9030108@sun.com> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <4AF9E509.9030108@sun.com> Message-ID: <3AD5BDF3-4E2D-4098-8207-BAA8A4C41319@Sun.COM> On Nov 10, 2009, at 2:11 PM, Vladimir Kozlov wrote: > Tom, > > here is review of all files but 2 new, I am looking on them next > > the_MIN_VALUE_string may be should be the_MIN_INT_VALUE_string I was trying to mimic Integer.MIN_VALUE. If you want int in there then how about the_min_jint_string? > universe.hpp > static oop _the_MIN_VALUE_string; // A cache of "" as a Java string > ^ "-2147483648" Fixed. > c2_globals.hpp > I thought we will use experimental for OptimizeStringConcat. I guess we could. I don't really like experimental much but I'm not completely against it. > callGenerator.cpp > missing _call_node(NULL) in DirectCallGenerator() constructor. > > bool _separate_io_proj; <- add field's descriptor by copying the comment > from LateInlineCallGenerator::generate() Ok. > > do_late_inline() > add checks to not inline when > call_node()->in(0) == NULL || call_node()->in(0)->is_top() Ok > > + for (uint i1 = 0; i1 < call->req(); i1++) { > ^ size Ok. > callnode.cpp > You copied code from PhaseMacroExpand::extract_call_projections(). > Can you use your new method in macro.cpp also? At least for HS17 changes > when you will have more time. I'll make that change later. > compile.cpp > In gvn_replace_by() why there is no initial_gvn()->hash_insert(use)? It was an oversight. I've added it. The code should roughly mirror Node::subsume_node. > Instead of OptimizeStringConcat && has_stringbuilder() and OptimizeStringConcat > checks may be we can use only has_stringbuilder() (or different name for query method) > and check OptimizeStringConcat in parseHelper.cpp where has_stringbuilder is set? I'd be ok with that. > > doCall.cpp > I saw cases when append methods were not inlined because they were > already compiled into "big" compiled method. You call for_late_inline() > after ok_to_inline(), so may be we should relax that condition for > OptimizeStringConcat case. The code doesn't care whether the interesting methods are inlined or not. If the methods would have been inlined then we register a late inline for them. If they wouldn't have been inlined then we tell the DirectCallGenerator to use separate io projs so that we can properly find all the edges if we need to replace it. THe late inlining logic is mainly about giving us the same result if we don't end up performing the optimization. > Should you check that safepoint has > jvms->argoff() inputs?: > Node* receiver = jvms->map()->in(jvms->argoff() + 1); I guess I could but if there aren't enough then the call itself is malformed and we'll die later won't we? Do you think i should? > > macro.cpp > Can you print array only when it is array?: > + log->head("eliminate_allocation %s type='%d'", > + alloc->is_AllocateArray() ? "array" : "", log->identify(tklass->klass())); > > the same with "lock":"unlock" I could but that's not very good xml. I think I'll just leave that out since arrayness should be discerned from the klass. I'd prefer to leave it as it for lock though. tom > > Vladimir > > Tom Rodriguez wrote: >> http://cr.openjdk.java.net/~never/6892658/ From changpeng.fang at sun.com Tue Nov 10 23:33:50 2009 From: changpeng.fang at sun.com (changpeng.fang at sun.com) Date: Wed, 11 Nov 2009 07:33:50 +0000 Subject: hg: jdk7/hotspot-comp/hotspot: 12 new changesets Message-ID: <20091111073415.D4B8D41EFB@hg.openjdk.java.net> Changeset: fc06cd9b42c7 Author: tonyp Date: 2009-10-23 14:34 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/fc06cd9b42c7 6886024: G1: assert(recent_avg_pause_time_ratio() < 1.00,"All GC?") Summary: the assert is incorrect and can fire incorrectly due to floating point inaccuracy. Reviewed-by: apetrusenko, ysr, jcoomes ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp Changeset: 6270f80a7331 Author: tonyp Date: 2009-09-30 14:50 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/6270f80a7331 6890137: G1: revamp reachable object dump Summary: Revamp the reachable object dump debugging facility. Reviewed-by: jmasa, apetrusenko ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/concurrentMark.hpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1_globals.hpp Changeset: fa2f65ebeb08 Author: apetrusenko Date: 2009-10-27 02:42 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/fa2f65ebeb08 6870843: G1: G1 GC memory leak Summary: The fix addresses two memory leaks in G1 code: (1) _evac_failure_scan_stack - a resource object allocated on the C heap was not freed; (2) RSHashTable were linked into deleted list which was only cleared at full GC. Reviewed-by: tonyp, iveresov ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/sparsePRT.cpp ! src/share/vm/gc_implementation/g1/sparsePRT.hpp Changeset: 72a6752ac432 Author: ysr Date: 2009-10-28 11:16 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/72a6752ac432 6818264: Heap dumper unexpectedly adds .hprof suffix Summary: Restore old behaviour wrt HeapDumpPath; first dump goes to , th dump goes to ., with default value of the same as before. Reviewed-by: alanb, jcoomes, tonyp ! src/share/vm/services/heapDumper.cpp Changeset: beb8f45ee9f0 Author: johnc Date: 2009-10-29 09:42 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/beb8f45ee9f0 6889740: G1: OpenDS fails with "unhandled exception in compiled code" Summary: Incorrect code was being generated for the store operation in the null case of the aastore bytecode template. The bad code was generated by the store_heap_oop routine which takes a Register as its second argument. Passing NULL_WORD (0) as the second argument causes the value to be converted to Register(0), which is rax. Thus the generated store was "mov (dst), $rax" instead of "mov (dst), $0x0". Changed calls to store_heap_oop that pass NULL_WORD as the second argument to a new routine store_heap_oop_null. Reviewed-by: kvn, twisti ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/assembler_x86.hpp ! src/cpu/x86/vm/templateTable_x86_64.cpp Changeset: 29adffcb6a61 Author: tonyp Date: 2009-10-30 13:31 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/29adffcb6a61 Merge Changeset: 26f1542097f1 Author: ysr Date: 2009-11-03 16:43 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/26f1542097f1 6801625: CDS: HeapDump tests crash with internal error in compactingPermGenGen.cpp Summary: Allow iteration over the shared spaces when using CDS, repealing previous proscription. Deferred further required CDS-related cleanups of perm gen to CR 6897789. Reviewed-by: phh, jmasa ! src/share/vm/memory/compactingPermGenGen.cpp ! src/share/vm/memory/compactingPermGenGen.hpp ! src/share/vm/memory/generation.cpp Changeset: bc1144adedfb Author: mikejwre Date: 2009-10-30 10:54 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/bc1144adedfb Added tag jdk7-b75 for changeset d8dd291a362a ! .hgtags Changeset: a6280c71758e Author: trims Date: 2009-11-05 15:44 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/a6280c71758e Merge Changeset: 50c16f09ddf5 Author: trims Date: 2009-11-05 15:58 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/50c16f09ddf5 6898707: Bump the HS17 build number to 05 Summary: Update the HS17 build number to 05 Reviewed-by: jcoomes ! make/hotspot_version Changeset: 9174bb32e934 Author: trims Date: 2009-11-06 00:41 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/9174bb32e934 Merge Changeset: 2f1ec89b9995 Author: cfang Date: 2009-11-10 17:00 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/2f1ec89b9995 Merge ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/assembler_x86.hpp From Vladimir.Kozlov at Sun.COM Wed Nov 11 08:36:44 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 11 Nov 2009 08:36:44 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <3AD5BDF3-4E2D-4098-8207-BAA8A4C41319@Sun.COM> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <4AF9E509.9030108@sun.com> <3AD5BDF3-4E2D-4098-8207-BAA8A4C41319@Sun.COM> Message-ID: <4AFAE81C.7060406@sun.com> Tom Rodriguez wrote: >> >> the_MIN_VALUE_string may be should be the_MIN_INT_VALUE_string > > I was trying to mimic Integer.MIN_VALUE. If you want int in there then how about the_min_jint_string? > Agree with the_min_jint_string. > >> c2_globals.hpp >> I thought we will use experimental for OptimizeStringConcat. > > I guess we could. I don't really like experimental much but I'm not completely against it. > I would prefer experimental. > >> doCall.cpp >> I saw cases when append methods were not inlined because they were >> already compiled into "big" compiled method. You call for_late_inline() >> after ok_to_inline(), so may be we should relax that condition for >> OptimizeStringConcat case. > > The code doesn't care whether the interesting methods are inlined or not. If the methods would have been inlined then we register a late inline for them. If they wouldn't have been inlined then we tell the DirectCallGenerator to use separate io projs so that we can properly find all the edges if we need to replace it. THe late inlining logic is mainly about giving us the same result if we don't end up performing the optimization. > You are right. The code delays inlining, not trying to inline. >> Should you check that safepoint has > jvms->argoff() inputs?: >> Node* receiver = jvms->map()->in(jvms->argoff() + 1); > > I guess I could but if there aren't enough then the call itself is malformed and we'll die later won't we? Do you think i should? > Leave it as it is. >> macro.cpp >> Can you print array only when it is array?: >> + log->head("eliminate_allocation %s type='%d'", >> + alloc->is_AllocateArray() ? "array" : "", log->identify(tklass->klass())); >> >> the same with "lock":"unlock" > > I could but that's not very good xml. I think I'll just leave that out since arrayness should be discerned from the klass. I'd prefer to leave it as it for lock though. > Leave it as it is since it is better for XML. Thanks, Vladimir > tom > >> Vladimir >> >> Tom Rodriguez wrote: >>> http://cr.openjdk.java.net/~never/6892658/ > From Thomas.Rodriguez at Sun.COM Wed Nov 11 08:55:18 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Wed, 11 Nov 2009 08:55:18 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <4AFA1DC4.6060302@sun.com> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <4AFA1DC4.6060302@sun.com> Message-ID: On Nov 10, 2009, at 6:13 PM, Vladimir Kozlov wrote: > > Tom Rodriguez wrote: >> http://cr.openjdk.java.net/~never/6892658/ > > stringopts.cpp > > It would be nice to have all fields description in StringConcat class. Added. > In the changes description you said that new String(SB.toString()) > is processed also but the code is under #if 0 (merge_add()). I decided there were some possible correctness issues with the logic proving that it wasn't used anywhere else so I pulled it out. I'll restore it later. > > Change PrintOptimizeStringConcat to "notproduct" flag since > it is used only under #ifndef PRODUCT. Ok. > In StringConcat::merge() > _stringopt could be used instead of other->_stringopts > in new StringConcat(other->_stringopts, _end) Sure. > Also _begin instead of other->_begin > in result->set_allocation(other->_begin) _begin is must be the earliest JVMState of the pattern and other->_begin has to be earlier than _begin otherwise the couldn't be merged so I can't just swap them around. > It will allow to merge "other" several times see my comment bellow > about coalesce concats. I'm certainly not going to incorporate any extensions in it at this point and I'm a little dubious on the utility of handling that pattern as well. > > In StringConcat::eliminate_initialize() would be nice > to have assert that initialize node doesn't have raw stores: > assert(init->req() <= InitializeNode::RawStores,""); Ok. > In build_candidate() you may use recv->uncast() instead of: > > 366 if (recv->Opcode() == Op_CastPP) { > 367 recv = recv->in(1); > 368 } Ok. > At the same code you skip Proj so you may fail AllocateNode::Ideal_allocation() > since it expects Proj or CheckCastPP nodes. Just check that recv->is_Allocate(). Ok. > > There are several places where you do next check, > may be you can factor it in a separate function: > > method->holder() == C->env()->StringBuilder_klass() || > method->holder() == C->env()->StringBuffer_klass() I'm not sure factoring it out would be better. > May be also verify has_stringbuilder() in PhaseStringOpts(). Why? > > I don't understand next break code. > > 562 c = 0; > 563 break; > > You have 3 nested loops so the next break will return to the > second loop, not first, so "sc" will not be updated and > o==0 will be skipped. Why? I'd intended to restart the search at beginning but you're right that it's not restarting the way I want. I've switched to a goto to the head of the first loop since we don't have labeled breaks. > Also this coalesce code will not work if "other" is used by > several sc/arguments since you removed it from the list after > first match and merge. For example: > > String s0 = new SB().append(1)...toString(); > String s1 = new SB().append(s0).append(s0).toString(); > > I would keep it and always replace "c" with merged > (you need to modify StringConcat::merge() as I pointed above). > The "o" will be removed automatically if there are no other uses. I don't want to support that. I don't think that's an interesting pattern. It would also require rewriting the management of the control and trap lists and I don't want to get into that. > > I will look on copy_string() and related methods tomorrow. Thanks. tom > > Vladimir From Thomas.Rodriguez at Sun.COM Wed Nov 11 09:11:51 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Wed, 11 Nov 2009 09:11:51 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <4AFAE81C.7060406@sun.com> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <4AF9E509.9030108@sun.com> <3AD5BDF3-4E2D-4098-8207-BAA8A4C41319@Sun.COM> <4AFAE81C.7060406@sun.com> Message-ID: I've updated the webrev with all these changes. tom On Nov 11, 2009, at 8:36 AM, Vladimir Kozlov wrote: > Tom Rodriguez wrote: >>> >>> the_MIN_VALUE_string may be should be the_MIN_INT_VALUE_string >> I was trying to mimic Integer.MIN_VALUE. If you want int in there then how about the_min_jint_string? > > Agree with the_min_jint_string. > >>> c2_globals.hpp >>> I thought we will use experimental for OptimizeStringConcat. >> I guess we could. I don't really like experimental much but I'm not completely against it. > > I would prefer experimental. > >>> doCall.cpp >>> I saw cases when append methods were not inlined because they were >>> already compiled into "big" compiled method. You call for_late_inline() >>> after ok_to_inline(), so may be we should relax that condition for >>> OptimizeStringConcat case. >> The code doesn't care whether the interesting methods are inlined or not. If the methods would have been inlined then we register a late inline for them. If they wouldn't have been inlined then we tell the DirectCallGenerator to use separate io projs so that we can properly find all the edges if we need to replace it. THe late inlining logic is mainly about giving us the same result if we don't end up performing the optimization. > > You are right. The code delays inlining, not trying to inline. > >>> Should you check that safepoint has > jvms->argoff() inputs?: >>> Node* receiver = jvms->map()->in(jvms->argoff() + 1); >> I guess I could but if there aren't enough then the call itself is malformed and we'll die later won't we? Do you think i should? > > Leave it as it is. > >>> macro.cpp >>> Can you print array only when it is array?: >>> + log->head("eliminate_allocation %s type='%d'", >>> + alloc->is_AllocateArray() ? "array" : "", log->identify(tklass->klass())); >>> >>> the same with "lock":"unlock" >> I could but that's not very good xml. I think I'll just leave that out since arrayness should be discerned from the klass. I'd prefer to leave it as it for lock though. > > Leave it as it is since it is better for XML. > > Thanks, > Vladimir > >> tom >>> Vladimir >>> >>> Tom Rodriguez wrote: >>>> http://cr.openjdk.java.net/~never/6892658/ From Thomas.Rodriguez at Sun.COM Wed Nov 11 10:06:17 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Wed, 11 Nov 2009 10:06:17 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <4AF9E509.9030108@sun.com> <3AD5BDF3-4E2D-4098-8207-BAA8A4C41319@Sun.COM> <4AFAE81C.7060406@sun.com> Message-ID: <76F2BA51-1592-4BCF-9D21-297463CE2ECD@sun.com> Note that using experimental required changes to several globals file because the macro didn't support experimental as is. tom On Nov 11, 2009, at 9:11 AM, Tom Rodriguez wrote: > I've updated the webrev with all these changes. > > tom > > On Nov 11, 2009, at 8:36 AM, Vladimir Kozlov wrote: > >> Tom Rodriguez wrote: >>>> >>>> the_MIN_VALUE_string may be should be the_MIN_INT_VALUE_string >>> I was trying to mimic Integer.MIN_VALUE. If you want int in there then how about the_min_jint_string? >> >> Agree with the_min_jint_string. >> >>>> c2_globals.hpp >>>> I thought we will use experimental for OptimizeStringConcat. >>> I guess we could. I don't really like experimental much but I'm not completely against it. >> >> I would prefer experimental. >> >>>> doCall.cpp >>>> I saw cases when append methods were not inlined because they were >>>> already compiled into "big" compiled method. You call for_late_inline() >>>> after ok_to_inline(), so may be we should relax that condition for >>>> OptimizeStringConcat case. >>> The code doesn't care whether the interesting methods are inlined or not. If the methods would have been inlined then we register a late inline for them. If they wouldn't have been inlined then we tell the DirectCallGenerator to use separate io projs so that we can properly find all the edges if we need to replace it. THe late inlining logic is mainly about giving us the same result if we don't end up performing the optimization. >> >> You are right. The code delays inlining, not trying to inline. >> >>>> Should you check that safepoint has > jvms->argoff() inputs?: >>>> Node* receiver = jvms->map()->in(jvms->argoff() + 1); >>> I guess I could but if there aren't enough then the call itself is malformed and we'll die later won't we? Do you think i should? >> >> Leave it as it is. >> >>>> macro.cpp >>>> Can you print array only when it is array?: >>>> + log->head("eliminate_allocation %s type='%d'", >>>> + alloc->is_AllocateArray() ? "array" : "", log->identify(tklass->klass())); >>>> >>>> the same with "lock":"unlock" >>> I could but that's not very good xml. I think I'll just leave that out since arrayness should be discerned from the klass. I'd prefer to leave it as it for lock though. >> >> Leave it as it is since it is better for XML. >> >> Thanks, >> Vladimir >> >>> tom >>>> Vladimir >>>> >>>> Tom Rodriguez wrote: >>>>> http://cr.openjdk.java.net/~never/6892658/ > From Vladimir.Kozlov at Sun.COM Wed Nov 11 10:36:13 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 11 Nov 2009 10:36:13 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <4AFA1DC4.6060302@sun.com> Message-ID: <4AFB041D.20803@sun.com> Tom Rodriguez wrote: > >> Also _begin instead of other->_begin >> in result->set_allocation(other->_begin) > > _begin is must be the earliest JVMState of the pattern and other->_begin has to be earlier than _begin otherwise the couldn't be merged so I can't just swap them around. > Then I don't get how you optimize next code: SB.append((new SB()).append(s).toString()).toString() and I don't see any checks that other->_begin dominates _begin. > >> There are several places where you do next check, >> may be you can factor it in a separate function: >> >> method->holder() == C->env()->StringBuilder_klass() || >> method->holder() == C->env()->StringBuffer_klass() > > I'm not sure factoring it out would be better. OK. > >> May be also verify has_stringbuilder() in PhaseStringOpts(). > > Why? OK, I see that caller code of PhaseStringOpts() has has_stringbuilder() >> Also this coalesce code will not work if "other" is used by >> several sc/arguments since you removed it from the list after >> first match and merge. For example: >> >> String s0 = new SB().append(1)...toString(); >> String s1 = new SB().append(s0).append(s0).toString(); >> >> I would keep it and always replace "c" with merged >> (you need to modify StringConcat::merge() as I pointed above). >> The "o" will be removed automatically if there are no other uses. > > I don't want to support that. I don't think that's an interesting pattern. It would also require rewriting the management of the control and trap lists and I don't want to get into that. > OK. Vladimir >> I will look on copy_string() and related methods tomorrow. > > Thanks. > > tom > >> Vladimir > From Thomas.Rodriguez at Sun.COM Wed Nov 11 11:08:20 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Wed, 11 Nov 2009 11:08:20 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <4AFB041D.20803@sun.com> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <4AFA1DC4.6060302@sun.com> <4AFB041D.20803@sun.com> Message-ID: <5388A8D6-810E-4607-A8DC-09879F2917E6@Sun.COM> On Nov 11, 2009, at 10:36 AM, Vladimir Kozlov wrote: > > > Tom Rodriguez wrote: >>> Also _begin instead of other->_begin >>> in result->set_allocation(other->_begin) >> _begin is must be the earliest JVMState of the pattern and other->_begin has to be earlier than _begin otherwise the couldn't be merged so I can't just swap them around. > > Then I don't get how you optimize next code: > > SB.append((new SB()).append(s).toString()).toString() It won't handle that as it's currently constructed but it handles String s = new SB().append().append().toString(); String s2 = new SB().append().append(s).toString(); which is a case we actually care about. Handling the case you illustrate would require extending the logic in build_candidate quite a bit I think. I think there are more complex SB pattern that we might like to get but this is currently targeting basic ones. We can add more later. > and I don't see any checks that other->_begin dominates _begin. It's by construction. Each string concat is a linear piece of control flow from the toString back to the allocation with nothing unknown in between. We identify a stacking opportunity by detecting that one StringConcat is an argument to another. Then we merge them together and verify that they still form a closed graph. That will only be true if they form another linear sequence so other->_begin must dominate _begin. tom > >>> There are several places where you do next check, >>> may be you can factor it in a separate function: >>> >>> method->holder() == C->env()->StringBuilder_klass() || >>> method->holder() == C->env()->StringBuffer_klass() >> I'm not sure factoring it out would be better. > > OK. > >>> May be also verify has_stringbuilder() in PhaseStringOpts(). >> Why? > > OK, I see that caller code of PhaseStringOpts() has has_stringbuilder() > >>> Also this coalesce code will not work if "other" is used by >>> several sc/arguments since you removed it from the list after >>> first match and merge. For example: >>> >>> String s0 = new SB().append(1)...toString(); >>> String s1 = new SB().append(s0).append(s0).toString(); >>> >>> I would keep it and always replace "c" with merged >>> (you need to modify StringConcat::merge() as I pointed above). >>> The "o" will be removed automatically if there are no other uses. >> I don't want to support that. I don't think that's an interesting pattern. It would also require rewriting the management of the control and trap lists and I don't want to get into that. > > OK. > > Vladimir > >>> I will look on copy_string() and related methods tomorrow. >> Thanks. >> tom >>> Vladimir From Vladimir.Kozlov at Sun.COM Wed Nov 11 12:08:29 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 11 Nov 2009 12:08:29 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> Message-ID: <4AFB19BD.8060806@sun.com> Final part. You use kit.gvn(). instead of _gvn-> in several places. Also I think you can use __ instead of kit. for intcon, makecon, loads and others and leave it for control, memory operations only. You use fetch_static_field() only to read Integer.sizeTable. Does it need to be so generalized? But you can keep it as it is. Can you separate inlined comments from code by empty lines in int_stringSize()? In int_stringSize(), I think, TypeAryPtr::INTS memory should be used instead of TypeAryPtr::CHARS (for final_mem) and int_adr_idx (needs to add it) instead of char_adr_idx. In int_getChars() should the sign store to have IfTrue(iff) control?: 1124 Node* st = __ store_to_memory(kit.control(), kit.array_element_address(char_array, m1, T_CHAR), 1125 sign, T_CHAR, char_adr_idx); 1126 1127 final_merge->init_req(1, __ IfTrue(iff)); copy_string(), so you not supporting byte array strings for now? Why 6 is limit for constant strings? Add some comments. Vladimir Tom Rodriguez wrote: > http://cr.openjdk.java.net/~never/6892658/ From Vladimir.Kozlov at Sun.COM Wed Nov 11 12:27:31 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 11 Nov 2009 12:27:31 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <5388A8D6-810E-4607-A8DC-09879F2917E6@Sun.COM> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <4AFA1DC4.6060302@sun.com> <4AFB041D.20803@sun.com> <5388A8D6-810E-4607-A8DC-09879F2917E6@Sun.COM> Message-ID: <4AFB1E33.5080107@sun.com> Tom Rodriguez wrote: > On Nov 11, 2009, at 10:36 AM, Vladimir Kozlov wrote: > >> >> Tom Rodriguez wrote: >>>> Also _begin instead of other->_begin >>>> in result->set_allocation(other->_begin) >>> _begin is must be the earliest JVMState of the pattern and other->_begin has to be earlier than _begin otherwise the couldn't be merged so I can't just swap them around. >> Then I don't get how you optimize next code: >> >> SB.append((new SB()).append(s).toString()).toString() > > It won't handle that as it's currently constructed but it handles > > String s = new SB().append().append().toString(); > String s2 = new SB().append().append(s).toString(); > > which is a case we actually care about. Handling the case you illustrate would require extending the logic in build_candidate quite a bit I think. I think there are more complex SB pattern that we might like to get but this is currently targeting basic ones. We can add more later. > My case could be also frequent since javac will generate SB for the next case: SB.append("size="+x).toString() But, I agree, you don't need to implement it now. >> and I don't see any checks that other->_begin dominates _begin. > > It's by construction. Each string concat is a linear piece of control flow from the toString back to the allocation with nothing unknown in between. We identify a stacking opportunity by detecting that one StringConcat is an argument to another. Then we merge them together and verify that they still form a closed graph. That will only be true if they form another linear sequence so other->_begin must dominate _begin. > I see, the next check will fail for my case. And verification code also. 447 } else if (cnode->method()->holder() == m->holder() && 448 cnode->method()->name() == ciSymbol::append_name() && OK. Thanks, Vladimir > tom > >>>> There are several places where you do next check, >>>> may be you can factor it in a separate function: >>>> >>>> method->holder() == C->env()->StringBuilder_klass() || >>>> method->holder() == C->env()->StringBuffer_klass() >>> I'm not sure factoring it out would be better. >> OK. >> >>>> May be also verify has_stringbuilder() in PhaseStringOpts(). >>> Why? >> OK, I see that caller code of PhaseStringOpts() has has_stringbuilder() >> >>>> Also this coalesce code will not work if "other" is used by >>>> several sc/arguments since you removed it from the list after >>>> first match and merge. For example: >>>> >>>> String s0 = new SB().append(1)...toString(); >>>> String s1 = new SB().append(s0).append(s0).toString(); >>>> >>>> I would keep it and always replace "c" with merged >>>> (you need to modify StringConcat::merge() as I pointed above). >>>> The "o" will be removed automatically if there are no other uses. >>> I don't want to support that. I don't think that's an interesting pattern. It would also require rewriting the management of the control and trap lists and I don't want to get into that. >> OK. >> >> Vladimir >> >>>> I will look on copy_string() and related methods tomorrow. >>> Thanks. >>> tom >>>> Vladimir > From Thomas.Rodriguez at Sun.COM Wed Nov 11 13:41:30 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Wed, 11 Nov 2009 13:41:30 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <4AFB19BD.8060806@sun.com> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <4AFB19BD.8060806@sun.com> Message-ID: <27F56C21-C0E0-4843-B475-AAF9B3A333AC@Sun.COM> On Nov 11, 2009, at 12:08 PM, Vladimir Kozlov wrote: > Final part. > > You use kit.gvn(). instead of _gvn-> in several places. > Also I think you can use __ instead of kit. for intcon, makecon, loads > and others and leave it for control, memory operations only. > > You use fetch_static_field() only to read Integer.sizeTable. Does it need > to be so generalized? But you can keep it as it is. Originally I was going to read some other fields so I needed something general. It's based on do_get_xxx and I don't see any reason to simplify it. I could move it over into GraphKit. > > Can you separate inlined comments from code by empty lines in int_stringSize()? Ok. > In int_stringSize(), I think, TypeAryPtr::INTS memory should be used instead of > TypeAryPtr::CHARS (for final_mem) and int_adr_idx (needs to add it) instead > of char_adr_idx. Actually there are no stores so it's not needed at all. I'd added some debugging code that did a runtime call and needed the phi but I think I can remove it completely now. > > In int_getChars() should the sign store to have IfTrue(iff) control?: Ah yes. It should. The store in the loop above has a similar problem. > > 1124 Node* st = __ store_to_memory(kit.control(), kit.array_element_address(char_array, m1, T_CHAR), > 1125 sign, T_CHAR, char_adr_idx); > 1126 > 1127 final_merge->init_req(1, __ IfTrue(iff)); > > > copy_string(), so you not supporting byte array strings for now? We don't have byte strings. > Why 6 is limit for constant strings? Add some comments. Ok. It's just a number. 6 seems like an ok code space vs. speed tradeoff. tom > > > Vladimir > > Tom Rodriguez wrote: >> http://cr.openjdk.java.net/~never/6892658/ From Vladimir.Kozlov at Sun.COM Wed Nov 11 13:59:42 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 11 Nov 2009 13:59:42 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <27F56C21-C0E0-4843-B475-AAF9B3A333AC@Sun.COM> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <4AFB19BD.8060806@sun.com> <27F56C21-C0E0-4843-B475-AAF9B3A333AC@Sun.COM> Message-ID: <4AFB33CE.5060403@sun.com> Tom Rodriguez wrote: >> You use fetch_static_field() only to read Integer.sizeTable. Does it need >> to be so generalized? But you can keep it as it is. > > Originally I was going to read some other fields so I needed something general. It's based on do_get_xxx and I don't see any reason to simplify it. I could move it over into GraphKit. > OK. >> In int_stringSize(), I think, TypeAryPtr::INTS memory should be used instead of >> TypeAryPtr::CHARS (for final_mem) and int_adr_idx (needs to add it) instead >> of char_adr_idx. > > Actually there are no stores so it's not needed at all. I'd added some debugging code that did a runtime call and needed the phi but I think I can remove it completely now. > OK. >> Why 6 is limit for constant strings? Add some comments. > > Ok. It's just a number. 6 seems like an ok code space vs. speed tradeoff. May be we should have it as flag or definition to be more visible? Thanks, Vladimir > > tom > >> >> Vladimir >> >> Tom Rodriguez wrote: >>> http://cr.openjdk.java.net/~never/6892658/ > From Thomas.Rodriguez at Sun.COM Wed Nov 11 14:03:21 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Wed, 11 Nov 2009 14:03:21 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: <4AFB33CE.5060403@sun.com> References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <4AFB19BD.8060806@sun.com> <27F56C21-C0E0-4843-B475-AAF9B3A333AC@Sun.COM> <4AFB33CE.5060403@sun.com> Message-ID: On Nov 11, 2009, at 1:59 PM, Vladimir Kozlov wrote: > Tom Rodriguez wrote: >>> You use fetch_static_field() only to read Integer.sizeTable. Does it need >>> to be so generalized? But you can keep it as it is. >> Originally I was going to read some other fields so I needed something general. It's based on do_get_xxx and I don't see any reason to simplify it. I could move it over into GraphKit. > > OK. > >>> In int_stringSize(), I think, TypeAryPtr::INTS memory should be used instead of >>> TypeAryPtr::CHARS (for final_mem) and int_adr_idx (needs to add it) instead >>> of char_adr_idx. >> Actually there are no stores so it's not needed at all. I'd added some debugging code that did a runtime call and needed the phi but I think I can remove it completely now. > > OK. > >>> Why 6 is limit for constant strings? Add some comments. >> Ok. It's just a number. 6 seems like an ok code space vs. speed tradeoff. > > May be we should have it as flag or definition to be more visible? A flag seems excessive. It's not like this is a critical tunable. I can move it into an enum in PhaseStringOpts if you like. enum { // max length of constant string copy unrolling in copy_string unroll_string_copy_length = 6 }; tom > > Thanks, > Vladimir > >> tom >>> >>> Vladimir >>> >>> Tom Rodriguez wrote: >>>> http://cr.openjdk.java.net/~never/6892658/ From Vladimir.Kozlov at Sun.COM Wed Nov 11 14:13:39 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 11 Nov 2009 14:13:39 -0800 Subject: review (M) for 6892658: C2 should optimize some stringbuilder patterns In-Reply-To: References: <8251832F-62D6-4BA6-B29C-AA01FD78FBCB@Sun.COM> <4AFB19BD.8060806@sun.com> <27F56C21-C0E0-4843-B475-AAF9B3A333AC@Sun.COM> <4AFB33CE.5060403@sun.com> Message-ID: <4AFB3713.1020704@sun.com> Yes, enum is fine. Vladimir Tom Rodriguez wrote: >>>> Why 6 is limit for constant strings? Add some comments. >>> Ok. It's just a number. 6 seems like an ok code space vs. speed tradeoff. >> May be we should have it as flag or definition to be more visible? > > A flag seems excessive. It's not like this is a critical tunable. I can move it into an enum in PhaseStringOpts if you like. > > enum { > // max length of constant string copy unrolling in copy_string > unroll_string_copy_length = 6 > }; > > tom > >>>> >>>> Tom Rodriguez wrote: >>>>> http://cr.openjdk.java.net/~never/6892658/ > From Christian.Thalinger at Sun.COM Thu Nov 12 01:49:58 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Thu, 12 Nov 2009 10:49:58 +0100 Subject: review (XS) for 6892079: live value must not be garbage failure after fix for 6854812 In-Reply-To: <20DC8F32-10AD-4463-A6E6-703BD41AC1CE@Sun.COM> References: <20DC8F32-10AD-4463-A6E6-703BD41AC1CE@Sun.COM> Message-ID: <1258019398.861.39.camel@macbook> On Mon, 2009-11-09 at 15:57 -0800, Tom Rodriguez wrote: > http://cr.openjdk.java.net/~never/6892079 Looks good. -- Christian From thomas.rodriguez at sun.com Thu Nov 12 03:03:22 2009 From: thomas.rodriguez at sun.com (thomas.rodriguez at sun.com) Date: Thu, 12 Nov 2009 11:03:22 +0000 Subject: hg: jdk7/hotspot-comp/hotspot: 6892079: live value must not be garbage failure after fix for 6854812 Message-ID: <20091112110337.D3A414145E@hg.openjdk.java.net> Changeset: 87b2fdd4bf98 Author: never Date: 2009-11-11 23:39 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/87b2fdd4bf98 6892079: live value must not be garbage failure after fix for 6854812 Reviewed-by: kvn ! src/share/vm/opto/parse1.cpp From Ulf.Zibis at gmx.de Tue Nov 17 12:14:39 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Tue, 17 Nov 2009 21:14:39 +0100 Subject: How to inspect hotspot compiler results? Message-ID: <4B03042F.8010709@gmx.de> Hi all, I remember, there was some information about hidden options to force hotspot compiler to output the compiled code on interesting classes. Can anybody give me a suitable link for start? Maybe additional hints. Thanks, -Ulf From Christian.Thalinger at Sun.COM Tue Nov 17 12:49:51 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Tue, 17 Nov 2009 21:49:51 +0100 Subject: How to inspect hotspot compiler results? In-Reply-To: <4B03042F.8010709@gmx.de> References: <4B03042F.8010709@gmx.de> Message-ID: <1258490991.10587.104.camel@macbook> On Tue, 2009-11-17 at 21:14 +0100, Ulf Zibis wrote: > Hi all, > > I remember, there was some information about hidden options to force > hotspot compiler to output the compiled code on interesting classes. > > Can anybody give me a suitable link for start? Maybe additional hints. How about today's thread: http://mail.openjdk.java.net/pipermail/hotspot-dev/2009-November/002342.html -- Christian From Ulf.Zibis at gmx.de Tue Nov 17 13:39:48 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Tue, 17 Nov 2009 22:39:48 +0100 Subject: How to inspect hotspot compiler results? In-Reply-To: <1258490991.10587.104.camel@macbook> References: <4B03042F.8010709@gmx.de> <1258490991.10587.104.camel@macbook> Message-ID: <4B031824.2020008@gmx.de> Am 17.11.2009 21:49, Christian Thalinger schrieb: > On Tue, 2009-11-17 at 21:14 +0100, Ulf Zibis wrote: > >> Hi all, >> >> I remember, there was some information about hidden options to force >> hotspot compiler to output the compiled code on interesting classes. >> >> Can anybody give me a suitable link for start? Maybe additional hints. >> > > How about today's thread: > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2009-November/002342.html > Wow, it seems to be Christmas eve today. Thanks Christian. Sorry for not opening all list items. -Ulf From Ulf.Zibis at gmx.de Tue Nov 17 15:09:04 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Wed, 18 Nov 2009 00:09:04 +0100 Subject: How to inspect hotspot compiler results? hsdis binaries for Windows? In-Reply-To: <1258490991.10587.104.camel@macbook> References: <4B03042F.8010709@gmx.de> <1258490991.10587.104.camel@macbook> Message-ID: <4B032D10.3090401@gmx.de> Am 17.11.2009 21:49, Christian Thalinger schrieb: > On Tue, 2009-11-17 at 21:14 +0100, Ulf Zibis wrote: > >> Hi all, >> >> I remember, there was some information about hidden options to force >> hotspot compiler to output the compiled code on interesting classes. >> >> Can anybody give me a suitable link for start? Maybe additional hints. >> > > How about today's thread: > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2009-November/002342.html > According to http://wikis.sun.com/display/HotSpotInternals/PrintAssembly, the hsdis plugin seems to be only available for MacOS and Solaris. Is there any chance to get such a plugin for Windows or at least for Linux? @ John: On your wiki page you mention hsdis DLL and libjvm.so in same line. How should I understand this? DLL normally is for Windows, but *.so for solaris. Any help would be greatly appreciated, thank you in advance. -Ulf From dennisbyrne at apache.org Tue Nov 17 15:16:05 2009 From: dennisbyrne at apache.org (Dennis Byrne) Date: Tue, 17 Nov 2009 17:16:05 -0600 Subject: How to inspect hotspot compiler results? hsdis binaries for Windows? In-Reply-To: <4B032D10.3090401@gmx.de> References: <4B03042F.8010709@gmx.de> <1258490991.10587.104.camel@macbook> <4B032D10.3090401@gmx.de> Message-ID: <446564320911171516y425c0e2bgbad3ce7845fc9540@mail.gmail.com> I could build it on Ubuntu, didn't try on Windows. Just an FYI, the README says to use the BINTUILS variable, but the T and the U should be swapped. Dennis Byrne On Tue, Nov 17, 2009 at 5:09 PM, Ulf Zibis wrote: > Am 17.11.2009 21:49, Christian Thalinger schrieb: >> >> On Tue, 2009-11-17 at 21:14 +0100, Ulf Zibis wrote: >> >>> >>> Hi all, >>> >>> I remember, there was some information about hidden options to force >>> hotspot compiler to output the compiled code on interesting classes. >>> >>> Can anybody give me a suitable link for start? Maybe additional hints. >>> >> >> How about today's thread: >> >> ?http://mail.openjdk.java.net/pipermail/hotspot-dev/2009-November/002342.html >> > > According to http://wikis.sun.com/display/HotSpotInternals/PrintAssembly, > the hsdis plugin seems to be only available for MacOS and Solaris. > Is there any chance to get such a plugin for Windows or at least for Linux? > > @ John: On your wiki page you mention hsdis DLL and libjvm.so in same line. > How should I understand this? DLL normally is for Windows, but *.so for > solaris. > > Any help would be greatly appreciated, thank you in advance. > > -Ulf > > > > > -- Dennis Byrne From Ulf.Zibis at gmx.de Tue Nov 17 15:28:36 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Wed, 18 Nov 2009 00:28:36 +0100 Subject: How to inspect hotspot compiler results? hsdis binaries for Windows? In-Reply-To: <446564320911171516y425c0e2bgbad3ce7845fc9540@mail.gmail.com> References: <4B03042F.8010709@gmx.de> <1258490991.10587.104.camel@macbook> <4B032D10.3090401@gmx.de> <446564320911171516y425c0e2bgbad3ce7845fc9540@mail.gmail.com> Message-ID: <4B0331A4.7090908@gmx.de> Denis, much thanks. First I'll wait for any help for Windows. Next week I maybe have time to install Ubuntu. Maybe you could provide me your compiled binary, to save me from building my own. -Ulf Am 18.11.2009 00:16, Dennis Byrne schrieb: > I could build it on Ubuntu, didn't try on Windows. Just an FYI, the > README says to use the BINTUILS variable, but the T and the U should > be swapped. > > Dennis Byrne > > On Tue, Nov 17, 2009 at 5:09 PM, Ulf Zibis wrote: > >> Am 17.11.2009 21:49, Christian Thalinger schrieb: >> >>> On Tue, 2009-11-17 at 21:14 +0100, Ulf Zibis wrote: >>> >>> >>>> Hi all, >>>> >>>> I remember, there was some information about hidden options to force >>>> hotspot compiler to output the compiled code on interesting classes. >>>> >>>> Can anybody give me a suitable link for start? Maybe additional hints. >>>> >>>> >>> How about today's thread: >>> >>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2009-November/002342.html >>> >>> >> According to http://wikis.sun.com/display/HotSpotInternals/PrintAssembly, >> the hsdis plugin seems to be only available for MacOS and Solaris. >> Is there any chance to get such a plugin for Windows or at least for Linux? >> >> @ John: On your wiki page you mention hsdis DLL and libjvm.so in same line. >> How should I understand this? DLL normally is for Windows, but *.so for >> solaris. >> >> Any help would be greatly appreciated, thank you in advance. >> >> -Ulf >> >> >> >> >> >> > > > > From Vladimir.Kozlov at Sun.COM Wed Nov 18 09:02:08 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 18 Nov 2009 09:02:08 -0800 Subject: Request reviews (S): 6902036: WorldWind asserts on escape.cpp:1153: assert(addr->is_AddP(),"AddP required") Message-ID: <4B042890.5020309@sun.com> http://cr.openjdk.java.net/~kvn/6902036/webrev.00 Fixed 6902036: WorldWind asserts on escape.cpp:1153: assert(addr->is_AddP(),"AddP required") Problem: AddPNode::Ideal() may replace AddP node with CastX2P node for a raw memory reference. EA code does not check such case. Solution: Check and ignore raw memory operations in EA. Reviewed by: Fix verified (y/n): y, test Other testing: JPRT From Vladimir.Kozlov at Sun.COM Wed Nov 18 09:58:56 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 18 Nov 2009 09:58:56 -0800 Subject: Request reviews (L): 6895383: JCK test throws NPE for method compiled with Escape Analysis Message-ID: <4B0435E0.6020206@sun.com> http://cr.openjdk.java.net/~kvn/6895383/webrev.04 Fixed 6895383: JCK test throws NPE for method compiled with Escape Analysis Problem: EA misses checks for MemBar nodes when looking for following MergeMem nodes. As result it did not add memory slice for the volatile store (followed by several MemBar nodes) in MergeMem of uncommon trap. So when the method deoptimized and eliminated object reallocated the NULL is stored to the corresponding field. Solution: Collect all MergeMem nodes at the beginning of EA when it walks through Ideal graph instead of searching MergeMem nodes by going down trough memory during memory splitting for non escaping objects. Also check for general MemBar nodes instead of only Initialize nodes during memory splitting. I also did next optimizations/fixes: 1. Check for SafePointScalarObject nodes to avoid printing empty lines in PrintOptoAssembly output since they don't have corresponding mach nodes. 2. Eliminate volatile MemBars nodes for stores into a scalar replaced objects. I noticed that we still generate membars even if the object is eliminated. For that I added the Precedent edge to corresponding Store node. 3. The check for ClearArray node is missing when searching for a better memory edge during EA, memory optimization and an object scalar replacement. We can't bypass it since it is the part of object initialization (zeroing). I will try to factor it into separate function before push. 4. Add missing checks for string intrinsic nodes to adjust the escape state (non scalar replaceable) of char[] arrays they use. 5. Move code for searching objects which are not scalar replaceable to separate method verify_escape_state() since the code became too large. Add a simple control flow check to find initializing store for oop fields. Reviewed by: Fix verified (y/n): y, test Other testing: JPRT, CTW (32 and 64bit) From Ulf.Zibis at gmx.de Thu Nov 19 01:59:31 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Thu, 19 Nov 2009 10:59:31 +0100 Subject: -XX:-PrintCompilation doesn't output Message-ID: <4B051703.50309@gmx.de> I've set "-XX:-PrintCompilation" on fastdebug-build jdk1.7.0_b76, but got nothing than: VM option '-PrintCompilation' Any ideas ? -Ulf From gbenson at redhat.com Thu Nov 19 02:07:59 2009 From: gbenson at redhat.com (Gary Benson) Date: Thu, 19 Nov 2009 10:07:59 +0000 Subject: -XX:-PrintCompilation doesn't output In-Reply-To: <4B051703.50309@gmx.de> References: <4B051703.50309@gmx.de> Message-ID: <20091119100759.GA7222@redhat.com> Ulf Zibis wrote: > I've set "-XX:-PrintCompilation" on fastdebug-build jdk1.7.0_b76, but > got nothing than: > > VM option '-PrintCompilation' > > Any ideas ? -XX:+PrintCompilation Cheers, Gary -- http://gbenson.net/ From Christian.Thalinger at Sun.COM Thu Nov 19 02:29:09 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Thu, 19 Nov 2009 11:29:09 +0100 Subject: Request reviews (S): 6902036: WorldWind asserts on escape.cpp:1153: assert(addr->is_AddP(),"AddP required") In-Reply-To: <4B042890.5020309@sun.com> References: <4B042890.5020309@sun.com> Message-ID: <1258626549.1712.0.camel@macbook> On Wed, 2009-11-18 at 09:02 -0800, Vladimir Kozlov wrote: > http://cr.openjdk.java.net/~kvn/6902036/webrev.00 > > Fixed 6902036: WorldWind asserts on escape.cpp:1153: assert(addr->is_AddP(),"AddP required") > > Problem: > AddPNode::Ideal() may replace AddP node with CastX2P node > for a raw memory reference. EA code does not check such case. > > Solution: > Check and ignore raw memory operations in EA. Looks good to me. At least I can tell that WorldWind works with that fix. -- Christian From gbenson at redhat.com Thu Nov 19 05:15:16 2009 From: gbenson at redhat.com (Gary Benson) Date: Thu, 19 Nov 2009 13:15:16 +0000 Subject: Review Request: 6896043: Zero fixes Message-ID: <20091119131516.GC7222@redhat.com> Hi all, This last week I've spent some time going through all the patches in IcedTea and picking out the ones that affect Zero. This webrev contains those, along with a fix for a build failure and a little bit of new code that's required for the latest Shark. http://cr.openjdk.java.net/~gbenson/zero-update-01-hs/ I've bundled them into one webrev, since each change is small, but if separate webrevs would be easier or better for you then let me know and I'll split it accordingly. The changes are as follows: hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp: Add code to update the invocation counter for non-synchronized native methods. This causes Shark to generate wrappers for hot methods, instead of using the interpreter entry. hotspot/src/cpu/zero/vm/entry_zero.hpp: Inline a couple of methods. This saves a frame every time the interpreter calls a new Java method, and makes stack traces look nicer in gdb. hotspot/src/cpu/zero/vm/frame_zero.hpp, hotspot/src/cpu/zero/vm/frame_zero.cpp: Zero has the same various frame::sender_for_${type}_frame methods as the architecture-specific ports, but in Zero all frames are handled identically except for entry frames. This change replaces most of the sender_for_${type}_frame with sender_for_entry_frame and sender_for_nonentry_frame, simplifying the marshalling. Adding compiled native methods in Shark was going to introduce another type of frame and so yet another sender_for_${type}_frame method and I decided enough was enough :) hotspot/src/cpu/zero/vm/globals_zero.hpp: Fix a build breakage introduced by 6887571. hotspot/src/cpu/zero/vm/sharedRuntime_zero.cpp: Implemented SharedRuntime::generate_native_wrapper using Shark. hotspot/src/cpu/zero/vm/sharkFrame_zero.hpp: Updated a friend declaration to match the latest version of Shark. hotspot/src/share/vm/interpreter/bytecodeInterpreter.cpp: Fixed a bug where the invocation counter would be incremented twice for backedges when not using Shark. Fixed a bug where SharedRuntime::OSR_migration_begin was incorrectly invoked using CALL_VM. Removed an old workaround that was causing build failures on IA64. hotspot/src/share/vm/prims/jni.cpp: Added an assertion to ensure Atomic::xchg and Atomic::xchg_ptr are working correctly. On most Zero platforms these are implemented using a GCC intrinsic which *should* work as we expect, but is allowed to only work for the values of 0 and 1 if that's all the platform can manage. hotspot/src/share/vm/prims/jvmtiManageCapabilities.cpp: The C++ interpreter doesn't currently support JVMTI pop frame or force early return messages. This changes the capabilities the VM reports to reflect that. hotspot/src/share/vm/runtime/os.hpp: S390 rounds the addresses of segmentation faults to the nearest page. This changes os::is_memory_serialize_page to cope with that. Cheers, Gary -- http://gbenson.net/ From Christian.Thalinger at Sun.COM Thu Nov 19 06:07:29 2009 From: Christian.Thalinger at Sun.COM (Christian.Thalinger at Sun.COM) Date: Thu, 19 Nov 2009 14:07:29 +0000 Subject: hg: jdk7/hotspot-comp/hotspot: 6902000: use ShouldNotReachHere() for btos/ctos/stos in TemplateInterpreterGenerator::set_short_entry_points Message-ID: <20091119140737.A869541FC3@hg.openjdk.java.net> Changeset: b18963243361 Author: twisti Date: 2009-11-19 03:41 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/b18963243361 6902000: use ShouldNotReachHere() for btos/ctos/stos in TemplateInterpreterGenerator::set_short_entry_points Summary: set_entry_point is only ever used with the tos states of bytecode templates in templateTable.cpp and none of those use the subword tos states like btos, ctos and stos. Reviewed-by: kvn ! src/share/vm/interpreter/templateInterpreter.cpp From Vladimir.Kozlov at Sun.COM Thu Nov 19 10:38:34 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Thu, 19 Nov 2009 10:38:34 -0800 Subject: Request reviews (S): 6902036: WorldWind asserts on escape.cpp:1153: assert(addr->is_AddP(),"AddP required") In-Reply-To: <1258626549.1712.0.camel@macbook> References: <4B042890.5020309@sun.com> <1258626549.1712.0.camel@macbook> Message-ID: <4B0590AA.3070207@sun.com> Thanks Vladimir Christian Thalinger wrote: > On Wed, 2009-11-18 at 09:02 -0800, Vladimir Kozlov wrote: >> http://cr.openjdk.java.net/~kvn/6902036/webrev.00 >> >> Fixed 6902036: WorldWind asserts on escape.cpp:1153: assert(addr->is_AddP(),"AddP required") >> >> Problem: >> AddPNode::Ideal() may replace AddP node with CastX2P node >> for a raw memory reference. EA code does not check such case. >> >> Solution: >> Check and ignore raw memory operations in EA. > > Looks good to me. At least I can tell that WorldWind works with that > fix. > > -- Christian > From Ulf.Zibis at gmx.de Thu Nov 19 16:01:08 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Fri, 20 Nov 2009 01:01:08 +0100 Subject: -XX:-PrintCompilation doesn't output In-Reply-To: <4B051703.50309@gmx.de> References: <4B051703.50309@gmx.de> Message-ID: <4B05DC44.8000303@gmx.de> I found out the reason, I should use "-XX:+PrintCompilation" Don't know, why '-' is stated here: http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp#DebuggingOptions -Ulf Am 19.11.2009 10:59, Ulf Zibis schrieb: > I've set "-XX:-PrintCompilation" on fastdebug-build jdk1.7.0_b76, but > got nothing than: > > VM option '-PrintCompilation' > > Any ideas ? > > -Ulf > > > From Vladimir.Kozlov at Sun.COM Thu Nov 19 16:03:43 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Thu, 19 Nov 2009 16:03:43 -0800 Subject: -XX:-PrintCompilation doesn't output In-Reply-To: <4B05DC44.8000303@gmx.de> References: <4B051703.50309@gmx.de> <4B05DC44.8000303@gmx.de> Message-ID: <4B05DCDF.6060003@sun.com> Because by default it is off: "Option and Default Value" Vladimir Ulf Zibis wrote: > I found out the reason, I should use "-XX:+PrintCompilation" > > Don't know, why '-' is stated here: > http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp#DebuggingOptions > > > -Ulf > > Am 19.11.2009 10:59, Ulf Zibis schrieb: >> I've set "-XX:-PrintCompilation" on fastdebug-build jdk1.7.0_b76, but >> got nothing than: >> >> VM option '-PrintCompilation' >> >> Any ideas ? >> >> -Ulf >> >> >> > From Y.S.Ramakrishna at Sun.COM Thu Nov 19 16:08:45 2009 From: Y.S.Ramakrishna at Sun.COM (Y. Srinivas Ramakrishna) Date: Thu, 19 Nov 2009 16:08:45 -0800 Subject: -XX:-PrintCompilation doesn't output In-Reply-To: <4B05DC44.8000303@gmx.de> References: <4B051703.50309@gmx.de> <4B05DC44.8000303@gmx.de> Message-ID: <4B05DE0D.7000104@Sun.COM> On 11/19/09 16:01, Ulf Zibis wrote: > I found out the reason, I should use "-XX:+PrintCompilation" > > Don't know, why '-' is stated here: > http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp#DebuggingOptions The "-" (read "minus") specifies the default status of this option (i.e. off, disabled). (For other options where you see a "+", the default setting is "on". I admit this can be a somewhat confusing documentation convention, at least at first glance, and takes some getting used to.) -- ramki > > > -Ulf > > Am 19.11.2009 10:59, Ulf Zibis schrieb: >> I've set "-XX:-PrintCompilation" on fastdebug-build jdk1.7.0_b76, but >> got nothing than: >> >> VM option '-PrintCompilation' >> >> Any ideas ? >> >> -Ulf >> >> >> > From Y.S.Ramakrishna at Sun.COM Thu Nov 19 16:39:00 2009 From: Y.S.Ramakrishna at Sun.COM (Y. Srinivas Ramakrishna) Date: Thu, 19 Nov 2009 16:39:00 -0800 Subject: -XX:-PrintCompilation doesn't output In-Reply-To: <9146341c0911191625k4fb76404x4c51aa1a3e7a0d20@mail.gmail.com> References: <4B051703.50309@gmx.de> <4B05DC44.8000303@gmx.de> <4B05DE0D.7000104@Sun.COM> <9146341c0911191625k4fb76404x4c51aa1a3e7a0d20@mail.gmail.com> Message-ID: <4B05E524.3080508@Sun.COM> On 11/19/09 16:25, David Schlosnagle wrote: > I wanted to pass on the following link as I know it has helped me in > the past find the definitions for some of the more hidden HotSpot JVM > options and which version they were introduced. If I get a chance I > could take a stab at grepping out the new options for JDK 7. Thanks for the link, i believe a rather popular one, as yr favourite search engine will reveal. (Thank you Joe, hereby bcc'd, for compiling & maintaining this!) An added plus would be to specify for each of the "version" columns, also a default setting/value may be? (Joe?) > > http://blogs.sun.com/watt/resource/jvm-options-list.html > > - Dave -- ramki > > > On Thu, Nov 19, 2009 at 7:08 PM, Y. Srinivas Ramakrishna > wrote: >> On 11/19/09 16:01, Ulf Zibis wrote: >>> I found out the reason, I should use "-XX:+PrintCompilation" >>> >>> Don't know, why '-' is stated here: >>> http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp#DebuggingOptions >> >> The "-" (read "minus") specifies the default status of this option (i.e. >> off, disabled). >> (For other options where you see a "+", the default setting is "on". I admit >> this can be a somewhat confusing documentation convention, at least at first >> glance, >> and takes some getting used to.) >> >> -- ramki >> >>> >>> -Ulf >>> >>> Am 19.11.2009 10:59, Ulf Zibis schrieb: >>>> I've set "-XX:-PrintCompilation" on fastdebug-build jdk1.7.0_b76, but got >>>> nothing than: >>>> >>>> VM option '-PrintCompilation' >>>> >>>> Any ideas ? >>>> >>>> -Ulf >>>> >>>> >>>> >> From Y.S.Ramakrishna at Sun.COM Thu Nov 19 16:43:15 2009 From: Y.S.Ramakrishna at Sun.COM (Y. Srinivas Ramakrishna) Date: Thu, 19 Nov 2009 16:43:15 -0800 Subject: -XX:-PrintCompilation doesn't output In-Reply-To: <4B05E524.3080508@Sun.COM> References: <4B051703.50309@gmx.de> <4B05DC44.8000303@gmx.de> <4B05DE0D.7000104@Sun.COM> <9146341c0911191625k4fb76404x4c51aa1a3e7a0d20@mail.gmail.com> <4B05E524.3080508@Sun.COM> Message-ID: <4B05E623.2070802@Sun.COM> On 11/19/09 16:39, Y. Srinivas Ramakrishna wrote: > On 11/19/09 16:25, David Schlosnagle wrote: >> I wanted to pass on the following link as I know it has helped me in >> the past find the definitions for some of the more hidden HotSpot JVM >> options and which version they were introduced. If I get a chance I >> could take a stab at grepping out the new options for JDK 7. > > Thanks for the link, i believe a rather popular one, as > yr favourite search engine will reveal. (Thank you Joe, hereby bcc'd, > for compiling & maintaining this!) An added plus would be to specify > for each of the "version" columns, also a default setting/value > may be? (Joe?) On second thoughts, perhaps not. Given the multiplicity of update releases and such, that level of detail might make the table too busy and perhaps confusing on that account ... -- ramki > >> >> http://blogs.sun.com/watt/resource/jvm-options-list.html >> >> - Dave > > -- ramki > >> >> >> On Thu, Nov 19, 2009 at 7:08 PM, Y. Srinivas Ramakrishna >> wrote: >>> On 11/19/09 16:01, Ulf Zibis wrote: >>>> I found out the reason, I should use "-XX:+PrintCompilation" >>>> >>>> Don't know, why '-' is stated here: >>>> http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp#DebuggingOptions >>>> >>> >>> The "-" (read "minus") specifies the default status of this option (i.e. >>> off, disabled). >>> (For other options where you see a "+", the default setting is "on". >>> I admit >>> this can be a somewhat confusing documentation convention, at least >>> at first >>> glance, >>> and takes some getting used to.) >>> >>> -- ramki >>> >>>> >>>> -Ulf >>>> >>>> Am 19.11.2009 10:59, Ulf Zibis schrieb: >>>>> I've set "-XX:-PrintCompilation" on fastdebug-build jdk1.7.0_b76, >>>>> but got >>>>> nothing than: >>>>> >>>>> VM option '-PrintCompilation' >>>>> >>>>> Any ideas ? >>>>> >>>>> -Ulf >>>>> >>>>> >>>>> >>> > > From vijay at kandysoftwareinc.com Thu Nov 19 16:45:51 2009 From: vijay at kandysoftwareinc.com (Vijay Kandy) Date: Thu, 19 Nov 2009 17:45:51 -0700 Subject: -XX:-PrintCompilation doesn't output In-Reply-To: <9146341c0911191625k4fb76404x4c51aa1a3e7a0d20@mail.gmail.com> References: <4B051703.50309@gmx.de> <4B05DC44.8000303@gmx.de> <4B05DE0D.7000104@Sun.COM> <9146341c0911191625k4fb76404x4c51aa1a3e7a0d20@mail.gmail.com> Message-ID: I use this page which has a lot of undocumented -XX options, even the unstable ones. Perhaps it'll be of some help. http://www.md.pp.ru/~eu/jdk6options.html Regards, Vijay From Ulf.Zibis at gmx.de Thu Nov 19 17:27:04 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Fri, 20 Nov 2009 02:27:04 +0100 Subject: -XX:-PrintCompilation doesn't output In-Reply-To: <9146341c0911191625k4fb76404x4c51aa1a3e7a0d20@mail.gmail.com> References: <4B051703.50309@gmx.de> <4B05DC44.8000303@gmx.de> <4B05DE0D.7000104@Sun.COM> <9146341c0911191625k4fb76404x4c51aa1a3e7a0d20@mail.gmail.com> Message-ID: <4B05F068.1060506@gmx.de> Very much thanks for sharing this. A long missed documentation! -Ulf Am 20.11.2009 01:25, David Schlosnagle schrieb: > I wanted to pass on the following link as I know it has helped me in > the past find the definitions for some of the more hidden HotSpot JVM > options and which version they were introduced. If I get a chance I > could take a stab at grepping out the new options for JDK 7. > > http://blogs.sun.com/watt/resource/jvm-options-list.html > > - Dave > > From Ulf.Zibis at gmx.de Thu Nov 19 17:52:07 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Fri, 20 Nov 2009 02:52:07 +0100 Subject: -XX:-PrintCompilation doesn't output In-Reply-To: <20091119100759.GA7222@redhat.com> References: <4B051703.50309@gmx.de> <20091119100759.GA7222@redhat.com> Message-ID: <4B05F647.1040807@gmx.de> Am 19.11.2009 11:07, Gary Benson schrieb: > Ulf Zibis wrote: > >> I've set "-XX:-PrintCompilation" on fastdebug-build jdk1.7.0_b76, but >> got nothing than: >> >> VM option '-PrintCompilation' >> >> Any ideas ? >> > > -XX:+PrintCompilation > To avoid confusion about my post before: This post was blocked by my provider's spam filter -Ulf From vladimir.kozlov at sun.com Thu Nov 19 17:59:38 2009 From: vladimir.kozlov at sun.com (vladimir.kozlov at sun.com) Date: Fri, 20 Nov 2009 01:59:38 +0000 Subject: hg: jdk7/hotspot-comp/hotspot: 6902036: WorldWind asserts on escape.cpp:1153: assert(addr->is_AddP(), "AddP required") Message-ID: <20091120015943.940FA4141B@hg.openjdk.java.net> Changeset: 7ef1d2e14917 Author: kvn Date: 2009-11-19 14:32 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/7ef1d2e14917 6902036: WorldWind asserts on escape.cpp:1153: assert(addr->is_AddP(),"AddP required") Summary: Remove the assert. Reviewed-by: twisti ! src/share/vm/opto/escape.cpp From schlosna at gmail.com Thu Nov 19 16:25:56 2009 From: schlosna at gmail.com (David Schlosnagle) Date: Thu, 19 Nov 2009 19:25:56 -0500 Subject: -XX:-PrintCompilation doesn't output In-Reply-To: <4B05DE0D.7000104@Sun.COM> References: <4B051703.50309@gmx.de> <4B05DC44.8000303@gmx.de> <4B05DE0D.7000104@Sun.COM> Message-ID: <9146341c0911191625k4fb76404x4c51aa1a3e7a0d20@mail.gmail.com> I wanted to pass on the following link as I know it has helped me in the past find the definitions for some of the more hidden HotSpot JVM options and which version they were introduced. If I get a chance I could take a stab at grepping out the new options for JDK 7. http://blogs.sun.com/watt/resource/jvm-options-list.html - Dave On Thu, Nov 19, 2009 at 7:08 PM, Y. Srinivas Ramakrishna wrote: > On 11/19/09 16:01, Ulf Zibis wrote: >> >> I found out the reason, I should use "-XX:+PrintCompilation" >> >> Don't know, why '-' is stated here: >> http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp#DebuggingOptions > > > The "-" (read "minus") specifies the default status of this option (i.e. > off, disabled). > (For other options where you see a "+", the default setting is "on". I admit > this can be a somewhat confusing documentation convention, at least at first > glance, > and takes some getting used to.) > > -- ramki > >> >> >> -Ulf >> >> Am 19.11.2009 10:59, Ulf Zibis schrieb: >>> >>> I've set "-XX:-PrintCompilation" on fastdebug-build jdk1.7.0_b76, but got >>> nothing than: >>> >>> VM option '-PrintCompilation' >>> >>> Any ideas ? >>> >>> -Ulf >>> >>> >>> >> > > From Vladimir.Kozlov at Sun.COM Fri Nov 20 14:04:20 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Fri, 20 Nov 2009 14:04:20 -0800 Subject: Request reviews (M): 6896727: nsk/logging/LoggingPermission/LoggingPermission/logperm002 fails with G1, EscapeAnalisys Message-ID: <4B071264.8000906@sun.com> http://cr.openjdk.java.net/~kvn/6896727/webrev.06 Fixed 6896727: nsk/logging/LoggingPermission/LoggingPermission/logperm002 fails with G1, EscapeAnalisys Problem: EA incorrectly allows to bypass one memory store by an other without updating its users. As result node's memory references could missing on some paths. Solution: When updating input memory edge for a store move its memory users to corresponding memory slices. I also added several asserts to verify correctness of memory splitting during EA. Reviewed by: Fix verified (y/n): y, test Other testing: JPRT,CTW From Vladimir.Kozlov at Sun.COM Fri Nov 20 14:51:52 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Fri, 20 Nov 2009 14:51:52 -0800 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <20091119131516.GC7222@redhat.com> References: <20091119131516.GC7222@redhat.com> Message-ID: <4B071D88.2020100@sun.com> Gary, > hotspot/src/cpu/zero/vm/entry_zero.hpp: To be clear. You did it to inline methods in debug VM version, right? They will be optimized for optimized builds anyway. > hotspot/src/cpu/zero/vm/frame_zero.hpp, > hotspot/src/cpu/zero/vm/frame_zero.cpp: Can you add at least an assert into sender_for_nonentry_frame() to verify expected frame types? > > hotspot/src/share/vm/runtime/os.hpp: Can you explain why your changes is not the same as the comment says?: ((_mem_serialize_page ^ addr) & -pagesize) == 0 Thanks, Vladimir Gary Benson wrote: > Hi all, > > This last week I've spent some time going through all the patches > in IcedTea and picking out the ones that affect Zero. This webrev > contains those, along with a fix for a build failure and a little > bit of new code that's required for the latest Shark. > > http://cr.openjdk.java.net/~gbenson/zero-update-01-hs/ > > I've bundled them into one webrev, since each change is small, but > if separate webrevs would be easier or better for you then let me > know and I'll split it accordingly. > > The changes are as follows: > > hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp: > Add code to update the invocation counter for non-synchronized > native methods. This causes Shark to generate wrappers for > hot methods, instead of using the interpreter entry. > > hotspot/src/cpu/zero/vm/entry_zero.hpp: > Inline a couple of methods. This saves a frame every time the > interpreter calls a new Java method, and makes stack traces look > nicer in gdb. > > hotspot/src/cpu/zero/vm/frame_zero.hpp, > hotspot/src/cpu/zero/vm/frame_zero.cpp: > Zero has the same various frame::sender_for_${type}_frame methods > as the architecture-specific ports, but in Zero all frames are > handled identically except for entry frames. This change replaces > most of the sender_for_${type}_frame with sender_for_entry_frame > and sender_for_nonentry_frame, simplifying the marshalling. Adding > compiled native methods in Shark was going to introduce another > type of frame and so yet another sender_for_${type}_frame method > and I decided enough was enough :) > > hotspot/src/cpu/zero/vm/globals_zero.hpp: > Fix a build breakage introduced by 6887571. > > hotspot/src/cpu/zero/vm/sharedRuntime_zero.cpp: > Implemented SharedRuntime::generate_native_wrapper using Shark. > > hotspot/src/cpu/zero/vm/sharkFrame_zero.hpp: > Updated a friend declaration to match the latest version of Shark. > > hotspot/src/share/vm/interpreter/bytecodeInterpreter.cpp: > Fixed a bug where the invocation counter would be incremented > twice for backedges when not using Shark. Fixed a bug where > SharedRuntime::OSR_migration_begin was incorrectly invoked > using CALL_VM. Removed an old workaround that was causing > build failures on IA64. > > hotspot/src/share/vm/prims/jni.cpp: > Added an assertion to ensure Atomic::xchg and Atomic::xchg_ptr > are working correctly. On most Zero platforms these are > implemented using a GCC intrinsic which *should* work as we > expect, but is allowed to only work for the values of 0 and 1 > if that's all the platform can manage. > > hotspot/src/share/vm/prims/jvmtiManageCapabilities.cpp: > The C++ interpreter doesn't currently support JVMTI pop frame > or force early return messages. This changes the capabilities > the VM reports to reflect that. > > hotspot/src/share/vm/runtime/os.hpp: > S390 rounds the addresses of segmentation faults to the nearest > page. This changes os::is_memory_serialize_page to cope with > that. > > Cheers, > Gary > From Ulf.Zibis at gmx.de Sat Nov 21 05:54:35 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Sat, 21 Nov 2009 14:54:35 +0100 Subject: Multiple copies of same code Message-ID: <4B07F11B.5060804@gmx.de> In output of PrintAssembly I frequently see : ... ... # more than 10 recurrences ... 726 B108: # B114 <- B10 Freq: 9.99898e-006 726 # exception oop is in EAX; no code emitted 726 MOV ECX,EAX 728 JMP,s B114 728 72a B109: # B114 <- B9 Freq: 9.99918e-006 72a # exception oop is in EAX; no code emitted 72a MOV ECX,EAX 72c JMP,s B114 72c 72e B110: # B114 <- B6 Freq: 9.99938e-006 72e # exception oop is in EAX; no code emitted 72e MOV ECX,EAX 730 JMP,s B114 730 732 B111: # B114 <- B4 Freq: 9.99959e-006 732 # exception oop is in EAX; no code emitted 732 MOV ECX,EAX 734 JMP,s B114 734 736 B112: # B114 <- B3 Freq: 9.99979e-006 736 # exception oop is in EAX; no code emitted 736 MOV ECX,EAX 738 JMP,s B114 738 73a B113: # B114 <- B2 Freq: 9.99999e-006 73a # exception oop is in EAX; no code emitted 73a MOV ECX,EAX 73a 73c B114: # N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 B104 B78 B77 B76 B75 B99 Freq: 7.11172e-005 Wouldn't it be better to have : ... ... # more than 10 recurrences ... 73a B108: # B114 <- B10 Freq: 9.99898e-006 73a B109: # B114 <- B9 Freq: 9.99918e-006 73a B110: # B114 <- B6 Freq: 9.99938e-006 73a B111: # B114 <- B4 Freq: 9.99959e-006 73a B112: # B114 <- B3 Freq: 9.99979e-006 73a B113: # B114 <- B2 Freq: 9.99999e-006 73a # exception oop is in EAX; no code emitted 73a MOV ECX,EAX 73a 73c B114: # N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 B104 B78 B77 B76 B75 B99 Freq: 7.11172e-005 From Ulf.Zibis at gmx.de Sat Nov 21 15:49:23 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Sun, 22 Nov 2009 00:49:23 +0100 Subject: Inline threshold relative to frequency Message-ID: <4B087C83.20307@gmx.de> Hi, wouldn't it make sense, if the inline threshold for a method would be relative to the frequency of it's usage? See the method below. It has 189 bytes of byte code, so it "too big" under default inline threshold. As it is called very frequent, performance should increase "dramatically", if it could be inlined, as the pushing of the numerous parameters to stack could be saved. -Ulf CoderResult decode(byte b1, byte b2, int p, char[] da, final int[] dp, int dl) { // package private for ISO2022_CN assert b1Max - b1Min >= 0 && b2Max - b2Min >= 0; // important if values are of type byte if (b1 < b1Min || b1 > b1Max || b2 < b2Min || b2 > b2Max) return p == 0 ? CoderResult.malformedForLength(1) : CoderResult.malformedForLength(4); int index = (b1 - b1Min) * dbSegSize + b2 - b2Min; char c = b2c[p].charAt(index); if (c == UNMAPPABLE_DECODING) return p == 0 ? CoderResult.unmappableForLength(2) : CoderResult.unmappableForLength(4); if ((b2cIsSupp[index] & (1 << p)) == 0) { // BMP character if (dp[0] == dl) return CoderResult.OVERFLOW; da[dp[0]++] = c; } else { // surrogate character if (dp[0] > dl - 2) return CoderResult.OVERFLOW; da[dp[0]++] = Character.highSurrogate(0x20000 + c); da[dp[0]++] = Character.lowSurrogate(0x20000 + c); // dp[0] += Character.toChars(0x20000 + c, da, dp); // too slow } return null; } From rasbold at google.com Sun Nov 22 08:59:15 2009 From: rasbold at google.com (Chuck Rasbold) Date: Sun, 22 Nov 2009 08:59:15 -0800 Subject: Multiple copies of same code In-Reply-To: <4B07F11B.5060804@gmx.de> References: <4B07F11B.5060804@gmx.de> Message-ID: <4149a0430911220859k2daad86crb87f6b81ab05eadc@mail.gmail.com> Sure. It would be great to merge redundant code paths. But I don't think the cost/benefit ratio is worth it. In the case you cite, there would be a savings of 4 bytes per path removed, which are projected to be very infrequent. In a JIT, you have to spend your compilation budget wisely. It's not that it can't be done. There are just better places to spend time. On Sat, Nov 21, 2009 at 5:54 AM, Ulf Zibis wrote: > In output of PrintAssembly I frequently see : > > ... > ... # more than 10 recurrences > ... > 726 B108: # B114 <- B10 Freq: 9.99898e-006 > 726 # exception oop is in EAX; no code emitted > 726 MOV ECX,EAX > 728 JMP,s B114 > 728 > 72a B109: # B114 <- B9 Freq: 9.99918e-006 > 72a # exception oop is in EAX; no code emitted > 72a MOV ECX,EAX > 72c JMP,s B114 > 72c > 72e B110: # B114 <- B6 Freq: 9.99938e-006 > 72e # exception oop is in EAX; no code emitted > 72e MOV ECX,EAX > 730 JMP,s B114 > 730 > 732 B111: # B114 <- B4 Freq: 9.99959e-006 > 732 # exception oop is in EAX; no code emitted > 732 MOV ECX,EAX > 734 JMP,s B114 > 734 > 736 B112: # B114 <- B3 Freq: 9.99979e-006 > 736 # exception oop is in EAX; no code emitted > 736 MOV ECX,EAX > 738 JMP,s B114 > 738 > 73a B113: # B114 <- B2 Freq: 9.99999e-006 > 73a # exception oop is in EAX; no code emitted > 73a MOV ECX,EAX > 73a > 73c B114: # N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 > B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 > B104 B78 B77 B76 B75 B99 Freq: 7.11172e-005 > > > Wouldn't it be better to have : > > ... > ... # more than 10 recurrences > ... > 73a B108: # B114 <- B10 Freq: 9.99898e-006 > 73a B109: # B114 <- B9 Freq: 9.99918e-006 > 73a B110: # B114 <- B6 Freq: 9.99938e-006 > 73a B111: # B114 <- B4 Freq: 9.99959e-006 > 73a B112: # B114 <- B3 Freq: 9.99979e-006 > 73a B113: # B114 <- B2 Freq: 9.99999e-006 > 73a # exception oop is in EAX; no code emitted > 73a MOV ECX,EAX > 73a > 73c B114: # N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 > B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 > B104 B78 B77 B76 B75 B99 Freq: 7.11172e-005 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20091122/7847a5e2/attachment.html From gbenson at redhat.com Mon Nov 23 02:18:23 2009 From: gbenson at redhat.com (Gary Benson) Date: Mon, 23 Nov 2009 10:18:23 +0000 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <4B071D88.2020100@sun.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> Message-ID: <20091123101823.GA3377@redhat.com> Hi Vladimir, Vladimir Kozlov wrote: > > hotspot/src/cpu/zero/vm/entry_zero.hpp: > > To be clear. You did it to inline methods in debug VM version, > right? They will be optimized for optimized builds anyway. No, I did it to inline them for the optimized build. I'd assumed that methods defined in header files like that would be inlined, but they don't seem to be. Maybe this is a peculiarity of the IcedTea build, which is where I noticed the problem. If it is, I can remove it from this webrev. > > hotspot/src/cpu/zero/vm/frame_zero.hpp, > > hotspot/src/cpu/zero/vm/frame_zero.cpp: > > Can you add at least an assert into sender_for_nonentry_frame() to > verify expected frame types? Ok. > > > > hotspot/src/share/vm/runtime/os.hpp: > > Can you explain why your changes is not the same as the comment > says?: ((_mem_serialize_page ^ addr) & -pagesize) == 0 Basically because I didn't know if I'd need to make changes anywhere else, and I didn't want to break the other platforms. Should I change it to what the comment says? Cheers, Gary -- http://gbenson.net/ From aph at redhat.com Mon Nov 23 02:34:49 2009 From: aph at redhat.com (Andrew Haley) Date: Mon, 23 Nov 2009 10:34:49 +0000 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <20091123101823.GA3377@redhat.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> Message-ID: <4B0A6549.30502@redhat.com> Gary Benson wrote: > Vladimir Kozlov wrote: >>> hotspot/src/cpu/zero/vm/entry_zero.hpp: >> To be clear. You did it to inline methods in debug VM version, >> right? They will be optimized for optimized builds anyway. > > No, I did it to inline them for the optimized build. I'd assumed > that methods defined in header files like that would be inlined, > but they don't seem to be. If you really want these functions to be inline, you have to specify it. Having said that, if you define a member function within a class definition, then it is inline. FWIW, I don't think that it's even legal C++ to define a non-inline member function more than once in a program, so this change is required for correctness. Andrew. From Christian.Thalinger at Sun.COM Mon Nov 23 02:38:50 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Mon, 23 Nov 2009 11:38:50 +0100 Subject: Inline threshold relative to frequency In-Reply-To: <4B087C83.20307@gmx.de> References: <4B087C83.20307@gmx.de> Message-ID: <1258972730.1712.69.camel@macbook> On Sun, 2009-11-22 at 00:49 +0100, Ulf Zibis wrote: > Hi, > > wouldn't it make sense, if the inline threshold for a method would be > relative to the frequency of it's usage? > > See the method below. It has 189 bytes of byte code, so it "too big" > under default inline threshold. > As it is called very frequent, performance should increase > "dramatically", if it could be inlined, as the pushing of the numerous > parameters to stack could be saved. Well, it "could" be a good idea but it does not necessarily increase "dramatically" performance. It's not easy to find good inlining heuristics that work in all cases. Inlining a rather big method, like yours, has a number side-effects to the caller, most importantly: a) it becomes bigger -> cache effects b) register pressure might increase -> worse register allocation (but could be the other way around) Also note that not all architectures use the stack for passing call arguments. Even x86_64 has enough argument registers for this particular method. -- Christian From Christian.Thalinger at Sun.COM Mon Nov 23 03:01:01 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Mon, 23 Nov 2009 12:01:01 +0100 Subject: Compiled call version seems to be slower In-Reply-To: <4B085F54.6000207@gmx.de> References: <4B085F54.6000207@gmx.de> Message-ID: <1258974061.1712.72.camel@macbook> On Sat, 2009-11-21 at 22:44 +0100, Ulf Zibis wrote: > In my attached example -XX:+PrintAssembly outputs 2 versions for > sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decodeArrayLoop. > > The 1st one seems for call from compiled calling method, the 2nd for > call from byte code interpreter. > Looking closer to the -XX:+PrintAssembly output, 1st version seems to > be > slower, i.e. has more instructions in the loop code. > > Additionally I'm wondering why the finally block is copy-and-pasted > for > each separate return. > Is that as disired ? I just had a quick glance at the code and it seems the second one has a better register allocation, for whatever reason. And maybe (would need to look closer) basic block ordering is different. -- Christian PS: It would be much easier when you would attach two files and give some hints (e.g. addresses) where the code is you're talking about. From Vladimir.Kozlov at Sun.COM Mon Nov 23 10:06:26 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Mon, 23 Nov 2009 10:06:26 -0800 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <20091123101823.GA3377@redhat.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> Message-ID: <4B0ACF22.3050107@sun.com> Gary Benson wrote: > Hi Vladimir, > > Vladimir Kozlov wrote: >>> hotspot/src/cpu/zero/vm/entry_zero.hpp: >> To be clear. You did it to inline methods in debug VM version, >> right? They will be optimized for optimized builds anyway. > > No, I did it to inline them for the optimized build. I'd assumed > that methods defined in header files like that would be inlined, > but they don't seem to be. Maybe this is a peculiarity of the > IcedTea build, which is where I noticed the problem. If it is, > I can remove it from this webrev. As Andrew said, functions defined inside class definition should be inlined by default. If that not happened then adding "inline" will not help. I was mostly surprise that you add "inline" for the method which just return a field value. But if it really helped (you verified the code) then leave it. >>> hotspot/src/share/vm/runtime/os.hpp: >> Can you explain why your changes is not the same as the comment >> says?: ((_mem_serialize_page ^ addr) & -pagesize) == 0 > > Basically because I didn't know if I'd need to make changes anywhere > else, and I didn't want to break the other platforms. Should I change > it to what the comment says? I am concern about correctness of your code - page sizes could be different. I would prefer if your code will be similar to one in is_poll_address(). Vladimir > > Cheers, > Gary > From Thomas.Rodriguez at Sun.COM Mon Nov 23 12:05:15 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Mon, 23 Nov 2009 12:05:15 -0800 Subject: Multiple copies of same code In-Reply-To: <4B07F11B.5060804@gmx.de> References: <4B07F11B.5060804@gmx.de> Message-ID: As Chuck said this particular case is unlikely to matter at all. It only effects code size and is only used in the exception path which is mostly dominated by exception lookup if you use this path. It would be nice to clean it up but it's not a high priority. tom On Nov 21, 2009, at 5:54 AM, Ulf Zibis wrote: > In output of PrintAssembly I frequently see : > > ... > ... # more than 10 recurrences > ... > 726 B108: # B114 <- B10 Freq: 9.99898e-006 > 726 # exception oop is in EAX; no code emitted > 726 MOV ECX,EAX > 728 JMP,s B114 > 728 > 72a B109: # B114 <- B9 Freq: 9.99918e-006 > 72a # exception oop is in EAX; no code emitted > 72a MOV ECX,EAX > 72c JMP,s B114 > 72c > 72e B110: # B114 <- B6 Freq: 9.99938e-006 > 72e # exception oop is in EAX; no code emitted > 72e MOV ECX,EAX > 730 JMP,s B114 > 730 > 732 B111: # B114 <- B4 Freq: 9.99959e-006 > 732 # exception oop is in EAX; no code emitted > 732 MOV ECX,EAX > 734 JMP,s B114 > 734 > 736 B112: # B114 <- B3 Freq: 9.99979e-006 > 736 # exception oop is in EAX; no code emitted > 736 MOV ECX,EAX > 738 JMP,s B114 > 738 > 73a B113: # B114 <- B2 Freq: 9.99999e-006 > 73a # exception oop is in EAX; no code emitted > 73a MOV ECX,EAX > 73a > 73c B114: # N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 B104 B78 B77 B76 B75 B99 Freq: 7.11172e-005 > > > Wouldn't it be better to have : > > ... > ... # more than 10 recurrences > ... > 73a B108: # B114 <- B10 Freq: 9.99898e-006 > 73a B109: # B114 <- B9 Freq: 9.99918e-006 > 73a B110: # B114 <- B6 Freq: 9.99938e-006 > 73a B111: # B114 <- B4 Freq: 9.99959e-006 > 73a B112: # B114 <- B3 Freq: 9.99979e-006 > 73a B113: # B114 <- B2 Freq: 9.99999e-006 > 73a # exception oop is in EAX; no code emitted > 73a MOV ECX,EAX > 73a > 73c B114: # N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 B104 B78 B77 B76 B75 B99 Freq: 7.11172e-005 > > From gbenson at redhat.com Tue Nov 24 02:18:04 2009 From: gbenson at redhat.com (Gary Benson) Date: Tue, 24 Nov 2009 10:18:04 +0000 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <4B0A6549.30502@redhat.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0A6549.30502@redhat.com> Message-ID: <20091124101803.GB3403@redhat.com> Andrew Haley wrote: > If you really want these functions to be inline, you have to specify > it. Having said that, if you define a member function within a class > definition, then it is inline. I never thought to check that the inline statement was actually doing anything, but apparently it isn't: ... #31 CppInterpreter::main_loop (recurse=, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp:110 #32 0x00007ffff70d4aff in ZeroEntry::invoke () at hotspot/src/cpu/zero/vm/entry_zero.hpp:54 #33 Interpreter::invoke_method () at hotspot/src/cpu/zero/vm/interpreter_zero.hpp:28 #34 CppInterpreter::main_loop (recurse=, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp:110 #35 0x00007ffff72d3173 in ZeroEntry::invoke () at hotspot/src/cpu/zero/vm/entry_zero.hpp:54 #36 Interpreter::invoke_method () at hotspot/src/cpu/zero/vm/interpreter_zero.hpp:28 #37 StubGenerator::call_stub (call_wrapper=, result=0x7ffff6d30f78, result_type=T_INT, method=0x7ffff02f8e98, entry_point=0x7ffff41371a0, parameters=, parameter_words=1, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/stubGenerator_zero.cpp:67 ... First things first: I'll remove the addition of the inline statements from the webrev. Secondly, Andrew, do you have any idea why these functions aren't being inlined? I'd expect the backtrace to look like this: ... #35 CppInterpreter::main_loop (recurse=, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp:110 #36 CppInterpreter::main_loop (recurse=, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp:110 #37 StubGenerator::call_stub (call_wrapper=, result=0x7ffff6d30f78, result_type=T_INT, method=0x7ffff02f8e98, entry_point=0x7ffff41371a0, parameters=, parameter_words=1, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/stubGenerator_zero.cpp:67 ... The files are being compiled with these options: g++ -DLINUX -D_GNU_SOURCE -DCC_INTERP -DZERO -DAMD64 -DZERO_LIBARCH=\"amd64\" -DPRODUCT -I. -I../generated/adfiles -I../generated/jvmtifiles -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/asm -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/c1 -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/ci -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/classfile -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/code -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/compiler -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/g1 -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/concurrentMarkSweep -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/parNew -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/shared -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/parallelScavenge -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_interface -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/interpreter -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/memory -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/oops -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/prims -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/runtime -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/services -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/shark -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/utilities -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/cpu/zero/vm -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/os/linux/vm -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/os_cpu/linux_zero/vm -I../generated -DHOTSPOT_RELEASE_VERSION="\"14.0-b16\"" -DHOTSPOT_BUILD_TARGET="\"product\"" -DHOTSPOT_BUILD_USER="\"gary\"" -DHOTSPOT_LIB_ARCH=\"amd64\" -DJRE_RELEASE_VERSION="\"1.6.0_0-b17\"" -DHOTSPOT_VM_DISTRO="\"OpenJDK\"" -DSHARK -I/usr/lib64/libffi-3.0.5/include -I/home/gary/work/llvm/llvm-2.6/include -I/home/gary/work/llvm/llvm-2.6/include -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -fPIC -Woverloaded-virtual -DSHARK_LLVM_VERSION=26 -fpic -fno-rtti -fno-exceptions -D_REENTRANT -fcheck-new -m64 -pipe -g -O3 -fno-strict-aliasing -DVM_LITTLE_ENDIAN -D_LP64=1 -Werror -Wpointer-arith -Wsign-compare -c -o vframe_hp.o /home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/runtime/vframe_hp.cpp Cheers, Gary -- http://gbenson.net/ From gbenson at redhat.com Tue Nov 24 02:27:04 2009 From: gbenson at redhat.com (Gary Benson) Date: Tue, 24 Nov 2009 10:27:04 +0000 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <4B0ACF22.3050107@sun.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0ACF22.3050107@sun.com> Message-ID: <20091124102703.GC3403@redhat.com> Vladimir Kozlov wrote: > Gary Benson wrote: > > Vladimir Kozlov wrote: > > > hotspot/src/share/vm/runtime/os.hpp: > > > Can you explain why your changes is not the same as the comment > > > says?: ((_mem_serialize_page ^ addr) & -pagesize) == 0 > > > > Basically because I didn't know if I'd need to make changes > > anywhere else, and I didn't want to break the other platforms. > > Should I change it to what the comment says? > > I am concern about correctness of your code - page sizes could be > different. I would prefer if your code will be similar to one in > is_poll_address(). So something like this: static bool is_memory_serialize_page(JavaThread *thread, address addr) { if (UseMembar) return false; if (thread == NULL) return false; return addr >= _mem_serialize_page && addr < (_mem_serialize_page + os::vm_page_size()); } The "if (thread == NULL) return false;" would no longer be necessary, so it could either be retained to preserve the old behaviour or not. Which would you prefer? Cheers, Gary -- http://gbenson.net/ From aph at redhat.com Tue Nov 24 02:38:57 2009 From: aph at redhat.com (Andrew Haley) Date: Tue, 24 Nov 2009 10:38:57 +0000 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <20091124101803.GB3403@redhat.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0A6549.30502@redhat.com> <20091124101803.GB3403@redhat.com> Message-ID: <4B0BB7C1.2070308@redhat.com> Gary Benson wrote: > Andrew Haley wrote: >> If you really want these functions to be inline, you have to specify >> it. Having said that, if you define a member function within a class >> definition, then it is inline. > > I never thought to check that the inline statement was actually doing > anything, but apparently it isn't: It won't: functions defined within a class definition are inline anyway. > First things first: I'll remove the addition of the inline statements > from the webrev. Secondly, Andrew, do you have any idea why these > functions aren't being inlined? Are you sure they aren't being inlined? I'd have to look at the code to be sure. This stack trace doesn't prove it. Not immediately. I'd experiment with __attribute__((always_inline)) and see what happens. If gcc actually can't inline for some reason, it'll produce a diagnostic. Andrew. From Vladimir.Kozlov at Sun.COM Tue Nov 24 10:24:31 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Tue, 24 Nov 2009 10:24:31 -0800 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <20091124101803.GB3403@redhat.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0A6549.30502@redhat.com> <20091124101803.GB3403@redhat.com> Message-ID: <4B0C24DF.6000409@sun.com> Your options have -g debug option but you build -DPRODUCT. Vladimir Gary Benson wrote: > Andrew Haley wrote: >> If you really want these functions to be inline, you have to specify >> it. Having said that, if you define a member function within a class >> definition, then it is inline. > > I never thought to check that the inline statement was actually doing > anything, but apparently it isn't: > > ... > #31 CppInterpreter::main_loop (recurse=, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp:110 > #32 0x00007ffff70d4aff in ZeroEntry::invoke () at hotspot/src/cpu/zero/vm/entry_zero.hpp:54 > #33 Interpreter::invoke_method () at hotspot/src/cpu/zero/vm/interpreter_zero.hpp:28 > #34 CppInterpreter::main_loop (recurse=, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp:110 > #35 0x00007ffff72d3173 in ZeroEntry::invoke () at hotspot/src/cpu/zero/vm/entry_zero.hpp:54 > #36 Interpreter::invoke_method () at hotspot/src/cpu/zero/vm/interpreter_zero.hpp:28 > #37 StubGenerator::call_stub (call_wrapper=, result=0x7ffff6d30f78, result_type=T_INT, method=0x7ffff02f8e98, entry_point=0x7ffff41371a0, > parameters=, parameter_words=1, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/stubGenerator_zero.cpp:67 > ... > > First things first: I'll remove the addition of the inline statements > from the webrev. Secondly, Andrew, do you have any idea why these > functions aren't being inlined? I'd expect the backtrace to look like > this: > > ... > #35 CppInterpreter::main_loop (recurse=, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp:110 > #36 CppInterpreter::main_loop (recurse=, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp:110 > #37 StubGenerator::call_stub (call_wrapper=, result=0x7ffff6d30f78, result_type=T_INT, method=0x7ffff02f8e98, entry_point=0x7ffff41371a0, > parameters=, parameter_words=1, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/stubGenerator_zero.cpp:67 > ... > > The files are being compiled with these options: > > g++ -DLINUX -D_GNU_SOURCE -DCC_INTERP -DZERO -DAMD64 -DZERO_LIBARCH=\"amd64\" -DPRODUCT -I. -I../generated/adfiles -I../generated/jvmtifiles -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/asm -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/c1 -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/ci -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/classfile -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/code -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/compiler -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/g1 -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/concurrentMarkSweep -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/parNew -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/shared -I/home/gary/work/icedte a6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/parallelScavenge -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_interface -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/interpreter -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/memory -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/oops -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/prims -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/runtime -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/services -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/shark -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/utilities -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/cpu/zero/vm -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/os/linux/vm -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/os_cpu/linux_zero/vm -I../generated -DHOTSPOT_RELEASE_VERSION="\"14.0-b16\"" -DHOTSPOT_BUILD_TARGET="\"product\"" -DHOTSPOT_BU ILD_USER="\"gary\"" -DHOTSPOT_LIB_ARCH=\"amd64\" -DJRE_RELEASE_VERSION="\"1.6.0_0-b17\"" -DHOTSPOT_VM_DISTRO="\"OpenJDK\"" -DSHARK -I/usr/lib64/libffi-3.0.5/include -I/home/gary/work/llvm/llvm-2.6/include -I/home/gary/work/llvm/llvm-2.6/include -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -fPIC -Woverloaded-virtual -DSHARK_LLVM_VERSION=26 -fpic -fno-rtti -fno-exceptions -D_REENTRANT -fcheck-new -m64 -pipe -g -O3 -fno-strict-aliasing -DVM_LITTLE_ENDIAN -D_LP64=1 -Werror -Wpointer-arith -Wsign-compare -c -o vframe_hp.o /home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/runtime/vframe_hp.cpp > > Cheers, > Gary > From Vladimir.Kozlov at Sun.COM Tue Nov 24 10:28:49 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Tue, 24 Nov 2009 10:28:49 -0800 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <20091124102703.GC3403@redhat.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0ACF22.3050107@sun.com> <20091124102703.GC3403@redhat.com> Message-ID: <4B0C25E1.4080804@sun.com> I would keep thread check. Vladimir Gary Benson wrote: > Vladimir Kozlov wrote: >> Gary Benson wrote: >>> Vladimir Kozlov wrote: >>>> hotspot/src/share/vm/runtime/os.hpp: >>>> Can you explain why your changes is not the same as the comment >>>> says?: ((_mem_serialize_page ^ addr) & -pagesize) == 0 >>> Basically because I didn't know if I'd need to make changes >>> anywhere else, and I didn't want to break the other platforms. >>> Should I change it to what the comment says? >> I am concern about correctness of your code - page sizes could be >> different. I would prefer if your code will be similar to one in >> is_poll_address(). > > So something like this: > > static bool is_memory_serialize_page(JavaThread *thread, address addr) { > if (UseMembar) return false; > if (thread == NULL) return false; > return addr >= _mem_serialize_page && addr < (_mem_serialize_page + os::vm_page_size()); > } > > The "if (thread == NULL) return false;" would no longer be necessary, > so it could either be retained to preserve the old behaviour or not. > Which would you prefer? > > Cheers, > Gary > From francis.rangel at gmail.com Tue Nov 24 11:20:09 2009 From: francis.rangel at gmail.com (Francis Rangel) Date: Tue, 24 Nov 2009 17:20:09 -0200 Subject: Question about the PrintMethodStatistics report Message-ID: Hi there! I'm new here and I want to introduce myself. My name is Francis Rangel and I'm a computing student. My final paper subject is the inline optimization of the openJDK JVM. Well, what I need to know is why the PrintMethodStatistics report changes the amount of each kind of method using diferent compilers. For example, when the programa run with the -server option, the static methods are 100. Then, when the -client compiler is used, the static methods come do 90. If the program wasn't modified, why does this numbers change? Is that because the devirtualization applied on virtual methods? Thanks for your time. -- Regards. Francis Rangel From Ulf.Zibis at gmx.de Tue Nov 24 13:56:27 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Tue, 24 Nov 2009 22:56:27 +0100 Subject: Multiple copies of same code In-Reply-To: <4149a0430911220859k2daad86crb87f6b81ab05eadc@mail.gmail.com> References: <4B07F11B.5060804@gmx.de> <4149a0430911220859k2daad86crb87f6b81ab05eadc@mail.gmail.com> Message-ID: <4B0C568B.10600@gmx.de> I think, it's not only the code size that matters, but too the performance lack from all these jumps. In the method code below, you see a 2-line finally block. Looking at the compile result, I can see, that this block is repeated 6 times and consumes 1/3 of the whole assembly code for this method. Additionally, there are plenty of range-check and null-check block which too seem to be copy-and-pasted, so I guess, removing the redundant blocks from this example would make the code half-sized. On the other hand, the 1-length int [] dp could be optimized to a normal int field and pushing the 6 parameters to stack could be saved, if method decode() would be inlined, but isn't because of inline threshold, which sadly isn't frequency-related. This would additionally increase the performance. private CoderResult decodeArrayLoop(ByteBuffer src, CharBuffer dst) { byte[] sa = src.array(); int sp = src.arrayOffset() + src.position(); int sl = sp + src.remaining(); char[] da = dst.array(); int [] dp = new int[1]; dp[0] = dst.arrayOffset() + dst.position(); int dl = dp[0] + dst.remaining(); try { while (sp < sl) { CoderResult result; byte byte1 = sa[sp]; if (byte1 >= 0) { // ASCII G0 if (dp[0] == dl) return CoderResult.OVERFLOW; da[dp[0]++] = (char)(byte1 & 0xff); sp++; } else if (byte1 != SS2) { // Codeset 1 G1 if (sp + 1 == sl) break; result = decode(byte1, sa[sp+1], 0, da, dp, dl); if (result != null) return result; sp += 2; } else { // Codeset 2 G2 if (sp + 4 > sl) break; int cnsPlane = cnspToIndex[sa[sp+1] & 0xff]; if (cnsPlane < 0) return CoderResult.malformedForLength(2); result = decode(sa[sp+2], sa[sp+3], cnsPlane, da, dp, dl); if (result != null) return result; sp += 4; } } return CoderResult.UNDERFLOW; } finally { src.position(sp - src.arrayOffset()); dst.position(dp[0] - dst.arrayOffset()); } } -Ulf Am 22.11.2009 17:59, Chuck Rasbold schrieb: > Sure. It would be great to merge redundant code paths. But I don't > think the cost/benefit ratio is worth it. > > In the case you cite, there would be a savings of 4 bytes per path > removed, which are projected to be very infrequent. In a JIT, you > have to spend your compilation budget wisely. > > It's not that it can't be done. There are just better places to spend > time. > > On Sat, Nov 21, 2009 at 5:54 AM, Ulf Zibis > wrote: > > In output of PrintAssembly I frequently see : > > ... > ... # more than 10 recurrences > ... > 726 B108: # B114 <- B10 Freq: 9.99898e-006 > 726 # exception oop is in EAX; no code emitted > 726 MOV ECX,EAX > 728 JMP,s B114 > 728 > 72a B109: # B114 <- B9 Freq: 9.99918e-006 > 72a # exception oop is in EAX; no code emitted > 72a MOV ECX,EAX > 72c JMP,s B114 > 72c > 72e B110: # B114 <- B6 Freq: 9.99938e-006 > 72e # exception oop is in EAX; no code emitted > 72e MOV ECX,EAX > 730 JMP,s B114 > 730 > 732 B111: # B114 <- B4 Freq: 9.99959e-006 > 732 # exception oop is in EAX; no code emitted > 732 MOV ECX,EAX > 734 JMP,s B114 > 734 > 736 B112: # B114 <- B3 Freq: 9.99979e-006 > 736 # exception oop is in EAX; no code emitted > 736 MOV ECX,EAX > 738 JMP,s B114 > 738 > 73a B113: # B114 <- B2 Freq: 9.99999e-006 > 73a # exception oop is in EAX; no code emitted > 73a MOV ECX,EAX > 73a > 73c B114: # N1132 <- B79 B113 B112 B111 B110 B109 B108 > B103 B102 B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 > B80 B107 B106 B105 B104 B78 B77 B76 B75 B99 Freq: 7.11172e-005 > > > Wouldn't it be better to have : > > ... > ... # more than 10 recurrences > ... > 73a B108: # B114 <- B10 Freq: 9.99898e-006 > 73a B109: # B114 <- B9 Freq: 9.99918e-006 > 73a B110: # B114 <- B6 Freq: 9.99938e-006 > 73a B111: # B114 <- B4 Freq: 9.99959e-006 > 73a B112: # B114 <- B3 Freq: 9.99979e-006 > 73a B113: # B114 <- B2 Freq: 9.99999e-006 > 73a # exception oop is in EAX; no code emitted > 73a MOV ECX,EAX > 73a > 73c B114: # N1132 <- B79 B113 B112 B111 B110 B109 B108 > B103 B102 B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 > B80 B107 B106 B105 B104 B78 B77 B76 B75 B99 Freq: 7.11172e-005 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20091124/ebb0e377/attachment.html From Ulf.Zibis at gmx.de Tue Nov 24 14:33:01 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Tue, 24 Nov 2009 23:33:01 +0100 Subject: Inline threshold relative to frequency In-Reply-To: <1258972730.1712.69.camel@macbook> References: <4B087C83.20307@gmx.de> <1258972730.1712.69.camel@macbook> Message-ID: <4B0C5F1D.1030307@gmx.de> Am 23.11.2009 11:38, Christian Thalinger schrieb: > On Sun, 2009-11-22 at 00:49 +0100, Ulf Zibis wrote: > >> Hi, >> >> wouldn't it make sense, if the inline threshold for a method would be >> relative to the frequency of it's usage? >> >> See the method below. It has 189 bytes of byte code, so it "too big" >> under default inline threshold. >> As it is called very frequent, performance should increase >> "dramatically", if it could be inlined, as the pushing of the numerous >> parameters to stack could be saved. >> > > Well, it "could" be a good idea but it does not necessarily increase > "dramatically" performance. It's not easy to find good inlining > heuristics that work in all cases. > > Inlining a rather big method, like yours, has a number side-effects to > the caller, most importantly: > > a) it becomes bigger -> cache effects > In my code example the regarding method is only called from 2 places, so the additional memory would not count so much here, and on the other hand the by-stack passing of 6 parameter arguments could be saved, so the amount of method parameters should be too valued for such a dynamic-threshold. In reference to my other thread "Multiple copies of same code" removing the 6 copies of the finally block would save more memory/cache. > b) register pressure might increase -> worse register allocation (but > could be the other way around) > > Also note that not all architectures use the stack for passing call > arguments. Even x86_64 has enough argument registers for this > particular method. > Does that mean, that all the MOV EBP,[ESP + #72] MOV [ESP + #4],EBP pairs would be optimized to register usage in a following optimization step, I can't see by PrintAssembly? -Ulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20091124/4e5a9975/attachment.html From changpeng.fang at sun.com Tue Nov 24 17:44:15 2009 From: changpeng.fang at sun.com (changpeng.fang at sun.com) Date: Wed, 25 Nov 2009 01:44:15 +0000 Subject: hg: jdk7/hotspot-comp/hotspot: 7 new changesets Message-ID: <20091125014435.9D11A41BFC@hg.openjdk.java.net> Changeset: 473cce303f13 Author: phh Date: 2009-10-28 16:25 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/473cce303f13 6887571: Increase default heap config sizes Summary: Apply modification of existing server heap size ergo to all collectors except CMS. Reviewed-by: jmasa, ysr, xlu ! src/cpu/sparc/vm/c1_globals_sparc.hpp ! src/cpu/sparc/vm/c2_globals_sparc.hpp ! src/cpu/sparc/vm/globals_sparc.hpp ! src/cpu/x86/vm/c1_globals_x86.hpp ! src/cpu/x86/vm/c2_globals_x86.hpp ! src/cpu/x86/vm/globals_x86.hpp ! src/cpu/zero/vm/globals_zero.hpp ! src/os_cpu/linux_x86/vm/globals_linux_x86.hpp ! src/os_cpu/solaris_x86/vm/globals_solaris_x86.hpp ! src/os_cpu/windows_x86/vm/globals_windows_x86.hpp ! src/share/vm/gc_implementation/parallelScavenge/psGCAdaptivePolicyCounters.cpp ! src/share/vm/memory/collectorPolicy.cpp ! src/share/vm/runtime/arguments.cpp ! src/share/vm/runtime/arguments.hpp ! src/share/vm/runtime/globals.cpp ! src/share/vm/runtime/globals.hpp ! src/share/vm/runtime/globals_extension.hpp ! src/share/vm/services/management.cpp Changeset: c4ecde2f6b3c Author: xlu Date: 2009-10-30 17:24 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/c4ecde2f6b3c Merge Changeset: 97b36138b494 Author: kamg Date: 2009-11-06 15:04 -0500 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/97b36138b494 Merge Changeset: ba7ea42fc66e Author: phh Date: 2009-11-04 16:49 -0500 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/ba7ea42fc66e 6898160: Need serviceability support for new vm argument type 'uint64_t' Summary: Add serviceability support for uint64_t. Flags of unknown type assert in debug builds and are ignored in product builds. Reviewed-by: never, xlu, mchung, dcubed ! src/share/vm/runtime/globals.cpp ! src/share/vm/services/attachListener.cpp ! src/share/vm/services/management.cpp Changeset: db0d21039f34 Author: kamg Date: 2009-11-06 16:05 -0500 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/db0d21039f34 Merge Changeset: fb4c00faa9da Author: kamg Date: 2009-11-11 09:13 -0500 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/fb4c00faa9da Merge ! src/share/vm/runtime/arguments.cpp Changeset: de44705e6b33 Author: cfang Date: 2009-11-24 11:49 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/de44705e6b33 Merge From ian.rogers at manchester.ac.uk Tue Nov 24 22:32:26 2009 From: ian.rogers at manchester.ac.uk (Ian Rogers) Date: Tue, 24 Nov 2009 22:32:26 -0800 Subject: Multiple copies of same code In-Reply-To: <4B0C568B.10600@gmx.de> References: <4B07F11B.5060804@gmx.de> <4149a0430911220859k2daad86crb87f6b81ab05eadc@mail.gmail.com> <4B0C568B.10600@gmx.de> Message-ID: Hi Ulf, I don't know if it useful but 2 years ago I had a go at optimizing GNU Classpath's NIO charset implementation, in particular for byte charsets like ASCII [1]. The approach I wanted was for small methods that would inline easily and final fields that could be chased through to avoid runtime indirections. In MRP [2] (source is Eclipse Public License) I kick the compiler to inline some of the core routines further [3]. Regards, Ian Rogers (now at Azul Systems in Mountain View) [1] http://cvs.savannah.gnu.org/viewvc/classpath/gnu/java/nio/charset/?root=classpath [2] http://mrp.codehaus.org/ [3] http://git.codehaus.org/gitweb.cgi?p=mrp.git;a=blob;f=tools/asm-tasks/src/org/jikesrvm/tools/asm/AnnotationAdder.java;hb=HEAD 2009/11/24 Ulf Zibis : > I think, it's not only the code size that matters, but too the performance > lack from all these jumps. > > In the method code below, you see a 2-line finally block. Looking at the > compile result, I can see, that this block is repeated 6 times and consumes > 1/3 of the whole assembly code for this method. Additionally, there are > plenty of range-check and null-check block which too seem to be > copy-and-pasted, so I guess, removing the redundant blocks from this example > would make the code half-sized. > > On the other hand, the 1-length int [] dp could be optimized to a normal int > field and pushing the 6 parameters to stack could be saved, if method > decode() would be inlined, but isn't because of inline threshold, which > sadly isn't frequency-related. This would additionally increase the > performance. > > > ??????? private CoderResult decodeArrayLoop(ByteBuffer src, CharBuffer dst) > { > > ??????????? byte[] sa = src.array(); > ??????????? int sp = src.arrayOffset() + src.position(); > ??????????? int sl = sp + src.remaining(); > > ??????????? char[] da = dst.array(); > ??????????? int [] dp = new int[1]; > ??????????? dp[0] = dst.arrayOffset() + dst.position(); > ??????????? int dl = dp[0] + dst.remaining(); > ??????????? try { > ??????????????? while (sp < sl) { > ??????????????????? CoderResult result; > ??????????????????? byte byte1 = sa[sp]; > ??????????????????? if (byte1 >= 0) {?????????????? // ASCII????? G0 > ??????????????????????? if (dp[0] == dl) > ??????????????????????????? return CoderResult.OVERFLOW; > ??????????????????????? da[dp[0]++] = (char)(byte1 & 0xff); > ??????????????????????? sp++; > ??????????????????? } else if (byte1 != SS2) {????? // Codeset 1? G1 > ??????????????????????? if (sp + 1 == sl) > ??????????????????????????? break; > ??????????????????????? result = decode(byte1, sa[sp+1], 0, da, dp, dl); > ??????????????????????? if (result != null) > ??????????????????????????? return result; > ??????????????????????? sp += 2; > ??????????????????? } else {??????????????????????? // Codeset 2? G2 > ??????????????????????? if (sp + 4 > sl) > ??????????????????????????? break; > ??????????????????????? int cnsPlane = cnspToIndex[sa[sp+1] & 0xff]; > ??????????????????????? if (cnsPlane < 0) > ??????????????????????????? return CoderResult.malformedForLength(2); > ??????????????????????? result = decode(sa[sp+2], sa[sp+3], cnsPlane, da, > dp, dl); > ??????????????????????? if (result != null) > ??????????????????????????? return result; > ??????????????????????? sp += 4; > ??????????????????? } > ??????????????? } > ??????????????? return CoderResult.UNDERFLOW; > ??????????? } finally { > ??????????????? src.position(sp - src.arrayOffset()); > ??????????????? dst.position(dp[0] - dst.arrayOffset()); > ??????????? } > ??????? } > > > -Ulf > > > Am 22.11.2009 17:59, Chuck Rasbold schrieb: > > Sure. ?It would be great to merge redundant code paths. ?But I don't > think the cost/benefit ratio is worth it. > In the case you cite, there would be a savings of 4 bytes per path > removed, which are projected to be very infrequent. In a JIT, you > have to spend your compilation budget wisely. > It's not that it can't be done. There are just better places to spend time. > On Sat, Nov 21, 2009 at 5:54 AM, Ulf Zibis wrote: >> >> In output of PrintAssembly I frequently see : >> >> ... >> ... ? # more than 10 recurrences >> ... >> 726 ? B108: # ? ? ? ?B114 <- B10 ?Freq: 9.99898e-006 >> 726 ? ? ? ? ? # exception oop is in EAX; no code emitted >> 726 ? ? ? ? ? MOV ? ?ECX,EAX >> 728 ? ? ? ? ? JMP,s ?B114 >> 728 >> 72a ? B109: # ? ? ? ?B114 <- B9 ?Freq: 9.99918e-006 >> 72a ? ? ? ? ? # exception oop is in EAX; no code emitted >> 72a ? ? ? ? ? MOV ? ?ECX,EAX >> 72c ? ? ? ? ? JMP,s ?B114 >> 72c >> 72e ? B110: # ? ? ? ?B114 <- B6 ?Freq: 9.99938e-006 >> 72e ? ? ? ? ? # exception oop is in EAX; no code emitted >> 72e ? ? ? ? ? MOV ? ?ECX,EAX >> 730 ? ? ? ? ? JMP,s ?B114 >> 730 >> 732 ? B111: # ? ? ? ?B114 <- B4 ?Freq: 9.99959e-006 >> 732 ? ? ? ? ? # exception oop is in EAX; no code emitted >> 732 ? ? ? ? ? MOV ? ?ECX,EAX >> 734 ? ? ? ? ? JMP,s ?B114 >> 734 >> 736 ? B112: # ? ? ? ?B114 <- B3 ?Freq: 9.99979e-006 >> 736 ? ? ? ? ? # exception oop is in EAX; no code emitted >> 736 ? ? ? ? ? MOV ? ?ECX,EAX >> 738 ? ? ? ? ? JMP,s ?B114 >> 738 >> 73a ? B113: # ? ? ? ?B114 <- B2 ?Freq: 9.99999e-006 >> 73a ? ? ? ? ? # exception oop is in EAX; no code emitted >> 73a ? ? ? ? ? MOV ? ?ECX,EAX >> 73a >> 73c ? B114: # ? ? ? ?N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 >> B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 >> B104 B78 B77 B76 B75 B99 ?Freq: 7.11172e-005 >> >> >> Wouldn't it be better to have : >> >> ... >> ... ? # more than 10 recurrences >> ... >> 73a ? B108: # ? ? ? ?B114 <- B10 ?Freq: 9.99898e-006 >> 73a ? B109: # ? ? ? ?B114 <- B9 ?Freq: 9.99918e-006 >> 73a ? B110: # ? ? ? ?B114 <- B6 ?Freq: 9.99938e-006 >> 73a ? B111: # ? ? ? ?B114 <- B4 ?Freq: 9.99959e-006 >> 73a ? B112: # ? ? ? ?B114 <- B3 ?Freq: 9.99979e-006 >> 73a ? B113: # ? ? ? ?B114 <- B2 ?Freq: 9.99999e-006 >> 73a ? ? ? ? ? # exception oop is in EAX; no code emitted >> 73a ? ? ? ? ? MOV ? ?ECX,EAX >> 73a >> 73c ? B114: # ? ? ? ?N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 >> B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 >> B104 B78 B77 B76 B75 B99 ?Freq: 7.11172e-005 >> >> > > From Christian.Thalinger at Sun.COM Wed Nov 25 01:47:29 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Wed, 25 Nov 2009 10:47:29 +0100 Subject: Question about the PrintMethodStatistics report In-Reply-To: References: Message-ID: <1259142449.11635.28.camel@macbook> On Tue, 2009-11-24 at 17:20 -0200, Francis Rangel wrote: > Hi there! > > I'm new here and I want to introduce myself. My name is Francis Rangel > and I'm a computing student. My final paper subject is the inline > optimization of the openJDK JVM. > Well, what I need to know is why the PrintMethodStatistics report > changes the amount of each kind of method using diferent compilers. > For example, when the programa run with the -server option, the static > methods are 100. Then, when the -client compiler is used, the static > methods come do 90. > If the program wasn't modified, why does this numbers change? > Is that because the devirtualization applied on virtual methods? It does not show different numbers for me. One possibility could be that your environment is different (e.g. LANG) and so different classes are loaded (given you're not using the same terminal to run the program). What program is this? You can try -XX:+PrintSystemDictionaryAtExit and compare the output of a C1 and C2 run. -- Christian From gbenson at redhat.com Wed Nov 25 02:28:30 2009 From: gbenson at redhat.com (Gary Benson) Date: Wed, 25 Nov 2009 10:28:30 +0000 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <4B0C24DF.6000409@sun.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0A6549.30502@redhat.com> <20091124101803.GB3403@redhat.com> <4B0C24DF.6000409@sun.com> Message-ID: <20091125102830.GA3446@redhat.com> That's an IcedTea patch. On Fedora, everything is built -g -O2, and the debuginfo is stripped out and stored into a separate package. The header files, etc, are likewise separated into another package, and the end result is, for example, zlib, zlib-devel, and zlib-debuginfo. If you just want to use zlib then all you install is zlib. If you want to build an application against it then you need to install zlib-devel, and if you want to debug an application that uses zlib then you install zlib-debuginfo. Cheers, Gary Vladimir Kozlov wrote: > Your options have -g debug option but you build -DPRODUCT. > > Vladimir > > Gary Benson wrote: > > Andrew Haley wrote: > > > If you really want these functions to be inline, you have to > > > specify it. Having said that, if you define a member function > > > within a class definition, then it is inline. > > > > I never thought to check that the inline statement was actually > > doing anything, but apparently it isn't: > > > > ... > > #31 CppInterpreter::main_loop (recurse=, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp:110 > > #32 0x00007ffff70d4aff in ZeroEntry::invoke () at hotspot/src/cpu/zero/vm/entry_zero.hpp:54 > > #33 Interpreter::invoke_method () at hotspot/src/cpu/zero/vm/interpreter_zero.hpp:28 > > #34 CppInterpreter::main_loop (recurse=, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp:110 > > #35 0x00007ffff72d3173 in ZeroEntry::invoke () at hotspot/src/cpu/zero/vm/entry_zero.hpp:54 > > #36 Interpreter::invoke_method () at hotspot/src/cpu/zero/vm/interpreter_zero.hpp:28 > > #37 StubGenerator::call_stub (call_wrapper=, result=0x7ffff6d30f78, result_type=T_INT, method=0x7ffff02f8e98, entry_point=0x7ffff41371a0, > > parameters=, parameter_words=1, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/stubGenerator_zero.cpp:67 > > ... > > > > First things first: I'll remove the addition of the inline statements > > from the webrev. Secondly, Andrew, do you have any idea why these > > functions aren't being inlined? I'd expect the backtrace to look like > > this: > > ... > > #35 CppInterpreter::main_loop (recurse=, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp:110 > > #36 CppInterpreter::main_loop (recurse=, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp:110 > > #37 StubGenerator::call_stub (call_wrapper=, result=0x7ffff6d30f78, result_type=T_INT, method=0x7ffff02f8e98, entry_point=0x7ffff41371a0, > > parameters=, parameter_words=1, __the_thread__=0x617d80) at hotspot/src/cpu/zero/vm/stubGenerator_zero.cpp:67 > > ... > > > > The files are being compiled with these options: > > > > g++ -DLINUX -D_GNU_SOURCE -DCC_INTERP -DZERO -DAMD64 -DZERO_LIBARCH=\"amd64\" -DPRODUCT -I. -I../generated/adfiles -I../generated/jvmtifiles -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/asm -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/c1 -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/ci -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/classfile -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/code -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/compiler -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/g1 -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/concurrentMarkSweep -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/parNew -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/shared -I/home/gary/work/icedte > a6/openjdk-ecj/hotspot/src/share/vm/gc_implementation/parallelScavenge -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/gc_interface -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/interpreter -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/memory -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/oops -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/prims -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/runtime -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/services -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/shark -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/utilities -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/cpu/zero/vm -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/os/linux/vm -I/home/gary/work/icedtea6/openjdk-ecj/hotspot/src/os_cpu/linux_zero/vm -I../generated -DHOTSPOT_RELEASE_VERSION="\"14.0-b16\"" -DHOTSPOT_BUILD_TARGET="\"product\"" -DHOTSPOT_BU > ILD_USER="\"gary\"" -DHOTSPOT_LIB_ARCH=\"amd64\" -DJRE_RELEASE_VERSION="\"1.6.0_0-b17\"" -DHOTSPOT_VM_DISTRO="\"OpenJDK\"" -DSHARK -I/usr/lib64/libffi-3.0.5/include -I/home/gary/work/llvm/llvm-2.6/include -I/home/gary/work/llvm/llvm-2.6/include -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -fPIC -Woverloaded-virtual -DSHARK_LLVM_VERSION=26 -fpic -fno-rtti -fno-exceptions -D_REENTRANT -fcheck-new -m64 -pipe -g -O3 -fno-strict-aliasing -DVM_LITTLE_ENDIAN -D_LP64=1 -Werror -Wpointer-arith -Wsign-compare -c -o vframe_hp.o /home/gary/work/icedtea6/openjdk-ecj/hotspot/src/share/vm/runtime/vframe_hp.cpp > > > > Cheers, > > Gary -- http://gbenson.net/ From francis.rangel at gmail.com Wed Nov 25 02:44:19 2009 From: francis.rangel at gmail.com (Francis Rangel) Date: Wed, 25 Nov 2009 08:44:19 -0200 Subject: Question about the PrintMethodStatistics report In-Reply-To: <1259142449.11635.28.camel@macbook> References: <1259142449.11635.28.camel@macbook> Message-ID: The programs are benchmarks. I'm using some programs of Dacapo, Java Grande Forum and Shootout benchmarks to do some tests with the inline optimization. If one class is loaded and it has a implementation of a method named "A", and there's just this implementation loaded at the moment, this method will be a static method, right? Then, after the garbage collector cleaned this class, another class with other implementation of "A" method is loaded, the method would still be static? I'm asking because I was wondering that this changes occured when the garbage collector cleaned some implementations and the method become static. Then, in other execution, the garbage collector didn't clean the implementations and the method was considered virtual. Do you think this might happening? Regards. 2009/11/25 Christian Thalinger : > On Tue, 2009-11-24 at 17:20 -0200, Francis Rangel wrote: >> Hi there! >> >> I'm new here and I want to introduce myself. My name is Francis Rangel >> and I'm a computing student. My final paper subject is the inline >> optimization of the openJDK JVM. >> Well, what I need to know is why the PrintMethodStatistics report >> changes the amount of each kind of method using diferent compilers. >> For example, when the programa run with the -server option, the static >> methods are 100. Then, when the -client compiler is used, the static >> methods come do 90. >> If the program wasn't modified, why does this numbers change? >> Is that because the devirtualization applied on virtual methods? > > It does not show different numbers for me. ?One possibility could be > that your environment is different (e.g. LANG) and so different classes > are loaded (given you're not using the same terminal to run the > program). > > What program is this? ?You can try -XX:+PrintSystemDictionaryAtExit and > compare the output of a C1 and C2 run. > > -- Christian > > -- Att. Francis Rangel From Christian.Thalinger at Sun.COM Wed Nov 25 03:19:44 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Wed, 25 Nov 2009 12:19:44 +0100 Subject: Question about the PrintMethodStatistics report In-Reply-To: References: <1259142449.11635.28.camel@macbook> Message-ID: <1259147984.894.8.camel@macbook> On Wed, 2009-11-25 at 08:44 -0200, Francis Rangel wrote: > The programs are benchmarks. I'm using some programs of Dacapo, Java > Grande Forum and Shootout benchmarks to do some tests with the inline > optimization. I also used DaCapo. > If one class is loaded and it has a implementation of a method named > "A", and there's just this implementation loaded at the moment, this > method will be a static method, right? Then, after the garbage > collector cleaned this class, another class with other implementation > of "A" method is loaded, the method would still be static? No, they don't become static. But the compiler can treat them specially if it can prove that the call is currently monomorphic and will make the assumption invalid if the class hierarchy changes. > I'm asking because I was wondering that this changes occured when the > garbage collector cleaned some implementations and the method become > static. Then, in other execution, the garbage collector didn't clean > the implementations and the method was considered virtual. Do you > think this might happening? The output of PrintMethodStatistics are static statistics about the methods in the loaded classes. -- Christian From francis.rangel at gmail.com Wed Nov 25 03:32:56 2009 From: francis.rangel at gmail.com (Francis Rangel) Date: Wed, 25 Nov 2009 09:32:56 -0200 Subject: Question about the PrintMethodStatistics report In-Reply-To: <1259147984.894.8.camel@macbook> References: <1259142449.11635.28.camel@macbook> <1259147984.894.8.camel@macbook> Message-ID: Right. Well, I have to change my final paper then. I'll try the PrintSystemDictionaryAtExit. Then I can attach the two outputs in the next email. Thanks for you help. Regards. 2009/11/25 Christian Thalinger : > On Wed, 2009-11-25 at 08:44 -0200, Francis Rangel wrote: >> The programs are benchmarks. I'm using some programs of Dacapo, Java >> Grande Forum and Shootout benchmarks to do some tests with the inline >> optimization. > > I also used DaCapo. > >> If one class is loaded and it has a implementation of a method named >> "A", and there's just this implementation loaded at the moment, this >> method will be a static method, right? Then, after the garbage >> collector cleaned this class, another class with other implementation >> of "A" method is loaded, the method would still be static? > > No, they don't become static. ?But the compiler can treat them specially > if it can prove that the call is currently monomorphic and will make the > assumption invalid if the class hierarchy changes. > >> I'm asking because I was wondering that this changes occured when the >> garbage collector cleaned some implementations and the method become >> static. Then, in other execution, the garbage collector didn't clean >> the implementations and the method was considered virtual. Do you >> think this might happening? > > The output of PrintMethodStatistics are static statistics about the > methods in the loaded classes. > > -- Christian > > -- Att. Francis Rangel From gbenson at redhat.com Wed Nov 25 03:38:02 2009 From: gbenson at redhat.com (Gary Benson) Date: Wed, 25 Nov 2009 11:38:02 +0000 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <4B0BB7C1.2070308@redhat.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0A6549.30502@redhat.com> <20091124101803.GB3403@redhat.com> <4B0BB7C1.2070308@redhat.com> Message-ID: <20091125113802.GE3446@redhat.com> Andrew Haley wrote: > Are you sure they aren't being inlined? I'd have to look at the > code to be sure. This stack trace doesn't prove it. So I checked the code and everything is inlined, even without the extra inline statements. See attached. I didn't know gdb could track what function you were in with such granularity! Cheers, Gary -- http://gbenson.net/ -------------- next part -------------- Dump of assembler code for function _ZN14CppInterpreter9main_loopEiP6Thread: 0x7ffff70d4990 <_ZN14CppInterpreter9main_loopEiP6Thread+0>: push %r14 0x7ffff70d4992 <_ZN14CppInterpreter9main_loopEiP6Thread+2>: test %edi,%edi 0x7ffff70d4994 <_ZN14CppInterpreter9main_loopEiP6Thread+4>: push %r13 0x7ffff70d4996 <_ZN14CppInterpreter9main_loopEiP6Thread+6>: push %r12 0x7ffff70d4998 <_ZN14CppInterpreter9main_loopEiP6Thread+8>: push %rbp 0x7ffff70d4999 <_ZN14CppInterpreter9main_loopEiP6Thread+9>: mov %rsi,%rbp 0x7ffff70d499c <_ZN14CppInterpreter9main_loopEiP6Thread+12>: push %rbx 0x7ffff70d499d <_ZN14CppInterpreter9main_loopEiP6Thread+13>: jne 0x7ffff70d4b4b <_ZN14CppInterpreter9main_loopEiP6Thread+443> 0x7ffff70d49a3 <_ZNK16InterpreterFrame17interpreter_stateEv+0>: mov 0x340(%rbp),%r14 0x7ffff70d49aa <_ZN14CppInterpreter9main_loopEiP6Thread+26>: mov %rbp,%rdi 0x7ffff70d49ad <_ZN14CppInterpreter9main_loopEiP6Thread+29>: callq 0x7ffff70d4170 <_ZN14CppInterpreter23stack_overflow_imminentEP10JavaThread> 0x7ffff70d49b2 <_ZN14CppInterpreter9main_loopEiP6Thread+34>: test %al,%al 0x7ffff70d49b4 <_ZNK16InterpreterFrame17interpreter_stateEv+17>: lea -0x90(%r14),%rbx 0x7ffff70d49bb <_ZN14CppInterpreter9main_loopEiP6Thread+43>: jne 0x7ffff70d4b5d <_ZN10JavaThread19set_last_Java_frameEv> 0x7ffff70d49c1 <_ZN14CppInterpreter9main_loopEiP6Thread+49>: mov 0xccb2d8(%rip),%r13 # 0x7ffff7d9fca0 0x7ffff70d49c8 <_ZN14CppInterpreter9main_loopEiP6Thread+56>: nopl 0x0(%rax,%rax,1) 0x7ffff70d49d0 <_ZN10JavaThread19set_last_Java_frameEv+0>: mov 0x340(%rbp),%rax 0x7ffff70d49d7 <_ZN15JavaFrameAnchor16set_last_Java_spEPl+0>: mov %rax,0x198(%rbp) 0x7ffff70d49de <_ZN14CppInterpreter9main_loopEiP6Thread+78>: cmpb $0x0,0x0(%r13) 0x7ffff70d49e3 <_ZN14CppInterpreter9main_loopEiP6Thread+83>: je 0x7ffff70d4a50 <_ZN14CppInterpreter9main_loopEiP6Thread+192> 0x7ffff70d49e5 <_ZN14CppInterpreter9main_loopEiP6Thread+85>: mov %rbx,%rdi 0x7ffff70d49e8 <_ZN14CppInterpreter9main_loopEiP6Thread+88>: callq 0x7ffff70355d0 <_ZN19BytecodeInterpreter13runWithChecksEPS_> 0x7ffff70d49ed <_ZN19BytecodeInterpreter6methodEv+0>: mov 0x20(%rbx),%r12 0x7ffff70d49f1 <_ZN15JavaFrameAnchor16set_last_Java_spEPl+0>: movq $0x0,0x198(%rbp) 0x7ffff70d49fc <_ZN19BytecodeInterpreter3msgEv+0>: mov 0x38(%rbx),%eax 0x7ffff70d49ff <_ZN14CppInterpreter9main_loopEiP6Thread+111>: cmp $0x8,%eax 0x7ffff70d4a02 <_ZN14CppInterpreter9main_loopEiP6Thread+114>: je 0x7ffff70d4ae0 <_ZN9ZeroStack6set_spEPl> 0x7ffff70d4a08 <_ZN14CppInterpreter9main_loopEiP6Thread+120>: cmp $0xa,%eax 0x7ffff70d4a0b <_ZN14CppInterpreter9main_loopEiP6Thread+123>: je 0x7ffff70d4a68 <_ZN14CppInterpreter9main_loopEiP6Thread+216> 0x7ffff70d4a0d <_ZN14CppInterpreter9main_loopEiP6Thread+125>: cmp $0x9,%eax 0x7ffff70d4a10 <_ZN14CppInterpreter9main_loopEiP6Thread+128>: je 0x7ffff70d4bfc <_ZN14CppInterpreter9main_loopEiP6Thread+620> 0x7ffff70d4a16 <_ZN14CppInterpreter9main_loopEiP6Thread+134>: cmp $0xb,%eax 0x7ffff70d4a19 <_ZN14CppInterpreter9main_loopEiP6Thread+137>: nopl 0x0(%rax) 0x7ffff70d4a20 <_ZN14CppInterpreter9main_loopEiP6Thread+144>: je 0x7ffff70d4c24 <_ZN14CppInterpreter9main_loopEiP6Thread+660> 0x7ffff70d4a26 <_ZN14CppInterpreter9main_loopEiP6Thread+150>: cmp $0xd,%eax 0x7ffff70d4a29 <_ZN14CppInterpreter9main_loopEiP6Thread+153>: nopl 0x0(%rax) 0x7ffff70d4a30 <_ZN14CppInterpreter9main_loopEiP6Thread+160>: je 0x7ffff70d4c2e <_ZN9ZeroStack6set_spEPl> 0x7ffff70d4a36 <_ZN14CppInterpreter9main_loopEiP6Thread+166>: lea 0x8274cb(%rip),%rdi # 0x7ffff78fbf08 0x7ffff70d4a3d <_ZN14CppInterpreter9main_loopEiP6Thread+173>: mov $0xab,%esi 0x7ffff70d4a42 <_ZN14CppInterpreter9main_loopEiP6Thread+178>: callq 0x7ffff70d5220 <_Z28report_should_not_reach_herePKci> 0x7ffff70d4a47 <_ZN14CppInterpreter9main_loopEiP6Thread+183>: callq 0x7ffff72500a0 0x7ffff70d4a4c <_ZN14CppInterpreter9main_loopEiP6Thread+188>: jmp 0x7ffff70d49d0 <_ZN10JavaThread19set_last_Java_frameEv> 0x7ffff70d4a4e <_ZN14CppInterpreter9main_loopEiP6Thread+190>: xchg %ax,%ax 0x7ffff70d4a50 <_ZN14CppInterpreter9main_loopEiP6Thread+192>: mov %rbx,%rdi 0x7ffff70d4a53 <_ZN14CppInterpreter9main_loopEiP6Thread+195>: nopl 0x0(%rax,%rax,1) 0x7ffff70d4a58 <_ZN14CppInterpreter9main_loopEiP6Thread+200>: callq 0x7ffff702aff0 <_ZN19BytecodeInterpreter3runEPS_> 0x7ffff70d4a5d <_ZN14CppInterpreter9main_loopEiP6Thread+205>: nopl (%rax) 0x7ffff70d4a60 <_ZN14CppInterpreter9main_loopEiP6Thread+208>: jmp 0x7ffff70d49ed <_ZN19BytecodeInterpreter6methodEv> 0x7ffff70d4a62 <_ZN14CppInterpreter9main_loopEiP6Thread+210>: nopw 0x0(%rax,%rax,1) 0x7ffff70d4a68 <_ZN14CppInterpreter9main_loopEiP6Thread+216>: mov 0x338(%rbp),%rax 0x7ffff70d4a6f <_ZN14CppInterpreter9main_loopEiP6Thread+223>: sub 0x328(%rbp),%rax 0x7ffff70d4a76 <_ZN14CppInterpreter9main_loopEiP6Thread+230>: sar $0x3,%rax 0x7ffff70d4a7a <_ZN14CppInterpreter9main_loopEiP6Thread+234>: sub $0x1,%eax 0x7ffff70d4a7d <_ZN14CppInterpreter9main_loopEiP6Thread+237>: jle 0x7ffff70d4b30 <_ZN14CppInterpreter9main_loopEiP6Thread+416> 0x7ffff70d4a83 <_ZN9ZeroStack5allocEm+0>: subq $0x10,0x338(%rbp) 0x7ffff70d4a8b <_ZN14CppInterpreter9main_loopEiP6Thread+251>: mov 0x30(%rbx),%rax 0x7ffff70d4a8f <_ZN14CppInterpreter9main_loopEiP6Thread+255>: lea 0x8(%rax),%rdx 0x7ffff70d4a93 <_ZN14CppInterpreter9main_loopEiP6Thread+259>: cmp 0x68(%rbx),%rdx 0x7ffff70d4a97 <_ZN14CppInterpreter9main_loopEiP6Thread+263>: jae 0x7ffff70d4ab1 <_ZN14CppInterpreter9main_loopEiP6Thread+289> 0x7ffff70d4a99 <_ZN14CppInterpreter9main_loopEiP6Thread+265>: nopl 0x0(%rax) 0x7ffff70d4aa0 <_ZN14CppInterpreter9main_loopEiP6Thread+272>: mov (%rdx),%rax 0x7ffff70d4aa3 <_ZN14CppInterpreter9main_loopEiP6Thread+275>: mov %rax,-0x10(%rdx) 0x7ffff70d4aa7 <_ZN14CppInterpreter9main_loopEiP6Thread+279>: add $0x8,%rdx 0x7ffff70d4aab <_ZN14CppInterpreter9main_loopEiP6Thread+283>: cmp 0x68(%rbx),%rdx 0x7ffff70d4aaf <_ZN14CppInterpreter9main_loopEiP6Thread+287>: jb 0x7ffff70d4aa0 <_ZN14CppInterpreter9main_loopEiP6Thread+272> 0x7ffff70d4ab1 <_ZN14CppInterpreter9main_loopEiP6Thread+289>: mov 0x68(%rbx),%rdx 0x7ffff70d4ab5 <_ZN19BytecodeInterpreter15set_stack_limitEPl+0>: subq $0x10,0x70(%rbx) 0x7ffff70d4aba <_ZN19BytecodeInterpreter9set_stackEPl+0>: subq $0x10,0x30(%rbx) 0x7ffff70d4abf <_ZN14CppInterpreter9main_loopEiP6Thread+303>: lea -0x10(%rdx),%rax 0x7ffff70d4ac3 <_ZN19BytecodeInterpreter14set_stack_baseEPl+0>: mov %rax,0x68(%rbx) 0x7ffff70d4ac7 <_ZN15BasicObjectLock7set_objEP7oopDesc+0>: movq $0x0,-0x8(%rdx) 0x7ffff70d4acf <_ZN19BytecodeInterpreter7set_msgENS_8messagesE+0>: movl $0x6,0x38(%rbx) 0x7ffff70d4ad6 <_ZN19BytecodeInterpreter7set_msgENS_8messagesE+7>: jmpq 0x7ffff70d49d0 <_ZN10JavaThread19set_last_Java_frameEv> 0x7ffff70d4adb <_ZN19BytecodeInterpreter7set_msgENS_8messagesE+12>: nopl 0x0(%rax,%rax,1) 0x7ffff70d4ae0 <_ZN9ZeroStack6set_spEPl+0>: mov 0x30(%rbx),%rax 0x7ffff70d4ae4 <_ZN19BytecodeInterpreter6calleeEv+0>: mov 0x40(%rbx),%rdi 0x7ffff70d4ae8 <_ZNK9ZeroEntry6invokeEP13methodOopDescP6Thread+0>: mov %rbp,%rdx 0x7ffff70d4aeb <_ZN9ZeroStack6set_spEPl+11>: add $0x8,%rax 0x7ffff70d4aef <_ZN9ZeroStack6set_spEPl+15>: mov %rax,0x338(%rbp) 0x7ffff70d4af6 <_ZN11Interpreter13invoke_methodEP13methodOopDescPhP6Thread+14>: mov 0x48(%rbx),%rax 0x7ffff70d4afa <_ZNK9ZeroEntry6invokeEP13methodOopDescP6Thread+18>: mov %rax,%rsi 0x7ffff70d4afd <_ZNK9ZeroEntry6invokeEP13methodOopDescP6Thread+21>: callq *(%rax) 0x7ffff70d4aff <_ZN19BytecodeInterpreter9set_stackEPl+0>: mov 0x338(%rbp),%rax 0x7ffff70d4b06 <_ZN19BytecodeInterpreter9set_stackEPl+7>: sub $0x8,%rax 0x7ffff70d4b0a <_ZN19BytecodeInterpreter9set_stackEPl+11>: mov %rax,0x30(%rbx) 0x7ffff70d4b0e <_ZN9ZeroStack6set_spEPl+0>: mov 0x70(%rbx),%rax 0x7ffff70d4b12 <_ZN9ZeroStack6set_spEPl+4>: add $0x8,%rax 0x7ffff70d4b16 <_ZN9ZeroStack6set_spEPl+8>: mov %rax,0x338(%rbp) 0x7ffff70d4b1d <_ZN19BytecodeInterpreter7set_msgENS_8messagesE+0>: movl $0x3,0x38(%rbx) 0x7ffff70d4b24 <_ZN19BytecodeInterpreter7set_msgENS_8messagesE+7>: jmpq 0x7ffff70d49d0 <_ZN10JavaThread19set_last_Java_frameEv> 0x7ffff70d4b29 <_ZN19BytecodeInterpreter7set_msgENS_8messagesE+12>: nopl 0x0(%rax) 0x7ffff70d4b30 <_ZN14CppInterpreter9main_loopEiP6Thread+416>: lea 0x8273d1(%rip),%rdi # 0x7ffff78fbf08 0x7ffff70d4b37 <_ZN14CppInterpreter9main_loopEiP6Thread+423>: mov $0x7f,%esi 0x7ffff70d4b3c <_ZN14CppInterpreter9main_loopEiP6Thread+428>: callq 0x7ffff70d51c0 <_Z20report_unimplementedPKci> 0x7ffff70d4b41 <_ZN14CppInterpreter9main_loopEiP6Thread+433>: callq 0x7ffff72500a0 0x7ffff70d4b46 <_ZN14CppInterpreter9main_loopEiP6Thread+438>: jmpq 0x7ffff70d4a83 <_ZN9ZeroStack5allocEm> 0x7ffff70d4b4b <_ZN14CppInterpreter9main_loopEiP6Thread+443>: sub $0x1,%edi 0x7ffff70d4b4e <_ZN14CppInterpreter9main_loopEiP6Thread+446>: xchg %ax,%ax 0x7ffff70d4b50 <_ZN14CppInterpreter9main_loopEiP6Thread+448>: callq 0x7ffff70d4990 <_ZN14CppInterpreter9main_loopEiP6Thread> 0x7ffff70d4b55 <_ZN14CppInterpreter9main_loopEiP6Thread+453>: nopl (%rax) 0x7ffff70d4b58 <_ZN14CppInterpreter9main_loopEiP6Thread+456>: jmpq 0x7ffff70d49a3 <_ZNK16InterpreterFrame17interpreter_stateEv> 0x7ffff70d4b5d <_ZN10JavaThread19set_last_Java_frameEv+0>: mov 0x340(%rbp),%rax 0x7ffff70d4b64 <_ZN14CppInterpreter9main_loopEiP6Thread+468>: mov %rbp,%rdi 0x7ffff70d4b67 <_ZN15JavaFrameAnchor16set_last_Java_spEPl+0>: mov %rax,0x198(%rbp) 0x7ffff70d4b6e <_ZN14CppInterpreter9main_loopEiP6Thread+478>: callq 0x7ffff715fbf0 <_ZN18InterpreterRuntime24throw_StackOverflowErrorEP10JavaThread> 0x7ffff70d4b73 <_ZN15JavaFrameAnchor16set_last_Java_spEPl+0>: movq $0x0,0x198(%rbp) 0x7ffff70d4b7e <_ZN19BytecodeInterpreter6methodEv+0>: mov -0x70(%r14),%r12 0x7ffff70d4b82 <_ZN19BytecodeInterpreter6methodEv+4>: xor %r8d,%r8d 0x7ffff70d4b85 <_ZN19BytecodeInterpreter6methodEv+7>: xor %ecx,%ecx 0x7ffff70d4b87 <_ZN9ZeroStack6set_spEPl+0>: mov 0x340(%rbp),%rdx 0x7ffff70d4b8e <_ZN9ZeroStack6set_spEPl+7>: lea 0x8(%rdx),%rax 0x7ffff70d4b92 <_ZN9ZeroStack6set_spEPl+11>: mov %rax,0x338(%rbp) 0x7ffff70d4b99 <_ZN10JavaThread14pop_zero_frameEv+18>: mov (%rdx),%rax 0x7ffff70d4b9c <_ZN10JavaThread14pop_zero_frameEv+21>: mov %rax,0x340(%rbp) 0x7ffff70d4ba3 <_ZN9ZeroStack6set_spEPl+0>: movzwl 0x3c(%r12),%eax 0x7ffff70d4ba9 <_ZN9ZeroStack6set_spEPl+6>: shl $0x3,%rax 0x7ffff70d4bad <_ZN9ZeroStack6set_spEPl+10>: add %rax,0x338(%rbp) 0x7ffff70d4bb4 <_ZN14CppInterpreter9main_loopEiP6Thread+548>: test %ecx,%ecx 0x7ffff70d4bb6 <_ZN14CppInterpreter9main_loopEiP6Thread+550>: jle 0x7ffff70d4bf3 <_ZN14CppInterpreter9main_loopEiP6Thread+611> 0x7ffff70d4bb8 <_ZN14CppInterpreter9main_loopEiP6Thread+552>: lea -0x1(%rcx),%eax 0x7ffff70d4bbb <_ZN14CppInterpreter9main_loopEiP6Thread+555>: xor %esi,%esi 0x7ffff70d4bbd <_ZN14CppInterpreter9main_loopEiP6Thread+557>: lea 0x8(,%rax,8),%rdi 0x7ffff70d4bc5 <_ZN14CppInterpreter9main_loopEiP6Thread+565>: neg %rdi 0x7ffff70d4bc8 <_ZN14CppInterpreter9main_loopEiP6Thread+568>: nopl 0x0(%rax,%rax,1) 0x7ffff70d4bd0 <_ZN9ZeroStack4pushEl+0>: mov 0x338(%rbp),%rdx 0x7ffff70d4bd7 <_ZN14CppInterpreter9main_loopEiP6Thread+583>: mov (%r8,%rsi,1),%rcx 0x7ffff70d4bdb <_ZN9ZeroStack4pushEl+11>: sub $0x8,%rsi 0x7ffff70d4bdf <_ZN14CppInterpreter9main_loopEiP6Thread+591>: cmp %rdi,%rsi 0x7ffff70d4be2 <_ZN9ZeroStack4pushEl+18>: lea -0x8(%rdx),%rax 0x7ffff70d4be6 <_ZN9ZeroStack4pushEl+22>: mov %rax,0x338(%rbp) 0x7ffff70d4bed <_ZN9ZeroStack4pushEl+29>: mov %rcx,-0x8(%rdx) 0x7ffff70d4bf1 <_ZN14CppInterpreter9main_loopEiP6Thread+609>: jne 0x7ffff70d4bd0 <_ZN9ZeroStack4pushEl> 0x7ffff70d4bf3 <_ZN14CppInterpreter9main_loopEiP6Thread+611>: pop %rbx 0x7ffff70d4bf4 <_ZN14CppInterpreter9main_loopEiP6Thread+612>: pop %rbp 0x7ffff70d4bf5 <_ZN14CppInterpreter9main_loopEiP6Thread+613>: pop %r12 0x7ffff70d4bf7 <_ZN14CppInterpreter9main_loopEiP6Thread+615>: pop %r13 0x7ffff70d4bf9 <_ZN14CppInterpreter9main_loopEiP6Thread+617>: pop %r14 0x7ffff70d4bfb <_ZN14CppInterpreter9main_loopEiP6Thread+619>: retq 0x7ffff70d4bfc <_ZN14CppInterpreter9main_loopEiP6Thread+620>: mov %r12,%rdi 0x7ffff70d4bff <_ZN14CppInterpreter9main_loopEiP6Thread+623>: callq 0x7ffff7230570 <_ZNK13methodOopDesc11result_typeEv> 0x7ffff70d4c04 <_ZN14CppInterpreter9main_loopEiP6Thread+628>: mov 0xccb38d(%rip),%rdx # 0x7ffff7d9ff98 0x7ffff70d4c0b <_ZN14CppInterpreter9main_loopEiP6Thread+635>: cltq 0x7ffff70d4c0d <_ZN14CppInterpreter9main_loopEiP6Thread+637>: mov (%rdx,%rax,4),%ecx 0x7ffff70d4c10 <_ZN14CppInterpreter9main_loopEiP6Thread+640>: movslq %ecx,%rax 0x7ffff70d4c13 <_ZN14CppInterpreter9main_loopEiP6Thread+643>: lea 0x0(,%rax,8),%r8 0x7ffff70d4c1b <_ZN14CppInterpreter9main_loopEiP6Thread+651>: add -0x60(%r14),%r8 0x7ffff70d4c1f <_ZN14CppInterpreter9main_loopEiP6Thread+655>: jmpq 0x7ffff70d4b87 <_ZN9ZeroStack6set_spEPl> 0x7ffff70d4c24 <_ZN14CppInterpreter9main_loopEiP6Thread+660>: xor %r8d,%r8d 0x7ffff70d4c27 <_ZN14CppInterpreter9main_loopEiP6Thread+663>: xor %ecx,%ecx 0x7ffff70d4c29 <_ZN14CppInterpreter9main_loopEiP6Thread+665>: jmpq 0x7ffff70d4b87 <_ZN9ZeroStack6set_spEPl> 0x7ffff70d4c2e <_ZN9ZeroStack6set_spEPl+0>: mov 0x340(%rbp),%rdx 0x7ffff70d4c35 <_ZNK9ZeroEntry10invoke_osrEP13methodOopDescPhP6Thread+0>: mov %rbp,%rcx 0x7ffff70d4c38 <_ZNK9ZeroEntry10invoke_osrEP13methodOopDescPhP6Thread+3>: mov %r12,%rdi 0x7ffff70d4c3b <_ZN9ZeroStack6set_spEPl+13>: lea 0x8(%rdx),%rax 0x7ffff70d4c3f <_ZN9ZeroStack6set_spEPl+17>: mov %rax,0x338(%rbp) 0x7ffff70d4c46 <_ZN10JavaThread14pop_zero_frameEv+24>: mov (%rdx),%rax 0x7ffff70d4c49 <_ZN10JavaThread14pop_zero_frameEv+27>: mov %rax,0x340(%rbp) 0x7ffff70d4c50 <_ZN9ZeroStack6set_spEPl+0>: movzwl 0x3e(%r12),%edx 0x7ffff70d4c56 <_ZN9ZeroStack6set_spEPl+6>: movzwl 0x3c(%r12),%eax 0x7ffff70d4c5c <_ZN9ZeroStack6set_spEPl+12>: sub %edx,%eax 0x7ffff70d4c5e <_ZN9ZeroStack6set_spEPl+14>: cltq 0x7ffff70d4c60 <_ZN9ZeroStack6set_spEPl+16>: shl $0x3,%rax 0x7ffff70d4c64 <_ZN9ZeroStack6set_spEPl+20>: add %rax,0x338(%rbp) 0x7ffff70d4c6b <_ZN14CppInterpreter9main_loopEiP6Thread+731>: pop %rbx 0x7ffff70d4c6c <_ZN14CppInterpreter9main_loopEiP6Thread+732>: pop %rbp 0x7ffff70d4c6d <_ZN11Interpreter10invoke_osrEP13methodOopDescPhS2_P6Thread+56>: mov -0x48(%r14),%rax 0x7ffff70d4c71 <_ZN14CppInterpreter9main_loopEiP6Thread+737>: pop %r12 0x7ffff70d4c73 <_ZN14CppInterpreter9main_loopEiP6Thread+739>: pop %r13 0x7ffff70d4c75 <_ZNK9ZeroEntry10invoke_osrEP13methodOopDescPhP6Thread+64>: mov -0x50(%r14),%rsi 0x7ffff70d4c79 <_ZNK9ZeroEntry10invoke_osrEP13methodOopDescPhP6Thread+68>: mov %rax,%rdx 0x7ffff70d4c7c <_ZNK9ZeroEntry10invoke_osrEP13methodOopDescPhP6Thread+71>: mov (%rax),%r11 0x7ffff70d4c7f <_ZN14CppInterpreter9main_loopEiP6Thread+751>: pop %r14 0x7ffff70d4c81 <_ZNK9ZeroEntry10invoke_osrEP13methodOopDescPhP6Thread+76>: jmpq *%r11 End of assembler dump. From gbenson at redhat.com Wed Nov 25 05:15:07 2009 From: gbenson at redhat.com (Gary Benson) Date: Wed, 25 Nov 2009 13:15:07 +0000 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <4B0C25E1.4080804@sun.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0ACF22.3050107@sun.com> <20091124102703.GC3403@redhat.com> <4B0C25E1.4080804@sun.com> Message-ID: <20091125131507.GF3446@redhat.com> Cool. I've rolled the changes we've discussed into this webrev: http://cr.openjdk.java.net/~gbenson/zero-update-02-hs/ Let me know what you think. Cheers, Gary Vladimir Kozlov wrote: > I would keep thread check. > > Vladimir > > Gary Benson wrote: > > Vladimir Kozlov wrote: > > > Gary Benson wrote: > > > > Vladimir Kozlov wrote: > > > > > hotspot/src/share/vm/runtime/os.hpp: > > > > > Can you explain why your changes is not the same as the comment > > > > > says?: ((_mem_serialize_page ^ addr) & -pagesize) == 0 > > > > > > > > Basically because I didn't know if I'd need to make changes > > > > anywhere else, and I didn't want to break the other platforms. > > > > Should I change it to what the comment says? > > > > > > I am concern about correctness of your code - page sizes could > > > be different. I would prefer if your code will be similar to > > > one in is_poll_address(). > > > > So something like this: > > > > static bool is_memory_serialize_page(JavaThread *thread, address addr) { > > if (UseMembar) return false; > > if (thread == NULL) return false; > > return addr >= _mem_serialize_page && addr < (_mem_serialize_page + os::vm_page_size()); > > } > > > > The "if (thread == NULL) return false;" would no longer be > > necessary, so it could either be retained to preserve the old > > behaviour or not. Which would you prefer? > > > > Cheers, > > Gary -- http://gbenson.net/ From Vladimir.Kozlov at Sun.COM Wed Nov 25 10:07:16 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Wed, 25 Nov 2009 10:07:16 -0800 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <20091125131507.GF3446@redhat.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0ACF22.3050107@sun.com> <20091124102703.GC3403@redhat.com> <4B0C25E1.4080804@sun.com> <20091125131507.GF3446@redhat.com> Message-ID: <4B0D7254.5080207@sun.com> Looks good. Vladimir Gary Benson wrote: > Cool. I've rolled the changes we've discussed into this webrev: > > http://cr.openjdk.java.net/~gbenson/zero-update-02-hs/ > > Let me know what you think. > > Cheers, > Gary > > Vladimir Kozlov wrote: >> I would keep thread check. >> >> Vladimir >> >> Gary Benson wrote: >>> Vladimir Kozlov wrote: >>>> Gary Benson wrote: >>>>> Vladimir Kozlov wrote: >>>>>> hotspot/src/share/vm/runtime/os.hpp: >>>>>> Can you explain why your changes is not the same as the comment >>>>>> says?: ((_mem_serialize_page ^ addr) & -pagesize) == 0 >>>>> Basically because I didn't know if I'd need to make changes >>>>> anywhere else, and I didn't want to break the other platforms. >>>>> Should I change it to what the comment says? >>>> I am concern about correctness of your code - page sizes could >>>> be different. I would prefer if your code will be similar to >>>> one in is_poll_address(). >>> So something like this: >>> >>> static bool is_memory_serialize_page(JavaThread *thread, address addr) { >>> if (UseMembar) return false; >>> if (thread == NULL) return false; >>> return addr >= _mem_serialize_page && addr < (_mem_serialize_page + os::vm_page_size()); >>> } >>> >>> The "if (thread == NULL) return false;" would no longer be >>> necessary, so it could either be retained to preserve the old >>> behaviour or not. Which would you prefer? >>> >>> Cheers, >>> Gary > From Ulf.Zibis at gmx.de Wed Nov 25 15:12:19 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Thu, 26 Nov 2009 00:12:19 +0100 Subject: Compiled call version seems to be slower In-Reply-To: <1258974061.1712.72.camel@macbook> References: <4B085F54.6000207@gmx.de> <1258974061.1712.72.camel@macbook> Message-ID: <4B0DB9D3.6050404@gmx.de> Am 23.11.2009 12:01, Christian Thalinger schrieb: > On Sat, 2009-11-21 at 22:44 +0100, Ulf Zibis wrote: > >> In my attached example -XX:+PrintAssembly outputs 2 versions for >> sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decodeArrayLoop. >> >> The 1st one seems for call from compiled calling method, the 2nd for >> call from byte code interpreter. >> Looking closer to the -XX:+PrintAssembly output, 1st version seems to >> be >> slower, i.e. has more instructions in the loop code. >> >> Additionally I'm wondering why the finally block is copy-and-pasted >> for >> each separate return. >> Is that as disired ? >> > > I just had a quick glance at the code and it seems the second one has a > better register allocation, for whatever reason. And maybe (would need > to look closer) basic block ordering is different. > Additional question: What does that mean in exact, that there are 2 little different compiler outputs for the same method? Which of the 2 would actually run? > -- Christian > > PS: It would be much easier when you would attach two files and give > some hints (e.g. addresses) where the code is you're talking about. > I'm not sure if I understand right. I had attached the java file and the PrintAssembly output, but got: (but anyway, I CC:ed you so you should got the 2 files.) Your mail to 'hotspot-compiler-dev' with the subject Compiled call version seems to be slower Is being held until the list moderator can review it for approval. The reason it is being held: Message body is too big: 314879 bytes with a limit of 100 KB -Ulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20091126/dabde37d/attachment.html From Christian.Thalinger at Sun.COM Thu Nov 26 00:27:31 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Thu, 26 Nov 2009 09:27:31 +0100 Subject: Compiled call version seems to be slower In-Reply-To: <4B0DB9D3.6050404@gmx.de> References: <4B085F54.6000207@gmx.de> <1258974061.1712.72.camel@macbook> <4B0DB9D3.6050404@gmx.de> Message-ID: <1259224051.875.3.camel@macbook> On Thu, 2009-11-26 at 00:12 +0100, Ulf Zibis wrote: > Am 23.11.2009 12:01, Christian Thalinger schrieb: > > On Sat, 2009-11-21 at 22:44 +0100, Ulf Zibis wrote: > > > > > In my attached example -XX:+PrintAssembly outputs 2 versions for > > > sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decodeArrayLoop. > > > > > > The 1st one seems for call from compiled calling method, the 2nd for > > > call from byte code interpreter. > > > Looking closer to the -XX:+PrintAssembly output, 1st version seems to > > > be > > > slower, i.e. has more instructions in the loop code. > > > > > > Additionally I'm wondering why the finally block is copy-and-pasted > > > for > > > each separate return. > > > Is that as disired ? > > > > > > > I just had a quick glance at the code and it seems the second one has a > > better register allocation, for whatever reason. And maybe (would need > > to look closer) basic block ordering is different. > > > > Additional question: > What does that mean in exact, that there are 2 little different > compiler outputs for the same method? > Which of the 2 would actually run? Ahh, sorry, I misunderstood. The two disassemblies are from the same run, I thought these are two different runs. Well, the first version seem to have been invalidated, for whatever reason, and got recompiled. A -XX:+PrintCompilation would probably tell us. And it's very likely that a second compile is different than the first one (more profiling data, different inlining, ...). -- Christian From Christian.Thalinger at Sun.COM Thu Nov 26 01:35:33 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Thu, 26 Nov 2009 10:35:33 +0100 Subject: Inline threshold relative to frequency In-Reply-To: <4B0C5F1D.1030307@gmx.de> References: <4B087C83.20307@gmx.de> <1258972730.1712.69.camel@macbook> <4B0C5F1D.1030307@gmx.de> Message-ID: <1259228133.875.8.camel@macbook> On Tue, 2009-11-24 at 23:33 +0100, Ulf Zibis wrote: > In my code example the regarding method is only called from 2 places, > so the additional memory would not count so much here, and on the > other hand the by-stack passing of 6 parameter arguments could be > saved, so the amount of method parameters should be too valued for > such a dynamic-threshold. Maybe it should be. > > In reference to my other thread "Multiple copies of same code" > removing the 6 copies of the finally block would save more > memory/cache. > > > b) register pressure might increase -> worse register allocation (but > > could be the other way around) > > > > Also note that not all architectures use the stack for passing call > > arguments. Even x86_64 has enough argument registers for this > > particular method. > > > > Does that mean, that all the > MOV EBP,[ESP + #72] > MOV [ESP + #4],EBP > pairs would be optimized to register usage in a following optimization > step, I can't see by PrintAssembly? No. The above code is very likely a register spill and happens because the architecture does not have enough registers to hold all live values for the currently compiled method. -- Christian From Ulf.Zibis at gmx.de Thu Nov 26 06:55:14 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Thu, 26 Nov 2009 15:55:14 +0100 Subject: Compiled call version seems to be slower In-Reply-To: <1259224051.875.3.camel@macbook> References: <4B085F54.6000207@gmx.de> <1258974061.1712.72.camel@macbook> <4B0DB9D3.6050404@gmx.de> <1259224051.875.3.camel@macbook> Message-ID: <4B0E96D2.1050508@gmx.de> Am 26.11.2009 09:27, Christian Thalinger schrieb: > On Thu, 2009-11-26 at 00:12 +0100, Ulf Zibis wrote: > >> >>>> Additionally I'm wondering why the finally block is copy-and-pasted >>>> for >>>> each separate return. >>>> Is that as disired ? >>>> Sorry about asking once more. Would it be so hard to avoid the 6-times redundancy of the finally block, or are there other reasons? >> Additional question: >> What does that mean in exact, that there are 2 little different >> compiler outputs for the same method? >> Which of the 2 would actually run? >> > > Ahh, sorry, I misunderstood. The two disassemblies are from the same > run, I thought these are two different runs. > > Well, the first version seem to have been invalidated, for whatever > reason, and got recompiled. A -XX:+PrintCompilation would probably tell > us. And it's very likely that a second compile is different than the > first one (more profiling data, different inlining, ...). > Much thanks, so my worry about, that the "slower" one would come to account, is obsolete. -Ulf From Ulf.Zibis at gmx.de Thu Nov 26 08:29:14 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Thu, 26 Nov 2009 17:29:14 +0100 Subject: +PrintInlining falsly? says: never executed Message-ID: <4B0EACDA.3070801@gmx.de> Having: VM option 'MaxInlineSize=250' VM option '+PrintCompilation' VM option '+PrintInlining' I get: @ 180 sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decode never executed and static sun/nio/cs/ext/EUC_TW_C_d_b_codeToBuffer4$Decoder.decode(BBI[C[II)Ljava/nio/charset/CoderResult; interpreter_invocation_count: 10001 invocation_counter: 5001 backedge_counter: 1 Why PrintInlining says: "never executed" ? Having only: VM option '+PrintInlining' I get: @ 181 sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decode too big For the method size refer: {method} - klass: {other class} - method holder: 'sun/nio/cs/ext/EUC_TW_C_d_b_codeToBuffer4$Decoder' - constants: 0x085562d4{constant pool} - access: 0x81000000 - name: 'decode' - signature: '(BBI[C[II)Ljava/nio/charset/CoderResult;' - max stack: 6 - max locals: 9 - size of params: 7 - method size: 20 - vtable index: 17 - i2i entry: 0x00acb6a0 - adapter: 0x03d431f0 - compiled entry 0x00b91070 - code size: 189 - code start: 0x1425d730 - code end (excl): 0x1425d7ed - method data: 0x142c5e08 - checked ex length: 0 - linenumber start: 0x1425d7ed - localvar length: 9 - localvar start: 0x1425d802 -Ulf From gbenson at redhat.com Fri Nov 27 01:58:57 2009 From: gbenson at redhat.com (Gary Benson) Date: Fri, 27 Nov 2009 09:58:57 +0000 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <4B0D7254.5080207@sun.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0ACF22.3050107@sun.com> <20091124102703.GC3403@redhat.com> <4B0C25E1.4080804@sun.com> <20091125131507.GF3446@redhat.com> <4B0D7254.5080207@sun.com> Message-ID: <20091127095857.GA3323@redhat.com> Does that mean it's approved? ;) Cheers, Gary Vladimir Kozlov wrote: > Looks good. > > Vladimir > > Gary Benson wrote: > > Cool. I've rolled the changes we've discussed into this webrev: > > > > http://cr.openjdk.java.net/~gbenson/zero-update-02-hs/ > > > > Let me know what you think. > > > > Cheers, > > Gary -- http://gbenson.net/ From francis.rangel at gmail.com Fri Nov 27 02:32:45 2009 From: francis.rangel at gmail.com (Francis Rangel) Date: Fri, 27 Nov 2009 08:32:45 -0200 Subject: Performance counter Message-ID: Hi there! I was trying to use de papiex and tauex to collect some performance counters, like cache miss, total cycles and so on. But none of them worked with my JVM. Is there some performance counter that you guys use? I need this to test the inline optimization. And I'm using Ubuntu 8.04. Thanks for your time. Regards. -- Att. Francis Rangel From Christian.Thalinger at Sun.COM Fri Nov 27 04:36:18 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Fri, 27 Nov 2009 13:36:18 +0100 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <20091127095857.GA3323@redhat.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0ACF22.3050107@sun.com> <20091124102703.GC3403@redhat.com> <4B0C25E1.4080804@sun.com> <20091125131507.GF3446@redhat.com> <4B0D7254.5080207@sun.com> <20091127095857.GA3323@redhat.com> Message-ID: <1259325378.875.110.camel@macbook> On Fri, 2009-11-27 at 09:58 +0000, Gary Benson wrote: > Does that mean it's approved? ;) Yes :-) Vladimir and Tom are on vacation this week. -- Christian From Christian.Thalinger at Sun.COM Fri Nov 27 04:44:42 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Fri, 27 Nov 2009 13:44:42 +0100 Subject: Compiled call version seems to be slower In-Reply-To: <4B0E96D2.1050508@gmx.de> References: <4B085F54.6000207@gmx.de> <1258974061.1712.72.camel@macbook> <4B0DB9D3.6050404@gmx.de> <1259224051.875.3.camel@macbook> <4B0E96D2.1050508@gmx.de> Message-ID: <1259325882.875.113.camel@macbook> On Thu, 2009-11-26 at 15:55 +0100, Ulf Zibis wrote: > Am 26.11.2009 09:27, Christian Thalinger schrieb: > > On Thu, 2009-11-26 at 00:12 +0100, Ulf Zibis wrote: > > > >> > >>>> Additionally I'm wondering why the finally block is copy-and-pasted > >>>> for > >>>> each separate return. > >>>> Is that as disired ? > >>>> > > Sorry about asking once more. Would it be so hard to avoid the 6-times > redundancy of the finally block, or are there other reasons? Well, I guess no, but everyone is busy with more important stuff. But hey, it's open source :-) -- Christian From gbenson at redhat.com Fri Nov 27 06:14:57 2009 From: gbenson at redhat.com (Gary Benson) Date: Fri, 27 Nov 2009 14:14:57 +0000 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <1259325378.875.110.camel@macbook> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0ACF22.3050107@sun.com> <20091124102703.GC3403@redhat.com> <4B0C25E1.4080804@sun.com> <20091125131507.GF3446@redhat.com> <4B0D7254.5080207@sun.com> <20091127095857.GA3323@redhat.com> <1259325378.875.110.camel@macbook> Message-ID: <20091127141457.GB3323@redhat.com> Christian Thalinger wrote: > On Fri, 2009-11-27 at 09:58 +0000, Gary Benson wrote: > > Does that mean it's approved? ;) > > Yes :-) Vladimir and Tom are on vacation this week. -- Christian Ah, cool :) Do I need to wait until they get back for the push? Cheers, Gary -- http://gbenson.net/ From Christian.Thalinger at Sun.COM Fri Nov 27 06:38:09 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Fri, 27 Nov 2009 15:38:09 +0100 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <20091127141457.GB3323@redhat.com> References: <20091119131516.GC7222@redhat.com> <4B071D88.2020100@sun.com> <20091123101823.GA3377@redhat.com> <4B0ACF22.3050107@sun.com> <20091124102703.GC3403@redhat.com> <4B0C25E1.4080804@sun.com> <20091125131507.GF3446@redhat.com> <4B0D7254.5080207@sun.com> <20091127095857.GA3323@redhat.com> <1259325378.875.110.camel@macbook> <20091127141457.GB3323@redhat.com> Message-ID: <1259332689.875.118.camel@macbook> On Fri, 2009-11-27 at 14:14 +0000, Gary Benson wrote: > Christian Thalinger wrote: > > On Fri, 2009-11-27 at 09:58 +0000, Gary Benson wrote: > > > Does that mean it's approved? ;) > > > > Yes :-) Vladimir and Tom are on vacation this week. -- Christian > > Ah, cool :) Do I need to wait until they get back for the push? No. Should I do the push for you? -- Christian From gbenson at redhat.com Fri Nov 27 07:34:22 2009 From: gbenson at redhat.com (Gary Benson) Date: Fri, 27 Nov 2009 15:34:22 +0000 Subject: Review Request: 6896043: Zero fixes In-Reply-To: <1259332689.875.118.camel@macbook> References: <20091123101823.GA3377@redhat.com> <4B0ACF22.3050107@sun.com> <20091124102703.GC3403@redhat.com> <4B0C25E1.4080804@sun.com> <20091125131507.GF3446@redhat.com> <4B0D7254.5080207@sun.com> <20091127095857.GA3323@redhat.com> <1259325378.875.110.camel@macbook> <20091127141457.GB3323@redhat.com> <1259332689.875.118.camel@macbook> Message-ID: <20091127153422.GC3323@redhat.com> Christian Thalinger wrote: > On Fri, 2009-11-27 at 14:14 +0000, Gary Benson wrote: > > Christian Thalinger wrote: > > > On Fri, 2009-11-27 at 09:58 +0000, Gary Benson wrote: > > > > Does that mean it's approved? ;) > > > > > > Yes :-) Vladimir and Tom are on vacation this week. -- Christian > > > > Ah, cool :) Do I need to wait until they get back for the push? > > No. Should I do the push for you? Yes please! Cheers, Gary -- http://gbenson.net/ From Christian.Thalinger at Sun.COM Fri Nov 27 10:17:18 2009 From: Christian.Thalinger at Sun.COM (Christian.Thalinger at Sun.COM) Date: Fri, 27 Nov 2009 18:17:18 +0000 Subject: hg: jdk7/hotspot-comp/hotspot: 6896043: first round of zero fixes Message-ID: <20091127181726.9345B4131A@hg.openjdk.java.net> Changeset: 8e7adf982378 Author: twisti Date: 2009-11-27 07:56 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/8e7adf982378 6896043: first round of zero fixes Reviewed-by: kvn Contributed-by: Gary Benson ! src/cpu/zero/vm/cppInterpreter_zero.cpp ! src/cpu/zero/vm/frame_zero.cpp ! src/cpu/zero/vm/frame_zero.hpp ! src/cpu/zero/vm/globals_zero.hpp ! src/cpu/zero/vm/sharedRuntime_zero.cpp ! src/cpu/zero/vm/sharkFrame_zero.hpp ! src/share/vm/interpreter/bytecodeInterpreter.cpp ! src/share/vm/prims/jni.cpp ! src/share/vm/prims/jvmtiManageCapabilities.cpp ! src/share/vm/runtime/os.hpp From Vladimir.Kozlov at Sun.COM Mon Nov 30 08:50:51 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Mon, 30 Nov 2009 08:50:51 -0800 Subject: Request for reviews (XL): 6829187: compiler optimizations required for JSR 292 In-Reply-To: <1256756457.3638.21.camel@macbook> References: <1256756457.3638.21.camel@macbook> Message-ID: <4B13F7EB.4030204@sun.com> Christian, I have next your 4 review requests. For which you still need a review? Vladimir > Here is the first part of the method handle walker: > > http://cr.openjdk.java.net/~twisti/6894206/webrev.01/ > The bytecode adapter generation part for the compilers is another > webrev, which will be posted later. > > This is the JSR 292 C2 compiler support: > http://cr.openjdk.java.net/~twisti/6829187/webrev.01/ > > This patch is one of the first ones: > It depends on a mlvm patch called meth.walker.patch, which may be sent > out tomorrow (or the day after). > I fixed a small bug. Here is the updated webrev: > http://cr.openjdk.java.net/~twisti/6893081/webrev.02/ > > And this one adds C2 inlining support: > http://cr.openjdk.java.net/~twisti/6893268/webrev.01/ From Christian.Thalinger at Sun.COM Mon Nov 30 08:58:19 2009 From: Christian.Thalinger at Sun.COM (Christian Thalinger) Date: Mon, 30 Nov 2009 17:58:19 +0100 Subject: Request for reviews (XL): 6829187: compiler optimizations required for JSR 292 In-Reply-To: <4B13F7EB.4030204@sun.com> References: <1256756457.3638.21.camel@macbook> <4B13F7EB.4030204@sun.com> Message-ID: <1259600299.22671.38.camel@macbook> On Mon, 2009-11-30 at 08:50 -0800, Vladimir Kozlov wrote: > Christian, > > I have next your 4 review requests. > For which you still need a review? For all of them :-/ -- Christian From Thomas.Rodriguez at Sun.COM Mon Nov 30 10:10:04 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Mon, 30 Nov 2009 10:10:04 -0800 Subject: Compiled call version seems to be slower In-Reply-To: <1259325882.875.113.camel@macbook> References: <4B085F54.6000207@gmx.de> <1258974061.1712.72.camel@macbook> <4B0DB9D3.6050404@gmx.de> <1259224051.875.3.camel@macbook> <4B0E96D2.1050508@gmx.de> <1259325882.875.113.camel@macbook> Message-ID: >>>> >>>>>> Additionally I'm wondering why the finally block is copy-and-pasted >>>>>> for >>>>>> each separate return. >>>>>> Is that as disired ? >>>>>> >> >> Sorry about asking once more. Would it be so hard to avoid the 6-times >> redundancy of the finally block, or are there other reasons? > > Well, I guess no, but everyone is busy with more important stuff. But > hey, it's open source :-) The 6 copies of the finally block are there in the bytecodes. It's not something hotspot is creating. A finally is executed on every return path so a copy of that code is needed at every return. Conceivably javac could merge all the return paths through a single return with a single copy of the code but it doesn't do that. You could reshape your code to look like that if you wanted to avoid multiple copies. tom > > -- Christian > From Thomas.Rodriguez at Sun.COM Mon Nov 30 10:17:13 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Mon, 30 Nov 2009 10:17:13 -0800 Subject: +PrintInlining falsly? says: never executed In-Reply-To: <4B0EACDA.3070801@gmx.de> References: <4B0EACDA.3070801@gmx.de> Message-ID: <6DE76E38-BC2D-46E3-97F1-093EE69256F9@sun.com> On Nov 26, 2009, at 8:29 AM, Ulf Zibis wrote: > Having: > VM option 'MaxInlineSize=250' > VM option '+PrintCompilation' > VM option '+PrintInlining' > > I get: > @ 180 sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decode never executed > and > static sun/nio/cs/ext/EUC_TW_C_d_b_codeToBuffer4$Decoder.decode(BBI[C[II)Ljava/nio/charset/CoderResult; > interpreter_invocation_count: 10001 > invocation_counter: 5001 > backedge_counter: 1 Where did this output come from? Was it printed at the time it was being checked for inlining? The "never executed" logic is in src/share/vm/opto/bytecodeInfo.cpp and it's simply checking that the invocation counter is non-zero. Are you saying that it's actually non-zero but we see it as zero? > > Why PrintInlining says: "never executed" ? > > > > Having only: > VM option '+PrintInlining' > > I get: > @ 181 sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decode too big method size: below is the size of the method object. code size: is the size of the bytecodes. tom > > > For the method size refer: > {method} > - klass: {other class} > - method holder: 'sun/nio/cs/ext/EUC_TW_C_d_b_codeToBuffer4$Decoder' > - constants: 0x085562d4{constant pool} > > - access: 0x81000000 - name: 'decode' > - signature: '(BBI[C[II)Ljava/nio/charset/CoderResult;' > - max stack: 6 > - max locals: 9 > - size of params: 7 > - method size: 20 > - vtable index: 17 > - i2i entry: 0x00acb6a0 > > - adapter: 0x03d431f0 > - compiled entry 0x00b91070 > - code size: 189 > - code start: 0x1425d730 > - code end (excl): 0x1425d7ed > - method data: 0x142c5e08 > - checked ex length: 0 > - linenumber start: 0x1425d7ed > - localvar length: 9 > - localvar start: 0x1425d802 > > > -Ulf > > From Thomas.Rodriguez at Sun.COM Mon Nov 30 11:09:36 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Mon, 30 Nov 2009 11:09:36 -0800 Subject: Multiple copies of same code In-Reply-To: <4B0C568B.10600@gmx.de> References: <4B07F11B.5060804@gmx.de> <4149a0430911220859k2daad86crb87f6b81ab05eadc@mail.gmail.com> <4B0C568B.10600@gmx.de> Message-ID: On Nov 24, 2009, at 1:56 PM, Ulf Zibis wrote: > I think, it's not only the code size that matters, but too the performance lack from all these jumps. I wasn't suggesting that duplication of code is always irrelevant, just that in the particular case of the exception entry points it would impossible to measure an improvement from their elimination because the overall path is so expensive that a short jump wouldn't be noticeable. tom > > In the method code below, you see a 2-line finally block. Looking at the compile result, I can see, that this block is repeated 6 times and consumes 1/3 of the whole assembly code for this method. Additionally, there are plenty of range-check and null-check block which too seem to be copy-and-pasted, so I guess, removing the redundant blocks from this example would make the code half-sized. > > On the other hand, the 1-length int [] dp could be optimized to a normal int field and pushing the 6 parameters to stack could be saved, if method decode() would be inlined, but isn't because of inline threshold, which sadly isn't frequency-related. This would additionally increase the performance. > > > private CoderResult decodeArrayLoop(ByteBuffer src, CharBuffer dst) { > > byte[] sa = src.array(); > int sp = src.arrayOffset() + src.position(); > int sl = sp + src.remaining(); > > char[] da = dst.array(); > int [] dp = new int[1]; > dp[0] = dst.arrayOffset() + dst.position(); > int dl = dp[0] + dst.remaining(); > try { > while (sp < sl) { > CoderResult result; > byte byte1 = sa[sp]; > if (byte1 >= 0) { // ASCII G0 > if (dp[0] == dl) > return CoderResult.OVERFLOW; > da[dp[0]++] = (char)(byte1 & 0xff); > sp++; > } else if (byte1 != SS2) { // Codeset 1 G1 > if (sp + 1 == sl) > break; > result = decode(byte1, sa[sp+1], 0, da, dp, dl); > if (result != null) > return result; > sp += 2; > } else { // Codeset 2 G2 > if (sp + 4 > sl) > break; > int cnsPlane = cnspToIndex[sa[sp+1] & 0xff]; > if (cnsPlane < 0) > return CoderResult.malformedForLength(2); > result = decode(sa[sp+2], sa[sp+3], cnsPlane, da, dp, dl); > if (result != null) > return result; > sp += 4; > } > } > return CoderResult.UNDERFLOW; > } finally { > src.position(sp - src.arrayOffset()); > dst.position(dp[0] - dst.arrayOffset()); > } > } > > > -Ulf > > > Am 22.11.2009 17:59, Chuck Rasbold schrieb: >> Sure. It would be great to merge redundant code paths. But I don't >> think the cost/benefit ratio is worth it. >> >> In the case you cite, there would be a savings of 4 bytes per path >> removed, which are projected to be very infrequent. In a JIT, you >> have to spend your compilation budget wisely. >> >> It's not that it can't be done. There are just better places to spend time. >> >> On Sat, Nov 21, 2009 at 5:54 AM, Ulf Zibis wrote: >> In output of PrintAssembly I frequently see : >> >> ... >> ... # more than 10 recurrences >> ... >> 726 B108: # B114 <- B10 Freq: 9.99898e-006 >> 726 # exception oop is in EAX; no code emitted >> 726 MOV ECX,EAX >> 728 JMP,s B114 >> 728 >> 72a B109: # B114 <- B9 Freq: 9.99918e-006 >> 72a # exception oop is in EAX; no code emitted >> 72a MOV ECX,EAX >> 72c JMP,s B114 >> 72c >> 72e B110: # B114 <- B6 Freq: 9.99938e-006 >> 72e # exception oop is in EAX; no code emitted >> 72e MOV ECX,EAX >> 730 JMP,s B114 >> 730 >> 732 B111: # B114 <- B4 Freq: 9.99959e-006 >> 732 # exception oop is in EAX; no code emitted >> 732 MOV ECX,EAX >> 734 JMP,s B114 >> 734 >> 736 B112: # B114 <- B3 Freq: 9.99979e-006 >> 736 # exception oop is in EAX; no code emitted >> 736 MOV ECX,EAX >> 738 JMP,s B114 >> 738 >> 73a B113: # B114 <- B2 Freq: 9.99999e-006 >> 73a # exception oop is in EAX; no code emitted >> 73a MOV ECX,EAX >> 73a >> 73c B114: # N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 B104 B78 B77 B76 B75 B99 Freq: 7.11172e-005 >> >> >> Wouldn't it be better to have : >> >> ... >> ... # more than 10 recurrences >> ... >> 73a B108: # B114 <- B10 Freq: 9.99898e-006 >> 73a B109: # B114 <- B9 Freq: 9.99918e-006 >> 73a B110: # B114 <- B6 Freq: 9.99938e-006 >> 73a B111: # B114 <- B4 Freq: 9.99959e-006 >> 73a B112: # B114 <- B3 Freq: 9.99979e-006 >> 73a B113: # B114 <- B2 Freq: 9.99999e-006 >> 73a # exception oop is in EAX; no code emitted >> 73a MOV ECX,EAX >> 73a >> 73c B114: # N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 B104 B78 B77 B76 B75 B99 Freq: 7.11172e-005 >> >> >> From Vladimir.Kozlov at Sun.COM Mon Nov 30 12:35:49 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Mon, 30 Nov 2009 12:35:49 -0800 Subject: Request for reviews (XL): 6894206: JVM needs a way to traverse method handle structures In-Reply-To: <1256228380.833.22.camel@macbook> References: <1256228380.833.22.camel@macbook> Message-ID: <4B142CA5.20104@sun.com> vmSymbols.cpp I think you should define and assert check log2_FLAG_LIMIT as we do for log2_SID_LIMIT. Also log2_SID_LIMIT == 10 and with log2_FLAG_LIMIT == 4 you leave only 6 bits (63, 31 without sign) for klass id. Will it be enough? You should add assert to check it in vmSymbols::initialize(). Your asserts should use max values, something like this: ! assert(((ID4(1021,1022,1023,15) >> shift) & mask) == 1021, ""); which will fail because of the above (you have to use 31 instead of 1021). Also last methods are not for printouts only so you need to modify the comment. Or method_for() is called only for debug output? Then you should keep #ifndef PRODUCT. And you miss #undef VM_INTRINSIC_INFO src/share/vm/utilities/growableArray.hpp Remove you change and use Tom's insert_before() method he added for 6892658: C2 should optimize some stringbuilder (when he push the change into HS17). src/share/vm/prims/methodHandleWalk.hpp Add assert in make_stack_value() to verify that TokenType and BasicType values fit into 4 bits. src/share/vm/prims/methodHandleWalk.cpp In compute_bound_arg_type() missing check: if (arg_slot >= m->size_of_parameters()) return T_VOID; 107 if (!m->is_static()) { 108 cur_slot -= type2size[T_OBJECT]; 109 if (cur_slot == arg_slot) 110 arg_type = T_OBJECT; ^ return T_OBJECT; 111 } 115 if (cur_slot < arg_slot) { ^ <= 116 if (cur_slot == arg_slot) 117 arg_type = bt; 118 break; 119 } MethodHandleCompiler::make_invoke() 567 Unimplemented(); <<<< ???????????????? argument_count_slow() is used only in assert. May be add #ifdef ASSERT around it. Vladimir Christian Thalinger wrote: > Here is the first part of the method handle walker: > > http://cr.openjdk.java.net/~twisti/6894206/webrev.01/ > > The bytecode adapter generation part for the compilers is another > webrev, which will be posted later. > > -- Christian > From Ulf.Zibis at gmx.de Mon Nov 30 15:32:16 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Tue, 01 Dec 2009 00:32:16 +0100 Subject: Compiled call version seems to be slower In-Reply-To: References: <4B085F54.6000207@gmx.de> <1258974061.1712.72.camel@macbook> <4B0DB9D3.6050404@gmx.de> <1259224051.875.3.camel@macbook> <4B0E96D2.1050508@gmx.de> <1259325882.875.113.camel@macbook> Message-ID: <4B145600.3000008@gmx.de> Am 30.11.2009 19:10, Tom Rodriguez schrieb: >>>>>>> Additionally I'm wondering why the finally block is copy-and-pasted >>>>>>> for >>>>>>> each separate return. >>>>>>> Is that as disired ? >>>>>>> >>>>>>> >>> Sorry about asking once more. Would it be so hard to avoid the 6-times >>> redundancy of the finally block, or are there other reasons? >>> >> Well, I guess no, but everyone is busy with more important stuff. But >> hey, it's open source :-) >> > > The 6 copies of the finally block are there in the bytecodes. Yes, you are right. > It's not something hotspot is creating. A finally is executed on every return path so a copy of that code is needed at every return. Conceivably javac could merge all the return paths through a single return with a single copy of the code but it doesn't do that. So I should file a bug against javac ? > You could reshape your code to look like that if you wanted to avoid multiple copies. > Hm, any idea how to do that ? -Ulf From Ulf.Zibis at gmx.de Mon Nov 30 15:48:59 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Tue, 01 Dec 2009 00:48:59 +0100 Subject: +PrintInlining falsly? says: never executed In-Reply-To: <6DE76E38-BC2D-46E3-97F1-093EE69256F9@sun.com> References: <4B0EACDA.3070801@gmx.de> <6DE76E38-BC2D-46E3-97F1-093EE69256F9@sun.com> Message-ID: <4B1459EB.4030104@gmx.de> Tom, thanks for looking at this. Am 30.11.2009 19:17, Tom Rodriguez schrieb: > On Nov 26, 2009, at 8:29 AM, Ulf Zibis wrote: > > >> Having: >> VM option 'MaxInlineSize=250' >> VM option '+PrintCompilation' >> VM option '+PrintInlining' >> >> I get: >> @ 180 sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decode never executed >> and >> static sun/nio/cs/ext/EUC_TW_C_d_b_codeToBuffer4$Decoder.decode(BBI[C[II)Ljava/nio/charset/CoderResult; >> interpreter_invocation_count: 10001 >> invocation_counter: 5001 >> backedge_counter: 1 >> > > Where did this output come from? Was it printed at the time it was being checked for inlining? I comes from -XX:PrintAssemblyOptions. Yes, it was printed at same time. The complete set of options was: -XX:MaxInlineSize=250 \ -XX:+PrintCompilation \ -XX:+PrintInlining \ -XX:+UnlockDiagnosticVMOptions -Xbatch \ -XX:PrintAssemblyOptions=hsdis-print-bytes \ -XX:CompileCommand=print,sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decodeArrayLoop \ -XX:CompileCommand=print,sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decode BTW: what is the difference to -XX:FreqInlineSize=250 ? I didn't get any effect. > The "never executed" logic is in src/share/vm/opto/bytecodeInfo.cpp and it's simply checking that the invocation counter is non-zero. Are you saying that it's actually non-zero but we see it as zero? > > >> Why PrintInlining says: "never executed" ? >> >> >> >> Having only: >> VM option '+PrintInlining' >> >> I get: >> @ 181 sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decode too big >> > > method size: below is the size of the method object. code size: is the size of the bytecodes. > Thanks, yes, I know. That's why I set MaxInlineSize above 189. -Ulf > tom > > >> For the method size refer: >> {method} >> - klass: {other class} >> - method holder: 'sun/nio/cs/ext/EUC_TW_C_d_b_codeToBuffer4$Decoder' >> - constants: 0x085562d4{constant pool} >> >> - access: 0x81000000 - name: 'decode' >> - signature: '(BBI[C[II)Ljava/nio/charset/CoderResult;' >> - max stack: 6 >> - max locals: 9 >> - size of params: 7 >> - method size: 20 >> - vtable index: 17 >> - i2i entry: 0x00acb6a0 >> >> - adapter: 0x03d431f0 >> - compiled entry 0x00b91070 >> - code size: 189 >> - code start: 0x1425d730 >> - code end (excl): 0x1425d7ed >> - method data: 0x142c5e08 >> - checked ex length: 0 >> - linenumber start: 0x1425d7ed >> - localvar length: 9 >> - localvar start: 0x1425d802 >> >> >> -Ulf >> >> >> > > > From rasbold at google.com Mon Nov 30 15:53:19 2009 From: rasbold at google.com (Chuck Rasbold) Date: Mon, 30 Nov 2009 15:53:19 -0800 Subject: Compiled call version seems to be slower In-Reply-To: <4B145600.3000008@gmx.de> References: <4B085F54.6000207@gmx.de> <1258974061.1712.72.camel@macbook> <4B0DB9D3.6050404@gmx.de> <1259224051.875.3.camel@macbook> <4B0E96D2.1050508@gmx.de> <1259325882.875.113.camel@macbook> <4B145600.3000008@gmx.de> Message-ID: <4149a0430911301553g4f3c24b7o712ef0071349ff55@mail.gmail.com> On Mon, Nov 30, 2009 at 3:32 PM, Ulf Zibis wrote: > Am 30.11.2009 19:10, Tom Rodriguez schrieb: > > Additionally I'm wondering why the finally block is copy-and-pasted >>>>>>>> for each separate return. >>>>>>>> Is that as disired ? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> Sorry about asking once more. Would it be so hard to avoid the >>>> 6-times redundancy of the finally block, or are there other reasons? >>>> >>>> >>> Well, I guess no, but everyone is busy with more important stuff. But >>> hey, it's open source :-) >>> >>> >> >> The 6 copies of the finally block are there in the bytecodes. >> > > Yes, you are right. > > > It's not something hotspot is creating. A finally is executed on every >> return path so a copy of that code is needed at every return. Conceivably >> javac could merge all the return paths through a single return with a single >> copy of the code but it doesn't do that. >> > > So I should file a bug against javac ? I think javac does this by design in order to eliminate jsr/ret bytecodes. Others could speak with more authority about javac. However, I'm sure the the implementors would not call it a bug. > > > You could reshape your code to look like that if you wanted to avoid >> multiple copies. >> >> > > Hm, any idea how to do that ? > Get rid of the try/finally. Is this semantically equivalent to your code? private CoderResult decodeArrayLoop(ByteBuffer src, CharBuffer dst) { byte[] sa = src.array(); int sp = src.arrayOffset() + src.position(); int sl = sp + src.remaining(); char[] da = dst.array(); int [] dp = new int[1]; dp[0] = dst.arrayOffset() + dst.position(); int dl = dp[0] + dst.remaining(); CoderResult result = CodeResult.UNDERFLOW; while (sp < sl) { byte byte1 = sa[sp]; if (byte1 >= 0) { // ASCII G0 if (dp[0] == dl) { result = CoderResult.OVERFLOW; break; } da[dp[0]++] = (char)(byte1 & 0xff); sp++; } else if (byte1 != SS2) { // Codeset 1 G1 if (sp + 1 == sl) { break; } result = decode(byte1, sa[sp+1], 0, da, dp, dl); if (result != null) break; sp += 2; } else { // Codeset 2 G2 if (sp + 4 > sl) break; int cnsPlane = cnspToIndex[sa[sp+1] & 0xff]; if (cnsPlane < 0) { result = CoderResult.malformedForLength(2); break; } result = decode(sa[sp+2], sa[sp+3], cnsPlane, da, dp, dl); if (result != null) break; sp += 4; } } src.position(sp - src.arrayOffset()); dst.position(dp[0] - dst.arrayOffset()); return result; } > -Ulf > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20091130/9a7339c7/attachment-0001.html From Thomas.Rodriguez at Sun.COM Mon Nov 30 16:01:09 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Mon, 30 Nov 2009 16:01:09 -0800 Subject: Compiled call version seems to be slower In-Reply-To: <4B145600.3000008@gmx.de> References: <4B085F54.6000207@gmx.de> <1258974061.1712.72.camel@macbook> <4B0DB9D3.6050404@gmx.de> <1259224051.875.3.camel@macbook> <4B0E96D2.1050508@gmx.de> <1259325882.875.113.camel@macbook> <4B145600.3000008@gmx.de> Message-ID: <7261756B-4273-4471-952F-0D948EC305A3@Sun.COM> On Nov 30, 2009, at 3:32 PM, Ulf Zibis wrote: > Am 30.11.2009 19:10, Tom Rodriguez schrieb: >>>>>>>> Additionally I'm wondering why the finally block is copy-and-pasted >>>>>>>> for each separate return. >>>>>>>> Is that as disired ? >>>>>>>> >>>>>>>> >>>> Sorry about asking once more. Would it be so hard to avoid the 6-times redundancy of the finally block, or are there other reasons? >>>> >>> Well, I guess no, but everyone is busy with more important stuff. But >>> hey, it's open source :-) >>> >> >> The 6 copies of the finally block are there in the bytecodes. > > Yes, you are right. > >> It's not something hotspot is creating. A finally is executed on every return path so a copy of that code is needed at every return. Conceivably javac could merge all the return paths through a single return with a single copy of the code but it doesn't do that. > > So I should file a bug against javac ? You could. There's no reason not to. Maybe from their perspective it makes mapping back to the original source harder since mapping the return bytecode back to the original wouldn't work the same. Anyway, I doubt it's a near term path to happiness. > >> You could reshape your code to look like that if you wanted to avoid multiple copies. >> > > Hm, any idea how to do that ? You need to capture all possible return values into a local variable and return that in one place. You'll probably need to use a break to get out of the loop. So a structure something like this: int result = -1; while (foo) { if (cond) { result = 1; break; } if (cond2) { result = 2; break; } } return result; it's ugly so make sure it's worth it. There are probably other ways of structuring it too. tom > > -Ulf > > From Vladimir.Kozlov at Sun.COM Mon Nov 30 16:01:53 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Mon, 30 Nov 2009 16:01:53 -0800 Subject: Request for reviews (L): 6893081: method handle & invokedynamic code needs additional cleanup (post 6815692, 6858164) In-Reply-To: <1256755697.3638.18.camel@macbook> References: <1256069437.820.123.camel@macbook> <1256755697.3638.18.camel@macbook> Message-ID: <4B145CF1.9070101@sun.com> Christian Thalinger wrote: > On Tue, 2009-10-20 at 22:10 +0200, Christian Thalinger wrote: >> This patch is one of the first ones: >> >> http://cr.openjdk.java.net/~twisti/6893081/webrev.01/ > > I fixed a small bug. Here is the updated webrev: > > http://cr.openjdk.java.net/~twisti/6893081/webrev.02/ > > -- Christian > src/cpu/x86/vm/methodHandles_x86.cpp Need comment update, 16+8+8+8 == 40 678 // original 32-bit vmdata word must be of this form: 679 // | MBZ:16 | signBitCount:8 | srcDstTypes:8 | conversionOp:8 | Use short jmpb() : 688 __ jmp(done); src/cpu/x86/vm/sharedRuntime_x86_64.cpp + // we be generated. ^ will src/cpu/x86/vm/templateInterpreter_x86_<32|64>.cpp Do you have a bug for this problem to not forget about it? Vladimir From Thomas.Rodriguez at Sun.COM Mon Nov 30 16:12:43 2009 From: Thomas.Rodriguez at Sun.COM (Tom Rodriguez) Date: Mon, 30 Nov 2009 16:12:43 -0800 Subject: +PrintInlining falsly? says: never executed In-Reply-To: <4B1459EB.4030104@gmx.de> References: <4B0EACDA.3070801@gmx.de> <6DE76E38-BC2D-46E3-97F1-093EE69256F9@sun.com> <4B1459EB.4030104@gmx.de> Message-ID: <8C19049F-1EED-43EB-9C8A-3AC9377AF68D@Sun.COM> >>> I get: >>> @ 180 sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decode never executed >>> and >>> static sun/nio/cs/ext/EUC_TW_C_d_b_codeToBuffer4$Decoder.decode(BBI[C[II)Ljava/nio/charset/CoderResult; >>> interpreter_invocation_count: 10001 >>> invocation_counter: 5001 >>> backedge_counter: 1 >>> >> >> Where did this output come from? Was it printed at the time it was being checked for inlining? > > I comes from -XX:PrintAssemblyOptions. Yes, it was printed at same time. I don't see how PrintAssemblyOptions could produce that output. I think those counts come from the CompileCommand=print option below and those are printed at the end of the run. So I'm guessing that at the point the compile was occurring decode actually hadn't been called. The can sometimes result from inlining. What was the whole inline tree? I think you'd have to look into the debugger to make sure. > > The complete set of options was: > -XX:MaxInlineSize=250 \ > -XX:+PrintCompilation \ > -XX:+PrintInlining \ > -XX:+UnlockDiagnosticVMOptions -Xbatch \ > -XX:PrintAssemblyOptions=hsdis-print-bytes \ > -XX:CompileCommand=print,sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decodeArrayLoop \ > -XX:CompileCommand=print,sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decode > > BTW: what is the difference to -XX:FreqInlineSize=250 ? I didn't get any effect. FreqInlineSize applies when the call site looks frequent using either InlineFrequencyRatio or InlineFrequencyCount. Again check out bytecodeInfo.cpp and search for freq_inline_size. tom > > >> The "never executed" logic is in src/share/vm/opto/bytecodeInfo.cpp and it's simply checking that the invocation counter is non-zero. Are you saying that it's actually non-zero but we see it as zero? >> >> >>> Why PrintInlining says: "never executed" ? >>> >>> >>> >>> Having only: >>> VM option '+PrintInlining' >>> >>> I get: >>> @ 181 sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decode too big >>> >> >> method size: below is the size of the method object. code size: is the size of the bytecodes. >> > > Thanks, yes, I know. That's why I set MaxInlineSize above 189. > > -Ulf > >> tom >> >> >>> For the method size refer: >>> {method} >>> - klass: {other class} >>> - method holder: 'sun/nio/cs/ext/EUC_TW_C_d_b_codeToBuffer4$Decoder' >>> - constants: 0x085562d4{constant pool} >>> >>> - access: 0x81000000 - name: 'decode' >>> - signature: '(BBI[C[II)Ljava/nio/charset/CoderResult;' >>> - max stack: 6 >>> - max locals: 9 >>> - size of params: 7 >>> - method size: 20 >>> - vtable index: 17 >>> - i2i entry: 0x00acb6a0 >>> >>> - adapter: 0x03d431f0 >>> - compiled entry 0x00b91070 >>> - code size: 189 >>> - code start: 0x1425d730 >>> - code end (excl): 0x1425d7ed >>> - method data: 0x142c5e08 >>> - checked ex length: 0 >>> - linenumber start: 0x1425d7ed >>> - localvar length: 9 >>> - localvar start: 0x1425d802 >>> >>> >>> -Ulf >>> >>> >>> >> >> >> > From Ulf.Zibis at gmx.de Mon Nov 30 16:26:24 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Tue, 01 Dec 2009 01:26:24 +0100 Subject: Compiled call version seems to be slower In-Reply-To: <7261756B-4273-4471-952F-0D948EC305A3@Sun.COM> References: <4B085F54.6000207@gmx.de> <1258974061.1712.72.camel@macbook> <4B0DB9D3.6050404@gmx.de> <1259224051.875.3.camel@macbook> <4B0E96D2.1050508@gmx.de> <1259325882.875.113.camel@macbook> <4B145600.3000008@gmx.de> <7261756B-4273-4471-952F-0D948EC305A3@Sun.COM> Message-ID: <4B1462B0.7070101@gmx.de> Am 01.12.2009 01:01, Tom Rodriguez schrieb: > On Nov 30, 2009, at 3:32 PM, Ulf Zibis wrote: > > >>> You could reshape your code to look like that if you wanted to avoid multiple copies. >>> >>> >> Hm, any idea how to do that ? >> > > You need to capture all possible return values into a local variable and return that in one place. You'll probably need to use a break to get out of the loop. So a structure something like this: > > int result = -1; > while (foo) { > if (cond) { > result = 1; > break; > } > if (cond2) { > result = 2; > break; > } > } > return result; > > it's ugly so make sure it's worth it. There are probably other ways of structuring it too. > Yes, same as Chuck proposed, that's a possible solution. I guess you convince, that this javac behavior makes the try-finally-without-catch construct a not-to-do, but it's often used and often a recommended design in Java books. -Ulf From Ulf.Zibis at gmx.de Mon Nov 30 17:00:46 2009 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Tue, 01 Dec 2009 02:00:46 +0100 Subject: +PrintInlining falsly? says: never executed In-Reply-To: <8C19049F-1EED-43EB-9C8A-3AC9377AF68D@Sun.COM> References: <4B0EACDA.3070801@gmx.de> <6DE76E38-BC2D-46E3-97F1-093EE69256F9@sun.com> <4B1459EB.4030104@gmx.de> <8C19049F-1EED-43EB-9C8A-3AC9377AF68D@Sun.COM> Message-ID: <4B146ABE.7010101@gmx.de> Am 01.12.2009 01:12, Tom Rodriguez schrieb: >>>> I get: >>>> @ 180 sun.nio.cs.ext.EUC_TW_C_d_b_codeToBuffer4$Decoder::decode never executed >>>> and >>>> static sun/nio/cs/ext/EUC_TW_C_d_b_codeToBuffer4$Decoder.decode(BBI[C[II)Ljava/nio/charset/CoderResult; >>>> interpreter_invocation_count: 10001 >>>> invocation_counter: 5001 >>>> backedge_counter: 1 >>>> >>>> >>> Where did this output come from? Was it printed at the time it was being checked for inlining? >>> >> I comes from -XX:PrintAssemblyOptions. Yes, it was printed at same time. >> > > I don't see how PrintAssemblyOptions could produce that output. I think those counts come from the CompileCommand=print option below and those are printed at the end of the run. So I'm guessing that at the point the compile was occurring decode actually hadn't been called. I suspect, this had happened here, because the decode() method should have been executed as many times, as the JIT threshold for compiling the frequent branches of decodeArrayLoop(). > The can sometimes result from inlining. What was the whole inline tree? I think you'd have to look into the debugger to make sure. > You can find my sources here (in the test tree): https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/branches/j7_EUC_TW/?rev=833 The complete hsdis outputs are in the log folder. -Ulf From Vladimir.Kozlov at Sun.COM Mon Nov 30 17:16:47 2009 From: Vladimir.Kozlov at Sun.COM (Vladimir Kozlov) Date: Mon, 30 Nov 2009 17:16:47 -0800 Subject: Request for reviews (XL): 6829187: compiler optimizations required for JSR 292 In-Reply-To: <1256756457.3638.21.camel@macbook> References: <1256756457.3638.21.camel@macbook> Message-ID: <4B146E7F.5090800@sun.com> Christian Thalinger wrote: > This is the JSR 292 C2 compiler support: > > http://cr.openjdk.java.net/~twisti/6829187/webrev.01/ > > -- Christian > src/cpu/x86/vm/frame_x86.inline.hpp Did you miss the next check or it was intentional? && last_sp >= sp() src/cpu/x86/vm/x86_{32|64}.ad Instead of using "/MethodHandle" in format %{%} I would fix MachCallStaticJavaNode::dump_spec(). Why you can't add effect KILL ebp to CallStaticJavaHandle()? If you do it do you still need idealreg2mhdebugmask[] masks? src/share/vm/ci/ciCPCache.hpp.html 27 // The class represents and entry in the constant pool cache. ^ an ??? Should the class be called ciCPCacheEntry? Vladimir