From tom.deneau at amd.com Thu Nov 7 13:22:41 2013 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 7 Nov 2013 21:22:41 +0000 Subject: non-foreign-call tlab refill from hsail In-Reply-To: <33F60C70-33F8-40F6-AE82-DD817293856C@oracle.com> References: <33F60C70-33F8-40F6-AE82-DD817293856C@oracle.com> Message-ID: Doug -- Trying to see if I understand how these pieces fit together. NewObjectSnippets.allocateInstance makes a call to NewInstanceStubCall.call if the current tlab does not have enough room. NewInstanceStubCall.call looks up the ForeignCallLinkage and finds that it is not a simple foreign call to a specific foreign call address (its address is 0) but instead has a stub associated with it. I think this association came from the call to link(new NewInstanceStub(providers, target, registerStubCall(NEW_INSTANCE, REEXECUTABLE, NOT_LEAF, ANY_LOCATION))); in HotSpotHostForeingCallsProvider.java So when we try to finalizeAddress for the ForeignCallLinkage we end up compiling this stub. The stub is a SnippetStub implemented with the snippet called "newInstance" in NewInstanceStub.java and tries to get a new tlab using CAS operations. If this stub cannot get a new tlab it makes a "real" foreign call using newInstanceC(NEW_INSTANCE_C, thread(), hub); which ends up going to the graalRuntime::new_instance -----Original Message----- From: Doug Simon [mailto:doug.simon at oracle.com] Sent: Tuesday, October 22, 2013 4:42 AM To: Deneau, Tom Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net Subject: Re: non-foreign-call tlab refill from hsail On Oct 22, 2013, at 12:18 AM, "Deneau, Tom" wrote: > We are experimenting with object (and array) allocation from an HSA device (using graal for > > the HSAIL codegen). Where we are now: > > > > * the hsa workitems are using TLABs from "donor threads" who exist > > just to supply TLABs and don't do any allocation themselves. > > > > * To reduce the number of donor threads required, a TLAB can be > > used by more than one workitem, in which case the workitems use > > HSAIL atomic_add instructions to bump the tlab top pointer. > > > > * the HSAIL backend has its own fastpath allocation snippets > > which generate the HSAIL atomic_add instructions which > > override the snippets in NewObjectSnippets.java > > > > Some junit tests have been written and pass which allocate objects, or arrays of primitives, or arrays of objects. > > > > All the above only works for the fastpath case, i.e., if there is indeed enough space in the donor TLAB. We realize there are other cases: > > > > a) not enough space in current TLAB but ability to allocate a new TLAB. > > > > b) not able to allocate a new TLAB, GC required. > > > > For only case a) above, we would like to experiment with grabbing the new TLAB from HSAIL without making a "foreign call" to the VM. From the hotspot code, I assume the logic required is what one sees in > > mutableSpace::cas_allocate(size_t size) at least for the non-G1 case. When the NewInstanceStub fails to allocate a new TLAB, it calls out to GraalRuntime::new_instance (in graalRuntime.cpp). > Some of this non-foreign-call allocation logic appears to exist in the Snippet called NewInstanceStub.newInstance (as opposed to NewObjectSnippets.allocateInstance snippet which is what we are currently overriding). This comments for this snippet say > > "Re-attempts allocation after an initial TLAB allocation failed or > > was skipped (e.g., due to * -XX:-UseTLAB)." > > > > Is this NewInstanceStub.newInstance snippet actually used anywhere in the current graal framework. Yes, you can see a call to NewInstanceStubCall in NewObjectSnippets.allocateInstance. > Is this a starting point we could use to get a non-foreign-call TLAB refill working? Yes. Note that this call *is* a foreign call (see the javadoc for ForeignCallDescriptor). -Doug From doug.simon at oracle.com Thu Nov 7 14:03:37 2013 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 7 Nov 2013 23:03:37 +0100 Subject: non-foreign-call tlab refill from hsail In-Reply-To: References: <33F60C70-33F8-40F6-AE82-DD817293856C@oracle.com> Message-ID: That is a very correct summary of the way it works! On Nov 7, 2013, at 10:22 PM, Deneau, Tom wrote: > Doug -- > > Trying to see if I understand how these pieces fit together. > > NewObjectSnippets.allocateInstance makes a call to > NewInstanceStubCall.call if the current tlab does not have enough > room. > > NewInstanceStubCall.call looks up the ForeignCallLinkage and finds > that it is not a simple foreign call to a specific foreign call > address (its address is 0) but instead has a stub associated with it. > I think this association came from the call to > > link(new NewInstanceStub(providers, target, registerStubCall(NEW_INSTANCE, REEXECUTABLE, NOT_LEAF, ANY_LOCATION))); > > in HotSpotHostForeingCallsProvider.java > > So when we try to finalizeAddress for the ForeignCallLinkage we end up > compiling this stub. > > The stub is a SnippetStub implemented with the snippet called > "newInstance" in NewInstanceStub.java and tries to get a new tlab > using CAS operations. If this stub cannot get a new tlab it makes a > "real" foreign call using > newInstanceC(NEW_INSTANCE_C, thread(), hub); > > which ends up going to the graalRuntime::new_instance > > > -----Original Message----- > From: Doug Simon [mailto:doug.simon at oracle.com] > Sent: Tuesday, October 22, 2013 4:42 AM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net > Subject: Re: non-foreign-call tlab refill from hsail > > > On Oct 22, 2013, at 12:18 AM, "Deneau, Tom" wrote: > >> We are experimenting with object (and array) allocation from an HSA device (using graal for >> >> the HSAIL codegen). Where we are now: >> >> >> >> * the hsa workitems are using TLABs from "donor threads" who exist >> >> just to supply TLABs and don't do any allocation themselves. >> >> >> >> * To reduce the number of donor threads required, a TLAB can be >> >> used by more than one workitem, in which case the workitems use >> >> HSAIL atomic_add instructions to bump the tlab top pointer. >> >> >> >> * the HSAIL backend has its own fastpath allocation snippets >> >> which generate the HSAIL atomic_add instructions which >> >> override the snippets in NewObjectSnippets.java >> >> >> >> Some junit tests have been written and pass which allocate objects, or arrays of primitives, or arrays of objects. >> >> >> >> All the above only works for the fastpath case, i.e., if there is indeed enough space in the donor TLAB. We realize there are other cases: >> >> >> >> a) not enough space in current TLAB but ability to allocate a new TLAB. >> >> >> >> b) not able to allocate a new TLAB, GC required. >> >> >> >> For only case a) above, we would like to experiment with grabbing the new TLAB from HSAIL without making a "foreign call" to the VM. From the hotspot code, I assume the logic required is what one sees in >> >> mutableSpace::cas_allocate(size_t size) at least for the non-G1 case. > > When the NewInstanceStub fails to allocate a new TLAB, it calls out to GraalRuntime::new_instance (in graalRuntime.cpp). > >> Some of this non-foreign-call allocation logic appears to exist in the Snippet called NewInstanceStub.newInstance (as opposed to NewObjectSnippets.allocateInstance snippet which is what we are currently overriding). This comments for this snippet say >> >> "Re-attempts allocation after an initial TLAB allocation failed or >> >> was skipped (e.g., due to * -XX:-UseTLAB)." >> >> >> >> Is this NewInstanceStub.newInstance snippet actually used anywhere in the current graal framework. > > Yes, you can see a call to NewInstanceStubCall in NewObjectSnippets.allocateInstance. > >> Is this a starting point we could use to get a non-foreign-call TLAB refill working? > > Yes. Note that this call *is* a foreign call (see the javadoc for ForeignCallDescriptor). > > -Doug > From tom.deneau at amd.com Thu Nov 7 14:13:49 2013 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 7 Nov 2013 22:13:49 +0000 Subject: non-foreign-call tlab refill from hsail In-Reply-To: References: <33F60C70-33F8-40F6-AE82-DD817293856C@oracle.com> Message-ID: So I was trying to understand why the NewInstanceStub.newInstance Stub code was not just included in the original NewObjectSnippet.allocateInstance snippet. -- Tom -----Original Message----- From: Doug Simon [mailto:doug.simon at oracle.com] Sent: Thursday, November 07, 2013 4:04 PM To: Deneau, Tom Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net Subject: Re: non-foreign-call tlab refill from hsail That is a very correct summary of the way it works! On Nov 7, 2013, at 10:22 PM, Deneau, Tom wrote: > Doug -- > > Trying to see if I understand how these pieces fit together. > > NewObjectSnippets.allocateInstance makes a call to > NewInstanceStubCall.call if the current tlab does not have enough > room. > > NewInstanceStubCall.call looks up the ForeignCallLinkage and finds > that it is not a simple foreign call to a specific foreign call > address (its address is 0) but instead has a stub associated with it. > I think this association came from the call to > > link(new NewInstanceStub(providers, target, > registerStubCall(NEW_INSTANCE, REEXECUTABLE, NOT_LEAF, > ANY_LOCATION))); > > in HotSpotHostForeingCallsProvider.java > > So when we try to finalizeAddress for the ForeignCallLinkage we end up > compiling this stub. > > The stub is a SnippetStub implemented with the snippet called > "newInstance" in NewInstanceStub.java and tries to get a new tlab > using CAS operations. If this stub cannot get a new tlab it makes a > "real" foreign call using > newInstanceC(NEW_INSTANCE_C, thread(), hub); > > which ends up going to the graalRuntime::new_instance > > > -----Original Message----- > From: Doug Simon [mailto:doug.simon at oracle.com] > Sent: Tuesday, October 22, 2013 4:42 AM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net > Subject: Re: non-foreign-call tlab refill from hsail > > > On Oct 22, 2013, at 12:18 AM, "Deneau, Tom" wrote: > >> We are experimenting with object (and array) allocation from an HSA >> device (using graal for >> >> the HSAIL codegen). Where we are now: >> >> >> >> * the hsa workitems are using TLABs from "donor threads" who exist >> >> just to supply TLABs and don't do any allocation themselves. >> >> >> >> * To reduce the number of donor threads required, a TLAB can be >> >> used by more than one workitem, in which case the workitems use >> >> HSAIL atomic_add instructions to bump the tlab top pointer. >> >> >> >> * the HSAIL backend has its own fastpath allocation snippets >> >> which generate the HSAIL atomic_add instructions which >> >> override the snippets in NewObjectSnippets.java >> >> >> >> Some junit tests have been written and pass which allocate objects, or arrays of primitives, or arrays of objects. >> >> >> >> All the above only works for the fastpath case, i.e., if there is indeed enough space in the donor TLAB. We realize there are other cases: >> >> >> >> a) not enough space in current TLAB but ability to allocate a new TLAB. >> >> >> >> b) not able to allocate a new TLAB, GC required. >> >> >> >> For only case a) above, we would like to experiment with grabbing the >> new TLAB from HSAIL without making a "foreign call" to the VM. From >> the hotspot code, I assume the logic required is what one sees in >> >> mutableSpace::cas_allocate(size_t size) at least for the non-G1 case. > > When the NewInstanceStub fails to allocate a new TLAB, it calls out to GraalRuntime::new_instance (in graalRuntime.cpp). > >> Some of this non-foreign-call allocation logic appears to exist in >> the Snippet called NewInstanceStub.newInstance (as opposed to >> NewObjectSnippets.allocateInstance snippet which is what we are >> currently overriding). This comments for this snippet say >> >> "Re-attempts allocation after an initial TLAB allocation failed or >> >> was skipped (e.g., due to * -XX:-UseTLAB)." >> >> >> >> Is this NewInstanceStub.newInstance snippet actually used anywhere in the current graal framework. > > Yes, you can see a call to NewInstanceStubCall in NewObjectSnippets.allocateInstance. > >> Is this a starting point we could use to get a non-foreign-call TLAB refill working? > > Yes. Note that this call *is* a foreign call (see the javadoc for ForeignCallDescriptor). > > -Doug > From doug.simon at oracle.com Thu Nov 7 14:15:12 2013 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 7 Nov 2013 23:15:12 +0100 Subject: non-foreign-call tlab refill from hsail In-Reply-To: References: <33F60C70-33F8-40F6-AE82-DD817293856C@oracle.com> Message-ID: <36423EAD-CDB9-4F3B-BDD4-E210C5EE3FB5@oracle.com> Because it is slow (well, medium) path code that we don?t want to inline at every allocation site. On Nov 7, 2013, at 11:13 PM, Deneau, Tom wrote: > So I was trying to understand why the NewInstanceStub.newInstance Stub code > was not just included in the original NewObjectSnippet.allocateInstance snippet. > > -- Tom > > > -----Original Message----- > From: Doug Simon [mailto:doug.simon at oracle.com] > Sent: Thursday, November 07, 2013 4:04 PM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net > Subject: Re: non-foreign-call tlab refill from hsail > > That is a very correct summary of the way it works! > > On Nov 7, 2013, at 10:22 PM, Deneau, Tom wrote: > >> Doug -- >> >> Trying to see if I understand how these pieces fit together. >> >> NewObjectSnippets.allocateInstance makes a call to >> NewInstanceStubCall.call if the current tlab does not have enough >> room. >> >> NewInstanceStubCall.call looks up the ForeignCallLinkage and finds >> that it is not a simple foreign call to a specific foreign call >> address (its address is 0) but instead has a stub associated with it. >> I think this association came from the call to >> >> link(new NewInstanceStub(providers, target, >> registerStubCall(NEW_INSTANCE, REEXECUTABLE, NOT_LEAF, >> ANY_LOCATION))); >> >> in HotSpotHostForeingCallsProvider.java >> >> So when we try to finalizeAddress for the ForeignCallLinkage we end up >> compiling this stub. >> >> The stub is a SnippetStub implemented with the snippet called >> "newInstance" in NewInstanceStub.java and tries to get a new tlab >> using CAS operations. If this stub cannot get a new tlab it makes a >> "real" foreign call using >> newInstanceC(NEW_INSTANCE_C, thread(), hub); >> >> which ends up going to the graalRuntime::new_instance >> >> >> -----Original Message----- >> From: Doug Simon [mailto:doug.simon at oracle.com] >> Sent: Tuesday, October 22, 2013 4:42 AM >> To: Deneau, Tom >> Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net >> Subject: Re: non-foreign-call tlab refill from hsail >> >> >> On Oct 22, 2013, at 12:18 AM, "Deneau, Tom" wrote: >> >>> We are experimenting with object (and array) allocation from an HSA >>> device (using graal for >>> >>> the HSAIL codegen). Where we are now: >>> >>> >>> >>> * the hsa workitems are using TLABs from "donor threads" who exist >>> >>> just to supply TLABs and don't do any allocation themselves. >>> >>> >>> >>> * To reduce the number of donor threads required, a TLAB can be >>> >>> used by more than one workitem, in which case the workitems use >>> >>> HSAIL atomic_add instructions to bump the tlab top pointer. >>> >>> >>> >>> * the HSAIL backend has its own fastpath allocation snippets >>> >>> which generate the HSAIL atomic_add instructions which >>> >>> override the snippets in NewObjectSnippets.java >>> >>> >>> >>> Some junit tests have been written and pass which allocate objects, or arrays of primitives, or arrays of objects. >>> >>> >>> >>> All the above only works for the fastpath case, i.e., if there is indeed enough space in the donor TLAB. We realize there are other cases: >>> >>> >>> >>> a) not enough space in current TLAB but ability to allocate a new TLAB. >>> >>> >>> >>> b) not able to allocate a new TLAB, GC required. >>> >>> >>> >>> For only case a) above, we would like to experiment with grabbing the >>> new TLAB from HSAIL without making a "foreign call" to the VM. From >>> the hotspot code, I assume the logic required is what one sees in >>> >>> mutableSpace::cas_allocate(size_t size) at least for the non-G1 case. >> >> When the NewInstanceStub fails to allocate a new TLAB, it calls out to GraalRuntime::new_instance (in graalRuntime.cpp). >> >>> Some of this non-foreign-call allocation logic appears to exist in >>> the Snippet called NewInstanceStub.newInstance (as opposed to >>> NewObjectSnippets.allocateInstance snippet which is what we are >>> currently overriding). This comments for this snippet say >>> >>> "Re-attempts allocation after an initial TLAB allocation failed or >>> >>> was skipped (e.g., due to * -XX:-UseTLAB)." >>> >>> >>> >>> Is this NewInstanceStub.newInstance snippet actually used anywhere in the current graal framework. >> >> Yes, you can see a call to NewInstanceStubCall in NewObjectSnippets.allocateInstance. >> >>> Is this a starting point we could use to get a non-foreign-call TLAB refill working? >> >> Yes. Note that this call *is* a foreign call (see the javadoc for ForeignCallDescriptor). >> >> -Doug >> > > > From tom.deneau at amd.com Thu Nov 7 14:21:40 2013 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 7 Nov 2013 22:21:40 +0000 Subject: non-foreign-call tlab refill from hsail In-Reply-To: <36423EAD-CDB9-4F3B-BDD4-E210C5EE3FB5@oracle.com> References: <33F60C70-33F8-40F6-AE82-DD817293856C@oracle.com> <36423EAD-CDB9-4F3B-BDD4-E210C5EE3FB5@oracle.com> Message-ID: Are snippets required to inline all their calls? Or alternatively is there no way to annotate that a method should not be inlined? -- Tom -----Original Message----- From: Doug Simon [mailto:doug.simon at oracle.com] Sent: Thursday, November 07, 2013 4:15 PM To: Deneau, Tom Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net Subject: Re: non-foreign-call tlab refill from hsail Because it is slow (well, medium) path code that we don't want to inline at every allocation site. On Nov 7, 2013, at 11:13 PM, Deneau, Tom wrote: > So I was trying to understand why the NewInstanceStub.newInstance Stub > code was not just included in the original NewObjectSnippet.allocateInstance snippet. > > -- Tom > > > -----Original Message----- > From: Doug Simon [mailto:doug.simon at oracle.com] > Sent: Thursday, November 07, 2013 4:04 PM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net > Subject: Re: non-foreign-call tlab refill from hsail > > That is a very correct summary of the way it works! > > On Nov 7, 2013, at 10:22 PM, Deneau, Tom wrote: > >> Doug -- >> >> Trying to see if I understand how these pieces fit together. >> >> NewObjectSnippets.allocateInstance makes a call to >> NewInstanceStubCall.call if the current tlab does not have enough >> room. >> >> NewInstanceStubCall.call looks up the ForeignCallLinkage and finds >> that it is not a simple foreign call to a specific foreign call >> address (its address is 0) but instead has a stub associated with it. >> I think this association came from the call to >> >> link(new NewInstanceStub(providers, target, >> registerStubCall(NEW_INSTANCE, REEXECUTABLE, NOT_LEAF, >> ANY_LOCATION))); >> >> in HotSpotHostForeingCallsProvider.java >> >> So when we try to finalizeAddress for the ForeignCallLinkage we end >> up compiling this stub. >> >> The stub is a SnippetStub implemented with the snippet called >> "newInstance" in NewInstanceStub.java and tries to get a new tlab >> using CAS operations. If this stub cannot get a new tlab it makes a >> "real" foreign call using >> newInstanceC(NEW_INSTANCE_C, thread(), hub); >> >> which ends up going to the graalRuntime::new_instance >> >> >> -----Original Message----- >> From: Doug Simon [mailto:doug.simon at oracle.com] >> Sent: Tuesday, October 22, 2013 4:42 AM >> To: Deneau, Tom >> Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net >> Subject: Re: non-foreign-call tlab refill from hsail >> >> >> On Oct 22, 2013, at 12:18 AM, "Deneau, Tom" wrote: >> >>> We are experimenting with object (and array) allocation from an HSA >>> device (using graal for >>> >>> the HSAIL codegen). Where we are now: >>> >>> >>> >>> * the hsa workitems are using TLABs from "donor threads" who exist >>> >>> just to supply TLABs and don't do any allocation themselves. >>> >>> >>> >>> * To reduce the number of donor threads required, a TLAB can be >>> >>> used by more than one workitem, in which case the workitems use >>> >>> HSAIL atomic_add instructions to bump the tlab top pointer. >>> >>> >>> >>> * the HSAIL backend has its own fastpath allocation snippets >>> >>> which generate the HSAIL atomic_add instructions which >>> >>> override the snippets in NewObjectSnippets.java >>> >>> >>> >>> Some junit tests have been written and pass which allocate objects, or arrays of primitives, or arrays of objects. >>> >>> >>> >>> All the above only works for the fastpath case, i.e., if there is indeed enough space in the donor TLAB. We realize there are other cases: >>> >>> >>> >>> a) not enough space in current TLAB but ability to allocate a new TLAB. >>> >>> >>> >>> b) not able to allocate a new TLAB, GC required. >>> >>> >>> >>> For only case a) above, we would like to experiment with grabbing >>> the new TLAB from HSAIL without making a "foreign call" to the VM. >>> From the hotspot code, I assume the logic required is what one sees >>> in >>> >>> mutableSpace::cas_allocate(size_t size) at least for the non-G1 case. >> >> When the NewInstanceStub fails to allocate a new TLAB, it calls out to GraalRuntime::new_instance (in graalRuntime.cpp). >> >>> Some of this non-foreign-call allocation logic appears to exist in >>> the Snippet called NewInstanceStub.newInstance (as opposed to >>> NewObjectSnippets.allocateInstance snippet which is what we are >>> currently overriding). This comments for this snippet say >>> >>> "Re-attempts allocation after an initial TLAB allocation failed or >>> >>> was skipped (e.g., due to * -XX:-UseTLAB)." >>> >>> >>> >>> Is this NewInstanceStub.newInstance snippet actually used anywhere in the current graal framework. >> >> Yes, you can see a call to NewInstanceStubCall in NewObjectSnippets.allocateInstance. >> >>> Is this a starting point we could use to get a non-foreign-call TLAB refill working? >> >> Yes. Note that this call *is* a foreign call (see the javadoc for ForeignCallDescriptor). >> >> -Doug >> > > > From doug.simon at oracle.com Thu Nov 7 14:38:14 2013 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 7 Nov 2013 23:38:14 +0100 Subject: non-foreign-call tlab refill from hsail In-Reply-To: References: <33F60C70-33F8-40F6-AE82-DD817293856C@oracle.com> <36423EAD-CDB9-4F3B-BDD4-E210C5EE3FB5@oracle.com> Message-ID: On Nov 7, 2013, at 11:21 PM, Deneau, Tom wrote: > Are snippets required to inline all their calls? Generally speaking, yes. > Or alternatively is there no way to annotate that a method should not be inlined? You can use the Snippet.SnippetInliningPolicy class to control inlining during snippet preparation. -Doug > -----Original Message----- > From: Doug Simon [mailto:doug.simon at oracle.com] > Sent: Thursday, November 07, 2013 4:15 PM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net > Subject: Re: non-foreign-call tlab refill from hsail > > Because it is slow (well, medium) path code that we don't want to inline at every allocation site. > > On Nov 7, 2013, at 11:13 PM, Deneau, Tom wrote: > >> So I was trying to understand why the NewInstanceStub.newInstance Stub >> code was not just included in the original NewObjectSnippet.allocateInstance snippet. >> >> -- Tom >> >> >> -----Original Message----- >> From: Doug Simon [mailto:doug.simon at oracle.com] >> Sent: Thursday, November 07, 2013 4:04 PM >> To: Deneau, Tom >> Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net >> Subject: Re: non-foreign-call tlab refill from hsail >> >> That is a very correct summary of the way it works! >> >> On Nov 7, 2013, at 10:22 PM, Deneau, Tom wrote: >> >>> Doug -- >>> >>> Trying to see if I understand how these pieces fit together. >>> >>> NewObjectSnippets.allocateInstance makes a call to >>> NewInstanceStubCall.call if the current tlab does not have enough >>> room. >>> >>> NewInstanceStubCall.call looks up the ForeignCallLinkage and finds >>> that it is not a simple foreign call to a specific foreign call >>> address (its address is 0) but instead has a stub associated with it. >>> I think this association came from the call to >>> >>> link(new NewInstanceStub(providers, target, >>> registerStubCall(NEW_INSTANCE, REEXECUTABLE, NOT_LEAF, >>> ANY_LOCATION))); >>> >>> in HotSpotHostForeingCallsProvider.java >>> >>> So when we try to finalizeAddress for the ForeignCallLinkage we end >>> up compiling this stub. >>> >>> The stub is a SnippetStub implemented with the snippet called >>> "newInstance" in NewInstanceStub.java and tries to get a new tlab >>> using CAS operations. If this stub cannot get a new tlab it makes a >>> "real" foreign call using >>> newInstanceC(NEW_INSTANCE_C, thread(), hub); >>> >>> which ends up going to the graalRuntime::new_instance >>> >>> >>> -----Original Message----- >>> From: Doug Simon [mailto:doug.simon at oracle.com] >>> Sent: Tuesday, October 22, 2013 4:42 AM >>> To: Deneau, Tom >>> Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net >>> Subject: Re: non-foreign-call tlab refill from hsail >>> >>> >>> On Oct 22, 2013, at 12:18 AM, "Deneau, Tom" wrote: >>> >>>> We are experimenting with object (and array) allocation from an HSA >>>> device (using graal for >>>> >>>> the HSAIL codegen). Where we are now: >>>> >>>> >>>> >>>> * the hsa workitems are using TLABs from "donor threads" who exist >>>> >>>> just to supply TLABs and don't do any allocation themselves. >>>> >>>> >>>> >>>> * To reduce the number of donor threads required, a TLAB can be >>>> >>>> used by more than one workitem, in which case the workitems use >>>> >>>> HSAIL atomic_add instructions to bump the tlab top pointer. >>>> >>>> >>>> >>>> * the HSAIL backend has its own fastpath allocation snippets >>>> >>>> which generate the HSAIL atomic_add instructions which >>>> >>>> override the snippets in NewObjectSnippets.java >>>> >>>> >>>> >>>> Some junit tests have been written and pass which allocate objects, or arrays of primitives, or arrays of objects. >>>> >>>> >>>> >>>> All the above only works for the fastpath case, i.e., if there is indeed enough space in the donor TLAB. We realize there are other cases: >>>> >>>> >>>> >>>> a) not enough space in current TLAB but ability to allocate a new TLAB. >>>> >>>> >>>> >>>> b) not able to allocate a new TLAB, GC required. >>>> >>>> >>>> >>>> For only case a) above, we would like to experiment with grabbing >>>> the new TLAB from HSAIL without making a "foreign call" to the VM. >>>> From the hotspot code, I assume the logic required is what one sees >>>> in >>>> >>>> mutableSpace::cas_allocate(size_t size) at least for the non-G1 case. >>> >>> When the NewInstanceStub fails to allocate a new TLAB, it calls out to GraalRuntime::new_instance (in graalRuntime.cpp). >>> >>>> Some of this non-foreign-call allocation logic appears to exist in >>>> the Snippet called NewInstanceStub.newInstance (as opposed to >>>> NewObjectSnippets.allocateInstance snippet which is what we are >>>> currently overriding). This comments for this snippet say >>>> >>>> "Re-attempts allocation after an initial TLAB allocation failed or >>>> >>>> was skipped (e.g., due to * -XX:-UseTLAB)." >>>> >>>> >>>> >>>> Is this NewInstanceStub.newInstance snippet actually used anywhere in the current graal framework. >>> >>> Yes, you can see a call to NewInstanceStubCall in NewObjectSnippets.allocateInstance. >>> >>>> Is this a starting point we could use to get a non-foreign-call TLAB refill working? >>> >>> Yes. Note that this call *is* a foreign call (see the javadoc for ForeignCallDescriptor). >>> >>> -Doug >>> >> >> >> > > > From jjfumero at gmail.com Thu Nov 14 07:54:09 2013 From: jjfumero at gmail.com (=?ISO-8859-1?Q?Juan_Jos=E9_Fumero_Alfonso?=) Date: Thu, 14 Nov 2013 15:54:09 +0000 Subject: Inlining in Graal Message-ID: Hi, I am working on Inlining with Graal in my backend. Using the Log information: -XX:+BootstrapGraal -G:Log=InliningDecisions -XX:+PrintCompilation I get this: [thread:1] scope: [thread:1] scope: Inlining [thread:1] scope: Inlining.Inlining [thread:1] scope: Inlining.Inlining.InliningDecisions * inlining MapEngineOpenCL.kernelComputation at 11: exact com.edinburgh.parallel.map.MapEngineOpenCL.checkInline(int):int: trivial (relevance=1.000000, probability=1.000000, bonus=1.000000, nodes=6) * With this code: // Testing public void kernelComputation(int[] input, int[] output) { for (int i = 0; i < input.length; i++) { * output[i] = checkInline(input[i]);* } } private int checkInline(int a) { return 10 + a; } So in this case the inlining is working and also the result in my program is correct. But what I want to do is the following: public void kernelComputation(T[] input, T[] output) { for (int i = 0; i < input.length; i++) { *output[i] = mInterface.f(input[i]);* } } And the method f is defined dynamically by the user: Integer[] result = Map.apply(data, new MapInterface() { @Override * public Integer f(Integer data) { return data * data; }* }).executeOpenCL(); What I want to do is the inlining of the f method which is defined dynamically by the user. [thread:1] scope: [thread:1] scope: Inlining [thread:1] scope: Inlining.InliningDecisions * not inlining MapEngineOpenCL.kernelComputation at 14: com.edinburgh.parallel.map.MapInterface.f(Object):Object (0 bytes): no type profile exists* I do not understand why "no type profile exists". Any idea about this? I am doing the inlining as Graal shows in the tests (com.oracle.graal.compiler.test.inlining). Thank you very much Juanjo From jjfumero at gmail.com Fri Nov 15 03:14:54 2013 From: jjfumero at gmail.com (=?ISO-8859-1?Q?Juan_Jos=E9_Fumero_Alfonso?=) Date: Fri, 15 Nov 2013 11:14:54 +0000 Subject: Inlining in Graal In-Reply-To: <5285FE5D0200000600105E13@gwia.im.jku.at> References: <5285FE5D0200000600105E13@gwia.im.jku.at> Message-ID: Hi Luka, I do not understand your second approach. In which moment the inlining is applied if I replace it with a constant? Is possible to force the inlining? I mean, a way to disable the heuristic for a while and create the inline. Thanks Juanjo 2013/11/15 Lukas Stadler > Hi Juan, > > > in your working example, the checkInline call is a private method that > can statically be bound to a specific target, without the help of > profiling information. Thus, it can be inlined. > The non-working example, however, contains an virtual/interface call, > which means that the compiler needs a hint from somewhere as to which > method could be the target of this call. > The inlining heuristics in Graal assume that all important code has been > running in the interpreter for a while (i.e. thousands of times). > I guess that you are compiling code that has never been executed in the > interpreter? > > > One solution would be to warm up the method in the interpreter. However, > the profiling information accumulates, so that the call site will become > polymorphic and again stop inlining at some point. > In your case, the closure that does the computation is known at the time > of compilation (I guess). You could replace the parameter with a > constant during your compilation process, which should allow the > compiler to devirtualize the call. > Is that a strategy that could work for you? > > > - Lukas > > >>> Juan Jos? Fumero Alfonso 11/14/13 4:55 PM >>> > Hi, > I am working on Inlining with Graal in my backend. Using the Log > information: > > -XX:+BootstrapGraal -G:Log=InliningDecisions -XX:+PrintCompilation > > I get this: > > [thread:1] scope: > [thread:1] scope: Inlining > [thread:1] scope: Inlining.Inlining > [thread:1] scope: Inlining.Inlining.InliningDecisions > > * inlining MapEngineOpenCL.kernelComputation at 11: exact > com.edinburgh.parallel.map.MapEngineOpenCL.checkInline(int):int: trivial > (relevance=1.000000, probability=1.000000, bonus=1.000000, nodes=6) * > > With this code: > > // Testing > public void kernelComputation(int[] input, int[] output) { > for (int i = 0; i < input.length; i++) { > * output[i] = checkInline(input[i]);* > } > } > > private int checkInline(int a) { > return 10 + a; > } > > So in this case the inlining is working and also the result in my > program > is correct. But what I want to do is the following: > > > public void kernelComputation(T[] input, T[] output) { > for (int i = 0; i < input.length; i++) { > *output[i] = mInterface.f(input[i]);* > } > } > > > And the method f is defined dynamically by the user: > > > Integer[] result = Map.apply(data, new MapInterface() { > @Override > > > * public Integer f(Integer data) { return data > * > data; }* > > }).executeOpenCL(); > > > What I want to do is the inlining of the f method which is defined > dynamically by the user. > > [thread:1] scope: > [thread:1] scope: Inlining > [thread:1] scope: Inlining.InliningDecisions > * not inlining MapEngineOpenCL.kernelComputation at 14: > com.edinburgh.parallel.map.MapInterface.f(Object):Object (0 bytes): no > type > profile exists* > > > I do not understand why "no type profile exists". Any idea about this? I > am > doing the inlining as Graal shows in the tests > (com.oracle.graal.compiler.test.inlining). > > Thank you very much > Juanjo > > > > > From Lukas.Stadler at jku.at Fri Nov 15 01:58:37 2013 From: Lukas.Stadler at jku.at (Lukas Stadler) Date: Fri, 15 Nov 2013 10:58:37 +0100 Subject: Inlining in Graal In-Reply-To: References: Message-ID: <5285FE5D0200000600105E13@gwia.im.jku.at> Hi Juan, in your working example, the checkInline call is a private method that can statically be bound to a specific target, without the help of profiling information. Thus, it can be inlined. The non-working example, however, contains an virtual/interface call, which means that the compiler needs a hint from somewhere as to which method could be the target of this call. The inlining heuristics in Graal assume that all important code has been running in the interpreter for a while (i.e. thousands of times). I guess that you are compiling code that has never been executed in the interpreter? One solution would be to warm up the method in the interpreter. However, the profiling information accumulates, so that the call site will become polymorphic and again stop inlining at some point. In your case, the closure that does the computation is known at the time of compilation (I guess). You could replace the parameter with a constant during your compilation process, which should allow the compiler to devirtualize the call. Is that a strategy that could work for you? - Lukas >>> Juan Jos? Fumero Alfonso 11/14/13 4:55 PM >>> Hi, I am working on Inlining with Graal in my backend. Using the Log information: -XX:+BootstrapGraal -G:Log=InliningDecisions -XX:+PrintCompilation I get this: [thread:1] scope: [thread:1] scope: Inlining [thread:1] scope: Inlining.Inlining [thread:1] scope: Inlining.Inlining.InliningDecisions * inlining MapEngineOpenCL.kernelComputation at 11: exact com.edinburgh.parallel.map.MapEngineOpenCL.checkInline(int):int: trivial (relevance=1.000000, probability=1.000000, bonus=1.000000, nodes=6) * With this code: // Testing public void kernelComputation(int[] input, int[] output) { for (int i = 0; i < input.length; i++) { * output[i] = checkInline(input[i]);* } } private int checkInline(int a) { return 10 + a; } So in this case the inlining is working and also the result in my program is correct. But what I want to do is the following: public void kernelComputation(T[] input, T[] output) { for (int i = 0; i < input.length; i++) { *output[i] = mInterface.f(input[i]);* } } And the method f is defined dynamically by the user: Integer[] result = Map.apply(data, new MapInterface() { @Override * public Integer f(Integer data) { return data * data; }* }).executeOpenCL(); What I want to do is the inlining of the f method which is defined dynamically by the user. [thread:1] scope: [thread:1] scope: Inlining [thread:1] scope: Inlining.InliningDecisions * not inlining MapEngineOpenCL.kernelComputation at 14: com.edinburgh.parallel.map.MapInterface.f(Object):Object (0 bytes): no type profile exists* I do not understand why "no type profile exists". Any idea about this? I am doing the inlining as Graal shows in the tests (com.oracle.graal.compiler.test.inlining). Thank you very much Juanjo