From dmitry.samersoff at oracle.com Sat Nov 1 10:13:44 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Sat, 01 Nov 2014 13:13:44 +0300 Subject: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <5453F9F4.20309@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> Message-ID: <5454B258.1080104@oracle.com> Serguei, Thank you for good finding. This approach looks much better for me. The fix looks good. Is it necessary to release vmDeathLock locks at eventHandler.c:1244 before call EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? -Dmitry On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: > > It is 3-rd round of review for: > https://bugs.openjdk.java.net/browse/JDK-6988950 > > New webrev: > > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ > > > Summary > > For failing scenario, please, refer to the 1-st round RFR below. > > I've found what is missed in the jdwp agent shutdown and decided to > switch from a workaround to a real fix. > > The agent VM_DEATH callback sets the gdata field: gdata->vmDead = 1. > The agent debugLoop_run() has a guard against the VM shutdown: > > 165 } else if (gdata->vmDead && > 166 ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) { > 167 /* Protect the VM from calls while dead. > 168 * VirtualMachine cmdSet quietly ignores some cmds > 169 * after VM death, so, it sends it's own errors. > 170 */ > 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD)); > > > However, the guard above does not help much if the VM_DEATH event > happens in the middle of a command execution. > There is a lack of synchronization here. > > The fix introduces new lock (vmDeathLock) which does not allow to > execute the commands > and the VM_DEATH event callback concurrently. > It should work well for any function that is used in implementation of > the JDWP_COMMAND_SET(VirtualMachine) . > > > Testing: > Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests > > > Thanks, > Serguei > > > On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >> The updated webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >> >> >> The changes are: >> - added a comment recommended by Staffan >> - removed the ignore_wrong_phase() call from function classSignature() >> >> The classSignature() function is called in 16 places. >> Most of them do not tolerate the NULL in place of returned signature >> and will crash. >> I'm not comfortable to fix all the occurrences now and suggest to >> return to this >> issue after gaining experience with more failure cases that are still >> expected. >> The failure with the classSignature() involved was observed only once >> in the nightly >> and should be extremely rare reproducible. >> I'll file a placeholder bug if necessary. >> >> Thanks, >> Serguei >> >> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>> >>> >>> Open webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>> >>> >>> >>> Summary: >>> >>> The failing scenario: >>> The debugger and the debuggee are well aware a VM shutdown has >>> been started in the target process. >>> The debugger at this point is not expected to send any commands >>> to the JDWP agent. >>> However, the JDI layer (debugger side) and the jdwp agent >>> (debuggee side) >>> are not in sync with the consumer layers. >>> >>> One reason is because the test debugger does not invoke the JDI >>> method VirtualMachine.dispose(). >>> Another reason is that the Debugger and the debuggee processes >>> are uneasy to sync in general. >>> >>> As a result the following steps are possible: >>> - The test debugger sends a 'quit' command to the test debuggee >>> - The debuggee is normally exiting >>> - The jdwp backend reports (over the jdwp protocol) an >>> anonymous class unload event >>> - The JDI InternalEventHandler thread handles the >>> ClassUnloadEvent event >>> - The InternalEventHandler wants to uncache the matching >>> reference type. >>> If there is more than one class with the same host class >>> signature, it can't distinguish them, >>> and so, deletes all references and re-retrieves them again >>> (see tracing below): >>> MY_TRACE: JDI: >>> VirtualMachineImpl.retrieveClassesBySignature: >>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>> - The jdwp backend debugLoop_run() gets the command from JDI >>> and calls the functions >>> classesForSignature() and classStatus() recursively. >>> - The classStatus() makes a call to the JVMTI GetClassStatus() >>> and gets the JVMTI_ERROR_WRONG_PHASE >>> - As a result the jdwp backend reports the JVMTI error to the >>> JDI, and so, the test fails >>> >>> For details, see the analysis in bug report closed as a dup of >>> the bug 6988950: >>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>> >>> Some similar cases can be found in the two bug reports (6988950 >>> and 8024865) describing this issue. >>> >>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error >>> as it is normal at the VM shutdown. >>> The original jdwp backend implementation had a similar approach >>> for the raw monitor functions. >>> Threy use the ignore_vm_death() to workaround the >>> JVMTI_ERROR_WRONG_PHASE errors. >>> For reference, please, see the file: src/share/back/util.c >>> >>> >>> Testing: >>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >>> >>> >>> Thanks, >>> Serguei >>> >> > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From david.holmes at oracle.com Mon Nov 3 04:50:09 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 03 Nov 2014 14:50:09 +1000 Subject: Review request: JDK-8062556: Add jdk tests for JDK-8058322 and JDK-8058313 In-Reply-To: <54539625.7070302@oracle.com> References: <5452CCA3.5040001@oracle.com> <5452ECF5.7020607@oracle.com> <54539625.7070302@oracle.com> Message-ID: <54570981.2090601@oracle.com> Hi Erik, webrevs still broken for some reason. On 1/11/2014 12:01 AM, Eric McCorkle wrote: > I went through and added comments in the binary data indicating where > the MethodParameters attributes are, and a breakdown of their contents. > I went ahead and did this for all the bad class files, not just the new > ones. > > There is a larger picture here: there's an outstanding task I filed > around the time these tests were written to find a better way for > langtools to run jtreg tests that involve bad class files. > Unfortunately, doing that is rather difficult, as you can see. The only > real way to do it is to generate a class file, convert it to signed > bytes (you can't even use hex; you get an unsigned/signed byte > conversion problem), then modify the data by hand. The intent is to > replace this with a better method at some point. OK. New comments an improvement. Please give the new test the correct initial copyright year of 2014. I know updates to the year are handled automatically (eventually) but we should at least have things correct to start with. Thanks, David > On 10/30/14 21:59, David Holmes wrote: >> Hi Erik, >> >> On 31/10/2014 9:41 AM, Eric McCorkle wrote: >>> Hello, >>> >>> Please review this patch which adds tests to the JDK test suite for two >>> reflection bugs that require hotspot changes (JDK-8058322 and >>> JDK-8058313) >>> >>> The webrev is here: >>> http://cr.openjdk.java.net/~emc/8062556/ >> >> I second Brian's comment re the source of the bad classes. >> >> Your webrev is broken btw - no top-level html files. >> >> The new test needs a copyright year of 2014 not 2013. >> >> Thanks, >> David >> From david.holmes at oracle.com Mon Nov 3 04:58:16 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 03 Nov 2014 14:58:16 +1000 Subject: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <5454B258.1080104@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> Message-ID: <54570B68.3060806@oracle.com> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: > Serguei, > > Thank you for good finding. This approach looks much better for me. > > The fix looks good. > > Is it necessary to release vmDeathLock locks at > eventHandler.c:1244 before call > > EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? I agree this looks necessary, or at least more clean (if things are failing we really don't know what is happening). More generally I'm concerned about whether any of the code paths taken while holding the new lock can result in deadlock - in particular with regard to the resumeLock ? David > -Dmitry > > > > On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >> >> It is 3-rd round of review for: >> https://bugs.openjdk.java.net/browse/JDK-6988950 >> >> New webrev: >> >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >> >> >> Summary >> >> For failing scenario, please, refer to the 1-st round RFR below. >> >> I've found what is missed in the jdwp agent shutdown and decided to >> switch from a workaround to a real fix. >> >> The agent VM_DEATH callback sets the gdata field: gdata->vmDead = 1. >> The agent debugLoop_run() has a guard against the VM shutdown: >> >> 165 } else if (gdata->vmDead && >> 166 ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) { >> 167 /* Protect the VM from calls while dead. >> 168 * VirtualMachine cmdSet quietly ignores some cmds >> 169 * after VM death, so, it sends it's own errors. >> 170 */ >> 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD)); >> >> >> However, the guard above does not help much if the VM_DEATH event >> happens in the middle of a command execution. >> There is a lack of synchronization here. >> >> The fix introduces new lock (vmDeathLock) which does not allow to >> execute the commands >> and the VM_DEATH event callback concurrently. >> It should work well for any function that is used in implementation of >> the JDWP_COMMAND_SET(VirtualMachine) . >> >> >> Testing: >> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >> >> >> Thanks, >> Serguei >> >> >> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>> The updated webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>> >>> >>> The changes are: >>> - added a comment recommended by Staffan >>> - removed the ignore_wrong_phase() call from function classSignature() >>> >>> The classSignature() function is called in 16 places. >>> Most of them do not tolerate the NULL in place of returned signature >>> and will crash. >>> I'm not comfortable to fix all the occurrences now and suggest to >>> return to this >>> issue after gaining experience with more failure cases that are still >>> expected. >>> The failure with the classSignature() involved was observed only once >>> in the nightly >>> and should be extremely rare reproducible. >>> I'll file a placeholder bug if necessary. >>> >>> Thanks, >>> Serguei >>> >>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>> Please, review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>> >>>> >>>> Open webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>> >>>> >>>> >>>> Summary: >>>> >>>> The failing scenario: >>>> The debugger and the debuggee are well aware a VM shutdown has >>>> been started in the target process. >>>> The debugger at this point is not expected to send any commands >>>> to the JDWP agent. >>>> However, the JDI layer (debugger side) and the jdwp agent >>>> (debuggee side) >>>> are not in sync with the consumer layers. >>>> >>>> One reason is because the test debugger does not invoke the JDI >>>> method VirtualMachine.dispose(). >>>> Another reason is that the Debugger and the debuggee processes >>>> are uneasy to sync in general. >>>> >>>> As a result the following steps are possible: >>>> - The test debugger sends a 'quit' command to the test debuggee >>>> - The debuggee is normally exiting >>>> - The jdwp backend reports (over the jdwp protocol) an >>>> anonymous class unload event >>>> - The JDI InternalEventHandler thread handles the >>>> ClassUnloadEvent event >>>> - The InternalEventHandler wants to uncache the matching >>>> reference type. >>>> If there is more than one class with the same host class >>>> signature, it can't distinguish them, >>>> and so, deletes all references and re-retrieves them again >>>> (see tracing below): >>>> MY_TRACE: JDI: >>>> VirtualMachineImpl.retrieveClassesBySignature: >>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>> - The jdwp backend debugLoop_run() gets the command from JDI >>>> and calls the functions >>>> classesForSignature() and classStatus() recursively. >>>> - The classStatus() makes a call to the JVMTI GetClassStatus() >>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>> - As a result the jdwp backend reports the JVMTI error to the >>>> JDI, and so, the test fails >>>> >>>> For details, see the analysis in bug report closed as a dup of >>>> the bug 6988950: >>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>> >>>> Some similar cases can be found in the two bug reports (6988950 >>>> and 8024865) describing this issue. >>>> >>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error >>>> as it is normal at the VM shutdown. >>>> The original jdwp backend implementation had a similar approach >>>> for the raw monitor functions. >>>> Threy use the ignore_vm_death() to workaround the >>>> JVMTI_ERROR_WRONG_PHASE errors. >>>> For reference, please, see the file: src/share/back/util.c >>>> >>>> >>>> Testing: >>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>> >> > > From serguei.spitsyn at oracle.com Mon Nov 3 06:16:03 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sun, 02 Nov 2014 22:16:03 -0800 Subject: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <5454B258.1080104@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> Message-ID: <54571DA3.9090906@oracle.com> On 11/1/14 3:13 AM, Dmitry Samersoff wrote: > Serguei, > > Thank you for good finding. This approach looks much better for me. > > The fix looks good. > > Is it necessary to release vmDeathLock locks at > eventHandler.c:1244 before call > > EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? Nice catch. Yes, it is better to release the lock in that case. Thanks, Dmitry! Serguei > > -Dmitry > > > > On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >> It is 3-rd round of review for: >> https://bugs.openjdk.java.net/browse/JDK-6988950 >> >> New webrev: >> >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >> >> >> Summary >> >> For failing scenario, please, refer to the 1-st round RFR below. >> >> I've found what is missed in the jdwp agent shutdown and decided to >> switch from a workaround to a real fix. >> >> The agent VM_DEATH callback sets the gdata field: gdata->vmDead = 1. >> The agent debugLoop_run() has a guard against the VM shutdown: >> >> 165 } else if (gdata->vmDead && >> 166 ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) { >> 167 /* Protect the VM from calls while dead. >> 168 * VirtualMachine cmdSet quietly ignores some cmds >> 169 * after VM death, so, it sends it's own errors. >> 170 */ >> 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD)); >> >> >> However, the guard above does not help much if the VM_DEATH event >> happens in the middle of a command execution. >> There is a lack of synchronization here. >> >> The fix introduces new lock (vmDeathLock) which does not allow to >> execute the commands >> and the VM_DEATH event callback concurrently. >> It should work well for any function that is used in implementation of >> the JDWP_COMMAND_SET(VirtualMachine) . >> >> >> Testing: >> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >> >> >> Thanks, >> Serguei >> >> >> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>> The updated webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>> >>> >>> The changes are: >>> - added a comment recommended by Staffan >>> - removed the ignore_wrong_phase() call from function classSignature() >>> >>> The classSignature() function is called in 16 places. >>> Most of them do not tolerate the NULL in place of returned signature >>> and will crash. >>> I'm not comfortable to fix all the occurrences now and suggest to >>> return to this >>> issue after gaining experience with more failure cases that are still >>> expected. >>> The failure with the classSignature() involved was observed only once >>> in the nightly >>> and should be extremely rare reproducible. >>> I'll file a placeholder bug if necessary. >>> >>> Thanks, >>> Serguei >>> >>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>> Please, review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>> >>>> >>>> Open webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>> >>>> >>>> >>>> Summary: >>>> >>>> The failing scenario: >>>> The debugger and the debuggee are well aware a VM shutdown has >>>> been started in the target process. >>>> The debugger at this point is not expected to send any commands >>>> to the JDWP agent. >>>> However, the JDI layer (debugger side) and the jdwp agent >>>> (debuggee side) >>>> are not in sync with the consumer layers. >>>> >>>> One reason is because the test debugger does not invoke the JDI >>>> method VirtualMachine.dispose(). >>>> Another reason is that the Debugger and the debuggee processes >>>> are uneasy to sync in general. >>>> >>>> As a result the following steps are possible: >>>> - The test debugger sends a 'quit' command to the test debuggee >>>> - The debuggee is normally exiting >>>> - The jdwp backend reports (over the jdwp protocol) an >>>> anonymous class unload event >>>> - The JDI InternalEventHandler thread handles the >>>> ClassUnloadEvent event >>>> - The InternalEventHandler wants to uncache the matching >>>> reference type. >>>> If there is more than one class with the same host class >>>> signature, it can't distinguish them, >>>> and so, deletes all references and re-retrieves them again >>>> (see tracing below): >>>> MY_TRACE: JDI: >>>> VirtualMachineImpl.retrieveClassesBySignature: >>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>> - The jdwp backend debugLoop_run() gets the command from JDI >>>> and calls the functions >>>> classesForSignature() and classStatus() recursively. >>>> - The classStatus() makes a call to the JVMTI GetClassStatus() >>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>> - As a result the jdwp backend reports the JVMTI error to the >>>> JDI, and so, the test fails >>>> >>>> For details, see the analysis in bug report closed as a dup of >>>> the bug 6988950: >>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>> >>>> Some similar cases can be found in the two bug reports (6988950 >>>> and 8024865) describing this issue. >>>> >>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error >>>> as it is normal at the VM shutdown. >>>> The original jdwp backend implementation had a similar approach >>>> for the raw monitor functions. >>>> Threy use the ignore_vm_death() to workaround the >>>> JVMTI_ERROR_WRONG_PHASE errors. >>>> For reference, please, see the file: src/share/back/util.c >>>> >>>> >>>> Testing: >>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >>>> >>>> >>>> Thanks, >>>> Serguei >>>> > From serguei.spitsyn at oracle.com Mon Nov 3 07:07:15 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sun, 02 Nov 2014 23:07:15 -0800 Subject: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <54570B68.3060806@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> Message-ID: <545729A3.7090301@oracle.com> On 11/2/14 8:58 PM, David Holmes wrote: > On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >> Serguei, >> >> Thank you for good finding. This approach looks much better for me. >> >> The fix looks good. >> >> Is it necessary to release vmDeathLock locks at >> eventHandler.c:1244 before call >> >> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? > > I agree this looks necessary, or at least more clean (if things are > failing we really don't know what is happening). Agreed (replied to Dmitry). > > More generally I'm concerned about whether any of the code paths taken > while holding the new lock can result in deadlock - in particular with > regard to the resumeLock ? The cbVMDeath() function never holds both vmDeathLock and resumeLock at the same time, so there is no chance for a deadlock that involves both these locks. Two more locks used in the cbVMDeath() are the callbackBlock and callbackLock. These two locks look completely unrelated to the debugLoop_run(). The debugLoop_run() function also uses the cmdQueueLock. The debugLoop_run() never holds both vmDeathLock and cmdQueueLock at the same time. So that I do not see any potential to introduce new deadlock with the vmDeathLock. However, it is still easy to overlook something here. Please, let me know if you see any danger. Thanks, Serguei > > David > >> -Dmitry >> >> >> >> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>> >>> It is 3-rd round of review for: >>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>> >>> New webrev: >>> >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>> >>> >>> >>> Summary >>> >>> For failing scenario, please, refer to the 1-st round RFR below. >>> >>> I've found what is missed in the jdwp agent shutdown and decided to >>> switch from a workaround to a real fix. >>> >>> The agent VM_DEATH callback sets the gdata field: gdata->vmDead = 1. >>> The agent debugLoop_run() has a guard against the VM shutdown: >>> >>> 165 } else if (gdata->vmDead && >>> 166 ((cmd->cmdSet) != >>> JDWP_COMMAND_SET(VirtualMachine))) { >>> 167 /* Protect the VM from calls while dead. >>> 168 * VirtualMachine cmdSet quietly ignores some >>> cmds >>> 169 * after VM death, so, it sends it's own errors. >>> 170 */ >>> 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD)); >>> >>> >>> However, the guard above does not help much if the VM_DEATH event >>> happens in the middle of a command execution. >>> There is a lack of synchronization here. >>> >>> The fix introduces new lock (vmDeathLock) which does not allow to >>> execute the commands >>> and the VM_DEATH event callback concurrently. >>> It should work well for any function that is used in >>> implementation of >>> the JDWP_COMMAND_SET(VirtualMachine) . >>> >>> >>> Testing: >>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>> The updated webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>> >>>> >>>> >>>> The changes are: >>>> - added a comment recommended by Staffan >>>> - removed the ignore_wrong_phase() call from function >>>> classSignature() >>>> >>>> The classSignature() function is called in 16 places. >>>> Most of them do not tolerate the NULL in place of returned signature >>>> and will crash. >>>> I'm not comfortable to fix all the occurrences now and suggest to >>>> return to this >>>> issue after gaining experience with more failure cases that are still >>>> expected. >>>> The failure with the classSignature() involved was observed only once >>>> in the nightly >>>> and should be extremely rare reproducible. >>>> I'll file a placeholder bug if necessary. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>> Please, review the fix for: >>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>> >>>>> >>>>> Open webrev: >>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>> >>>>> >>>>> >>>>> >>>>> Summary: >>>>> >>>>> The failing scenario: >>>>> The debugger and the debuggee are well aware a VM shutdown has >>>>> been started in the target process. >>>>> The debugger at this point is not expected to send any commands >>>>> to the JDWP agent. >>>>> However, the JDI layer (debugger side) and the jdwp agent >>>>> (debuggee side) >>>>> are not in sync with the consumer layers. >>>>> >>>>> One reason is because the test debugger does not invoke the JDI >>>>> method VirtualMachine.dispose(). >>>>> Another reason is that the Debugger and the debuggee processes >>>>> are uneasy to sync in general. >>>>> >>>>> As a result the following steps are possible: >>>>> - The test debugger sends a 'quit' command to the test >>>>> debuggee >>>>> - The debuggee is normally exiting >>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>> anonymous class unload event >>>>> - The JDI InternalEventHandler thread handles the >>>>> ClassUnloadEvent event >>>>> - The InternalEventHandler wants to uncache the matching >>>>> reference type. >>>>> If there is more than one class with the same host class >>>>> signature, it can't distinguish them, >>>>> and so, deletes all references and re-retrieves them again >>>>> (see tracing below): >>>>> MY_TRACE: JDI: >>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>> - The jdwp backend debugLoop_run() gets the command from JDI >>>>> and calls the functions >>>>> classesForSignature() and classStatus() recursively. >>>>> - The classStatus() makes a call to the JVMTI >>>>> GetClassStatus() >>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>> - As a result the jdwp backend reports the JVMTI error to the >>>>> JDI, and so, the test fails >>>>> >>>>> For details, see the analysis in bug report closed as a dup of >>>>> the bug 6988950: >>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>> >>>>> Some similar cases can be found in the two bug reports (6988950 >>>>> and 8024865) describing this issue. >>>>> >>>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error >>>>> as it is normal at the VM shutdown. >>>>> The original jdwp backend implementation had a similar approach >>>>> for the raw monitor functions. >>>>> Threy use the ignore_vm_death() to workaround the >>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>> For reference, please, see the file: src/share/back/util.c >>>>> >>>>> >>>>> Testing: >>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>> tests >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>> >>> >> >> From serguei.spitsyn at oracle.com Mon Nov 3 07:13:16 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sun, 02 Nov 2014 23:13:16 -0800 Subject: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <545729A3.7090301@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> <545729A3.7090301@oracle.com> Message-ID: <54572B0C.4000100@oracle.com> David, I forgot to thank you for reviewing! Thanks, Serguei On 11/2/14 11:07 PM, serguei.spitsyn at oracle.com wrote: > On 11/2/14 8:58 PM, David Holmes wrote: >> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >>> Serguei, >>> >>> Thank you for good finding. This approach looks much better for me. >>> >>> The fix looks good. >>> >>> Is it necessary to release vmDeathLock locks at >>> eventHandler.c:1244 before call >>> >>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? >> >> I agree this looks necessary, or at least more clean (if things are >> failing we really don't know what is happening). > > Agreed (replied to Dmitry). > >> >> More generally I'm concerned about whether any of the code paths >> taken while holding the new lock can result in deadlock - in >> particular with regard to the resumeLock ? > > The cbVMDeath() function never holds both vmDeathLock and resumeLock > at the same time, > so there is no chance for a deadlock that involves both these locks. > > Two more locks used in the cbVMDeath() are the callbackBlock and > callbackLock. > These two locks look completely unrelated to the debugLoop_run(). > > The debugLoop_run() function also uses the cmdQueueLock. > The debugLoop_run() never holds both vmDeathLock and cmdQueueLock at > the same time. > > So that I do not see any potential to introduce new deadlock with the > vmDeathLock. > > However, it is still easy to overlook something here. > Please, let me know if you see any danger. > > Thanks, > Serguei > >> >> David >> >>> -Dmitry >>> >>> >>> >>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>>> >>>> It is 3-rd round of review for: >>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>> >>>> New webrev: >>>> >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>>> >>>> >>>> >>>> Summary >>>> >>>> For failing scenario, please, refer to the 1-st round RFR below. >>>> >>>> I've found what is missed in the jdwp agent shutdown and decided to >>>> switch from a workaround to a real fix. >>>> >>>> The agent VM_DEATH callback sets the gdata field: gdata->vmDead >>>> = 1. >>>> The agent debugLoop_run() has a guard against the VM shutdown: >>>> >>>> 165 } else if (gdata->vmDead && >>>> 166 ((cmd->cmdSet) != >>>> JDWP_COMMAND_SET(VirtualMachine))) { >>>> 167 /* Protect the VM from calls while dead. >>>> 168 * VirtualMachine cmdSet quietly ignores some >>>> cmds >>>> 169 * after VM death, so, it sends it's own errors. >>>> 170 */ >>>> 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD)); >>>> >>>> >>>> However, the guard above does not help much if the VM_DEATH event >>>> happens in the middle of a command execution. >>>> There is a lack of synchronization here. >>>> >>>> The fix introduces new lock (vmDeathLock) which does not allow to >>>> execute the commands >>>> and the VM_DEATH event callback concurrently. >>>> It should work well for any function that is used in >>>> implementation of >>>> the JDWP_COMMAND_SET(VirtualMachine) . >>>> >>>> >>>> Testing: >>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>>> The updated webrev: >>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>>> >>>>> >>>>> >>>>> The changes are: >>>>> - added a comment recommended by Staffan >>>>> - removed the ignore_wrong_phase() call from function >>>>> classSignature() >>>>> >>>>> The classSignature() function is called in 16 places. >>>>> Most of them do not tolerate the NULL in place of returned signature >>>>> and will crash. >>>>> I'm not comfortable to fix all the occurrences now and suggest to >>>>> return to this >>>>> issue after gaining experience with more failure cases that are still >>>>> expected. >>>>> The failure with the classSignature() involved was observed only once >>>>> in the nightly >>>>> and should be extremely rare reproducible. >>>>> I'll file a placeholder bug if necessary. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Please, review the fix for: >>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>> >>>>>> >>>>>> Open webrev: >>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Summary: >>>>>> >>>>>> The failing scenario: >>>>>> The debugger and the debuggee are well aware a VM shutdown has >>>>>> been started in the target process. >>>>>> The debugger at this point is not expected to send any >>>>>> commands >>>>>> to the JDWP agent. >>>>>> However, the JDI layer (debugger side) and the jdwp agent >>>>>> (debuggee side) >>>>>> are not in sync with the consumer layers. >>>>>> >>>>>> One reason is because the test debugger does not invoke the >>>>>> JDI >>>>>> method VirtualMachine.dispose(). >>>>>> Another reason is that the Debugger and the debuggee processes >>>>>> are uneasy to sync in general. >>>>>> >>>>>> As a result the following steps are possible: >>>>>> - The test debugger sends a 'quit' command to the test >>>>>> debuggee >>>>>> - The debuggee is normally exiting >>>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>>> anonymous class unload event >>>>>> - The JDI InternalEventHandler thread handles the >>>>>> ClassUnloadEvent event >>>>>> - The InternalEventHandler wants to uncache the matching >>>>>> reference type. >>>>>> If there is more than one class with the same host class >>>>>> signature, it can't distinguish them, >>>>>> and so, deletes all references and re-retrieves them again >>>>>> (see tracing below): >>>>>> MY_TRACE: JDI: >>>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>>> - The jdwp backend debugLoop_run() gets the command from JDI >>>>>> and calls the functions >>>>>> classesForSignature() and classStatus() recursively. >>>>>> - The classStatus() makes a call to the JVMTI >>>>>> GetClassStatus() >>>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>>> - As a result the jdwp backend reports the JVMTI error to >>>>>> the >>>>>> JDI, and so, the test fails >>>>>> >>>>>> For details, see the analysis in bug report closed as a dup of >>>>>> the bug 6988950: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>>> >>>>>> Some similar cases can be found in the two bug reports >>>>>> (6988950 >>>>>> and 8024865) describing this issue. >>>>>> >>>>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error >>>>>> as it is normal at the VM shutdown. >>>>>> The original jdwp backend implementation had a similar >>>>>> approach >>>>>> for the raw monitor functions. >>>>>> Threy use the ignore_vm_death() to workaround the >>>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>>> For reference, please, see the file: src/share/back/util.c >>>>>> >>>>>> >>>>>> Testing: >>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>> tests >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>> >>>> >>> >>> > From tobias.hartmann at oracle.com Mon Nov 3 11:09:09 2014 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 03 Nov 2014 12:09:09 +0100 Subject: [8u40] Backport requests for 8061817 and 8062169 Message-ID: <54576255.2020000@oracle.com> Hi, please review the following backport requests for 8u40. (1) 8061817: Whitebox.deoptimizeMethod() does not deoptimize all OSR versions of method https://bugs.openjdk.java.net/browse/JDK-8061817 http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/d30c8335bb6f (2) 8062169: Multiple OSR compilations issued for same bci https://bugs.openjdk.java.net/browse/JDK-8062169 http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/8edc39841abe The changes were pushed last week. Nightly testing showed no problems. The changes apply cleanly to 8u40. Thanks, Tobias From mikael.gerdin at oracle.com Mon Nov 3 13:01:10 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 03 Nov 2014 14:01:10 +0100 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions Message-ID: <54577C96.5030503@oracle.com> Hi all, Please review this attempt at fixing the OrderAccess functions on Linux x86 with GCC. While working on another bug I recently discovered that g++ was reordering stores across a call to OrderAccess::storestore on Linux x86. The G1 code attempts to do an ordered publishing of two values: _saved_mark_word = _top; OrderAccess::storestore(); _gc_time_stamp = curr_gc_time_stamp; The types involved are HeapWord* _top, _saved_mark_word; volatile unsigned _gc_time_stamp; The incorrect behavior seems to have started when JDK-6973570 was fixed in JDK 7. Below, _top is at offset 0x58, _saved_mark_word at 0x18 and _gc_time_stamp at 0x138, %rbx is "this". /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: 3d9f4d: 39 d0 cmp %edx,%eax 3d9f4f: 73 1c jae 3d9f6d 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # ae98a0 <_DYNAMIC+0x12f8> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so 3da05d: 39 d0 cmp %edx,%eax 3da05f: 73 15 jae 3da076 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) 3da072: 48 89 43 18 mov %rax,0x18(%rbx) In b109 the store of %rax to 0x18(%rbx) has been ordered after the store of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. My suggestion to fix this is to extend all the OrderAccess::release* variants on x86 with a: __asm__ volatile ("" : : : "memory"); to attempt to prevent GCC from reordering any memory accesses across those function calls. I've verified that this solves the issue in the assembly with our current JDK 9 build platform compilers. I've also verified that this particular piece of code is compiled correctly on our other x86 platforms: Solaris, Windows and OS X. Webrev: http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ Bug: https://bugs.openjdk.java.net/browse/JDK-8061964 Testing: JPRT, inspecting generated assembly for the function G1OffsetTableContigSpace::record_top_and_timestamp (as the method is currently named). Suggestions of further testing is greatly appreciated. Thanks Mikael From eric.mccorkle at oracle.com Mon Nov 3 15:42:40 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Mon, 03 Nov 2014 10:42:40 -0500 Subject: Review request: JDK-8062556: Add jdk tests for JDK-8058322 and JDK-8058313 In-Reply-To: <54570981.2090601@oracle.com> References: <5452CCA3.5040001@oracle.com> <5452ECF5.7020607@oracle.com> <54539625.7070302@oracle.com> <54570981.2090601@oracle.com> Message-ID: <5457A270.9010009@oracle.com> I have been having issues with webrev, which I reported earlier. Webrev reports a syntax error when I try to use it, and curiously, it fails to produce top-level files in this case (and this case only, as evidenced by my other webrevs). Unfortunately, there's nothing I can do about the missing top-level files; however, you can still look at the individual files just fine. On 11/02/14 23:50, David Holmes wrote: > Hi Erik, > > webrevs still broken for some reason. > On 1/11/2014 12:01 AM, Eric McCorkle wrote: >> I went through and added comments in the binary data indicating where >> the MethodParameters attributes are, and a breakdown of their contents. >> I went ahead and did this for all the bad class files, not just the new >> ones. >> >> There is a larger picture here: there's an outstanding task I filed >> around the time these tests were written to find a better way for >> langtools to run jtreg tests that involve bad class files. >> Unfortunately, doing that is rather difficult, as you can see. The only >> real way to do it is to generate a class file, convert it to signed >> bytes (you can't even use hex; you get an unsigned/signed byte >> conversion problem), then modify the data by hand. The intent is to >> replace this with a better method at some point. > > OK. New comments an improvement. > > Please give the new test the correct initial copyright year of 2014. I > know updates to the year are handled automatically (eventually) but we > should at least have things correct to start with. > > Thanks, > David > >> On 10/30/14 21:59, David Holmes wrote: >>> Hi Erik, >>> >>> On 31/10/2014 9:41 AM, Eric McCorkle wrote: >>>> Hello, >>>> >>>> Please review this patch which adds tests to the JDK test suite for two >>>> reflection bugs that require hotspot changes (JDK-8058322 and >>>> JDK-8058313) >>>> >>>> The webrev is here: >>>> http://cr.openjdk.java.net/~emc/8062556/ >>> >>> I second Brian's comment re the source of the bad classes. >>> >>> Your webrev is broken btw - no top-level html files. >>> >>> The new test needs a copyright year of 2014 not 2013. >>> >>> Thanks, >>> David >>> From vladimir.kozlov at oracle.com Mon Nov 3 17:36:09 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 03 Nov 2014 09:36:09 -0800 Subject: [8u40] Backport requests for 8061817 and 8062169 In-Reply-To: <54576255.2020000@oracle.com> References: <54576255.2020000@oracle.com> Message-ID: <5457BD09.4000606@oracle.com> Looks good. Thanks, Vladimir On 11/3/14 3:09 AM, Tobias Hartmann wrote: > Hi, > > please review the following backport requests for 8u40. > > (1) 8061817: Whitebox.deoptimizeMethod() does not deoptimize all OSR versions of > method > https://bugs.openjdk.java.net/browse/JDK-8061817 > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/d30c8335bb6f > > (2) 8062169: Multiple OSR compilations issued for same bci > https://bugs.openjdk.java.net/browse/JDK-8062169 > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/8edc39841abe > > The changes were pushed last week. Nightly testing showed no problems. The > changes apply cleanly to 8u40. > > Thanks, > Tobias > > From daniel.daugherty at oracle.com Mon Nov 3 18:41:33 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 03 Nov 2014 11:41:33 -0700 Subject: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <5453F9F4.20309@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> Message-ID: <5457CC5D.3010300@oracle.com> On 10/31/14 3:07 PM, serguei.spitsyn at oracle.com wrote: > > It is 3-rd round of review for: > https://bugs.openjdk.java.net/browse/JDK-6988950 > > New webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ Thumbs up on the code. I have comment suggestions below... src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c line 149: debugMonitorEnter(vmDeathLock); Perhaps this comment would help: /* * We grab the vmDeathLock here to prevent the cbVMDeath() * event handler from tearing things down while we're * asynchronously processing a command. */ src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c line 71: jrawMonitorID vmDeathLock; A nice blurb about the new global lock's protocol would be good here. Something like: /* * Coordinates the cbVMDeath() event handler and the * debugLoop_run() thread. */ line 1236: debugMonitorEnter(vmDeathLock); Perhaps this comment would help: /* * We grab the vmDeathLock here to prevent the debugLoop_run() * thread from asynchronously dispatching another command. */ This block caught my eye: 1295 /* 1296 * The VM will die soon after the completion of this callback - we 1297 * may need to do a final synchronization with the command loop to 1298 * avoid the VM terminating with replying to the final (resume) 1299 * command. 1300 */ 1301 debugLoop_sync(); The above comment implies that debugLoop_sync() does something to coordinate this code (cbVMDeath()) with the debugLoop code, but clearly something is incomplete. It's entirely possible that this debugLoop_sync() is solving a different problem that happens after the debugLoop has left its command processing loop and realizes that the VM is shutting down. Don't know for sure. I haven't been in this code for quite a while... Dan > > > Summary > > For failing scenario, please, refer to the 1-st round RFR below. > > I've found what is missed in the jdwp agent shutdown and decided to > switch from a workaround to a real fix. > > The agent VM_DEATH callback sets the gdata field: gdata->vmDead = 1. > The agent debugLoop_run() has a guard against the VM shutdown: > 165 } else if (gdata->vmDead && > 166 ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) { > 167 /* Protect the VM from calls while dead. > 168 * VirtualMachine cmdSet quietly ignores some cmds > 169 * after VM death, so, it sends it's own errors. > 170 */ > 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD)); > > However, the guard above does not help much if the VM_DEATH event > happens in the middle of a command execution. > There is a lack of synchronization here. > > The fix introduces new lock (vmDeathLock) which does not allow to > execute the commands > and the VM_DEATH event callback concurrently. > It should work well for any function that is used in implementation > of the JDWP_COMMAND_SET(VirtualMachine) . > > > Testing: > Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests > > > Thanks, > Serguei > > > On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >> The updated webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >> >> >> The changes are: >> - added a comment recommended by Staffan >> - removed the ignore_wrong_phase() call from function classSignature() >> >> The classSignature() function is called in 16 places. >> Most of them do not tolerate the NULL in place of returned signature >> and will crash. >> I'm not comfortable to fix all the occurrences now and suggest to >> return to this >> issue after gaining experience with more failure cases that are still >> expected. >> The failure with the classSignature() involved was observed only once >> in the nightly >> and should be extremely rare reproducible. >> I'll file a placeholder bug if necessary. >> >> Thanks, >> Serguei >> >> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>> >>> >>> Open webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>> >>> >>> >>> Summary: >>> >>> The failing scenario: >>> The debugger and the debuggee are well aware a VM shutdown has >>> been started in the target process. >>> The debugger at this point is not expected to send any commands >>> to the JDWP agent. >>> However, the JDI layer (debugger side) and the jdwp agent >>> (debuggee side) >>> are not in sync with the consumer layers. >>> >>> One reason is because the test debugger does not invoke the JDI >>> method VirtualMachine.dispose(). >>> Another reason is that the Debugger and the debuggee processes >>> are uneasy to sync in general. >>> >>> As a result the following steps are possible: >>> - The test debugger sends a 'quit' command to the test debuggee >>> - The debuggee is normally exiting >>> - The jdwp backend reports (over the jdwp protocol) an >>> anonymous class unload event >>> - The JDI InternalEventHandler thread handles the >>> ClassUnloadEvent event >>> - The InternalEventHandler wants to uncache the matching >>> reference type. >>> If there is more than one class with the same host class >>> signature, it can't distinguish them, >>> and so, deletes all references and re-retrieves them again >>> (see tracing below): >>> MY_TRACE: JDI: >>> VirtualMachineImpl.retrieveClassesBySignature: >>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>> - The jdwp backend debugLoop_run() gets the command from JDI >>> and calls the functions >>> classesForSignature() and classStatus() recursively. >>> - The classStatus() makes a call to the JVMTI >>> GetClassStatus() and gets the JVMTI_ERROR_WRONG_PHASE >>> - As a result the jdwp backend reports the JVMTI error to the >>> JDI, and so, the test fails >>> >>> For details, see the analysis in bug report closed as a dup of >>> the bug 6988950: >>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>> >>> Some similar cases can be found in the two bug reports (6988950 >>> and 8024865) describing this issue. >>> >>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error >>> as it is normal at the VM shutdown. >>> The original jdwp backend implementation had a similar approach >>> for the raw monitor functions. >>> Threy use the ignore_vm_death() to workaround the >>> JVMTI_ERROR_WRONG_PHASE errors. >>> For reference, please, see the file: src/share/back/util.c >>> >>> >>> Testing: >>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >>> >>> >>> Thanks, >>> Serguei >>> >> > From daniel.daugherty at oracle.com Mon Nov 3 19:00:09 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 03 Nov 2014 12:00:09 -0700 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <54577C96.5030503@oracle.com> References: <54577C96.5030503@oracle.com> Message-ID: <5457D0B9.10506@oracle.com> > http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp No comments. Thanks for being persistent and chasing this crazy bug to ground. I have one administrative request. Can you please add a complete set of "gdb session notes" to the bug report where you show the good code in JDK7-B108 and the bad code in JDK7-B109? Please do the same for your baseline bits and your fixed bits. Every time we consider a GCC upgrade, I would like it to be easy for someone to verify that this code sequence is not messed up. It should also make it easy for other folks to adapt your sequence to investigate their own possibly broken code sequences... As for testing, I think our usual battery of JPRT, PIT, and promotion testing will have to do. Personally, once your fix is in I'm planning to drop the bits onto my Linux DevOps machine and see if the following still repos: 8047212 runtime/ParallelClassLoading/bootstrap/random/inner-complex assert(ObjectSynchronizer::verify_objmon_isinpool(inf)) failed: monitor is invalid Again, thanks for fixing this! Dan On 11/3/14 6:01 AM, Mikael Gerdin wrote: > Hi all, > > Please review this attempt at fixing the OrderAccess functions on > Linux x86 with GCC. > > While working on another bug I recently discovered that g++ was > reordering stores across a call to OrderAccess::storestore on Linux x86. > > The G1 code attempts to do an ordered publishing of two values: > _saved_mark_word = _top; > OrderAccess::storestore(); > _gc_time_stamp = curr_gc_time_stamp; > > The types involved are > HeapWord* _top, _saved_mark_word; > volatile unsigned _gc_time_stamp; > > The incorrect behavior seems to have started when JDK-6973570 was > fixed in JDK 7. > Below, _top is at offset 0x58, _saved_mark_word at 0x18 and > _gc_time_stamp at 0x138, %rbx is "this". > > /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: > > 3d9f4d: 39 d0 cmp %edx,%eax > 3d9f4f: 73 1c jae 3d9f6d > > 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax > 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) > 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # > ae98a0 <_DYNAMIC+0x12f8> > 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) > 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) > > /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so > > 3da05d: 39 d0 cmp %edx,%eax > 3da05f: 73 15 jae 3da076 > > 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax > 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) > 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) > 3da072: 48 89 43 18 mov %rax,0x18(%rbx) > > In b109 the store of %rax to 0x18(%rbx) has been ordered after the > store of %edx to 0x138(%rbx) in the same build as JDK-6973570 was > integrated. > > My suggestion to fix this is to extend all the OrderAccess::release* > variants on x86 with a: > __asm__ volatile ("" : : : "memory"); > to attempt to prevent GCC from reordering any memory accesses across > those function calls. > > I've verified that this solves the issue in the assembly with our > current JDK 9 build platform compilers. > I've also verified that this particular piece of code is compiled > correctly on our other x86 platforms: Solaris, Windows and OS X. > > Webrev: > http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ > Bug: > https://bugs.openjdk.java.net/browse/JDK-8061964 > Testing: > JPRT, inspecting generated assembly for the function > G1OffsetTableContigSpace::record_top_and_timestamp (as the method is > currently named). > Suggestions of further testing is greatly appreciated. > > Thanks > Mikael From serguei.spitsyn at oracle.com Mon Nov 3 19:35:38 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 03 Nov 2014 11:35:38 -0800 Subject: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <5457CC5D.3010300@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5457CC5D.3010300@oracle.com> Message-ID: <5457D90A.1040608@oracle.com> On 11/3/14 10:41 AM, Daniel D. Daugherty wrote: > On 10/31/14 3:07 PM, serguei.spitsyn at oracle.com wrote: >> >> It is 3-rd round of review for: >> https://bugs.openjdk.java.net/browse/JDK-6988950 >> >> New webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ > > Thumbs up on the code. Thanks, Dan! > I have comment suggestions below... > > src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c > line 149: debugMonitorEnter(vmDeathLock); > Perhaps this comment would help: > /* > * We grab the vmDeathLock here to prevent the cbVMDeath() > * event handler from tearing things down while we're > * asynchronously processing a command. > */ Done > > src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c > line 71: jrawMonitorID vmDeathLock; > A nice blurb about the new global lock's protocol would be > good here. Something like: > > /* > * Coordinates the cbVMDeath() event handler and the > * debugLoop_run() thread. > */ > Done > line 1236: debugMonitorEnter(vmDeathLock); > Perhaps this comment would help: > /* > * We grab the vmDeathLock here to prevent the debugLoop_run() > * thread from asynchronously dispatching another command. > */ > Done I also think, it'd be enough to narrow the scope of synchronization around the event_callback() call: /* * Coordinates the cbVMDeath() event handler and the * debugLoop_run() thread. */ debugMonitorEnter(vmDeathLock); /* Only now should we actually process the VM death event */ (void)memset(&info,0,sizeof(info)); info.ei = EI_VM_DEATH; event_callback(env, &info); debugMonitorExit(vmDeathLock); > This block caught my eye: > > 1295 /* > 1296 * The VM will die soon after the completion of this > callback - we > 1297 * may need to do a final synchronization with the > command loop to > 1298 * avoid the VM terminating with replying to the final > (resume) > 1299 * command. > 1300 */ > 1301 debugLoop_sync(); > > The above comment implies that debugLoop_sync() does something to > coordinate this code (cbVMDeath()) with the debugLoop code, but > clearly something is incomplete. It's entirely possible that this > debugLoop_sync() is solving a different problem that happens after > the debugLoop has left its command processing loop and realizes > that the VM is shutting down. This block caught my eye too. I agree, it looks incomplete. The cbVMDeath() callback waits until a resume command finishes if it has been started. Not sure how useful it is. Thanks! Serguei > > Don't know for sure. I haven't been in this code for quite a while... > > Dan > > >> >> >> Summary >> >> For failing scenario, please, refer to the 1-st round RFR below. >> >> I've found what is missed in the jdwp agent shutdown and decided to >> switch from a workaround to a real fix. >> >> The agent VM_DEATH callback sets the gdata field: gdata->vmDead = 1. >> The agent debugLoop_run() has a guard against the VM shutdown: >> 165 } else if (gdata->vmDead && >> 166 ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) { >> 167 /* Protect the VM from calls while dead. >> 168 * VirtualMachine cmdSet quietly ignores some cmds >> 169 * after VM death, so, it sends it's own errors. >> 170 */ >> 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD)); >> >> However, the guard above does not help much if the VM_DEATH event >> happens in the middle of a command execution. >> There is a lack of synchronization here. >> >> The fix introduces new lock (vmDeathLock) which does not allow to >> execute the commands >> and the VM_DEATH event callback concurrently. >> It should work well for any function that is used in implementation >> of the JDWP_COMMAND_SET(VirtualMachine) . >> >> >> Testing: >> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >> >> >> Thanks, >> Serguei >> >> >> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>> The updated webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>> >>> >>> The changes are: >>> - added a comment recommended by Staffan >>> - removed the ignore_wrong_phase() call from function >>> classSignature() >>> >>> The classSignature() function is called in 16 places. >>> Most of them do not tolerate the NULL in place of returned signature >>> and will crash. >>> I'm not comfortable to fix all the occurrences now and suggest to >>> return to this >>> issue after gaining experience with more failure cases that are >>> still expected. >>> The failure with the classSignature() involved was observed only >>> once in the nightly >>> and should be extremely rare reproducible. >>> I'll file a placeholder bug if necessary. >>> >>> Thanks, >>> Serguei >>> >>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>> Please, review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>> >>>> >>>> Open webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>> >>>> >>>> >>>> Summary: >>>> >>>> The failing scenario: >>>> The debugger and the debuggee are well aware a VM shutdown has >>>> been started in the target process. >>>> The debugger at this point is not expected to send any >>>> commands to the JDWP agent. >>>> However, the JDI layer (debugger side) and the jdwp agent >>>> (debuggee side) >>>> are not in sync with the consumer layers. >>>> >>>> One reason is because the test debugger does not invoke the >>>> JDI method VirtualMachine.dispose(). >>>> Another reason is that the Debugger and the debuggee processes >>>> are uneasy to sync in general. >>>> >>>> As a result the following steps are possible: >>>> - The test debugger sends a 'quit' command to the test debuggee >>>> - The debuggee is normally exiting >>>> - The jdwp backend reports (over the jdwp protocol) an >>>> anonymous class unload event >>>> - The JDI InternalEventHandler thread handles the >>>> ClassUnloadEvent event >>>> - The InternalEventHandler wants to uncache the matching >>>> reference type. >>>> If there is more than one class with the same host class >>>> signature, it can't distinguish them, >>>> and so, deletes all references and re-retrieves them again >>>> (see tracing below): >>>> MY_TRACE: JDI: >>>> VirtualMachineImpl.retrieveClassesBySignature: >>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>> - The jdwp backend debugLoop_run() gets the command from JDI >>>> and calls the functions >>>> classesForSignature() and classStatus() recursively. >>>> - The classStatus() makes a call to the JVMTI >>>> GetClassStatus() and gets the JVMTI_ERROR_WRONG_PHASE >>>> - As a result the jdwp backend reports the JVMTI error to >>>> the JDI, and so, the test fails >>>> >>>> For details, see the analysis in bug report closed as a dup of >>>> the bug 6988950: >>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>> >>>> Some similar cases can be found in the two bug reports >>>> (6988950 and 8024865) describing this issue. >>>> >>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error >>>> as it is normal at the VM shutdown. >>>> The original jdwp backend implementation had a similar >>>> approach for the raw monitor functions. >>>> Threy use the ignore_vm_death() to workaround the >>>> JVMTI_ERROR_WRONG_PHASE errors. >>>> For reference, please, see the file: src/share/back/util.c >>>> >>>> >>>> Testing: >>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>> >> > From dean.long at oracle.com Mon Nov 3 20:28:13 2014 From: dean.long at oracle.com (Dean Long) Date: Mon, 03 Nov 2014 12:28:13 -0800 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <54577C96.5030503@oracle.com> References: <54577C96.5030503@oracle.com> Message-ID: <5457E55D.8060304@oracle.com> Do we need a compiler barrier for OrderAccess::*acquire as well? dl On 11/3/2014 5:01 AM, Mikael Gerdin wrote: > Hi all, > > Please review this attempt at fixing the OrderAccess functions on > Linux x86 with GCC. > > While working on another bug I recently discovered that g++ was > reordering stores across a call to OrderAccess::storestore on Linux x86. > > The G1 code attempts to do an ordered publishing of two values: > _saved_mark_word = _top; > OrderAccess::storestore(); > _gc_time_stamp = curr_gc_time_stamp; > > The types involved are > HeapWord* _top, _saved_mark_word; > volatile unsigned _gc_time_stamp; > > The incorrect behavior seems to have started when JDK-6973570 was > fixed in JDK 7. > Below, _top is at offset 0x58, _saved_mark_word at 0x18 and > _gc_time_stamp at 0x138, %rbx is "this". > > /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: > > 3d9f4d: 39 d0 cmp %edx,%eax > 3d9f4f: 73 1c jae 3d9f6d > > 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax > 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) > 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # > ae98a0 <_DYNAMIC+0x12f8> > 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) > 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) > > /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so > > 3da05d: 39 d0 cmp %edx,%eax > 3da05f: 73 15 jae 3da076 > > 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax > 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) > 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) > 3da072: 48 89 43 18 mov %rax,0x18(%rbx) > > In b109 the store of %rax to 0x18(%rbx) has been ordered after the > store of %edx to 0x138(%rbx) in the same build as JDK-6973570 was > integrated. > > My suggestion to fix this is to extend all the OrderAccess::release* > variants on x86 with a: > __asm__ volatile ("" : : : "memory"); > to attempt to prevent GCC from reordering any memory accesses across > those function calls. > > I've verified that this solves the issue in the assembly with our > current JDK 9 build platform compilers. > I've also verified that this particular piece of code is compiled > correctly on our other x86 platforms: Solaris, Windows and OS X. > > Webrev: > http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ > Bug: > https://bugs.openjdk.java.net/browse/JDK-8061964 > Testing: > JPRT, inspecting generated assembly for the function > G1OffsetTableContigSpace::record_top_and_timestamp (as the method is > currently named). > Suggestions of further testing is greatly appreciated. > > Thanks > Mikael From eric.mccorkle at oracle.com Mon Nov 3 21:35:44 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Mon, 03 Nov 2014 16:35:44 -0500 Subject: Review request for 8058313: Mismatch of method descriptor and MethodParameters.parameters_count should cause MalformedParameterException In-Reply-To: <5452CC7F.1090809@oracle.com> References: <54516C9A.7070404@oracle.com> <54518820.50700@oracle.com> <545251AD.8050208@oracle.com> <54527A90.4030503@oracle.com> <5452CC7F.1090809@oracle.com> Message-ID: <5457F530.2070907@oracle.com> Please review this issue so that it can go in along with 8058322. Thanks. On 10/30/14 19:40, Eric McCorkle wrote: > Thank you for the pointers. I have applied your changes and refreshed > the webrev. > > http://cr.openjdk.java.net/~emc/8058313/ > > Also, I have posted the test for this and another patch here: > http://cr.openjdk.java.net/~emc/8062556/ > > On 10/30/14 13:51, Jiangli Zhou wrote: >> Hi Eric, >> >> On 10/30/2014 07:56 AM, Eric McCorkle wrote: >>> On 10/29/14 20:36, Jiangli Zhou wrote: >>>> Hi Eric, >>>> >>>> I wonder if we could specialize this particular case and avoid changing >>>> the parsing code. How about setting the _has_method_parameters flag in >>>> the ConstMethod when encounter such MethodParameter, and changing >>>> JVM_GetMethodParameters() to return non-NULL value for such case when >>>> _has_method_parameters is true but method_parameters_length is 0. Would >>>> that work? >>> Which parser are you talking about? The inline tables parser, or the >>> class file parser. The class file parser has to change, because it was >>> previously ignoring MethodParameters attributes with parameter_count 0. >> >> It's the class parsing changes that I was referring to, mostly relate to >> the initialization and checking against method_parameters_length. It's a >> bit awkward to include the 0 case but also skipping it in the loop. For >> example, the following code in classFileParser.cpp changed ">" to ">=" >> in the if check, but has no real effect and is not need. >> >> 2486 // Copy method parameters >> 2487 if (method_parameters_length >= 0) { >> 2488 MethodParametersElement* elem = >> m->constMethod()->method_parameters_start(); >> 2489 for (int i = 0; i < method_parameters_length; i++) { >> 2490 elem[i].name_cp_index = >> Bytes::get_Java_u2(method_parameters_data); >> 2491 method_parameters_data += 2; >> 2492 elem[i].flags = Bytes::get_Java_u2(method_parameters_data); >> 2493 method_parameters_data += 2; >> 2494 } >> 2495 } >> >> >>> >>> I don't think your proposal will work. The inline tables' offsets are >>> all dependent on what inline tables are actually present. If >>> _has_method_parameters is set, then the inline tables code expects the >>> last u2 of the inline tables to be a u2 indicating the number of method >>> parameters entries, preceeded by the array of method parameters data. >>> If _has_method_parameters is false, then it expects that there is no >>> method parameters information at all (including no length field). If >>> you were to set _has_method_parameters, but not store any information in >>> the inline table, then it would cause errors for all the rest of the >>> inline tables. >> >> Thank you for reminding me of the complexity of the inlined table >> calculation in the ConstMethod. My proposal would require tweaks in that >> area to correctly compute the table sizes. As it's easy to introduce >> bugs in that area, it's not worth to change the table calculation code >> for this purpose. I agree my proposal is not a better choice in this case. >> >>> What I do for the parameter_count = 0 case is just store >>> a 0 u2 for zero-length method parameters information, and no data. All >>> the existing inline tables code works fine with this case, so there >>> aren't any serious changes to the inline tables code (other than >>> allowing method parameters information to be stored when the array is >>> length 0). But you have to make some change to the inline table code, >>> otherwise the information won't be stored. >> >> Ok. Could you please add comments to the change in constMethod.cpp to >> explain above? >> >> In jvm.cpp, since -1 represents no method parameter now. Maybe checking >> against explicity and add comments for the 0-length case. >> >> JVM_ENTRY(jobjectArray, JVM_GetMethodParameters(JNIEnv *env, jobject >> method)) >> { >> ... >> // No method parameter >> if (num_params == -1) { >> return (jobjectArray)NULL; >> } >> >> /* handle the rest here */ >> // make sure all the symbols are properly formatted >> for (int i = 0; i < num_params; i++) { >> ... >> } >> >> Thanks, >> Jiangli >> >>> >>>> Thanks, >>>> Jiangli >>>> >>>> On 10/29/2014 03:39 PM, Eric McCorkle wrote: >>>>> Hello, >>>>> >>>>> Please review this fix for parameter reflection which addresses hotspot >>>>> falsely ignoring zero-length MethodParameter attributes. The JVMS >>>>> allows a MethodParameters attribute with parameter_count = 0, and the >>>>> parameter reflection spec states that a MalformedParametersException >>>>> should be thrown if parameter_count does not match the number of real >>>>> parameters to a method. Hotspot currently ignores MethodParameters >>>>> attributes with parameter_count = 0; however, in a case where a (bad) >>>>> MethodParameters attribute has parameter_count = 0, but the method >>>>> has a >>>>> nonzero number of real parameters, hotspot will return null from >>>>> JVM_GetMethodParameters, the result being that a >>>>> MalformedParametersException is not thrown (rather, the reflection API >>>>> acts like there is no MethodParameters attribute). >>>>> >>>>> This patch causes hotspot to record the fact that a zero-length >>>>> MethodParameters attribute does exist, causing the exception to be >>>>> thrown when it should be. >>>>> >>>>> The bug is here: >>>>> https://bugs.openjdk.java.net/browse/JDK-8058313 >>>>> >>>>> The webrev is here: >>>>> http://cr.openjdk.java.net/~emc/8058313/ >> From kim.barrett at oracle.com Mon Nov 3 22:35:04 2014 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 3 Nov 2014 17:35:04 -0500 Subject: RFR: 8058255: Native jbyte Atomic::cmpxchg for supported x86 platforms In-Reply-To: References: <37B3D027-5B2E-417C-A679-D58AA250FCEF@lnu.se> Message-ID: This is a review of the so-called "simple fix", e.g. this: http://cr.openjdk.java.net/~jwilhelm/8058255/webrev.03.macro I haven't reviewed the inline assembly code very carefully. The gcc versions (linux, bsd) look correct to me (subject to comments below), but my knowledge of both x86 assembly and gcc inline assembly are somewhat rusty and should be considered suspect for this purpose. I only did a superficial review of the inline assembly code for Windows, as I'm not familiar with the syntax for that at all. ------------------------------------------------------------------------------ src/cpu/x86/vm/stubGenerator_x86_64.cpp 597 // Support for jbyte atomic::atomic_cmpxchg( [...] Shouldn't that be atomic_cmpxchg_byte, e.g. needs "_byte" suffix? That would be consistent with the comment for the later generate_atomic_cmpxchg_long(). But maybe that comment isn't right? [There isn't an Atomic::atomic_cmpxchg[_long](), only Atomic::cmpxchg() overloads. And since the argument types are explicit in these comments...] [legacy issue, not in changed code] I think the comment for generate_atomic_cmpxchg_long() is wrong in the return value; shouldn't it be returning a jlong? Probably a C-Y bug. ------------------------------------------------------------------------------ src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp 96 : "q" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp) Why is the new byte version using "q" for exchange_value, where the existing int and long versions use "r"? [There might be a good reason, and this is just my rusty assembler skills showing.] src/os_cpu/linux_x86/vm/atomic_linux_x86.inline.hpp Similarly here, line 96. src/os_cpu/solaris_x86/vm/atomic_solaris_x86.inline.hpp Similarly here, line 231. ------------------------------------------------------------------------------ src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp src/os_cpu/windows_x86/vm/os_windows_x86.hpp The windows port seems to only support specialized cmpxchgb when defined(AMD64), while the BSD/Linux variants don't have that restriction. Why this inconsistency? Or am I missing something, which seems entirely possible in this tangle. ------------------------------------------------------------------------------ src/os_cpu/solaris_x86/vm/solaris_x86_32.il 79 // Support fori jbyte Atomic::cmpxchg(jbyte exchange_value, "fori" => "for" ------------------------------------------------------------------------------ VM_HAS_SPECIALIZED_BYTE_CMPXCHG I think I would have called this VM_HAS_SPECIALIZED_CMPXCHG_BYTE, for consistency with the names of the helper primitives, but that might just be me. That way a case-insensitive search for "cmpxchg_byte" finds both this macro and those helpers. From erik.osterlund at lnu.se Tue Nov 4 00:21:01 2014 From: erik.osterlund at lnu.se (=?iso-8859-1?Q?Erik_=D6sterlund?=) Date: Tue, 4 Nov 2014 00:21:01 +0000 Subject: RFR: 8058255: Native jbyte Atomic::cmpxchg for supported x86 platforms In-Reply-To: References: <37B3D027-5B2E-417C-A679-D58AA250FCEF@lnu.se> Message-ID: <4CC8B7BA-1536-47A3-9CEF-069191E574B7@lnu.se> Hi Kim, Thanks a lot for taking a look at this! :) On 03 Nov 2014, at 23:35, Kim Barrett wrote: > This is a review of the so-called "simple fix", e.g. this: > http://cr.openjdk.java.net/~jwilhelm/8058255/webrev.03.macro > > I haven't reviewed the inline assembly code very carefully. The gcc > versions (linux, bsd) look correct to me (subject to comments below), > but my knowledge of both x86 assembly and gcc inline assembly are > somewhat rusty and should be considered suspect for this purpose. I > only did a superficial review of the inline assembly code for Windows, > as I'm not familiar with the syntax for that at all. > > ------------------------------------------------------------------------------ > > src/cpu/x86/vm/stubGenerator_x86_64.cpp > 597 // Support for jbyte atomic::atomic_cmpxchg( [...] > > Shouldn't that be atomic_cmpxchg_byte, e.g. needs "_byte" suffix? > That would be consistent with the comment for the later > generate_atomic_cmpxchg_long(). But maybe that comment isn't right? > [There isn't an Atomic::atomic_cmpxchg[_long](), only > Atomic::cmpxchg() overloads. And since the argument types are > explicit in these comments...] > I agree - would say the other comment for jlong CAS is a bit off - there is obviously no Atomic::cmpxchg_long. Maybe it was once called that in ancient times and the comment outlived its expiry date? Fixed it anyway... > [legacy issue, not in changed code] > I think the comment for generate_atomic_cmpxchg_long() is wrong in the > return value; shouldn't it be returning a jlong? Probably a C-Y bug. No generate_atomic_cmpxchg_long() is used for generating code stubs for jlong CAS. I.e. it returns the address of the generated stub rather than executing a CAS - hence the return type is correct. > > ------------------------------------------------------------------------------ > > src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp > 96 : "q" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp) > > Why is the new byte version using "q" for exchange_value, where the > existing int and long versions use "r"? [There might be a good > reason, and this is just my rusty assembler skills showing.] > src/os_cpu/linux_x86/vm/atomic_linux_x86.inline.hpp > Similarly here, line 96. > > src/os_cpu/solaris_x86/vm/atomic_solaris_x86.inline.hpp > Similarly here, line 231. > With the "q" constraint you select one of the 8-bit-addressable registers rax, rcx, rdx, rbx (as opposed to any register with "r"). The compare_value is assigned to eax using "a" which is also 8-bit-addressable (al). Also cmpxchgb needs it to be in al specifically. The former (allocating 8-bit-addressable registers) wasn't a concern for the other variants really, but here this is pretty important for the operands of cmpxchgb. :) > ------------------------------------------------------------------------------ > > src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp > src/os_cpu/windows_x86/vm/os_windows_x86.hpp > > The windows port seems to only support specialized cmpxchgb when > defined(AMD64), while the BSD/Linux variants don't have that > restriction. Why this inconsistency? Or am I missing something, > which seems entirely possible in this tangle. If you look closely, you will see there are two definitions - one for AMD64 using a runtime-generated code stub. Then there is another MSVC assembly variant for #ifndef AMD64. This goes perfectly consistent with e.g. the jint cmpxchg for windows way of doing things. > > ------------------------------------------------------------------------------ > > src/os_cpu/solaris_x86/vm/solaris_x86_32.il > 79 // Support fori jbyte Atomic::cmpxchg(jbyte exchange_value, > "fori" => "for" Fixed. > ------------------------------------------------------------------------------ > > VM_HAS_SPECIALIZED_BYTE_CMPXCHG > I think I would have called this VM_HAS_SPECIALIZED_CMPXCHG_BYTE, for > consistency with the names of the helper primitives, but that might > just be me. That way a case-insensitive search for "cmpxchg_byte" > finds both this macro and those helpers. Okay sure, fixed. Do you want a new webrev? (just polished comments and renamed the #define as per request) Thanks again Kim for looking through these changes! :) /Erik From tobias.hartmann at oracle.com Tue Nov 4 06:39:45 2014 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 04 Nov 2014 07:39:45 +0100 Subject: [8u40] Backport requests for 8061817 and 8062169 In-Reply-To: <5457BD09.4000606@oracle.com> References: <54576255.2020000@oracle.com> <5457BD09.4000606@oracle.com> Message-ID: <545874B1.1030906@oracle.com> Thanks, Vladimir. Best, Tobias On 03.11.2014 18:36, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 11/3/14 3:09 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following backport requests for 8u40. >> >> (1) 8061817: Whitebox.deoptimizeMethod() does not deoptimize all OSR versions of >> method >> https://bugs.openjdk.java.net/browse/JDK-8061817 >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/d30c8335bb6f >> >> (2) 8062169: Multiple OSR compilations issued for same bci >> https://bugs.openjdk.java.net/browse/JDK-8062169 >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/8edc39841abe >> >> The changes were pushed last week. Nightly testing showed no problems. The >> changes apply cleanly to 8u40. >> >> Thanks, >> Tobias >> >> From mikael.gerdin at oracle.com Tue Nov 4 07:51:05 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 04 Nov 2014 08:51:05 +0100 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <5457D0B9.10506@oracle.com> References: <54577C96.5030503@oracle.com> <5457D0B9.10506@oracle.com> Message-ID: <54588569.1090402@oracle.com> Dan, On 2014-11-03 20:00, Daniel D. Daugherty wrote: > > http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ > > src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp > No comments. > > Thanks for being persistent and chasing this crazy bug to ground. > > I have one administrative request. Can you please add a complete > set of "gdb session notes" to the bug report where you show the > good code in JDK7-B108 and the bad code in JDK7-B109? Please do > the same for your baseline bits and your fixed bits. Will do. > > Every time we consider a GCC upgrade, I would like it to be easy > for someone to verify that this code sequence is not messed up. > It should also make it easy for other folks to adapt your sequence > to investigate their own possibly broken code sequences... > > As for testing, I think our usual battery of JPRT, PIT, and promotion > testing will have to do. Yep, it may be a good idea to keep in mind that if "impossible" conditions seem to have occurred in a core file then it may be a good idea to inspect the generated assembly instruction sequences to make sure that no incorrect reordering has occurred. > > Personally, once your fix is in I'm planning to drop the bits onto > my Linux DevOps machine and see if the following still repos: > > 8047212 runtime/ParallelClassLoading/bootstrap/random/inner-complex > assert(ObjectSynchronizer::verify_objmon_isinpool(inf)) failed: > monitor is invalid > > Again, thanks for fixing this! Thanks for reviewing, Dan. /Mikael > > Dan > > > On 11/3/14 6:01 AM, Mikael Gerdin wrote: >> Hi all, >> >> Please review this attempt at fixing the OrderAccess functions on >> Linux x86 with GCC. >> >> While working on another bug I recently discovered that g++ was >> reordering stores across a call to OrderAccess::storestore on Linux x86. >> >> The G1 code attempts to do an ordered publishing of two values: >> _saved_mark_word = _top; >> OrderAccess::storestore(); >> _gc_time_stamp = curr_gc_time_stamp; >> >> The types involved are >> HeapWord* _top, _saved_mark_word; >> volatile unsigned _gc_time_stamp; >> >> The incorrect behavior seems to have started when JDK-6973570 was >> fixed in JDK 7. >> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and >> _gc_time_stamp at 0x138, %rbx is "this". >> >> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: >> >> 3d9f4d: 39 d0 cmp %edx,%eax >> 3d9f4f: 73 1c jae 3d9f6d >> >> 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax >> 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) >> 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # >> ae98a0 <_DYNAMIC+0x12f8> >> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) >> 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >> >> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so >> >> 3da05d: 39 d0 cmp %edx,%eax >> 3da05f: 73 15 jae 3da076 >> >> 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax >> 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) >> 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >> 3da072: 48 89 43 18 mov %rax,0x18(%rbx) >> >> In b109 the store of %rax to 0x18(%rbx) has been ordered after the >> store of %edx to 0x138(%rbx) in the same build as JDK-6973570 was >> integrated. >> >> My suggestion to fix this is to extend all the OrderAccess::release* >> variants on x86 with a: >> __asm__ volatile ("" : : : "memory"); >> to attempt to prevent GCC from reordering any memory accesses across >> those function calls. >> >> I've verified that this solves the issue in the assembly with our >> current JDK 9 build platform compilers. >> I've also verified that this particular piece of code is compiled >> correctly on our other x86 platforms: Solaris, Windows and OS X. >> >> Webrev: >> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8061964 >> Testing: >> JPRT, inspecting generated assembly for the function >> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is >> currently named). >> Suggestions of further testing is greatly appreciated. >> >> Thanks >> Mikael > From mikael.gerdin at oracle.com Tue Nov 4 07:52:25 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 04 Nov 2014 08:52:25 +0100 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <5457E55D.8060304@oracle.com> References: <54577C96.5030503@oracle.com> <5457E55D.8060304@oracle.com> Message-ID: <545885B9.1030203@oracle.com> Dean, On 2014-11-03 21:28, Dean Long wrote: > Do we need a compiler barrier for OrderAccess::*acquire as well? As far as I understand, __asm__ volatile ("movq 0(%%rsp), %0" : "=r" (local_dummy) : : "memory"); should work as a memory barrier since the last parameter informs the compiler that the assembly sequence clobbers memory and should invalidate all assumptions about memory after this point. /Mikael > > dl > > On 11/3/2014 5:01 AM, Mikael Gerdin wrote: >> Hi all, >> >> Please review this attempt at fixing the OrderAccess functions on >> Linux x86 with GCC. >> >> While working on another bug I recently discovered that g++ was >> reordering stores across a call to OrderAccess::storestore on Linux x86. >> >> The G1 code attempts to do an ordered publishing of two values: >> _saved_mark_word = _top; >> OrderAccess::storestore(); >> _gc_time_stamp = curr_gc_time_stamp; >> >> The types involved are >> HeapWord* _top, _saved_mark_word; >> volatile unsigned _gc_time_stamp; >> >> The incorrect behavior seems to have started when JDK-6973570 was >> fixed in JDK 7. >> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and >> _gc_time_stamp at 0x138, %rbx is "this". >> >> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: >> >> 3d9f4d: 39 d0 cmp %edx,%eax >> 3d9f4f: 73 1c jae 3d9f6d >> >> 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax >> 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) >> 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # >> ae98a0 <_DYNAMIC+0x12f8> >> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) >> 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >> >> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so >> >> 3da05d: 39 d0 cmp %edx,%eax >> 3da05f: 73 15 jae 3da076 >> >> 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax >> 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) >> 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >> 3da072: 48 89 43 18 mov %rax,0x18(%rbx) >> >> In b109 the store of %rax to 0x18(%rbx) has been ordered after the >> store of %edx to 0x138(%rbx) in the same build as JDK-6973570 was >> integrated. >> >> My suggestion to fix this is to extend all the OrderAccess::release* >> variants on x86 with a: >> __asm__ volatile ("" : : : "memory"); >> to attempt to prevent GCC from reordering any memory accesses across >> those function calls. >> >> I've verified that this solves the issue in the assembly with our >> current JDK 9 build platform compilers. >> I've also verified that this particular piece of code is compiled >> correctly on our other x86 platforms: Solaris, Windows and OS X. >> >> Webrev: >> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8061964 >> Testing: >> JPRT, inspecting generated assembly for the function >> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is >> currently named). >> Suggestions of further testing is greatly appreciated. >> >> Thanks >> Mikael > From goetz.lindenmaier at sap.com Tue Nov 4 09:34:31 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 4 Nov 2014 09:34:31 +0000 Subject: RFR (L): 8062370: Various minor code improvements Message-ID: <4295855A5C1DE049A61835A1887419CC2CF23EDB@DEWDFEMB12A.global.corp.sap> Hi, could anybody have a look at this change, please? I think it contains a lot of fixes useful to improve the code quality. Thanks and best regards, Goetz. From: Lindenmaier, Goetz Sent: Donnerstag, 30. Oktober 2014 09:28 To: hotspot-dev at openjdk.java.net Subject: RFR (L): 8062370: Various minor code improvements Hi, this change contains a row of minor code improvements we did to fulfil our internal quality requirements. We would like to share these with openJDK. Please review and test this change. I please need a sponsor. http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8062370 We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 bit and, of course, the ppc platforms. Some details: CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. We add some missing memory frees and some closing of files. jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. Best regards, Goetz From igor.veresov at oracle.com Tue Nov 4 09:38:49 2014 From: igor.veresov at oracle.com (Igor Veresov) Date: Mon, 3 Nov 2014 23:38:49 -1000 Subject: [8u40] RFR(M) 8041984: CompilerThread seems to occupy all CPU in a very rare situation In-Reply-To: <545145F5.5010207@oracle.com> References: <545145F5.5010207@oracle.com> Message-ID: Good. igor On Oct 29, 2014, at 9:54 AM, Vladimir Kozlov wrote: > Backport request. Changes were pushed into jdk9 last week. Nighties are fine. Changes are applied cleanly to 8u sources. > > https://bugs.openjdk.java.net/browse/JDK-8041984 > http://cr.openjdk.java.net/~kvn/8041984/webrev.01/ > > Review thread: > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2014-October/015855.html > > jdk9 changeset: > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/b9576378eaad > > Thanks, > Vladimir From david.holmes at oracle.com Tue Nov 4 11:15:54 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 04 Nov 2014 21:15:54 +1000 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <54577C96.5030503@oracle.com> References: <54577C96.5030503@oracle.com> Message-ID: <5458B56A.10801@oracle.com> Hi Mikael, Thanks for fixing this. Given x86 is a TSO system my understanding from the previous discussion in 6973570 is that once you have the compiler barrier the volatile write to the dummy variable in storestore() is no longer needed - it's only purpose was to introduce the compiler barrier (which it failed to do). Secondly, do we observe the same bug with the release_store operations or are we just being conservative? I would not expect to need the compiler barrier prior to the Atomic::store, but I would assume this is completely harmless and performance neutral, so we should have it for completeness regardless. Finally, you answered Dean regarding the simple acquire() function but the load_acquire variants all rely on volatile semantics - which we've seen not to provide what we expected. So perhaps the load_acquire functions also need a compile_barrier inserted ? Thanks, David On 3/11/2014 11:01 PM, Mikael Gerdin wrote: > Hi all, > > Please review this attempt at fixing the OrderAccess functions on Linux > x86 with GCC. > > While working on another bug I recently discovered that g++ was > reordering stores across a call to OrderAccess::storestore on Linux x86. > > The G1 code attempts to do an ordered publishing of two values: > _saved_mark_word = _top; > OrderAccess::storestore(); > _gc_time_stamp = curr_gc_time_stamp; > > The types involved are > HeapWord* _top, _saved_mark_word; > volatile unsigned _gc_time_stamp; > > The incorrect behavior seems to have started when JDK-6973570 was fixed > in JDK 7. > Below, _top is at offset 0x58, _saved_mark_word at 0x18 and > _gc_time_stamp at 0x138, %rbx is "this". > > /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: > > 3d9f4d: 39 d0 cmp %edx,%eax > 3d9f4f: 73 1c jae 3d9f6d > > 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax > 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) > 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # > ae98a0 <_DYNAMIC+0x12f8> > 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) > 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) > > /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so > > 3da05d: 39 d0 cmp %edx,%eax > 3da05f: 73 15 jae 3da076 > > 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax > 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) > 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) > 3da072: 48 89 43 18 mov %rax,0x18(%rbx) > > In b109 the store of %rax to 0x18(%rbx) has been ordered after the store > of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. > > My suggestion to fix this is to extend all the OrderAccess::release* > variants on x86 with a: > __asm__ volatile ("" : : : "memory"); > to attempt to prevent GCC from reordering any memory accesses across > those function calls. > > I've verified that this solves the issue in the assembly with our > current JDK 9 build platform compilers. > I've also verified that this particular piece of code is compiled > correctly on our other x86 platforms: Solaris, Windows and OS X. > > Webrev: > http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ > Bug: > https://bugs.openjdk.java.net/browse/JDK-8061964 > Testing: > JPRT, inspecting generated assembly for the function > G1OffsetTableContigSpace::record_top_and_timestamp (as the method is > currently named). > Suggestions of further testing is greatly appreciated. > > Thanks > Mikael From mikael.gerdin at oracle.com Tue Nov 4 12:35:15 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 04 Nov 2014 13:35:15 +0100 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <5458B56A.10801@oracle.com> References: <54577C96.5030503@oracle.com> <5458B56A.10801@oracle.com> Message-ID: <5458C803.9040802@oracle.com> Hi David, On 2014-11-04 12:15, David Holmes wrote: > Hi Mikael, > > Thanks for fixing this. > > Given x86 is a TSO system my understanding from the previous discussion > in 6973570 is that once you have the compiler barrier the volatile write > to the dummy variable in storestore() is no longer needed - it's only > purpose was to introduce the compiler barrier (which it failed to do). I felt that removing the dummy store would be more risky since I have no good way of verifying that removing the volatile write doesn't break ordering for some other caller of storestore(). > > Secondly, do we observe the same bug with the release_store operations > or are we just being conservative? I would not expect to need the > compiler barrier prior to the Atomic::store, but I would assume this is > completely harmless and performance neutral, so we should have it for > completeness regardless. I tried changing the problematic code to do a release_store of _gc_time_stamp and saw the same reordering problem, that's the reason for adding compiler_barrier to the release_store variants. Any suggestion on where to put compiler_barrier in order to use it from atomic_linux_x86.inline.hpp? > > Finally, you answered Dean regarding the simple acquire() function but > the load_acquire variants all rely on volatile semantics - which we've > seen not to provide what we expected. So perhaps the load_acquire > functions also need a compile_barrier inserted ? Perhaps. I haven't observed any problems with the acquire functions so I can't determine if they need the compiler barrier or not. It may be a good idea to add the compiler barrier to those functions as well, what are the other reviewers' opinions on this? Thanks for the review David. /Mikael > > Thanks, > David > > On 3/11/2014 11:01 PM, Mikael Gerdin wrote: >> Hi all, >> >> Please review this attempt at fixing the OrderAccess functions on Linux >> x86 with GCC. >> >> While working on another bug I recently discovered that g++ was >> reordering stores across a call to OrderAccess::storestore on Linux x86. >> >> The G1 code attempts to do an ordered publishing of two values: >> _saved_mark_word = _top; >> OrderAccess::storestore(); >> _gc_time_stamp = curr_gc_time_stamp; >> >> The types involved are >> HeapWord* _top, _saved_mark_word; >> volatile unsigned _gc_time_stamp; >> >> The incorrect behavior seems to have started when JDK-6973570 was fixed >> in JDK 7. >> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and >> _gc_time_stamp at 0x138, %rbx is "this". >> >> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: >> >> >> 3d9f4d: 39 d0 cmp %edx,%eax >> 3d9f4f: 73 1c jae 3d9f6d >> >> 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax >> 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) >> 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # >> ae98a0 <_DYNAMIC+0x12f8> >> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) >> 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >> >> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so >> >> >> 3da05d: 39 d0 cmp %edx,%eax >> 3da05f: 73 15 jae 3da076 >> >> 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax >> 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) >> 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >> 3da072: 48 89 43 18 mov %rax,0x18(%rbx) >> >> In b109 the store of %rax to 0x18(%rbx) has been ordered after the store >> of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. >> >> My suggestion to fix this is to extend all the OrderAccess::release* >> variants on x86 with a: >> __asm__ volatile ("" : : : "memory"); >> to attempt to prevent GCC from reordering any memory accesses across >> those function calls. >> >> I've verified that this solves the issue in the assembly with our >> current JDK 9 build platform compilers. >> I've also verified that this particular piece of code is compiled >> correctly on our other x86 platforms: Solaris, Windows and OS X. >> >> Webrev: >> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8061964 >> Testing: >> JPRT, inspecting generated assembly for the function >> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is >> currently named). >> Suggestions of further testing is greatly appreciated. >> >> Thanks >> Mikael From albert.noll at oracle.com Tue Nov 4 13:01:05 2014 From: albert.noll at oracle.com (Albert Noll) Date: Tue, 04 Nov 2014 14:01:05 +0100 Subject: [9] RFR(S): 8062735: CodeCacheSweeperThread missing from SA Message-ID: <5458CE11.305@oracle.com> Hi, could I get reviews for this small patch? Bug: https://bugs.openjdk.java.net/browse/JDK-8062735 Problem: The fix for JDK-8046809 added the CodeCacheSweeperThread, but did not add this new type to SA. Solution: Add type to SA. Testing: Failing test cases. Webrev: http://cr.openjdk.java.net/~anoll/8062735/webrev.00/ Many thanks, Albert From bertrand.delsart at oracle.com Tue Nov 4 13:06:31 2014 From: bertrand.delsart at oracle.com (Bertrand Delsart) Date: Tue, 04 Nov 2014 14:06:31 +0100 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <5458C803.9040802@oracle.com> References: <54577C96.5030503@oracle.com> <5458B56A.10801@oracle.com> <5458C803.9040802@oracle.com> Message-ID: <5458CF57.6070306@oracle.com> Hi Mikael, On 04/11/2014 13:35, Mikael Gerdin wrote: > Hi David, > > On 2014-11-04 12:15, David Holmes wrote: >> Hi Mikael, >> >> Thanks for fixing this. >> >> Given x86 is a TSO system my understanding from the previous discussion >> in 6973570 is that once you have the compiler barrier the volatile write >> to the dummy variable in storestore() is no longer needed - it's only >> purpose was to introduce the compiler barrier (which it failed to do). > > I felt that removing the dummy store would be more risky since I have no > good way of verifying that removing the volatile write doesn't break > ordering for some other caller of storestore(). I agree with David that the dummy volatile store is no longer needed. It should IMHO be removed to clean the code and avoid confusing future readers. >> Secondly, do we observe the same bug with the release_store operations >> or are we just being conservative? I would not expect to need the >> compiler barrier prior to the Atomic::store, but I would assume this is >> completely harmless and performance neutral, so we should have it for >> completeness regardless. > > I tried changing the problematic code to do a release_store of > _gc_time_stamp and saw the same reordering problem, that's the reason > for adding compiler_barrier to the release_store variants. I think David was just referring to the jlong versions, which use the Atomic class. However, since the compiler_barrier() is performance neutral when redundant, it is IMHO better to keep it (avoiding to create a dependency towards how Atomic::store is specificed and implemented for jlong). > Any suggestion on where to put compiler_barrier in order to use it from > atomic_linux_x86.inline.hpp? > >> >> Finally, you answered Dean regarding the simple acquire() function but >> the load_acquire variants all rely on volatile semantics - which we've >> seen not to provide what we expected. So perhaps the load_acquire >> functions also need a compile_barrier inserted ? > > Perhaps. I haven't observed any problems with the acquire functions so I > can't determine if they need the compiler barrier or not. > > It may be a good idea to add the compiler barrier to those functions as > well, what are the other reviewers' opinions on this? If the barrier is useless (e.g. the volatile load already provides the compiler barrier) then it is performance neutral. Hence, IMHO, it is better to play it safe and add the compiler_barrier in load_acquire (after the loads). Regards, Bertrand > > Thanks for the review David. > /Mikael > > >> >> Thanks, >> David >> >> On 3/11/2014 11:01 PM, Mikael Gerdin wrote: >>> Hi all, >>> >>> Please review this attempt at fixing the OrderAccess functions on Linux >>> x86 with GCC. >>> >>> While working on another bug I recently discovered that g++ was >>> reordering stores across a call to OrderAccess::storestore on Linux x86. >>> >>> The G1 code attempts to do an ordered publishing of two values: >>> _saved_mark_word = _top; >>> OrderAccess::storestore(); >>> _gc_time_stamp = curr_gc_time_stamp; >>> >>> The types involved are >>> HeapWord* _top, _saved_mark_word; >>> volatile unsigned _gc_time_stamp; >>> >>> The incorrect behavior seems to have started when JDK-6973570 was fixed >>> in JDK 7. >>> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and >>> _gc_time_stamp at 0x138, %rbx is "this". >>> >>> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: >>> >>> >>> >>> 3d9f4d: 39 d0 cmp %edx,%eax >>> 3d9f4f: 73 1c jae 3d9f6d >>> >>> 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax >>> 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) >>> 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # >>> ae98a0 <_DYNAMIC+0x12f8> >>> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) >>> 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >>> >>> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so >>> >>> >>> >>> 3da05d: 39 d0 cmp %edx,%eax >>> 3da05f: 73 15 jae 3da076 >>> >>> 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax >>> 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) >>> 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >>> 3da072: 48 89 43 18 mov %rax,0x18(%rbx) >>> >>> In b109 the store of %rax to 0x18(%rbx) has been ordered after the store >>> of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. >>> >>> My suggestion to fix this is to extend all the OrderAccess::release* >>> variants on x86 with a: >>> __asm__ volatile ("" : : : "memory"); >>> to attempt to prevent GCC from reordering any memory accesses across >>> those function calls. >>> >>> I've verified that this solves the issue in the assembly with our >>> current JDK 9 build platform compilers. >>> I've also verified that this particular piece of code is compiled >>> correctly on our other x86 platforms: Solaris, Windows and OS X. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8061964 >>> Testing: >>> JPRT, inspecting generated assembly for the function >>> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is >>> currently named). >>> Suggestions of further testing is greatly appreciated. >>> >>> Thanks >>> Mikael -- Bertrand Delsart, Grenoble Engineering Center Oracle, 180 av. de l'Europe, ZIRST de Montbonnot 38334 Saint Ismier, FRANCE bertrand.delsart at oracle.com Phone : +33 4 76 18 81 23 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From erik.osterlund at lnu.se Tue Nov 4 13:21:25 2014 From: erik.osterlund at lnu.se (=?iso-8859-1?Q?Erik_=D6sterlund?=) Date: Tue, 4 Nov 2014 13:21:25 +0000 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <5458C803.9040802@oracle.com> References: <54577C96.5030503@oracle.com> <5458B56A.10801@oracle.com> <5458C803.9040802@oracle.com> Message-ID: Hi guys, Small comment on this discussion. On 04 Nov 2014, at 13:35, Mikael Gerdin wrote: > Hi David, > > On 2014-11-04 12:15, David Holmes wrote: >> Hi Mikael, >> >> Thanks for fixing this. >> >> Given x86 is a TSO system my understanding from the previous discussion >> in 6973570 is that once you have the compiler barrier the volatile write >> to the dummy variable in storestore() is no longer needed - it's only >> purpose was to introduce the compiler barrier (which it failed to do). > > I felt that removing the dummy store would be more risky since I have no good way of verifying that removing the volatile write doesn't break ordering for some other caller of storestore(). > >> >> Secondly, do we observe the same bug with the release_store operations >> or are we just being conservative? I would not expect to need the >> compiler barrier prior to the Atomic::store, but I would assume this is >> completely harmless and performance neutral, so we should have it for >> completeness regardless. > > I tried changing the problematic code to do a release_store of _gc_time_stamp and saw the same reordering problem, that's the reason for adding compiler_barrier to the release_store variants. > > Any suggestion on where to put compiler_barrier in order to use it from atomic_linux_x86.inline.hpp? > >> >> Finally, you answered Dean regarding the simple acquire() function but >> the load_acquire variants all rely on volatile semantics - which we've >> seen not to provide what we expected. So perhaps the load_acquire >> functions also need a compile_barrier inserted ? > > Perhaps. I haven't observed any problems with the acquire functions so I can't determine if they need the compiler barrier or not. > > It may be a good idea to add the compiler barrier to those functions as well, what are the other reviewers' opinions on this? The issue is that volatiles in the C++ standard prevent reordering of one volatile access w.r.t. another volatile access, but not other non-volatile accesses. Both acquire and release semantics however should consider reordering of non-volatile accesses to. Therefore, not issuing a compiler barrier for acquire may not be shown as an issue in code yet, but I'm convinced it's hazardous not to. Example of a pair of release_store synchronizing with a load_acquire: release_store in T1: non-volatile write x_1 release <-- now enforced properly with compiler barrier volatile write x_2 load_acquire in T2: volatile load x_2 acquire <--- not currently enforced properly! non-volatile load x_1 Now by fixing the compiler barrier of release store, the non-volatile write to x_1 is properly kept above the volatile x_2. However, by not having a compiler barrier for the load_acquire, there is nothing AFAIK that prevents the volatile load x_2 from reordering with the non-volatile load x_1, since volatile accesses are only constrained in order with respect to other volatile accesses. My 50 cents... /Erik > > Thanks for the review David. > /Mikael > > >> >> Thanks, >> David >> >> On 3/11/2014 11:01 PM, Mikael Gerdin wrote: >>> Hi all, >>> >>> Please review this attempt at fixing the OrderAccess functions on Linux >>> x86 with GCC. >>> >>> While working on another bug I recently discovered that g++ was >>> reordering stores across a call to OrderAccess::storestore on Linux x86. >>> >>> The G1 code attempts to do an ordered publishing of two values: >>> _saved_mark_word = _top; >>> OrderAccess::storestore(); >>> _gc_time_stamp = curr_gc_time_stamp; >>> >>> The types involved are >>> HeapWord* _top, _saved_mark_word; >>> volatile unsigned _gc_time_stamp; >>> >>> The incorrect behavior seems to have started when JDK-6973570 was fixed >>> in JDK 7. >>> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and >>> _gc_time_stamp at 0x138, %rbx is "this". >>> >>> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: >>> >>> >>> 3d9f4d: 39 d0 cmp %edx,%eax >>> 3d9f4f: 73 1c jae 3d9f6d >>> >>> 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax >>> 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) >>> 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # >>> ae98a0 <_DYNAMIC+0x12f8> >>> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) >>> 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >>> >>> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so >>> >>> >>> 3da05d: 39 d0 cmp %edx,%eax >>> 3da05f: 73 15 jae 3da076 >>> >>> 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax >>> 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) >>> 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >>> 3da072: 48 89 43 18 mov %rax,0x18(%rbx) >>> >>> In b109 the store of %rax to 0x18(%rbx) has been ordered after the store >>> of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. >>> >>> My suggestion to fix this is to extend all the OrderAccess::release* >>> variants on x86 with a: >>> __asm__ volatile ("" : : : "memory"); >>> to attempt to prevent GCC from reordering any memory accesses across >>> those function calls. >>> >>> I've verified that this solves the issue in the assembly with our >>> current JDK 9 build platform compilers. >>> I've also verified that this particular piece of code is compiled >>> correctly on our other x86 platforms: Solaris, Windows and OS X. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8061964 >>> Testing: >>> JPRT, inspecting generated assembly for the function >>> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is >>> currently named). >>> Suggestions of further testing is greatly appreciated. >>> >>> Thanks >>> Mikael From mikael.gerdin at oracle.com Tue Nov 4 14:09:34 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 04 Nov 2014 15:09:34 +0100 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: References: <54577C96.5030503@oracle.com> <5458B56A.10801@oracle.com> <5458C803.9040802@oracle.com> Message-ID: <5458DE1E.5050108@oracle.com> Hi Erik, On 2014-11-04 14:21, Erik ?sterlund wrote: > Hi guys, > > Small comment on this discussion. > > On 04 Nov 2014, at 13:35, Mikael Gerdin wrote: > >> Hi David, >> >> On 2014-11-04 12:15, David Holmes wrote: >>> Hi Mikael, >>> >>> Thanks for fixing this. >>> >>> Given x86 is a TSO system my understanding from the previous discussion >>> in 6973570 is that once you have the compiler barrier the volatile write >>> to the dummy variable in storestore() is no longer needed - it's only >>> purpose was to introduce the compiler barrier (which it failed to do). >> >> I felt that removing the dummy store would be more risky since I have no good way of verifying that removing the volatile write doesn't break ordering for some other caller of storestore(). >> >>> >>> Secondly, do we observe the same bug with the release_store operations >>> or are we just being conservative? I would not expect to need the >>> compiler barrier prior to the Atomic::store, but I would assume this is >>> completely harmless and performance neutral, so we should have it for >>> completeness regardless. >> >> I tried changing the problematic code to do a release_store of _gc_time_stamp and saw the same reordering problem, that's the reason for adding compiler_barrier to the release_store variants. >> >> Any suggestion on where to put compiler_barrier in order to use it from atomic_linux_x86.inline.hpp? >> >>> >>> Finally, you answered Dean regarding the simple acquire() function but >>> the load_acquire variants all rely on volatile semantics - which we've >>> seen not to provide what we expected. So perhaps the load_acquire >>> functions also need a compile_barrier inserted ? >> >> Perhaps. I haven't observed any problems with the acquire functions so I can't determine if they need the compiler barrier or not. >> >> It may be a good idea to add the compiler barrier to those functions as well, what are the other reviewers' opinions on this? > > The issue is that volatiles in the C++ standard prevent reordering of one volatile access w.r.t. another volatile access, but not other non-volatile accesses. > Both acquire and release semantics however should consider reordering of non-volatile accesses to. Therefore, not issuing a compiler barrier for acquire may not be shown as an issue in code yet, but I'm convinced it's hazardous not to. > > Example of a pair of release_store synchronizing with a load_acquire: > > release_store in T1: > non-volatile write x_1 > release <-- now enforced properly with compiler barrier > volatile write x_2 > > load_acquire in T2: > volatile load x_2 > acquire <--- not currently enforced properly! > non-volatile load x_1 > > Now by fixing the compiler barrier of release store, the non-volatile write to x_1 is properly kept above the volatile x_2. > However, by not having a compiler barrier for the load_acquire, there is nothing AFAIK that prevents the volatile load x_2 from reordering with the non-volatile load x_1, since volatile accesses are only constrained in order with respect to other volatile accesses. > > My 50 cents... Thanks for your feedback Erik, I think I've been convinced to add the compiler barrier to the load_acquire variants as well. I'll prepare an updated webrev. /Mikael > > /Erik > >> >> Thanks for the review David. >> /Mikael >> >> >>> >>> Thanks, >>> David >>> >>> On 3/11/2014 11:01 PM, Mikael Gerdin wrote: >>>> Hi all, >>>> >>>> Please review this attempt at fixing the OrderAccess functions on Linux >>>> x86 with GCC. >>>> >>>> While working on another bug I recently discovered that g++ was >>>> reordering stores across a call to OrderAccess::storestore on Linux x86. >>>> >>>> The G1 code attempts to do an ordered publishing of two values: >>>> _saved_mark_word = _top; >>>> OrderAccess::storestore(); >>>> _gc_time_stamp = curr_gc_time_stamp; >>>> >>>> The types involved are >>>> HeapWord* _top, _saved_mark_word; >>>> volatile unsigned _gc_time_stamp; >>>> >>>> The incorrect behavior seems to have started when JDK-6973570 was fixed >>>> in JDK 7. >>>> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and >>>> _gc_time_stamp at 0x138, %rbx is "this". >>>> >>>> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: >>>> >>>> >>>> 3d9f4d: 39 d0 cmp %edx,%eax >>>> 3d9f4f: 73 1c jae 3d9f6d >>>> >>>> 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax >>>> 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) >>>> 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # >>>> ae98a0 <_DYNAMIC+0x12f8> >>>> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) >>>> 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >>>> >>>> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so >>>> >>>> >>>> 3da05d: 39 d0 cmp %edx,%eax >>>> 3da05f: 73 15 jae 3da076 >>>> >>>> 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax >>>> 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) >>>> 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >>>> 3da072: 48 89 43 18 mov %rax,0x18(%rbx) >>>> >>>> In b109 the store of %rax to 0x18(%rbx) has been ordered after the store >>>> of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. >>>> >>>> My suggestion to fix this is to extend all the OrderAccess::release* >>>> variants on x86 with a: >>>> __asm__ volatile ("" : : : "memory"); >>>> to attempt to prevent GCC from reordering any memory accesses across >>>> those function calls. >>>> >>>> I've verified that this solves the issue in the assembly with our >>>> current JDK 9 build platform compilers. >>>> I've also verified that this particular piece of code is compiled >>>> correctly on our other x86 platforms: Solaris, Windows and OS X. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8061964 >>>> Testing: >>>> JPRT, inspecting generated assembly for the function >>>> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is >>>> currently named). >>>> Suggestions of further testing is greatly appreciated. >>>> >>>> Thanks >>>> Mikael > From coleen.phillimore at oracle.com Tue Nov 4 15:13:20 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 04 Nov 2014 10:13:20 -0500 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF23EDB@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF23EDB@DEWDFEMB12A.global.corp.sap> Message-ID: <5458ED10.2010405@oracle.com> I agree that it's an improvement. I started to look at it and I will sponsor it. Do you have the new version of 'webrev' with next navigation? That would make this easier. thanks, Coleen On 11/04/2014 04:34 AM, Lindenmaier, Goetz wrote: > Hi, > > could anybody have a look at this change, please? > I think it contains a lot of fixes useful to improve the code quality. > > Thanks and best regards, > Goetz. > > From: Lindenmaier, Goetz > Sent: Donnerstag, 30. Oktober 2014 09:28 > To: hotspot-dev at openjdk.java.net > Subject: RFR (L): 8062370: Various minor code improvements > > Hi, > > this change contains a row of minor code improvements we did to fulfil > our internal quality requirements. We would like to share these with > openJDK. > > Please review and test this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8062370 > > We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 bit and, > of course, the ppc platforms. > > > Some details: > > CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. > > We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. > > We add some missing memory frees and some closing of files. > > jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. > > Best regards, > > Goetz > > > > From coleen.phillimore at oracle.com Tue Nov 4 15:28:57 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 04 Nov 2014 10:28:57 -0500 Subject: [9] RFR(S): 8062735: CodeCacheSweeperThread missing from SA In-Reply-To: <5458CE11.305@oracle.com> References: <5458CE11.305@oracle.com> Message-ID: <5458F0B9.9050005@oracle.com> This looks good to me. Coleen On 11/04/2014 08:01 AM, Albert Noll wrote: > Hi, > > could I get reviews for this small patch? > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8062735 > > Problem: > The fix for JDK-8046809 added the CodeCacheSweeperThread, but did not > add this new type to SA. > > Solution: > Add type to SA. > > Testing: > Failing test cases. > > Webrev: > http://cr.openjdk.java.net/~anoll/8062735/webrev.00/ > > Many thanks, > Albert > From goetz.lindenmaier at sap.com Tue Nov 4 16:55:04 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 4 Nov 2014 16:55:04 +0000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <5458ED10.2010405@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF23EDB@DEWDFEMB12A.global.corp.sap> <5458ED10.2010405@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF241B6@DEWDFEMB12A.global.corp.sap> Hi Coleen, thanks for your support! I update the webrev with a version generated with the new tool. No code changes. Best regards, Goetz. -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Coleen Phillimore Sent: Dienstag, 4. November 2014 16:13 To: hotspot-dev at openjdk.java.net Subject: Re: RFR (L): 8062370: Various minor code improvements I agree that it's an improvement. I started to look at it and I will sponsor it. Do you have the new version of 'webrev' with next navigation? That would make this easier. thanks, Coleen On 11/04/2014 04:34 AM, Lindenmaier, Goetz wrote: > Hi, > > could anybody have a look at this change, please? > I think it contains a lot of fixes useful to improve the code quality. > > Thanks and best regards, > Goetz. > > From: Lindenmaier, Goetz > Sent: Donnerstag, 30. Oktober 2014 09:28 > To: hotspot-dev at openjdk.java.net > Subject: RFR (L): 8062370: Various minor code improvements > > Hi, > > this change contains a row of minor code improvements we did to fulfil > our internal quality requirements. We would like to share these with > openJDK. > > Please review and test this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8062370 > > We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 bit and, > of course, the ppc platforms. > > > Some details: > > CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. > > We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. > > We add some missing memory frees and some closing of files. > > jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. > > Best regards, > > Goetz > > > > From andreas.eriksson at oracle.com Tue Nov 4 16:57:22 2014 From: andreas.eriksson at oracle.com (Andreas Eriksson) Date: Tue, 04 Nov 2014 17:57:22 +0100 Subject: FYI: Jdk8 backport, Was: Re: RFR(M): 8057043: Type annotations not retained during class redefine / retransform In-Reply-To: <54479AEA.8040308@oracle.com> References: <54368BC1.3000905@oracle.com> <54390854.1090109@oracle.com> <543BA4DB.3000104@oracle.com> <543DFF76.80605@oracle.com> <543E590A.4020304@oracle.com> <543E712D.90900@oracle.com> <543E77CA.3090806@oracle.com> <543E7B08.8040305@oracle.com> <543FA9AD.2070404@oracle.com> <5445686F.4090103@oracle.com> <5446B9FA.4080201@oracle.com> <5446C254.7040801@oracle.com> <54479AEA.8040308@oracle.com> Message-ID: <54590572.2030302@oracle.com> Hi, Just wanted to let the list know that I'm about to backport this change to jdk8. Regards, Andreas On 2014-10-22 13:54, Andreas Eriksson wrote: > Thanks Serguei! > > Regards, > Andreas > > On 2014-10-21 22:30, serguei.spitsyn at oracle.com wrote: >> Hi Andreas, >> >> Very nice, thank you for the refactoring! >> Thumbs up. >> >> Thanks, >> Serguei >> >> >> On 10/21/14 12:54 PM, Andreas Eriksson wrote: >>> Hi Serguei, >>> >>> I split up the method into several, and made the verification before >>> and after retransform share logic. >>> Webrev: http://cr.openjdk.java.net/~aeriksso/8057043/webrev.02/ >>> >>> Regards, >>> Andreas >>> >>> On 2014-10-20 21:54, serguei.spitsyn at oracle.com wrote: >>>> Hi Andreas, >>>> >>>> Sorry for the delay. >>>> >>>> On 10/16/14 4:19 AM, Andreas Eriksson wrote: >>>>> >>>>> On 2014-10-15 15:47, Daniel D. Daugherty wrote: >>>>>> On 10/15/14 7:34 AM, Coleen Phillimore wrote: >>>>>>> >>>>>>> There are lots of other rewrite_cp_refs_in* function calls. >>>>>>> Please indent your function like them, not differently. >>>>>> >>>>>> The above implies that my answer below was made without sufficient >>>>>> context... my apologies for that. >>>>>> >>>>>> The general rule is to follow the existing style in the file so >>>>>> if there are rewrite_cp_refs_in* function calls in the file, then >>>>>> please follow that style. Unless, of course, you want to fix all >>>>>> of them to follow the HotSpot style guideline: >>>>>> >>>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide >>>>>> >>>>>> > Use good taste to break lines and align corresponding tokens >>>>>> > on adjacent lines. >>>>>> >>>>>> but that may cause Coleen some heartburn :-) >>>>> >>>>> I fixed the calls to follow the already existing indent style. >>>>> I have also made changes to the test, which I hope Joel can take a >>>>> look at. >>>>> >>>>> New webrev: >>>>> http://cr.openjdk.java.net/~aeriksso/8057043/webrev.01/ >>>> >>>> The fix looks good. >>>> >>>> A couple of comments about the test. >>>> >>>> The method testTransformAndVerify() is too big. >>>> At least, it looks like there are some ways to refactor it to make >>>> calls to smaller methods. >>>> >>>> There are two directions of doing it: >>>> - make a smaller method out of each block: >>>> 217-236, 238-260, 262-276, 311-329, 331-351, 353-367 >>>> - some of the lines sequences looks very typical: >>>> 221 at = c.getDeclaredField("typeAnnotatedArray").getAnnotatedType(); >>>> 222 arrayTA1 = at.getAnnotations()[0]; >>>> 223 verifyTestAnnSite(arrayTA1, "array1"); >>>> 224 >>>> 225 at = ((AnnotatedArrayType) at).getAnnotatedGenericComponentType(); >>>> 226 arrayTA2 = at.getAnnotations()[0]; >>>> 227 verifyTestAnnSite(arrayTA2, "array2"); >>>> 228 >>>> 229 at = ((AnnotatedArrayType) at).getAnnotatedGenericComponentType(); >>>> 230 arrayTA3 = at.getAnnotations()[0]; >>>> 231 verifyTestAnnSite(arrayTA3, "array3"); >>>> 232 >>>> 233 at = ((AnnotatedArrayType) at).getAnnotatedGenericComponentType(); >>>> 234 arrayTA4 = at.getAnnotations()[0]; >>>> 235 verifyTestAnnSite(arrayTA4, "array4"); >>>> But I leave it up to you. >>>> >>>> Another step to improve the readability is to add a short comment >>>> for each block of code saying what is done there. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>>> >>>>> Thanks, >>>>> Andreas >>>>> >>>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>>> >>>>>>> Coleen >>>>>>> >>>>>>> On 10/15/14, 9:05 AM, Daniel D. Daugherty wrote: >>>>>>>> On 10/15/14 5:22 AM, Andreas Eriksson wrote: >>>>>>>>> Thanks Serguei. >>>>>>>>> >>>>>>>>> I have a question about the if-blocks that had the wrong indent: >>>>>>>>> >>>>>>>>> 2335 if >>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>> >>>>>>>>> 2336 byte_i, "method_info", THREAD)) { >>>>>>>>> >>>>>>>>> How should I indent them? >>>>>>>> >>>>>>>> Trying again without the line numbers... >>>>>>>> >>>>>>>> if >>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>> >>>>>>>> byte_i, "method_info", >>>>>>>> THREAD)) { >>>>>>>> >>>>>>>> Just in case, TB messes with the spacing again, the "byte_i" >>>>>>>> line and >>>>>>>> "THREAD" lines are aligned under "method_type_annotations". >>>>>>>> >>>>>>>> Dan >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> /Andreas >>>>>>>>> >>>>>>>>> On 2014-10-15 07:00, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Hi Andreas, >>>>>>>>>> >>>>>>>>>> Sorry I did not reply on this early. >>>>>>>>>> I assumed, it is a thumbs up from me. >>>>>>>>>> Just wanted make it clean now. :) >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> On 10/13/14 3:09 AM, Andreas Eriksson wrote: >>>>>>>>>>> Hi Serguei, thanks for looking at this! >>>>>>>>>>> >>>>>>>>>>> I'll make sure to fix the style problems. >>>>>>>>>>> For the symbolic names / #defines, please see my answer to >>>>>>>>>>> Coleen. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Andreas >>>>>>>>>>> >>>>>>>>>>> On 2014-10-11 12:37, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Hi Andreas, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thank you for fixing this issue! >>>>>>>>>>>> The fix looks nice, I do not see any logical issues. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Only minor comments... >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> src/share/vm/prims/jvmtiRedefineClasses.cpp >>>>>>>>>>>> >>>>>>>>>>>> 2281 } // end rewrite_cp_refs_in_class_type_annotations( >>>>>>>>>>>> 2315 } // end rewrite_cp_refs_in_fields_type_annotations( >>>>>>>>>>>> 2345 } // end rewrite_cp_refs_in_methods_type_annotations() >>>>>>>>>>>> 2397 } // end rewrite_cp_refs_in_type_annotations_typeArray >>>>>>>>>>>> 2443 } // end rewrite_cp_refs_in_type_annotation_struct >>>>>>>>>>>> 2785 } // end skip_type_annotation_target >>>>>>>>>>>> 2844 } // end skip_type_annotation_type_path >>>>>>>>>>>> >>>>>>>>>>>> The ')' is missed at 2281, 2315. >>>>>>>>>>>> The 2397-2844 are inconsistent with the 2345 and other >>>>>>>>>>>> function-end comments in the file. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2335 if >>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>>>>> >>>>>>>>>>>> 2336 byte_i, "method_info", THREAD)) { >>>>>>>>>>>> . . . >>>>>>>>>>>> 2378 if >>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotation_struct(type_annotations_typeArray, >>>>>>>>>>>> >>>>>>>>>>>> 2379 byte_i_ref, location_mesg, THREAD)) { >>>>>>>>>>>> . . . >>>>>>>>>>>> 2427 if >>>>>>>>>>>> (!skip_type_annotation_target(type_annotations_typeArray, >>>>>>>>>>>> 2428 byte_i_ref, location_mesg, THREAD)) { >>>>>>>>>>>> 2429 return false; >>>>>>>>>>>> 2430 } >>>>>>>>>>>> 2431 >>>>>>>>>>>> 2432 if >>>>>>>>>>>> (!skip_type_annotation_type_path(type_annotations_typeArray, >>>>>>>>>>>> 2433 byte_i_ref, THREAD)) { >>>>>>>>>>>> 2434 return false; >>>>>>>>>>>> 2435 } >>>>>>>>>>>> 2436 >>>>>>>>>>>> 2437 if >>>>>>>>>>>> (!rewrite_cp_refs_in_annotation_struct(type_annotations_typeArray, >>>>>>>>>>>> >>>>>>>>>>>> 2438 byte_i_ref, THREAD)) { >>>>>>>>>>>> 2439 return false; >>>>>>>>>>>> Wrong indent at 2336, 2379, etc. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I also concur with Coleen that it would be good to define >>>>>>>>>>>> and use >>>>>>>>>>>> symbolic names for the hexa-decimal constants used in the fix. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> test/runtime/RedefineTests/RedefineAnnotations.java >>>>>>>>>>>> >>>>>>>>>>>> Java indent must be 4, not 2. >>>>>>>>>>>> >>>>>>>>>>>> 253 @TestAnn(site="returnTypeAnnotation") Class >>>>>>>>>>>> typeAnnotatedMethod(@TestAnn(site="formalParameterTypeAnnotation") >>>>>>>>>>>> TypeAnnotatedTestClass arg) >>>>>>>>>>>> >>>>>>>>>>>> The line is too long. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 143 } >>>>>>>>>>>> 144 public static void main(String argv[]) { >>>>>>>>>>>> . . . >>>>>>>>>>>> 209 } >>>>>>>>>>>> 210 private static void checkAnnotations(AnnotatedType >>>>>>>>>>>> p) { >>>>>>>>>>>> 211 checkAnnotations(p.getAnnotations()); >>>>>>>>>>>> 212 } >>>>>>>>>>>> 213 private static void >>>>>>>>>>>> checkAnnotations(AnnotatedType[] annoTypes) { >>>>>>>>>>>> 214 for (AnnotatedType p : annoTypes) >>>>>>>>>>>> checkAnnotations(p.getAnnotations()); >>>>>>>>>>>> 215 } >>>>>>>>>>>> 216 private static void >>>>>>>>>>>> checkAnnotations(Class c) { >>>>>>>>>>>> . . . >>>>>>>>>>>> 257 } >>>>>>>>>>>> 258 public void run() {} >>>>>>>>>>>> >>>>>>>>>>>> Adding empty lines between method definitions would >>>>>>>>>>>> improve readability. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 10/9/14 6:21 AM, Andreas Eriksson wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> Please review this patch to RedefineClasses to allow type >>>>>>>>>>>>> annotations to be preserved. >>>>>>>>>>>>> >>>>>>>>>>>>> Summary: >>>>>>>>>>>>> During redefine / retransform class the constant pool >>>>>>>>>>>>> indexes can change. >>>>>>>>>>>>> Since annotations have indexes into the constant pool >>>>>>>>>>>>> these indexes need to be rewritten. >>>>>>>>>>>>> This is already done for regular annotations, but not for >>>>>>>>>>>>> type annotations. >>>>>>>>>>>>> This patch adds code to add this rewriting for the type >>>>>>>>>>>>> annotations as well. >>>>>>>>>>>>> The patch also contains minor changes to >>>>>>>>>>>>> ClassFileReconstituter, to make sure that type annotations >>>>>>>>>>>>> are preserved during a redefine / retransform class >>>>>>>>>>>>> operation. >>>>>>>>>>>>> It also has a test that uses asm to change constant pool >>>>>>>>>>>>> indexes through a retransform, and then verifies that type >>>>>>>>>>>>> annotations are preserved. >>>>>>>>>>>>> >>>>>>>>>>>>> Detail: >>>>>>>>>>>>> A type annotation struct consists of some target >>>>>>>>>>>>> information and a type path, followed by a regular >>>>>>>>>>>>> annotation struct. >>>>>>>>>>>>> Constant pool indexes are only present in the regular >>>>>>>>>>>>> annotation struct. >>>>>>>>>>>>> The added code skips over the type annotation specific >>>>>>>>>>>>> parts, then calls previously existing code to rewrite >>>>>>>>>>>>> constant pool indexes in the regular annotation struct. >>>>>>>>>>>>> Please see the Java SE 8 Ed. VM Spec. section 4.7.20 for >>>>>>>>>>>>> more info about the type annotation struct. >>>>>>>>>>>>> >>>>>>>>>>>>> JPRT with the new test passes without failures on all >>>>>>>>>>>>> platforms. >>>>>>>>>>>>> >>>>>>>>>>>>> Webrev: >>>>>>>>>>>>> http://cr.openjdk.java.net/~aeriksso/8057043/webrev.00/ >>>>>>>>>>>>> >>>>>>>>>>>>> Bug: >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8057043 >>>>>>>>>>>>> >>>>>>>>>>>>> Regards >>>>>>>>>>>>> Andreas >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From andreas.eriksson at oracle.com Tue Nov 4 17:04:42 2014 From: andreas.eriksson at oracle.com (Andreas Eriksson) Date: Tue, 04 Nov 2014 18:04:42 +0100 Subject: FYI: Jdk8 backport, Was: Re: RFR(M): 8057043: Type annotations not retained during class redefine / retransform In-Reply-To: <54590572.2030302@oracle.com> References: <54368BC1.3000905@oracle.com> <54390854.1090109@oracle.com> <543BA4DB.3000104@oracle.com> <543DFF76.80605@oracle.com> <543E590A.4020304@oracle.com> <543E712D.90900@oracle.com> <543E77CA.3090806@oracle.com> <543E7B08.8040305@oracle.com> <543FA9AD.2070404@oracle.com> <5445686F.4090103@oracle.com> <5446B9FA.4080201@oracle.com> <5446C254.7040801@oracle.com> <54479AEA.8040308@oracle.com> <54590572.2030302@oracle.com> Message-ID: <5459072A.9050506@oracle.com> Or do I need to send out a real review for the backport? I'm not sure what the process is here. Thanks, Andreas On 2014-11-04 17:57, Andreas Eriksson wrote: > Hi, > > Just wanted to let the list know that I'm about to backport this > change to jdk8. > > Regards, > Andreas > > On 2014-10-22 13:54, Andreas Eriksson wrote: >> Thanks Serguei! >> >> Regards, >> Andreas >> >> On 2014-10-21 22:30, serguei.spitsyn at oracle.com wrote: >>> Hi Andreas, >>> >>> Very nice, thank you for the refactoring! >>> Thumbs up. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 10/21/14 12:54 PM, Andreas Eriksson wrote: >>>> Hi Serguei, >>>> >>>> I split up the method into several, and made the verification >>>> before and after retransform share logic. >>>> Webrev: http://cr.openjdk.java.net/~aeriksso/8057043/webrev.02/ >>>> >>>> Regards, >>>> Andreas >>>> >>>> On 2014-10-20 21:54, serguei.spitsyn at oracle.com wrote: >>>>> Hi Andreas, >>>>> >>>>> Sorry for the delay. >>>>> >>>>> On 10/16/14 4:19 AM, Andreas Eriksson wrote: >>>>>> >>>>>> On 2014-10-15 15:47, Daniel D. Daugherty wrote: >>>>>>> On 10/15/14 7:34 AM, Coleen Phillimore wrote: >>>>>>>> >>>>>>>> There are lots of other rewrite_cp_refs_in* function calls. >>>>>>>> Please indent your function like them, not differently. >>>>>>> >>>>>>> The above implies that my answer below was made without sufficient >>>>>>> context... my apologies for that. >>>>>>> >>>>>>> The general rule is to follow the existing style in the file so >>>>>>> if there are rewrite_cp_refs_in* function calls in the file, then >>>>>>> please follow that style. Unless, of course, you want to fix all >>>>>>> of them to follow the HotSpot style guideline: >>>>>>> >>>>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide >>>>>>> >>>>>>> > Use good taste to break lines and align corresponding tokens >>>>>>> > on adjacent lines. >>>>>>> >>>>>>> but that may cause Coleen some heartburn :-) >>>>>> >>>>>> I fixed the calls to follow the already existing indent style. >>>>>> I have also made changes to the test, which I hope Joel can take >>>>>> a look at. >>>>>> >>>>>> New webrev: >>>>>> http://cr.openjdk.java.net/~aeriksso/8057043/webrev.01/ >>>>> >>>>> The fix looks good. >>>>> >>>>> A couple of comments about the test. >>>>> >>>>> The method testTransformAndVerify() is too big. >>>>> At least, it looks like there are some ways to refactor it to make >>>>> calls to smaller methods. >>>>> >>>>> There are two directions of doing it: >>>>> - make a smaller method out of each block: >>>>> 217-236, 238-260, 262-276, 311-329, 331-351, 353-367 >>>>> - some of the lines sequences looks very typical: >>>>> 221 at = >>>>> c.getDeclaredField("typeAnnotatedArray").getAnnotatedType(); >>>>> 222 arrayTA1 = at.getAnnotations()[0]; >>>>> 223 verifyTestAnnSite(arrayTA1, "array1"); >>>>> 224 >>>>> 225 at = ((AnnotatedArrayType) >>>>> at).getAnnotatedGenericComponentType(); >>>>> 226 arrayTA2 = at.getAnnotations()[0]; >>>>> 227 verifyTestAnnSite(arrayTA2, "array2"); >>>>> 228 >>>>> 229 at = ((AnnotatedArrayType) >>>>> at).getAnnotatedGenericComponentType(); >>>>> 230 arrayTA3 = at.getAnnotations()[0]; >>>>> 231 verifyTestAnnSite(arrayTA3, "array3"); >>>>> 232 >>>>> 233 at = ((AnnotatedArrayType) >>>>> at).getAnnotatedGenericComponentType(); >>>>> 234 arrayTA4 = at.getAnnotations()[0]; >>>>> 235 verifyTestAnnSite(arrayTA4, "array4"); >>>>> But I leave it up to you. >>>>> >>>>> Another step to improve the readability is to add a short comment >>>>> for each block of code saying what is done there. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> Andreas >>>>>> >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Coleen >>>>>>>> >>>>>>>> On 10/15/14, 9:05 AM, Daniel D. Daugherty wrote: >>>>>>>>> On 10/15/14 5:22 AM, Andreas Eriksson wrote: >>>>>>>>>> Thanks Serguei. >>>>>>>>>> >>>>>>>>>> I have a question about the if-blocks that had the wrong indent: >>>>>>>>>> >>>>>>>>>> 2335 if >>>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>>> >>>>>>>>>> 2336 byte_i, "method_info", THREAD)) { >>>>>>>>>> >>>>>>>>>> How should I indent them? >>>>>>>>> >>>>>>>>> Trying again without the line numbers... >>>>>>>>> >>>>>>>>> if >>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>> >>>>>>>>> byte_i, "method_info", >>>>>>>>> THREAD)) { >>>>>>>>> >>>>>>>>> Just in case, TB messes with the spacing again, the "byte_i" >>>>>>>>> line and >>>>>>>>> "THREAD" lines are aligned under "method_type_annotations". >>>>>>>>> >>>>>>>>> Dan >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> /Andreas >>>>>>>>>> >>>>>>>>>> On 2014-10-15 07:00, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Hi Andreas, >>>>>>>>>>> >>>>>>>>>>> Sorry I did not reply on this early. >>>>>>>>>>> I assumed, it is a thumbs up from me. >>>>>>>>>>> Just wanted make it clean now. :) >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> On 10/13/14 3:09 AM, Andreas Eriksson wrote: >>>>>>>>>>>> Hi Serguei, thanks for looking at this! >>>>>>>>>>>> >>>>>>>>>>>> I'll make sure to fix the style problems. >>>>>>>>>>>> For the symbolic names / #defines, please see my answer to >>>>>>>>>>>> Coleen. >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> Andreas >>>>>>>>>>>> >>>>>>>>>>>> On 2014-10-11 12:37, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi Andreas, >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for fixing this issue! >>>>>>>>>>>>> The fix looks nice, I do not see any logical issues. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Only minor comments... >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> src/share/vm/prims/jvmtiRedefineClasses.cpp >>>>>>>>>>>>> >>>>>>>>>>>>> 2281 } // end rewrite_cp_refs_in_class_type_annotations( >>>>>>>>>>>>> 2315 } // end rewrite_cp_refs_in_fields_type_annotations( >>>>>>>>>>>>> 2345 } // end rewrite_cp_refs_in_methods_type_annotations() >>>>>>>>>>>>> 2397 } // end rewrite_cp_refs_in_type_annotations_typeArray >>>>>>>>>>>>> 2443 } // end rewrite_cp_refs_in_type_annotation_struct >>>>>>>>>>>>> 2785 } // end skip_type_annotation_target >>>>>>>>>>>>> 2844 } // end skip_type_annotation_type_path >>>>>>>>>>>>> >>>>>>>>>>>>> The ')' is missed at 2281, 2315. >>>>>>>>>>>>> The 2397-2844 are inconsistent with the 2345 and other >>>>>>>>>>>>> function-end comments in the file. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2335 if >>>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>>>>>> >>>>>>>>>>>>> 2336 byte_i, "method_info", THREAD)) { >>>>>>>>>>>>> . . . >>>>>>>>>>>>> 2378 if >>>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotation_struct(type_annotations_typeArray, >>>>>>>>>>>>> >>>>>>>>>>>>> 2379 byte_i_ref, location_mesg, THREAD)) { >>>>>>>>>>>>> . . . >>>>>>>>>>>>> 2427 if >>>>>>>>>>>>> (!skip_type_annotation_target(type_annotations_typeArray, >>>>>>>>>>>>> 2428 byte_i_ref, location_mesg, THREAD)) { >>>>>>>>>>>>> 2429 return false; >>>>>>>>>>>>> 2430 } >>>>>>>>>>>>> 2431 >>>>>>>>>>>>> 2432 if >>>>>>>>>>>>> (!skip_type_annotation_type_path(type_annotations_typeArray, >>>>>>>>>>>>> 2433 byte_i_ref, THREAD)) { >>>>>>>>>>>>> 2434 return false; >>>>>>>>>>>>> 2435 } >>>>>>>>>>>>> 2436 >>>>>>>>>>>>> 2437 if >>>>>>>>>>>>> (!rewrite_cp_refs_in_annotation_struct(type_annotations_typeArray, >>>>>>>>>>>>> >>>>>>>>>>>>> 2438 byte_i_ref, THREAD)) { >>>>>>>>>>>>> 2439 return false; >>>>>>>>>>>>> Wrong indent at 2336, 2379, etc. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I also concur with Coleen that it would be good to define >>>>>>>>>>>>> and use >>>>>>>>>>>>> symbolic names for the hexa-decimal constants used in the >>>>>>>>>>>>> fix. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> test/runtime/RedefineTests/RedefineAnnotations.java >>>>>>>>>>>>> >>>>>>>>>>>>> Java indent must be 4, not 2. >>>>>>>>>>>>> >>>>>>>>>>>>> 253 @TestAnn(site="returnTypeAnnotation") Class >>>>>>>>>>>>> typeAnnotatedMethod(@TestAnn(site="formalParameterTypeAnnotation") >>>>>>>>>>>>> TypeAnnotatedTestClass arg) >>>>>>>>>>>>> >>>>>>>>>>>>> The line is too long. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 143 } >>>>>>>>>>>>> 144 public static void main(String argv[]) { >>>>>>>>>>>>> . . . >>>>>>>>>>>>> 209 } >>>>>>>>>>>>> 210 private static void checkAnnotations(AnnotatedType >>>>>>>>>>>>> p) { >>>>>>>>>>>>> 211 checkAnnotations(p.getAnnotations()); >>>>>>>>>>>>> 212 } >>>>>>>>>>>>> 213 private static void >>>>>>>>>>>>> checkAnnotations(AnnotatedType[] annoTypes) { >>>>>>>>>>>>> 214 for (AnnotatedType p : annoTypes) >>>>>>>>>>>>> checkAnnotations(p.getAnnotations()); >>>>>>>>>>>>> 215 } >>>>>>>>>>>>> 216 private static void >>>>>>>>>>>>> checkAnnotations(Class c) { >>>>>>>>>>>>> . . . >>>>>>>>>>>>> 257 } >>>>>>>>>>>>> 258 public void run() {} >>>>>>>>>>>>> >>>>>>>>>>>>> Adding empty lines between method definitions would >>>>>>>>>>>>> improve readability. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/9/14 6:21 AM, Andreas Eriksson wrote: >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please review this patch to RedefineClasses to allow type >>>>>>>>>>>>>> annotations to be preserved. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Summary: >>>>>>>>>>>>>> During redefine / retransform class the constant pool >>>>>>>>>>>>>> indexes can change. >>>>>>>>>>>>>> Since annotations have indexes into the constant pool >>>>>>>>>>>>>> these indexes need to be rewritten. >>>>>>>>>>>>>> This is already done for regular annotations, but not for >>>>>>>>>>>>>> type annotations. >>>>>>>>>>>>>> This patch adds code to add this rewriting for the type >>>>>>>>>>>>>> annotations as well. >>>>>>>>>>>>>> The patch also contains minor changes to >>>>>>>>>>>>>> ClassFileReconstituter, to make sure that type >>>>>>>>>>>>>> annotations are preserved during a redefine / retransform >>>>>>>>>>>>>> class operation. >>>>>>>>>>>>>> It also has a test that uses asm to change constant pool >>>>>>>>>>>>>> indexes through a retransform, and then verifies that >>>>>>>>>>>>>> type annotations are preserved. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Detail: >>>>>>>>>>>>>> A type annotation struct consists of some target >>>>>>>>>>>>>> information and a type path, followed by a regular >>>>>>>>>>>>>> annotation struct. >>>>>>>>>>>>>> Constant pool indexes are only present in the regular >>>>>>>>>>>>>> annotation struct. >>>>>>>>>>>>>> The added code skips over the type annotation specific >>>>>>>>>>>>>> parts, then calls previously existing code to rewrite >>>>>>>>>>>>>> constant pool indexes in the regular annotation struct. >>>>>>>>>>>>>> Please see the Java SE 8 Ed. VM Spec. section 4.7.20 for >>>>>>>>>>>>>> more info about the type annotation struct. >>>>>>>>>>>>>> >>>>>>>>>>>>>> JPRT with the new test passes without failures on all >>>>>>>>>>>>>> platforms. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~aeriksso/8057043/webrev.00/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Bug: >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8057043 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards >>>>>>>>>>>>>> Andreas >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From vladimir.kozlov at oracle.com Tue Nov 4 18:13:05 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 04 Nov 2014 10:13:05 -0800 Subject: [9] RFR(S): 8062735: CodeCacheSweeperThread missing from SA In-Reply-To: <5458CE11.305@oracle.com> References: <5458CE11.305@oracle.com> Message-ID: <54591731.9070404@oracle.com> Looks good. Thanks, Vladimir On 11/4/14 5:01 AM, Albert Noll wrote: > Hi, > > could I get reviews for this small patch? > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8062735 > > Problem: > The fix for JDK-8046809 added the CodeCacheSweeperThread, but did not add this new type to SA. > > Solution: > Add type to SA. > > Testing: > Failing test cases. > > Webrev: > http://cr.openjdk.java.net/~anoll/8062735/webrev.00/ > > Many thanks, > Albert > From kim.barrett at oracle.com Tue Nov 4 18:15:07 2014 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 4 Nov 2014 13:15:07 -0500 Subject: RFR: 8058255: Native jbyte Atomic::cmpxchg for supported x86 platforms In-Reply-To: <4CC8B7BA-1536-47A3-9CEF-069191E574B7@lnu.se> References: <37B3D027-5B2E-417C-A679-D58AA250FCEF@lnu.se> <4CC8B7BA-1536-47A3-9CEF-069191E574B7@lnu.se> Message-ID: <47EB5B12-540E-45F7-8873-FA7BB015A8FE@oracle.com> On Nov 3, 2014, at 7:21 PM, Erik ?sterlund wrote: > >> [legacy issue, not in changed code] >> I think the comment for generate_atomic_cmpxchg_long() is wrong in the >> return value; shouldn't it be returning a jlong? Probably a C-Y bug. > > No generate_atomic_cmpxchg_long() is used for generating code stubs for jlong CAS. I.e. it returns the address of the generated stub rather than executing a CAS - hence the return type is correct. The comment that I?m complaining about is the one describing the operation being supported by the generator, whose return type should be jlong, just as the corresponding return type in the comment for the new cmpxchg_byte support is jbyte. That is, 623 // Support for jint atomic::atomic_cmpxchg_long(jlong exchange_value, should be ?// Support for jlong ?" >> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp >> 96 : "q" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp) >> >> Why is the new byte version using "q" for exchange_value, where the >> existing int and long versions use "r"? [There might be a good >> reason, and this is just my rusty assembler skills showing.] > > With the "q" constraint you select one of the 8-bit-addressable registers rax, rcx, rdx, rbx (as opposed to any register with "r?). Thanks for the explanation. I didn?t remember that at all, and the documentation I skimmed yesterday wasn?t helping. > The compare_value is assigned to eax using "a" which is also 8-bit-addressable (al). Also cmpxchgb needs it to be in al specifically. At least I got that part. > The former (allocating 8-bit-addressable registers) wasn't a concern for the other variants really, but here this is pretty important for the operands of cmpxchgb. :) Indeed. >> ------------------------------------------------------------------------------ >> >> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp >> src/os_cpu/windows_x86/vm/os_windows_x86.hpp >> >> The windows port seems to only support specialized cmpxchgb when >> defined(AMD64), while the BSD/Linux variants don't have that >> restriction. Why this inconsistency? Or am I missing something, >> which seems entirely possible in this tangle. > > If you look closely, you will see there are two definitions - one for AMD64 using a runtime-generated code stub. > Then there is another MSVC assembly variant for #ifndef AMD64. > This goes perfectly consistent with e.g. the jint cmpxchg for windows way of doing things. Oops, you are correct. > Do you want a new webrev? (just polished comments and renamed the #define as per request) I don?t think I need one, but others might want a closer to final version. From vladimir.kozlov at oracle.com Tue Nov 4 18:49:31 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 04 Nov 2014 10:49:31 -0800 Subject: [8u40] RFR(M) 8041984: CompilerThread seems to occupy all CPU in a very rare situation In-Reply-To: References: <545145F5.5010207@oracle.com> Message-ID: <54591FBB.8020301@oracle.com> Thank you, Igor Vladimir On 11/4/14 1:38 AM, Igor Veresov wrote: > Good. > > igor > > On Oct 29, 2014, at 9:54 AM, Vladimir Kozlov wrote: > >> Backport request. Changes were pushed into jdk9 last week. Nighties are fine. Changes are applied cleanly to 8u sources. >> >> https://bugs.openjdk.java.net/browse/JDK-8041984 >> http://cr.openjdk.java.net/~kvn/8041984/webrev.01/ >> >> Review thread: >> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2014-October/015855.html >> >> jdk9 changeset: >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/b9576378eaad >> >> Thanks, >> Vladimir > From serguei.spitsyn at oracle.com Tue Nov 4 22:32:56 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 04 Nov 2014 14:32:56 -0800 Subject: [9] RFR(S): 8062735: CodeCacheSweeperThread missing from SA In-Reply-To: <5458CE11.305@oracle.com> References: <5458CE11.305@oracle.com> Message-ID: <54595418.6060300@oracle.com> Hi Albert, The fix looks good. Thanks, Serguei On 11/4/14 5:01 AM, Albert Noll wrote: > Hi, > > could I get reviews for this small patch? > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8062735 > > Problem: > The fix for JDK-8046809 added the CodeCacheSweeperThread, but did not > add this new type to SA. > > Solution: > Add type to SA. > > Testing: > Failing test cases. > > Webrev: > http://cr.openjdk.java.net/~anoll/8062735/webrev.00/ > > Many thanks, > Albert > From serguei.spitsyn at oracle.com Tue Nov 4 23:37:52 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 04 Nov 2014 15:37:52 -0800 Subject: FYI: Jdk8 backport, Was: Re: RFR(M): 8057043: Type annotations not retained during class redefine / retransform In-Reply-To: <5459072A.9050506@oracle.com> References: <54368BC1.3000905@oracle.com> <54390854.1090109@oracle.com> <543BA4DB.3000104@oracle.com> <543DFF76.80605@oracle.com> <543E590A.4020304@oracle.com> <543E712D.90900@oracle.com> <543E77CA.3090806@oracle.com> <543E7B08.8040305@oracle.com> <543FA9AD.2070404@oracle.com> <5445686F.4090103@oracle.com> <5446B9FA.4080201@oracle.com> <5446C254.7040801@oracle.com> <54479AEA.8040308@oracle.com> <54590572.2030302@oracle.com> <5459072A.9050506@oracle.com> Message-ID: <54596350.2030609@oracle.com> Hi Andreas, If the port is straightforward then referring to the jdk 9 reviewers should be enough. You still need to get an approval. Thanks, Serguei On 11/4/14 9:04 AM, Andreas Eriksson wrote: > Or do I need to send out a real review for the backport? > I'm not sure what the process is here. > > Thanks, > Andreas > > On 2014-11-04 17:57, Andreas Eriksson wrote: >> Hi, >> >> Just wanted to let the list know that I'm about to backport this >> change to jdk8. >> >> Regards, >> Andreas >> >> On 2014-10-22 13:54, Andreas Eriksson wrote: >>> Thanks Serguei! >>> >>> Regards, >>> Andreas >>> >>> On 2014-10-21 22:30, serguei.spitsyn at oracle.com wrote: >>>> Hi Andreas, >>>> >>>> Very nice, thank you for the refactoring! >>>> Thumbs up. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 10/21/14 12:54 PM, Andreas Eriksson wrote: >>>>> Hi Serguei, >>>>> >>>>> I split up the method into several, and made the verification >>>>> before and after retransform share logic. >>>>> Webrev: http://cr.openjdk.java.net/~aeriksso/8057043/webrev.02/ >>>>> >>>>> Regards, >>>>> Andreas >>>>> >>>>> On 2014-10-20 21:54, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Andreas, >>>>>> >>>>>> Sorry for the delay. >>>>>> >>>>>> On 10/16/14 4:19 AM, Andreas Eriksson wrote: >>>>>>> >>>>>>> On 2014-10-15 15:47, Daniel D. Daugherty wrote: >>>>>>>> On 10/15/14 7:34 AM, Coleen Phillimore wrote: >>>>>>>>> >>>>>>>>> There are lots of other rewrite_cp_refs_in* function calls. >>>>>>>>> Please indent your function like them, not differently. >>>>>>>> >>>>>>>> The above implies that my answer below was made without sufficient >>>>>>>> context... my apologies for that. >>>>>>>> >>>>>>>> The general rule is to follow the existing style in the file so >>>>>>>> if there are rewrite_cp_refs_in* function calls in the file, then >>>>>>>> please follow that style. Unless, of course, you want to fix all >>>>>>>> of them to follow the HotSpot style guideline: >>>>>>>> >>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide >>>>>>>> >>>>>>>> > Use good taste to break lines and align corresponding tokens >>>>>>>> > on adjacent lines. >>>>>>>> >>>>>>>> but that may cause Coleen some heartburn :-) >>>>>>> >>>>>>> I fixed the calls to follow the already existing indent style. >>>>>>> I have also made changes to the test, which I hope Joel can take >>>>>>> a look at. >>>>>>> >>>>>>> New webrev: >>>>>>> http://cr.openjdk.java.net/~aeriksso/8057043/webrev.01/ >>>>>> >>>>>> The fix looks good. >>>>>> >>>>>> A couple of comments about the test. >>>>>> >>>>>> The method testTransformAndVerify() is too big. >>>>>> At least, it looks like there are some ways to refactor it to >>>>>> make calls to smaller methods. >>>>>> >>>>>> There are two directions of doing it: >>>>>> - make a smaller method out of each block: >>>>>> 217-236, 238-260, 262-276, 311-329, 331-351, 353-367 >>>>>> - some of the lines sequences looks very typical: >>>>>> 221 at = >>>>>> c.getDeclaredField("typeAnnotatedArray").getAnnotatedType(); >>>>>> 222 arrayTA1 = at.getAnnotations()[0]; >>>>>> 223 verifyTestAnnSite(arrayTA1, "array1"); >>>>>> 224 >>>>>> 225 at = ((AnnotatedArrayType) >>>>>> at).getAnnotatedGenericComponentType(); >>>>>> 226 arrayTA2 = at.getAnnotations()[0]; >>>>>> 227 verifyTestAnnSite(arrayTA2, "array2"); >>>>>> 228 >>>>>> 229 at = ((AnnotatedArrayType) >>>>>> at).getAnnotatedGenericComponentType(); >>>>>> 230 arrayTA3 = at.getAnnotations()[0]; >>>>>> 231 verifyTestAnnSite(arrayTA3, "array3"); >>>>>> 232 >>>>>> 233 at = ((AnnotatedArrayType) >>>>>> at).getAnnotatedGenericComponentType(); >>>>>> 234 arrayTA4 = at.getAnnotations()[0]; >>>>>> 235 verifyTestAnnSite(arrayTA4, "array4"); >>>>>> But I leave it up to you. >>>>>> >>>>>> Another step to improve the readability is to add a short comment >>>>>> for each block of code saying what is done there. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Andreas >>>>>>> >>>>>>>> >>>>>>>> Dan >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Coleen >>>>>>>>> >>>>>>>>> On 10/15/14, 9:05 AM, Daniel D. Daugherty wrote: >>>>>>>>>> On 10/15/14 5:22 AM, Andreas Eriksson wrote: >>>>>>>>>>> Thanks Serguei. >>>>>>>>>>> >>>>>>>>>>> I have a question about the if-blocks that had the wrong >>>>>>>>>>> indent: >>>>>>>>>>> >>>>>>>>>>> 2335 if >>>>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>>>> >>>>>>>>>>> 2336 byte_i, "method_info", THREAD)) { >>>>>>>>>>> >>>>>>>>>>> How should I indent them? >>>>>>>>>> >>>>>>>>>> Trying again without the line numbers... >>>>>>>>>> >>>>>>>>>> if >>>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>>> >>>>>>>>>> byte_i, "method_info", >>>>>>>>>> THREAD)) { >>>>>>>>>> >>>>>>>>>> Just in case, TB messes with the spacing again, the "byte_i" >>>>>>>>>> line and >>>>>>>>>> "THREAD" lines are aligned under "method_type_annotations". >>>>>>>>>> >>>>>>>>>> Dan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> /Andreas >>>>>>>>>>> >>>>>>>>>>> On 2014-10-15 07:00, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Hi Andreas, >>>>>>>>>>>> >>>>>>>>>>>> Sorry I did not reply on this early. >>>>>>>>>>>> I assumed, it is a thumbs up from me. >>>>>>>>>>>> Just wanted make it clean now. :) >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>> On 10/13/14 3:09 AM, Andreas Eriksson wrote: >>>>>>>>>>>>> Hi Serguei, thanks for looking at this! >>>>>>>>>>>>> >>>>>>>>>>>>> I'll make sure to fix the style problems. >>>>>>>>>>>>> For the symbolic names / #defines, please see my answer to >>>>>>>>>>>>> Coleen. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> Andreas >>>>>>>>>>>>> >>>>>>>>>>>>> On 2014-10-11 12:37, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>>> Hi Andreas, >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you for fixing this issue! >>>>>>>>>>>>>> The fix looks nice, I do not see any logical issues. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Only minor comments... >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> src/share/vm/prims/jvmtiRedefineClasses.cpp >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2281 } // end rewrite_cp_refs_in_class_type_annotations( >>>>>>>>>>>>>> 2315 } // end rewrite_cp_refs_in_fields_type_annotations( >>>>>>>>>>>>>> 2345 } // end rewrite_cp_refs_in_methods_type_annotations() >>>>>>>>>>>>>> 2397 } // end rewrite_cp_refs_in_type_annotations_typeArray >>>>>>>>>>>>>> 2443 } // end rewrite_cp_refs_in_type_annotation_struct >>>>>>>>>>>>>> 2785 } // end skip_type_annotation_target >>>>>>>>>>>>>> 2844 } // end skip_type_annotation_type_path >>>>>>>>>>>>>> >>>>>>>>>>>>>> The ')' is missed at 2281, 2315. >>>>>>>>>>>>>> The 2397-2844 are inconsistent with the 2345 and other >>>>>>>>>>>>>> function-end comments in the file. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2335 if >>>>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2336 byte_i, "method_info", THREAD)) { >>>>>>>>>>>>>> . . . >>>>>>>>>>>>>> 2378 if >>>>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotation_struct(type_annotations_typeArray, >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2379 byte_i_ref, location_mesg, THREAD)) { >>>>>>>>>>>>>> . . . >>>>>>>>>>>>>> 2427 if >>>>>>>>>>>>>> (!skip_type_annotation_target(type_annotations_typeArray, >>>>>>>>>>>>>> 2428 byte_i_ref, location_mesg, THREAD)) { >>>>>>>>>>>>>> 2429 return false; >>>>>>>>>>>>>> 2430 } >>>>>>>>>>>>>> 2431 >>>>>>>>>>>>>> 2432 if >>>>>>>>>>>>>> (!skip_type_annotation_type_path(type_annotations_typeArray, >>>>>>>>>>>>>> 2433 byte_i_ref, THREAD)) { >>>>>>>>>>>>>> 2434 return false; >>>>>>>>>>>>>> 2435 } >>>>>>>>>>>>>> 2436 >>>>>>>>>>>>>> 2437 if >>>>>>>>>>>>>> (!rewrite_cp_refs_in_annotation_struct(type_annotations_typeArray, >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2438 byte_i_ref, THREAD)) { >>>>>>>>>>>>>> 2439 return false; >>>>>>>>>>>>>> Wrong indent at 2336, 2379, etc. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I also concur with Coleen that it would be good to define >>>>>>>>>>>>>> and use >>>>>>>>>>>>>> symbolic names for the hexa-decimal constants used in the >>>>>>>>>>>>>> fix. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> test/runtime/RedefineTests/RedefineAnnotations.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> Java indent must be 4, not 2. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 253 @TestAnn(site="returnTypeAnnotation") Class >>>>>>>>>>>>>> typeAnnotatedMethod(@TestAnn(site="formalParameterTypeAnnotation") >>>>>>>>>>>>>> TypeAnnotatedTestClass arg) >>>>>>>>>>>>>> >>>>>>>>>>>>>> The line is too long. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 143 } >>>>>>>>>>>>>> 144 public static void main(String argv[]) { >>>>>>>>>>>>>> . . . >>>>>>>>>>>>>> 209 } >>>>>>>>>>>>>> 210 private static void >>>>>>>>>>>>>> checkAnnotations(AnnotatedType p) { >>>>>>>>>>>>>> 211 checkAnnotations(p.getAnnotations()); >>>>>>>>>>>>>> 212 } >>>>>>>>>>>>>> 213 private static void >>>>>>>>>>>>>> checkAnnotations(AnnotatedType[] annoTypes) { >>>>>>>>>>>>>> 214 for (AnnotatedType p : annoTypes) >>>>>>>>>>>>>> checkAnnotations(p.getAnnotations()); >>>>>>>>>>>>>> 215 } >>>>>>>>>>>>>> 216 private static void >>>>>>>>>>>>>> checkAnnotations(Class c) { >>>>>>>>>>>>>> . . . >>>>>>>>>>>>>> 257 } >>>>>>>>>>>>>> 258 public void run() {} >>>>>>>>>>>>>> >>>>>>>>>>>>>> Adding empty lines between method definitions would >>>>>>>>>>>>>> improve readability. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 10/9/14 6:21 AM, Andreas Eriksson wrote: >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please review this patch to RedefineClasses to allow >>>>>>>>>>>>>>> type annotations to be preserved. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Summary: >>>>>>>>>>>>>>> During redefine / retransform class the constant pool >>>>>>>>>>>>>>> indexes can change. >>>>>>>>>>>>>>> Since annotations have indexes into the constant pool >>>>>>>>>>>>>>> these indexes need to be rewritten. >>>>>>>>>>>>>>> This is already done for regular annotations, but not >>>>>>>>>>>>>>> for type annotations. >>>>>>>>>>>>>>> This patch adds code to add this rewriting for the type >>>>>>>>>>>>>>> annotations as well. >>>>>>>>>>>>>>> The patch also contains minor changes to >>>>>>>>>>>>>>> ClassFileReconstituter, to make sure that type >>>>>>>>>>>>>>> annotations are preserved during a redefine / >>>>>>>>>>>>>>> retransform class operation. >>>>>>>>>>>>>>> It also has a test that uses asm to change constant pool >>>>>>>>>>>>>>> indexes through a retransform, and then verifies that >>>>>>>>>>>>>>> type annotations are preserved. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Detail: >>>>>>>>>>>>>>> A type annotation struct consists of some target >>>>>>>>>>>>>>> information and a type path, followed by a regular >>>>>>>>>>>>>>> annotation struct. >>>>>>>>>>>>>>> Constant pool indexes are only present in the regular >>>>>>>>>>>>>>> annotation struct. >>>>>>>>>>>>>>> The added code skips over the type annotation specific >>>>>>>>>>>>>>> parts, then calls previously existing code to rewrite >>>>>>>>>>>>>>> constant pool indexes in the regular annotation struct. >>>>>>>>>>>>>>> Please see the Java SE 8 Ed. VM Spec. section 4.7.20 for >>>>>>>>>>>>>>> more info about the type annotation struct. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> JPRT with the new test passes without failures on all >>>>>>>>>>>>>>> platforms. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~aeriksso/8057043/webrev.00/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Bug: >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8057043 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regards >>>>>>>>>>>>>>> Andreas >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From david.holmes at oracle.com Wed Nov 5 00:09:38 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 05 Nov 2014 10:09:38 +1000 Subject: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <545729A3.7090301@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> <545729A3.7090301@oracle.com> Message-ID: <54596AC2.6050502@oracle.com> Hi Serguei, On 3/11/2014 5:07 PM, serguei.spitsyn at oracle.com wrote: > On 11/2/14 8:58 PM, David Holmes wrote: >> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >>> Serguei, >>> >>> Thank you for good finding. This approach looks much better for me. >>> >>> The fix looks good. >>> >>> Is it necessary to release vmDeathLock locks at >>> eventHandler.c:1244 before call >>> >>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? >> >> I agree this looks necessary, or at least more clean (if things are >> failing we really don't know what is happening). > > Agreed (replied to Dmitry). > >> >> More generally I'm concerned about whether any of the code paths taken >> while holding the new lock can result in deadlock - in particular with >> regard to the resumeLock ? > > The cbVMDeath() function never holds both vmDeathLock and resumeLock at > the same time, > so there is no chance for a deadlock that involves both these locks. > > Two more locks used in the cbVMDeath() are the callbackBlock and > callbackLock. > These two locks look completely unrelated to the debugLoop_run(). > > The debugLoop_run() function also uses the cmdQueueLock. > The debugLoop_run() never holds both vmDeathLock and cmdQueueLock at the > same time. > > So that I do not see any potential to introduce new deadlock with the > vmDeathLock. > > However, it is still easy to overlook something here. > Please, let me know if you see any danger. I was mainly concerned about what might happen in the call chain for threadControl_resumeAll() (it certainly sounds like it might need to use a resumeLock :) ). I see direct use of the threadLock and indirectly the eventHandler lock; but there are further call paths I did not explore. Wish there was an easy way to determine the transitive closure of all locks used from a given call. Thanks, David > Thanks, > Serguei > >> >> David >> >>> -Dmitry >>> >>> >>> >>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>>> >>>> It is 3-rd round of review for: >>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>> >>>> New webrev: >>>> >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>>> >>>> >>>> >>>> Summary >>>> >>>> For failing scenario, please, refer to the 1-st round RFR below. >>>> >>>> I've found what is missed in the jdwp agent shutdown and decided to >>>> switch from a workaround to a real fix. >>>> >>>> The agent VM_DEATH callback sets the gdata field: gdata->vmDead = 1. >>>> The agent debugLoop_run() has a guard against the VM shutdown: >>>> >>>> 165 } else if (gdata->vmDead && >>>> 166 ((cmd->cmdSet) != >>>> JDWP_COMMAND_SET(VirtualMachine))) { >>>> 167 /* Protect the VM from calls while dead. >>>> 168 * VirtualMachine cmdSet quietly ignores some >>>> cmds >>>> 169 * after VM death, so, it sends it's own errors. >>>> 170 */ >>>> 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD)); >>>> >>>> >>>> However, the guard above does not help much if the VM_DEATH event >>>> happens in the middle of a command execution. >>>> There is a lack of synchronization here. >>>> >>>> The fix introduces new lock (vmDeathLock) which does not allow to >>>> execute the commands >>>> and the VM_DEATH event callback concurrently. >>>> It should work well for any function that is used in >>>> implementation of >>>> the JDWP_COMMAND_SET(VirtualMachine) . >>>> >>>> >>>> Testing: >>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>>> The updated webrev: >>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>>> >>>>> >>>>> >>>>> The changes are: >>>>> - added a comment recommended by Staffan >>>>> - removed the ignore_wrong_phase() call from function >>>>> classSignature() >>>>> >>>>> The classSignature() function is called in 16 places. >>>>> Most of them do not tolerate the NULL in place of returned signature >>>>> and will crash. >>>>> I'm not comfortable to fix all the occurrences now and suggest to >>>>> return to this >>>>> issue after gaining experience with more failure cases that are still >>>>> expected. >>>>> The failure with the classSignature() involved was observed only once >>>>> in the nightly >>>>> and should be extremely rare reproducible. >>>>> I'll file a placeholder bug if necessary. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Please, review the fix for: >>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>> >>>>>> >>>>>> Open webrev: >>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Summary: >>>>>> >>>>>> The failing scenario: >>>>>> The debugger and the debuggee are well aware a VM shutdown has >>>>>> been started in the target process. >>>>>> The debugger at this point is not expected to send any commands >>>>>> to the JDWP agent. >>>>>> However, the JDI layer (debugger side) and the jdwp agent >>>>>> (debuggee side) >>>>>> are not in sync with the consumer layers. >>>>>> >>>>>> One reason is because the test debugger does not invoke the JDI >>>>>> method VirtualMachine.dispose(). >>>>>> Another reason is that the Debugger and the debuggee processes >>>>>> are uneasy to sync in general. >>>>>> >>>>>> As a result the following steps are possible: >>>>>> - The test debugger sends a 'quit' command to the test >>>>>> debuggee >>>>>> - The debuggee is normally exiting >>>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>>> anonymous class unload event >>>>>> - The JDI InternalEventHandler thread handles the >>>>>> ClassUnloadEvent event >>>>>> - The InternalEventHandler wants to uncache the matching >>>>>> reference type. >>>>>> If there is more than one class with the same host class >>>>>> signature, it can't distinguish them, >>>>>> and so, deletes all references and re-retrieves them again >>>>>> (see tracing below): >>>>>> MY_TRACE: JDI: >>>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>>> - The jdwp backend debugLoop_run() gets the command from JDI >>>>>> and calls the functions >>>>>> classesForSignature() and classStatus() recursively. >>>>>> - The classStatus() makes a call to the JVMTI >>>>>> GetClassStatus() >>>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>>> - As a result the jdwp backend reports the JVMTI error to the >>>>>> JDI, and so, the test fails >>>>>> >>>>>> For details, see the analysis in bug report closed as a dup of >>>>>> the bug 6988950: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>>> >>>>>> Some similar cases can be found in the two bug reports (6988950 >>>>>> and 8024865) describing this issue. >>>>>> >>>>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error >>>>>> as it is normal at the VM shutdown. >>>>>> The original jdwp backend implementation had a similar approach >>>>>> for the raw monitor functions. >>>>>> Threy use the ignore_vm_death() to workaround the >>>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>>> For reference, please, see the file: src/share/back/util.c >>>>>> >>>>>> >>>>>> Testing: >>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>> tests >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>> >>>> >>> >>> > From david.holmes at oracle.com Wed Nov 5 00:15:36 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 05 Nov 2014 10:15:36 +1000 Subject: FYI: Jdk8 backport, Was: Re: RFR(M): 8057043: Type annotations not retained during class redefine / retransform In-Reply-To: <54596350.2030609@oracle.com> References: <54368BC1.3000905@oracle.com> <54390854.1090109@oracle.com> <543BA4DB.3000104@oracle.com> <543DFF76.80605@oracle.com> <543E590A.4020304@oracle.com> <543E712D.90900@oracle.com> <543E77CA.3090806@oracle.com> <543E7B08.8040305@oracle.com> <543FA9AD.2070404@oracle.com> <5445686F.4090103@oracle.com> <5446B9FA.4080201@oracle.com> <5446C254.7040801@oracle.com> <54479AEA.8040308@oracle.com> <54590572.2030302@oracle.com> <5459072A.9050506@oracle.com> <54596350.2030609@oracle.com> Message-ID: <54596C28.40900@oracle.com> On 5/11/2014 9:37 AM, serguei.spitsyn at oracle.com wrote: > Hi Andreas, > > If the port is straightforward then referring to the jdk 9 reviewers > should be enough. > You still need to get an approval. You don't need an approval in sense of requesting approval from jdk8u-dev at ojn. You need a RFR of the backport on the hotspot lists before pushing to jdk8u/hs-dev. Alejandro will then request a bulk approval (on jdk8u-dev at ojn) when hs-dev syncs up. Andreas: similar to approval requests the RFR should cover whether the changesets were imported directly without change, or whether modifications were needed. For the former include a link to the 9 changesets; for the latter a link to the 8u webrev. David > Thanks, > Serguei > > On 11/4/14 9:04 AM, Andreas Eriksson wrote: >> Or do I need to send out a real review for the backport? >> I'm not sure what the process is here. >> >> Thanks, >> Andreas >> >> On 2014-11-04 17:57, Andreas Eriksson wrote: >>> Hi, >>> >>> Just wanted to let the list know that I'm about to backport this >>> change to jdk8. >>> >>> Regards, >>> Andreas >>> >>> On 2014-10-22 13:54, Andreas Eriksson wrote: >>>> Thanks Serguei! >>>> >>>> Regards, >>>> Andreas >>>> >>>> On 2014-10-21 22:30, serguei.spitsyn at oracle.com wrote: >>>>> Hi Andreas, >>>>> >>>>> Very nice, thank you for the refactoring! >>>>> Thumbs up. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 10/21/14 12:54 PM, Andreas Eriksson wrote: >>>>>> Hi Serguei, >>>>>> >>>>>> I split up the method into several, and made the verification >>>>>> before and after retransform share logic. >>>>>> Webrev: http://cr.openjdk.java.net/~aeriksso/8057043/webrev.02/ >>>>>> >>>>>> Regards, >>>>>> Andreas >>>>>> >>>>>> On 2014-10-20 21:54, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Andreas, >>>>>>> >>>>>>> Sorry for the delay. >>>>>>> >>>>>>> On 10/16/14 4:19 AM, Andreas Eriksson wrote: >>>>>>>> >>>>>>>> On 2014-10-15 15:47, Daniel D. Daugherty wrote: >>>>>>>>> On 10/15/14 7:34 AM, Coleen Phillimore wrote: >>>>>>>>>> >>>>>>>>>> There are lots of other rewrite_cp_refs_in* function calls. >>>>>>>>>> Please indent your function like them, not differently. >>>>>>>>> >>>>>>>>> The above implies that my answer below was made without sufficient >>>>>>>>> context... my apologies for that. >>>>>>>>> >>>>>>>>> The general rule is to follow the existing style in the file so >>>>>>>>> if there are rewrite_cp_refs_in* function calls in the file, then >>>>>>>>> please follow that style. Unless, of course, you want to fix all >>>>>>>>> of them to follow the HotSpot style guideline: >>>>>>>>> >>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide >>>>>>>>> >>>>>>>>> > Use good taste to break lines and align corresponding tokens >>>>>>>>> > on adjacent lines. >>>>>>>>> >>>>>>>>> but that may cause Coleen some heartburn :-) >>>>>>>> >>>>>>>> I fixed the calls to follow the already existing indent style. >>>>>>>> I have also made changes to the test, which I hope Joel can take >>>>>>>> a look at. >>>>>>>> >>>>>>>> New webrev: >>>>>>>> http://cr.openjdk.java.net/~aeriksso/8057043/webrev.01/ >>>>>>> >>>>>>> The fix looks good. >>>>>>> >>>>>>> A couple of comments about the test. >>>>>>> >>>>>>> The method testTransformAndVerify() is too big. >>>>>>> At least, it looks like there are some ways to refactor it to >>>>>>> make calls to smaller methods. >>>>>>> >>>>>>> There are two directions of doing it: >>>>>>> - make a smaller method out of each block: >>>>>>> 217-236, 238-260, 262-276, 311-329, 331-351, 353-367 >>>>>>> - some of the lines sequences looks very typical: >>>>>>> 221 at = >>>>>>> c.getDeclaredField("typeAnnotatedArray").getAnnotatedType(); >>>>>>> 222 arrayTA1 = at.getAnnotations()[0]; >>>>>>> 223 verifyTestAnnSite(arrayTA1, "array1"); >>>>>>> 224 >>>>>>> 225 at = ((AnnotatedArrayType) >>>>>>> at).getAnnotatedGenericComponentType(); >>>>>>> 226 arrayTA2 = at.getAnnotations()[0]; >>>>>>> 227 verifyTestAnnSite(arrayTA2, "array2"); >>>>>>> 228 >>>>>>> 229 at = ((AnnotatedArrayType) >>>>>>> at).getAnnotatedGenericComponentType(); >>>>>>> 230 arrayTA3 = at.getAnnotations()[0]; >>>>>>> 231 verifyTestAnnSite(arrayTA3, "array3"); >>>>>>> 232 >>>>>>> 233 at = ((AnnotatedArrayType) >>>>>>> at).getAnnotatedGenericComponentType(); >>>>>>> 234 arrayTA4 = at.getAnnotations()[0]; >>>>>>> 235 verifyTestAnnSite(arrayTA4, "array4"); >>>>>>> But I leave it up to you. >>>>>>> >>>>>>> Another step to improve the readability is to add a short comment >>>>>>> for each block of code saying what is done there. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Andreas >>>>>>>> >>>>>>>>> >>>>>>>>> Dan >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>>> On 10/15/14, 9:05 AM, Daniel D. Daugherty wrote: >>>>>>>>>>> On 10/15/14 5:22 AM, Andreas Eriksson wrote: >>>>>>>>>>>> Thanks Serguei. >>>>>>>>>>>> >>>>>>>>>>>> I have a question about the if-blocks that had the wrong >>>>>>>>>>>> indent: >>>>>>>>>>>> >>>>>>>>>>>> 2335 if >>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>>>>> >>>>>>>>>>>> 2336 byte_i, "method_info", THREAD)) { >>>>>>>>>>>> >>>>>>>>>>>> How should I indent them? >>>>>>>>>>> >>>>>>>>>>> Trying again without the line numbers... >>>>>>>>>>> >>>>>>>>>>> if >>>>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>>>> >>>>>>>>>>> byte_i, "method_info", >>>>>>>>>>> THREAD)) { >>>>>>>>>>> >>>>>>>>>>> Just in case, TB messes with the spacing again, the "byte_i" >>>>>>>>>>> line and >>>>>>>>>>> "THREAD" lines are aligned under "method_type_annotations". >>>>>>>>>>> >>>>>>>>>>> Dan >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> /Andreas >>>>>>>>>>>> >>>>>>>>>>>> On 2014-10-15 07:00, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi Andreas, >>>>>>>>>>>>> >>>>>>>>>>>>> Sorry I did not reply on this early. >>>>>>>>>>>>> I assumed, it is a thumbs up from me. >>>>>>>>>>>>> Just wanted make it clean now. :) >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/13/14 3:09 AM, Andreas Eriksson wrote: >>>>>>>>>>>>>> Hi Serguei, thanks for looking at this! >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'll make sure to fix the style problems. >>>>>>>>>>>>>> For the symbolic names / #defines, please see my answer to >>>>>>>>>>>>>> Coleen. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> Andreas >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2014-10-11 12:37, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>>>> Hi Andreas, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you for fixing this issue! >>>>>>>>>>>>>>> The fix looks nice, I do not see any logical issues. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Only minor comments... >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> src/share/vm/prims/jvmtiRedefineClasses.cpp >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2281 } // end rewrite_cp_refs_in_class_type_annotations( >>>>>>>>>>>>>>> 2315 } // end rewrite_cp_refs_in_fields_type_annotations( >>>>>>>>>>>>>>> 2345 } // end rewrite_cp_refs_in_methods_type_annotations() >>>>>>>>>>>>>>> 2397 } // end rewrite_cp_refs_in_type_annotations_typeArray >>>>>>>>>>>>>>> 2443 } // end rewrite_cp_refs_in_type_annotation_struct >>>>>>>>>>>>>>> 2785 } // end skip_type_annotation_target >>>>>>>>>>>>>>> 2844 } // end skip_type_annotation_type_path >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The ')' is missed at 2281, 2315. >>>>>>>>>>>>>>> The 2397-2844 are inconsistent with the 2345 and other >>>>>>>>>>>>>>> function-end comments in the file. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2335 if >>>>>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2336 byte_i, "method_info", THREAD)) { >>>>>>>>>>>>>>> . . . >>>>>>>>>>>>>>> 2378 if >>>>>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotation_struct(type_annotations_typeArray, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2379 byte_i_ref, location_mesg, THREAD)) { >>>>>>>>>>>>>>> . . . >>>>>>>>>>>>>>> 2427 if >>>>>>>>>>>>>>> (!skip_type_annotation_target(type_annotations_typeArray, >>>>>>>>>>>>>>> 2428 byte_i_ref, location_mesg, THREAD)) { >>>>>>>>>>>>>>> 2429 return false; >>>>>>>>>>>>>>> 2430 } >>>>>>>>>>>>>>> 2431 >>>>>>>>>>>>>>> 2432 if >>>>>>>>>>>>>>> (!skip_type_annotation_type_path(type_annotations_typeArray, >>>>>>>>>>>>>>> 2433 byte_i_ref, THREAD)) { >>>>>>>>>>>>>>> 2434 return false; >>>>>>>>>>>>>>> 2435 } >>>>>>>>>>>>>>> 2436 >>>>>>>>>>>>>>> 2437 if >>>>>>>>>>>>>>> (!rewrite_cp_refs_in_annotation_struct(type_annotations_typeArray, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2438 byte_i_ref, THREAD)) { >>>>>>>>>>>>>>> 2439 return false; >>>>>>>>>>>>>>> Wrong indent at 2336, 2379, etc. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I also concur with Coleen that it would be good to define >>>>>>>>>>>>>>> and use >>>>>>>>>>>>>>> symbolic names for the hexa-decimal constants used in the >>>>>>>>>>>>>>> fix. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> test/runtime/RedefineTests/RedefineAnnotations.java >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Java indent must be 4, not 2. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 253 @TestAnn(site="returnTypeAnnotation") Class >>>>>>>>>>>>>>> typeAnnotatedMethod(@TestAnn(site="formalParameterTypeAnnotation") >>>>>>>>>>>>>>> TypeAnnotatedTestClass arg) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The line is too long. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 143 } >>>>>>>>>>>>>>> 144 public static void main(String argv[]) { >>>>>>>>>>>>>>> . . . >>>>>>>>>>>>>>> 209 } >>>>>>>>>>>>>>> 210 private static void >>>>>>>>>>>>>>> checkAnnotations(AnnotatedType p) { >>>>>>>>>>>>>>> 211 checkAnnotations(p.getAnnotations()); >>>>>>>>>>>>>>> 212 } >>>>>>>>>>>>>>> 213 private static void >>>>>>>>>>>>>>> checkAnnotations(AnnotatedType[] annoTypes) { >>>>>>>>>>>>>>> 214 for (AnnotatedType p : annoTypes) >>>>>>>>>>>>>>> checkAnnotations(p.getAnnotations()); >>>>>>>>>>>>>>> 215 } >>>>>>>>>>>>>>> 216 private static void >>>>>>>>>>>>>>> checkAnnotations(Class c) { >>>>>>>>>>>>>>> . . . >>>>>>>>>>>>>>> 257 } >>>>>>>>>>>>>>> 258 public void run() {} >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Adding empty lines between method definitions would >>>>>>>>>>>>>>> improve readability. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 10/9/14 6:21 AM, Andreas Eriksson wrote: >>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please review this patch to RedefineClasses to allow >>>>>>>>>>>>>>>> type annotations to be preserved. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Summary: >>>>>>>>>>>>>>>> During redefine / retransform class the constant pool >>>>>>>>>>>>>>>> indexes can change. >>>>>>>>>>>>>>>> Since annotations have indexes into the constant pool >>>>>>>>>>>>>>>> these indexes need to be rewritten. >>>>>>>>>>>>>>>> This is already done for regular annotations, but not >>>>>>>>>>>>>>>> for type annotations. >>>>>>>>>>>>>>>> This patch adds code to add this rewriting for the type >>>>>>>>>>>>>>>> annotations as well. >>>>>>>>>>>>>>>> The patch also contains minor changes to >>>>>>>>>>>>>>>> ClassFileReconstituter, to make sure that type >>>>>>>>>>>>>>>> annotations are preserved during a redefine / >>>>>>>>>>>>>>>> retransform class operation. >>>>>>>>>>>>>>>> It also has a test that uses asm to change constant pool >>>>>>>>>>>>>>>> indexes through a retransform, and then verifies that >>>>>>>>>>>>>>>> type annotations are preserved. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Detail: >>>>>>>>>>>>>>>> A type annotation struct consists of some target >>>>>>>>>>>>>>>> information and a type path, followed by a regular >>>>>>>>>>>>>>>> annotation struct. >>>>>>>>>>>>>>>> Constant pool indexes are only present in the regular >>>>>>>>>>>>>>>> annotation struct. >>>>>>>>>>>>>>>> The added code skips over the type annotation specific >>>>>>>>>>>>>>>> parts, then calls previously existing code to rewrite >>>>>>>>>>>>>>>> constant pool indexes in the regular annotation struct. >>>>>>>>>>>>>>>> Please see the Java SE 8 Ed. VM Spec. section 4.7.20 for >>>>>>>>>>>>>>>> more info about the type annotation struct. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> JPRT with the new test passes without failures on all >>>>>>>>>>>>>>>> platforms. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~aeriksso/8057043/webrev.00/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Bug: >>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8057043 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Regards >>>>>>>>>>>>>>>> Andreas >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From jiangli.zhou at oracle.com Wed Nov 5 01:35:56 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 04 Nov 2014 17:35:56 -0800 Subject: Review request for 8058313: Mismatch of method descriptor and MethodParameters.parameters_count should cause MalformedParameterException In-Reply-To: <5457F530.2070907@oracle.com> References: <54516C9A.7070404@oracle.com> <54518820.50700@oracle.com> <545251AD.8050208@oracle.com> <54527A90.4030503@oracle.com> <5452CC7F.1090809@oracle.com> <5457F530.2070907@oracle.com> Message-ID: <54597EFC.2070509@oracle.com> Hi Eric, I have a few more comments: In ClassFileParser::parse_method(), should 'real_length' be int instead of u2? In JVM_GetMethodParameters(), can you add an assert to make sure the num_params is -1 when it's less than 0? Also, it's probably more conventional to use (num_params < 0) instead of (0 > num_params). Thanks, Jiangli On 11/03/2014 01:35 PM, Eric McCorkle wrote: > Please review this issue so that it can go in along with 8058322. Thanks. > > On 10/30/14 19:40, Eric McCorkle wrote: >> Thank you for the pointers. I have applied your changes and refreshed >> the webrev. >> >> http://cr.openjdk.java.net/~emc/8058313/ >> >> Also, I have posted the test for this and another patch here: >> http://cr.openjdk.java.net/~emc/8062556/ >> >> On 10/30/14 13:51, Jiangli Zhou wrote: >>> Hi Eric, >>> >>> On 10/30/2014 07:56 AM, Eric McCorkle wrote: >>>> On 10/29/14 20:36, Jiangli Zhou wrote: >>>>> Hi Eric, >>>>> >>>>> I wonder if we could specialize this particular case and avoid changing >>>>> the parsing code. How about setting the _has_method_parameters flag in >>>>> the ConstMethod when encounter such MethodParameter, and changing >>>>> JVM_GetMethodParameters() to return non-NULL value for such case when >>>>> _has_method_parameters is true but method_parameters_length is 0. Would >>>>> that work? >>>> Which parser are you talking about? The inline tables parser, or the >>>> class file parser. The class file parser has to change, because it was >>>> previously ignoring MethodParameters attributes with parameter_count 0. >>> It's the class parsing changes that I was referring to, mostly relate to >>> the initialization and checking against method_parameters_length. It's a >>> bit awkward to include the 0 case but also skipping it in the loop. For >>> example, the following code in classFileParser.cpp changed ">" to ">=" >>> in the if check, but has no real effect and is not need. >>> >>> 2486 // Copy method parameters >>> 2487 if (method_parameters_length >= 0) { >>> 2488 MethodParametersElement* elem = >>> m->constMethod()->method_parameters_start(); >>> 2489 for (int i = 0; i < method_parameters_length; i++) { >>> 2490 elem[i].name_cp_index = >>> Bytes::get_Java_u2(method_parameters_data); >>> 2491 method_parameters_data += 2; >>> 2492 elem[i].flags = Bytes::get_Java_u2(method_parameters_data); >>> 2493 method_parameters_data += 2; >>> 2494 } >>> 2495 } >>> >>> >>>> I don't think your proposal will work. The inline tables' offsets are >>>> all dependent on what inline tables are actually present. If >>>> _has_method_parameters is set, then the inline tables code expects the >>>> last u2 of the inline tables to be a u2 indicating the number of method >>>> parameters entries, preceeded by the array of method parameters data. >>>> If _has_method_parameters is false, then it expects that there is no >>>> method parameters information at all (including no length field). If >>>> you were to set _has_method_parameters, but not store any information in >>>> the inline table, then it would cause errors for all the rest of the >>>> inline tables. >>> Thank you for reminding me of the complexity of the inlined table >>> calculation in the ConstMethod. My proposal would require tweaks in that >>> area to correctly compute the table sizes. As it's easy to introduce >>> bugs in that area, it's not worth to change the table calculation code >>> for this purpose. I agree my proposal is not a better choice in this case. >>> >>>> What I do for the parameter_count = 0 case is just store >>>> a 0 u2 for zero-length method parameters information, and no data. All >>>> the existing inline tables code works fine with this case, so there >>>> aren't any serious changes to the inline tables code (other than >>>> allowing method parameters information to be stored when the array is >>>> length 0). But you have to make some change to the inline table code, >>>> otherwise the information won't be stored. >>> Ok. Could you please add comments to the change in constMethod.cpp to >>> explain above? >>> >>> In jvm.cpp, since -1 represents no method parameter now. Maybe checking >>> against explicity and add comments for the 0-length case. >>> >>> JVM_ENTRY(jobjectArray, JVM_GetMethodParameters(JNIEnv *env, jobject >>> method)) >>> { >>> ... >>> // No method parameter >>> if (num_params == -1) { >>> return (jobjectArray)NULL; >>> } >>> >>> /* handle the rest here */ >>> // make sure all the symbols are properly formatted >>> for (int i = 0; i < num_params; i++) { >>> ... >>> } >>> >>> Thanks, >>> Jiangli >>> >>>>> Thanks, >>>>> Jiangli >>>>> >>>>> On 10/29/2014 03:39 PM, Eric McCorkle wrote: >>>>>> Hello, >>>>>> >>>>>> Please review this fix for parameter reflection which addresses hotspot >>>>>> falsely ignoring zero-length MethodParameter attributes. The JVMS >>>>>> allows a MethodParameters attribute with parameter_count = 0, and the >>>>>> parameter reflection spec states that a MalformedParametersException >>>>>> should be thrown if parameter_count does not match the number of real >>>>>> parameters to a method. Hotspot currently ignores MethodParameters >>>>>> attributes with parameter_count = 0; however, in a case where a (bad) >>>>>> MethodParameters attribute has parameter_count = 0, but the method >>>>>> has a >>>>>> nonzero number of real parameters, hotspot will return null from >>>>>> JVM_GetMethodParameters, the result being that a >>>>>> MalformedParametersException is not thrown (rather, the reflection API >>>>>> acts like there is no MethodParameters attribute). >>>>>> >>>>>> This patch causes hotspot to record the fact that a zero-length >>>>>> MethodParameters attribute does exist, causing the exception to be >>>>>> thrown when it should be. >>>>>> >>>>>> The bug is here: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8058313 >>>>>> >>>>>> The webrev is here: >>>>>> http://cr.openjdk.java.net/~emc/8058313/ From david.holmes at oracle.com Wed Nov 5 01:48:36 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 05 Nov 2014 11:48:36 +1000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> Message-ID: <545981F4.3050006@oracle.com> Hi Goetz, The only issue I see is in: src/share/vm/runtime/globals.cpp where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the "abort on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. Thanks, David On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: > Hi, > > this change contains a row of minor code improvements we did to fulfil > our internal quality requirements. We would like to share these with > openJDK. > > Please review and test this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8062370 > > We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 bit and, > of course, the ppc platforms. > > > Some details: > > CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. > > We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. > > We add some missing memory frees and some closing of files. > > jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. > > Best regards, > > Goetz > > > > From goetz.lindenmaier at sap.com Wed Nov 5 08:16:04 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 5 Nov 2014 08:16:04 +0000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <545981F4.3050006@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> Hi David, thanks for looking at the change! I fixed the issue in a new webrev: http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ Best regards, Goetz. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Mittwoch, 5. November 2014 02:49 To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net Subject: Re: RFR (L): 8062370: Various minor code improvements Hi Goetz, The only issue I see is in: src/share/vm/runtime/globals.cpp where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the "abort on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. Thanks, David On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: > Hi, > > this change contains a row of minor code improvements we did to fulfil > our internal quality requirements. We would like to share these with > openJDK. > > Please review and test this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8062370 > > We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 bit and, > of course, the ppc platforms. > > > Some details: > > CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. > > We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. > > We add some missing memory frees and some closing of files. > > jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. > > Best regards, > > Goetz > > > > From stefan.karlsson at oracle.com Wed Nov 5 08:03:58 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 09:03:58 +0100 Subject: [8u40] RFR: 8056240: Investigate increased GC remark time after class unloading changes in CRM Fuse In-Reply-To: <542D2EC1.5030303@oracle.com> References: <542D2EC1.5030303@oracle.com> Message-ID: <5459D9EE.1030202@oracle.com> Hi all, Please, review the backport of this fix to 8u40: http://cr.openjdk.java.net/~stefank/backports/8u40/8056240/webrev.01/ I've attached the .rej files from applying the JDK 9 patch to JDK 8u40. Reasons for the .rej files from the failed applied patch hunks: metadataOnStackMark.cpp - the has_redefined_a_class parameter in MetadataOnStackMark(bool has_redefined_a_class) is only present in JDK 9. metadataOnStackMark.hpp - the same as above classLoaderData.cpp - has_redefined_a_class parameter and JDK-8040237 isn't backported concurrentMark.cpp - JDK-8027450 isn't backported thanks, StefanK On 2014-10-02 12:53, Stefan Karlsson wrote: > Hi all, > > (The following patch changes HotSpot code in areas concerning GC, RT, > and Compiler. So, it would be good to get reviews from all three teams.) > > Please, review this patch to optimize and parallelize the CodeCache > part of MetadaOnStackMark. > > G1 performance measurements showed longer than expected remark times > on an application using a lot of nmethods and Metadata. The cause for > this performance regression was the call to > CodeCache::alive_nmethods_do(nmethod::mark_on_stack); in > MetadataOnStackMark. This code path is only taken when class > redefinition is used. Class redefinition is typically used in > monitoring and diagnostic frameworks. > > With this patch we now: > 1) Filter out duplicate Metadata* entries instead of storing a > Metadata* per visited metadata reference. > 2) Collect the set of Metadata* entries in parallel. The code > piggy-backs on the parallel CodeCache cleaning task. > > http://cr.openjdk.java.net/~stefank/8056240/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8056240 > > Functional testing: > JPRT, Kitchensink, parallel_class_unloading, unit tests > > Performance testing: > CRM Fuse - where the regression was found > > The patch changes HotSpot code in areas concerning GC, RT, Compiler, > and Serviceability. It would be good to get some reviews from the > other teams, and not only from the GC team. > > thanks, > StefanK From stefan.karlsson at oracle.com Wed Nov 5 08:48:32 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 09:48:32 +0100 Subject: [8u40] RFR: 8056240: Investigate increased GC remark time after class unloading changes in CRM Fuse In-Reply-To: <5459D9EE.1030202@oracle.com> References: <542D2EC1.5030303@oracle.com> <5459D9EE.1030202@oracle.com> Message-ID: <5459E460.7080608@oracle.com> On 2014-11-05 09:03, Stefan Karlsson wrote: > Hi all, > > Please, review the backport of this fix to 8u40: > http://cr.openjdk.java.net/~stefank/backports/8u40/8056240/webrev.01/ > > I've attached the .rej files from applying the JDK 9 patch to JDK 8u40. The attachments were removed. The files can be found at: http://cr.openjdk.java.net/~stefank/backports/8u40/8056240/webrev.01/rej/ thanks, StefanK > > Reasons for the .rej files from the failed applied patch hunks: > metadataOnStackMark.cpp - the has_redefined_a_class parameter in > MetadataOnStackMark(bool has_redefined_a_class) is only present in JDK 9. > metadataOnStackMark.hpp - the same as above > classLoaderData.cpp - has_redefined_a_class parameter and JDK-8040237 > isn't backported > concurrentMark.cpp - JDK-8027450 isn't backported > > thanks, > StefanK > > On 2014-10-02 12:53, Stefan Karlsson wrote: >> Hi all, >> >> (The following patch changes HotSpot code in areas concerning GC, RT, >> and Compiler. So, it would be good to get reviews from all three teams.) >> >> Please, review this patch to optimize and parallelize the CodeCache >> part of MetadaOnStackMark. >> >> G1 performance measurements showed longer than expected remark times >> on an application using a lot of nmethods and Metadata. The cause for >> this performance regression was the call to >> CodeCache::alive_nmethods_do(nmethod::mark_on_stack); in >> MetadataOnStackMark. This code path is only taken when class >> redefinition is used. Class redefinition is typically used in >> monitoring and diagnostic frameworks. >> >> With this patch we now: >> 1) Filter out duplicate Metadata* entries instead of storing a >> Metadata* per visited metadata reference. >> 2) Collect the set of Metadata* entries in parallel. The code >> piggy-backs on the parallel CodeCache cleaning task. >> >> http://cr.openjdk.java.net/~stefank/8056240/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8056240 >> >> Functional testing: >> JPRT, Kitchensink, parallel_class_unloading, unit tests >> >> Performance testing: >> CRM Fuse - where the regression was found >> >> The patch changes HotSpot code in areas concerning GC, RT, Compiler, >> and Serviceability. It would be good to get some reviews from the >> other teams, and not only from the GC team. >> >> thanks, >> StefanK > From albert.noll at oracle.com Wed Nov 5 09:12:32 2014 From: albert.noll at oracle.com (Albert Noll) Date: Wed, 05 Nov 2014 10:12:32 +0100 Subject: [9] RFR(S): 8062735: CodeCacheSweeperThread missing from SA In-Reply-To: <54595418.6060300@oracle.com> References: <5458CE11.305@oracle.com> <54595418.6060300@oracle.com> Message-ID: <5459EA00.9070109@oracle.com> Coleen, Vladimir, Serguei, thanks for reviewing this. Best, Albert On 11/04/2014 11:32 PM, serguei.spitsyn at oracle.com wrote: > Hi Albert, > > The fix looks good. > > Thanks, > Serguei > > > On 11/4/14 5:01 AM, Albert Noll wrote: >> Hi, >> >> could I get reviews for this small patch? >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8062735 >> >> Problem: >> The fix for JDK-8046809 added the CodeCacheSweeperThread, but did not >> add this new type to SA. >> >> Solution: >> Add type to SA. >> >> Testing: >> Failing test cases. >> >> Webrev: >> http://cr.openjdk.java.net/~anoll/8062735/webrev.00/ >> >> Many thanks, >> Albert >> > From mikael.gerdin at oracle.com Wed Nov 5 09:16:10 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Wed, 05 Nov 2014 10:16:10 +0100 Subject: [8u40] RFR: 8056240: Investigate increased GC remark time after class unloading changes in CRM Fuse In-Reply-To: <5459E460.7080608@oracle.com> References: <542D2EC1.5030303@oracle.com> <5459D9EE.1030202@oracle.com> <5459E460.7080608@oracle.com> Message-ID: <5459EADA.3000107@oracle.com> Hi Stefan, On 2014-11-05 09:48, Stefan Karlsson wrote: > > On 2014-11-05 09:03, Stefan Karlsson wrote: >> Hi all, >> >> Please, review the backport of this fix to 8u40: >> http://cr.openjdk.java.net/~stefank/backports/8u40/8056240/webrev.01/ >> >> I've attached the .rej files from applying the JDK 9 patch to JDK 8u40. > > The attachments were removed. The files can be found at: > http://cr.openjdk.java.net/~stefank/backports/8u40/8056240/webrev.01/rej/ Thanks for posting the reject files, it made reviewing the backport much easier! Looks good. /Mikael > > thanks, > StefanK > >> >> Reasons for the .rej files from the failed applied patch hunks: >> metadataOnStackMark.cpp - the has_redefined_a_class parameter in >> MetadataOnStackMark(bool has_redefined_a_class) is only present in JDK 9. >> metadataOnStackMark.hpp - the same as above >> classLoaderData.cpp - has_redefined_a_class parameter and JDK-8040237 >> isn't backported >> concurrentMark.cpp - JDK-8027450 isn't backported >> >> thanks, >> StefanK >> >> On 2014-10-02 12:53, Stefan Karlsson wrote: >>> Hi all, >>> >>> (The following patch changes HotSpot code in areas concerning GC, RT, >>> and Compiler. So, it would be good to get reviews from all three teams.) >>> >>> Please, review this patch to optimize and parallelize the CodeCache >>> part of MetadaOnStackMark. >>> >>> G1 performance measurements showed longer than expected remark times >>> on an application using a lot of nmethods and Metadata. The cause for >>> this performance regression was the call to >>> CodeCache::alive_nmethods_do(nmethod::mark_on_stack); in >>> MetadataOnStackMark. This code path is only taken when class >>> redefinition is used. Class redefinition is typically used in >>> monitoring and diagnostic frameworks. >>> >>> With this patch we now: >>> 1) Filter out duplicate Metadata* entries instead of storing a >>> Metadata* per visited metadata reference. >>> 2) Collect the set of Metadata* entries in parallel. The code >>> piggy-backs on the parallel CodeCache cleaning task. >>> >>> http://cr.openjdk.java.net/~stefank/8056240/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8056240 >>> >>> Functional testing: >>> JPRT, Kitchensink, parallel_class_unloading, unit tests >>> >>> Performance testing: >>> CRM Fuse - where the regression was found >>> >>> The patch changes HotSpot code in areas concerning GC, RT, Compiler, >>> and Serviceability. It would be good to get some reviews from the >>> other teams, and not only from the GC team. >>> >>> thanks, >>> StefanK >> > From stefan.karlsson at oracle.com Wed Nov 5 09:04:22 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 10:04:22 +0100 Subject: [8u40] RFR: 8056240: Investigate increased GC remark time after class unloading changes in CRM Fuse In-Reply-To: <5459EADA.3000107@oracle.com> References: <542D2EC1.5030303@oracle.com> <5459D9EE.1030202@oracle.com> <5459E460.7080608@oracle.com> <5459EADA.3000107@oracle.com> Message-ID: <5459E816.3000103@oracle.com> On 2014-11-05 10:16, Mikael Gerdin wrote: > Hi Stefan, > > On 2014-11-05 09:48, Stefan Karlsson wrote: >> >> On 2014-11-05 09:03, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please, review the backport of this fix to 8u40: >>> http://cr.openjdk.java.net/~stefank/backports/8u40/8056240/webrev.01/ >>> >>> I've attached the .rej files from applying the JDK 9 patch to JDK 8u40. >> >> The attachments were removed. The files can be found at: >> http://cr.openjdk.java.net/~stefank/backports/8u40/8056240/webrev.01/rej/ >> > > Thanks for posting the reject files, it made reviewing the backport > much easier! > > Looks good. Thanks! StefanK > > /Mikael > >> >> thanks, >> StefanK >> >>> >>> Reasons for the .rej files from the failed applied patch hunks: >>> metadataOnStackMark.cpp - the has_redefined_a_class parameter in >>> MetadataOnStackMark(bool has_redefined_a_class) is only present in >>> JDK 9. >>> metadataOnStackMark.hpp - the same as above >>> classLoaderData.cpp - has_redefined_a_class parameter and JDK-8040237 >>> isn't backported >>> concurrentMark.cpp - JDK-8027450 isn't backported >>> >>> thanks, >>> StefanK >>> >>> On 2014-10-02 12:53, Stefan Karlsson wrote: >>>> Hi all, >>>> >>>> (The following patch changes HotSpot code in areas concerning GC, RT, >>>> and Compiler. So, it would be good to get reviews from all three >>>> teams.) >>>> >>>> Please, review this patch to optimize and parallelize the CodeCache >>>> part of MetadaOnStackMark. >>>> >>>> G1 performance measurements showed longer than expected remark times >>>> on an application using a lot of nmethods and Metadata. The cause for >>>> this performance regression was the call to >>>> CodeCache::alive_nmethods_do(nmethod::mark_on_stack); in >>>> MetadataOnStackMark. This code path is only taken when class >>>> redefinition is used. Class redefinition is typically used in >>>> monitoring and diagnostic frameworks. >>>> >>>> With this patch we now: >>>> 1) Filter out duplicate Metadata* entries instead of storing a >>>> Metadata* per visited metadata reference. >>>> 2) Collect the set of Metadata* entries in parallel. The code >>>> piggy-backs on the parallel CodeCache cleaning task. >>>> >>>> http://cr.openjdk.java.net/~stefank/8056240/webrev.00/ >>>> https://bugs.openjdk.java.net/browse/JDK-8056240 >>>> >>>> Functional testing: >>>> JPRT, Kitchensink, parallel_class_unloading, unit tests >>>> >>>> Performance testing: >>>> CRM Fuse - where the regression was found >>>> >>>> The patch changes HotSpot code in areas concerning GC, RT, Compiler, >>>> and Serviceability. It would be good to get some reviews from the >>>> other teams, and not only from the GC team. >>>> >>>> thanks, >>>> StefanK >>> >> From albert.noll at oracle.com Wed Nov 5 09:47:59 2014 From: albert.noll at oracle.com (Albert Noll) Date: Wed, 05 Nov 2014 10:47:59 +0100 Subject: [9] RFR(S): 8062735: CodeCacheSweeperThread missing from SA In-Reply-To: <5459EA00.9070109@oracle.com> References: <5458CE11.305@oracle.com> <54595418.6060300@oracle.com> <5459EA00.9070109@oracle.com> Message-ID: <5459F24F.4090700@oracle.com> Hi, I forgot to add CodeCacheSweeperThread.java. Could you please look it it again? http://cr.openjdk.java.net/~anoll/8062735/webrev.01/ Thanks, Albert On 11/05/2014 10:12 AM, Albert Noll wrote: > Coleen, Vladimir, Serguei, thanks for reviewing this. > > Best, > Albert > > On 11/04/2014 11:32 PM, serguei.spitsyn at oracle.com wrote: >> Hi Albert, >> >> The fix looks good. >> >> Thanks, >> Serguei >> >> >> On 11/4/14 5:01 AM, Albert Noll wrote: >>> Hi, >>> >>> could I get reviews for this small patch? >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8062735 >>> >>> Problem: >>> The fix for JDK-8046809 added the CodeCacheSweeperThread, but did >>> not add this new type to SA. >>> >>> Solution: >>> Add type to SA. >>> >>> Testing: >>> Failing test cases. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~anoll/8062735/webrev.00/ >>> >>> Many thanks, >>> Albert >>> >> > From serguei.spitsyn at oracle.com Wed Nov 5 10:19:30 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 05 Nov 2014 02:19:30 -0800 Subject: [9] RFR(S): 8062735: CodeCacheSweeperThread missing from SA In-Reply-To: <5459F24F.4090700@oracle.com> References: <5458CE11.305@oracle.com> <54595418.6060300@oracle.com> <5459EA00.9070109@oracle.com> <5459F24F.4090700@oracle.com> Message-ID: <5459F9B2.5020703@oracle.com> Reviewed. Thanks, Serguei On 11/5/14 1:47 AM, Albert Noll wrote: > Hi, > > I forgot to add CodeCacheSweeperThread.java. Could you please look it > it again? > http://cr.openjdk.java.net/~anoll/8062735/webrev.01/ > > Thanks, > Albert > > On 11/05/2014 10:12 AM, Albert Noll wrote: >> Coleen, Vladimir, Serguei, thanks for reviewing this. >> >> Best, >> Albert >> >> On 11/04/2014 11:32 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Albert, >>> >>> The fix looks good. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/4/14 5:01 AM, Albert Noll wrote: >>>> Hi, >>>> >>>> could I get reviews for this small patch? >>>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8062735 >>>> >>>> Problem: >>>> The fix for JDK-8046809 added the CodeCacheSweeperThread, but did >>>> not add this new type to SA. >>>> >>>> Solution: >>>> Add type to SA. >>>> >>>> Testing: >>>> Failing test cases. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~anoll/8062735/webrev.00/ >>>> >>>> Many thanks, >>>> Albert >>>> >>> >> > From albert.noll at oracle.com Wed Nov 5 10:20:44 2014 From: albert.noll at oracle.com (Albert Noll) Date: Wed, 05 Nov 2014 11:20:44 +0100 Subject: [9] RFR(S): 8062735: CodeCacheSweeperThread missing from SA In-Reply-To: <5459F9B2.5020703@oracle.com> References: <5458CE11.305@oracle.com> <54595418.6060300@oracle.com> <5459EA00.9070109@oracle.com> <5459F24F.4090700@oracle.com> <5459F9B2.5020703@oracle.com> Message-ID: <5459F9FC.9010008@oracle.com> Thank you, Serguei. Best, Albert On 11/05/2014 11:19 AM, serguei.spitsyn at oracle.com wrote: > Reviewed. > > Thanks, > Serguei > > > On 11/5/14 1:47 AM, Albert Noll wrote: >> Hi, >> >> I forgot to add CodeCacheSweeperThread.java. Could you please look it >> it again? >> http://cr.openjdk.java.net/~anoll/8062735/webrev.01/ >> >> Thanks, >> Albert >> >> On 11/05/2014 10:12 AM, Albert Noll wrote: >>> Coleen, Vladimir, Serguei, thanks for reviewing this. >>> >>> Best, >>> Albert >>> >>> On 11/04/2014 11:32 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Albert, >>>> >>>> The fix looks good. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 11/4/14 5:01 AM, Albert Noll wrote: >>>>> Hi, >>>>> >>>>> could I get reviews for this small patch? >>>>> >>>>> Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8062735 >>>>> >>>>> Problem: >>>>> The fix for JDK-8046809 added the CodeCacheSweeperThread, but did >>>>> not add this new type to SA. >>>>> >>>>> Solution: >>>>> Add type to SA. >>>>> >>>>> Testing: >>>>> Failing test cases. >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~anoll/8062735/webrev.00/ >>>>> >>>>> Many thanks, >>>>> Albert >>>>> >>>> >>> >> > From andreas.eriksson at oracle.com Wed Nov 5 10:27:30 2014 From: andreas.eriksson at oracle.com (Andreas Eriksson) Date: Wed, 05 Nov 2014 11:27:30 +0100 Subject: FYI: Jdk8 backport, Was: Re: RFR(M): 8057043: Type annotations not retained during class redefine / retransform In-Reply-To: <54596C28.40900@oracle.com> References: <54368BC1.3000905@oracle.com> <54390854.1090109@oracle.com> <543BA4DB.3000104@oracle.com> <543DFF76.80605@oracle.com> <543E590A.4020304@oracle.com> <543E712D.90900@oracle.com> <543E77CA.3090806@oracle.com> <543E7B08.8040305@oracle.com> <543FA9AD.2070404@oracle.com> <5445686F.4090103@oracle.com> <5446B9FA.4080201@oracle.com> <5446C254.7040801@oracle.com> <54479AEA.8040308@oracle.com> <54590572.2030302@oracle.com> <5459072A.9050506@oracle.com> <54596350.2030609@oracle.com> <54596C28.40900@oracle.com> Message-ID: <5459FB92.8060000@oracle.com> On 2014-11-05 01:15, David Holmes wrote: > On 5/11/2014 9:37 AM, serguei.spitsyn at oracle.com wrote: >> Hi Andreas, >> >> If the port is straightforward then referring to the jdk 9 reviewers >> should be enough. >> You still need to get an approval. > > You don't need an approval in sense of requesting approval from > jdk8u-dev at ojn. You need a RFR of the backport on the hotspot lists > before pushing to jdk8u/hs-dev. Alejandro will then request a bulk > approval (on jdk8u-dev at ojn) when hs-dev syncs up. > > Andreas: similar to approval requests the RFR should cover whether the > changesets were imported directly without change, or whether > modifications were needed. For the former include a link to the 9 > changesets; for the latter a link to the 8u webrev. Alright, thanks. /Andreas > > David > >> Thanks, >> Serguei >> >> On 11/4/14 9:04 AM, Andreas Eriksson wrote: >>> Or do I need to send out a real review for the backport? >>> I'm not sure what the process is here. >>> >>> Thanks, >>> Andreas >>> >>> On 2014-11-04 17:57, Andreas Eriksson wrote: >>>> Hi, >>>> >>>> Just wanted to let the list know that I'm about to backport this >>>> change to jdk8. >>>> >>>> Regards, >>>> Andreas >>>> >>>> On 2014-10-22 13:54, Andreas Eriksson wrote: >>>>> Thanks Serguei! >>>>> >>>>> Regards, >>>>> Andreas >>>>> >>>>> On 2014-10-21 22:30, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Andreas, >>>>>> >>>>>> Very nice, thank you for the refactoring! >>>>>> Thumbs up. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 10/21/14 12:54 PM, Andreas Eriksson wrote: >>>>>>> Hi Serguei, >>>>>>> >>>>>>> I split up the method into several, and made the verification >>>>>>> before and after retransform share logic. >>>>>>> Webrev: http://cr.openjdk.java.net/~aeriksso/8057043/webrev.02/ >>>>>>> >>>>>>> Regards, >>>>>>> Andreas >>>>>>> >>>>>>> On 2014-10-20 21:54, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Andreas, >>>>>>>> >>>>>>>> Sorry for the delay. >>>>>>>> >>>>>>>> On 10/16/14 4:19 AM, Andreas Eriksson wrote: >>>>>>>>> >>>>>>>>> On 2014-10-15 15:47, Daniel D. Daugherty wrote: >>>>>>>>>> On 10/15/14 7:34 AM, Coleen Phillimore wrote: >>>>>>>>>>> >>>>>>>>>>> There are lots of other rewrite_cp_refs_in* function calls. >>>>>>>>>>> Please indent your function like them, not differently. >>>>>>>>>> >>>>>>>>>> The above implies that my answer below was made without >>>>>>>>>> sufficient >>>>>>>>>> context... my apologies for that. >>>>>>>>>> >>>>>>>>>> The general rule is to follow the existing style in the file so >>>>>>>>>> if there are rewrite_cp_refs_in* function calls in the file, >>>>>>>>>> then >>>>>>>>>> please follow that style. Unless, of course, you want to fix all >>>>>>>>>> of them to follow the HotSpot style guideline: >>>>>>>>>> >>>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide >>>>>>>>>> >>>>>>>>>> > Use good taste to break lines and align corresponding >>>>>>>>>> tokens >>>>>>>>>> > on adjacent lines. >>>>>>>>>> >>>>>>>>>> but that may cause Coleen some heartburn :-) >>>>>>>>> >>>>>>>>> I fixed the calls to follow the already existing indent style. >>>>>>>>> I have also made changes to the test, which I hope Joel can take >>>>>>>>> a look at. >>>>>>>>> >>>>>>>>> New webrev: >>>>>>>>> http://cr.openjdk.java.net/~aeriksso/8057043/webrev.01/ >>>>>>>> >>>>>>>> The fix looks good. >>>>>>>> >>>>>>>> A couple of comments about the test. >>>>>>>> >>>>>>>> The method testTransformAndVerify() is too big. >>>>>>>> At least, it looks like there are some ways to refactor it to >>>>>>>> make calls to smaller methods. >>>>>>>> >>>>>>>> There are two directions of doing it: >>>>>>>> - make a smaller method out of each block: >>>>>>>> 217-236, 238-260, 262-276, 311-329, 331-351, 353-367 >>>>>>>> - some of the lines sequences looks very typical: >>>>>>>> 221 at = >>>>>>>> c.getDeclaredField("typeAnnotatedArray").getAnnotatedType(); >>>>>>>> 222 arrayTA1 = at.getAnnotations()[0]; >>>>>>>> 223 verifyTestAnnSite(arrayTA1, "array1"); >>>>>>>> 224 >>>>>>>> 225 at = ((AnnotatedArrayType) >>>>>>>> at).getAnnotatedGenericComponentType(); >>>>>>>> 226 arrayTA2 = at.getAnnotations()[0]; >>>>>>>> 227 verifyTestAnnSite(arrayTA2, "array2"); >>>>>>>> 228 >>>>>>>> 229 at = ((AnnotatedArrayType) >>>>>>>> at).getAnnotatedGenericComponentType(); >>>>>>>> 230 arrayTA3 = at.getAnnotations()[0]; >>>>>>>> 231 verifyTestAnnSite(arrayTA3, "array3"); >>>>>>>> 232 >>>>>>>> 233 at = ((AnnotatedArrayType) >>>>>>>> at).getAnnotatedGenericComponentType(); >>>>>>>> 234 arrayTA4 = at.getAnnotations()[0]; >>>>>>>> 235 verifyTestAnnSite(arrayTA4, "array4"); >>>>>>>> But I leave it up to you. >>>>>>>> >>>>>>>> Another step to improve the readability is to add a short comment >>>>>>>> for each block of code saying what is done there. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Andreas >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Dan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Coleen >>>>>>>>>>> >>>>>>>>>>> On 10/15/14, 9:05 AM, Daniel D. Daugherty wrote: >>>>>>>>>>>> On 10/15/14 5:22 AM, Andreas Eriksson wrote: >>>>>>>>>>>>> Thanks Serguei. >>>>>>>>>>>>> >>>>>>>>>>>>> I have a question about the if-blocks that had the wrong >>>>>>>>>>>>> indent: >>>>>>>>>>>>> >>>>>>>>>>>>> 2335 if >>>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2336 byte_i, "method_info", THREAD)) { >>>>>>>>>>>>> >>>>>>>>>>>>> How should I indent them? >>>>>>>>>>>> >>>>>>>>>>>> Trying again without the line numbers... >>>>>>>>>>>> >>>>>>>>>>>> if >>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> byte_i, "method_info", >>>>>>>>>>>> THREAD)) { >>>>>>>>>>>> >>>>>>>>>>>> Just in case, TB messes with the spacing again, the "byte_i" >>>>>>>>>>>> line and >>>>>>>>>>>> "THREAD" lines are aligned under "method_type_annotations". >>>>>>>>>>>> >>>>>>>>>>>> Dan >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> /Andreas >>>>>>>>>>>>> >>>>>>>>>>>>> On 2014-10-15 07:00, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>>> Hi Andreas, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sorry I did not reply on this early. >>>>>>>>>>>>>> I assumed, it is a thumbs up from me. >>>>>>>>>>>>>> Just wanted make it clean now. :) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 10/13/14 3:09 AM, Andreas Eriksson wrote: >>>>>>>>>>>>>>> Hi Serguei, thanks for looking at this! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'll make sure to fix the style problems. >>>>>>>>>>>>>>> For the symbolic names / #defines, please see my answer to >>>>>>>>>>>>>>> Coleen. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>> Andreas >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2014-10-11 12:37, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>>>>> Hi Andreas, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thank you for fixing this issue! >>>>>>>>>>>>>>>> The fix looks nice, I do not see any logical issues. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Only minor comments... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> src/share/vm/prims/jvmtiRedefineClasses.cpp >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2281 } // end rewrite_cp_refs_in_class_type_annotations( >>>>>>>>>>>>>>>> 2315 } // end rewrite_cp_refs_in_fields_type_annotations( >>>>>>>>>>>>>>>> 2345 } // end >>>>>>>>>>>>>>>> rewrite_cp_refs_in_methods_type_annotations() >>>>>>>>>>>>>>>> 2397 } // end >>>>>>>>>>>>>>>> rewrite_cp_refs_in_type_annotations_typeArray >>>>>>>>>>>>>>>> 2443 } // end rewrite_cp_refs_in_type_annotation_struct >>>>>>>>>>>>>>>> 2785 } // end skip_type_annotation_target >>>>>>>>>>>>>>>> 2844 } // end skip_type_annotation_type_path >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The ')' is missed at 2281, 2315. >>>>>>>>>>>>>>>> The 2397-2844 are inconsistent with the 2345 and other >>>>>>>>>>>>>>>> function-end comments in the file. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2335 if >>>>>>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotations_typeArray(method_type_annotations, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2336 byte_i, "method_info", THREAD)) { >>>>>>>>>>>>>>>> . . . >>>>>>>>>>>>>>>> 2378 if >>>>>>>>>>>>>>>> (!rewrite_cp_refs_in_type_annotation_struct(type_annotations_typeArray, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2379 byte_i_ref, location_mesg, THREAD)) { >>>>>>>>>>>>>>>> . . . >>>>>>>>>>>>>>>> 2427 if >>>>>>>>>>>>>>>> (!skip_type_annotation_target(type_annotations_typeArray, >>>>>>>>>>>>>>>> 2428 byte_i_ref, location_mesg, THREAD)) { >>>>>>>>>>>>>>>> 2429 return false; >>>>>>>>>>>>>>>> 2430 } >>>>>>>>>>>>>>>> 2431 >>>>>>>>>>>>>>>> 2432 if >>>>>>>>>>>>>>>> (!skip_type_annotation_type_path(type_annotations_typeArray, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2433 byte_i_ref, THREAD)) { >>>>>>>>>>>>>>>> 2434 return false; >>>>>>>>>>>>>>>> 2435 } >>>>>>>>>>>>>>>> 2436 >>>>>>>>>>>>>>>> 2437 if >>>>>>>>>>>>>>>> (!rewrite_cp_refs_in_annotation_struct(type_annotations_typeArray, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2438 byte_i_ref, THREAD)) { >>>>>>>>>>>>>>>> 2439 return false; >>>>>>>>>>>>>>>> Wrong indent at 2336, 2379, etc. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I also concur with Coleen that it would be good to define >>>>>>>>>>>>>>>> and use >>>>>>>>>>>>>>>> symbolic names for the hexa-decimal constants used in the >>>>>>>>>>>>>>>> fix. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> test/runtime/RedefineTests/RedefineAnnotations.java >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Java indent must be 4, not 2. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 253 @TestAnn(site="returnTypeAnnotation") Class >>>>>>>>>>>>>>>> typeAnnotatedMethod(@TestAnn(site="formalParameterTypeAnnotation") >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> TypeAnnotatedTestClass arg) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The line is too long. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 143 } >>>>>>>>>>>>>>>> 144 public static void main(String argv[]) { >>>>>>>>>>>>>>>> . . . >>>>>>>>>>>>>>>> 209 } >>>>>>>>>>>>>>>> 210 private static void >>>>>>>>>>>>>>>> checkAnnotations(AnnotatedType p) { >>>>>>>>>>>>>>>> 211 checkAnnotations(p.getAnnotations()); >>>>>>>>>>>>>>>> 212 } >>>>>>>>>>>>>>>> 213 private static void >>>>>>>>>>>>>>>> checkAnnotations(AnnotatedType[] annoTypes) { >>>>>>>>>>>>>>>> 214 for (AnnotatedType p : annoTypes) >>>>>>>>>>>>>>>> checkAnnotations(p.getAnnotations()); >>>>>>>>>>>>>>>> 215 } >>>>>>>>>>>>>>>> 216 private static void >>>>>>>>>>>>>>>> checkAnnotations(Class c) { >>>>>>>>>>>>>>>> . . . >>>>>>>>>>>>>>>> 257 } >>>>>>>>>>>>>>>> 258 public void run() {} >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Adding empty lines between method definitions would >>>>>>>>>>>>>>>> improve readability. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 10/9/14 6:21 AM, Andreas Eriksson wrote: >>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please review this patch to RedefineClasses to allow >>>>>>>>>>>>>>>>> type annotations to be preserved. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Summary: >>>>>>>>>>>>>>>>> During redefine / retransform class the constant pool >>>>>>>>>>>>>>>>> indexes can change. >>>>>>>>>>>>>>>>> Since annotations have indexes into the constant pool >>>>>>>>>>>>>>>>> these indexes need to be rewritten. >>>>>>>>>>>>>>>>> This is already done for regular annotations, but not >>>>>>>>>>>>>>>>> for type annotations. >>>>>>>>>>>>>>>>> This patch adds code to add this rewriting for the type >>>>>>>>>>>>>>>>> annotations as well. >>>>>>>>>>>>>>>>> The patch also contains minor changes to >>>>>>>>>>>>>>>>> ClassFileReconstituter, to make sure that type >>>>>>>>>>>>>>>>> annotations are preserved during a redefine / >>>>>>>>>>>>>>>>> retransform class operation. >>>>>>>>>>>>>>>>> It also has a test that uses asm to change constant pool >>>>>>>>>>>>>>>>> indexes through a retransform, and then verifies that >>>>>>>>>>>>>>>>> type annotations are preserved. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Detail: >>>>>>>>>>>>>>>>> A type annotation struct consists of some target >>>>>>>>>>>>>>>>> information and a type path, followed by a regular >>>>>>>>>>>>>>>>> annotation struct. >>>>>>>>>>>>>>>>> Constant pool indexes are only present in the regular >>>>>>>>>>>>>>>>> annotation struct. >>>>>>>>>>>>>>>>> The added code skips over the type annotation specific >>>>>>>>>>>>>>>>> parts, then calls previously existing code to rewrite >>>>>>>>>>>>>>>>> constant pool indexes in the regular annotation struct. >>>>>>>>>>>>>>>>> Please see the Java SE 8 Ed. VM Spec. section 4.7.20 for >>>>>>>>>>>>>>>>> more info about the type annotation struct. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> JPRT with the new test passes without failures on all >>>>>>>>>>>>>>>>> platforms. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~aeriksso/8057043/webrev.00/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Bug: >>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8057043 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Regards >>>>>>>>>>>>>>>>> Andreas >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> From serguei.spitsyn at oracle.com Wed Nov 5 10:27:34 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 05 Nov 2014 02:27:34 -0800 Subject: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <54596AC2.6050502@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> <545729A3.7090301@oracle.com> <54596AC2.6050502@oracle.com> Message-ID: <5459FB96.9020404@oracle.com> Hi David, Thank you for the concerns! Testing showed several tests failing with deadlocks. Scenarios are similar to that you describe. Trying to understand the details. Thanks, Serguei On 11/4/14 4:09 PM, David Holmes wrote: > Hi Serguei, > > On 3/11/2014 5:07 PM, serguei.spitsyn at oracle.com wrote: >> On 11/2/14 8:58 PM, David Holmes wrote: >>> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >>>> Serguei, >>>> >>>> Thank you for good finding. This approach looks much better for me. >>>> >>>> The fix looks good. >>>> >>>> Is it necessary to release vmDeathLock locks at >>>> eventHandler.c:1244 before call >>>> >>>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? >>> >>> I agree this looks necessary, or at least more clean (if things are >>> failing we really don't know what is happening). >> >> Agreed (replied to Dmitry). >> >>> >>> More generally I'm concerned about whether any of the code paths taken >>> while holding the new lock can result in deadlock - in particular with >>> regard to the resumeLock ? >> >> The cbVMDeath() function never holds both vmDeathLock and resumeLock at >> the same time, >> so there is no chance for a deadlock that involves both these locks. >> >> Two more locks used in the cbVMDeath() are the callbackBlock and >> callbackLock. >> These two locks look completely unrelated to the debugLoop_run(). >> >> The debugLoop_run() function also uses the cmdQueueLock. >> The debugLoop_run() never holds both vmDeathLock and cmdQueueLock at the >> same time. >> >> So that I do not see any potential to introduce new deadlock with the >> vmDeathLock. >> >> However, it is still easy to overlook something here. >> Please, let me know if you see any danger. > > I was mainly concerned about what might happen in the call chain for > threadControl_resumeAll() (it certainly sounds like it might need to > use a resumeLock :) ). I see direct use of the threadLock and > indirectly the eventHandler lock; but there are further call paths I > did not explore. Wish there was an easy way to determine the > transitive closure of all locks used from a given call. > > Thanks, > David > >> Thanks, >> Serguei >> >>> >>> David >>> >>>> -Dmitry >>>> >>>> >>>> >>>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>>>> >>>>> It is 3-rd round of review for: >>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>> >>>>> New webrev: >>>>> >>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>>>> >>>>> >>>>> >>>>> >>>>> Summary >>>>> >>>>> For failing scenario, please, refer to the 1-st round RFR below. >>>>> >>>>> I've found what is missed in the jdwp agent shutdown and >>>>> decided to >>>>> switch from a workaround to a real fix. >>>>> >>>>> The agent VM_DEATH callback sets the gdata field: gdata->vmDead >>>>> = 1. >>>>> The agent debugLoop_run() has a guard against the VM shutdown: >>>>> >>>>> 165 } else if (gdata->vmDead && >>>>> 166 ((cmd->cmdSet) != >>>>> JDWP_COMMAND_SET(VirtualMachine))) { >>>>> 167 /* Protect the VM from calls while dead. >>>>> 168 * VirtualMachine cmdSet quietly ignores some >>>>> cmds >>>>> 169 * after VM death, so, it sends it's own >>>>> errors. >>>>> 170 */ >>>>> 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD)); >>>>> >>>>> >>>>> However, the guard above does not help much if the VM_DEATH event >>>>> happens in the middle of a command execution. >>>>> There is a lack of synchronization here. >>>>> >>>>> The fix introduces new lock (vmDeathLock) which does not allow to >>>>> execute the commands >>>>> and the VM_DEATH event callback concurrently. >>>>> It should work well for any function that is used in >>>>> implementation of >>>>> the JDWP_COMMAND_SET(VirtualMachine) . >>>>> >>>>> >>>>> Testing: >>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>> tests >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>>>> The updated webrev: >>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> The changes are: >>>>>> - added a comment recommended by Staffan >>>>>> - removed the ignore_wrong_phase() call from function >>>>>> classSignature() >>>>>> >>>>>> The classSignature() function is called in 16 places. >>>>>> Most of them do not tolerate the NULL in place of returned signature >>>>>> and will crash. >>>>>> I'm not comfortable to fix all the occurrences now and suggest to >>>>>> return to this >>>>>> issue after gaining experience with more failure cases that are >>>>>> still >>>>>> expected. >>>>>> The failure with the classSignature() involved was observed only >>>>>> once >>>>>> in the nightly >>>>>> and should be extremely rare reproducible. >>>>>> I'll file a placeholder bug if necessary. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Please, review the fix for: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>> >>>>>>> >>>>>>> Open webrev: >>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Summary: >>>>>>> >>>>>>> The failing scenario: >>>>>>> The debugger and the debuggee are well aware a VM shutdown >>>>>>> has >>>>>>> been started in the target process. >>>>>>> The debugger at this point is not expected to send any >>>>>>> commands >>>>>>> to the JDWP agent. >>>>>>> However, the JDI layer (debugger side) and the jdwp agent >>>>>>> (debuggee side) >>>>>>> are not in sync with the consumer layers. >>>>>>> >>>>>>> One reason is because the test debugger does not invoke >>>>>>> the JDI >>>>>>> method VirtualMachine.dispose(). >>>>>>> Another reason is that the Debugger and the debuggee >>>>>>> processes >>>>>>> are uneasy to sync in general. >>>>>>> >>>>>>> As a result the following steps are possible: >>>>>>> - The test debugger sends a 'quit' command to the test >>>>>>> debuggee >>>>>>> - The debuggee is normally exiting >>>>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>>>> anonymous class unload event >>>>>>> - The JDI InternalEventHandler thread handles the >>>>>>> ClassUnloadEvent event >>>>>>> - The InternalEventHandler wants to uncache the matching >>>>>>> reference type. >>>>>>> If there is more than one class with the same host class >>>>>>> signature, it can't distinguish them, >>>>>>> and so, deletes all references and re-retrieves them >>>>>>> again >>>>>>> (see tracing below): >>>>>>> MY_TRACE: JDI: >>>>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>>>> - The jdwp backend debugLoop_run() gets the command from >>>>>>> JDI >>>>>>> and calls the functions >>>>>>> classesForSignature() and classStatus() recursively. >>>>>>> - The classStatus() makes a call to the JVMTI >>>>>>> GetClassStatus() >>>>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>>>> - As a result the jdwp backend reports the JVMTI error >>>>>>> to the >>>>>>> JDI, and so, the test fails >>>>>>> >>>>>>> For details, see the analysis in bug report closed as a >>>>>>> dup of >>>>>>> the bug 6988950: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>>>> >>>>>>> Some similar cases can be found in the two bug reports >>>>>>> (6988950 >>>>>>> and 8024865) describing this issue. >>>>>>> >>>>>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE >>>>>>> error >>>>>>> as it is normal at the VM shutdown. >>>>>>> The original jdwp backend implementation had a similar >>>>>>> approach >>>>>>> for the raw monitor functions. >>>>>>> Threy use the ignore_vm_death() to workaround the >>>>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>>>> For reference, please, see the file: src/share/back/util.c >>>>>>> >>>>>>> >>>>>>> Testing: >>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>> tests >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>> >>>>> >>>> >>>> >> From coleen.phillimore at oracle.com Wed Nov 5 12:27:48 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 05 Nov 2014 07:27:48 -0500 Subject: [9] RFR(S): 8062735: CodeCacheSweeperThread missing from SA In-Reply-To: <5459F24F.4090700@oracle.com> References: <5458CE11.305@oracle.com> <54595418.6060300@oracle.com> <5459EA00.9070109@oracle.com> <5459F24F.4090700@oracle.com> Message-ID: <545A17C4.5030304@oracle.com> This is good. Coleen On 11/5/14, 4:47 AM, Albert Noll wrote: > Hi, > > I forgot to add CodeCacheSweeperThread.java. Could you please look it > it again? > http://cr.openjdk.java.net/~anoll/8062735/webrev.01/ > > Thanks, > Albert > > On 11/05/2014 10:12 AM, Albert Noll wrote: >> Coleen, Vladimir, Serguei, thanks for reviewing this. >> >> Best, >> Albert >> >> On 11/04/2014 11:32 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Albert, >>> >>> The fix looks good. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/4/14 5:01 AM, Albert Noll wrote: >>>> Hi, >>>> >>>> could I get reviews for this small patch? >>>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8062735 >>>> >>>> Problem: >>>> The fix for JDK-8046809 added the CodeCacheSweeperThread, but did >>>> not add this new type to SA. >>>> >>>> Solution: >>>> Add type to SA. >>>> >>>> Testing: >>>> Failing test cases. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~anoll/8062735/webrev.00/ >>>> >>>> Many thanks, >>>> Albert >>>> >>> >> > From stefan.karlsson at oracle.com Wed Nov 5 12:22:26 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 13:22:26 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning Message-ID: <545A1682.5030406@oracle.com> Hi all, I propose that we turn on the -Wreturn-type warning when compiling HotSpot with GCC. This will help us catch missing return statements earlier in the development cycle. http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8062808 thanks, StefanK From andreas.eriksson at oracle.com Wed Nov 5 12:49:17 2014 From: andreas.eriksson at oracle.com (Andreas Eriksson) Date: Wed, 05 Nov 2014 13:49:17 +0100 Subject: [8u-hs-dev] Backport RFR: 8057043: Type annotations not retained during class redefine / retransform Message-ID: <545A1CCD.5080008@oracle.com> Hi, This backport of JDK-8057043 imported cleanly from the jdk9 changeset: http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/06de05da6f2b Thanks, Andreas From mikael.gerdin at oracle.com Wed Nov 5 13:59:56 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Wed, 05 Nov 2014 14:59:56 +0100 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <54577C96.5030503@oracle.com> References: <54577C96.5030503@oracle.com> Message-ID: <545A2D5C.5090808@oracle.com> Hi all, I have an updated webrev with the following changes: * compiler_barrier() calls added after the loads to all load_acquire-variants. * the dummy store in release() is removed. Full webrev at: http://cr.openjdk.java.net/~mgerdin/8061964/webrev.1/ Incremental webrev at: http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0_to_1/ Thanks for all the feedback so far! /Mikael On 2014-11-03 14:01, Mikael Gerdin wrote: > Hi all, > > Please review this attempt at fixing the OrderAccess functions on Linux > x86 with GCC. > > While working on another bug I recently discovered that g++ was > reordering stores across a call to OrderAccess::storestore on Linux x86. > > The G1 code attempts to do an ordered publishing of two values: > _saved_mark_word = _top; > OrderAccess::storestore(); > _gc_time_stamp = curr_gc_time_stamp; > > The types involved are > HeapWord* _top, _saved_mark_word; > volatile unsigned _gc_time_stamp; > > The incorrect behavior seems to have started when JDK-6973570 was fixed > in JDK 7. > Below, _top is at offset 0x58, _saved_mark_word at 0x18 and > _gc_time_stamp at 0x138, %rbx is "this". > > /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: > > 3d9f4d: 39 d0 cmp %edx,%eax > 3d9f4f: 73 1c jae 3d9f6d > > 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax > 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) > 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # > ae98a0 <_DYNAMIC+0x12f8> > 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) > 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) > > /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so > > 3da05d: 39 d0 cmp %edx,%eax > 3da05f: 73 15 jae 3da076 > > 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax > 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) > 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) > 3da072: 48 89 43 18 mov %rax,0x18(%rbx) > > In b109 the store of %rax to 0x18(%rbx) has been ordered after the store > of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. > > My suggestion to fix this is to extend all the OrderAccess::release* > variants on x86 with a: > __asm__ volatile ("" : : : "memory"); > to attempt to prevent GCC from reordering any memory accesses across > those function calls. > > I've verified that this solves the issue in the assembly with our > current JDK 9 build platform compilers. > I've also verified that this particular piece of code is compiled > correctly on our other x86 platforms: Solaris, Windows and OS X. > > Webrev: > http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ > Bug: > https://bugs.openjdk.java.net/browse/JDK-8061964 > Testing: > JPRT, inspecting generated assembly for the function > G1OffsetTableContigSpace::record_top_and_timestamp (as the method is > currently named). > Suggestions of further testing is greatly appreciated. > > Thanks > Mikael From bertrand.delsart at oracle.com Wed Nov 5 14:34:57 2014 From: bertrand.delsart at oracle.com (Bertrand Delsart) Date: Wed, 05 Nov 2014 15:34:57 +0100 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <545A2D5C.5090808@oracle.com> References: <54577C96.5030503@oracle.com> <545A2D5C.5090808@oracle.com> Message-ID: <545A3591.4050209@oracle.com> Looks good. Bertrand (not a Reviewer). On 05/11/2014 14:59, Mikael Gerdin wrote: > Hi all, > > I have an updated webrev with the following changes: > > * compiler_barrier() calls added after the loads to all > load_acquire-variants. > * the dummy store in release() is removed. > > Full webrev at: > http://cr.openjdk.java.net/~mgerdin/8061964/webrev.1/ > > Incremental webrev at: > http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0_to_1/ > > Thanks for all the feedback so far! > > /Mikael > > On 2014-11-03 14:01, Mikael Gerdin wrote: >> Hi all, >> >> Please review this attempt at fixing the OrderAccess functions on Linux >> x86 with GCC. >> >> While working on another bug I recently discovered that g++ was >> reordering stores across a call to OrderAccess::storestore on Linux x86. >> >> The G1 code attempts to do an ordered publishing of two values: >> _saved_mark_word = _top; >> OrderAccess::storestore(); >> _gc_time_stamp = curr_gc_time_stamp; >> >> The types involved are >> HeapWord* _top, _saved_mark_word; >> volatile unsigned _gc_time_stamp; >> >> The incorrect behavior seems to have started when JDK-6973570 was fixed >> in JDK 7. >> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and >> _gc_time_stamp at 0x138, %rbx is "this". >> >> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: >> >> >> 3d9f4d: 39 d0 cmp %edx,%eax >> 3d9f4f: 73 1c jae 3d9f6d >> >> 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax >> 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) >> 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # >> ae98a0 <_DYNAMIC+0x12f8> >> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) >> 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >> >> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so >> >> >> 3da05d: 39 d0 cmp %edx,%eax >> 3da05f: 73 15 jae 3da076 >> >> 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax >> 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) >> 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >> 3da072: 48 89 43 18 mov %rax,0x18(%rbx) >> >> In b109 the store of %rax to 0x18(%rbx) has been ordered after the store >> of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. >> >> My suggestion to fix this is to extend all the OrderAccess::release* >> variants on x86 with a: >> __asm__ volatile ("" : : : "memory"); >> to attempt to prevent GCC from reordering any memory accesses across >> those function calls. >> >> I've verified that this solves the issue in the assembly with our >> current JDK 9 build platform compilers. >> I've also verified that this particular piece of code is compiled >> correctly on our other x86 platforms: Solaris, Windows and OS X. >> >> Webrev: >> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8061964 >> Testing: >> JPRT, inspecting generated assembly for the function >> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is >> currently named). >> Suggestions of further testing is greatly appreciated. >> >> Thanks >> Mikael -- Bertrand Delsart, Grenoble Engineering Center Oracle, 180 av. de l'Europe, ZIRST de Montbonnot 38334 Saint Ismier, FRANCE bertrand.delsart at oracle.com Phone : +33 4 76 18 81 23 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From mikael.gerdin at oracle.com Wed Nov 5 14:55:04 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Wed, 05 Nov 2014 15:55:04 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A1682.5030406@oracle.com> References: <545A1682.5030406@oracle.com> Message-ID: <545A3A48.4070809@oracle.com> Hi Stefan, On 2014-11-05 13:22, Stefan Karlsson wrote: > Hi all, > > I propose that we turn on the -Wreturn-type warning when compiling > HotSpot with GCC. > > This will help us catch missing return statements earlier in the > development cycle. This is an excellent step in the right direction, more than once I've been surprised by strange behavior only to find out that I forgot to add a return statement to my functions. > > http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ Looks good to me, the CHECK to THREAD change is good even in itself I think, since it's pointless to do return foo(CHECK); since CHECK will expand to code after an unconditional return statement. /Mikael > https://bugs.openjdk.java.net/browse/JDK-8062808 > > thanks, > StefanK From stefan.karlsson at oracle.com Wed Nov 5 15:00:24 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 16:00:24 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A3A48.4070809@oracle.com> References: <545A1682.5030406@oracle.com> <545A3A48.4070809@oracle.com> Message-ID: <545A3B88.6050905@oracle.com> On 2014-11-05 15:55, Mikael Gerdin wrote: > Hi Stefan, > > On 2014-11-05 13:22, Stefan Karlsson wrote: >> Hi all, >> >> I propose that we turn on the -Wreturn-type warning when compiling >> HotSpot with GCC. >> >> This will help us catch missing return statements earlier in the >> development cycle. > > This is an excellent step in the right direction, more than once I've > been surprised by strange behavior only to find out that I forgot to > add a return statement to my functions. > >> >> http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ > > Looks good to me, the CHECK to THREAD change is good even in itself I > think, since it's pointless to do > return foo(CHECK); > since CHECK will expand to code after an unconditional return statement. Thanks, Mikael. StefanK > > /Mikael > >> https://bugs.openjdk.java.net/browse/JDK-8062808 >> >> thanks, >> StefanK From george.triantafillou at oracle.com Wed Nov 5 14:59:35 2014 From: george.triantafillou at oracle.com (George Triantafillou) Date: Wed, 05 Nov 2014 09:59:35 -0500 Subject: RFR: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms Message-ID: <545A3B57.6000609@oracle.com> Please review this fix for JDK-8061969. The test was modified to verify the fix for JDK-8058251, but since another intermittent issue was uncovered during testing (JDK-8062870), this test will retain the "@ignore" jtreg tag until the new issue is fixed. Additionally, the "@requires" tag was used to ensure that the test only runs on 32-bit platforms. Bug: https://bugs.openjdk.java.net/browse/JDK-8061969 Webrev: http://cr.openjdk.java.net/~gtriantafill/8061969/webrev/ The fix was tested locally on Linux with jtreg and the JPRT hotspot testset. -George From christian.tornqvist at oracle.com Wed Nov 5 15:06:45 2014 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Wed, 5 Nov 2014 10:06:45 -0500 Subject: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms In-Reply-To: <545A3B57.6000609@oracle.com> References: <545A3B57.6000609@oracle.com> Message-ID: <015401cff90a$1da62290$58f267b0$@oracle.com> Hi George, Some comments: src/share/vm/prims/whitebox.cpp L362, you should assert on hash_size rather than calling hash_buckets() again. test/runtime/NMT/MallocSiteHashOverflow.java L41 and L52 are both getting the WhiteBox instance, I'd remove L41 and move MAX_HASH_SIZE into main instead. Otherwise this looks good, thanks for doing this. Thanks, Christian -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of George Triantafillou Sent: Wednesday, November 5, 2014 10:00 AM To: hotspot-dev at openjdk.java.net Subject: RFR: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms Please review this fix for JDK-8061969. The test was modified to verify the fix for JDK-8058251, but since another intermittent issue was uncovered during testing (JDK-8062870), this test will retain the "@ignore" jtreg tag until the new issue is fixed. Additionally, the "@requires" tag was used to ensure that the test only runs on 32-bit platforms. Bug: https://bugs.openjdk.java.net/browse/JDK-8061969 Webrev: http://cr.openjdk.java.net/~gtriantafill/8061969/webrev/ The fix was tested locally on Linux with jtreg and the JPRT hotspot testset. -George From daniel.daugherty at oracle.com Wed Nov 5 15:21:09 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 05 Nov 2014 08:21:09 -0700 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <545A2D5C.5090808@oracle.com> References: <54577C96.5030503@oracle.com> <545A2D5C.5090808@oracle.com> Message-ID: <545A4065.6090605@oracle.com> On 11/5/14 6:59 AM, Mikael Gerdin wrote: > Hi all, > > I have an updated webrev with the following changes: > > * compiler_barrier() calls added after the loads to all > load_acquire-variants. > * the dummy store in release() is removed. > > Full webrev at: > http://cr.openjdk.java.net/~mgerdin/8061964/webrev.1/ src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp No comments. Thumbs up. Dan > > Incremental webrev at: > http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0_to_1/ > > Thanks for all the feedback so far! > > /Mikael > > On 2014-11-03 14:01, Mikael Gerdin wrote: >> Hi all, >> >> Please review this attempt at fixing the OrderAccess functions on Linux >> x86 with GCC. >> >> While working on another bug I recently discovered that g++ was >> reordering stores across a call to OrderAccess::storestore on Linux x86. >> >> The G1 code attempts to do an ordered publishing of two values: >> _saved_mark_word = _top; >> OrderAccess::storestore(); >> _gc_time_stamp = curr_gc_time_stamp; >> >> The types involved are >> HeapWord* _top, _saved_mark_word; >> volatile unsigned _gc_time_stamp; >> >> The incorrect behavior seems to have started when JDK-6973570 was fixed >> in JDK 7. >> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and >> _gc_time_stamp at 0x138, %rbx is "this". >> >> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: >> >> >> 3d9f4d: 39 d0 cmp %edx,%eax >> 3d9f4f: 73 1c jae 3d9f6d >> >> 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax >> 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) >> 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # >> ae98a0 <_DYNAMIC+0x12f8> >> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) >> 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >> >> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so >> >> >> 3da05d: 39 d0 cmp %edx,%eax >> 3da05f: 73 15 jae 3da076 >> >> 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax >> 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) >> 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >> 3da072: 48 89 43 18 mov %rax,0x18(%rbx) >> >> In b109 the store of %rax to 0x18(%rbx) has been ordered after the store >> of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. >> >> My suggestion to fix this is to extend all the OrderAccess::release* >> variants on x86 with a: >> __asm__ volatile ("" : : : "memory"); >> to attempt to prevent GCC from reordering any memory accesses across >> those function calls. >> >> I've verified that this solves the issue in the assembly with our >> current JDK 9 build platform compilers. >> I've also verified that this particular piece of code is compiled >> correctly on our other x86 platforms: Solaris, Windows and OS X. >> >> Webrev: >> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8061964 >> Testing: >> JPRT, inspecting generated assembly for the function >> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is >> currently named). >> Suggestions of further testing is greatly appreciated. >> >> Thanks >> Mikael From george.triantafillou at oracle.com Wed Nov 5 15:25:51 2014 From: george.triantafillou at oracle.com (George Triantafillou) Date: Wed, 05 Nov 2014 10:25:51 -0500 Subject: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms In-Reply-To: <015401cff90a$1da62290$58f267b0$@oracle.com> References: <545A3B57.6000609@oracle.com> <015401cff90a$1da62290$58f267b0$@oracle.com> Message-ID: <545A417F.6080401@oracle.com> Thanks Christian. Updated webrev is here: http://cr.openjdk.java.net/~gtriantafill/8061969/webrev.01/ -George On 11/5/2014 10:06 AM, Christian Tornqvist wrote: > Hi George, > > Some comments: > > src/share/vm/prims/whitebox.cpp > > L362, you should assert on hash_size rather than calling hash_buckets() again. > > test/runtime/NMT/MallocSiteHashOverflow.java > > L41 and L52 are both getting the WhiteBox instance, I'd remove L41 and move MAX_HASH_SIZE into main instead. > > > Otherwise this looks good, thanks for doing this. > > Thanks, > Christian > > -----Original Message----- > From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of George Triantafillou > Sent: Wednesday, November 5, 2014 10:00 AM > To: hotspot-dev at openjdk.java.net > Subject: RFR: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms > > Please review this fix for JDK-8061969. The test was modified to verify the fix for JDK-8058251, but since another intermittent issue was uncovered during testing (JDK-8062870), this test will retain the "@ignore" jtreg tag until the new issue is fixed. Additionally, the "@requires" tag was used to ensure that the test only runs on 32-bit platforms. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8061969 > Webrev: http://cr.openjdk.java.net/~gtriantafill/8061969/webrev/ > > > The fix was tested locally on Linux with jtreg and the JPRT hotspot testset. > > -George > From thomas.schatzl at oracle.com Wed Nov 5 16:05:59 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 05 Nov 2014 17:05:59 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A1682.5030406@oracle.com> References: <545A1682.5030406@oracle.com> Message-ID: <1415203559.2912.16.camel@oracle.com> Hi, On Wed, 2014-11-05 at 13:22 +0100, Stefan Karlsson wrote: > Hi all, > > I propose that we turn on the -Wreturn-type warning when compiling > HotSpot with GCC. > > This will help us catch missing return statements earlier in the > development cycle. > > http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8062808 looks good to me. Thanks, Thomas From vladimir.kozlov at oracle.com Wed Nov 5 16:28:22 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 05 Nov 2014 08:28:22 -0800 Subject: [9] RFR(S): 8062735: CodeCacheSweeperThread missing from SA In-Reply-To: <5459F24F.4090700@oracle.com> References: <5458CE11.305@oracle.com> <54595418.6060300@oracle.com> <5459EA00.9070109@oracle.com> <5459F24F.4090700@oracle.com> Message-ID: <545A5026.2070607@oracle.com> Okay. Thanks, Vladimir On 11/5/14 1:47 AM, Albert Noll wrote: > Hi, > > I forgot to add CodeCacheSweeperThread.java. Could you please look it it again? > http://cr.openjdk.java.net/~anoll/8062735/webrev.01/ > > Thanks, > Albert > > On 11/05/2014 10:12 AM, Albert Noll wrote: >> Coleen, Vladimir, Serguei, thanks for reviewing this. >> >> Best, >> Albert >> >> On 11/04/2014 11:32 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Albert, >>> >>> The fix looks good. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/4/14 5:01 AM, Albert Noll wrote: >>>> Hi, >>>> >>>> could I get reviews for this small patch? >>>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8062735 >>>> >>>> Problem: >>>> The fix for JDK-8046809 added the CodeCacheSweeperThread, but did not add this new type to SA. >>>> >>>> Solution: >>>> Add type to SA. >>>> >>>> Testing: >>>> Failing test cases. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~anoll/8062735/webrev.00/ >>>> >>>> Many thanks, >>>> Albert >>>> >>> >> > From andreas.eriksson at oracle.com Wed Nov 5 16:58:43 2014 From: andreas.eriksson at oracle.com (Andreas Eriksson) Date: Wed, 05 Nov 2014 17:58:43 +0100 Subject: "/src/share/vm/classfile/classLoader.cpp", line 907: Error: Cannot use long to initialize instanceKlassHandle. Message-ID: <545A5743.1010307@oracle.com> Hi all, I'm backporting JDK-8020675 to jdk7, but debug builds on solaris fails with the following error: ".../src/share/vm/classfile/classLoader.cpp", line 907: Error: Cannot use long to initialize instanceKlassHandle. This is in method ClassLoader::load_classfile which returns an instanceKlassHandle. The changed code adds a CHECK_NULL: - stream = e->open_stream(name); + stream = e->open_stream(name, CHECK_NULL); The CHECK_NULL macro expands to stream = e->open_stream(name, THREAD); if (HAS_PENDING_EXCEPTION) return NULL; (0); NULL is of type long on solaris_x64 (or int on solaris_i586, which has the same problem), and the solaris debug build fails with the above error because we try to return NULL i.e. 0L as an instanceKlassHandle. This is only a problem on jdk7 solaris debug builds, does anyone know why it fails in this particular configuration? It is not a problem on jdk8, even though the same code exists there. Returning instanceKlassHandle() works as expected, would this be the correct way to go since that is a NULL handle? Any help is appreciated. Regards, Andreas From coleen.phillimore at oracle.com Wed Nov 5 17:02:41 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 05 Nov 2014 12:02:41 -0500 Subject: "/src/share/vm/classfile/classLoader.cpp", line 907: Error: Cannot use long to initialize instanceKlassHandle. In-Reply-To: <545A5743.1010307@oracle.com> References: <545A5743.1010307@oracle.com> Message-ID: <545A5831.8050506@oracle.com> Yes, return CHECK_(instanceKlassHandle()) instead of CHECK_NULL for jdk7. Handles were rewritten since InstanceKlass is no longer an oop. Coleen On 11/5/14, 11:58 AM, Andreas Eriksson wrote: > Hi all, > > I'm backporting JDK-8020675 > to jdk7, but debug > builds on solaris fails with the following error: > > ".../src/share/vm/classfile/classLoader.cpp", line 907: Error: Cannot > use long to initialize instanceKlassHandle. > > This is in method ClassLoader::load_classfile which returns an > instanceKlassHandle. > The changed code adds a CHECK_NULL: > > - stream = e->open_stream(name); > + stream = e->open_stream(name, CHECK_NULL); > > The CHECK_NULL macro expands to > > stream = e->open_stream(name, THREAD); if (HAS_PENDING_EXCEPTION) > return NULL; (0); > > NULL is of type long on solaris_x64 (or int on solaris_i586, which has > the same problem), > and the solaris debug build fails with the above error because we try > to return NULL i.e. 0L as an instanceKlassHandle. > > This is only a problem on jdk7 solaris debug builds, does anyone know > why it fails in this particular configuration? > It is not a problem on jdk8, even though the same code exists there. > > Returning instanceKlassHandle() works as expected, would this be the > correct way to go since that is a NULL handle? > > Any help is appreciated. > > Regards, > Andreas From coleen.phillimore at oracle.com Wed Nov 5 17:50:02 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 05 Nov 2014 12:50:02 -0500 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <1415203559.2912.16.camel@oracle.com> References: <545A1682.5030406@oracle.com> <1415203559.2912.16.camel@oracle.com> Message-ID: <545A634A.3090201@oracle.com> This looks good to me also. I'm surprised there weren't more CHECK in return statements and other changes needed. Coleen On 11/5/14, 11:05 AM, Thomas Schatzl wrote: > Hi, > > On Wed, 2014-11-05 at 13:22 +0100, Stefan Karlsson wrote: >> Hi all, >> >> I propose that we turn on the -Wreturn-type warning when compiling >> HotSpot with GCC. >> >> This will help us catch missing return statements earlier in the >> development cycle. >> >> http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8062808 > looks good to me. > > Thanks, > Thomas > From coleen.phillimore at oracle.com Wed Nov 5 18:00:32 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 05 Nov 2014 13:00:32 -0500 Subject: RFR: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms In-Reply-To: <545A3B57.6000609@oracle.com> References: <545A3B57.6000609@oracle.com> Message-ID: <545A65C0.401@oracle.com> George, Why aren't lines 74-77 outside the for loop in this test? Coleen On 11/5/14, 9:59 AM, George Triantafillou wrote: > Please review this fix for JDK-8061969. The test was modified to > verify the fix for JDK-8058251, but since another intermittent issue > was uncovered during testing (JDK-8062870), this test will retain the > "@ignore" jtreg tag until the new issue is fixed. Additionally, the > "@requires" tag was used to ensure that the test only runs on 32-bit > platforms. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8061969 > Webrev: http://cr.openjdk.java.net/~gtriantafill/8061969/webrev/ > > > The fix was tested locally on Linux with jtreg and the JPRT hotspot > testset. > > -George From kim.barrett at oracle.com Wed Nov 5 18:34:45 2014 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 5 Nov 2014 13:34:45 -0500 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A1682.5030406@oracle.com> References: <545A1682.5030406@oracle.com> Message-ID: <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> On Nov 5, 2014, at 7:22 AM, Stefan Karlsson wrote: > > Hi all, > > I propose that we turn on the -Wreturn-type warning when compiling HotSpot with GCC. > > This will help us catch missing return statements earlier in the development cycle. > > http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8062808 I?ve only skimmed this and not really reviewed, but I really dislike insertion of purportedly unreachable ?return? statements to silence compiler warnings. I have a similar dislike for ?if (check for bad case) { non-returning error processing } /* no else */ ?? to disable such warnings by avoid an apparent terminating control flow w/o return at the end of the error processing. There?s got to be a better way? I know that gcc/clang/MS all have annotation mechanism to mark code unreachable or functions as no-return. I?d rather see something like that added and used before we turn on -Wreturn-type, rather than littering / contorting code to avoid that warning. And compilers may generate better code sometimes when such are used. And I say that having just last week wasted half a day debugging what turned out to be a small refactoring that left a code path without a return. From stefan.karlsson at oracle.com Wed Nov 5 19:56:40 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 20:56:40 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> References: <545A1682.5030406@oracle.com> <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> Message-ID: <545A80F8.9010200@oracle.com> On 2014-11-05 19:34, Kim Barrett wrote: > On Nov 5, 2014, at 7:22 AM, Stefan Karlsson wrote: >> Hi all, >> >> I propose that we turn on the -Wreturn-type warning when compiling HotSpot with GCC. >> >> This will help us catch missing return statements earlier in the development cycle. >> >> http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8062808 > I?ve only skimmed this and not really reviewed, but I really dislike insertion of purportedly unreachable ?return? statements to silence compiler warnings. I have a similar dislike for ?if (check for bad case) { non-returning error processing } /* no else */ ?? to disable such warnings by avoid an apparent terminating control flow w/o return at the end of the error processing. There?s got to be a better way? I understand, but that's what you'll find if you look at the shared code. I only added a few more places, where our other compilers didn't complain about this or the code wasn't compiled with those compilers. With that said, I'm all for cleaning this up, but it's a pretty large undertaking that I don't think should prevent the usage of -Wreturn-type. > > I know that gcc/clang/MS all have annotation mechanism to mark code unreachable or functions as no-return. My first implementation of thisused notreturn annotations on gcc/clang/MS, but since we need to support other compilers that don't have this annotation we can't really take advantage of the annotation to fix this problem throughout our code base. We would still have all these constructs that you've pointed out, except for the two I added. > I?d rather see something like that added and used before we turn on -Wreturn-type, rather than littering / contorting code to avoid that warning. I don't agree. The code is already littered with these kind of constructs, so that I add a couple of similar returns shouldn't be a show-stopper for this warning, IMHO. > And compilers may generate better code sometimes when such are used. A benefit of using the noreturn annotation would be to be able to turn on -Wuninitialized and get away with constructs like this: int a; switch (v) { case 1: a = x; break; case 2: a = y; break; default: ShouldNotReachHere(); } use(a); I have a patch were I started testing this, but there are a number of places were the compiler doesn't manage to infer that all paths taken will initialize the variable and we have to change the code to make it easier for the compiler to understand it. This will probably help with the readability of the code, so that is probably not something negative. > > And I say that having just last week wasted half a day debugging what turned out to be a small refactoring that left a code path without a return. I've asked around and I've heard of a hand-full of HotSpot developers that has been bitten by this. :) thanks for looking at this, StefanK > > From george.triantafillou at oracle.com Wed Nov 5 19:57:46 2014 From: george.triantafillou at oracle.com (George Triantafillou) Date: Wed, 05 Nov 2014 14:57:46 -0500 Subject: RFR: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms In-Reply-To: <545A65C0.401@oracle.com> References: <545A3B57.6000609@oracle.com> <545A65C0.401@oracle.com> Message-ID: <545A813A.4060105@oracle.com> Hi Coleen, I tried that, but couldn't reliably reproduce the "assert(_count > 0) failed: Negative counter" described in JDK-8062870. -George On 11/5/2014 1:00 PM, Coleen Phillimore wrote: > > George, > > Why aren't lines 74-77 outside the for loop in this test? > > Coleen > > On 11/5/14, 9:59 AM, George Triantafillou wrote: >> Please review this fix for JDK-8061969. The test was modified to >> verify the fix for JDK-8058251, but since another intermittent issue >> was uncovered during testing (JDK-8062870), this test will retain the >> "@ignore" jtreg tag until the new issue is fixed. Additionally, the >> "@requires" tag was used to ensure that the test only runs on 32-bit >> platforms. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8061969 >> Webrev: http://cr.openjdk.java.net/~gtriantafill/8061969/webrev/ >> >> >> The fix was tested locally on Linux with jtreg and the JPRT hotspot >> testset. >> >> -George > From stefan.karlsson at oracle.com Wed Nov 5 20:12:43 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 21:12:43 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <1415203559.2912.16.camel@oracle.com> References: <545A1682.5030406@oracle.com> <1415203559.2912.16.camel@oracle.com> Message-ID: <545A84BB.5060904@oracle.com> On 2014-11-05 17:05, Thomas Schatzl wrote: > Hi, > > On Wed, 2014-11-05 at 13:22 +0100, Stefan Karlsson wrote: >> Hi all, >> >> I propose that we turn on the -Wreturn-type warning when compiling >> HotSpot with GCC. >> >> This will help us catch missing return statements earlier in the >> development cycle. >> >> http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8062808 > looks good to me. Thanks. StefanK > > Thanks, > Thomas > From stefan.karlsson at oracle.com Wed Nov 5 20:13:08 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 21:13:08 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A634A.3090201@oracle.com> References: <545A1682.5030406@oracle.com> <1415203559.2912.16.camel@oracle.com> <545A634A.3090201@oracle.com> Message-ID: <545A84D4.10602@oracle.com> On 2014-11-05 18:50, Coleen Phillimore wrote: > > This looks good to me also. I'm surprised there weren't more CHECK in > return statements and other changes needed. Thanks. StefanK > > Coleen > > On 11/5/14, 11:05 AM, Thomas Schatzl wrote: >> Hi, >> >> On Wed, 2014-11-05 at 13:22 +0100, Stefan Karlsson wrote: >>> Hi all, >>> >>> I propose that we turn on the -Wreturn-type warning when compiling >>> HotSpot with GCC. >>> >>> This will help us catch missing return statements earlier in the >>> development cycle. >>> >>> http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8062808 >> looks good to me. >> >> Thanks, >> Thomas >> > From john.r.rose at oracle.com Wed Nov 5 20:20:27 2014 From: john.r.rose at oracle.com (John Rose) Date: Wed, 5 Nov 2014 12:20:27 -0800 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A80F8.9010200@oracle.com> References: <545A1682.5030406@oracle.com> <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> <545A80F8.9010200@oracle.com> Message-ID: This is a good gain in robustness. Like Coleen, I'm agreeably surprised at the relatively small number of fixes needed. On Nov 5, 2014, at 11:56 AM, Stefan Karlsson wrote: > ... > A benefit of using the noreturn annotation would be to be able to turn on -Wuninitialized and get away with constructs like this: > int a; > switch (v) { > case 1: a = x; break; > case 2: a = y; break; > default: ShouldNotReachHere(); > } > use(a); > > I have a patch were I started testing this, but there are a number of places were the compiler doesn't manage to infer that all paths taken will initialize the variable and we have to change the code to make it easier for the compiler to understand it. This will probably help with the readability of the code, so that is probably not something negative. I think this is a reasonable goal, worth a follow-up bug. One concern: The noreturn annotation might differ in effect from compiler to compiler, making it difficult to have one (simple) source base that pleases all the compilers. Here's a suggestion: Define a macro along these lines: #define AFTER_NORETURN(code) {ShouldNotReachHere();code;} Use it for your two "Mute compiler" lines. For compilers which have noreturn, define it this way: #define AFTER_NORETURN(code) /*nothing*/ Thanks for pursuing these cleanups. ? John From coleen.phillimore at oracle.com Wed Nov 5 20:26:09 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 05 Nov 2014 15:26:09 -0500 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A80F8.9010200@oracle.com> References: <545A1682.5030406@oracle.com> <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> <545A80F8.9010200@oracle.com> Message-ID: <545A87E1.2020404@oracle.com> On 11/5/14, 2:56 PM, Stefan Karlsson wrote: > On 2014-11-05 19:34, Kim Barrett wrote: >> On Nov 5, 2014, at 7:22 AM, Stefan Karlsson >> wrote: >>> Hi all, >>> >>> I propose that we turn on the -Wreturn-type warning when compiling >>> HotSpot with GCC. >>> >>> This will help us catch missing return statements earlier in the >>> development cycle. >>> >>> http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8062808 >> I?ve only skimmed this and not really reviewed, but I really dislike >> insertion of purportedly unreachable ?return? statements to silence >> compiler warnings. I have a similar dislike for ?if (check for bad >> case) { non-returning error processing } /* no else */ ?? to disable >> such warnings by avoid an apparent terminating control flow w/o >> return at the end of the error processing. There?s got to be a >> better way? > > I understand, but that's what you'll find if you look at the shared > code. I only added a few more places, where our other compilers didn't > complain about this or the code wasn't compiled with those compilers. > With that said, I'm all for cleaning this up, but it's a pretty large > undertaking that I don't think should prevent the usage of -Wreturn-type. > >> >> I know that gcc/clang/MS all have annotation mechanism to mark code >> unreachable or functions as no-return. > > My first implementation of thisused notreturn annotations on > gcc/clang/MS, but since we need to support other compilers that don't > have this annotation we can't really take advantage of the annotation > to fix this problem throughout our code base. We would still have all > these constructs that you've pointed out, except for the two I added. I don't think a "return 0;" line after ShouldNotReachHere(); which only knowing the contents of this call, ie that it doesn't return, is worse than some #pragma doesnt-return macro. The latter is more noise and distraction to me. I agree with Stefan. I don't think this should stop this change. This change is an improvement. Coleen > >> I?d rather see something like that added and used before we turn on >> -Wreturn-type, rather than littering / contorting code to avoid that >> warning. > > I don't agree. The code is already littered with these kind of > constructs, so that I add a couple of similar returns shouldn't be a > show-stopper for this warning, IMHO. > >> And compilers may generate better code sometimes when such are used. > > A benefit of using the noreturn annotation would be to be able to turn > on -Wuninitialized and get away with constructs like this: > int a; > switch (v) { > case 1: a = x; break; > case 2: a = y; break; > default: ShouldNotReachHere(); > } > use(a); > > I have a patch were I started testing this, but there are a number of > places were the compiler doesn't manage to infer that all paths taken > will initialize the variable and we have to change the code to make it > easier for the compiler to understand it. This will probably help with > the readability of the code, so that is probably not something negative. > >> >> And I say that having just last week wasted half a day debugging what >> turned out to be a small refactoring that left a code path without a >> return. > > I've asked around and I've heard of a hand-full of HotSpot developers > that has been bitten by this. :) > > thanks for looking at this, > StefanK > >> >> > From coleen.phillimore at oracle.com Wed Nov 5 20:27:38 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 05 Nov 2014 15:27:38 -0500 Subject: RFR: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms In-Reply-To: <545A813A.4060105@oracle.com> References: <545A3B57.6000609@oracle.com> <545A65C0.401@oracle.com> <545A813A.4060105@oracle.com> Message-ID: <545A883A.2050502@oracle.com> On 11/5/14, 2:57 PM, George Triantafillou wrote: > Hi Coleen, > > I tried that, but couldn't reliably reproduce the "assert(_count > 0) > failed: Negative counter" described in JDK-8062870. > Really? Okay, leave it then. Change is fine as-is then. Thanks, Coleen > > -George > > > On 11/5/2014 1:00 PM, Coleen Phillimore wrote: >> >> George, >> >> Why aren't lines 74-77 outside the for loop in this test? >> >> Coleen >> >> On 11/5/14, 9:59 AM, George Triantafillou wrote: >>> Please review this fix for JDK-8061969. The test was modified to >>> verify the fix for JDK-8058251, but since another intermittent issue >>> was uncovered during testing (JDK-8062870), this test will retain >>> the "@ignore" jtreg tag until the new issue is fixed. Additionally, >>> the "@requires" tag was used to ensure that the test only runs on >>> 32-bit platforms. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8061969 >>> Webrev: http://cr.openjdk.java.net/~gtriantafill/8061969/webrev/ >>> >>> >>> The fix was tested locally on Linux with jtreg and the JPRT hotspot >>> testset. >>> >>> -George >> > From stefan.karlsson at oracle.com Wed Nov 5 20:45:17 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 21:45:17 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A87E1.2020404@oracle.com> References: <545A1682.5030406@oracle.com> <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> <545A80F8.9010200@oracle.com> <545A87E1.2020404@oracle.com> Message-ID: <545A8C5D.2090107@oracle.com> On 2014-11-05 21:26, Coleen Phillimore wrote: > > > On 11/5/14, 2:56 PM, Stefan Karlsson wrote: >> On 2014-11-05 19:34, Kim Barrett wrote: >>> On Nov 5, 2014, at 7:22 AM, Stefan Karlsson >>> wrote: >>>> Hi all, >>>> >>>> I propose that we turn on the -Wreturn-type warning when compiling >>>> HotSpot with GCC. >>>> >>>> This will help us catch missing return statements earlier in the >>>> development cycle. >>>> >>>> http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ >>>> https://bugs.openjdk.java.net/browse/JDK-8062808 >>> I?ve only skimmed this and not really reviewed, but I really dislike >>> insertion of purportedly unreachable ?return? statements to silence >>> compiler warnings. I have a similar dislike for ?if (check for bad >>> case) { non-returning error processing } /* no else */ ?? to disable >>> such warnings by avoid an apparent terminating control flow w/o >>> return at the end of the error processing. There?s got to be a >>> better way? >> >> I understand, but that's what you'll find if you look at the shared >> code. I only added a few more places, where our other compilers >> didn't complain about this or the code wasn't compiled with those >> compilers. With that said, I'm all for cleaning this up, but it's a >> pretty large undertaking that I don't think should prevent the usage >> of -Wreturn-type. >> >>> >>> I know that gcc/clang/MS all have annotation mechanism to mark code >>> unreachable or functions as no-return. >> >> My first implementation of thisused notreturn annotations on >> gcc/clang/MS, but since we need to support other compilers that don't >> have this annotation we can't really take advantage of the annotation >> to fix this problem throughout our code base. We would still have all >> these constructs that you've pointed out, except for the two I added. > > I don't think a "return 0;" line after ShouldNotReachHere(); which > only knowing the contents of this call, ie that it doesn't return, is > worse than some #pragma doesnt-return macro. The latter is more > noise and distraction to me. Actually, the noreturn experiment/patch hides it within ShouldNotReachHere(); so you wouldn't get any line noise. This is done by doing something like this: // globalDefinitions.hpp WITH_NORETURN_ATTRIBUTE(void noreturn_function()); // globalDefinitions.cpp void noreturn_function() { // Make sure that this function never returns. while (true) { os::naked_short_sleep(10); } } // globalDefinitions_gcc.hpp #define WITH_NORETURN_ATTRIBUTE(function) function ___attribute((noreturn)) // globalDefinitions_ #define ShouldNotReachHere() \ do { \ report_should_not_reach_here(__FILE__, __LINE__); \ BREAKPOINT; noreturn_function(); \ } while (0) thanks, StefanK > I agree with Stefan. I don't think this should stop this change. This > change is an improvement. > > Coleen > >> >>> I?d rather see something like that added and used before we turn >>> on -Wreturn-type, rather than littering / contorting code to avoid >>> that warning. >> >> I don't agree. The code is already littered with these kind of >> constructs, so that I add a couple of similar returns shouldn't be a >> show-stopper for this warning, IMHO. >> >>> And compilers may generate better code sometimes when such are used. >> >> A benefit of using the noreturn annotation would be to be able to >> turn on -Wuninitialized and get away with constructs like this: >> int a; >> switch (v) { >> case 1: a = x; break; >> case 2: a = y; break; >> default: ShouldNotReachHere(); >> } >> use(a); >> >> I have a patch were I started testing this, but there are a number of >> places were the compiler doesn't manage to infer that all paths taken >> will initialize the variable and we have to change the code to make >> it easier for the compiler to understand it. This will probably help >> with the readability of the code, so that is probably not something >> negative. >> >>> >>> And I say that having just last week wasted half a day debugging >>> what turned out to be a small refactoring that left a code path >>> without a return. >> >> I've asked around and I've heard of a hand-full of HotSpot developers >> that has been bitten by this. :) >> >> thanks for looking at this, >> StefanK >> >>> >>> >> > From stefan.karlsson at oracle.com Wed Nov 5 20:47:09 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 21:47:09 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: References: <545A1682.5030406@oracle.com> <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> <545A80F8.9010200@oracle.com> Message-ID: <545A8CCD.10802@oracle.com> On 2014-11-05 21:20, John Rose wrote: > This is a good gain in robustness. Like Coleen, I'm agreeably > surprised at the relatively small number of fixes needed. > > On Nov 5, 2014, at 11:56 AM, Stefan Karlsson > > wrote: > >> ... >> A benefit of using the noreturn annotation would be to be able to >> turn on -Wuninitialized and get away with constructs like this: >> int a; >> switch (v) { >> case 1: a = x; break; >> case 2: a = y; break; >> default: ShouldNotReachHere(); >> } >> use(a); >> >> I have a patch were I started testing this, but there are a number of >> places were the compiler doesn't manage to infer that all paths taken >> will initialize the variable and we have to change the code to make >> it easier for the compiler to understand it. This will probably help >> with the readability of the code, so that is probably not something >> negative. > > I think this is a reasonable goal, worth a follow-up bug. > > One concern: The noreturn annotation might differ in effect from > compiler to compiler, making it difficult to have one (simple) source > base that pleases all the compilers. > > Here's a suggestion: Define a macro along these lines: > #define AFTER_NORETURN(code) {ShouldNotReachHere();code;} > > Use it for your two "Mute compiler" lines. For compilers which have > noreturn, define it this way: > #define AFTER_NORETURN(code) /*nothing*/ Thanks for the tip. > > Thanks for pursuing these cleanups. Thanks. StefanK > > ? John > From paul.hohensee at gmail.com Wed Nov 5 21:10:28 2014 From: paul.hohensee at gmail.com (Paul Hohensee) Date: Wed, 5 Nov 2014 16:10:28 -0500 Subject: RFR: 8058255: Native jbyte Atomic::cmpxchg for supported x86 platforms In-Reply-To: <47EB5B12-540E-45F7-8873-FA7BB015A8FE@oracle.com> References: <37B3D027-5B2E-417C-A679-D58AA250FCEF@lnu.se> <4CC8B7BA-1536-47A3-9CEF-069191E574B7@lnu.se> <47EB5B12-540E-45F7-8873-FA7BB015A8FE@oracle.com> Message-ID: I don't need a new webrev either, so afaic you're good to go. Thanks, Paul On Tue, Nov 4, 2014 at 1:15 PM, Kim Barrett wrote: > On Nov 3, 2014, at 7:21 PM, Erik ?sterlund wrote: > > > >> [legacy issue, not in changed code] > >> I think the comment for generate_atomic_cmpxchg_long() is wrong in the > >> return value; shouldn't it be returning a jlong? Probably a C-Y bug. > > > > No generate_atomic_cmpxchg_long() is used for generating code stubs for > jlong CAS. I.e. it returns the address of the generated stub rather than > executing a CAS - hence the return type is correct. > > The comment that I?m complaining about is the one describing the operation > being supported by the generator, whose return type should be jlong, just > as the corresponding return type in the comment for the new cmpxchg_byte > support is jbyte. That is, > > 623 // Support for jint atomic::atomic_cmpxchg_long(jlong > exchange_value, > > should be ?// Support for jlong ?" > > >> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp > >> 96 : "q" (exchange_value), "a" (compare_value), "r" > (dest), "r" (mp) > >> > >> Why is the new byte version using "q" for exchange_value, where the > >> existing int and long versions use "r"? [There might be a good > >> reason, and this is just my rusty assembler skills showing.] > > > > With the "q" constraint you select one of the 8-bit-addressable > registers rax, rcx, rdx, rbx (as opposed to any register with "r?). > > Thanks for the explanation. I didn?t remember that at all, and the > documentation I skimmed yesterday wasn?t helping. > > > The compare_value is assigned to eax using "a" which is also > 8-bit-addressable (al). Also cmpxchgb needs it to be in al specifically. > > At least I got that part. > > > The former (allocating 8-bit-addressable registers) wasn't a concern for > the other variants really, but here this is pretty important for the > operands of cmpxchgb. :) > > Indeed. > > >> > ------------------------------------------------------------------------------ > >> > >> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp > >> src/os_cpu/windows_x86/vm/os_windows_x86.hpp > >> > >> The windows port seems to only support specialized cmpxchgb when > >> defined(AMD64), while the BSD/Linux variants don't have that > >> restriction. Why this inconsistency? Or am I missing something, > >> which seems entirely possible in this tangle. > > > > If you look closely, you will see there are two definitions - one for > AMD64 using a runtime-generated code stub. > > Then there is another MSVC assembly variant for #ifndef AMD64. > > This goes perfectly consistent with e.g. the jint cmpxchg for windows > way of doing things. > > Oops, you are correct. > > > Do you want a new webrev? (just polished comments and renamed the > #define as per request) > > I don?t think I need one, but others might want a closer to final version. > > From coleen.phillimore at oracle.com Wed Nov 5 21:22:02 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 05 Nov 2014 16:22:02 -0500 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A8C5D.2090107@oracle.com> References: <545A1682.5030406@oracle.com> <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> <545A80F8.9010200@oracle.com> <545A87E1.2020404@oracle.com> <545A8C5D.2090107@oracle.com> Message-ID: <545A94FA.607@oracle.com> Question below... not to belabor the point. On 11/5/14, 3:45 PM, Stefan Karlsson wrote: > On 2014-11-05 21:26, Coleen Phillimore wrote: >> >> >> On 11/5/14, 2:56 PM, Stefan Karlsson wrote: >>> On 2014-11-05 19:34, Kim Barrett wrote: >>>> On Nov 5, 2014, at 7:22 AM, Stefan Karlsson >>>> wrote: >>>>> Hi all, >>>>> >>>>> I propose that we turn on the -Wreturn-type warning when compiling >>>>> HotSpot with GCC. >>>>> >>>>> This will help us catch missing return statements earlier in the >>>>> development cycle. >>>>> >>>>> http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ >>>>> https://bugs.openjdk.java.net/browse/JDK-8062808 >>>> I?ve only skimmed this and not really reviewed, but I really >>>> dislike insertion of purportedly unreachable ?return? statements to >>>> silence compiler warnings. I have a similar dislike for ?if (check >>>> for bad case) { non-returning error processing } /* no else */ ?? >>>> to disable such warnings by avoid an apparent terminating control >>>> flow w/o return at the end of the error processing. There?s got to >>>> be a better way? >>> >>> I understand, but that's what you'll find if you look at the shared >>> code. I only added a few more places, where our other compilers >>> didn't complain about this or the code wasn't compiled with those >>> compilers. With that said, I'm all for cleaning this up, but it's a >>> pretty large undertaking that I don't think should prevent the usage >>> of -Wreturn-type. >>> >>>> >>>> I know that gcc/clang/MS all have annotation mechanism to mark code >>>> unreachable or functions as no-return. >>> >>> My first implementation of thisused notreturn annotations on >>> gcc/clang/MS, but since we need to support other compilers that >>> don't have this annotation we can't really take advantage of the >>> annotation to fix this problem throughout our code base. We would >>> still have all these constructs that you've pointed out, except for >>> the two I added. >> >> I don't think a "return 0;" line after ShouldNotReachHere(); which >> only knowing the contents of this call, ie that it doesn't return, is >> worse than some #pragma doesnt-return macro. The latter is more >> noise and distraction to me. > > Actually, the noreturn experiment/patch hides it within > ShouldNotReachHere(); so you wouldn't get any line noise. > > This is done by doing something like this: > > // globalDefinitions.hpp > WITH_NORETURN_ATTRIBUTE(void noreturn_function()); > > // globalDefinitions.cpp > void noreturn_function() { > // Make sure that this function never returns. > while (true) { > os::naked_short_sleep(10); > } > } > > // globalDefinitions_gcc.hpp > #define WITH_NORETURN_ATTRIBUTE(function) function > ___attribute((noreturn)) > > // globalDefinitions_ > #define ShouldNotReachHere() \ > do { \ > report_should_not_reach_here(__FILE__, __LINE__); \ > BREAKPOINT; noreturn_function(); \ > } while (0) > Rather than have all this, why not have report_should_not_reach_here() have the NORETURN_ATTRIBUTE. In any case, hidden in ShouldNotReachHere is good. Coleen > thanks, > StefanK >> I agree with Stefan. I don't think this should stop this change. >> This change is an improvement. >> >> Coleen >> >>> >>>> I?d rather see something like that added and used before we turn >>>> on -Wreturn-type, rather than littering / contorting code to avoid >>>> that warning. >>> >>> I don't agree. The code is already littered with these kind of >>> constructs, so that I add a couple of similar returns shouldn't be a >>> show-stopper for this warning, IMHO. >>> >>>> And compilers may generate better code sometimes when such are used. >>> >>> A benefit of using the noreturn annotation would be to be able to >>> turn on -Wuninitialized and get away with constructs like this: >>> int a; >>> switch (v) { >>> case 1: a = x; break; >>> case 2: a = y; break; >>> default: ShouldNotReachHere(); >>> } >>> use(a); >>> >>> I have a patch were I started testing this, but there are a number >>> of places were the compiler doesn't manage to infer that all paths >>> taken will initialize the variable and we have to change the code to >>> make it easier for the compiler to understand it. This will probably >>> help with the readability of the code, so that is probably not >>> something negative. >>> >>>> >>>> And I say that having just last week wasted half a day debugging >>>> what turned out to be a small refactoring that left a code path >>>> without a return. >>> >>> I've asked around and I've heard of a hand-full of HotSpot >>> developers that has been bitten by this. :) >>> >>> thanks for looking at this, >>> StefanK >>> >>>> >>>> >>> >> > From stefan.karlsson at oracle.com Wed Nov 5 21:35:26 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 22:35:26 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A94FA.607@oracle.com> References: <545A1682.5030406@oracle.com> <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> <545A80F8.9010200@oracle.com> <545A87E1.2020404@oracle.com> <545A8C5D.2090107@oracle.com> <545A94FA.607@oracle.com> Message-ID: <545A981E.7060502@oracle.com> On 2014-11-05 22:22, Coleen Phillimore wrote: > > Question below... not to belabor the point. > > On 11/5/14, 3:45 PM, Stefan Karlsson wrote: >> On 2014-11-05 21:26, Coleen Phillimore wrote: >>> >>> >>> On 11/5/14, 2:56 PM, Stefan Karlsson wrote: >>>> On 2014-11-05 19:34, Kim Barrett wrote: >>>>> On Nov 5, 2014, at 7:22 AM, Stefan Karlsson >>>>> wrote: >>>>>> Hi all, >>>>>> >>>>>> I propose that we turn on the -Wreturn-type warning when >>>>>> compiling HotSpot with GCC. >>>>>> >>>>>> This will help us catch missing return statements earlier in the >>>>>> development cycle. >>>>>> >>>>>> http://cr.openjdk.java.net/~stefank/8062808/webrev.01/ >>>>>> https://bugs.openjdk.java.net/browse/JDK-8062808 >>>>> I?ve only skimmed this and not really reviewed, but I really >>>>> dislike insertion of purportedly unreachable ?return? statements >>>>> to silence compiler warnings. I have a similar dislike for ?if >>>>> (check for bad case) { non-returning error processing } /* no else >>>>> */ ?? to disable such warnings by avoid an apparent terminating >>>>> control flow w/o return at the end of the error processing. >>>>> There?s got to be a better way? >>>> >>>> I understand, but that's what you'll find if you look at the shared >>>> code. I only added a few more places, where our other compilers >>>> didn't complain about this or the code wasn't compiled with those >>>> compilers. With that said, I'm all for cleaning this up, but it's a >>>> pretty large undertaking that I don't think should prevent the >>>> usage of -Wreturn-type. >>>> >>>>> >>>>> I know that gcc/clang/MS all have annotation mechanism to mark >>>>> code unreachable or functions as no-return. >>>> >>>> My first implementation of thisused notreturn annotations on >>>> gcc/clang/MS, but since we need to support other compilers that >>>> don't have this annotation we can't really take advantage of the >>>> annotation to fix this problem throughout our code base. We would >>>> still have all these constructs that you've pointed out, except for >>>> the two I added. >>> >>> I don't think a "return 0;" line after ShouldNotReachHere(); which >>> only knowing the contents of this call, ie that it doesn't return, >>> is worse than some #pragma doesnt-return macro. The latter is more >>> noise and distraction to me. >> >> Actually, the noreturn experiment/patch hides it within >> ShouldNotReachHere(); so you wouldn't get any line noise. >> >> This is done by doing something like this: >> >> // globalDefinitions.hpp >> WITH_NORETURN_ATTRIBUTE(void noreturn_function()); >> >> // globalDefinitions.cpp >> void noreturn_function() { >> // Make sure that this function never returns. >> while (true) { >> os::naked_short_sleep(10); >> } >> } >> >> // globalDefinitions_gcc.hpp >> #define WITH_NORETURN_ATTRIBUTE(function) function >> ___attribute((noreturn)) >> >> // globalDefinitions_ >> #define ShouldNotReachHere() \ >> do { \ >> report_should_not_reach_here(__FILE__, __LINE__); \ >> BREAKPOINT; noreturn_function(); \ >> } while (0) >> > Rather than have all this, why not have report_should_not_reach_here() > have the NORETURN_ATTRIBUTE. 1) report_should_not_reach_here is used by assert and guarantee and we have infrastructure to ignore/skip those by specifying a filename and a line number. 2) We would never reach the BREAKPOINT code. > > In any case, hidden in ShouldNotReachHere is good. Great. thanks, StefanK > > Coleen > >> thanks, >> StefanK >>> I agree with Stefan. I don't think this should stop this change. >>> This change is an improvement. >>> >>> Coleen >>> >>>> >>>>> I?d rather see something like that added and used before we turn >>>>> on -Wreturn-type, rather than littering / contorting code to avoid >>>>> that warning. >>>> >>>> I don't agree. The code is already littered with these kind of >>>> constructs, so that I add a couple of similar returns shouldn't be >>>> a show-stopper for this warning, IMHO. >>>> >>>>> And compilers may generate better code sometimes when such are >>>>> used. >>>> >>>> A benefit of using the noreturn annotation would be to be able to >>>> turn on -Wuninitialized and get away with constructs like this: >>>> int a; >>>> switch (v) { >>>> case 1: a = x; break; >>>> case 2: a = y; break; >>>> default: ShouldNotReachHere(); >>>> } >>>> use(a); >>>> >>>> I have a patch were I started testing this, but there are a number >>>> of places were the compiler doesn't manage to infer that all paths >>>> taken will initialize the variable and we have to change the code >>>> to make it easier for the compiler to understand it. This will >>>> probably help with the readability of the code, so that is probably >>>> not something negative. >>>> >>>>> >>>>> And I say that having just last week wasted half a day debugging >>>>> what turned out to be a small refactoring that left a code path >>>>> without a return. >>>> >>>> I've asked around and I've heard of a hand-full of HotSpot >>>> developers that has been bitten by this. :) >>>> >>>> thanks for looking at this, >>>> StefanK >>>> >>>>> >>>>> >>>> >>> >> > From kim.barrett at oracle.com Wed Nov 5 21:35:13 2014 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 5 Nov 2014 16:35:13 -0500 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A94FA.607@oracle.com> References: <545A1682.5030406@oracle.com> <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> <545A80F8.9010200@oracle.com> <545A87E1.2020404@oracle.com> <545A8C5D.2090107@oracle.com> <545A94FA.607@oracle.com> Message-ID: <63468AB6-6A42-4579-9BC8-9C38496A16C2@oracle.com> On Nov 5, 2014, at 4:22 PM, Coleen Phillimore wrote: > > Rather than have all this, why not have report_should_not_reach_here() have the NORETURN_ATTRIBUTE. I was recently wondering about that myself, in response to that recently wasted half day. It turns out that report_should_not_reach_here() - and anything else built on report_vm_error() - cannot presently be noreturn, because report_vm_error() is not noreturn; it starts with: if (Debugging || error_is_suppressed(file, line)) return; report_vm_out_of_memory() similarly starts with if (Debugging) return; Calls to these report functions typically seem to be followed by ?BREAKPOINT;?. I don?t really understand the design here, e.g. why not move that into the place where the returns are located in the report functions? I expect there?s a reason, but I haven?t puzzled it out. From kim.barrett at oracle.com Wed Nov 5 21:40:27 2014 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 5 Nov 2014 16:40:27 -0500 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A80F8.9010200@oracle.com> References: <545A1682.5030406@oracle.com> <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> <545A80F8.9010200@oracle.com> Message-ID: <10D033B6-C77D-463F-B239-CB09477B53A0@oracle.com> On Nov 5, 2014, at 2:56 PM, Stefan Karlsson wrote: > > I understand, but that's what you'll find if you look at the shared code. I only added a few more places, where our other compilers didn't complain about this or the code wasn't compiled with those compilers. With that said, I'm all for cleaning this up, but it's a pretty large undertaking that I don't think should prevent the usage of -Wreturn-type. Ugh. I suppose that?s true, in which case I agree that adding a couple more isn?t doing much harm, and turning on the warning is a significant benefit. >> I know that gcc/clang/MS all have annotation mechanism to mark code unreachable or functions as no-return. > > My first implementation of thisused notreturn annotations on gcc/clang/MS, but since we need to support other compilers that don't have this annotation we can't really take advantage of the annotation to fix this problem throughout our code base. We would still have all these constructs that you've pointed out, except for the two I added. Do compilers without such annotations also issue such warnings? > I've asked around and I've heard of a hand-full of HotSpot developers that has been bitten by this. :) I?m used to building with -Wall -Werror (from my previous job, and something I pushed for), and am discovering that I?ve become more reliant on the compiler telling me I?m doing something stupid than I had realized. From stefan.karlsson at oracle.com Wed Nov 5 21:48:15 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 22:48:15 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <10D033B6-C77D-463F-B239-CB09477B53A0@oracle.com> References: <545A1682.5030406@oracle.com> <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> <545A80F8.9010200@oracle.com> <10D033B6-C77D-463F-B239-CB09477B53A0@oracle.com> Message-ID: <545A9B1F.7010706@oracle.com> On 2014-11-05 22:40, Kim Barrett wrote: > On Nov 5, 2014, at 2:56 PM, Stefan Karlsson wrote: >> I understand, but that's what you'll find if you look at the shared code. I only added a few more places, where our other compilers didn't complain about this or the code wasn't compiled with those compilers. With that said, I'm all for cleaning this up, but it's a pretty large undertaking that I don't think should prevent the usage of -Wreturn-type. > Ugh. I suppose that?s true, in which case I agree that adding a couple more isn?t doing much harm, and turning on the warning is a significant benefit. > >>> I know that gcc/clang/MS all have annotation mechanism to mark code unreachable or functions as no-return. >> My first implementation of thisused notreturn annotations on gcc/clang/MS, but since we need to support other compilers that don't have this annotation we can't really take advantage of the annotation to fix this problem throughout our code base. We would still have all these constructs that you've pointed out, except for the two I added. > Do compilers without such annotations also issue such warnings? I couldn't find one for the Sun Studio compiler. Maybe someone else knows? > >> I've asked around and I've heard of a hand-full of HotSpot developers that has been bitten by this. :) > I?m used to building with -Wall -Werror (from my previous job, and something I pushed for), and am discovering that I?ve become more reliant on the compiler telling me I?m doing something stupid than I had realized. > Yes. StefanK From thomas.schatzl at oracle.com Wed Nov 5 21:53:06 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 05 Nov 2014 22:53:06 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <545A9B1F.7010706@oracle.com> References: <545A1682.5030406@oracle.com> <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> <545A80F8.9010200@oracle.com> <10D033B6-C77D-463F-B239-CB09477B53A0@oracle.com> <545A9B1F.7010706@oracle.com> Message-ID: <1415224386.3193.4.camel@oracle.com> Hi, On Wed, 2014-11-05 at 22:48 +0100, Stefan Karlsson wrote: > On 2014-11-05 22:40, Kim Barrett wrote: > > On Nov 5, 2014, at 2:56 PM, Stefan Karlsson wrote: > >> I understand, but that's what you'll find if you look at the shared code. I only added a few more places, where our other compilers didn't complain about this or the code wasn't compiled with those compilers. With that said, I'm all for cleaning this up, but it's a pretty large undertaking that I don't think should prevent the usage of -Wreturn-type. > > Ugh. I suppose that?s true, in which case I agree that adding a couple more isn?t doing much harm, and turning on the warning is a significant benefit. > > > >>> I know that gcc/clang/MS all have annotation mechanism to mark code unreachable or functions as no-return. > >> My first implementation of thisused notreturn annotations on gcc/clang/MS, but since we need to support other compilers that don't have this annotation we can't really take advantage of the annotation to fix this problem throughout our code base. We would still have all these constructs that you've pointed out, except for the two I added. > > Do compilers without such annotations also issue such warnings? > > I couldn't find one for the Sun Studio compiler. Maybe someone else knows? > > At least since Sun studio 11 there is the "#pragma does_not_return". http://docs.oracle.com/cd/E19422-01/819-3690-10/819-3690-10.pdf and later (12+) even add a "noreturn" attribute for compatibility. http://docs.oracle.com/cd/E24457_01/html/E21991/gljol.html Thanks, Thomas From stefan.karlsson at oracle.com Wed Nov 5 21:59:31 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 05 Nov 2014 22:59:31 +0100 Subject: RFR: 8062808: Turn on the -Wreturn-type warning In-Reply-To: <1415224386.3193.4.camel@oracle.com> References: <545A1682.5030406@oracle.com> <085EFDE1-CB8C-42FF-8115-8290631DB4F2@oracle.com> <545A80F8.9010200@oracle.com> <10D033B6-C77D-463F-B239-CB09477B53A0@oracle.com> <545A9B1F.7010706@oracle.com> <1415224386.3193.4.camel@oracle.com> Message-ID: <545A9DC3.2070707@oracle.com> On 2014-11-05 22:53, Thomas Schatzl wrote: > Hi, > > On Wed, 2014-11-05 at 22:48 +0100, Stefan Karlsson wrote: >> On 2014-11-05 22:40, Kim Barrett wrote: >>> On Nov 5, 2014, at 2:56 PM, Stefan Karlsson wrote: >>>> I understand, but that's what you'll find if you look at the shared code. I only added a few more places, where our other compilers didn't complain about this or the code wasn't compiled with those compilers. With that said, I'm all for cleaning this up, but it's a pretty large undertaking that I don't think should prevent the usage of -Wreturn-type. >>> Ugh. I suppose that?s true, in which case I agree that adding a couple more isn?t doing much harm, and turning on the warning is a significant benefit. >>> >>>>> I know that gcc/clang/MS all have annotation mechanism to mark code unreachable or functions as no-return. >>>> My first implementation of thisused notreturn annotations on gcc/clang/MS, but since we need to support other compilers that don't have this annotation we can't really take advantage of the annotation to fix this problem throughout our code base. We would still have all these constructs that you've pointed out, except for the two I added. >>> Do compilers without such annotations also issue such warnings? >> I couldn't find one for the Sun Studio compiler. Maybe someone else knows? > At least since Sun studio 11 there is the "#pragma does_not_return". > > http://docs.oracle.com/cd/E19422-01/819-3690-10/819-3690-10.pdf > > and later (12+) even add a "noreturn" attribute for compatibility. > > http://docs.oracle.com/cd/E24457_01/html/E21991/gljol.html Great! thanks, StefanK > > Thanks, > Thomas > From vladimir.kozlov at oracle.com Wed Nov 5 22:05:10 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 05 Nov 2014 14:05:10 -0800 Subject: [8u40] backport RFR(XS) 8059780: SPECjvm2008-MPEG performance regressions on x64 platforms Message-ID: <545A9F16.8050107@oracle.com> Backport request. Changes were pushed into jdk9 2 days ago. Nighties are fine. Changes are applied cleanly to 8u sources. http://cr.openjdk.java.net/~kvn/8059780/webrev/ https://bugs.openjdk.java.net/browse/JDK-8059780 http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/b8bcacc8ccca Thanks, Vladimir From erik.osterlund at lnu.se Wed Nov 5 22:08:31 2014 From: erik.osterlund at lnu.se (=?Windows-1252?Q?Erik_=D6sterlund?=) Date: Wed, 5 Nov 2014 22:08:31 +0000 Subject: RFR: 8058255: Native jbyte Atomic::cmpxchg for supported x86 platforms In-Reply-To: References: <37B3D027-5B2E-417C-A679-D58AA250FCEF@lnu.se> <4CC8B7BA-1536-47A3-9CEF-069191E574B7@lnu.se> <47EB5B12-540E-45F7-8873-FA7BB015A8FE@oracle.com> Message-ID: Okay, thanks a lot for the reviews Paul and Kim. :) Kim can you confirm I'm good to go? Everything you mentioned is fixed and I'm ready to go. Thanks, /Erik On 05 Nov 2014, at 22:10, Paul Hohensee > wrote: I don't need a new webrev either, so afaic you're good to go. Thanks, Paul On Tue, Nov 4, 2014 at 1:15 PM, Kim Barrett > wrote: On Nov 3, 2014, at 7:21 PM, Erik ?sterlund > wrote: > >> [legacy issue, not in changed code] >> I think the comment for generate_atomic_cmpxchg_long() is wrong in the >> return value; shouldn't it be returning a jlong? Probably a C-Y bug. > > No generate_atomic_cmpxchg_long() is used for generating code stubs for jlong CAS. I.e. it returns the address of the generated stub rather than executing a CAS - hence the return type is correct. The comment that I?m complaining about is the one describing the operation being supported by the generator, whose return type should be jlong, just as the corresponding return type in the comment for the new cmpxchg_byte support is jbyte. That is, 623 // Support for jint atomic::atomic_cmpxchg_long(jlong exchange_value, should be ?// Support for jlong ?" >> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp >> 96 : "q" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp) >> >> Why is the new byte version using "q" for exchange_value, where the >> existing int and long versions use "r"? [There might be a good >> reason, and this is just my rusty assembler skills showing.] > > With the "q" constraint you select one of the 8-bit-addressable registers rax, rcx, rdx, rbx (as opposed to any register with "r?). Thanks for the explanation. I didn?t remember that at all, and the documentation I skimmed yesterday wasn?t helping. > The compare_value is assigned to eax using "a" which is also 8-bit-addressable (al). Also cmpxchgb needs it to be in al specifically. At least I got that part. > The former (allocating 8-bit-addressable registers) wasn't a concern for the other variants really, but here this is pretty important for the operands of cmpxchgb. :) Indeed. >> ------------------------------------------------------------------------------ >> >> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp >> src/os_cpu/windows_x86/vm/os_windows_x86.hpp >> >> The windows port seems to only support specialized cmpxchgb when >> defined(AMD64), while the BSD/Linux variants don't have that >> restriction. Why this inconsistency? Or am I missing something, >> which seems entirely possible in this tangle. > > If you look closely, you will see there are two definitions - one for AMD64 using a runtime-generated code stub. > Then there is another MSVC assembly variant for #ifndef AMD64. > This goes perfectly consistent with e.g. the jint cmpxchg for windows way of doing things. Oops, you are correct. > Do you want a new webrev? (just polished comments and renamed the #define as per request) I don?t think I need one, but others might want a closer to final version. From serguei.spitsyn at oracle.com Wed Nov 5 22:22:49 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 05 Nov 2014 14:22:49 -0800 Subject: [8u-hs-dev] Backport RFR: 8057043: Type annotations not retained during class redefine / retransform In-Reply-To: <545A1CCD.5080008@oracle.com> References: <545A1CCD.5080008@oracle.com> Message-ID: <545AA339.8020607@oracle.com> Hi Andreas, It is good. Thanks, Serguei On 11/5/14 4:49 AM, Andreas Eriksson wrote: > Hi, > > This backport of JDK-8057043 > imported cleanly > from the jdk9 changeset: > http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/06de05da6f2b > > Thanks, > Andreas From kim.barrett at oracle.com Wed Nov 5 23:12:55 2014 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 5 Nov 2014 18:12:55 -0500 Subject: RFR: 8058255: Native jbyte Atomic::cmpxchg for supported x86 platforms In-Reply-To: References: <37B3D027-5B2E-417C-A679-D58AA250FCEF@lnu.se> <4CC8B7BA-1536-47A3-9CEF-069191E574B7@lnu.se> <47EB5B12-540E-45F7-8873-FA7BB015A8FE@oracle.com> Message-ID: On Nov 5, 2014, at 5:08 PM, Erik ?sterlund wrote: > > Okay, thanks a lot for the reviews Paul and Kim. :) > Kim can you confirm I'm good to go? Everything you mentioned is fixed and I'm ready to go. Yes, looks good. From david.holmes at oracle.com Thu Nov 6 01:22:27 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 06 Nov 2014 11:22:27 +1000 Subject: RFR: 8058255: Native jbyte Atomic::cmpxchg for supported x86 platforms In-Reply-To: References: <37B3D027-5B2E-417C-A679-D58AA250FCEF@lnu.se> <4CC8B7BA-1536-47A3-9CEF-069191E574B7@lnu.se> <47EB5B12-540E-45F7-8873-FA7BB015A8FE@oracle.com> Message-ID: <545ACD53.3050108@oracle.com> I'd like to see a final webrev please! I've lost track of this a bit. Thanks, David On 6/11/2014 8:08 AM, Erik ?sterlund wrote: > Okay, thanks a lot for the reviews Paul and Kim. :) > Kim can you confirm I'm good to go? Everything you mentioned is fixed and I'm ready to go. > > Thanks, > > /Erik > > On 05 Nov 2014, at 22:10, Paul Hohensee > wrote: > > I don't need a new webrev either, so afaic you're good to go. > > Thanks, > > Paul > > > On Tue, Nov 4, 2014 at 1:15 PM, Kim Barrett > wrote: > On Nov 3, 2014, at 7:21 PM, Erik ?sterlund > wrote: >> >>> [legacy issue, not in changed code] >>> I think the comment for generate_atomic_cmpxchg_long() is wrong in the >>> return value; shouldn't it be returning a jlong? Probably a C-Y bug. >> >> No generate_atomic_cmpxchg_long() is used for generating code stubs for jlong CAS. I.e. it returns the address of the generated stub rather than executing a CAS - hence the return type is correct. > > The comment that I?m complaining about is the one describing the operation being supported by the generator, whose return type should be jlong, just as the corresponding return type in the comment for the new cmpxchg_byte support is jbyte. That is, > > 623 // Support for jint atomic::atomic_cmpxchg_long(jlong exchange_value, > > should be ?// Support for jlong ?" > >>> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp >>> 96 : "q" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp) >>> >>> Why is the new byte version using "q" for exchange_value, where the >>> existing int and long versions use "r"? [There might be a good >>> reason, and this is just my rusty assembler skills showing.] >> >> With the "q" constraint you select one of the 8-bit-addressable registers rax, rcx, rdx, rbx (as opposed to any register with "r?). > > Thanks for the explanation. I didn?t remember that at all, and the documentation I skimmed yesterday wasn?t helping. > >> The compare_value is assigned to eax using "a" which is also 8-bit-addressable (al). Also cmpxchgb needs it to be in al specifically. > > At least I got that part. > >> The former (allocating 8-bit-addressable registers) wasn't a concern for the other variants really, but here this is pretty important for the operands of cmpxchgb. :) > > Indeed. > >>> ------------------------------------------------------------------------------ >>> >>> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp >>> src/os_cpu/windows_x86/vm/os_windows_x86.hpp >>> >>> The windows port seems to only support specialized cmpxchgb when >>> defined(AMD64), while the BSD/Linux variants don't have that >>> restriction. Why this inconsistency? Or am I missing something, >>> which seems entirely possible in this tangle. >> >> If you look closely, you will see there are two definitions - one for AMD64 using a runtime-generated code stub. >> Then there is another MSVC assembly variant for #ifndef AMD64. >> This goes perfectly consistent with e.g. the jint cmpxchg for windows way of doing things. > > Oops, you are correct. > >> Do you want a new webrev? (just polished comments and renamed the #define as per request) > > I don?t think I need one, but others might want a closer to final version. > > > From david.holmes at oracle.com Thu Nov 6 02:10:26 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 06 Nov 2014 12:10:26 +1000 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <545A2D5C.5090808@oracle.com> References: <54577C96.5030503@oracle.com> <545A2D5C.5090808@oracle.com> Message-ID: <545AD892.2050708@oracle.com> On 5/11/2014 11:59 PM, Mikael Gerdin wrote: > Hi all, > > I have an updated webrev with the following changes: > > * compiler_barrier() calls added after the loads to all > load_acquire-variants. > * the dummy store in release() is removed. Looks good to me. Thanks, David > Full webrev at: > http://cr.openjdk.java.net/~mgerdin/8061964/webrev.1/ > > Incremental webrev at: > http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0_to_1/ > > Thanks for all the feedback so far! > > /Mikael > > On 2014-11-03 14:01, Mikael Gerdin wrote: >> Hi all, >> >> Please review this attempt at fixing the OrderAccess functions on Linux >> x86 with GCC. >> >> While working on another bug I recently discovered that g++ was >> reordering stores across a call to OrderAccess::storestore on Linux x86. >> >> The G1 code attempts to do an ordered publishing of two values: >> _saved_mark_word = _top; >> OrderAccess::storestore(); >> _gc_time_stamp = curr_gc_time_stamp; >> >> The types involved are >> HeapWord* _top, _saved_mark_word; >> volatile unsigned _gc_time_stamp; >> >> The incorrect behavior seems to have started when JDK-6973570 was fixed >> in JDK 7. >> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and >> _gc_time_stamp at 0x138, %rbx is "this". >> >> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: >> >> >> 3d9f4d: 39 d0 cmp %edx,%eax >> 3d9f4f: 73 1c jae 3d9f6d >> >> 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax >> 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) >> 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # >> ae98a0 <_DYNAMIC+0x12f8> >> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) >> 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >> >> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so >> >> >> 3da05d: 39 d0 cmp %edx,%eax >> 3da05f: 73 15 jae 3da076 >> >> 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax >> 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) >> 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >> 3da072: 48 89 43 18 mov %rax,0x18(%rbx) >> >> In b109 the store of %rax to 0x18(%rbx) has been ordered after the store >> of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. >> >> My suggestion to fix this is to extend all the OrderAccess::release* >> variants on x86 with a: >> __asm__ volatile ("" : : : "memory"); >> to attempt to prevent GCC from reordering any memory accesses across >> those function calls. >> >> I've verified that this solves the issue in the assembly with our >> current JDK 9 build platform compilers. >> I've also verified that this particular piece of code is compiled >> correctly on our other x86 platforms: Solaris, Windows and OS X. >> >> Webrev: >> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8061964 >> Testing: >> JPRT, inspecting generated assembly for the function >> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is >> currently named). >> Suggestions of further testing is greatly appreciated. >> >> Thanks >> Mikael From goetz.lindenmaier at sap.com Thu Nov 6 08:04:44 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 6 Nov 2014 08:04:44 +0000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <5458ED10.2010405@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF23EDB@DEWDFEMB12A.global.corp.sap> <5458ED10.2010405@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF24829@DEWDFEMB12A.global.corp.sap> Coleen, thanks for helping and pushing the change! Best regards, Goetz. -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Coleen Phillimore Sent: Dienstag, 4. November 2014 16:13 To: hotspot-dev at openjdk.java.net Subject: Re: RFR (L): 8062370: Various minor code improvements I agree that it's an improvement. I started to look at it and I will sponsor it. Do you have the new version of 'webrev' with next navigation? That would make this easier. thanks, Coleen On 11/04/2014 04:34 AM, Lindenmaier, Goetz wrote: > Hi, > > could anybody have a look at this change, please? > I think it contains a lot of fixes useful to improve the code quality. > > Thanks and best regards, > Goetz. > > From: Lindenmaier, Goetz > Sent: Donnerstag, 30. Oktober 2014 09:28 > To: hotspot-dev at openjdk.java.net > Subject: RFR (L): 8062370: Various minor code improvements > > Hi, > > this change contains a row of minor code improvements we did to fulfil > our internal quality requirements. We would like to share these with > openJDK. > > Please review and test this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8062370 > > We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 bit and, > of course, the ppc platforms. > > > Some details: > > CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. > > We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. > > We add some missing memory frees and some closing of files. > > jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. > > Best regards, > > Goetz > > > > From mikael.gerdin at oracle.com Thu Nov 6 08:33:22 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 06 Nov 2014 09:33:22 +0100 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <545AD892.2050708@oracle.com> References: <54577C96.5030503@oracle.com> <545A2D5C.5090808@oracle.com> <545AD892.2050708@oracle.com> Message-ID: <545B3252.2070908@oracle.com> On 2014-11-06 03:10, David Holmes wrote: > On 5/11/2014 11:59 PM, Mikael Gerdin wrote: >> Hi all, >> >> I have an updated webrev with the following changes: >> >> * compiler_barrier() calls added after the loads to all >> load_acquire-variants. >> * the dummy store in release() is removed. > > Looks good to me. Thanks for the review, David. /Mikael > > Thanks, > David > >> Full webrev at: >> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.1/ >> >> Incremental webrev at: >> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0_to_1/ >> >> Thanks for all the feedback so far! >> >> /Mikael >> >> On 2014-11-03 14:01, Mikael Gerdin wrote: >>> Hi all, >>> >>> Please review this attempt at fixing the OrderAccess functions on Linux >>> x86 with GCC. >>> >>> While working on another bug I recently discovered that g++ was >>> reordering stores across a call to OrderAccess::storestore on Linux x86. >>> >>> The G1 code attempts to do an ordered publishing of two values: >>> _saved_mark_word = _top; >>> OrderAccess::storestore(); >>> _gc_time_stamp = curr_gc_time_stamp; >>> >>> The types involved are >>> HeapWord* _top, _saved_mark_word; >>> volatile unsigned _gc_time_stamp; >>> >>> The incorrect behavior seems to have started when JDK-6973570 was fixed >>> in JDK 7. >>> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and >>> _gc_time_stamp at 0x138, %rbx is "this". >>> >>> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: >>> >>> >>> >>> 3d9f4d: 39 d0 cmp %edx,%eax >>> 3d9f4f: 73 1c jae 3d9f6d >>> >>> 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax >>> 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) >>> 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # >>> ae98a0 <_DYNAMIC+0x12f8> >>> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) >>> 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >>> >>> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so >>> >>> >>> >>> 3da05d: 39 d0 cmp %edx,%eax >>> 3da05f: 73 15 jae 3da076 >>> >>> 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax >>> 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) >>> 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >>> 3da072: 48 89 43 18 mov %rax,0x18(%rbx) >>> >>> In b109 the store of %rax to 0x18(%rbx) has been ordered after the store >>> of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. >>> >>> My suggestion to fix this is to extend all the OrderAccess::release* >>> variants on x86 with a: >>> __asm__ volatile ("" : : : "memory"); >>> to attempt to prevent GCC from reordering any memory accesses across >>> those function calls. >>> >>> I've verified that this solves the issue in the assembly with our >>> current JDK 9 build platform compilers. >>> I've also verified that this particular piece of code is compiled >>> correctly on our other x86 platforms: Solaris, Windows and OS X. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8061964 >>> Testing: >>> JPRT, inspecting generated assembly for the function >>> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is >>> currently named). >>> Suggestions of further testing is greatly appreciated. >>> >>> Thanks >>> Mikael From mikael.gerdin at oracle.com Thu Nov 6 08:33:36 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 06 Nov 2014 09:33:36 +0100 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <545A4065.6090605@oracle.com> References: <54577C96.5030503@oracle.com> <545A2D5C.5090808@oracle.com> <545A4065.6090605@oracle.com> Message-ID: <545B3260.6020101@oracle.com> On 2014-11-05 16:21, Daniel D. Daugherty wrote: > On 11/5/14 6:59 AM, Mikael Gerdin wrote: >> Hi all, >> >> I have an updated webrev with the following changes: >> >> * compiler_barrier() calls added after the loads to all >> load_acquire-variants. >> * the dummy store in release() is removed. >> >> Full webrev at: >> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.1/ > > src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp > No comments. > > > Thumbs up. Thanks for the review, Dan. /Mikael > > Dan > > > >> >> Incremental webrev at: >> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0_to_1/ >> >> Thanks for all the feedback so far! >> >> /Mikael >> >> On 2014-11-03 14:01, Mikael Gerdin wrote: >>> Hi all, >>> >>> Please review this attempt at fixing the OrderAccess functions on Linux >>> x86 with GCC. >>> >>> While working on another bug I recently discovered that g++ was >>> reordering stores across a call to OrderAccess::storestore on Linux x86. >>> >>> The G1 code attempts to do an ordered publishing of two values: >>> _saved_mark_word = _top; >>> OrderAccess::storestore(); >>> _gc_time_stamp = curr_gc_time_stamp; >>> >>> The types involved are >>> HeapWord* _top, _saved_mark_word; >>> volatile unsigned _gc_time_stamp; >>> >>> The incorrect behavior seems to have started when JDK-6973570 was fixed >>> in JDK 7. >>> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and >>> _gc_time_stamp at 0x138, %rbx is "this". >>> >>> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: >>> >>> >>> 3d9f4d: 39 d0 cmp %edx,%eax >>> 3d9f4f: 73 1c jae 3d9f6d >>> >>> 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax >>> 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) >>> 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # >>> ae98a0 <_DYNAMIC+0x12f8> >>> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) >>> 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >>> >>> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so >>> >>> >>> 3da05d: 39 d0 cmp %edx,%eax >>> 3da05f: 73 15 jae 3da076 >>> >>> 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax >>> 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) >>> 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >>> 3da072: 48 89 43 18 mov %rax,0x18(%rbx) >>> >>> In b109 the store of %rax to 0x18(%rbx) has been ordered after the store >>> of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. >>> >>> My suggestion to fix this is to extend all the OrderAccess::release* >>> variants on x86 with a: >>> __asm__ volatile ("" : : : "memory"); >>> to attempt to prevent GCC from reordering any memory accesses across >>> those function calls. >>> >>> I've verified that this solves the issue in the assembly with our >>> current JDK 9 build platform compilers. >>> I've also verified that this particular piece of code is compiled >>> correctly on our other x86 platforms: Solaris, Windows and OS X. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8061964 >>> Testing: >>> JPRT, inspecting generated assembly for the function >>> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is >>> currently named). >>> Suggestions of further testing is greatly appreciated. >>> >>> Thanks >>> Mikael > From mikael.gerdin at oracle.com Thu Nov 6 08:33:51 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 06 Nov 2014 09:33:51 +0100 Subject: RFR: JDK-8061964: Insufficient compiler barriers for GCC in OrderAccess functions In-Reply-To: <545A3591.4050209@oracle.com> References: <54577C96.5030503@oracle.com> <545A2D5C.5090808@oracle.com> <545A3591.4050209@oracle.com> Message-ID: <545B326F.9050304@oracle.com> On 2014-11-05 15:34, Bertrand Delsart wrote: > Looks good. Thanks for the review, Bertrand. /Mikael > > Bertrand (not a Reviewer). > > On 05/11/2014 14:59, Mikael Gerdin wrote: >> Hi all, >> >> I have an updated webrev with the following changes: >> >> * compiler_barrier() calls added after the loads to all >> load_acquire-variants. >> * the dummy store in release() is removed. >> >> Full webrev at: >> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.1/ >> >> Incremental webrev at: >> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0_to_1/ >> >> Thanks for all the feedback so far! >> >> /Mikael >> >> On 2014-11-03 14:01, Mikael Gerdin wrote: >>> Hi all, >>> >>> Please review this attempt at fixing the OrderAccess functions on Linux >>> x86 with GCC. >>> >>> While working on another bug I recently discovered that g++ was >>> reordering stores across a call to OrderAccess::storestore on Linux x86. >>> >>> The G1 code attempts to do an ordered publishing of two values: >>> _saved_mark_word = _top; >>> OrderAccess::storestore(); >>> _gc_time_stamp = curr_gc_time_stamp; >>> >>> The types involved are >>> HeapWord* _top, _saved_mark_word; >>> volatile unsigned _gc_time_stamp; >>> >>> The incorrect behavior seems to have started when JDK-6973570 was fixed >>> in JDK 7. >>> Below, _top is at offset 0x58, _saved_mark_word at 0x18 and >>> _gc_time_stamp at 0x138, %rbx is "this". >>> >>> /net/jre/onestop/jdk/1.7.0/promoted/all//b108/binaries/linux-x64/jre/lib/amd64/server/libjvm.so: >>> >>> >>> >>> 3d9f4d: 39 d0 cmp %edx,%eax >>> 3d9f4f: 73 1c jae 3d9f6d >>> >>> 3d9f51: 48 8b 43 58 mov 0x58(%rbx),%rax >>> 3d9f55: 48 89 43 18 mov %rax,0x18(%rbx) >>> 3d9f59: 48 8b 05 40 f9 70 00 mov 0x70f940(%rip),%rax # >>> ae98a0 <_DYNAMIC+0x12f8> >>> 3d9f60: 48 c7 00 00 00 00 00 movq $0x0,(%rax) >>> 3d9f67: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >>> >>> /net/jre/onestop/jdk/1.7.0/promoted/all//b109/binaries/linux-x64/jre/lib/amd64/server/libjvm.so >>> >>> >>> >>> 3da05d: 39 d0 cmp %edx,%eax >>> 3da05f: 73 15 jae 3da076 >>> >>> 3da061: 48 8b 43 58 mov 0x58(%rbx),%rax >>> 3da065: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp) >>> 3da06c: 89 93 38 01 00 00 mov %edx,0x138(%rbx) >>> 3da072: 48 89 43 18 mov %rax,0x18(%rbx) >>> >>> In b109 the store of %rax to 0x18(%rbx) has been ordered after the store >>> of %edx to 0x138(%rbx) in the same build as JDK-6973570 was integrated. >>> >>> My suggestion to fix this is to extend all the OrderAccess::release* >>> variants on x86 with a: >>> __asm__ volatile ("" : : : "memory"); >>> to attempt to prevent GCC from reordering any memory accesses across >>> those function calls. >>> >>> I've verified that this solves the issue in the assembly with our >>> current JDK 9 build platform compilers. >>> I've also verified that this particular piece of code is compiled >>> correctly on our other x86 platforms: Solaris, Windows and OS X. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mgerdin/8061964/webrev.0/ >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8061964 >>> Testing: >>> JPRT, inspecting generated assembly for the function >>> G1OffsetTableContigSpace::record_top_and_timestamp (as the method is >>> currently named). >>> Suggestions of further testing is greatly appreciated. >>> >>> Thanks >>> Mikael > > From albert.noll at oracle.com Thu Nov 6 08:53:31 2014 From: albert.noll at oracle.com (Albert Noll) Date: Thu, 06 Nov 2014 09:53:31 +0100 Subject: [9] RFR(S): 8062735: CodeCacheSweeperThread missing from SA In-Reply-To: <545A5026.2070607@oracle.com> References: <5458CE11.305@oracle.com> <54595418.6060300@oracle.com> <5459EA00.9070109@oracle.com> <5459F24F.4090700@oracle.com> <545A5026.2070607@oracle.com> Message-ID: <545B370B.2060705@oracle.com> Serguei, Coleen, Vladimir, thanks again. Best, Albert On 11/05/2014 05:28 PM, Vladimir Kozlov wrote: > Okay. > > Thanks, > Vladimir > > On 11/5/14 1:47 AM, Albert Noll wrote: >> Hi, >> >> I forgot to add CodeCacheSweeperThread.java. Could you please look it >> it again? >> http://cr.openjdk.java.net/~anoll/8062735/webrev.01/ >> >> Thanks, >> Albert >> >> On 11/05/2014 10:12 AM, Albert Noll wrote: >>> Coleen, Vladimir, Serguei, thanks for reviewing this. >>> >>> Best, >>> Albert >>> >>> On 11/04/2014 11:32 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Albert, >>>> >>>> The fix looks good. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 11/4/14 5:01 AM, Albert Noll wrote: >>>>> Hi, >>>>> >>>>> could I get reviews for this small patch? >>>>> >>>>> Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8062735 >>>>> >>>>> Problem: >>>>> The fix for JDK-8046809 added the CodeCacheSweeperThread, but did >>>>> not add this new type to SA. >>>>> >>>>> Solution: >>>>> Add type to SA. >>>>> >>>>> Testing: >>>>> Failing test cases. >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~anoll/8062735/webrev.00/ >>>>> >>>>> Many thanks, >>>>> Albert >>>>> >>>> >>> >> From andreas.eriksson at oracle.com Thu Nov 6 09:32:38 2014 From: andreas.eriksson at oracle.com (Andreas Eriksson) Date: Thu, 06 Nov 2014 10:32:38 +0100 Subject: [8u-hs-dev] Backport RFR: 8057043: Type annotations not retained during class redefine / retransform In-Reply-To: <545AA339.8020607@oracle.com> References: <545A1CCD.5080008@oracle.com> <545AA339.8020607@oracle.com> Message-ID: <545B4036.1070808@oracle.com> Thanks Serguei! /Andreas On 2014-11-05 23:22, serguei.spitsyn at oracle.com wrote: > Hi Andreas, > > It is good. > > Thanks, > Serguei > > > On 11/5/14 4:49 AM, Andreas Eriksson wrote: >> Hi, >> >> This backport of JDK-8057043 >> imported cleanly >> from the jdk9 changeset: >> http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/06de05da6f2b >> >> Thanks, >> Andreas > From roland.westrelin at oracle.com Thu Nov 6 09:50:28 2014 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Thu, 6 Nov 2014 10:50:28 +0100 Subject: [8u40] backport RFR(XS) 8059780: SPECjvm2008-MPEG performance regressions on x64 platforms In-Reply-To: <545A9F16.8050107@oracle.com> References: <545A9F16.8050107@oracle.com> Message-ID: <7176BA2A-BFE3-4312-9E2E-05E3AE0841CE@oracle.com> Looks good to me. Roland. > On Nov 5, 2014, at 11:05 PM, Vladimir Kozlov wrote: > > Backport request. Changes were pushed into jdk9 2 days ago. Nighties are fine. Changes are applied cleanly to 8u sources. > > http://cr.openjdk.java.net/~kvn/8059780/webrev/ > > https://bugs.openjdk.java.net/browse/JDK-8059780 > > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/b8bcacc8ccca > > Thanks, > Vladimir From andreas.eriksson at oracle.com Thu Nov 6 10:08:26 2014 From: andreas.eriksson at oracle.com (Andreas Eriksson) Date: Thu, 06 Nov 2014 11:08:26 +0100 Subject: "/src/share/vm/classfile/classLoader.cpp", line 907: Error: Cannot use long to initialize instanceKlassHandle. In-Reply-To: <545A5831.8050506@oracle.com> References: <545A5743.1010307@oracle.com> <545A5831.8050506@oracle.com> Message-ID: <545B489A.1040408@oracle.com> Ah, I see. Thanks! /Andreas On 2014-11-05 18:02, Coleen Phillimore wrote: > > Yes, return CHECK_(instanceKlassHandle()) instead of CHECK_NULL for > jdk7. Handles were rewritten since InstanceKlass is no longer an oop. > > Coleen > > On 11/5/14, 11:58 AM, Andreas Eriksson wrote: >> Hi all, >> >> I'm backporting JDK-8020675 >> to jdk7, but debug >> builds on solaris fails with the following error: >> >> ".../src/share/vm/classfile/classLoader.cpp", line 907: Error: Cannot >> use long to initialize instanceKlassHandle. >> >> This is in method ClassLoader::load_classfile which returns an >> instanceKlassHandle. >> The changed code adds a CHECK_NULL: >> >> - stream = e->open_stream(name); >> + stream = e->open_stream(name, CHECK_NULL); >> >> The CHECK_NULL macro expands to >> >> stream = e->open_stream(name, THREAD); if (HAS_PENDING_EXCEPTION) >> return NULL; (0); >> >> NULL is of type long on solaris_x64 (or int on solaris_i586, which >> has the same problem), >> and the solaris debug build fails with the above error because we try >> to return NULL i.e. 0L as an instanceKlassHandle. >> >> This is only a problem on jdk7 solaris debug builds, does anyone know >> why it fails in this particular configuration? >> It is not a problem on jdk8, even though the same code exists there. >> >> Returning instanceKlassHandle() works as expected, would this be the >> correct way to go since that is a NULL handle? >> >> Any help is appreciated. >> >> Regards, >> Andreas > From david.holmes at oracle.com Thu Nov 6 10:09:23 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 06 Nov 2014 20:09:23 +1000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> Message-ID: <545B48D3.5040009@oracle.com> Hi Goetz, This change has introduced a bug: - return vsnprintf(str, count, fmt, args); + + int result = vsnprintf(str, count, fmt, args); + if ((result > 0 && (size_t)result >= count) || result == -1) { + str[count - 1] = '\0'; + result = -1; + } + + return result; some strings are getting their last character truncated on Windows. David On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: > Hi David, > > thanks for looking at the change! I fixed the issue in a new > webrev: > http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ > > Best regards, > Goetz. > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Mittwoch, 5. November 2014 02:49 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Subject: Re: RFR (L): 8062370: Various minor code improvements > > Hi Goetz, > > The only issue I see is in: > > src/share/vm/runtime/globals.cpp > > where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the "abort > on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. > > Thanks, > David > > On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >> Hi, >> >> this change contains a row of minor code improvements we did to fulfil >> our internal quality requirements. We would like to share these with >> openJDK. >> >> Please review and test this change. I please need a sponsor. >> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8062370 >> >> We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 bit and, >> of course, the ppc platforms. >> >> >> Some details: >> >> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. >> >> We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. >> >> We add some missing memory frees and some closing of files. >> >> jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. >> >> Best regards, >> >> Goetz >> >> >> >> From goetz.lindenmaier at sap.com Thu Nov 6 10:17:48 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 6 Nov 2014 10:17:48 +0000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <545B48D3.5040009@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> Thanks David, I'll have a look. Best regards, Goetz. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Donnerstag, 6. November 2014 11:09 To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net Cc: Markus Gr?nlund Subject: Re: RFR (L): 8062370: Various minor code improvements Hi Goetz, This change has introduced a bug: - return vsnprintf(str, count, fmt, args); + + int result = vsnprintf(str, count, fmt, args); + if ((result > 0 && (size_t)result >= count) || result == -1) { + str[count - 1] = '\0'; + result = -1; + } + + return result; some strings are getting their last character truncated on Windows. David On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: > Hi David, > > thanks for looking at the change! I fixed the issue in a new > webrev: > http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ > > Best regards, > Goetz. > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Mittwoch, 5. November 2014 02:49 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Subject: Re: RFR (L): 8062370: Various minor code improvements > > Hi Goetz, > > The only issue I see is in: > > src/share/vm/runtime/globals.cpp > > where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the "abort > on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. > > Thanks, > David > > On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >> Hi, >> >> this change contains a row of minor code improvements we did to fulfil >> our internal quality requirements. We would like to share these with >> openJDK. >> >> Please review and test this change. I please need a sponsor. >> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8062370 >> >> We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 bit and, >> of course, the ppc platforms. >> >> >> Some details: >> >> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. >> >> We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. >> >> We add some missing memory frees and some closing of files. >> >> jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. >> >> Best regards, >> >> Goetz >> >> >> >> From david.holmes at oracle.com Thu Nov 6 10:30:28 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 06 Nov 2014 20:30:28 +1000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> Message-ID: <545B4DC4.3020709@oracle.com> On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: > Thanks David, I'll have a look. It seems that windows vsnprintf may not null-terminate the string - which I think is what your patch was trying to address. But if we have existing code that works with that then the fix is now overwriting the last character. I can't quite see how to handle this in a cross platform manner, but in the immediate term we should probably revert that part of the changeset. David > Best regards, > Goetz. > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 6. November 2014 11:09 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: Re: RFR (L): 8062370: Various minor code improvements > > Hi Goetz, > > This change has introduced a bug: > > - return vsnprintf(str, count, fmt, args); > + > + int result = vsnprintf(str, count, fmt, args); > + if ((result > 0 && (size_t)result >= count) || result == -1) { > + str[count - 1] = '\0'; > + result = -1; > + } > + > + return result; > > some strings are getting their last character truncated on Windows. > > David > > On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: >> Hi David, >> >> thanks for looking at the change! I fixed the issue in a new >> webrev: >> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ >> >> Best regards, >> Goetz. >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Mittwoch, 5. November 2014 02:49 >> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >> Subject: Re: RFR (L): 8062370: Various minor code improvements >> >> Hi Goetz, >> >> The only issue I see is in: >> >> src/share/vm/runtime/globals.cpp >> >> where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the "abort >> on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. >> >> Thanks, >> David >> >> On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> this change contains a row of minor code improvements we did to fulfil >>> our internal quality requirements. We would like to share these with >>> openJDK. >>> >>> Please review and test this change. I please need a sponsor. >>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8062370 >>> >>> We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 bit and, >>> of course, the ppc platforms. >>> >>> >>> Some details: >>> >>> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. >>> >>> We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. >>> >>> We add some missing memory frees and some closing of files. >>> >>> jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. >>> >>> Best regards, >>> >>> Goetz >>> >>> >>> >>> From goetz.lindenmaier at sap.com Thu Nov 6 10:43:40 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 6 Nov 2014 10:43:40 +0000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <545B4DC4.3020709@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> Hi David, yes, windows does not null terminate if there is an overflow. Obviously there are overflows, and they now see one less character. I think this should be fixed where jio_vsnprintf is called. Having non-null terminated strings isn't nice. But for now I will roll back this single change. I'll send a RFR soon. Where did you see the problem? Best regards, Goetz. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Donnerstag, 6. November 2014 11:30 To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net Cc: Markus Gr?nlund Subject: Re: RFR (L): 8062370: Various minor code improvements On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: > Thanks David, I'll have a look. It seems that windows vsnprintf may not null-terminate the string - which I think is what your patch was trying to address. But if we have existing code that works with that then the fix is now overwriting the last character. I can't quite see how to handle this in a cross platform manner, but in the immediate term we should probably revert that part of the changeset. David > Best regards, > Goetz. > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 6. November 2014 11:09 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: Re: RFR (L): 8062370: Various minor code improvements > > Hi Goetz, > > This change has introduced a bug: > > - return vsnprintf(str, count, fmt, args); > + > + int result = vsnprintf(str, count, fmt, args); > + if ((result > 0 && (size_t)result >= count) || result == -1) { > + str[count - 1] = '\0'; > + result = -1; > + } > + > + return result; > > some strings are getting their last character truncated on Windows. > > David > > On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: >> Hi David, >> >> thanks for looking at the change! I fixed the issue in a new >> webrev: >> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ >> >> Best regards, >> Goetz. >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Mittwoch, 5. November 2014 02:49 >> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >> Subject: Re: RFR (L): 8062370: Various minor code improvements >> >> Hi Goetz, >> >> The only issue I see is in: >> >> src/share/vm/runtime/globals.cpp >> >> where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the "abort >> on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. >> >> Thanks, >> David >> >> On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> this change contains a row of minor code improvements we did to fulfil >>> our internal quality requirements. We would like to share these with >>> openJDK. >>> >>> Please review and test this change. I please need a sponsor. >>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8062370 >>> >>> We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 bit and, >>> of course, the ppc platforms. >>> >>> >>> Some details: >>> >>> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. >>> >>> We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. >>> >>> We add some missing memory frees and some closing of files. >>> >>> jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. >>> >>> Best regards, >>> >>> Goetz >>> >>> >>> >>> From david.holmes at oracle.com Thu Nov 6 11:23:00 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 06 Nov 2014 21:23:00 +1000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> Message-ID: <545B5A14.1020508@oracle.com> On 6/11/2014 8:43 PM, Lindenmaier, Goetz wrote: > Hi David, > > yes, windows does not null terminate if there is an overflow. > Obviously there are overflows, and they now see one less > character. I think this should be fixed where jio_vsnprintf > is called. Having non-null terminated strings isn't nice. I think it depends on what you consider an overflow. If the buffer is already null terminated and you pass in a count that covers up to the location before the null then there is no problem - except now the logic will introduce a second null in place of the last character. > But for now I will roll back this single change. I'll send a RFR soon. > > Where did you see the problem? It was in our closed code so I can't go into details. We have a non-public bug number: 8063089 Thanks, David > > Best regards, > Goetz. > > > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 6. November 2014 11:30 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: Re: RFR (L): 8062370: Various minor code improvements > > On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: >> Thanks David, I'll have a look. > > It seems that windows vsnprintf may not null-terminate the string - > which I think is what your patch was trying to address. But if we have > existing code that works with that then the fix is now overwriting the > last character. I can't quite see how to handle this in a cross platform > manner, but in the immediate term we should probably revert that part of > the changeset. > > David > >> Best regards, >> Goetz. >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Donnerstag, 6. November 2014 11:09 >> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >> Cc: Markus Gr?nlund >> Subject: Re: RFR (L): 8062370: Various minor code improvements >> >> Hi Goetz, >> >> This change has introduced a bug: >> >> - return vsnprintf(str, count, fmt, args); >> + >> + int result = vsnprintf(str, count, fmt, args); >> + if ((result > 0 && (size_t)result >= count) || result == -1) { >> + str[count - 1] = '\0'; >> + result = -1; >> + } >> + >> + return result; >> >> some strings are getting their last character truncated on Windows. >> >> David >> >> On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: >>> Hi David, >>> >>> thanks for looking at the change! I fixed the issue in a new >>> webrev: >>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ >>> >>> Best regards, >>> Goetz. >>> >>> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com] >>> Sent: Mittwoch, 5. November 2014 02:49 >>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>> >>> Hi Goetz, >>> >>> The only issue I see is in: >>> >>> src/share/vm/runtime/globals.cpp >>> >>> where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the "abort >>> on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. >>> >>> Thanks, >>> David >>> >>> On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> this change contains a row of minor code improvements we did to fulfil >>>> our internal quality requirements. We would like to share these with >>>> openJDK. >>>> >>>> Please review and test this change. I please need a sponsor. >>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >>>> https://bugs.openjdk.java.net/browse/JDK-8062370 >>>> >>>> We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 bit and, >>>> of course, the ppc platforms. >>>> >>>> >>>> Some details: >>>> >>>> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. >>>> >>>> We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. >>>> >>>> We add some missing memory frees and some closing of files. >>>> >>>> jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. >>>> >>>> Best regards, >>>> >>>> Goetz >>>> >>>> >>>> >>>> From goetz.lindenmaier at sap.com Thu Nov 6 12:48:30 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 6 Nov 2014 12:48:30 +0000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <545B5A14.1020508@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> Hi David, Well, yes, that's right. But then you can simply pass in count+1. It works also if the caller knows he will only use 'count' bytes of the string. In this case +1 must be allocated. But that both is quite special. Currently, if the string is truncated, there is no null byte on windows. And there are a lot of uses of this method in the VM (via jio_snprintf). Should I use the internal bug number for the rollback-fix? How should we proceed, as I can't fix you internal code? Best regards, Goetz. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Donnerstag, 6. November 2014 12:23 To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net Cc: Markus Gr?nlund Subject: Re: RFR (L): 8062370: Various minor code improvements On 6/11/2014 8:43 PM, Lindenmaier, Goetz wrote: > Hi David, > > yes, windows does not null terminate if there is an overflow. > Obviously there are overflows, and they now see one less > character. I think this should be fixed where jio_vsnprintf > is called. Having non-null terminated strings isn't nice. I think it depends on what you consider an overflow. If the buffer is already null terminated and you pass in a count that covers up to the location before the null then there is no problem - except now the logic will introduce a second null in place of the last character. > But for now I will roll back this single change. I'll send a RFR soon. > > Where did you see the problem? It was in our closed code so I can't go into details. We have a non-public bug number: 8063089 Thanks, David > > Best regards, > Goetz. > > > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 6. November 2014 11:30 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: Re: RFR (L): 8062370: Various minor code improvements > > On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: >> Thanks David, I'll have a look. > > It seems that windows vsnprintf may not null-terminate the string - > which I think is what your patch was trying to address. But if we have > existing code that works with that then the fix is now overwriting the > last character. I can't quite see how to handle this in a cross platform > manner, but in the immediate term we should probably revert that part of > the changeset. > > David > >> Best regards, >> Goetz. >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Donnerstag, 6. November 2014 11:09 >> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >> Cc: Markus Gr?nlund >> Subject: Re: RFR (L): 8062370: Various minor code improvements >> >> Hi Goetz, >> >> This change has introduced a bug: >> >> - return vsnprintf(str, count, fmt, args); >> + >> + int result = vsnprintf(str, count, fmt, args); >> + if ((result > 0 && (size_t)result >= count) || result == -1) { >> + str[count - 1] = '\0'; >> + result = -1; >> + } >> + >> + return result; >> >> some strings are getting their last character truncated on Windows. >> >> David >> >> On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: >>> Hi David, >>> >>> thanks for looking at the change! I fixed the issue in a new >>> webrev: >>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ >>> >>> Best regards, >>> Goetz. >>> >>> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com] >>> Sent: Mittwoch, 5. November 2014 02:49 >>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>> >>> Hi Goetz, >>> >>> The only issue I see is in: >>> >>> src/share/vm/runtime/globals.cpp >>> >>> where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the "abort >>> on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. >>> >>> Thanks, >>> David >>> >>> On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> this change contains a row of minor code improvements we did to fulfil >>>> our internal quality requirements. We would like to share these with >>>> openJDK. >>>> >>>> Please review and test this change. I please need a sponsor. >>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >>>> https://bugs.openjdk.java.net/browse/JDK-8062370 >>>> >>>> We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 bit and, >>>> of course, the ppc platforms. >>>> >>>> >>>> Some details: >>>> >>>> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. >>>> >>>> We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. >>>> >>>> We add some missing memory frees and some closing of files. >>>> >>>> jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. >>>> >>>> Best regards, >>>> >>>> Goetz >>>> >>>> >>>> >>>> From thomas.stuefe at gmail.com Thu Nov 6 13:21:51 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 6 Nov 2014 14:21:51 +0100 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <545B5A14.1020508@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> Message-ID: Hi David, Our intend was to always guarantee that the written string is zero terminated, and across platforms to always return the same value (-1) if truncation happened. The original jio_snprintf() did not have any value over plain snprintf() (apart from maybe solving the name problem with "_snprintf" on Windows). But having a dedicated wrapper function around snprintf() suggests some added value, and I always thought that value was supposed to be zero termination. If that is not intended, it would be better to use the plain snprintf instead, because at least C programmers then know what to expect. I also found very few cases where the return code of jio_snprintf() was actually checked and truncation handled correctly. Which would be difficult too, because the return code differed for truncation between windows and Posix. Bottomline, I think it would be better if jio_snprintf() were to always zero-terminate, guaranteed. Kind Regards, Thomas On Thu, Nov 6, 2014 at 12:23 PM, David Holmes wrote: > On 6/11/2014 8:43 PM, Lindenmaier, Goetz wrote: > >> Hi David, >> >> yes, windows does not null terminate if there is an overflow. >> Obviously there are overflows, and they now see one less >> character. I think this should be fixed where jio_vsnprintf >> is called. Having non-null terminated strings isn't nice. >> > > I think it depends on what you consider an overflow. If the buffer is > already null terminated and you pass in a count that covers up to the > location before the null then there is no problem - except now the logic > will introduce a second null in place of the last character. > > But for now I will roll back this single change. I'll send a RFR soon. >> >> Where did you see the problem? >> > > It was in our closed code so I can't go into details. We have a non-public > bug number: 8063089 > > Thanks, > > David > > >> Best regards, >> Goetz. >> >> >> >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Donnerstag, 6. November 2014 11:30 >> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >> Cc: Markus Gr?nlund >> Subject: Re: RFR (L): 8062370: Various minor code improvements >> >> On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: >> >>> Thanks David, I'll have a look. >>> >> >> It seems that windows vsnprintf may not null-terminate the string - >> which I think is what your patch was trying to address. But if we have >> existing code that works with that then the fix is now overwriting the >> last character. I can't quite see how to handle this in a cross platform >> manner, but in the immediate term we should probably revert that part of >> the changeset. >> >> David >> >> Best regards, >>> Goetz. >>> >>> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com] >>> Sent: Donnerstag, 6. November 2014 11:09 >>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>> Cc: Markus Gr?nlund >>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>> >>> Hi Goetz, >>> >>> This change has introduced a bug: >>> >>> - return vsnprintf(str, count, fmt, args); >>> + >>> + int result = vsnprintf(str, count, fmt, args); >>> + if ((result > 0 && (size_t)result >= count) || result == -1) { >>> + str[count - 1] = '\0'; >>> + result = -1; >>> + } >>> + >>> + return result; >>> >>> some strings are getting their last character truncated on Windows. >>> >>> David >>> >>> On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: >>> >>>> Hi David, >>>> >>>> thanks for looking at the change! I fixed the issue in a new >>>> webrev: >>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ >>>> >>>> Best regards, >>>> Goetz. >>>> >>>> -----Original Message----- >>>> From: David Holmes [mailto:david.holmes at oracle.com] >>>> Sent: Mittwoch, 5. November 2014 02:49 >>>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>>> >>>> Hi Goetz, >>>> >>>> The only issue I see is in: >>>> >>>> src/share/vm/runtime/globals.cpp >>>> >>>> where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the "abort >>>> on OOM" semantics of NEW_C_HEAP_ARRAY you need to use >>>> os::strdup_check_oom. >>>> >>>> Thanks, >>>> David >>>> >>>> On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >>>> >>>>> Hi, >>>>> >>>>> this change contains a row of minor code improvements we did to fulfil >>>>> our internal quality requirements. We would like to share these with >>>>> openJDK. >>>>> >>>>> Please review and test this change. I please need a sponsor. >>>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >>>>> https://bugs.openjdk.java.net/browse/JDK-8062370 >>>>> >>>>> We tested this on windows 64, linux x86_64, mac, solaris sparc 32+64 >>>>> bit and, >>>>> of course, the ppc platforms. >>>>> >>>>> >>>>> Some details: >>>>> >>>>> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus >>>>> not representable as i64 what is used in the CONST64 macro. This change >>>>> adapts UCONST64 to use ui64, and the usages of these macros where necessary. >>>>> >>>>> We add some more strncpy uses. Also, we fix strncpy on windows. >>>>> There, strncpy does not write a \0 into the last byte if the copied string >>>>> is too long. >>>>> >>>>> We add some missing memory frees and some closing of files. >>>>> >>>>> jio_vsnprintf() works differently on windows and linux. This change >>>>> adapts this to show the same behaviour on all platforms. See java.cpp. >>>>> >>>>> Best regards, >>>>> >>>>> Goetz >>>>> >>>>> >>>>> >>>>> >>>>> From markus.gronlund at oracle.com Thu Nov 6 13:50:27 2014 From: markus.gronlund at oracle.com (=?iso-8859-1?B?TWFya3VzIEdy9m5sdW5k?=) Date: Thu, 6 Nov 2014 05:50:27 -0800 (PST) Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> Message-ID: Hi Goetz, Thanks for looking into this. I think I will be able to update the internal code I am working on to accommodate your updates. I don't know if any other code will see potential issues - only testing will tell. So I would await the rollback and I will putback my updated code - let's see if other issues appear after this - we should know after this nights nightly testing (then we can re-evaluate the rollback). Thanks Markus -----Original Message----- From: Lindenmaier, Goetz [mailto:goetz.lindenmaier at sap.com] Sent: den 6 november 2014 13:49 To: David Holmes; hotspot-dev at openjdk.java.net Cc: Markus Gr?nlund Subject: RE: RFR (L): 8062370: Various minor code improvements Hi David, Well, yes, that's right. But then you can simply pass in count+1. It works also if the caller knows he will only use 'count' bytes of the string. In this case +1 must be allocated. But that both is quite special. Currently, if the string is truncated, there is no null byte on windows. And there are a lot of uses of this method in the VM (via jio_snprintf). Should I use the internal bug number for the rollback-fix? How should we proceed, as I can't fix you internal code? Best regards, Goetz. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Donnerstag, 6. November 2014 12:23 To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net Cc: Markus Gr?nlund Subject: Re: RFR (L): 8062370: Various minor code improvements On 6/11/2014 8:43 PM, Lindenmaier, Goetz wrote: > Hi David, > > yes, windows does not null terminate if there is an overflow. > Obviously there are overflows, and they now see one less character. I > think this should be fixed where jio_vsnprintf is called. Having > non-null terminated strings isn't nice. I think it depends on what you consider an overflow. If the buffer is already null terminated and you pass in a count that covers up to the location before the null then there is no problem - except now the logic will introduce a second null in place of the last character. > But for now I will roll back this single change. I'll send a RFR soon. > > Where did you see the problem? It was in our closed code so I can't go into details. We have a non-public bug number: 8063089 Thanks, David > > Best regards, > Goetz. > > > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 6. November 2014 11:30 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: Re: RFR (L): 8062370: Various minor code improvements > > On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: >> Thanks David, I'll have a look. > > It seems that windows vsnprintf may not null-terminate the string - > which I think is what your patch was trying to address. But if we have > existing code that works with that then the fix is now overwriting the > last character. I can't quite see how to handle this in a cross > platform manner, but in the immediate term we should probably revert > that part of the changeset. > > David > >> Best regards, >> Goetz. >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Donnerstag, 6. November 2014 11:09 >> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >> Cc: Markus Gr?nlund >> Subject: Re: RFR (L): 8062370: Various minor code improvements >> >> Hi Goetz, >> >> This change has introduced a bug: >> >> - return vsnprintf(str, count, fmt, args); >> + >> + int result = vsnprintf(str, count, fmt, args); if ((result > 0 && >> + (size_t)result >= count) || result == -1) { >> + str[count - 1] = '\0'; >> + result = -1; >> + } >> + >> + return result; >> >> some strings are getting their last character truncated on Windows. >> >> David >> >> On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: >>> Hi David, >>> >>> thanks for looking at the change! I fixed the issue in a new >>> webrev: >>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ >>> >>> Best regards, >>> Goetz. >>> >>> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com] >>> Sent: Mittwoch, 5. November 2014 02:49 >>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>> >>> Hi Goetz, >>> >>> The only issue I see is in: >>> >>> src/share/vm/runtime/globals.cpp >>> >>> where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the >>> "abort on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. >>> >>> Thanks, >>> David >>> >>> On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> this change contains a row of minor code improvements we did to >>>> fulfil our internal quality requirements. We would like to share >>>> these with openJDK. >>>> >>>> Please review and test this change. I please need a sponsor. >>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >>>> https://bugs.openjdk.java.net/browse/JDK-8062370 >>>> >>>> We tested this on windows 64, linux x86_64, mac, solaris sparc >>>> 32+64 bit and, of course, the ppc platforms. >>>> >>>> >>>> Some details: >>>> >>>> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. >>>> >>>> We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. >>>> >>>> We add some missing memory frees and some closing of files. >>>> >>>> jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. >>>> >>>> Best regards, >>>> >>>> Goetz >>>> >>>> >>>> >>>> From volker.simonis at gmail.com Thu Nov 6 14:35:24 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 6 Nov 2014 15:35:24 +0100 Subject: Proposal: Allowing selective pushes to hotspot without jprt In-Reply-To: References: <540F7021.5080100@oracle.com> <5410CDA9.7030405@oracle.com> <541C6FD9.5050602@oracle.com> Message-ID: Hi Mikael, just wanted to ask what's the status of this project? I hope it was not just a JavaOne hoax :) Regards, Volker On Fri, Sep 19, 2014 at 8:47 PM, Volker Simonis wrote: > Thanks Mikael, that sounds good! > > Regards, > Volker > > > On Fri, Sep 19, 2014 at 8:03 PM, Mikael Vidstedt > wrote: >> >> Volker, >> >> The proposal is only to change how the changes are pushed, not which forests >> changes can be pushed to. That is, we would still require hotspot changes to >> be pushed to one of the group repositories (jdk9/hs-{comp,gc,rt}) or to the >> jdk8u/hs-dev forest (jdk8u), but I propose that the relaxation be applied on >> all those (four) forests. Reasonable? >> >> Cheers, >> Mikael >> >> >> On 2014-09-12 11:38, Volker Simonis wrote: >>> >>> Hi Mikael, >>> >>> there's one more question that came to my mind: will the new rule >>> apply to all hotspot respitories (i.e. jdk9/hs-rt/hotspot, >>> jdk9/hs-comp/hotspot, jdk9/hs-gc/hotspot, jdk9/hs-hs/hotspot AND >>> jdk8u/jdk8u-dev/hotspot, jdk8u/hs-dev/hotspot) ? >>> >>> Thanks, >>> Volker >>> >>> >>> On Thu, Sep 11, 2014 at 12:16 AM, Mikael Vidstedt >>> wrote: >>>> >>>> Andrew/Volker, >>>> >>>> Thanks for the positive feedback. The goal of the proposal is to simplify >>>> pushing changes which are effectively not tested by the jprt system >>>> anyway. >>>> The proposed relaxation would not affect work on other infrastructure >>>> projects in any relevant way, but would hopefully improve all our lives >>>> significantly immediately. >>>> >>>> Cheers, >>>> Mikael >>>> >>>> >>>> On 2014-09-10 01:45, Volker Simonis wrote: >>>>> >>>>> Hi Mikael, >>>>> >>>>> thanks a lot for this proposal. I think this will dramatically >>>>> simplify our work to keep our ports up to date! So I fully support it. >>>>> >>>>> Nevertheless, I think this can only be a first step towards fully open >>>>> the JPRT system to developers outside Oracle. With "opening" I mean to >>>>> allow OpenJDK commiters from outside Oracle to submit and run JPRT >>>>> jobs as well as allowing porting projects to add hardware which builds >>>>> and tests the HotSpot on alternative platforms. >>>>> >>>>> So while I'm all in favor of your proposal I hope you can allay my >>>>> doubts that this simplification will hopefully not push the >>>>> realization of a truly OPEN JPRT system even further away. >>>>> >>>>> Regards, >>>>> Volker >>>>> >>>>> >>>>> On Tue, Sep 9, 2014 at 11:24 PM, Mikael Vidstedt >>>>> wrote: >>>>>> >>>>>> All, >>>>>> >>>>>> Made up primarily of low level C++ code, the Hotspot codebase is highly >>>>>> platform dependent and also tightly coupled with the tool chains on the >>>>>> various platforms. Each platform/tool chain combination has its set of >>>>>> special quirks, and code must be implemented in a way such that it only >>>>>> relies on the common subset of syntax and functionality across all >>>>>> these >>>>>> combinations. History has taught us that even simple changes can have >>>>>> surprising results when compiled with different compilers. >>>>>> >>>>>> For more than a decade the Hotspot team has ensured a minimum quality >>>>>> level >>>>>> by requiring all pushes to be done through a build and test system >>>>>> (jprt) >>>>>> which guarantees that the code resulting from applying a set of changes >>>>>> builds on a set of core platforms and that a set of core tests pass. >>>>>> Only >>>>>> if >>>>>> all the builds and tests pass will the changes actually be pushed to >>>>>> the >>>>>> target repository. >>>>>> >>>>>> We believe that testing like the above, in combination with later >>>>>> stages >>>>>> of >>>>>> testing, is vital to ensuring that the quality level of the Hotspot >>>>>> code >>>>>> remains high and that developers do not run into situations where the >>>>>> latest >>>>>> version has build errors on some platforms. >>>>>> >>>>>> Recently the AIX/PPC port was added to the set of OpenJDK platforms. >>>>>> From >>>>>> a >>>>>> Hotspot perspective this new platform added a set of AIX/PPC specific >>>>>> files >>>>>> including some platform specific changes to shared code. The AIX/PPC >>>>>> platform is not tested by Oracle as part of Hotspot push jobs. The same >>>>>> thing applies for the shark and zero versions of Hotspot. >>>>>> >>>>>> While Hotspot developers remain committed to making sure changes are >>>>>> developed in a way such that the quality level remains high across all >>>>>> platforms and variants, because of the above mentioned complexities it >>>>>> is >>>>>> inevitable that from time to time changes will be made which introduce >>>>>> issues on specific platforms or tool chains not part of the core >>>>>> testing. >>>>>> >>>>>> To allow these issues to be resolved more quickly I would like to >>>>>> propose >>>>>> a >>>>>> relaxation in the requirements on how changes to Hotspot are pushed. >>>>>> Specifically I would like to allow for direct pushes to the hotspot/ >>>>>> repository of files specific to the following ports/variants/tools: >>>>>> >>>>>> * AIX >>>>>> * PPC >>>>>> * Shark >>>>>> * Zero >>>>>> >>>>>> Today this translates into the following files: >>>>>> >>>>>> - src/cpu/ppc/** >>>>>> - src/cpu/zero/** >>>>>> - src/os/aix/** >>>>>> - src/os_cpu/aix_ppc/** >>>>>> - src/os_cpu/bsd_zero/** >>>>>> - src/os_cpu/linux_ppc/** >>>>>> - src/os_cpu/linux_zero/** >>>>>> >>>>>> Note that all changes are still required to go through the normal >>>>>> development and review cycle; the proposed relaxation only applies to >>>>>> how >>>>>> the changes are pushed. >>>>>> >>>>>> If at code review time a change is for some reason deemed to be risky >>>>>> and/or >>>>>> otherwise have impact on shared files the reviewer may request that the >>>>>> change to go through the regular push testing. For changes only >>>>>> touching >>>>>> the >>>>>> above set of files this expected to be rare. >>>>>> >>>>>> Please let me know what you think. >>>>>> >>>>>> Cheers, >>>>>> Mikael >>>>>> >> From stefan.karlsson at oracle.com Thu Nov 6 17:57:43 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 06 Nov 2014 18:57:43 +0100 Subject: [8u40] RFR: 8056240: Investigate increased GC remark time after class unloading changes in CRM Fuse In-Reply-To: <5459D9EE.1030202@oracle.com> References: <542D2EC1.5030303@oracle.com> <5459D9EE.1030202@oracle.com> Message-ID: <545BB697.5080802@oracle.com> Just a heads-up: I plan to push this backport tomorrow, unless I get more comments. thanks, StefanK On 2014-11-05 09:03, Stefan Karlsson wrote: > Hi all, > > Please, review the backport of this fix to 8u40: > http://cr.openjdk.java.net/~stefank/backports/8u40/8056240/webrev.01/ > > I've attached the .rej files from applying the JDK 9 patch to JDK 8u40. > > Reasons for the .rej files from the failed applied patch hunks: > metadataOnStackMark.cpp - the has_redefined_a_class parameter in > MetadataOnStackMark(bool has_redefined_a_class) is only present in JDK 9. > metadataOnStackMark.hpp - the same as above > classLoaderData.cpp - has_redefined_a_class parameter and JDK-8040237 > isn't backported > concurrentMark.cpp - JDK-8027450 isn't backported > > thanks, > StefanK > > On 2014-10-02 12:53, Stefan Karlsson wrote: >> Hi all, >> >> (The following patch changes HotSpot code in areas concerning GC, RT, >> and Compiler. So, it would be good to get reviews from all three teams.) >> >> Please, review this patch to optimize and parallelize the CodeCache >> part of MetadaOnStackMark. >> >> G1 performance measurements showed longer than expected remark times >> on an application using a lot of nmethods and Metadata. The cause for >> this performance regression was the call to >> CodeCache::alive_nmethods_do(nmethod::mark_on_stack); in >> MetadataOnStackMark. This code path is only taken when class >> redefinition is used. Class redefinition is typically used in >> monitoring and diagnostic frameworks. >> >> With this patch we now: >> 1) Filter out duplicate Metadata* entries instead of storing a >> Metadata* per visited metadata reference. >> 2) Collect the set of Metadata* entries in parallel. The code >> piggy-backs on the parallel CodeCache cleaning task. >> >> http://cr.openjdk.java.net/~stefank/8056240/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8056240 >> >> Functional testing: >> JPRT, Kitchensink, parallel_class_unloading, unit tests >> >> Performance testing: >> CRM Fuse - where the regression was found >> >> The patch changes HotSpot code in areas concerning GC, RT, Compiler, >> and Serviceability. It would be good to get some reviews from the >> other teams, and not only from the GC team. >> >> thanks, >> StefanK > From eric.mccorkle at oracle.com Thu Nov 6 18:00:10 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Thu, 06 Nov 2014 13:00:10 -0500 Subject: Review request: JDK-8062556: Add jdk tests for JDK-8058322 and JDK-8058313 In-Reply-To: <5457A270.9010009@oracle.com> References: <5452CCA3.5040001@oracle.com> <5452ECF5.7020607@oracle.com> <54539625.7070302@oracle.com> <54570981.2090601@oracle.com> <5457A270.9010009@oracle.com> Message-ID: <545BB72A.1070204@oracle.com> Are there any concerns about the tests, other than the broken webrevs? On 11/03/14 10:42, Eric McCorkle wrote: > I have been having issues with webrev, which I reported earlier. Webrev > reports a syntax error when I try to use it, and curiously, it fails to > produce top-level files in this case (and this case only, as evidenced > by my other webrevs). > > Unfortunately, there's nothing I can do about the missing top-level > files; however, you can still look at the individual files just fine. > > On 11/02/14 23:50, David Holmes wrote: >> Hi Erik, >> >> webrevs still broken for some reason. >> On 1/11/2014 12:01 AM, Eric McCorkle wrote: >>> I went through and added comments in the binary data indicating where >>> the MethodParameters attributes are, and a breakdown of their contents. >>> I went ahead and did this for all the bad class files, not just the new >>> ones. >>> >>> There is a larger picture here: there's an outstanding task I filed >>> around the time these tests were written to find a better way for >>> langtools to run jtreg tests that involve bad class files. >>> Unfortunately, doing that is rather difficult, as you can see. The only >>> real way to do it is to generate a class file, convert it to signed >>> bytes (you can't even use hex; you get an unsigned/signed byte >>> conversion problem), then modify the data by hand. The intent is to >>> replace this with a better method at some point. >> >> OK. New comments an improvement. >> >> Please give the new test the correct initial copyright year of 2014. I >> know updates to the year are handled automatically (eventually) but we >> should at least have things correct to start with. >> >> Thanks, >> David >> >>> On 10/30/14 21:59, David Holmes wrote: >>>> Hi Erik, >>>> >>>> On 31/10/2014 9:41 AM, Eric McCorkle wrote: >>>>> Hello, >>>>> >>>>> Please review this patch which adds tests to the JDK test suite for two >>>>> reflection bugs that require hotspot changes (JDK-8058322 and >>>>> JDK-8058313) >>>>> >>>>> The webrev is here: >>>>> http://cr.openjdk.java.net/~emc/8062556/ >>>> >>>> I second Brian's comment re the source of the bad classes. >>>> >>>> Your webrev is broken btw - no top-level html files. >>>> >>>> The new test needs a copyright year of 2014 not 2013. >>>> >>>> Thanks, >>>> David >>>> From eric.mccorkle at oracle.com Thu Nov 6 18:35:32 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Thu, 06 Nov 2014 13:35:32 -0500 Subject: Review request for 8058313: Mismatch of method descriptor and MethodParameters.parameters_count should cause MalformedParameterException In-Reply-To: <54597EFC.2070509@oracle.com> References: <54516C9A.7070404@oracle.com> <54518820.50700@oracle.com> <545251AD.8050208@oracle.com> <54527A90.4030503@oracle.com> <5452CC7F.1090809@oracle.com> <5457F530.2070907@oracle.com> <54597EFC.2070509@oracle.com> Message-ID: <545BBF74.4020607@oracle.com> On 11/04/14 20:35, Jiangli Zhou wrote: > Hi Eric, > > I have a few more comments: > > In ClassFileParser::parse_method(), should 'real_length' be int instead > of u2? No. By that point, we know it's positive, and it's about to be compared to method_attribute_length, which is a u2. > In JVM_GetMethodParameters(), can you add an assert to make sure the > num_params is -1 when it's less than 0? Also, it's probably more > conventional to use (num_params < 0) instead of (0 > num_params). Appiled, and webrev refreshed. Please look at it. > > On 11/03/2014 01:35 PM, Eric McCorkle wrote: >> Please review this issue so that it can go in along with 8058322. >> Thanks. >> >> On 10/30/14 19:40, Eric McCorkle wrote: >>> Thank you for the pointers. I have applied your changes and refreshed >>> the webrev. >>> >>> http://cr.openjdk.java.net/~emc/8058313/ >>> >>> Also, I have posted the test for this and another patch here: >>> http://cr.openjdk.java.net/~emc/8062556/ >>> >>> On 10/30/14 13:51, Jiangli Zhou wrote: >>>> Hi Eric, >>>> >>>> On 10/30/2014 07:56 AM, Eric McCorkle wrote: >>>>> On 10/29/14 20:36, Jiangli Zhou wrote: >>>>>> Hi Eric, >>>>>> >>>>>> I wonder if we could specialize this particular case and avoid >>>>>> changing >>>>>> the parsing code. How about setting the _has_method_parameters >>>>>> flag in >>>>>> the ConstMethod when encounter such MethodParameter, and changing >>>>>> JVM_GetMethodParameters() to return non-NULL value for such case when >>>>>> _has_method_parameters is true but method_parameters_length is 0. >>>>>> Would >>>>>> that work? >>>>> Which parser are you talking about? The inline tables parser, or the >>>>> class file parser. The class file parser has to change, because it >>>>> was >>>>> previously ignoring MethodParameters attributes with >>>>> parameter_count 0. >>>> It's the class parsing changes that I was referring to, mostly >>>> relate to >>>> the initialization and checking against method_parameters_length. >>>> It's a >>>> bit awkward to include the 0 case but also skipping it in the loop. For >>>> example, the following code in classFileParser.cpp changed ">" to ">=" >>>> in the if check, but has no real effect and is not need. >>>> >>>> 2486 // Copy method parameters >>>> 2487 if (method_parameters_length >= 0) { >>>> 2488 MethodParametersElement* elem = >>>> m->constMethod()->method_parameters_start(); >>>> 2489 for (int i = 0; i < method_parameters_length; i++) { >>>> 2490 elem[i].name_cp_index = >>>> Bytes::get_Java_u2(method_parameters_data); >>>> 2491 method_parameters_data += 2; >>>> 2492 elem[i].flags = Bytes::get_Java_u2(method_parameters_data); >>>> 2493 method_parameters_data += 2; >>>> 2494 } >>>> 2495 } >>>> >>>> >>>>> I don't think your proposal will work. The inline tables' offsets are >>>>> all dependent on what inline tables are actually present. If >>>>> _has_method_parameters is set, then the inline tables code expects the >>>>> last u2 of the inline tables to be a u2 indicating the number of >>>>> method >>>>> parameters entries, preceeded by the array of method parameters data. >>>>> If _has_method_parameters is false, then it expects that there is no >>>>> method parameters information at all (including no length field). If >>>>> you were to set _has_method_parameters, but not store any >>>>> information in >>>>> the inline table, then it would cause errors for all the rest of the >>>>> inline tables. >>>> Thank you for reminding me of the complexity of the inlined table >>>> calculation in the ConstMethod. My proposal would require tweaks in >>>> that >>>> area to correctly compute the table sizes. As it's easy to introduce >>>> bugs in that area, it's not worth to change the table calculation code >>>> for this purpose. I agree my proposal is not a better choice in this >>>> case. >>>> >>>>> What I do for the parameter_count = 0 case is just store >>>>> a 0 u2 for zero-length method parameters information, and no data. >>>>> All >>>>> the existing inline tables code works fine with this case, so there >>>>> aren't any serious changes to the inline tables code (other than >>>>> allowing method parameters information to be stored when the array is >>>>> length 0). But you have to make some change to the inline table code, >>>>> otherwise the information won't be stored. >>>> Ok. Could you please add comments to the change in constMethod.cpp to >>>> explain above? >>>> >>>> In jvm.cpp, since -1 represents no method parameter now. Maybe checking >>>> against explicity and add comments for the 0-length case. >>>> >>>> JVM_ENTRY(jobjectArray, JVM_GetMethodParameters(JNIEnv *env, jobject >>>> method)) >>>> { >>>> ... >>>> // No method parameter >>>> if (num_params == -1) { >>>> return (jobjectArray)NULL; >>>> } >>>> >>>> /* handle the rest here */ >>>> // make sure all the symbols are properly formatted >>>> for (int i = 0; i < num_params; i++) { >>>> ... >>>> } >>>> >>>> Thanks, >>>> Jiangli >>>> >>>>>> Thanks, >>>>>> Jiangli >>>>>> >>>>>> On 10/29/2014 03:39 PM, Eric McCorkle wrote: >>>>>>> Hello, >>>>>>> >>>>>>> Please review this fix for parameter reflection which addresses >>>>>>> hotspot >>>>>>> falsely ignoring zero-length MethodParameter attributes. The JVMS >>>>>>> allows a MethodParameters attribute with parameter_count = 0, and >>>>>>> the >>>>>>> parameter reflection spec states that a MalformedParametersException >>>>>>> should be thrown if parameter_count does not match the number of >>>>>>> real >>>>>>> parameters to a method. Hotspot currently ignores MethodParameters >>>>>>> attributes with parameter_count = 0; however, in a case where a >>>>>>> (bad) >>>>>>> MethodParameters attribute has parameter_count = 0, but the method >>>>>>> has a >>>>>>> nonzero number of real parameters, hotspot will return null from >>>>>>> JVM_GetMethodParameters, the result being that a >>>>>>> MalformedParametersException is not thrown (rather, the >>>>>>> reflection API >>>>>>> acts like there is no MethodParameters attribute). >>>>>>> >>>>>>> This patch causes hotspot to record the fact that a zero-length >>>>>>> MethodParameters attribute does exist, causing the exception to be >>>>>>> thrown when it should be. >>>>>>> >>>>>>> The bug is here: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8058313 >>>>>>> >>>>>>> The webrev is here: >>>>>>> http://cr.openjdk.java.net/~emc/8058313/ > From david.holmes at oracle.com Thu Nov 6 20:33:56 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 07 Nov 2014 06:33:56 +1000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> Message-ID: <545BDB34.9000300@oracle.com> Hi Markus, On 6/11/2014 11:50 PM, Markus Gr?nlund wrote: > Hi Goetz, > > Thanks for looking into this. > > I think I will be able to update the internal code I am working on to accommodate your updates. > > I don't know if any other code will see potential issues - only testing will tell. > > So I would await the rollback and I will putback my updated code - let's see if other issues appear after this - we should know after this nights nightly testing (then we can re-evaluate the rollback). Thanks for dealing with this without needing to do a partial rollback of the current changes. We need to do a quick audit of windows-only uses of jio_snprintf to check that other code is not also expecting the Windows vsnprintf semantics. We will also close the pre-push testing gap that allowed this to slip through. Thanks, David > Thanks > Markus > > > -----Original Message----- > From: Lindenmaier, Goetz [mailto:goetz.lindenmaier at sap.com] > Sent: den 6 november 2014 13:49 > To: David Holmes; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: RE: RFR (L): 8062370: Various minor code improvements > > Hi David, > > Well, yes, that's right. But then you can simply pass in count+1. > It works also if the caller knows he will only use 'count' bytes of the string. In this case +1 must be allocated. > But that both is quite special. > > Currently, if the string is truncated, there is no null byte on windows. And there are a lot of uses of this method in the VM (via jio_snprintf). > > Should I use the internal bug number for the rollback-fix? > > How should we proceed, as I can't fix you internal code? > > Best regards, > Goetz. > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 6. November 2014 12:23 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: Re: RFR (L): 8062370: Various minor code improvements > > On 6/11/2014 8:43 PM, Lindenmaier, Goetz wrote: >> Hi David, >> >> yes, windows does not null terminate if there is an overflow. >> Obviously there are overflows, and they now see one less character. I >> think this should be fixed where jio_vsnprintf is called. Having >> non-null terminated strings isn't nice. > > I think it depends on what you consider an overflow. If the buffer is already null terminated and you pass in a count that covers up to the location before the null then there is no problem - except now the logic will introduce a second null in place of the last character. > >> But for now I will roll back this single change. I'll send a RFR soon. >> >> Where did you see the problem? > > It was in our closed code so I can't go into details. We have a non-public bug number: 8063089 > > Thanks, > David > >> >> Best regards, >> Goetz. >> >> >> >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Donnerstag, 6. November 2014 11:30 >> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >> Cc: Markus Gr?nlund >> Subject: Re: RFR (L): 8062370: Various minor code improvements >> >> On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: >>> Thanks David, I'll have a look. >> >> It seems that windows vsnprintf may not null-terminate the string - >> which I think is what your patch was trying to address. But if we have >> existing code that works with that then the fix is now overwriting the >> last character. I can't quite see how to handle this in a cross >> platform manner, but in the immediate term we should probably revert >> that part of the changeset. >> >> David >> >>> Best regards, >>> Goetz. >>> >>> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com] >>> Sent: Donnerstag, 6. November 2014 11:09 >>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>> Cc: Markus Gr?nlund >>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>> >>> Hi Goetz, >>> >>> This change has introduced a bug: >>> >>> - return vsnprintf(str, count, fmt, args); >>> + >>> + int result = vsnprintf(str, count, fmt, args); if ((result > 0 && >>> + (size_t)result >= count) || result == -1) { >>> + str[count - 1] = '\0'; >>> + result = -1; >>> + } >>> + >>> + return result; >>> >>> some strings are getting their last character truncated on Windows. >>> >>> David >>> >>> On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: >>>> Hi David, >>>> >>>> thanks for looking at the change! I fixed the issue in a new >>>> webrev: >>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ >>>> >>>> Best regards, >>>> Goetz. >>>> >>>> -----Original Message----- >>>> From: David Holmes [mailto:david.holmes at oracle.com] >>>> Sent: Mittwoch, 5. November 2014 02:49 >>>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>>> >>>> Hi Goetz, >>>> >>>> The only issue I see is in: >>>> >>>> src/share/vm/runtime/globals.cpp >>>> >>>> where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the >>>> "abort on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. >>>> >>>> Thanks, >>>> David >>>> >>>> On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >>>>> Hi, >>>>> >>>>> this change contains a row of minor code improvements we did to >>>>> fulfil our internal quality requirements. We would like to share >>>>> these with openJDK. >>>>> >>>>> Please review and test this change. I please need a sponsor. >>>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >>>>> https://bugs.openjdk.java.net/browse/JDK-8062370 >>>>> >>>>> We tested this on windows 64, linux x86_64, mac, solaris sparc >>>>> 32+64 bit and, of course, the ppc platforms. >>>>> >>>>> >>>>> Some details: >>>>> >>>>> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. >>>>> >>>>> We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. >>>>> >>>>> We add some missing memory frees and some closing of files. >>>>> >>>>> jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. >>>>> >>>>> Best regards, >>>>> >>>>> Goetz >>>>> >>>>> >>>>> >>>>> From serguei.spitsyn at oracle.com Thu Nov 6 22:27:25 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 06 Nov 2014 14:27:25 -0800 Subject: 4-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <5459FB96.9020404@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> <545729A3.7090301@oracle.com> <54596AC2.6050502@oracle.com> <5459FB96.9020404@oracle.com> Message-ID: <545BF5CD.6010008@oracle.com> Hi reviewers, I'm suggesting to review a modified fix: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.4/ The 3-rd round fix is not right as it caused deadlocks in several tests (in nsk.jdi.testlist and jtreg com/sun/jdi). Here is a deadlock example: ----------------- lwp# 2 / thread# 2 -------------------- ffffffff7e8dc6a4 lwp_cond_wait (100138748, 100138730, 0, 0) ffffffff7dcad148 void os::PlatformEvent::park() (100138700, d4788, d4400, 0, ffffffff7e357440, 100138730) + 100 ffffffff7dc3151c int Monitor::IWait(Thread*,long) (ffffffff7e3c5b98, 100137000, 0, 1004405d0, 6e750, 0) + a4 ffffffff7dc324d0 bool Monitor::wait(bool,long,bool) (1004405d0, 100137000, 0, 0, 1, 20000000) + 358 ffffffff7de6c530 int JavaThread::java_suspend_self() (1004405d0, 100137000, 1, deab, 60000000, 100137000) + c8 ffffffff7da5f478 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) (10034bdc0, ffffffffffffffff, ffffffff7e3e6bd0, 100137000, 1, 2) + 258 ffffffff7da2284c jvmtiError JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bdc0, ffffffffffffffff, 4, 9aeb0, 100137000) + 8c ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, ffffffff7ab3ad18, ffffffff7ab3ad18, 0) + 3c ffffffff7aa1c804 enqueueCommand (10034bb90, 102c00, ffffffffffefd118, ffffffff7ab3ad18, 102c00, ffffffff7ab3bd60) + 14c ffffffff7aa1e23c eventHelper_reportEvents (d8, 100135d70, 2, 1, 1, 2) + 10c ffffffff7aa181f8 reportEvents (1001371f8, 0, 0, 14, 100135d70, 0) + 138 ffffffff7aa187b8 event_callback (1001371f8, ffffffff7b0ffa88, ffffffff7aa23150, ffffffff7aa376a0, ffffffff7ab3ad18, 100441ad0) + 360 ffffffff7aa1b870 cbVMDeath (800, 1001371f8, ffffffff7aa37c48, ffffffff7ab3ad18, 1018, 1000) + 1d8 ffffffff7da3635c void JvmtiExport::post_vm_death() (1ffc, 100137000, ffffffff7e3e8b30, ffffffff7e357440, 1, 10010cf30) + 534 ffffffff7d7bb104 void before_exit(JavaThread*) (100137000, ffffffff7e392350, ffffffff7e3fb938, 6ed99, ffffffff7e357440, ffffffff7e3e6b70) + 30c ffffffff7de72128 bool Threads::destroy_vm() (100137000, 100110a40, ffffffff7e3f22f4, ffffffff7e3e6ab0, ffffffff7e357440, 30000000) + 100 ffffffff7d8d0664 jni_DestroyJavaVM (100137000, 1ffc, ffffffff7e3e8b30, ffffffff7e357440, 0, 10013700) + 1bc ffffffff7ee08680 JavaMain (ffffffff7e3da790, 0, ffffffff7e3da790, 10035de68, 0, ffffffff7e4143b0) + 860 ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) ----------------- lwp# 12 / thread# 12 -------------------- ffffffff7e8dc6a4 lwp_cond_wait (100349948, 100349930, 0, 0) ffffffff7dcad148 void os::PlatformEvent::park() (100349900, d4788, d4400, 0, ffffffff7e357440, 100349930) + 100 ffffffff7da5f010 int JvmtiRawMonitor::raw_enter(Thread*) (10034a070, 100348800, a, ffffffff7e3de340, 1, ffffffff7e115ff4) + 258 ffffffff7da22450 jvmtiError JvmtiEnv::RawMonitorEnter(JvmtiRawMonitor*) (ffffffff7ea05a00, 10034a070, 1c7, 100348800, ffffffff7e357440, 4) + a0 ffffffff7aa2f288 debugMonitorEnter (10034a070, c18, c00, ffffffff7ab3ad28, ffffffff7ab3b940, 0) + 38 ffffffff7aa14134 debugLoop_run (ffffffff7ab3b940, 1000, ffffffff7ab3ad28, ffffffff7aa360d0, ffffffff5b2ff718, c18) + 11c ffffffff7aa2a4f8 connectionInitiated (ffffffff5b504010, 1358, 1000, ffffffff7ab3ad28, 1, ffffffff7ab3c080) + e0 ffffffff7aa2a7d4 attachThread (ffffffffffefee48, 101000, ffffffff5b504010, ffffffff7ab3ad28, 0, 10000000) + 54 ffffffff7da56b18 void JvmtiAgentThread::call_start_function() (100348800, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034880, 1) + 128 ffffffff7de6a678 void JavaThread::thread_main_inner() (100348800, 3d8, 1003497f8, 100349420, ffffffff5b2ff9f8, 0) + 90 ffffffff7de6a5b4 void JavaThread::run() (100348800, 100349442, c, fffffffea5f3e048, 3d8, 1003497f8) + 3ac ffffffff7dc9f2e4 java_start (ca800, 100348800, ca904, ffffffff7e16ff31, ffffffff7e357440, 4797) + 2e4 ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) ----------------- lwp# 13 / thread# 13 -------------------- ffffffff7e8dc6a4 lwp_cond_wait (10034d348, 10034d330, 0, 0) ffffffff7dcad148 void os::PlatformEvent::park() (10034d300, d4788, d4400, 0, ffffffff7e357440, 10034d330) + 100 ffffffff7da5eac8 int JvmtiRawMonitor::SimpleWait(Thread*,long) (10034bed0, 10034c000, ffffffffffffffff, 241000, 0, 10034c000) + 100 ffffffff7da5f300 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) (10034bed0, ffffffffffffffff, 1, 10034c000, ffffffff7e357440, 10034c000) + e0 ffffffff7da2284c jvmtiError JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bed0, ffffffffffffffff, 4, 9aeb0, 10034c000) + 8c ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, ffffffff7ab3ad18, ffffffff7ab3b940, 0) + 3c ffffffff7aa1d838 doBlockCommandLoop (800, 1038, ffffffff7ab3ad18, 1000, ffffffff7ab3ad18, ffffffff7ab3bd60) + 48 ffffffff7aa1da3c commandLoop (c28, 10034c1f8, c00, ffffffff7ab3ad18, 0, 10000000) + ac ffffffff7da56b18 void JvmtiAgentThread::call_start_function() (10034c000, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034c00, 1) + 128 ffffffff7de6a678 void JavaThread::thread_main_inner() (10034c000, 3d8, 10034cfe8, 10034cc10, ffffffff5b0ffbf8, 0) + 90 ffffffff7de6a5b4 void JavaThread::run() (10034c000, 10034cc28, d, fffffffea5f3e290, 3d8, 10034cfe8) + 3ac ffffffff7dc9f2e4 java_start (ca800, 10034c000, ca904, ffffffff7e16ff31, ffffffff7e357440, 181a) + 2e4 ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) The details: - Thread #2: The cbVMDeath() event handler is waiting on the commandCompleteLock in the enqueueCommand(). The call chain is: cbVMDeath() -> event_callback() -> reportEvents() -> eventHelper_reportEvents() -> enqueueCommand(). The enqueueCommand() depends on the commandLoop() that has to call completeCommand(command) for the command being enqueued. This has not been set yet: gdata->vmDead = JNI_TRUE - Thread #12: The debugLoop_run blocked on the vmDeathLock enter - Thread #13: The commandLoop is waiting on the blockCommandLoopLock in the doBlockCommandLoop(). It is because blockCommandLoop == JNI_TRUE which is set in the needBlockCommandLoop() if the following condition is true: (cmd->commandKind == COMMAND_REPORT_EVENT_COMPOSITE && cmd->u.reportEventComposite.suspendPolicy == JDWP_SUSPEND_POLICY(ALL)) It seems, the debugLoop_run() block on the vmDeathLock causes the commandLoop() to wait indefinitely. The cbVMDeath() can not proceed because the commandLoop() does not make a progress. The vmDeathLock critical section in the cbVMDeath() event callback seems to be an overkill (unnecessary). A less intrusive synchronization is required here which is to wait until the current command is completed before returning to the JvmtiExport::post_vm_death(). The new approach (see new webrev) is to extend the resumeLock synchronization pattern to all VirtualMachine set of commands, not only the resume command. The resumeLock name is replaced with the vmDeathLock to reflect new semantics. In general, we could consider to do the same for the rest of the JDWP command sets. But it is better to be careful and see how this change goes first. Thanks, Serguei On 11/5/14 2:27 AM, serguei.spitsyn at oracle.com wrote: > Hi David, > > Thank you for the concerns! > Testing showed several tests failing with deadlocks. > Scenarios are similar to that you describe. > > Trying to understand the details. > > Thanks, > Serguei > > On 11/4/14 4:09 PM, David Holmes wrote: >> Hi Serguei, >> >> On 3/11/2014 5:07 PM, serguei.spitsyn at oracle.com wrote: >>> On 11/2/14 8:58 PM, David Holmes wrote: >>>> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >>>>> Serguei, >>>>> >>>>> Thank you for good finding. This approach looks much better for me. >>>>> >>>>> The fix looks good. >>>>> >>>>> Is it necessary to release vmDeathLock locks at >>>>> eventHandler.c:1244 before call >>>>> >>>>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? >>>> >>>> I agree this looks necessary, or at least more clean (if things are >>>> failing we really don't know what is happening). >>> >>> Agreed (replied to Dmitry). >>> >>>> >>>> More generally I'm concerned about whether any of the code paths taken >>>> while holding the new lock can result in deadlock - in particular with >>>> regard to the resumeLock ? >>> >>> The cbVMDeath() function never holds both vmDeathLock and resumeLock at >>> the same time, >>> so there is no chance for a deadlock that involves both these locks. >>> >>> Two more locks used in the cbVMDeath() are the callbackBlock and >>> callbackLock. >>> These two locks look completely unrelated to the debugLoop_run(). >>> >>> The debugLoop_run() function also uses the cmdQueueLock. >>> The debugLoop_run() never holds both vmDeathLock and cmdQueueLock at >>> the >>> same time. >>> >>> So that I do not see any potential to introduce new deadlock with the >>> vmDeathLock. >>> >>> However, it is still easy to overlook something here. >>> Please, let me know if you see any danger. >> >> I was mainly concerned about what might happen in the call chain for >> threadControl_resumeAll() (it certainly sounds like it might need to >> use a resumeLock :) ). I see direct use of the threadLock and >> indirectly the eventHandler lock; but there are further call paths I >> did not explore. Wish there was an easy way to determine the >> transitive closure of all locks used from a given call. >> >> Thanks, >> David >> >>> Thanks, >>> Serguei >>> >>>> >>>> David >>>> >>>>> -Dmitry >>>>> >>>>> >>>>> >>>>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>>>>> >>>>>> It is 3-rd round of review for: >>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>> >>>>>> New webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Summary >>>>>> >>>>>> For failing scenario, please, refer to the 1-st round RFR below. >>>>>> >>>>>> I've found what is missed in the jdwp agent shutdown and >>>>>> decided to >>>>>> switch from a workaround to a real fix. >>>>>> >>>>>> The agent VM_DEATH callback sets the gdata field: >>>>>> gdata->vmDead = 1. >>>>>> The agent debugLoop_run() has a guard against the VM shutdown: >>>>>> >>>>>> 165 } else if (gdata->vmDead && >>>>>> 166 ((cmd->cmdSet) != >>>>>> JDWP_COMMAND_SET(VirtualMachine))) { >>>>>> 167 /* Protect the VM from calls while dead. >>>>>> 168 * VirtualMachine cmdSet quietly ignores some >>>>>> cmds >>>>>> 169 * after VM death, so, it sends it's own >>>>>> errors. >>>>>> 170 */ >>>>>> 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD)); >>>>>> >>>>>> >>>>>> However, the guard above does not help much if the VM_DEATH event >>>>>> happens in the middle of a command execution. >>>>>> There is a lack of synchronization here. >>>>>> >>>>>> The fix introduces new lock (vmDeathLock) which does not allow to >>>>>> execute the commands >>>>>> and the VM_DEATH event callback concurrently. >>>>>> It should work well for any function that is used in >>>>>> implementation of >>>>>> the JDWP_COMMAND_SET(VirtualMachine) . >>>>>> >>>>>> >>>>>> Testing: >>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>> tests >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> The updated webrev: >>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> The changes are: >>>>>>> - added a comment recommended by Staffan >>>>>>> - removed the ignore_wrong_phase() call from function >>>>>>> classSignature() >>>>>>> >>>>>>> The classSignature() function is called in 16 places. >>>>>>> Most of them do not tolerate the NULL in place of returned >>>>>>> signature >>>>>>> and will crash. >>>>>>> I'm not comfortable to fix all the occurrences now and suggest to >>>>>>> return to this >>>>>>> issue after gaining experience with more failure cases that are >>>>>>> still >>>>>>> expected. >>>>>>> The failure with the classSignature() involved was observed only >>>>>>> once >>>>>>> in the nightly >>>>>>> and should be extremely rare reproducible. >>>>>>> I'll file a placeholder bug if necessary. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Please, review the fix for: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>> >>>>>>>> >>>>>>>> Open webrev: >>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Summary: >>>>>>>> >>>>>>>> The failing scenario: >>>>>>>> The debugger and the debuggee are well aware a VM >>>>>>>> shutdown has >>>>>>>> been started in the target process. >>>>>>>> The debugger at this point is not expected to send any >>>>>>>> commands >>>>>>>> to the JDWP agent. >>>>>>>> However, the JDI layer (debugger side) and the jdwp agent >>>>>>>> (debuggee side) >>>>>>>> are not in sync with the consumer layers. >>>>>>>> >>>>>>>> One reason is because the test debugger does not invoke >>>>>>>> the JDI >>>>>>>> method VirtualMachine.dispose(). >>>>>>>> Another reason is that the Debugger and the debuggee >>>>>>>> processes >>>>>>>> are uneasy to sync in general. >>>>>>>> >>>>>>>> As a result the following steps are possible: >>>>>>>> - The test debugger sends a 'quit' command to the test >>>>>>>> debuggee >>>>>>>> - The debuggee is normally exiting >>>>>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>>>>> anonymous class unload event >>>>>>>> - The JDI InternalEventHandler thread handles the >>>>>>>> ClassUnloadEvent event >>>>>>>> - The InternalEventHandler wants to uncache the matching >>>>>>>> reference type. >>>>>>>> If there is more than one class with the same host class >>>>>>>> signature, it can't distinguish them, >>>>>>>> and so, deletes all references and re-retrieves them >>>>>>>> again >>>>>>>> (see tracing below): >>>>>>>> MY_TRACE: JDI: >>>>>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>>>>> - The jdwp backend debugLoop_run() gets the command >>>>>>>> from JDI >>>>>>>> and calls the functions >>>>>>>> classesForSignature() and classStatus() recursively. >>>>>>>> - The classStatus() makes a call to the JVMTI >>>>>>>> GetClassStatus() >>>>>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>>>>> - As a result the jdwp backend reports the JVMTI error >>>>>>>> to the >>>>>>>> JDI, and so, the test fails >>>>>>>> >>>>>>>> For details, see the analysis in bug report closed as a >>>>>>>> dup of >>>>>>>> the bug 6988950: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>>>>> >>>>>>>> Some similar cases can be found in the two bug reports >>>>>>>> (6988950 >>>>>>>> and 8024865) describing this issue. >>>>>>>> >>>>>>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE >>>>>>>> error >>>>>>>> as it is normal at the VM shutdown. >>>>>>>> The original jdwp backend implementation had a similar >>>>>>>> approach >>>>>>>> for the raw monitor functions. >>>>>>>> Threy use the ignore_vm_death() to workaround the >>>>>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>>>>> For reference, please, see the file: src/share/back/util.c >>>>>>>> >>>>>>>> >>>>>>>> Testing: >>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>>> tests >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>> > From serguei.spitsyn at oracle.com Thu Nov 6 22:56:50 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 06 Nov 2014 14:56:50 -0800 Subject: 4-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <545BF5CD.6010008@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> <545729A3.7090301@oracle.com> <54596AC2.6050502@oracle.com> <5459FB96.9020404@oracle.com> <545BF5CD.6010008@oracle.com> Message-ID: <545BFCB2.5070802@oracle.com> I forgot to tell that all the nsk.jdi.testlist, nsk.jdwp.testlist and gtreg com/sun/jdi tests successfully passed with the new fix. No deadlocks are observed. Thanks, Serguei On 11/6/14 2:27 PM, serguei.spitsyn at oracle.com wrote: > Hi reviewers, > > I'm suggesting to review a modified fix: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.4/ > > > The 3-rd round fix is not right as it caused deadlocks in several > tests (in nsk.jdi.testlist and jtreg com/sun/jdi). > > Here is a deadlock example: > > ----------------- lwp# 2 / thread# 2 -------------------- > ffffffff7e8dc6a4 lwp_cond_wait (100138748, 100138730, 0, 0) > ffffffff7dcad148 void os::PlatformEvent::park() (100138700, d4788, > d4400, 0, ffffffff7e357440, 100138730) + 100 > ffffffff7dc3151c int Monitor::IWait(Thread*,long) (ffffffff7e3c5b98, > 100137000, 0, 1004405d0, 6e750, 0) + a4 > ffffffff7dc324d0 bool Monitor::wait(bool,long,bool) (1004405d0, > 100137000, 0, 0, 1, 20000000) + 358 > ffffffff7de6c530 int JavaThread::java_suspend_self() (1004405d0, > 100137000, 1, deab, 60000000, 100137000) + c8 > ffffffff7da5f478 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) > (10034bdc0, ffffffffffffffff, ffffffff7e3e6bd0, 100137000, 1, 2) + 258 > ffffffff7da2284c jvmtiError > JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bdc0, > ffffffffffffffff, 4, 9aeb0, 100137000) + 8c > ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, > ffffffff7ab3ad18, ffffffff7ab3ad18, 0) + 3c > ffffffff7aa1c804 enqueueCommand (10034bb90, 102c00, ffffffffffefd118, > ffffffff7ab3ad18, 102c00, ffffffff7ab3bd60) + 14c > ffffffff7aa1e23c eventHelper_reportEvents (d8, 100135d70, 2, 1, 1, 2) > + 10c > ffffffff7aa181f8 reportEvents (1001371f8, 0, 0, 14, 100135d70, 0) + 138 > ffffffff7aa187b8 event_callback (1001371f8, ffffffff7b0ffa88, > ffffffff7aa23150, ffffffff7aa376a0, ffffffff7ab3ad18, 100441ad0) + 360 > ffffffff7aa1b870 cbVMDeath (800, 1001371f8, ffffffff7aa37c48, > ffffffff7ab3ad18, 1018, 1000) + 1d8 > ffffffff7da3635c void JvmtiExport::post_vm_death() (1ffc, 100137000, > ffffffff7e3e8b30, ffffffff7e357440, 1, 10010cf30) + 534 > ffffffff7d7bb104 void before_exit(JavaThread*) (100137000, > ffffffff7e392350, ffffffff7e3fb938, 6ed99, ffffffff7e357440, > ffffffff7e3e6b70) + 30c > ffffffff7de72128 bool Threads::destroy_vm() (100137000, 100110a40, > ffffffff7e3f22f4, ffffffff7e3e6ab0, ffffffff7e357440, 30000000) + 100 > ffffffff7d8d0664 jni_DestroyJavaVM (100137000, 1ffc, > ffffffff7e3e8b30, ffffffff7e357440, 0, 10013700) + 1bc > ffffffff7ee08680 JavaMain (ffffffff7e3da790, 0, ffffffff7e3da790, > 10035de68, 0, ffffffff7e4143b0) + 860 > ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) > > ----------------- lwp# 12 / thread# 12 -------------------- > ffffffff7e8dc6a4 lwp_cond_wait (100349948, 100349930, 0, 0) > ffffffff7dcad148 void os::PlatformEvent::park() (100349900, d4788, > d4400, 0, ffffffff7e357440, 100349930) + 100 > ffffffff7da5f010 int JvmtiRawMonitor::raw_enter(Thread*) (10034a070, > 100348800, a, ffffffff7e3de340, 1, ffffffff7e115ff4) + 258 > ffffffff7da22450 jvmtiError > JvmtiEnv::RawMonitorEnter(JvmtiRawMonitor*) (ffffffff7ea05a00, > 10034a070, 1c7, 100348800, ffffffff7e357440, 4) + a0 > ffffffff7aa2f288 debugMonitorEnter (10034a070, c18, c00, > ffffffff7ab3ad28, ffffffff7ab3b940, 0) + 38 > ffffffff7aa14134 debugLoop_run (ffffffff7ab3b940, 1000, > ffffffff7ab3ad28, ffffffff7aa360d0, ffffffff5b2ff718, c18) + 11c > ffffffff7aa2a4f8 connectionInitiated (ffffffff5b504010, 1358, 1000, > ffffffff7ab3ad28, 1, ffffffff7ab3c080) + e0 > ffffffff7aa2a7d4 attachThread (ffffffffffefee48, 101000, > ffffffff5b504010, ffffffff7ab3ad28, 0, 10000000) + 54 > ffffffff7da56b18 void JvmtiAgentThread::call_start_function() > (100348800, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034880, 1) + 128 > ffffffff7de6a678 void JavaThread::thread_main_inner() (100348800, > 3d8, 1003497f8, 100349420, ffffffff5b2ff9f8, 0) + 90 > ffffffff7de6a5b4 void JavaThread::run() (100348800, 100349442, c, > fffffffea5f3e048, 3d8, 1003497f8) + 3ac > ffffffff7dc9f2e4 java_start (ca800, 100348800, ca904, > ffffffff7e16ff31, ffffffff7e357440, 4797) + 2e4 > ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) > > ----------------- lwp# 13 / thread# 13 -------------------- > ffffffff7e8dc6a4 lwp_cond_wait (10034d348, 10034d330, 0, 0) > ffffffff7dcad148 void os::PlatformEvent::park() (10034d300, d4788, > d4400, 0, ffffffff7e357440, 10034d330) + 100 > ffffffff7da5eac8 int JvmtiRawMonitor::SimpleWait(Thread*,long) > (10034bed0, 10034c000, ffffffffffffffff, 241000, 0, 10034c000) + 100 > ffffffff7da5f300 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) > (10034bed0, ffffffffffffffff, 1, 10034c000, ffffffff7e357440, > 10034c000) + e0 > ffffffff7da2284c jvmtiError > JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bed0, > ffffffffffffffff, 4, 9aeb0, 10034c000) + 8c > ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, > ffffffff7ab3ad18, ffffffff7ab3b940, 0) + 3c > ffffffff7aa1d838 doBlockCommandLoop (800, 1038, ffffffff7ab3ad18, > 1000, ffffffff7ab3ad18, ffffffff7ab3bd60) + 48 > ffffffff7aa1da3c commandLoop (c28, 10034c1f8, c00, ffffffff7ab3ad18, > 0, 10000000) + ac > ffffffff7da56b18 void JvmtiAgentThread::call_start_function() > (10034c000, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034c00, 1) + 128 > ffffffff7de6a678 void JavaThread::thread_main_inner() (10034c000, > 3d8, 10034cfe8, 10034cc10, ffffffff5b0ffbf8, 0) + 90 > ffffffff7de6a5b4 void JavaThread::run() (10034c000, 10034cc28, d, > fffffffea5f3e290, 3d8, 10034cfe8) + 3ac > ffffffff7dc9f2e4 java_start (ca800, 10034c000, ca904, > ffffffff7e16ff31, ffffffff7e357440, 181a) + 2e4 > ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) > > > The details: > - Thread #2: The cbVMDeath() event handler is waiting on the > commandCompleteLock in the enqueueCommand(). > The call chain is: > cbVMDeath() -> event_callback() -> reportEvents() -> > eventHelper_reportEvents() -> enqueueCommand(). > The enqueueCommand() depends on the commandLoop() that has to call > completeCommand(command) for the command being enqueued. > This has not been set yet: gdata->vmDead = JNI_TRUE > > - Thread #12: The debugLoop_run blocked on the vmDeathLock enter > > - Thread #13: The commandLoop is waiting on the blockCommandLoopLock > in the doBlockCommandLoop(). > It is because blockCommandLoop == JNI_TRUE which is set in the > needBlockCommandLoop() > if the following condition is true: > (cmd->commandKind == COMMAND_REPORT_EVENT_COMPOSITE && > cmd->u.reportEventComposite.suspendPolicy == > JDWP_SUSPEND_POLICY(ALL)) > > > It seems, the debugLoop_run() block on the vmDeathLock causes the > commandLoop() to wait indefinitely. > The cbVMDeath() can not proceed because the commandLoop() does not > make a progress. > > The vmDeathLock critical section in the cbVMDeath() event callback > seems to be an overkill (unnecessary). > A less intrusive synchronization is required here which is to wait > until the current command is completed > before returning to the JvmtiExport::post_vm_death(). > > The new approach (see new webrev) is to extend the resumeLock > synchronization pattern > to all VirtualMachine set of commands, not only the resume command. > The resumeLock name is replaced with the vmDeathLock to reflect new > semantics. > > In general, we could consider to do the same for the rest of the JDWP > command sets. > But it is better to be careful and see how this change goes first. > > > Thanks, > Serguei > > > On 11/5/14 2:27 AM, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> Thank you for the concerns! >> Testing showed several tests failing with deadlocks. >> Scenarios are similar to that you describe. >> >> Trying to understand the details. >> >> Thanks, >> Serguei >> >> On 11/4/14 4:09 PM, David Holmes wrote: >>> Hi Serguei, >>> >>> On 3/11/2014 5:07 PM, serguei.spitsyn at oracle.com wrote: >>>> On 11/2/14 8:58 PM, David Holmes wrote: >>>>> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >>>>>> Serguei, >>>>>> >>>>>> Thank you for good finding. This approach looks much better for me. >>>>>> >>>>>> The fix looks good. >>>>>> >>>>>> Is it necessary to release vmDeathLock locks at >>>>>> eventHandler.c:1244 before call >>>>>> >>>>>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? >>>>> >>>>> I agree this looks necessary, or at least more clean (if things are >>>>> failing we really don't know what is happening). >>>> >>>> Agreed (replied to Dmitry). >>>> >>>>> >>>>> More generally I'm concerned about whether any of the code paths >>>>> taken >>>>> while holding the new lock can result in deadlock - in particular >>>>> with >>>>> regard to the resumeLock ? >>>> >>>> The cbVMDeath() function never holds both vmDeathLock and >>>> resumeLock at >>>> the same time, >>>> so there is no chance for a deadlock that involves both these locks. >>>> >>>> Two more locks used in the cbVMDeath() are the callbackBlock and >>>> callbackLock. >>>> These two locks look completely unrelated to the debugLoop_run(). >>>> >>>> The debugLoop_run() function also uses the cmdQueueLock. >>>> The debugLoop_run() never holds both vmDeathLock and cmdQueueLock >>>> at the >>>> same time. >>>> >>>> So that I do not see any potential to introduce new deadlock with the >>>> vmDeathLock. >>>> >>>> However, it is still easy to overlook something here. >>>> Please, let me know if you see any danger. >>> >>> I was mainly concerned about what might happen in the call chain for >>> threadControl_resumeAll() (it certainly sounds like it might need to >>> use a resumeLock :) ). I see direct use of the threadLock and >>> indirectly the eventHandler lock; but there are further call paths I >>> did not explore. Wish there was an easy way to determine the >>> transitive closure of all locks used from a given call. >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> David >>>>> >>>>>> -Dmitry >>>>>> >>>>>> >>>>>> >>>>>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>>>>>> >>>>>>> It is 3-rd round of review for: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>> >>>>>>> New webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Summary >>>>>>> >>>>>>> For failing scenario, please, refer to the 1-st round RFR below. >>>>>>> >>>>>>> I've found what is missed in the jdwp agent shutdown and >>>>>>> decided to >>>>>>> switch from a workaround to a real fix. >>>>>>> >>>>>>> The agent VM_DEATH callback sets the gdata field: >>>>>>> gdata->vmDead = 1. >>>>>>> The agent debugLoop_run() has a guard against the VM shutdown: >>>>>>> >>>>>>> 165 } else if (gdata->vmDead && >>>>>>> 166 ((cmd->cmdSet) != >>>>>>> JDWP_COMMAND_SET(VirtualMachine))) { >>>>>>> 167 /* Protect the VM from calls while dead. >>>>>>> 168 * VirtualMachine cmdSet quietly ignores some >>>>>>> cmds >>>>>>> 169 * after VM death, so, it sends it's own >>>>>>> errors. >>>>>>> 170 */ >>>>>>> 171 outStream_setError(&out, >>>>>>> JDWP_ERROR(VM_DEAD)); >>>>>>> >>>>>>> >>>>>>> However, the guard above does not help much if the VM_DEATH >>>>>>> event >>>>>>> happens in the middle of a command execution. >>>>>>> There is a lack of synchronization here. >>>>>>> >>>>>>> The fix introduces new lock (vmDeathLock) which does not >>>>>>> allow to >>>>>>> execute the commands >>>>>>> and the VM_DEATH event callback concurrently. >>>>>>> It should work well for any function that is used in >>>>>>> implementation of >>>>>>> the JDWP_COMMAND_SET(VirtualMachine) . >>>>>>> >>>>>>> >>>>>>> Testing: >>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>> tests >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> The updated webrev: >>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The changes are: >>>>>>>> - added a comment recommended by Staffan >>>>>>>> - removed the ignore_wrong_phase() call from function >>>>>>>> classSignature() >>>>>>>> >>>>>>>> The classSignature() function is called in 16 places. >>>>>>>> Most of them do not tolerate the NULL in place of returned >>>>>>>> signature >>>>>>>> and will crash. >>>>>>>> I'm not comfortable to fix all the occurrences now and suggest to >>>>>>>> return to this >>>>>>>> issue after gaining experience with more failure cases that are >>>>>>>> still >>>>>>>> expected. >>>>>>>> The failure with the classSignature() involved was observed >>>>>>>> only once >>>>>>>> in the nightly >>>>>>>> and should be extremely rare reproducible. >>>>>>>> I'll file a placeholder bug if necessary. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Please, review the fix for: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>>> >>>>>>>>> >>>>>>>>> Open webrev: >>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Summary: >>>>>>>>> >>>>>>>>> The failing scenario: >>>>>>>>> The debugger and the debuggee are well aware a VM >>>>>>>>> shutdown has >>>>>>>>> been started in the target process. >>>>>>>>> The debugger at this point is not expected to send any >>>>>>>>> commands >>>>>>>>> to the JDWP agent. >>>>>>>>> However, the JDI layer (debugger side) and the jdwp agent >>>>>>>>> (debuggee side) >>>>>>>>> are not in sync with the consumer layers. >>>>>>>>> >>>>>>>>> One reason is because the test debugger does not invoke >>>>>>>>> the JDI >>>>>>>>> method VirtualMachine.dispose(). >>>>>>>>> Another reason is that the Debugger and the debuggee >>>>>>>>> processes >>>>>>>>> are uneasy to sync in general. >>>>>>>>> >>>>>>>>> As a result the following steps are possible: >>>>>>>>> - The test debugger sends a 'quit' command to the test >>>>>>>>> debuggee >>>>>>>>> - The debuggee is normally exiting >>>>>>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>>>>>> anonymous class unload event >>>>>>>>> - The JDI InternalEventHandler thread handles the >>>>>>>>> ClassUnloadEvent event >>>>>>>>> - The InternalEventHandler wants to uncache the matching >>>>>>>>> reference type. >>>>>>>>> If there is more than one class with the same host >>>>>>>>> class >>>>>>>>> signature, it can't distinguish them, >>>>>>>>> and so, deletes all references and re-retrieves them >>>>>>>>> again >>>>>>>>> (see tracing below): >>>>>>>>> MY_TRACE: JDI: >>>>>>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>>>>>> - The jdwp backend debugLoop_run() gets the command >>>>>>>>> from JDI >>>>>>>>> and calls the functions >>>>>>>>> classesForSignature() and classStatus() recursively. >>>>>>>>> - The classStatus() makes a call to the JVMTI >>>>>>>>> GetClassStatus() >>>>>>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>>>>>> - As a result the jdwp backend reports the JVMTI error >>>>>>>>> to the >>>>>>>>> JDI, and so, the test fails >>>>>>>>> >>>>>>>>> For details, see the analysis in bug report closed as a >>>>>>>>> dup of >>>>>>>>> the bug 6988950: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>>>>>> >>>>>>>>> Some similar cases can be found in the two bug reports >>>>>>>>> (6988950 >>>>>>>>> and 8024865) describing this issue. >>>>>>>>> >>>>>>>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE >>>>>>>>> error >>>>>>>>> as it is normal at the VM shutdown. >>>>>>>>> The original jdwp backend implementation had a similar >>>>>>>>> approach >>>>>>>>> for the raw monitor functions. >>>>>>>>> Threy use the ignore_vm_death() to workaround the >>>>>>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>>>>>> For reference, please, see the file: src/share/back/util.c >>>>>>>>> >>>>>>>>> >>>>>>>>> Testing: >>>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>>>> tests >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>> >> > From jiangli.zhou at oracle.com Thu Nov 6 23:53:31 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 06 Nov 2014 15:53:31 -0800 Subject: Review request for 8058313: Mismatch of method descriptor and MethodParameters.parameters_count should cause MalformedParameterException In-Reply-To: <545BBF74.4020607@oracle.com> References: <54516C9A.7070404@oracle.com> <54518820.50700@oracle.com> <545251AD.8050208@oracle.com> <54527A90.4030503@oracle.com> <5452CC7F.1090809@oracle.com> <5457F530.2070907@oracle.com> <54597EFC.2070509@oracle.com> <545BBF74.4020607@oracle.com> Message-ID: <545C09FB.9020907@oracle.com> Eric, On 11/06/2014 10:35 AM, Eric McCorkle wrote: > On 11/04/14 20:35, Jiangli Zhou wrote: >> Hi Eric, >> >> I have a few more comments: >> >> In ClassFileParser::parse_method(), should 'real_length' be int instead >> of u2? > No. By that point, we know it's positive, and it's about to be compared > to method_attribute_length, which is a u2. Ok, I double checked with the spec and the VM code. The 'parameters_count' is u1, so it would not overflow using u2. So we are fine using u2 here. > >> In JVM_GetMethodParameters(), can you add an assert to make sure the >> num_params is -1 when it's less than 0? Also, it's probably more >> conventional to use (num_params < 0) instead of (0 > num_params). > Appiled, and webrev refreshed. Please look at it. Could you please point to the updated webrev? I don't see the update in http://cr.openjdk.java.net/~emc/8058313/webrev.01/src/share/vm/prims/jvm.cpp.sdiff.html. Thanks, Jiangli > >> On 11/03/2014 01:35 PM, Eric McCorkle wrote: >>> Please review this issue so that it can go in along with 8058322. >>> Thanks. >>> >>> On 10/30/14 19:40, Eric McCorkle wrote: >>>> Thank you for the pointers. I have applied your changes and refreshed >>>> the webrev. >>>> >>>> http://cr.openjdk.java.net/~emc/8058313/ >>>> >>>> Also, I have posted the test for this and another patch here: >>>> http://cr.openjdk.java.net/~emc/8062556/ >>>> >>>> On 10/30/14 13:51, Jiangli Zhou wrote: >>>>> Hi Eric, >>>>> >>>>> On 10/30/2014 07:56 AM, Eric McCorkle wrote: >>>>>> On 10/29/14 20:36, Jiangli Zhou wrote: >>>>>>> Hi Eric, >>>>>>> >>>>>>> I wonder if we could specialize this particular case and avoid >>>>>>> changing >>>>>>> the parsing code. How about setting the _has_method_parameters >>>>>>> flag in >>>>>>> the ConstMethod when encounter such MethodParameter, and changing >>>>>>> JVM_GetMethodParameters() to return non-NULL value for such case when >>>>>>> _has_method_parameters is true but method_parameters_length is 0. >>>>>>> Would >>>>>>> that work? >>>>>> Which parser are you talking about? The inline tables parser, or the >>>>>> class file parser. The class file parser has to change, because it >>>>>> was >>>>>> previously ignoring MethodParameters attributes with >>>>>> parameter_count 0. >>>>> It's the class parsing changes that I was referring to, mostly >>>>> relate to >>>>> the initialization and checking against method_parameters_length. >>>>> It's a >>>>> bit awkward to include the 0 case but also skipping it in the loop. For >>>>> example, the following code in classFileParser.cpp changed ">" to ">=" >>>>> in the if check, but has no real effect and is not need. >>>>> >>>>> 2486 // Copy method parameters >>>>> 2487 if (method_parameters_length >= 0) { >>>>> 2488 MethodParametersElement* elem = >>>>> m->constMethod()->method_parameters_start(); >>>>> 2489 for (int i = 0; i < method_parameters_length; i++) { >>>>> 2490 elem[i].name_cp_index = >>>>> Bytes::get_Java_u2(method_parameters_data); >>>>> 2491 method_parameters_data += 2; >>>>> 2492 elem[i].flags = Bytes::get_Java_u2(method_parameters_data); >>>>> 2493 method_parameters_data += 2; >>>>> 2494 } >>>>> 2495 } >>>>> >>>>> >>>>>> I don't think your proposal will work. The inline tables' offsets are >>>>>> all dependent on what inline tables are actually present. If >>>>>> _has_method_parameters is set, then the inline tables code expects the >>>>>> last u2 of the inline tables to be a u2 indicating the number of >>>>>> method >>>>>> parameters entries, preceeded by the array of method parameters data. >>>>>> If _has_method_parameters is false, then it expects that there is no >>>>>> method parameters information at all (including no length field). If >>>>>> you were to set _has_method_parameters, but not store any >>>>>> information in >>>>>> the inline table, then it would cause errors for all the rest of the >>>>>> inline tables. >>>>> Thank you for reminding me of the complexity of the inlined table >>>>> calculation in the ConstMethod. My proposal would require tweaks in >>>>> that >>>>> area to correctly compute the table sizes. As it's easy to introduce >>>>> bugs in that area, it's not worth to change the table calculation code >>>>> for this purpose. I agree my proposal is not a better choice in this >>>>> case. >>>>> >>>>>> What I do for the parameter_count = 0 case is just store >>>>>> a 0 u2 for zero-length method parameters information, and no data. >>>>>> All >>>>>> the existing inline tables code works fine with this case, so there >>>>>> aren't any serious changes to the inline tables code (other than >>>>>> allowing method parameters information to be stored when the array is >>>>>> length 0). But you have to make some change to the inline table code, >>>>>> otherwise the information won't be stored. >>>>> Ok. Could you please add comments to the change in constMethod.cpp to >>>>> explain above? >>>>> >>>>> In jvm.cpp, since -1 represents no method parameter now. Maybe checking >>>>> against explicity and add comments for the 0-length case. >>>>> >>>>> JVM_ENTRY(jobjectArray, JVM_GetMethodParameters(JNIEnv *env, jobject >>>>> method)) >>>>> { >>>>> ... >>>>> // No method parameter >>>>> if (num_params == -1) { >>>>> return (jobjectArray)NULL; >>>>> } >>>>> >>>>> /* handle the rest here */ >>>>> // make sure all the symbols are properly formatted >>>>> for (int i = 0; i < num_params; i++) { >>>>> ... >>>>> } >>>>> >>>>> Thanks, >>>>> Jiangli >>>>> >>>>>>> Thanks, >>>>>>> Jiangli >>>>>>> >>>>>>> On 10/29/2014 03:39 PM, Eric McCorkle wrote: >>>>>>>> Hello, >>>>>>>> >>>>>>>> Please review this fix for parameter reflection which addresses >>>>>>>> hotspot >>>>>>>> falsely ignoring zero-length MethodParameter attributes. The JVMS >>>>>>>> allows a MethodParameters attribute with parameter_count = 0, and >>>>>>>> the >>>>>>>> parameter reflection spec states that a MalformedParametersException >>>>>>>> should be thrown if parameter_count does not match the number of >>>>>>>> real >>>>>>>> parameters to a method. Hotspot currently ignores MethodParameters >>>>>>>> attributes with parameter_count = 0; however, in a case where a >>>>>>>> (bad) >>>>>>>> MethodParameters attribute has parameter_count = 0, but the method >>>>>>>> has a >>>>>>>> nonzero number of real parameters, hotspot will return null from >>>>>>>> JVM_GetMethodParameters, the result being that a >>>>>>>> MalformedParametersException is not thrown (rather, the >>>>>>>> reflection API >>>>>>>> acts like there is no MethodParameters attribute). >>>>>>>> >>>>>>>> This patch causes hotspot to record the fact that a zero-length >>>>>>>> MethodParameters attribute does exist, causing the exception to be >>>>>>>> thrown when it should be. >>>>>>>> >>>>>>>> The bug is here: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8058313 >>>>>>>> >>>>>>>> The webrev is here: >>>>>>>> http://cr.openjdk.java.net/~emc/8058313/ From vladimir.kozlov at oracle.com Fri Nov 7 02:56:11 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 06 Nov 2014 18:56:11 -0800 Subject: [8u40] backport RFR(XS) 8059780: SPECjvm2008-MPEG performance regressions on x64 platforms In-Reply-To: <7176BA2A-BFE3-4312-9E2E-05E3AE0841CE@oracle.com> References: <545A9F16.8050107@oracle.com> <7176BA2A-BFE3-4312-9E2E-05E3AE0841CE@oracle.com> Message-ID: <545C34CB.2010709@oracle.com> Thank you, Roland Vladimir On 11/6/14 1:50 AM, Roland Westrelin wrote: > Looks good to me. > > Roland. > >> On Nov 5, 2014, at 11:05 PM, Vladimir Kozlov wrote: >> >> Backport request. Changes were pushed into jdk9 2 days ago. Nighties are fine. Changes are applied cleanly to 8u sources. >> >> http://cr.openjdk.java.net/~kvn/8059780/webrev/ >> >> https://bugs.openjdk.java.net/browse/JDK-8059780 >> >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/b8bcacc8ccca >> >> Thanks, >> Vladimir > From david.holmes at oracle.com Fri Nov 7 05:18:24 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 07 Nov 2014 15:18:24 +1000 Subject: 4-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <545BF5CD.6010008@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> <545729A3.7090301@oracle.com> <54596AC2.6050502@oracle.com> <5459FB96.9020404@oracle.com> <545BF5CD.6010008@oracle.com> Message-ID: <545C5620.7080300@oracle.com> Hi Serguei, I think I get the gist of this approach but I'm not an expert on the JVM TI or JDWP event model. My main concern would be how the delay to the completion of cbVMDeath() might impact things - specifically if it might be a lengthy delay? Thanks, David On 7/11/2014 8:27 AM, serguei.spitsyn at oracle.com wrote: > Hi reviewers, > > I'm suggesting to review a modified fix: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.4/ > > > The 3-rd round fix is not right as it caused deadlocks in several tests > (in nsk.jdi.testlist and jtreg com/sun/jdi). > > Here is a deadlock example: > > ----------------- lwp# 2 / thread# 2 -------------------- > ffffffff7e8dc6a4 lwp_cond_wait (100138748, 100138730, 0, 0) > ffffffff7dcad148 void os::PlatformEvent::park() (100138700, d4788, > d4400, 0, ffffffff7e357440, 100138730) + 100 > ffffffff7dc3151c int Monitor::IWait(Thread*,long) (ffffffff7e3c5b98, > 100137000, 0, 1004405d0, 6e750, 0) + a4 > ffffffff7dc324d0 bool Monitor::wait(bool,long,bool) (1004405d0, > 100137000, 0, 0, 1, 20000000) + 358 > ffffffff7de6c530 int JavaThread::java_suspend_self() (1004405d0, > 100137000, 1, deab, 60000000, 100137000) + c8 > ffffffff7da5f478 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) > (10034bdc0, ffffffffffffffff, ffffffff7e3e6bd0, 100137000, 1, 2) + 258 > ffffffff7da2284c jvmtiError > JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bdc0, > ffffffffffffffff, 4, 9aeb0, 100137000) + 8c > ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, > ffffffff7ab3ad18, ffffffff7ab3ad18, 0) + 3c > ffffffff7aa1c804 enqueueCommand (10034bb90, 102c00, ffffffffffefd118, > ffffffff7ab3ad18, 102c00, ffffffff7ab3bd60) + 14c > ffffffff7aa1e23c eventHelper_reportEvents (d8, 100135d70, 2, 1, 1, 2) > + 10c > ffffffff7aa181f8 reportEvents (1001371f8, 0, 0, 14, 100135d70, 0) + 138 > ffffffff7aa187b8 event_callback (1001371f8, ffffffff7b0ffa88, > ffffffff7aa23150, ffffffff7aa376a0, ffffffff7ab3ad18, 100441ad0) + 360 > ffffffff7aa1b870 cbVMDeath (800, 1001371f8, ffffffff7aa37c48, > ffffffff7ab3ad18, 1018, 1000) + 1d8 > ffffffff7da3635c void JvmtiExport::post_vm_death() (1ffc, 100137000, > ffffffff7e3e8b30, ffffffff7e357440, 1, 10010cf30) + 534 > ffffffff7d7bb104 void before_exit(JavaThread*) (100137000, > ffffffff7e392350, ffffffff7e3fb938, 6ed99, ffffffff7e357440, > ffffffff7e3e6b70) + 30c > ffffffff7de72128 bool Threads::destroy_vm() (100137000, 100110a40, > ffffffff7e3f22f4, ffffffff7e3e6ab0, ffffffff7e357440, 30000000) + 100 > ffffffff7d8d0664 jni_DestroyJavaVM (100137000, 1ffc, ffffffff7e3e8b30, > ffffffff7e357440, 0, 10013700) + 1bc > ffffffff7ee08680 JavaMain (ffffffff7e3da790, 0, ffffffff7e3da790, > 10035de68, 0, ffffffff7e4143b0) + 860 > ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) > > ----------------- lwp# 12 / thread# 12 -------------------- > ffffffff7e8dc6a4 lwp_cond_wait (100349948, 100349930, 0, 0) > ffffffff7dcad148 void os::PlatformEvent::park() (100349900, d4788, > d4400, 0, ffffffff7e357440, 100349930) + 100 > ffffffff7da5f010 int JvmtiRawMonitor::raw_enter(Thread*) (10034a070, > 100348800, a, ffffffff7e3de340, 1, ffffffff7e115ff4) + 258 > ffffffff7da22450 jvmtiError > JvmtiEnv::RawMonitorEnter(JvmtiRawMonitor*) (ffffffff7ea05a00, > 10034a070, 1c7, 100348800, ffffffff7e357440, 4) + a0 > ffffffff7aa2f288 debugMonitorEnter (10034a070, c18, c00, > ffffffff7ab3ad28, ffffffff7ab3b940, 0) + 38 > ffffffff7aa14134 debugLoop_run (ffffffff7ab3b940, 1000, > ffffffff7ab3ad28, ffffffff7aa360d0, ffffffff5b2ff718, c18) + 11c > ffffffff7aa2a4f8 connectionInitiated (ffffffff5b504010, 1358, 1000, > ffffffff7ab3ad28, 1, ffffffff7ab3c080) + e0 > ffffffff7aa2a7d4 attachThread (ffffffffffefee48, 101000, > ffffffff5b504010, ffffffff7ab3ad28, 0, 10000000) + 54 > ffffffff7da56b18 void JvmtiAgentThread::call_start_function() > (100348800, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034880, 1) + 128 > ffffffff7de6a678 void JavaThread::thread_main_inner() (100348800, 3d8, > 1003497f8, 100349420, ffffffff5b2ff9f8, 0) + 90 > ffffffff7de6a5b4 void JavaThread::run() (100348800, 100349442, c, > fffffffea5f3e048, 3d8, 1003497f8) + 3ac > ffffffff7dc9f2e4 java_start (ca800, 100348800, ca904, > ffffffff7e16ff31, ffffffff7e357440, 4797) + 2e4 > ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) > > ----------------- lwp# 13 / thread# 13 -------------------- > ffffffff7e8dc6a4 lwp_cond_wait (10034d348, 10034d330, 0, 0) > ffffffff7dcad148 void os::PlatformEvent::park() (10034d300, d4788, > d4400, 0, ffffffff7e357440, 10034d330) + 100 > ffffffff7da5eac8 int JvmtiRawMonitor::SimpleWait(Thread*,long) > (10034bed0, 10034c000, ffffffffffffffff, 241000, 0, 10034c000) + 100 > ffffffff7da5f300 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) > (10034bed0, ffffffffffffffff, 1, 10034c000, ffffffff7e357440, 10034c000) > + e0 > ffffffff7da2284c jvmtiError > JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bed0, > ffffffffffffffff, 4, 9aeb0, 10034c000) + 8c > ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, > ffffffff7ab3ad18, ffffffff7ab3b940, 0) + 3c > ffffffff7aa1d838 doBlockCommandLoop (800, 1038, ffffffff7ab3ad18, > 1000, ffffffff7ab3ad18, ffffffff7ab3bd60) + 48 > ffffffff7aa1da3c commandLoop (c28, 10034c1f8, c00, ffffffff7ab3ad18, > 0, 10000000) + ac > ffffffff7da56b18 void JvmtiAgentThread::call_start_function() > (10034c000, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034c00, 1) + 128 > ffffffff7de6a678 void JavaThread::thread_main_inner() (10034c000, 3d8, > 10034cfe8, 10034cc10, ffffffff5b0ffbf8, 0) + 90 > ffffffff7de6a5b4 void JavaThread::run() (10034c000, 10034cc28, d, > fffffffea5f3e290, 3d8, 10034cfe8) + 3ac > ffffffff7dc9f2e4 java_start (ca800, 10034c000, ca904, > ffffffff7e16ff31, ffffffff7e357440, 181a) + 2e4 > ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) > > > The details: > - Thread #2: The cbVMDeath() event handler is waiting on the > commandCompleteLock in the enqueueCommand(). > The call chain is: > cbVMDeath() -> event_callback() -> reportEvents() -> > eventHelper_reportEvents() -> enqueueCommand(). > The enqueueCommand() depends on the commandLoop() that has to call > completeCommand(command) for the command being enqueued. > This has not been set yet: gdata->vmDead = JNI_TRUE > > - Thread #12: The debugLoop_run blocked on the vmDeathLock enter > > - Thread #13: The commandLoop is waiting on the blockCommandLoopLock > in the doBlockCommandLoop(). > It is because blockCommandLoop == JNI_TRUE which is set in the > needBlockCommandLoop() > if the following condition is true: > (cmd->commandKind == COMMAND_REPORT_EVENT_COMPOSITE && > cmd->u.reportEventComposite.suspendPolicy == > JDWP_SUSPEND_POLICY(ALL)) > > > It seems, the debugLoop_run() block on the vmDeathLock causes the > commandLoop() to wait indefinitely. > The cbVMDeath() can not proceed because the commandLoop() does not make > a progress. > > The vmDeathLock critical section in the cbVMDeath() event callback seems > to be an overkill (unnecessary). > A less intrusive synchronization is required here which is to wait until > the current command is completed > before returning to the JvmtiExport::post_vm_death(). > > The new approach (see new webrev) is to extend the resumeLock > synchronization pattern > to all VirtualMachine set of commands, not only the resume command. > The resumeLock name is replaced with the vmDeathLock to reflect new > semantics. > > In general, we could consider to do the same for the rest of the JDWP > command sets. > But it is better to be careful and see how this change goes first. > > > Thanks, > Serguei > > > On 11/5/14 2:27 AM, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> Thank you for the concerns! >> Testing showed several tests failing with deadlocks. >> Scenarios are similar to that you describe. >> >> Trying to understand the details. >> >> Thanks, >> Serguei >> >> On 11/4/14 4:09 PM, David Holmes wrote: >>> Hi Serguei, >>> >>> On 3/11/2014 5:07 PM, serguei.spitsyn at oracle.com wrote: >>>> On 11/2/14 8:58 PM, David Holmes wrote: >>>>> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >>>>>> Serguei, >>>>>> >>>>>> Thank you for good finding. This approach looks much better for me. >>>>>> >>>>>> The fix looks good. >>>>>> >>>>>> Is it necessary to release vmDeathLock locks at >>>>>> eventHandler.c:1244 before call >>>>>> >>>>>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? >>>>> >>>>> I agree this looks necessary, or at least more clean (if things are >>>>> failing we really don't know what is happening). >>>> >>>> Agreed (replied to Dmitry). >>>> >>>>> >>>>> More generally I'm concerned about whether any of the code paths taken >>>>> while holding the new lock can result in deadlock - in particular with >>>>> regard to the resumeLock ? >>>> >>>> The cbVMDeath() function never holds both vmDeathLock and resumeLock at >>>> the same time, >>>> so there is no chance for a deadlock that involves both these locks. >>>> >>>> Two more locks used in the cbVMDeath() are the callbackBlock and >>>> callbackLock. >>>> These two locks look completely unrelated to the debugLoop_run(). >>>> >>>> The debugLoop_run() function also uses the cmdQueueLock. >>>> The debugLoop_run() never holds both vmDeathLock and cmdQueueLock at >>>> the >>>> same time. >>>> >>>> So that I do not see any potential to introduce new deadlock with the >>>> vmDeathLock. >>>> >>>> However, it is still easy to overlook something here. >>>> Please, let me know if you see any danger. >>> >>> I was mainly concerned about what might happen in the call chain for >>> threadControl_resumeAll() (it certainly sounds like it might need to >>> use a resumeLock :) ). I see direct use of the threadLock and >>> indirectly the eventHandler lock; but there are further call paths I >>> did not explore. Wish there was an easy way to determine the >>> transitive closure of all locks used from a given call. >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> David >>>>> >>>>>> -Dmitry >>>>>> >>>>>> >>>>>> >>>>>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>>>>>> >>>>>>> It is 3-rd round of review for: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>> >>>>>>> New webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Summary >>>>>>> >>>>>>> For failing scenario, please, refer to the 1-st round RFR below. >>>>>>> >>>>>>> I've found what is missed in the jdwp agent shutdown and >>>>>>> decided to >>>>>>> switch from a workaround to a real fix. >>>>>>> >>>>>>> The agent VM_DEATH callback sets the gdata field: >>>>>>> gdata->vmDead = 1. >>>>>>> The agent debugLoop_run() has a guard against the VM shutdown: >>>>>>> >>>>>>> 165 } else if (gdata->vmDead && >>>>>>> 166 ((cmd->cmdSet) != >>>>>>> JDWP_COMMAND_SET(VirtualMachine))) { >>>>>>> 167 /* Protect the VM from calls while dead. >>>>>>> 168 * VirtualMachine cmdSet quietly ignores some >>>>>>> cmds >>>>>>> 169 * after VM death, so, it sends it's own >>>>>>> errors. >>>>>>> 170 */ >>>>>>> 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD)); >>>>>>> >>>>>>> >>>>>>> However, the guard above does not help much if the VM_DEATH event >>>>>>> happens in the middle of a command execution. >>>>>>> There is a lack of synchronization here. >>>>>>> >>>>>>> The fix introduces new lock (vmDeathLock) which does not allow to >>>>>>> execute the commands >>>>>>> and the VM_DEATH event callback concurrently. >>>>>>> It should work well for any function that is used in >>>>>>> implementation of >>>>>>> the JDWP_COMMAND_SET(VirtualMachine) . >>>>>>> >>>>>>> >>>>>>> Testing: >>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>> tests >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> The updated webrev: >>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The changes are: >>>>>>>> - added a comment recommended by Staffan >>>>>>>> - removed the ignore_wrong_phase() call from function >>>>>>>> classSignature() >>>>>>>> >>>>>>>> The classSignature() function is called in 16 places. >>>>>>>> Most of them do not tolerate the NULL in place of returned >>>>>>>> signature >>>>>>>> and will crash. >>>>>>>> I'm not comfortable to fix all the occurrences now and suggest to >>>>>>>> return to this >>>>>>>> issue after gaining experience with more failure cases that are >>>>>>>> still >>>>>>>> expected. >>>>>>>> The failure with the classSignature() involved was observed only >>>>>>>> once >>>>>>>> in the nightly >>>>>>>> and should be extremely rare reproducible. >>>>>>>> I'll file a placeholder bug if necessary. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Please, review the fix for: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>>> >>>>>>>>> >>>>>>>>> Open webrev: >>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Summary: >>>>>>>>> >>>>>>>>> The failing scenario: >>>>>>>>> The debugger and the debuggee are well aware a VM >>>>>>>>> shutdown has >>>>>>>>> been started in the target process. >>>>>>>>> The debugger at this point is not expected to send any >>>>>>>>> commands >>>>>>>>> to the JDWP agent. >>>>>>>>> However, the JDI layer (debugger side) and the jdwp agent >>>>>>>>> (debuggee side) >>>>>>>>> are not in sync with the consumer layers. >>>>>>>>> >>>>>>>>> One reason is because the test debugger does not invoke >>>>>>>>> the JDI >>>>>>>>> method VirtualMachine.dispose(). >>>>>>>>> Another reason is that the Debugger and the debuggee >>>>>>>>> processes >>>>>>>>> are uneasy to sync in general. >>>>>>>>> >>>>>>>>> As a result the following steps are possible: >>>>>>>>> - The test debugger sends a 'quit' command to the test >>>>>>>>> debuggee >>>>>>>>> - The debuggee is normally exiting >>>>>>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>>>>>> anonymous class unload event >>>>>>>>> - The JDI InternalEventHandler thread handles the >>>>>>>>> ClassUnloadEvent event >>>>>>>>> - The InternalEventHandler wants to uncache the matching >>>>>>>>> reference type. >>>>>>>>> If there is more than one class with the same host class >>>>>>>>> signature, it can't distinguish them, >>>>>>>>> and so, deletes all references and re-retrieves them >>>>>>>>> again >>>>>>>>> (see tracing below): >>>>>>>>> MY_TRACE: JDI: >>>>>>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>>>>>> - The jdwp backend debugLoop_run() gets the command >>>>>>>>> from JDI >>>>>>>>> and calls the functions >>>>>>>>> classesForSignature() and classStatus() recursively. >>>>>>>>> - The classStatus() makes a call to the JVMTI >>>>>>>>> GetClassStatus() >>>>>>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>>>>>> - As a result the jdwp backend reports the JVMTI error >>>>>>>>> to the >>>>>>>>> JDI, and so, the test fails >>>>>>>>> >>>>>>>>> For details, see the analysis in bug report closed as a >>>>>>>>> dup of >>>>>>>>> the bug 6988950: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>>>>>> >>>>>>>>> Some similar cases can be found in the two bug reports >>>>>>>>> (6988950 >>>>>>>>> and 8024865) describing this issue. >>>>>>>>> >>>>>>>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE >>>>>>>>> error >>>>>>>>> as it is normal at the VM shutdown. >>>>>>>>> The original jdwp backend implementation had a similar >>>>>>>>> approach >>>>>>>>> for the raw monitor functions. >>>>>>>>> Threy use the ignore_vm_death() to workaround the >>>>>>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>>>>>> For reference, please, see the file: src/share/back/util.c >>>>>>>>> >>>>>>>>> >>>>>>>>> Testing: >>>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>>>> tests >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>> >> > From david.holmes at oracle.com Fri Nov 7 06:14:06 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 07 Nov 2014 16:14:06 +1000 Subject: Review request: JDK-8062556: Add jdk tests for JDK-8058322 and JDK-8058313 In-Reply-To: <545BB72A.1070204@oracle.com> References: <5452CCA3.5040001@oracle.com> <5452ECF5.7020607@oracle.com> <54539625.7070302@oracle.com> <54570981.2090601@oracle.com> <5457A270.9010009@oracle.com> <545BB72A.1070204@oracle.com> Message-ID: <545C632E.8080900@oracle.com> I have no further comments. David On 7/11/2014 4:00 AM, Eric McCorkle wrote: > Are there any concerns about the tests, other than the broken webrevs? > > On 11/03/14 10:42, Eric McCorkle wrote: >> I have been having issues with webrev, which I reported earlier. Webrev >> reports a syntax error when I try to use it, and curiously, it fails to >> produce top-level files in this case (and this case only, as evidenced >> by my other webrevs). >> >> Unfortunately, there's nothing I can do about the missing top-level >> files; however, you can still look at the individual files just fine. >> >> On 11/02/14 23:50, David Holmes wrote: >>> Hi Erik, >>> >>> webrevs still broken for some reason. >>> On 1/11/2014 12:01 AM, Eric McCorkle wrote: >>>> I went through and added comments in the binary data indicating where >>>> the MethodParameters attributes are, and a breakdown of their contents. >>>> I went ahead and did this for all the bad class files, not just the new >>>> ones. >>>> >>>> There is a larger picture here: there's an outstanding task I filed >>>> around the time these tests were written to find a better way for >>>> langtools to run jtreg tests that involve bad class files. >>>> Unfortunately, doing that is rather difficult, as you can see. The only >>>> real way to do it is to generate a class file, convert it to signed >>>> bytes (you can't even use hex; you get an unsigned/signed byte >>>> conversion problem), then modify the data by hand. The intent is to >>>> replace this with a better method at some point. >>> >>> OK. New comments an improvement. >>> >>> Please give the new test the correct initial copyright year of 2014. I >>> know updates to the year are handled automatically (eventually) but we >>> should at least have things correct to start with. >>> >>> Thanks, >>> David >>> >>>> On 10/30/14 21:59, David Holmes wrote: >>>>> Hi Erik, >>>>> >>>>> On 31/10/2014 9:41 AM, Eric McCorkle wrote: >>>>>> Hello, >>>>>> >>>>>> Please review this patch which adds tests to the JDK test suite for two >>>>>> reflection bugs that require hotspot changes (JDK-8058322 and >>>>>> JDK-8058313) >>>>>> >>>>>> The webrev is here: >>>>>> http://cr.openjdk.java.net/~emc/8062556/ >>>>> >>>>> I second Brian's comment re the source of the bad classes. >>>>> >>>>> Your webrev is broken btw - no top-level html files. >>>>> >>>>> The new test needs a copyright year of 2014 not 2013. >>>>> >>>>> Thanks, >>>>> David >>>>> From david.holmes at oracle.com Fri Nov 7 07:13:10 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 07 Nov 2014 17:13:10 +1000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> Message-ID: <545C7106.2080602@oracle.com> Hi Thomas, On 6/11/2014 11:21 PM, Thomas St?fe wrote: > Hi David, > > Our intend was to always guarantee that the written string is zero > terminated, and across platforms to always return the same value (-1) if > truncation happened. > > The original jio_snprintf() did not have any value over plain snprintf() > (apart from maybe solving the name problem with "_snprintf" on Windows). > But having a dedicated wrapper function around snprintf() suggests some > added value, and I always thought that value was supposed to be zero > termination. If that is not intended, it would be better to use the > plain snprintf instead, because at least C programmers then know what to > expect. > > I also found very few cases where the return code of jio_snprintf() was > actually checked and truncation handled correctly. Which would be > difficult too, because the return code differed for truncation between > windows and Posix. > > Bottomline, I think it would be better if jio_snprintf() were to always > zero-terminate, guaranteed. I don't disagree, but we needed to make sure that everyone was on the same page before this change occurred. It is not unreasonable that Windows developers writing windows code assumed windows semantics. As a Reviewer I should have paid more attention to this aspect. Also if we are going to implement this behaviour then it really needs to be clearly documented - both in terms of null-termination and return value - and commented in the code (anyone knowing *NIX or Windows behaviour alone will be quite perplexed by it I think). Further, returning -1 to indicate truncation while familiar to windows programmers might be frustrating to others who would like to know what size buffer was needed - lowest common denominator I know :( And as you note most uses of this function have a "dont care" attitude to truncation - which makes it hard to spot if there may be other lurking truncation issues in the windows code. Cheers, David > Kind Regards, Thomas > > > On Thu, Nov 6, 2014 at 12:23 PM, David Holmes > wrote: > > On 6/11/2014 8:43 PM, Lindenmaier, Goetz wrote: > > Hi David, > > yes, windows does not null terminate if there is an overflow. > Obviously there are overflows, and they now see one less > character. I think this should be fixed where jio_vsnprintf > is called. Having non-null terminated strings isn't nice. > > > I think it depends on what you consider an overflow. If the buffer > is already null terminated and you pass in a count that covers up to > the location before the null then there is no problem - except now > the logic will introduce a second null in place of the last character. > > But for now I will roll back this single change. I'll send a > RFR soon. > > Where did you see the problem? > > > It was in our closed code so I can't go into details. We have a > non-public bug number: 8063089 > > Thanks, > > David > > > Best regards, > Goetz. > > > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.__com > ] > Sent: Donnerstag, 6. November 2014 11:30 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > > Cc: Markus Gr?nlund > Subject: Re: RFR (L): 8062370: Various minor code improvements > > On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: > > Thanks David, I'll have a look. > > > It seems that windows vsnprintf may not null-terminate the string - > which I think is what your patch was trying to address. But if > we have > existing code that works with that then the fix is now > overwriting the > last character. I can't quite see how to handle this in a cross > platform > manner, but in the immediate term we should probably revert that > part of > the changeset. > > David > > Best regards, > Goetz. > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.__com > ] > Sent: Donnerstag, 6. November 2014 11:09 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > > Cc: Markus Gr?nlund > Subject: Re: RFR (L): 8062370: Various minor code improvements > > Hi Goetz, > > This change has introduced a bug: > > - return vsnprintf(str, count, fmt, args); > + > + int result = vsnprintf(str, count, fmt, args); > + if ((result > 0 && (size_t)result >= count) || result == > -1) { > + str[count - 1] = '\0'; > + result = -1; > + } > + > + return result; > > some strings are getting their last character truncated on > Windows. > > David > > On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: > > Hi David, > > thanks for looking at the change! I fixed the issue in > a new > webrev: > http://cr.openjdk.java.net/~__goetz/webrevs/8062370/webrev.__01/ > > > Best regards, > Goetz. > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.__com > ] > Sent: Mittwoch, 5. November 2014 02:49 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > > Subject: Re: RFR (L): 8062370: Various minor code > improvements > > Hi Goetz, > > The only issue I see is in: > > src/share/vm/runtime/globals.__cpp > > where you replaced NEW_C_HEAP_ARRAY with os::strdup. To > keep the "abort > on OOM" semantics of NEW_C_HEAP_ARRAY you need to use > os::strdup_check_oom. > > Thanks, > David > > On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: > > Hi, > > this change contains a row of minor code > improvements we did to fulfil > our internal quality requirements. We would like to > share these with > openJDK. > > Please review and test this change. I please need a > sponsor. > http://cr.openjdk.java.net/~__goetz/webrevs/8062370/webrev.__00/ > > https://bugs.openjdk.java.net/__browse/JDK-8062370 > > > We tested this on windows 64, linux x86_64, mac, > solaris sparc 32+64 bit and, > of course, the ppc platforms. > > > Some details: > > CONST64(0x8000000000000000) is wrong, as 0x8... is > positive, and thus not representable as i64 what is > used in the CONST64 macro. This change adapts > UCONST64 to use ui64, and the usages of these macros > where necessary. > > We add some more strncpy uses. Also, we fix strncpy > on windows. There, strncpy does not write a \0 into > the last byte if the copied string is too long. > > We add some missing memory frees and some closing of > files. > > jio_vsnprintf() works differently on windows and > linux. This change adapts this to show the same > behaviour on all platforms. See java.cpp. > > Best regards, > > Goetz > > > > > From thomas.stuefe at gmail.com Fri Nov 7 09:08:14 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 7 Nov 2014 10:08:14 +0100 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <545C7106.2080602@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <545C7106.2080602@oracle.com> Message-ID: Hi David On Fri, Nov 7, 2014 at 8:13 AM, David Holmes wrote: > Hi Thomas, > > On 6/11/2014 11:21 PM, Thomas St?fe wrote: > >> Hi David, >> >> Our intend was to always guarantee that the written string is zero >> terminated, and across platforms to always return the same value (-1) if >> truncation happened. >> >> The original jio_snprintf() did not have any value over plain snprintf() >> (apart from maybe solving the name problem with "_snprintf" on Windows). >> But having a dedicated wrapper function around snprintf() suggests some >> added value, and I always thought that value was supposed to be zero >> termination. If that is not intended, it would be better to use the >> plain snprintf instead, because at least C programmers then know what to >> expect. >> >> I also found very few cases where the return code of jio_snprintf() was >> actually checked and truncation handled correctly. Which would be >> difficult too, because the return code differed for truncation between >> windows and Posix. >> >> Bottomline, I think it would be better if jio_snprintf() were to always >> zero-terminate, guaranteed. >> > > I don't disagree, but we needed to make sure that everyone was on the same > page before this change occurred. It is not unreasonable that Windows > developers writing windows code assumed windows semantics. As a Reviewer I > should have paid more attention to this aspect. > > I understand this. In the SAP JVM we have regression tests for C/C++ code, similar to jprt, but on C function level. Nothing fancy, just some big test functions which test our C APIs for regressions like this. That code is just compiled into the hotspot and can be executed with a command line switch, but gets excluded in release builds. Is there something similar for the OpenJDK? If yes, I would provide test functions for jio_snprintf. If no, would it be worth contributing? > Also if we are going to implement this behaviour then it really needs to > be clearly documented - both in terms of null-termination and return value > - and commented in the code (anyone knowing *NIX or Windows behaviour alone > will be quite perplexed by it I think). You are right, we should add comments. > Further, returning -1 to indicate truncation while familiar to windows > programmers might be frustrating to others who would like to know what size > buffer was needed - lowest common denominator I know :( > > As you said, lowest common denominator. Not easy to implement unless you implement the whole printf routine yourself (we have done this in our tracing subsystem to get unified behaviour on all platforms). I also did not find any usage like this. And as you note most uses of this function have a "dont care" attitude to > truncation - which makes it hard to spot if there may be other lurking > truncation issues in the windows code. > Unfortunately I think this is not Windows-specific. Would it be worth to assert this case (assert if vsnprintf returns exactly count bytes) ? Kind Regards, Thomas From serguei.spitsyn at oracle.com Fri Nov 7 09:48:11 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 07 Nov 2014 01:48:11 -0800 Subject: 4-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <545C5620.7080300@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> <545729A3.7090301@oracle.com> <54596AC2.6050502@oracle.com> <5459FB96.9020404@oracle.com> <545BF5CD.6010008@oracle.com> <545C5620.7080300@oracle.com> Message-ID: <545C955B.3070209@oracle.com> Hi David, On 11/6/14 9:18 PM, David Holmes wrote: > Hi Serguei, > > I think I get the gist of this approach but I'm not an expert on the > JVM TI or JDWP event model. My main concern would be how the delay to > the completion of cbVMDeath() might impact things - specifically if it > might be a lengthy delay? 1. At the beginning the VirtualMachine comands check if gdata->vmDead is true and in such case just return with the JDWP_ERROR(VM_DEAD) error or quietly. Normally, the cbVMDeath event callback needs to wait for just one command. Please, see the VirtualMachine.c and the following comment in debugLoop_run(): } else if (gdata->vmDead && ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) { /* Protect the VM from calls while dead. * VirtualMachine cmdSet quietly ignores some cmds * after VM death, so, it sends it's own errors. */ outStream_setError(&out, JDWP_ERROR(VM_DEAD)); } else { 2. We do not have many choices. Without a sync on a command completeness we will continue getting WRONG_PHASE errors intermittently. Another choice is to use already reviewed ignore_wrong_phase workaround. Note, the workaround works Ok not for all the commands. I understand, we need to make sure nothing is broken if we choose this approach. :) 3. What delay would you consider lengthy: 1 sec, 10 sec, 1 min.? For instance, I can add 10 sec sleep to provoke the command execution delay and see what can be broken. With 1 min sleep I see some timeouts in the jtreg com/sun/jdi tests though which is probably Ok. Thanks, Serguei > > Thanks, > David > > On 7/11/2014 8:27 AM, serguei.spitsyn at oracle.com wrote: >> Hi reviewers, >> >> I'm suggesting to review a modified fix: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.4/ >> >> >> >> The 3-rd round fix is not right as it caused deadlocks in several tests >> (in nsk.jdi.testlist and jtreg com/sun/jdi). >> >> Here is a deadlock example: >> >> ----------------- lwp# 2 / thread# 2 -------------------- >> ffffffff7e8dc6a4 lwp_cond_wait (100138748, 100138730, 0, 0) >> ffffffff7dcad148 void os::PlatformEvent::park() (100138700, d4788, >> d4400, 0, ffffffff7e357440, 100138730) + 100 >> ffffffff7dc3151c int Monitor::IWait(Thread*,long) (ffffffff7e3c5b98, >> 100137000, 0, 1004405d0, 6e750, 0) + a4 >> ffffffff7dc324d0 bool Monitor::wait(bool,long,bool) (1004405d0, >> 100137000, 0, 0, 1, 20000000) + 358 >> ffffffff7de6c530 int JavaThread::java_suspend_self() (1004405d0, >> 100137000, 1, deab, 60000000, 100137000) + c8 >> ffffffff7da5f478 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) >> (10034bdc0, ffffffffffffffff, ffffffff7e3e6bd0, 100137000, 1, 2) + 258 >> ffffffff7da2284c jvmtiError >> JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bdc0, >> ffffffffffffffff, 4, 9aeb0, 100137000) + 8c >> ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, >> ffffffff7ab3ad18, ffffffff7ab3ad18, 0) + 3c >> ffffffff7aa1c804 enqueueCommand (10034bb90, 102c00, ffffffffffefd118, >> ffffffff7ab3ad18, 102c00, ffffffff7ab3bd60) + 14c >> ffffffff7aa1e23c eventHelper_reportEvents (d8, 100135d70, 2, 1, 1, 2) >> + 10c >> ffffffff7aa181f8 reportEvents (1001371f8, 0, 0, 14, 100135d70, 0) + >> 138 >> ffffffff7aa187b8 event_callback (1001371f8, ffffffff7b0ffa88, >> ffffffff7aa23150, ffffffff7aa376a0, ffffffff7ab3ad18, 100441ad0) + 360 >> ffffffff7aa1b870 cbVMDeath (800, 1001371f8, ffffffff7aa37c48, >> ffffffff7ab3ad18, 1018, 1000) + 1d8 >> ffffffff7da3635c void JvmtiExport::post_vm_death() (1ffc, 100137000, >> ffffffff7e3e8b30, ffffffff7e357440, 1, 10010cf30) + 534 >> ffffffff7d7bb104 void before_exit(JavaThread*) (100137000, >> ffffffff7e392350, ffffffff7e3fb938, 6ed99, ffffffff7e357440, >> ffffffff7e3e6b70) + 30c >> ffffffff7de72128 bool Threads::destroy_vm() (100137000, 100110a40, >> ffffffff7e3f22f4, ffffffff7e3e6ab0, ffffffff7e357440, 30000000) + 100 >> ffffffff7d8d0664 jni_DestroyJavaVM (100137000, 1ffc, ffffffff7e3e8b30, >> ffffffff7e357440, 0, 10013700) + 1bc >> ffffffff7ee08680 JavaMain (ffffffff7e3da790, 0, ffffffff7e3da790, >> 10035de68, 0, ffffffff7e4143b0) + 860 >> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >> >> ----------------- lwp# 12 / thread# 12 -------------------- >> ffffffff7e8dc6a4 lwp_cond_wait (100349948, 100349930, 0, 0) >> ffffffff7dcad148 void os::PlatformEvent::park() (100349900, d4788, >> d4400, 0, ffffffff7e357440, 100349930) + 100 >> ffffffff7da5f010 int JvmtiRawMonitor::raw_enter(Thread*) (10034a070, >> 100348800, a, ffffffff7e3de340, 1, ffffffff7e115ff4) + 258 >> ffffffff7da22450 jvmtiError >> JvmtiEnv::RawMonitorEnter(JvmtiRawMonitor*) (ffffffff7ea05a00, >> 10034a070, 1c7, 100348800, ffffffff7e357440, 4) + a0 >> ffffffff7aa2f288 debugMonitorEnter (10034a070, c18, c00, >> ffffffff7ab3ad28, ffffffff7ab3b940, 0) + 38 >> ffffffff7aa14134 debugLoop_run (ffffffff7ab3b940, 1000, >> ffffffff7ab3ad28, ffffffff7aa360d0, ffffffff5b2ff718, c18) + 11c >> ffffffff7aa2a4f8 connectionInitiated (ffffffff5b504010, 1358, 1000, >> ffffffff7ab3ad28, 1, ffffffff7ab3c080) + e0 >> ffffffff7aa2a7d4 attachThread (ffffffffffefee48, 101000, >> ffffffff5b504010, ffffffff7ab3ad28, 0, 10000000) + 54 >> ffffffff7da56b18 void JvmtiAgentThread::call_start_function() >> (100348800, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034880, 1) + >> 128 >> ffffffff7de6a678 void JavaThread::thread_main_inner() (100348800, 3d8, >> 1003497f8, 100349420, ffffffff5b2ff9f8, 0) + 90 >> ffffffff7de6a5b4 void JavaThread::run() (100348800, 100349442, c, >> fffffffea5f3e048, 3d8, 1003497f8) + 3ac >> ffffffff7dc9f2e4 java_start (ca800, 100348800, ca904, >> ffffffff7e16ff31, ffffffff7e357440, 4797) + 2e4 >> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >> >> ----------------- lwp# 13 / thread# 13 -------------------- >> ffffffff7e8dc6a4 lwp_cond_wait (10034d348, 10034d330, 0, 0) >> ffffffff7dcad148 void os::PlatformEvent::park() (10034d300, d4788, >> d4400, 0, ffffffff7e357440, 10034d330) + 100 >> ffffffff7da5eac8 int JvmtiRawMonitor::SimpleWait(Thread*,long) >> (10034bed0, 10034c000, ffffffffffffffff, 241000, 0, 10034c000) + 100 >> ffffffff7da5f300 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) >> (10034bed0, ffffffffffffffff, 1, 10034c000, ffffffff7e357440, 10034c000) >> + e0 >> ffffffff7da2284c jvmtiError >> JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bed0, >> ffffffffffffffff, 4, 9aeb0, 10034c000) + 8c >> ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, >> ffffffff7ab3ad18, ffffffff7ab3b940, 0) + 3c >> ffffffff7aa1d838 doBlockCommandLoop (800, 1038, ffffffff7ab3ad18, >> 1000, ffffffff7ab3ad18, ffffffff7ab3bd60) + 48 >> ffffffff7aa1da3c commandLoop (c28, 10034c1f8, c00, ffffffff7ab3ad18, >> 0, 10000000) + ac >> ffffffff7da56b18 void JvmtiAgentThread::call_start_function() >> (10034c000, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034c00, 1) + >> 128 >> ffffffff7de6a678 void JavaThread::thread_main_inner() (10034c000, 3d8, >> 10034cfe8, 10034cc10, ffffffff5b0ffbf8, 0) + 90 >> ffffffff7de6a5b4 void JavaThread::run() (10034c000, 10034cc28, d, >> fffffffea5f3e290, 3d8, 10034cfe8) + 3ac >> ffffffff7dc9f2e4 java_start (ca800, 10034c000, ca904, >> ffffffff7e16ff31, ffffffff7e357440, 181a) + 2e4 >> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >> >> >> The details: >> - Thread #2: The cbVMDeath() event handler is waiting on the >> commandCompleteLock in the enqueueCommand(). >> The call chain is: >> cbVMDeath() -> event_callback() -> reportEvents() -> >> eventHelper_reportEvents() -> enqueueCommand(). >> The enqueueCommand() depends on the commandLoop() that has to call >> completeCommand(command) for the command being enqueued. >> This has not been set yet: gdata->vmDead = JNI_TRUE >> >> - Thread #12: The debugLoop_run blocked on the vmDeathLock enter >> >> - Thread #13: The commandLoop is waiting on the blockCommandLoopLock >> in the doBlockCommandLoop(). >> It is because blockCommandLoop == JNI_TRUE which is set in the >> needBlockCommandLoop() >> if the following condition is true: >> (cmd->commandKind == COMMAND_REPORT_EVENT_COMPOSITE && >> cmd->u.reportEventComposite.suspendPolicy == >> JDWP_SUSPEND_POLICY(ALL)) >> >> >> It seems, the debugLoop_run() block on the vmDeathLock causes the >> commandLoop() to wait indefinitely. >> The cbVMDeath() can not proceed because the commandLoop() does not make >> a progress. >> >> The vmDeathLock critical section in the cbVMDeath() event callback seems >> to be an overkill (unnecessary). >> A less intrusive synchronization is required here which is to wait until >> the current command is completed >> before returning to the JvmtiExport::post_vm_death(). >> >> The new approach (see new webrev) is to extend the resumeLock >> synchronization pattern >> to all VirtualMachine set of commands, not only the resume command. >> The resumeLock name is replaced with the vmDeathLock to reflect new >> semantics. >> >> In general, we could consider to do the same for the rest of the JDWP >> command sets. >> But it is better to be careful and see how this change goes first. >> >> >> Thanks, >> Serguei >> >> >> On 11/5/14 2:27 AM, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> Thank you for the concerns! >>> Testing showed several tests failing with deadlocks. >>> Scenarios are similar to that you describe. >>> >>> Trying to understand the details. >>> >>> Thanks, >>> Serguei >>> >>> On 11/4/14 4:09 PM, David Holmes wrote: >>>> Hi Serguei, >>>> >>>> On 3/11/2014 5:07 PM, serguei.spitsyn at oracle.com wrote: >>>>> On 11/2/14 8:58 PM, David Holmes wrote: >>>>>> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >>>>>>> Serguei, >>>>>>> >>>>>>> Thank you for good finding. This approach looks much better for me. >>>>>>> >>>>>>> The fix looks good. >>>>>>> >>>>>>> Is it necessary to release vmDeathLock locks at >>>>>>> eventHandler.c:1244 before call >>>>>>> >>>>>>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? >>>>>> >>>>>> I agree this looks necessary, or at least more clean (if things are >>>>>> failing we really don't know what is happening). >>>>> >>>>> Agreed (replied to Dmitry). >>>>> >>>>>> >>>>>> More generally I'm concerned about whether any of the code paths >>>>>> taken >>>>>> while holding the new lock can result in deadlock - in particular >>>>>> with >>>>>> regard to the resumeLock ? >>>>> >>>>> The cbVMDeath() function never holds both vmDeathLock and >>>>> resumeLock at >>>>> the same time, >>>>> so there is no chance for a deadlock that involves both these locks. >>>>> >>>>> Two more locks used in the cbVMDeath() are the callbackBlock and >>>>> callbackLock. >>>>> These two locks look completely unrelated to the debugLoop_run(). >>>>> >>>>> The debugLoop_run() function also uses the cmdQueueLock. >>>>> The debugLoop_run() never holds both vmDeathLock and cmdQueueLock at >>>>> the >>>>> same time. >>>>> >>>>> So that I do not see any potential to introduce new deadlock with the >>>>> vmDeathLock. >>>>> >>>>> However, it is still easy to overlook something here. >>>>> Please, let me know if you see any danger. >>>> >>>> I was mainly concerned about what might happen in the call chain for >>>> threadControl_resumeAll() (it certainly sounds like it might need to >>>> use a resumeLock :) ). I see direct use of the threadLock and >>>> indirectly the eventHandler lock; but there are further call paths I >>>> did not explore. Wish there was an easy way to determine the >>>> transitive closure of all locks used from a given call. >>>> >>>> Thanks, >>>> David >>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> >>>>>> David >>>>>> >>>>>>> -Dmitry >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>>>>>>> >>>>>>>> It is 3-rd round of review for: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>> >>>>>>>> New webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Summary >>>>>>>> >>>>>>>> For failing scenario, please, refer to the 1-st round RFR >>>>>>>> below. >>>>>>>> >>>>>>>> I've found what is missed in the jdwp agent shutdown and >>>>>>>> decided to >>>>>>>> switch from a workaround to a real fix. >>>>>>>> >>>>>>>> The agent VM_DEATH callback sets the gdata field: >>>>>>>> gdata->vmDead = 1. >>>>>>>> The agent debugLoop_run() has a guard against the VM shutdown: >>>>>>>> >>>>>>>> 165 } else if (gdata->vmDead && >>>>>>>> 166 ((cmd->cmdSet) != >>>>>>>> JDWP_COMMAND_SET(VirtualMachine))) { >>>>>>>> 167 /* Protect the VM from calls while dead. >>>>>>>> 168 * VirtualMachine cmdSet quietly ignores >>>>>>>> some >>>>>>>> cmds >>>>>>>> 169 * after VM death, so, it sends it's own >>>>>>>> errors. >>>>>>>> 170 */ >>>>>>>> 171 outStream_setError(&out, >>>>>>>> JDWP_ERROR(VM_DEAD)); >>>>>>>> >>>>>>>> >>>>>>>> However, the guard above does not help much if the VM_DEATH >>>>>>>> event >>>>>>>> happens in the middle of a command execution. >>>>>>>> There is a lack of synchronization here. >>>>>>>> >>>>>>>> The fix introduces new lock (vmDeathLock) which does not >>>>>>>> allow to >>>>>>>> execute the commands >>>>>>>> and the VM_DEATH event callback concurrently. >>>>>>>> It should work well for any function that is used in >>>>>>>> implementation of >>>>>>>> the JDWP_COMMAND_SET(VirtualMachine) . >>>>>>>> >>>>>>>> >>>>>>>> Testing: >>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>>> tests >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> The updated webrev: >>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> The changes are: >>>>>>>>> - added a comment recommended by Staffan >>>>>>>>> - removed the ignore_wrong_phase() call from function >>>>>>>>> classSignature() >>>>>>>>> >>>>>>>>> The classSignature() function is called in 16 places. >>>>>>>>> Most of them do not tolerate the NULL in place of returned >>>>>>>>> signature >>>>>>>>> and will crash. >>>>>>>>> I'm not comfortable to fix all the occurrences now and suggest to >>>>>>>>> return to this >>>>>>>>> issue after gaining experience with more failure cases that are >>>>>>>>> still >>>>>>>>> expected. >>>>>>>>> The failure with the classSignature() involved was observed only >>>>>>>>> once >>>>>>>>> in the nightly >>>>>>>>> and should be extremely rare reproducible. >>>>>>>>> I'll file a placeholder bug if necessary. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Please, review the fix for: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Open webrev: >>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Summary: >>>>>>>>>> >>>>>>>>>> The failing scenario: >>>>>>>>>> The debugger and the debuggee are well aware a VM >>>>>>>>>> shutdown has >>>>>>>>>> been started in the target process. >>>>>>>>>> The debugger at this point is not expected to send any >>>>>>>>>> commands >>>>>>>>>> to the JDWP agent. >>>>>>>>>> However, the JDI layer (debugger side) and the jdwp agent >>>>>>>>>> (debuggee side) >>>>>>>>>> are not in sync with the consumer layers. >>>>>>>>>> >>>>>>>>>> One reason is because the test debugger does not invoke >>>>>>>>>> the JDI >>>>>>>>>> method VirtualMachine.dispose(). >>>>>>>>>> Another reason is that the Debugger and the debuggee >>>>>>>>>> processes >>>>>>>>>> are uneasy to sync in general. >>>>>>>>>> >>>>>>>>>> As a result the following steps are possible: >>>>>>>>>> - The test debugger sends a 'quit' command to the test >>>>>>>>>> debuggee >>>>>>>>>> - The debuggee is normally exiting >>>>>>>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>>>>>>> anonymous class unload event >>>>>>>>>> - The JDI InternalEventHandler thread handles the >>>>>>>>>> ClassUnloadEvent event >>>>>>>>>> - The InternalEventHandler wants to uncache the matching >>>>>>>>>> reference type. >>>>>>>>>> If there is more than one class with the same host >>>>>>>>>> class >>>>>>>>>> signature, it can't distinguish them, >>>>>>>>>> and so, deletes all references and re-retrieves them >>>>>>>>>> again >>>>>>>>>> (see tracing below): >>>>>>>>>> MY_TRACE: JDI: >>>>>>>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>>>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>>>>>>> - The jdwp backend debugLoop_run() gets the command >>>>>>>>>> from JDI >>>>>>>>>> and calls the functions >>>>>>>>>> classesForSignature() and classStatus() recursively. >>>>>>>>>> - The classStatus() makes a call to the JVMTI >>>>>>>>>> GetClassStatus() >>>>>>>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>>>>>>> - As a result the jdwp backend reports the JVMTI error >>>>>>>>>> to the >>>>>>>>>> JDI, and so, the test fails >>>>>>>>>> >>>>>>>>>> For details, see the analysis in bug report closed as a >>>>>>>>>> dup of >>>>>>>>>> the bug 6988950: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>>>>>>> >>>>>>>>>> Some similar cases can be found in the two bug reports >>>>>>>>>> (6988950 >>>>>>>>>> and 8024865) describing this issue. >>>>>>>>>> >>>>>>>>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE >>>>>>>>>> error >>>>>>>>>> as it is normal at the VM shutdown. >>>>>>>>>> The original jdwp backend implementation had a similar >>>>>>>>>> approach >>>>>>>>>> for the raw monitor functions. >>>>>>>>>> Threy use the ignore_vm_death() to workaround the >>>>>>>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>>>>>>> For reference, please, see the file: src/share/back/util.c >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Testing: >>>>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>>>>> tests >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>> >>> >> From david.holmes at oracle.com Fri Nov 7 09:59:53 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 07 Nov 2014 19:59:53 +1000 Subject: 4-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <545C955B.3070209@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> <545729A3.7090301@oracle.com> <54596AC2.6050502@oracle.com> <5459FB96.9020404@oracle.com> <545BF5CD.6010008@oracle.com> <545C5620.7080300@oracle.com> <545C955B.3070209@oracle.com> Message-ID: <545C9819.5050806@oracle.com> On 7/11/2014 7:48 PM, serguei.spitsyn at oracle.com wrote: > Hi David, > > > On 11/6/14 9:18 PM, David Holmes wrote: >> Hi Serguei, >> >> I think I get the gist of this approach but I'm not an expert on the >> JVM TI or JDWP event model. My main concern would be how the delay to >> the completion of cbVMDeath() might impact things - specifically if it >> might be a lengthy delay? > > 1. At the beginning the VirtualMachine comands check if gdata->vmDead is > true > and in such case just return with the JDWP_ERROR(VM_DEAD) error or > quietly. > Normally, the cbVMDeath event callback needs to wait for just one > command. > > Please, see the VirtualMachine.c and the following comment in > debugLoop_run(): > > } else if (gdata->vmDead && > ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) { > /* Protect the VM from calls while dead. > * VirtualMachine cmdSet quietly ignores some cmds > * after VM death, so, it sends it's own errors. > */ > outStream_setError(&out, JDWP_ERROR(VM_DEAD)); > } else { > > > 2. We do not have many choices. > Without a sync on a command completeness we will continue getting > WRONG_PHASE errors intermittently. > Another choice is to use already reviewed ignore_wrong_phase > workaround. > Note, the workaround works Ok not for all the commands. > I understand, we need to make sure nothing is broken if we choose > this approach. :) > > 3. What delay would you consider lengthy: 1 sec, 10 sec, 1 min.? Anything that causes something unexpected to happen :) I'm just looking at the code and thinking what might go wrong. Really all we can do is try this and see. Thanks, David > For instance, I can add 10 sec sleep to provoke the command > execution delay and see what can be broken. > With 1 min sleep I see some timeouts in the jtreg com/sun/jdi tests > though which is probably Ok. > > Thanks, > Serguei > >> >> Thanks, >> David >> >> On 7/11/2014 8:27 AM, serguei.spitsyn at oracle.com wrote: >>> Hi reviewers, >>> >>> I'm suggesting to review a modified fix: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.4/ >>> >>> >>> >>> The 3-rd round fix is not right as it caused deadlocks in several tests >>> (in nsk.jdi.testlist and jtreg com/sun/jdi). >>> >>> Here is a deadlock example: >>> >>> ----------------- lwp# 2 / thread# 2 -------------------- >>> ffffffff7e8dc6a4 lwp_cond_wait (100138748, 100138730, 0, 0) >>> ffffffff7dcad148 void os::PlatformEvent::park() (100138700, d4788, >>> d4400, 0, ffffffff7e357440, 100138730) + 100 >>> ffffffff7dc3151c int Monitor::IWait(Thread*,long) (ffffffff7e3c5b98, >>> 100137000, 0, 1004405d0, 6e750, 0) + a4 >>> ffffffff7dc324d0 bool Monitor::wait(bool,long,bool) (1004405d0, >>> 100137000, 0, 0, 1, 20000000) + 358 >>> ffffffff7de6c530 int JavaThread::java_suspend_self() (1004405d0, >>> 100137000, 1, deab, 60000000, 100137000) + c8 >>> ffffffff7da5f478 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) >>> (10034bdc0, ffffffffffffffff, ffffffff7e3e6bd0, 100137000, 1, 2) + 258 >>> ffffffff7da2284c jvmtiError >>> JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bdc0, >>> ffffffffffffffff, 4, 9aeb0, 100137000) + 8c >>> ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, >>> ffffffff7ab3ad18, ffffffff7ab3ad18, 0) + 3c >>> ffffffff7aa1c804 enqueueCommand (10034bb90, 102c00, ffffffffffefd118, >>> ffffffff7ab3ad18, 102c00, ffffffff7ab3bd60) + 14c >>> ffffffff7aa1e23c eventHelper_reportEvents (d8, 100135d70, 2, 1, 1, 2) >>> + 10c >>> ffffffff7aa181f8 reportEvents (1001371f8, 0, 0, 14, 100135d70, 0) + >>> 138 >>> ffffffff7aa187b8 event_callback (1001371f8, ffffffff7b0ffa88, >>> ffffffff7aa23150, ffffffff7aa376a0, ffffffff7ab3ad18, 100441ad0) + 360 >>> ffffffff7aa1b870 cbVMDeath (800, 1001371f8, ffffffff7aa37c48, >>> ffffffff7ab3ad18, 1018, 1000) + 1d8 >>> ffffffff7da3635c void JvmtiExport::post_vm_death() (1ffc, 100137000, >>> ffffffff7e3e8b30, ffffffff7e357440, 1, 10010cf30) + 534 >>> ffffffff7d7bb104 void before_exit(JavaThread*) (100137000, >>> ffffffff7e392350, ffffffff7e3fb938, 6ed99, ffffffff7e357440, >>> ffffffff7e3e6b70) + 30c >>> ffffffff7de72128 bool Threads::destroy_vm() (100137000, 100110a40, >>> ffffffff7e3f22f4, ffffffff7e3e6ab0, ffffffff7e357440, 30000000) + 100 >>> ffffffff7d8d0664 jni_DestroyJavaVM (100137000, 1ffc, ffffffff7e3e8b30, >>> ffffffff7e357440, 0, 10013700) + 1bc >>> ffffffff7ee08680 JavaMain (ffffffff7e3da790, 0, ffffffff7e3da790, >>> 10035de68, 0, ffffffff7e4143b0) + 860 >>> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >>> >>> ----------------- lwp# 12 / thread# 12 -------------------- >>> ffffffff7e8dc6a4 lwp_cond_wait (100349948, 100349930, 0, 0) >>> ffffffff7dcad148 void os::PlatformEvent::park() (100349900, d4788, >>> d4400, 0, ffffffff7e357440, 100349930) + 100 >>> ffffffff7da5f010 int JvmtiRawMonitor::raw_enter(Thread*) (10034a070, >>> 100348800, a, ffffffff7e3de340, 1, ffffffff7e115ff4) + 258 >>> ffffffff7da22450 jvmtiError >>> JvmtiEnv::RawMonitorEnter(JvmtiRawMonitor*) (ffffffff7ea05a00, >>> 10034a070, 1c7, 100348800, ffffffff7e357440, 4) + a0 >>> ffffffff7aa2f288 debugMonitorEnter (10034a070, c18, c00, >>> ffffffff7ab3ad28, ffffffff7ab3b940, 0) + 38 >>> ffffffff7aa14134 debugLoop_run (ffffffff7ab3b940, 1000, >>> ffffffff7ab3ad28, ffffffff7aa360d0, ffffffff5b2ff718, c18) + 11c >>> ffffffff7aa2a4f8 connectionInitiated (ffffffff5b504010, 1358, 1000, >>> ffffffff7ab3ad28, 1, ffffffff7ab3c080) + e0 >>> ffffffff7aa2a7d4 attachThread (ffffffffffefee48, 101000, >>> ffffffff5b504010, ffffffff7ab3ad28, 0, 10000000) + 54 >>> ffffffff7da56b18 void JvmtiAgentThread::call_start_function() >>> (100348800, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034880, 1) + >>> 128 >>> ffffffff7de6a678 void JavaThread::thread_main_inner() (100348800, 3d8, >>> 1003497f8, 100349420, ffffffff5b2ff9f8, 0) + 90 >>> ffffffff7de6a5b4 void JavaThread::run() (100348800, 100349442, c, >>> fffffffea5f3e048, 3d8, 1003497f8) + 3ac >>> ffffffff7dc9f2e4 java_start (ca800, 100348800, ca904, >>> ffffffff7e16ff31, ffffffff7e357440, 4797) + 2e4 >>> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >>> >>> ----------------- lwp# 13 / thread# 13 -------------------- >>> ffffffff7e8dc6a4 lwp_cond_wait (10034d348, 10034d330, 0, 0) >>> ffffffff7dcad148 void os::PlatformEvent::park() (10034d300, d4788, >>> d4400, 0, ffffffff7e357440, 10034d330) + 100 >>> ffffffff7da5eac8 int JvmtiRawMonitor::SimpleWait(Thread*,long) >>> (10034bed0, 10034c000, ffffffffffffffff, 241000, 0, 10034c000) + 100 >>> ffffffff7da5f300 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) >>> (10034bed0, ffffffffffffffff, 1, 10034c000, ffffffff7e357440, 10034c000) >>> + e0 >>> ffffffff7da2284c jvmtiError >>> JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bed0, >>> ffffffffffffffff, 4, 9aeb0, 10034c000) + 8c >>> ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, >>> ffffffff7ab3ad18, ffffffff7ab3b940, 0) + 3c >>> ffffffff7aa1d838 doBlockCommandLoop (800, 1038, ffffffff7ab3ad18, >>> 1000, ffffffff7ab3ad18, ffffffff7ab3bd60) + 48 >>> ffffffff7aa1da3c commandLoop (c28, 10034c1f8, c00, ffffffff7ab3ad18, >>> 0, 10000000) + ac >>> ffffffff7da56b18 void JvmtiAgentThread::call_start_function() >>> (10034c000, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034c00, 1) + >>> 128 >>> ffffffff7de6a678 void JavaThread::thread_main_inner() (10034c000, 3d8, >>> 10034cfe8, 10034cc10, ffffffff5b0ffbf8, 0) + 90 >>> ffffffff7de6a5b4 void JavaThread::run() (10034c000, 10034cc28, d, >>> fffffffea5f3e290, 3d8, 10034cfe8) + 3ac >>> ffffffff7dc9f2e4 java_start (ca800, 10034c000, ca904, >>> ffffffff7e16ff31, ffffffff7e357440, 181a) + 2e4 >>> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >>> >>> >>> The details: >>> - Thread #2: The cbVMDeath() event handler is waiting on the >>> commandCompleteLock in the enqueueCommand(). >>> The call chain is: >>> cbVMDeath() -> event_callback() -> reportEvents() -> >>> eventHelper_reportEvents() -> enqueueCommand(). >>> The enqueueCommand() depends on the commandLoop() that has to call >>> completeCommand(command) for the command being enqueued. >>> This has not been set yet: gdata->vmDead = JNI_TRUE >>> >>> - Thread #12: The debugLoop_run blocked on the vmDeathLock enter >>> >>> - Thread #13: The commandLoop is waiting on the blockCommandLoopLock >>> in the doBlockCommandLoop(). >>> It is because blockCommandLoop == JNI_TRUE which is set in the >>> needBlockCommandLoop() >>> if the following condition is true: >>> (cmd->commandKind == COMMAND_REPORT_EVENT_COMPOSITE && >>> cmd->u.reportEventComposite.suspendPolicy == >>> JDWP_SUSPEND_POLICY(ALL)) >>> >>> >>> It seems, the debugLoop_run() block on the vmDeathLock causes the >>> commandLoop() to wait indefinitely. >>> The cbVMDeath() can not proceed because the commandLoop() does not make >>> a progress. >>> >>> The vmDeathLock critical section in the cbVMDeath() event callback seems >>> to be an overkill (unnecessary). >>> A less intrusive synchronization is required here which is to wait until >>> the current command is completed >>> before returning to the JvmtiExport::post_vm_death(). >>> >>> The new approach (see new webrev) is to extend the resumeLock >>> synchronization pattern >>> to all VirtualMachine set of commands, not only the resume command. >>> The resumeLock name is replaced with the vmDeathLock to reflect new >>> semantics. >>> >>> In general, we could consider to do the same for the rest of the JDWP >>> command sets. >>> But it is better to be careful and see how this change goes first. >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/5/14 2:27 AM, serguei.spitsyn at oracle.com wrote: >>>> Hi David, >>>> >>>> Thank you for the concerns! >>>> Testing showed several tests failing with deadlocks. >>>> Scenarios are similar to that you describe. >>>> >>>> Trying to understand the details. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> On 11/4/14 4:09 PM, David Holmes wrote: >>>>> Hi Serguei, >>>>> >>>>> On 3/11/2014 5:07 PM, serguei.spitsyn at oracle.com wrote: >>>>>> On 11/2/14 8:58 PM, David Holmes wrote: >>>>>>> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >>>>>>>> Serguei, >>>>>>>> >>>>>>>> Thank you for good finding. This approach looks much better for me. >>>>>>>> >>>>>>>> The fix looks good. >>>>>>>> >>>>>>>> Is it necessary to release vmDeathLock locks at >>>>>>>> eventHandler.c:1244 before call >>>>>>>> >>>>>>>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? >>>>>>> >>>>>>> I agree this looks necessary, or at least more clean (if things are >>>>>>> failing we really don't know what is happening). >>>>>> >>>>>> Agreed (replied to Dmitry). >>>>>> >>>>>>> >>>>>>> More generally I'm concerned about whether any of the code paths >>>>>>> taken >>>>>>> while holding the new lock can result in deadlock - in particular >>>>>>> with >>>>>>> regard to the resumeLock ? >>>>>> >>>>>> The cbVMDeath() function never holds both vmDeathLock and >>>>>> resumeLock at >>>>>> the same time, >>>>>> so there is no chance for a deadlock that involves both these locks. >>>>>> >>>>>> Two more locks used in the cbVMDeath() are the callbackBlock and >>>>>> callbackLock. >>>>>> These two locks look completely unrelated to the debugLoop_run(). >>>>>> >>>>>> The debugLoop_run() function also uses the cmdQueueLock. >>>>>> The debugLoop_run() never holds both vmDeathLock and cmdQueueLock at >>>>>> the >>>>>> same time. >>>>>> >>>>>> So that I do not see any potential to introduce new deadlock with the >>>>>> vmDeathLock. >>>>>> >>>>>> However, it is still easy to overlook something here. >>>>>> Please, let me know if you see any danger. >>>>> >>>>> I was mainly concerned about what might happen in the call chain for >>>>> threadControl_resumeAll() (it certainly sounds like it might need to >>>>> use a resumeLock :) ). I see direct use of the threadLock and >>>>> indirectly the eventHandler lock; but there are further call paths I >>>>> did not explore. Wish there was an easy way to determine the >>>>> transitive closure of all locks used from a given call. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>>> >>>>>>> David >>>>>>> >>>>>>>> -Dmitry >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>>>>>>>> >>>>>>>>> It is 3-rd round of review for: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>>> >>>>>>>>> New webrev: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Summary >>>>>>>>> >>>>>>>>> For failing scenario, please, refer to the 1-st round RFR >>>>>>>>> below. >>>>>>>>> >>>>>>>>> I've found what is missed in the jdwp agent shutdown and >>>>>>>>> decided to >>>>>>>>> switch from a workaround to a real fix. >>>>>>>>> >>>>>>>>> The agent VM_DEATH callback sets the gdata field: >>>>>>>>> gdata->vmDead = 1. >>>>>>>>> The agent debugLoop_run() has a guard against the VM shutdown: >>>>>>>>> >>>>>>>>> 165 } else if (gdata->vmDead && >>>>>>>>> 166 ((cmd->cmdSet) != >>>>>>>>> JDWP_COMMAND_SET(VirtualMachine))) { >>>>>>>>> 167 /* Protect the VM from calls while dead. >>>>>>>>> 168 * VirtualMachine cmdSet quietly ignores >>>>>>>>> some >>>>>>>>> cmds >>>>>>>>> 169 * after VM death, so, it sends it's own >>>>>>>>> errors. >>>>>>>>> 170 */ >>>>>>>>> 171 outStream_setError(&out, >>>>>>>>> JDWP_ERROR(VM_DEAD)); >>>>>>>>> >>>>>>>>> >>>>>>>>> However, the guard above does not help much if the VM_DEATH >>>>>>>>> event >>>>>>>>> happens in the middle of a command execution. >>>>>>>>> There is a lack of synchronization here. >>>>>>>>> >>>>>>>>> The fix introduces new lock (vmDeathLock) which does not >>>>>>>>> allow to >>>>>>>>> execute the commands >>>>>>>>> and the VM_DEATH event callback concurrently. >>>>>>>>> It should work well for any function that is used in >>>>>>>>> implementation of >>>>>>>>> the JDWP_COMMAND_SET(VirtualMachine) . >>>>>>>>> >>>>>>>>> >>>>>>>>> Testing: >>>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>>>> tests >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> The updated webrev: >>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The changes are: >>>>>>>>>> - added a comment recommended by Staffan >>>>>>>>>> - removed the ignore_wrong_phase() call from function >>>>>>>>>> classSignature() >>>>>>>>>> >>>>>>>>>> The classSignature() function is called in 16 places. >>>>>>>>>> Most of them do not tolerate the NULL in place of returned >>>>>>>>>> signature >>>>>>>>>> and will crash. >>>>>>>>>> I'm not comfortable to fix all the occurrences now and suggest to >>>>>>>>>> return to this >>>>>>>>>> issue after gaining experience with more failure cases that are >>>>>>>>>> still >>>>>>>>>> expected. >>>>>>>>>> The failure with the classSignature() involved was observed only >>>>>>>>>> once >>>>>>>>>> in the nightly >>>>>>>>>> and should be extremely rare reproducible. >>>>>>>>>> I'll file a placeholder bug if necessary. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Please, review the fix for: >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Open webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Summary: >>>>>>>>>>> >>>>>>>>>>> The failing scenario: >>>>>>>>>>> The debugger and the debuggee are well aware a VM >>>>>>>>>>> shutdown has >>>>>>>>>>> been started in the target process. >>>>>>>>>>> The debugger at this point is not expected to send any >>>>>>>>>>> commands >>>>>>>>>>> to the JDWP agent. >>>>>>>>>>> However, the JDI layer (debugger side) and the jdwp agent >>>>>>>>>>> (debuggee side) >>>>>>>>>>> are not in sync with the consumer layers. >>>>>>>>>>> >>>>>>>>>>> One reason is because the test debugger does not invoke >>>>>>>>>>> the JDI >>>>>>>>>>> method VirtualMachine.dispose(). >>>>>>>>>>> Another reason is that the Debugger and the debuggee >>>>>>>>>>> processes >>>>>>>>>>> are uneasy to sync in general. >>>>>>>>>>> >>>>>>>>>>> As a result the following steps are possible: >>>>>>>>>>> - The test debugger sends a 'quit' command to the test >>>>>>>>>>> debuggee >>>>>>>>>>> - The debuggee is normally exiting >>>>>>>>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>>>>>>>> anonymous class unload event >>>>>>>>>>> - The JDI InternalEventHandler thread handles the >>>>>>>>>>> ClassUnloadEvent event >>>>>>>>>>> - The InternalEventHandler wants to uncache the matching >>>>>>>>>>> reference type. >>>>>>>>>>> If there is more than one class with the same host >>>>>>>>>>> class >>>>>>>>>>> signature, it can't distinguish them, >>>>>>>>>>> and so, deletes all references and re-retrieves them >>>>>>>>>>> again >>>>>>>>>>> (see tracing below): >>>>>>>>>>> MY_TRACE: JDI: >>>>>>>>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>>>>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>>>>>>>> - The jdwp backend debugLoop_run() gets the command >>>>>>>>>>> from JDI >>>>>>>>>>> and calls the functions >>>>>>>>>>> classesForSignature() and classStatus() recursively. >>>>>>>>>>> - The classStatus() makes a call to the JVMTI >>>>>>>>>>> GetClassStatus() >>>>>>>>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>>>>>>>> - As a result the jdwp backend reports the JVMTI error >>>>>>>>>>> to the >>>>>>>>>>> JDI, and so, the test fails >>>>>>>>>>> >>>>>>>>>>> For details, see the analysis in bug report closed as a >>>>>>>>>>> dup of >>>>>>>>>>> the bug 6988950: >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>>>>>>>> >>>>>>>>>>> Some similar cases can be found in the two bug reports >>>>>>>>>>> (6988950 >>>>>>>>>>> and 8024865) describing this issue. >>>>>>>>>>> >>>>>>>>>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE >>>>>>>>>>> error >>>>>>>>>>> as it is normal at the VM shutdown. >>>>>>>>>>> The original jdwp backend implementation had a similar >>>>>>>>>>> approach >>>>>>>>>>> for the raw monitor functions. >>>>>>>>>>> Threy use the ignore_vm_death() to workaround the >>>>>>>>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>>>>>>>> For reference, please, see the file: src/share/back/util.c >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Testing: >>>>>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>>>>>> tests >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>> >>> > From serguei.spitsyn at oracle.com Fri Nov 7 10:13:11 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 07 Nov 2014 02:13:11 -0800 Subject: 4-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <545C9819.5050806@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> <545729A3.7090301@oracle.com> <54596AC2.6050502@oracle.com> <5459FB96.9020404@oracle.com> <545BF5CD.6010008@oracle.com> <545C5620.7080300@oracle.com> <545C955B.3070209@oracle.com> <545C9819.5050806@oracle.com> Message-ID: <545C9B37.5020205@oracle.com> On 11/7/14 1:59 AM, David Holmes wrote: > On 7/11/2014 7:48 PM, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> >> On 11/6/14 9:18 PM, David Holmes wrote: >>> Hi Serguei, >>> >>> I think I get the gist of this approach but I'm not an expert on the >>> JVM TI or JDWP event model. My main concern would be how the delay to >>> the completion of cbVMDeath() might impact things - specifically if it >>> might be a lengthy delay? >> >> 1. At the beginning the VirtualMachine comands check if gdata->vmDead is >> true >> and in such case just return with the JDWP_ERROR(VM_DEAD) error or >> quietly. >> Normally, the cbVMDeath event callback needs to wait for just one >> command. >> >> Please, see the VirtualMachine.c and the following comment in >> debugLoop_run(): >> >> } else if (gdata->vmDead && >> ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) { >> /* Protect the VM from calls while dead. >> * VirtualMachine cmdSet quietly ignores some cmds >> * after VM death, so, it sends it's own errors. >> */ >> outStream_setError(&out, JDWP_ERROR(VM_DEAD)); >> } else { >> >> >> 2. We do not have many choices. >> Without a sync on a command completeness we will continue getting >> WRONG_PHASE errors intermittently. >> Another choice is to use already reviewed ignore_wrong_phase >> workaround. >> Note, the workaround works Ok not for all the commands. >> I understand, we need to make sure nothing is broken if we choose >> this approach. :) >> >> 3. What delay would you consider lengthy: 1 sec, 10 sec, 1 min.? > > Anything that causes something unexpected to happen :) I'm just > looking at the code and thinking what might go wrong. Really all we > can do is try this and see. 1 min sleep looks too big as it causes timeout failures of some tests. Launched the nsk.jdi and jtreg com/sun/jdi with 10 sec sleep. Will see the results tomorrow. Thanks! Serguei > > Thanks, > David > >> For instance, I can add 10 sec sleep to provoke the command >> execution delay and see what can be broken. >> With 1 min sleep I see some timeouts in the jtreg com/sun/jdi tests >> though which is probably Ok. >> >> Thanks, >> Serguei >> >>> >>> Thanks, >>> David >>> >>> On 7/11/2014 8:27 AM, serguei.spitsyn at oracle.com wrote: >>>> Hi reviewers, >>>> >>>> I'm suggesting to review a modified fix: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.4/ >>>> >>>> >>>> >>>> >>>> The 3-rd round fix is not right as it caused deadlocks in several >>>> tests >>>> (in nsk.jdi.testlist and jtreg com/sun/jdi). >>>> >>>> Here is a deadlock example: >>>> >>>> ----------------- lwp# 2 / thread# 2 -------------------- >>>> ffffffff7e8dc6a4 lwp_cond_wait (100138748, 100138730, 0, 0) >>>> ffffffff7dcad148 void os::PlatformEvent::park() (100138700, d4788, >>>> d4400, 0, ffffffff7e357440, 100138730) + 100 >>>> ffffffff7dc3151c int Monitor::IWait(Thread*,long) (ffffffff7e3c5b98, >>>> 100137000, 0, 1004405d0, 6e750, 0) + a4 >>>> ffffffff7dc324d0 bool Monitor::wait(bool,long,bool) (1004405d0, >>>> 100137000, 0, 0, 1, 20000000) + 358 >>>> ffffffff7de6c530 int JavaThread::java_suspend_self() (1004405d0, >>>> 100137000, 1, deab, 60000000, 100137000) + c8 >>>> ffffffff7da5f478 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) >>>> (10034bdc0, ffffffffffffffff, ffffffff7e3e6bd0, 100137000, 1, 2) + 258 >>>> ffffffff7da2284c jvmtiError >>>> JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bdc0, >>>> ffffffffffffffff, 4, 9aeb0, 100137000) + 8c >>>> ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, >>>> ffffffff7ab3ad18, ffffffff7ab3ad18, 0) + 3c >>>> ffffffff7aa1c804 enqueueCommand (10034bb90, 102c00, >>>> ffffffffffefd118, >>>> ffffffff7ab3ad18, 102c00, ffffffff7ab3bd60) + 14c >>>> ffffffff7aa1e23c eventHelper_reportEvents (d8, 100135d70, 2, 1, >>>> 1, 2) >>>> + 10c >>>> ffffffff7aa181f8 reportEvents (1001371f8, 0, 0, 14, 100135d70, 0) + >>>> 138 >>>> ffffffff7aa187b8 event_callback (1001371f8, ffffffff7b0ffa88, >>>> ffffffff7aa23150, ffffffff7aa376a0, ffffffff7ab3ad18, 100441ad0) + 360 >>>> ffffffff7aa1b870 cbVMDeath (800, 1001371f8, ffffffff7aa37c48, >>>> ffffffff7ab3ad18, 1018, 1000) + 1d8 >>>> ffffffff7da3635c void JvmtiExport::post_vm_death() (1ffc, 100137000, >>>> ffffffff7e3e8b30, ffffffff7e357440, 1, 10010cf30) + 534 >>>> ffffffff7d7bb104 void before_exit(JavaThread*) (100137000, >>>> ffffffff7e392350, ffffffff7e3fb938, 6ed99, ffffffff7e357440, >>>> ffffffff7e3e6b70) + 30c >>>> ffffffff7de72128 bool Threads::destroy_vm() (100137000, 100110a40, >>>> ffffffff7e3f22f4, ffffffff7e3e6ab0, ffffffff7e357440, 30000000) + 100 >>>> ffffffff7d8d0664 jni_DestroyJavaVM (100137000, 1ffc, >>>> ffffffff7e3e8b30, >>>> ffffffff7e357440, 0, 10013700) + 1bc >>>> ffffffff7ee08680 JavaMain (ffffffff7e3da790, 0, ffffffff7e3da790, >>>> 10035de68, 0, ffffffff7e4143b0) + 860 >>>> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >>>> >>>> ----------------- lwp# 12 / thread# 12 -------------------- >>>> ffffffff7e8dc6a4 lwp_cond_wait (100349948, 100349930, 0, 0) >>>> ffffffff7dcad148 void os::PlatformEvent::park() (100349900, d4788, >>>> d4400, 0, ffffffff7e357440, 100349930) + 100 >>>> ffffffff7da5f010 int JvmtiRawMonitor::raw_enter(Thread*) (10034a070, >>>> 100348800, a, ffffffff7e3de340, 1, ffffffff7e115ff4) + 258 >>>> ffffffff7da22450 jvmtiError >>>> JvmtiEnv::RawMonitorEnter(JvmtiRawMonitor*) (ffffffff7ea05a00, >>>> 10034a070, 1c7, 100348800, ffffffff7e357440, 4) + a0 >>>> ffffffff7aa2f288 debugMonitorEnter (10034a070, c18, c00, >>>> ffffffff7ab3ad28, ffffffff7ab3b940, 0) + 38 >>>> ffffffff7aa14134 debugLoop_run (ffffffff7ab3b940, 1000, >>>> ffffffff7ab3ad28, ffffffff7aa360d0, ffffffff5b2ff718, c18) + 11c >>>> ffffffff7aa2a4f8 connectionInitiated (ffffffff5b504010, 1358, 1000, >>>> ffffffff7ab3ad28, 1, ffffffff7ab3c080) + e0 >>>> ffffffff7aa2a7d4 attachThread (ffffffffffefee48, 101000, >>>> ffffffff5b504010, ffffffff7ab3ad28, 0, 10000000) + 54 >>>> ffffffff7da56b18 void JvmtiAgentThread::call_start_function() >>>> (100348800, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034880, 1) + >>>> 128 >>>> ffffffff7de6a678 void JavaThread::thread_main_inner() (100348800, >>>> 3d8, >>>> 1003497f8, 100349420, ffffffff5b2ff9f8, 0) + 90 >>>> ffffffff7de6a5b4 void JavaThread::run() (100348800, 100349442, c, >>>> fffffffea5f3e048, 3d8, 1003497f8) + 3ac >>>> ffffffff7dc9f2e4 java_start (ca800, 100348800, ca904, >>>> ffffffff7e16ff31, ffffffff7e357440, 4797) + 2e4 >>>> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >>>> >>>> ----------------- lwp# 13 / thread# 13 -------------------- >>>> ffffffff7e8dc6a4 lwp_cond_wait (10034d348, 10034d330, 0, 0) >>>> ffffffff7dcad148 void os::PlatformEvent::park() (10034d300, d4788, >>>> d4400, 0, ffffffff7e357440, 10034d330) + 100 >>>> ffffffff7da5eac8 int JvmtiRawMonitor::SimpleWait(Thread*,long) >>>> (10034bed0, 10034c000, ffffffffffffffff, 241000, 0, 10034c000) + 100 >>>> ffffffff7da5f300 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) >>>> (10034bed0, ffffffffffffffff, 1, 10034c000, ffffffff7e357440, >>>> 10034c000) >>>> + e0 >>>> ffffffff7da2284c jvmtiError >>>> JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bed0, >>>> ffffffffffffffff, 4, 9aeb0, 10034c000) + 8c >>>> ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, >>>> ffffffff7ab3ad18, ffffffff7ab3b940, 0) + 3c >>>> ffffffff7aa1d838 doBlockCommandLoop (800, 1038, ffffffff7ab3ad18, >>>> 1000, ffffffff7ab3ad18, ffffffff7ab3bd60) + 48 >>>> ffffffff7aa1da3c commandLoop (c28, 10034c1f8, c00, ffffffff7ab3ad18, >>>> 0, 10000000) + ac >>>> ffffffff7da56b18 void JvmtiAgentThread::call_start_function() >>>> (10034c000, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034c00, 1) + >>>> 128 >>>> ffffffff7de6a678 void JavaThread::thread_main_inner() (10034c000, >>>> 3d8, >>>> 10034cfe8, 10034cc10, ffffffff5b0ffbf8, 0) + 90 >>>> ffffffff7de6a5b4 void JavaThread::run() (10034c000, 10034cc28, d, >>>> fffffffea5f3e290, 3d8, 10034cfe8) + 3ac >>>> ffffffff7dc9f2e4 java_start (ca800, 10034c000, ca904, >>>> ffffffff7e16ff31, ffffffff7e357440, 181a) + 2e4 >>>> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >>>> >>>> >>>> The details: >>>> - Thread #2: The cbVMDeath() event handler is waiting on the >>>> commandCompleteLock in the enqueueCommand(). >>>> The call chain is: >>>> cbVMDeath() -> event_callback() -> reportEvents() -> >>>> eventHelper_reportEvents() -> enqueueCommand(). >>>> The enqueueCommand() depends on the commandLoop() that has to >>>> call >>>> completeCommand(command) for the command being enqueued. >>>> This has not been set yet: gdata->vmDead = JNI_TRUE >>>> >>>> - Thread #12: The debugLoop_run blocked on the vmDeathLock enter >>>> >>>> - Thread #13: The commandLoop is waiting on the >>>> blockCommandLoopLock >>>> in the doBlockCommandLoop(). >>>> It is because blockCommandLoop == JNI_TRUE which is set in the >>>> needBlockCommandLoop() >>>> if the following condition is true: >>>> (cmd->commandKind == COMMAND_REPORT_EVENT_COMPOSITE && >>>> cmd->u.reportEventComposite.suspendPolicy == >>>> JDWP_SUSPEND_POLICY(ALL)) >>>> >>>> >>>> It seems, the debugLoop_run() block on the vmDeathLock causes the >>>> commandLoop() to wait indefinitely. >>>> The cbVMDeath() can not proceed because the commandLoop() does not >>>> make >>>> a progress. >>>> >>>> The vmDeathLock critical section in the cbVMDeath() event callback >>>> seems >>>> to be an overkill (unnecessary). >>>> A less intrusive synchronization is required here which is to wait >>>> until >>>> the current command is completed >>>> before returning to the JvmtiExport::post_vm_death(). >>>> >>>> The new approach (see new webrev) is to extend the resumeLock >>>> synchronization pattern >>>> to all VirtualMachine set of commands, not only the resume command. >>>> The resumeLock name is replaced with the vmDeathLock to reflect new >>>> semantics. >>>> >>>> In general, we could consider to do the same for the rest of the JDWP >>>> command sets. >>>> But it is better to be careful and see how this change goes first. >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 11/5/14 2:27 AM, serguei.spitsyn at oracle.com wrote: >>>>> Hi David, >>>>> >>>>> Thank you for the concerns! >>>>> Testing showed several tests failing with deadlocks. >>>>> Scenarios are similar to that you describe. >>>>> >>>>> Trying to understand the details. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> On 11/4/14 4:09 PM, David Holmes wrote: >>>>>> Hi Serguei, >>>>>> >>>>>> On 3/11/2014 5:07 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> On 11/2/14 8:58 PM, David Holmes wrote: >>>>>>>> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >>>>>>>>> Serguei, >>>>>>>>> >>>>>>>>> Thank you for good finding. This approach looks much better >>>>>>>>> for me. >>>>>>>>> >>>>>>>>> The fix looks good. >>>>>>>>> >>>>>>>>> Is it necessary to release vmDeathLock locks at >>>>>>>>> eventHandler.c:1244 before call >>>>>>>>> >>>>>>>>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? >>>>>>>> >>>>>>>> I agree this looks necessary, or at least more clean (if things >>>>>>>> are >>>>>>>> failing we really don't know what is happening). >>>>>>> >>>>>>> Agreed (replied to Dmitry). >>>>>>> >>>>>>>> >>>>>>>> More generally I'm concerned about whether any of the code paths >>>>>>>> taken >>>>>>>> while holding the new lock can result in deadlock - in particular >>>>>>>> with >>>>>>>> regard to the resumeLock ? >>>>>>> >>>>>>> The cbVMDeath() function never holds both vmDeathLock and >>>>>>> resumeLock at >>>>>>> the same time, >>>>>>> so there is no chance for a deadlock that involves both these >>>>>>> locks. >>>>>>> >>>>>>> Two more locks used in the cbVMDeath() are the callbackBlock and >>>>>>> callbackLock. >>>>>>> These two locks look completely unrelated to the debugLoop_run(). >>>>>>> >>>>>>> The debugLoop_run() function also uses the cmdQueueLock. >>>>>>> The debugLoop_run() never holds both vmDeathLock and >>>>>>> cmdQueueLock at >>>>>>> the >>>>>>> same time. >>>>>>> >>>>>>> So that I do not see any potential to introduce new deadlock >>>>>>> with the >>>>>>> vmDeathLock. >>>>>>> >>>>>>> However, it is still easy to overlook something here. >>>>>>> Please, let me know if you see any danger. >>>>>> >>>>>> I was mainly concerned about what might happen in the call chain for >>>>>> threadControl_resumeAll() (it certainly sounds like it might need to >>>>>> use a resumeLock :) ). I see direct use of the threadLock and >>>>>> indirectly the eventHandler lock; but there are further call paths I >>>>>> did not explore. Wish there was an easy way to determine the >>>>>> transitive closure of all locks used from a given call. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>>> -Dmitry >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> >>>>>>>>>> It is 3-rd round of review for: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>>>> >>>>>>>>>> New webrev: >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Summary >>>>>>>>>> >>>>>>>>>> For failing scenario, please, refer to the 1-st round RFR >>>>>>>>>> below. >>>>>>>>>> >>>>>>>>>> I've found what is missed in the jdwp agent shutdown and >>>>>>>>>> decided to >>>>>>>>>> switch from a workaround to a real fix. >>>>>>>>>> >>>>>>>>>> The agent VM_DEATH callback sets the gdata field: >>>>>>>>>> gdata->vmDead = 1. >>>>>>>>>> The agent debugLoop_run() has a guard against the VM >>>>>>>>>> shutdown: >>>>>>>>>> >>>>>>>>>> 165 } else if (gdata->vmDead && >>>>>>>>>> 166 ((cmd->cmdSet) != >>>>>>>>>> JDWP_COMMAND_SET(VirtualMachine))) { >>>>>>>>>> 167 /* Protect the VM from calls while dead. >>>>>>>>>> 168 * VirtualMachine cmdSet quietly ignores >>>>>>>>>> some >>>>>>>>>> cmds >>>>>>>>>> 169 * after VM death, so, it sends it's own >>>>>>>>>> errors. >>>>>>>>>> 170 */ >>>>>>>>>> 171 outStream_setError(&out, >>>>>>>>>> JDWP_ERROR(VM_DEAD)); >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> However, the guard above does not help much if the VM_DEATH >>>>>>>>>> event >>>>>>>>>> happens in the middle of a command execution. >>>>>>>>>> There is a lack of synchronization here. >>>>>>>>>> >>>>>>>>>> The fix introduces new lock (vmDeathLock) which does not >>>>>>>>>> allow to >>>>>>>>>> execute the commands >>>>>>>>>> and the VM_DEATH event callback concurrently. >>>>>>>>>> It should work well for any function that is used in >>>>>>>>>> implementation of >>>>>>>>>> the JDWP_COMMAND_SET(VirtualMachine) . >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Testing: >>>>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>>>>> tests >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> The updated webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> The changes are: >>>>>>>>>>> - added a comment recommended by Staffan >>>>>>>>>>> - removed the ignore_wrong_phase() call from function >>>>>>>>>>> classSignature() >>>>>>>>>>> >>>>>>>>>>> The classSignature() function is called in 16 places. >>>>>>>>>>> Most of them do not tolerate the NULL in place of returned >>>>>>>>>>> signature >>>>>>>>>>> and will crash. >>>>>>>>>>> I'm not comfortable to fix all the occurrences now and >>>>>>>>>>> suggest to >>>>>>>>>>> return to this >>>>>>>>>>> issue after gaining experience with more failure cases that are >>>>>>>>>>> still >>>>>>>>>>> expected. >>>>>>>>>>> The failure with the classSignature() involved was observed >>>>>>>>>>> only >>>>>>>>>>> once >>>>>>>>>>> in the nightly >>>>>>>>>>> and should be extremely rare reproducible. >>>>>>>>>>> I'll file a placeholder bug if necessary. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Please, review the fix for: >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Open webrev: >>>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Summary: >>>>>>>>>>>> >>>>>>>>>>>> The failing scenario: >>>>>>>>>>>> The debugger and the debuggee are well aware a VM >>>>>>>>>>>> shutdown has >>>>>>>>>>>> been started in the target process. >>>>>>>>>>>> The debugger at this point is not expected to send any >>>>>>>>>>>> commands >>>>>>>>>>>> to the JDWP agent. >>>>>>>>>>>> However, the JDI layer (debugger side) and the jdwp >>>>>>>>>>>> agent >>>>>>>>>>>> (debuggee side) >>>>>>>>>>>> are not in sync with the consumer layers. >>>>>>>>>>>> >>>>>>>>>>>> One reason is because the test debugger does not invoke >>>>>>>>>>>> the JDI >>>>>>>>>>>> method VirtualMachine.dispose(). >>>>>>>>>>>> Another reason is that the Debugger and the debuggee >>>>>>>>>>>> processes >>>>>>>>>>>> are uneasy to sync in general. >>>>>>>>>>>> >>>>>>>>>>>> As a result the following steps are possible: >>>>>>>>>>>> - The test debugger sends a 'quit' command to the test >>>>>>>>>>>> debuggee >>>>>>>>>>>> - The debuggee is normally exiting >>>>>>>>>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>>>>>>>>> anonymous class unload event >>>>>>>>>>>> - The JDI InternalEventHandler thread handles the >>>>>>>>>>>> ClassUnloadEvent event >>>>>>>>>>>> - The InternalEventHandler wants to uncache the >>>>>>>>>>>> matching >>>>>>>>>>>> reference type. >>>>>>>>>>>> If there is more than one class with the same host >>>>>>>>>>>> class >>>>>>>>>>>> signature, it can't distinguish them, >>>>>>>>>>>> and so, deletes all references and re-retrieves them >>>>>>>>>>>> again >>>>>>>>>>>> (see tracing below): >>>>>>>>>>>> MY_TRACE: JDI: >>>>>>>>>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>>>>>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>>>>>>>>> - The jdwp backend debugLoop_run() gets the command >>>>>>>>>>>> from JDI >>>>>>>>>>>> and calls the functions >>>>>>>>>>>> classesForSignature() and classStatus() recursively. >>>>>>>>>>>> - The classStatus() makes a call to the JVMTI >>>>>>>>>>>> GetClassStatus() >>>>>>>>>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>>>>>>>>> - As a result the jdwp backend reports the JVMTI error >>>>>>>>>>>> to the >>>>>>>>>>>> JDI, and so, the test fails >>>>>>>>>>>> >>>>>>>>>>>> For details, see the analysis in bug report closed as a >>>>>>>>>>>> dup of >>>>>>>>>>>> the bug 6988950: >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>>>>>>>>> >>>>>>>>>>>> Some similar cases can be found in the two bug reports >>>>>>>>>>>> (6988950 >>>>>>>>>>>> and 8024865) describing this issue. >>>>>>>>>>>> >>>>>>>>>>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE >>>>>>>>>>>> error >>>>>>>>>>>> as it is normal at the VM shutdown. >>>>>>>>>>>> The original jdwp backend implementation had a similar >>>>>>>>>>>> approach >>>>>>>>>>>> for the raw monitor functions. >>>>>>>>>>>> Threy use the ignore_vm_death() to workaround the >>>>>>>>>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>>>>>>>>> For reference, please, see the file: >>>>>>>>>>>> src/share/back/util.c >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Testing: >>>>>>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG >>>>>>>>>>>> com/sun/jdi >>>>>>>>>>>> tests >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>>> >> From stefan.sarne at oracle.com Fri Nov 7 10:37:51 2014 From: stefan.sarne at oracle.com (=?UTF-8?B?U3RlZmFuIFPDpHJuZQ==?=) Date: Fri, 07 Nov 2014 11:37:51 +0100 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <545C7106.2080602@oracle.com> Message-ID: <545CA0FF.9010302@oracle.com> Hi Thomas, There are 2 parts to this answer. 1. Today there is a test utility similar to what you describe called VM Internal Tests inside HotSpot. See the jtreg test invokes it as a starting point: hotspot/test/sanity/ExecuteInternalVMTests.java 2. We are doing a proof of concept for a xunit based C++ unit test framework for the VM. The JEP is forthcoming. I recommend you to wait for the latter. There are several engineers who does already. Best regards, /Stefan Thomas St?fe skrev 2014-11-07 10:08: > > In the SAP JVM we have regression tests for C/C++ code, similar to jprt, > but on C function level. Nothing fancy, just some big test functions which > test our C APIs for regressions like this. That code is just compiled into > the hotspot and can be executed with a command line switch, but gets > excluded in release builds. > Is there something similar for the OpenJDK? If yes, I would provide test > functions for jio_snprintf. If no, would it be worth contributing? > > From mikael.vidstedt at oracle.com Fri Nov 7 11:56:18 2014 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Fri, 07 Nov 2014 12:56:18 +0100 Subject: Proposal: Allowing selective pushes to hotspot without jprt In-Reply-To: References: <540F7021.5080100@oracle.com> <5410CDA9.7030405@oracle.com> <541C6FD9.5050602@oracle.com> Message-ID: <545CB362.60501@oracle.com> Volker, Thanks for reminding me, this totally slipped my mind. I think it's fair to say say we've given this enough time for feedback, and that the feedback has been all supportive. With that in mind I consider the proposal approved and effective immediately! Cheers, Mikael On 2014-11-06 15:35, Volker Simonis wrote: > Hi Mikael, > > just wanted to ask what's the status of this project? > I hope it was not just a JavaOne hoax :) > > Regards, > Volker > > > On Fri, Sep 19, 2014 at 8:47 PM, Volker Simonis > wrote: >> Thanks Mikael, that sounds good! >> >> Regards, >> Volker >> >> >> On Fri, Sep 19, 2014 at 8:03 PM, Mikael Vidstedt >> wrote: >>> Volker, >>> >>> The proposal is only to change how the changes are pushed, not which forests >>> changes can be pushed to. That is, we would still require hotspot changes to >>> be pushed to one of the group repositories (jdk9/hs-{comp,gc,rt}) or to the >>> jdk8u/hs-dev forest (jdk8u), but I propose that the relaxation be applied on >>> all those (four) forests. Reasonable? >>> >>> Cheers, >>> Mikael >>> >>> >>> On 2014-09-12 11:38, Volker Simonis wrote: >>>> Hi Mikael, >>>> >>>> there's one more question that came to my mind: will the new rule >>>> apply to all hotspot respitories (i.e. jdk9/hs-rt/hotspot, >>>> jdk9/hs-comp/hotspot, jdk9/hs-gc/hotspot, jdk9/hs-hs/hotspot AND >>>> jdk8u/jdk8u-dev/hotspot, jdk8u/hs-dev/hotspot) ? >>>> >>>> Thanks, >>>> Volker >>>> >>>> >>>> On Thu, Sep 11, 2014 at 12:16 AM, Mikael Vidstedt >>>> wrote: >>>>> Andrew/Volker, >>>>> >>>>> Thanks for the positive feedback. The goal of the proposal is to simplify >>>>> pushing changes which are effectively not tested by the jprt system >>>>> anyway. >>>>> The proposed relaxation would not affect work on other infrastructure >>>>> projects in any relevant way, but would hopefully improve all our lives >>>>> significantly immediately. >>>>> >>>>> Cheers, >>>>> Mikael >>>>> >>>>> >>>>> On 2014-09-10 01:45, Volker Simonis wrote: >>>>>> Hi Mikael, >>>>>> >>>>>> thanks a lot for this proposal. I think this will dramatically >>>>>> simplify our work to keep our ports up to date! So I fully support it. >>>>>> >>>>>> Nevertheless, I think this can only be a first step towards fully open >>>>>> the JPRT system to developers outside Oracle. With "opening" I mean to >>>>>> allow OpenJDK commiters from outside Oracle to submit and run JPRT >>>>>> jobs as well as allowing porting projects to add hardware which builds >>>>>> and tests the HotSpot on alternative platforms. >>>>>> >>>>>> So while I'm all in favor of your proposal I hope you can allay my >>>>>> doubts that this simplification will hopefully not push the >>>>>> realization of a truly OPEN JPRT system even further away. >>>>>> >>>>>> Regards, >>>>>> Volker >>>>>> >>>>>> >>>>>> On Tue, Sep 9, 2014 at 11:24 PM, Mikael Vidstedt >>>>>> wrote: >>>>>>> All, >>>>>>> >>>>>>> Made up primarily of low level C++ code, the Hotspot codebase is highly >>>>>>> platform dependent and also tightly coupled with the tool chains on the >>>>>>> various platforms. Each platform/tool chain combination has its set of >>>>>>> special quirks, and code must be implemented in a way such that it only >>>>>>> relies on the common subset of syntax and functionality across all >>>>>>> these >>>>>>> combinations. History has taught us that even simple changes can have >>>>>>> surprising results when compiled with different compilers. >>>>>>> >>>>>>> For more than a decade the Hotspot team has ensured a minimum quality >>>>>>> level >>>>>>> by requiring all pushes to be done through a build and test system >>>>>>> (jprt) >>>>>>> which guarantees that the code resulting from applying a set of changes >>>>>>> builds on a set of core platforms and that a set of core tests pass. >>>>>>> Only >>>>>>> if >>>>>>> all the builds and tests pass will the changes actually be pushed to >>>>>>> the >>>>>>> target repository. >>>>>>> >>>>>>> We believe that testing like the above, in combination with later >>>>>>> stages >>>>>>> of >>>>>>> testing, is vital to ensuring that the quality level of the Hotspot >>>>>>> code >>>>>>> remains high and that developers do not run into situations where the >>>>>>> latest >>>>>>> version has build errors on some platforms. >>>>>>> >>>>>>> Recently the AIX/PPC port was added to the set of OpenJDK platforms. >>>>>>> From >>>>>>> a >>>>>>> Hotspot perspective this new platform added a set of AIX/PPC specific >>>>>>> files >>>>>>> including some platform specific changes to shared code. The AIX/PPC >>>>>>> platform is not tested by Oracle as part of Hotspot push jobs. The same >>>>>>> thing applies for the shark and zero versions of Hotspot. >>>>>>> >>>>>>> While Hotspot developers remain committed to making sure changes are >>>>>>> developed in a way such that the quality level remains high across all >>>>>>> platforms and variants, because of the above mentioned complexities it >>>>>>> is >>>>>>> inevitable that from time to time changes will be made which introduce >>>>>>> issues on specific platforms or tool chains not part of the core >>>>>>> testing. >>>>>>> >>>>>>> To allow these issues to be resolved more quickly I would like to >>>>>>> propose >>>>>>> a >>>>>>> relaxation in the requirements on how changes to Hotspot are pushed. >>>>>>> Specifically I would like to allow for direct pushes to the hotspot/ >>>>>>> repository of files specific to the following ports/variants/tools: >>>>>>> >>>>>>> * AIX >>>>>>> * PPC >>>>>>> * Shark >>>>>>> * Zero >>>>>>> >>>>>>> Today this translates into the following files: >>>>>>> >>>>>>> - src/cpu/ppc/** >>>>>>> - src/cpu/zero/** >>>>>>> - src/os/aix/** >>>>>>> - src/os_cpu/aix_ppc/** >>>>>>> - src/os_cpu/bsd_zero/** >>>>>>> - src/os_cpu/linux_ppc/** >>>>>>> - src/os_cpu/linux_zero/** >>>>>>> >>>>>>> Note that all changes are still required to go through the normal >>>>>>> development and review cycle; the proposed relaxation only applies to >>>>>>> how >>>>>>> the changes are pushed. >>>>>>> >>>>>>> If at code review time a change is for some reason deemed to be risky >>>>>>> and/or >>>>>>> otherwise have impact on shared files the reviewer may request that the >>>>>>> change to go through the regular push testing. For changes only >>>>>>> touching >>>>>>> the >>>>>>> above set of files this expected to be rare. >>>>>>> >>>>>>> Please let me know what you think. >>>>>>> >>>>>>> Cheers, >>>>>>> Mikael >>>>>>> From eric.mccorkle at oracle.com Fri Nov 7 13:07:54 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Fri, 07 Nov 2014 08:07:54 -0500 Subject: Review request for 8058313: Mismatch of method descriptor and MethodParameters.parameters_count should cause MalformedParameterException In-Reply-To: <545C09FB.9020907@oracle.com> References: <54516C9A.7070404@oracle.com> <54518820.50700@oracle.com> <545251AD.8050208@oracle.com> <54527A90.4030503@oracle.com> <5452CC7F.1090809@oracle.com> <5457F530.2070907@oracle.com> <54597EFC.2070509@oracle.com> <545BBF74.4020607@oracle.com> <545C09FB.9020907@oracle.com> Message-ID: <545CC42A.8030004@oracle.com> On 11/06/14 18:53, Jiangli Zhou wrote: > Could you please point to the updated webrev? I don't see the update in > http://cr.openjdk.java.net/~emc/8058313/webrev.01/src/share/vm/prims/jvm.cpp.sdiff.html. I made a mistake uploading it. It's here: http://cr.openjdk.java.net/~emc/8058313/webrev.02/ > > Thanks, > Jiangli >> >>> On 11/03/2014 01:35 PM, Eric McCorkle wrote: >>>> Please review this issue so that it can go in along with 8058322. >>>> Thanks. >>>> >>>> On 10/30/14 19:40, Eric McCorkle wrote: >>>>> Thank you for the pointers. I have applied your changes and refreshed >>>>> the webrev. >>>>> >>>>> http://cr.openjdk.java.net/~emc/8058313/ >>>>> >>>>> Also, I have posted the test for this and another patch here: >>>>> http://cr.openjdk.java.net/~emc/8062556/ >>>>> >>>>> On 10/30/14 13:51, Jiangli Zhou wrote: >>>>>> Hi Eric, >>>>>> >>>>>> On 10/30/2014 07:56 AM, Eric McCorkle wrote: >>>>>>> On 10/29/14 20:36, Jiangli Zhou wrote: >>>>>>>> Hi Eric, >>>>>>>> >>>>>>>> I wonder if we could specialize this particular case and avoid >>>>>>>> changing >>>>>>>> the parsing code. How about setting the _has_method_parameters >>>>>>>> flag in >>>>>>>> the ConstMethod when encounter such MethodParameter, and changing >>>>>>>> JVM_GetMethodParameters() to return non-NULL value for such case >>>>>>>> when >>>>>>>> _has_method_parameters is true but method_parameters_length is 0. >>>>>>>> Would >>>>>>>> that work? >>>>>>> Which parser are you talking about? The inline tables parser, or >>>>>>> the >>>>>>> class file parser. The class file parser has to change, because it >>>>>>> was >>>>>>> previously ignoring MethodParameters attributes with >>>>>>> parameter_count 0. >>>>>> It's the class parsing changes that I was referring to, mostly >>>>>> relate to >>>>>> the initialization and checking against method_parameters_length. >>>>>> It's a >>>>>> bit awkward to include the 0 case but also skipping it in the >>>>>> loop. For >>>>>> example, the following code in classFileParser.cpp changed ">" to >>>>>> ">=" >>>>>> in the if check, but has no real effect and is not need. >>>>>> >>>>>> 2486 // Copy method parameters >>>>>> 2487 if (method_parameters_length >= 0) { >>>>>> 2488 MethodParametersElement* elem = >>>>>> m->constMethod()->method_parameters_start(); >>>>>> 2489 for (int i = 0; i < method_parameters_length; i++) { >>>>>> 2490 elem[i].name_cp_index = >>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>> 2491 method_parameters_data += 2; >>>>>> 2492 elem[i].flags = >>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>> 2493 method_parameters_data += 2; >>>>>> 2494 } >>>>>> 2495 } >>>>>> >>>>>> >>>>>>> I don't think your proposal will work. The inline tables' >>>>>>> offsets are >>>>>>> all dependent on what inline tables are actually present. If >>>>>>> _has_method_parameters is set, then the inline tables code >>>>>>> expects the >>>>>>> last u2 of the inline tables to be a u2 indicating the number of >>>>>>> method >>>>>>> parameters entries, preceeded by the array of method parameters >>>>>>> data. >>>>>>> If _has_method_parameters is false, then it expects that there is no >>>>>>> method parameters information at all (including no length >>>>>>> field). If >>>>>>> you were to set _has_method_parameters, but not store any >>>>>>> information in >>>>>>> the inline table, then it would cause errors for all the rest of the >>>>>>> inline tables. >>>>>> Thank you for reminding me of the complexity of the inlined table >>>>>> calculation in the ConstMethod. My proposal would require tweaks in >>>>>> that >>>>>> area to correctly compute the table sizes. As it's easy to introduce >>>>>> bugs in that area, it's not worth to change the table calculation >>>>>> code >>>>>> for this purpose. I agree my proposal is not a better choice in this >>>>>> case. >>>>>> >>>>>>> What I do for the parameter_count = 0 case is just store >>>>>>> a 0 u2 for zero-length method parameters information, and no data. >>>>>>> All >>>>>>> the existing inline tables code works fine with this case, so there >>>>>>> aren't any serious changes to the inline tables code (other than >>>>>>> allowing method parameters information to be stored when the >>>>>>> array is >>>>>>> length 0). But you have to make some change to the inline table >>>>>>> code, >>>>>>> otherwise the information won't be stored. >>>>>> Ok. Could you please add comments to the change in constMethod.cpp to >>>>>> explain above? >>>>>> >>>>>> In jvm.cpp, since -1 represents no method parameter now. Maybe >>>>>> checking >>>>>> against explicity and add comments for the 0-length case. >>>>>> >>>>>> JVM_ENTRY(jobjectArray, JVM_GetMethodParameters(JNIEnv *env, jobject >>>>>> method)) >>>>>> { >>>>>> ... >>>>>> // No method parameter >>>>>> if (num_params == -1) { >>>>>> return (jobjectArray)NULL; >>>>>> } >>>>>> >>>>>> /* handle the rest here */ >>>>>> // make sure all the symbols are properly formatted >>>>>> for (int i = 0; i < num_params; i++) { >>>>>> ... >>>>>> } >>>>>> >>>>>> Thanks, >>>>>> Jiangli >>>>>> >>>>>>>> Thanks, >>>>>>>> Jiangli >>>>>>>> >>>>>>>> On 10/29/2014 03:39 PM, Eric McCorkle wrote: >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> Please review this fix for parameter reflection which addresses >>>>>>>>> hotspot >>>>>>>>> falsely ignoring zero-length MethodParameter attributes. The JVMS >>>>>>>>> allows a MethodParameters attribute with parameter_count = 0, and >>>>>>>>> the >>>>>>>>> parameter reflection spec states that a >>>>>>>>> MalformedParametersException >>>>>>>>> should be thrown if parameter_count does not match the number of >>>>>>>>> real >>>>>>>>> parameters to a method. Hotspot currently ignores >>>>>>>>> MethodParameters >>>>>>>>> attributes with parameter_count = 0; however, in a case where a >>>>>>>>> (bad) >>>>>>>>> MethodParameters attribute has parameter_count = 0, but the method >>>>>>>>> has a >>>>>>>>> nonzero number of real parameters, hotspot will return null from >>>>>>>>> JVM_GetMethodParameters, the result being that a >>>>>>>>> MalformedParametersException is not thrown (rather, the >>>>>>>>> reflection API >>>>>>>>> acts like there is no MethodParameters attribute). >>>>>>>>> >>>>>>>>> This patch causes hotspot to record the fact that a zero-length >>>>>>>>> MethodParameters attribute does exist, causing the exception to be >>>>>>>>> thrown when it should be. >>>>>>>>> >>>>>>>>> The bug is here: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8058313 >>>>>>>>> >>>>>>>>> The webrev is here: >>>>>>>>> http://cr.openjdk.java.net/~emc/8058313/ > From volker.simonis at gmail.com Fri Nov 7 13:12:34 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 7 Nov 2014 14:12:34 +0100 Subject: Proposal: Allowing selective pushes to hotspot without jprt In-Reply-To: <545CB362.60501@oracle.com> References: <540F7021.5080100@oracle.com> <5410CDA9.7030405@oracle.com> <541C6FD9.5050602@oracle.com> <545CB362.60501@oracle.com> Message-ID: On Fri, Nov 7, 2014 at 12:56 PM, Mikael Vidstedt wrote: > > Volker, > > Thanks for reminding me, this totally slipped my mind. > > I think it's fair to say say we've given this enough time for feedback, and > that the feedback has been all supportive. With that in mind I consider the > proposal approved and effective immediately! > OK great. So does this mean we can now push reviewed changes to the ppc/aix subdirs right away? > Cheers, > Mikael > > > On 2014-11-06 15:35, Volker Simonis wrote: >> >> Hi Mikael, >> >> just wanted to ask what's the status of this project? >> I hope it was not just a JavaOne hoax :) >> >> Regards, >> Volker >> >> >> On Fri, Sep 19, 2014 at 8:47 PM, Volker Simonis >> wrote: >>> >>> Thanks Mikael, that sounds good! >>> >>> Regards, >>> Volker >>> >>> >>> On Fri, Sep 19, 2014 at 8:03 PM, Mikael Vidstedt >>> wrote: >>>> >>>> Volker, >>>> >>>> The proposal is only to change how the changes are pushed, not which >>>> forests >>>> changes can be pushed to. That is, we would still require hotspot >>>> changes to >>>> be pushed to one of the group repositories (jdk9/hs-{comp,gc,rt}) or to >>>> the >>>> jdk8u/hs-dev forest (jdk8u), but I propose that the relaxation be >>>> applied on >>>> all those (four) forests. Reasonable? >>>> >>>> Cheers, >>>> Mikael >>>> >>>> >>>> On 2014-09-12 11:38, Volker Simonis wrote: >>>>> >>>>> Hi Mikael, >>>>> >>>>> there's one more question that came to my mind: will the new rule >>>>> apply to all hotspot respitories (i.e. jdk9/hs-rt/hotspot, >>>>> jdk9/hs-comp/hotspot, jdk9/hs-gc/hotspot, jdk9/hs-hs/hotspot AND >>>>> jdk8u/jdk8u-dev/hotspot, jdk8u/hs-dev/hotspot) ? >>>>> >>>>> Thanks, >>>>> Volker >>>>> >>>>> >>>>> On Thu, Sep 11, 2014 at 12:16 AM, Mikael Vidstedt >>>>> wrote: >>>>>> >>>>>> Andrew/Volker, >>>>>> >>>>>> Thanks for the positive feedback. The goal of the proposal is to >>>>>> simplify >>>>>> pushing changes which are effectively not tested by the jprt system >>>>>> anyway. >>>>>> The proposed relaxation would not affect work on other infrastructure >>>>>> projects in any relevant way, but would hopefully improve all our >>>>>> lives >>>>>> significantly immediately. >>>>>> >>>>>> Cheers, >>>>>> Mikael >>>>>> >>>>>> >>>>>> On 2014-09-10 01:45, Volker Simonis wrote: >>>>>>> >>>>>>> Hi Mikael, >>>>>>> >>>>>>> thanks a lot for this proposal. I think this will dramatically >>>>>>> simplify our work to keep our ports up to date! So I fully support >>>>>>> it. >>>>>>> >>>>>>> Nevertheless, I think this can only be a first step towards fully >>>>>>> open >>>>>>> the JPRT system to developers outside Oracle. With "opening" I mean >>>>>>> to >>>>>>> allow OpenJDK commiters from outside Oracle to submit and run JPRT >>>>>>> jobs as well as allowing porting projects to add hardware which >>>>>>> builds >>>>>>> and tests the HotSpot on alternative platforms. >>>>>>> >>>>>>> So while I'm all in favor of your proposal I hope you can allay my >>>>>>> doubts that this simplification will hopefully not push the >>>>>>> realization of a truly OPEN JPRT system even further away. >>>>>>> >>>>>>> Regards, >>>>>>> Volker >>>>>>> >>>>>>> >>>>>>> On Tue, Sep 9, 2014 at 11:24 PM, Mikael Vidstedt >>>>>>> wrote: >>>>>>>> >>>>>>>> All, >>>>>>>> >>>>>>>> Made up primarily of low level C++ code, the Hotspot codebase is >>>>>>>> highly >>>>>>>> platform dependent and also tightly coupled with the tool chains on >>>>>>>> the >>>>>>>> various platforms. Each platform/tool chain combination has its set >>>>>>>> of >>>>>>>> special quirks, and code must be implemented in a way such that it >>>>>>>> only >>>>>>>> relies on the common subset of syntax and functionality across all >>>>>>>> these >>>>>>>> combinations. History has taught us that even simple changes can >>>>>>>> have >>>>>>>> surprising results when compiled with different compilers. >>>>>>>> >>>>>>>> For more than a decade the Hotspot team has ensured a minimum >>>>>>>> quality >>>>>>>> level >>>>>>>> by requiring all pushes to be done through a build and test system >>>>>>>> (jprt) >>>>>>>> which guarantees that the code resulting from applying a set of >>>>>>>> changes >>>>>>>> builds on a set of core platforms and that a set of core tests pass. >>>>>>>> Only >>>>>>>> if >>>>>>>> all the builds and tests pass will the changes actually be pushed to >>>>>>>> the >>>>>>>> target repository. >>>>>>>> >>>>>>>> We believe that testing like the above, in combination with later >>>>>>>> stages >>>>>>>> of >>>>>>>> testing, is vital to ensuring that the quality level of the Hotspot >>>>>>>> code >>>>>>>> remains high and that developers do not run into situations where >>>>>>>> the >>>>>>>> latest >>>>>>>> version has build errors on some platforms. >>>>>>>> >>>>>>>> Recently the AIX/PPC port was added to the set of OpenJDK platforms. >>>>>>>> From >>>>>>>> a >>>>>>>> Hotspot perspective this new platform added a set of AIX/PPC >>>>>>>> specific >>>>>>>> files >>>>>>>> including some platform specific changes to shared code. The AIX/PPC >>>>>>>> platform is not tested by Oracle as part of Hotspot push jobs. The >>>>>>>> same >>>>>>>> thing applies for the shark and zero versions of Hotspot. >>>>>>>> >>>>>>>> While Hotspot developers remain committed to making sure changes are >>>>>>>> developed in a way such that the quality level remains high across >>>>>>>> all >>>>>>>> platforms and variants, because of the above mentioned complexities >>>>>>>> it >>>>>>>> is >>>>>>>> inevitable that from time to time changes will be made which >>>>>>>> introduce >>>>>>>> issues on specific platforms or tool chains not part of the core >>>>>>>> testing. >>>>>>>> >>>>>>>> To allow these issues to be resolved more quickly I would like to >>>>>>>> propose >>>>>>>> a >>>>>>>> relaxation in the requirements on how changes to Hotspot are pushed. >>>>>>>> Specifically I would like to allow for direct pushes to the hotspot/ >>>>>>>> repository of files specific to the following ports/variants/tools: >>>>>>>> >>>>>>>> * AIX >>>>>>>> * PPC >>>>>>>> * Shark >>>>>>>> * Zero >>>>>>>> >>>>>>>> Today this translates into the following files: >>>>>>>> >>>>>>>> - src/cpu/ppc/** >>>>>>>> - src/cpu/zero/** >>>>>>>> - src/os/aix/** >>>>>>>> - src/os_cpu/aix_ppc/** >>>>>>>> - src/os_cpu/bsd_zero/** >>>>>>>> - src/os_cpu/linux_ppc/** >>>>>>>> - src/os_cpu/linux_zero/** >>>>>>>> >>>>>>>> Note that all changes are still required to go through the normal >>>>>>>> development and review cycle; the proposed relaxation only applies >>>>>>>> to >>>>>>>> how >>>>>>>> the changes are pushed. >>>>>>>> >>>>>>>> If at code review time a change is for some reason deemed to be >>>>>>>> risky >>>>>>>> and/or >>>>>>>> otherwise have impact on shared files the reviewer may request that >>>>>>>> the >>>>>>>> change to go through the regular push testing. For changes only >>>>>>>> touching >>>>>>>> the >>>>>>>> above set of files this expected to be rare. >>>>>>>> >>>>>>>> Please let me know what you think. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Mikael >>>>>>>> > From erik.osterlund at lnu.se Fri Nov 7 14:43:23 2014 From: erik.osterlund at lnu.se (=?Windows-1252?Q?Erik_=D6sterlund?=) Date: Fri, 7 Nov 2014 14:43:23 +0000 Subject: RFR: 8058255: Native jbyte Atomic::cmpxchg for supported x86 platforms In-Reply-To: <545ACD53.3050108@oracle.com> References: <37B3D027-5B2E-417C-A679-D58AA250FCEF@lnu.se> <4CC8B7BA-1536-47A3-9CEF-069191E574B7@lnu.se> <47EB5B12-540E-45F7-8873-FA7BB015A8FE@oracle.com> <545ACD53.3050108@oracle.com> Message-ID: <77B724DF-8174-4E0D-B86C-7A320DEFB4A2@lnu.se> Hi David, Full webrev of the proposed push: http://cr.openjdk.java.net/~jwilhelm/8058255/webrev.04/ Incremental webrev of the proposed push: http://cr.openjdk.java.net/~jwilhelm/8058255/webrev.04.incremental/ Note that the #define is still left in this change as Paul wanted this pushed before we push my solution for getting rid of it (using templates and inheritance). Thanks, Erik On 06 Nov 2014, at 02:22, David Holmes wrote: > I'd like to see a final webrev please! I've lost track of this a bit. > > Thanks, > David > > On 6/11/2014 8:08 AM, Erik ?sterlund wrote: >> Okay, thanks a lot for the reviews Paul and Kim. :) >> Kim can you confirm I'm good to go? Everything you mentioned is fixed and I'm ready to go. >> >> Thanks, >> >> /Erik >> >> On 05 Nov 2014, at 22:10, Paul Hohensee > wrote: >> >> I don't need a new webrev either, so afaic you're good to go. >> >> Thanks, >> >> Paul >> >> >> On Tue, Nov 4, 2014 at 1:15 PM, Kim Barrett > wrote: >> On Nov 3, 2014, at 7:21 PM, Erik ?sterlund > wrote: >>> >>>> [legacy issue, not in changed code] >>>> I think the comment for generate_atomic_cmpxchg_long() is wrong in the >>>> return value; shouldn't it be returning a jlong? Probably a C-Y bug. >>> >>> No generate_atomic_cmpxchg_long() is used for generating code stubs for jlong CAS. I.e. it returns the address of the generated stub rather than executing a CAS - hence the return type is correct. >> >> The comment that I?m complaining about is the one describing the operation being supported by the generator, whose return type should be jlong, just as the corresponding return type in the comment for the new cmpxchg_byte support is jbyte. That is, >> >> 623 // Support for jint atomic::atomic_cmpxchg_long(jlong exchange_value, >> >> should be ?// Support for jlong ?" >> >>>> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp >>>> 96 : "q" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp) >>>> >>>> Why is the new byte version using "q" for exchange_value, where the >>>> existing int and long versions use "r"? [There might be a good >>>> reason, and this is just my rusty assembler skills showing.] >>> >>> With the "q" constraint you select one of the 8-bit-addressable registers rax, rcx, rdx, rbx (as opposed to any register with "r?). >> >> Thanks for the explanation. I didn?t remember that at all, and the documentation I skimmed yesterday wasn?t helping. >> >>> The compare_value is assigned to eax using "a" which is also 8-bit-addressable (al). Also cmpxchgb needs it to be in al specifically. >> >> At least I got that part. >> >>> The former (allocating 8-bit-addressable registers) wasn't a concern for the other variants really, but here this is pretty important for the operands of cmpxchgb. :) >> >> Indeed. >> >>>> ------------------------------------------------------------------------------ >>>> >>>> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp >>>> src/os_cpu/windows_x86/vm/os_windows_x86.hpp >>>> >>>> The windows port seems to only support specialized cmpxchgb when >>>> defined(AMD64), while the BSD/Linux variants don't have that >>>> restriction. Why this inconsistency? Or am I missing something, >>>> which seems entirely possible in this tangle. >>> >>> If you look closely, you will see there are two definitions - one for AMD64 using a runtime-generated code stub. >>> Then there is another MSVC assembly variant for #ifndef AMD64. >>> This goes perfectly consistent with e.g. the jint cmpxchg for windows way of doing things. >> >> Oops, you are correct. >> >>> Do you want a new webrev? (just polished comments and renamed the #define as per request) >> >> I don?t think I need one, but others might want a closer to final version. >> >> >> From thomas.stuefe at gmail.com Fri Nov 7 15:00:34 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 7 Nov 2014 16:00:34 +0100 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <545CA0FF.9010302@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <545C7106.2080602@oracle.com> <545CA0FF.9010302@oracle.com> Message-ID: Thank you, that sounds good. I will wait then. Kind Regards, Thomas On Nov 7, 2014 11:37 AM, "Stefan S?rne" wrote: > > Hi Thomas, > > There are 2 parts to this answer. > > 1. Today there is a test utility similar to what you describe called VM > Internal Tests inside HotSpot. > See the jtreg test invokes it as a starting point: > hotspot/test/sanity/ExecuteInternalVMTests.java > > 2. We are doing a proof of concept for a xunit based C++ unit test > framework for the VM. The JEP is forthcoming. > > I recommend you to wait for the latter. > There are several engineers who does already. > > Best regards, > /Stefan > > Thomas St?fe skrev 2014-11-07 10:08: > >> >> In the SAP JVM we have regression tests for C/C++ code, similar to jprt, >> but on C function level. Nothing fancy, just some big test functions which >> test our C APIs for regressions like this. That code is just compiled into >> the hotspot and can be executed with a command line switch, but gets >> excluded in release builds. >> Is there something similar for the OpenJDK? If yes, I would provide test >> functions for jio_snprintf. If no, would it be worth contributing? >> >> >> > From max.ockner at oracle.com Fri Nov 7 16:15:47 2014 From: max.ockner at oracle.com (Max Ockner) Date: Fri, 07 Nov 2014 11:15:47 -0500 Subject: RFR:8047290:Ensure consistent safepoint checking in MutexLockerEx In-Reply-To: <543F174F.7040204@oracle.com> References: <543EB71A.8000403@oracle.com> <543F174F.7040204@oracle.com> Message-ID: <545CF033.4010503@oracle.com> Hello all, I have made these additonal changes: -Moved the assert() statements into the lock and lock_without_safepoint methods. -Changed Monitor::SafepointAllowed to Monitor::SafepointCheckRequired -Changed the Monitor::SafepointCheckRequired values for several locks which were locked outside of a MutexLockerEx (some were locked with MutexLocker, some were locked were locked without any MutexLocker* ) New webrev location: http://cr.openjdk.java.net/~coleenp/8047290/ Additional testing: jtreg ./jdk/test/java/lang/invoke jtreg jfr tests Here is a list of ALL of the "sometimes" locks: "WorkGroup monitor" share/vm/utilities/workgroup.cpp "SLTMonitor" share/vm/gc_implementation/shared/concurrentGCThread.cpp "CompactibleFreeListSpace._lock" share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp "freelist par lock" share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp "SR_lock" share/vm/runtime/thread.cpp The remaining "sometimes" locks can be found in share/vm/runtime/mutexLocker.cpp: ParGCRareEvent_lock Safepoint_lock Threads_lock VMOperationQueue_lock VMOperationRequest_lock Terminator_lock Heap_lock Compile_lock PeriodicTask_lock JfrStacktrace_lock I have not checked the validity of the "sometimes" locks, and I believe that this should be a different project. Thanks for your help! Max Ockner On 10/15/2014 8:54 PM, David Holmes wrote: > Hi Max, > > This is looking good. > > A few high-level initial comments: > > I think SafepointAllowed should be SafepointCheckNeeded > > Why are the checks in MutexLocker when the state is maintained in the > mutex itself and the mutex/monitor has lock_without_safepoint, and > wait() ? I would have expected to see the > check in the mutex/monitor methods. > > Checking consistent usage of the _no_safepoint_check_flag is good. But > another part of this is that a monitor/mutex that never checks for > safepoints should never be held when a thread blocks at a safepoint - > is there some way to easily check that? I was surprised how many locks > are actually not checking for safepoints. > > Did you find any cases where the mutex/monitor was being used > inconsistently and incorrectly? > > Did you analyse the "sometimes" cases to see if they were safe? > (Aside: just for fun check out what happens if you lock the > Threads_lock with a safepoint check and a safepoint has been requested > :) ). > > Cheers, > David > > On 16/10/2014 4:04 AM, Max Ockner wrote: >> Hi all, >> >> I am a new member of the Hotspot runtime team in Burlington, MA. >> Please review my first fix related to safepoint checking. >> >> Summary: MutexLockerEx can either acquire a lock with or without a >> safepoint check. >> In some cases, a particular lock must either safepoint check always or >> never to avoid deadlocking. >> Some other locks have semantics which allow them to avoid deadlocks >> despite having a safepoint check only some of the time. >> All locks that are OK having inconsistent safepoint checks have been >> marked. All locks that should never safepoint check and all locks that >> should always safepoint check have also been marked. >> When a MutexLockerEx acquires a lock with or without a safepoint check, >> the lock's safepointAllowed marker is checked to ensure consistent >> safepoint checking. >> >> Webrev: http://oklahoma.us.oracle.com/~mockner/webrev/8047290/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8047290 >> >> Tested with: >> jprt "-testset hotspot" >> jtreg hotspot >> vm.quick.testlist >> >> Whitebox tests: >> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java: Test >> expects Assert ("This lock should always have a safepoint check") >> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java: Test >> expects Assert ("This lock should never have a safepoint check") >> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java: code >> should not assert. (Lock is properly acquired with no safepoint check) >> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java: code >> should not assert. (Lock is properly acquired with safepoint check) >> >> Thanks, >> Max >> From bertrand.delsart at oracle.com Fri Nov 7 16:45:12 2014 From: bertrand.delsart at oracle.com (Bertrand Delsart) Date: Fri, 07 Nov 2014 17:45:12 +0100 Subject: RFR:8047290:Ensure consistent safepoint checking in MutexLockerEx In-Reply-To: <545CF033.4010503@oracle.com> References: <543EB71A.8000403@oracle.com> <543F174F.7040204@oracle.com> <545CF033.4010503@oracle.com> Message-ID: <545CF718.2020702@oracle.com> Hi Max, Like David, I think we should go further but this is one step in the right direction. Thanks for doing it. Only noticed one small issue. The "or allow" part of the comment look strange in lock_without_safepoint_check: void Monitor::lock_without_safepoint_check(Thread * Self) { + //Ensure that the Monitor does not require or allow safepoint checks. + assert(this->_safepoint_check_required != Monitor::_safepoint_check_always, + err_msg("This lock should always have a safepoint check: %s", + this->name())); Regards, Bertrand (not a Reviewer). On 07/11/2014 17:15, Max Ockner wrote: > Hello all, > I have made these additonal changes: > -Moved the assert() statements into the lock and lock_without_safepoint > methods. > -Changed Monitor::SafepointAllowed to Monitor::SafepointCheckRequired > -Changed the Monitor::SafepointCheckRequired values for several locks > which were locked outside of a MutexLockerEx (some were locked with > MutexLocker, some were locked were locked without any MutexLocker* ) > > New webrev location: http://cr.openjdk.java.net/~coleenp/8047290/ > > Additional testing: > jtreg ./jdk/test/java/lang/invoke > jtreg jfr tests > > Here is a list of ALL of the "sometimes" locks: > > "WorkGroup monitor" share/vm/utilities/workgroup.cpp > "SLTMonitor" share/vm/gc_implementation/shared/concurrentGCThread.cpp > "CompactibleFreeListSpace._lock" > share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp > "freelist par lock" > share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp > "SR_lock" share/vm/runtime/thread.cpp > > The remaining "sometimes" locks can be found in > share/vm/runtime/mutexLocker.cpp: > > ParGCRareEvent_lock > Safepoint_lock > Threads_lock > VMOperationQueue_lock > VMOperationRequest_lock > Terminator_lock > Heap_lock > Compile_lock > PeriodicTask_lock > JfrStacktrace_lock > > I have not checked the validity of the "sometimes" locks, and I believe > that this should be a different project. > > Thanks for your help! > Max Ockner > On 10/15/2014 8:54 PM, David Holmes wrote: >> Hi Max, >> >> This is looking good. >> >> A few high-level initial comments: >> >> I think SafepointAllowed should be SafepointCheckNeeded >> >> Why are the checks in MutexLocker when the state is maintained in the >> mutex itself and the mutex/monitor has lock_without_safepoint, and >> wait() ? I would have expected to see the >> check in the mutex/monitor methods. >> >> Checking consistent usage of the _no_safepoint_check_flag is good. But >> another part of this is that a monitor/mutex that never checks for >> safepoints should never be held when a thread blocks at a safepoint - >> is there some way to easily check that? I was surprised how many locks >> are actually not checking for safepoints. >> >> Did you find any cases where the mutex/monitor was being used >> inconsistently and incorrectly? >> >> Did you analyse the "sometimes" cases to see if they were safe? >> (Aside: just for fun check out what happens if you lock the >> Threads_lock with a safepoint check and a safepoint has been requested >> :) ). >> >> Cheers, >> David >> >> On 16/10/2014 4:04 AM, Max Ockner wrote: >>> Hi all, >>> >>> I am a new member of the Hotspot runtime team in Burlington, MA. >>> Please review my first fix related to safepoint checking. >>> >>> Summary: MutexLockerEx can either acquire a lock with or without a >>> safepoint check. >>> In some cases, a particular lock must either safepoint check always or >>> never to avoid deadlocking. >>> Some other locks have semantics which allow them to avoid deadlocks >>> despite having a safepoint check only some of the time. >>> All locks that are OK having inconsistent safepoint checks have been >>> marked. All locks that should never safepoint check and all locks that >>> should always safepoint check have also been marked. >>> When a MutexLockerEx acquires a lock with or without a safepoint check, >>> the lock's safepointAllowed marker is checked to ensure consistent >>> safepoint checking. >>> >>> Webrev: http://oklahoma.us.oracle.com/~mockner/webrev/8047290/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8047290 >>> >>> Tested with: >>> jprt "-testset hotspot" >>> jtreg hotspot >>> vm.quick.testlist >>> >>> Whitebox tests: >>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java: Test >>> expects Assert ("This lock should always have a safepoint check") >>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java: Test >>> expects Assert ("This lock should never have a safepoint check") >>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java: code >>> should not assert. (Lock is properly acquired with no safepoint check) >>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java: code >>> should not assert. (Lock is properly acquired with safepoint check) >>> >>> Thanks, >>> Max >>> > -- Bertrand Delsart, Grenoble Engineering Center Oracle, 180 av. de l'Europe, ZIRST de Montbonnot 38334 Saint Ismier, FRANCE bertrand.delsart at oracle.com Phone : +33 4 76 18 81 23 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From aph at redhat.com Fri Nov 7 17:21:45 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 07 Nov 2014 17:21:45 +0000 Subject: RFR: AARCH64: Top-level JDK changes Message-ID: <545CFFA9.4070107@redhat.com> The first patch: top-level build machinery changes. http://cr.openjdk.java.net/~aph/8064357-rev-1/ Andrew. From vladimir.kozlov at oracle.com Fri Nov 7 17:34:08 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 07 Nov 2014 09:34:08 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <545CFFA9.4070107@redhat.com> References: <545CFFA9.4070107@redhat.com> Message-ID: <545D0290.5080307@oracle.com> CCing to build-dev and JDK9-dev since it is top level changes. Note, it will go into staging aarch64 repo. Vladimir On 11/7/14 9:21 AM, Andrew Haley wrote: > The first patch: top-level build machinery changes. > > http://cr.openjdk.java.net/~aph/8064357-rev-1/ > > Andrew. > From christian.thalinger at oracle.com Fri Nov 7 17:42:08 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Fri, 7 Nov 2014 09:42:08 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <545CFFA9.4070107@redhat.com> References: <545CFFA9.4070107@redhat.com> Message-ID: <6313454D-6690-4119-B55C-DBB356E4B3AC@oracle.com> > On Nov 7, 2014, at 9:21 AM, Andrew Haley wrote: > > The first patch: top-level build machinery changes. > > http://cr.openjdk.java.net/~aph/8064357-rev-1/ common/autoconf/flags.m4 + aarch64) + ZERO_ARCHFLAG="" + ;; Why is this required on aarch64 but not all the other architectures? Otherwise this looks good. > > Andrew. From aph at redhat.com Fri Nov 7 17:55:26 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 07 Nov 2014 17:55:26 +0000 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <6313454D-6690-4119-B55C-DBB356E4B3AC@oracle.com> References: <545CFFA9.4070107@redhat.com> <6313454D-6690-4119-B55C-DBB356E4B3AC@oracle.com> Message-ID: <545D078E.2090509@redhat.com> On 11/07/2014 05:42 PM, Christian Thalinger wrote: > >> On Nov 7, 2014, at 9:21 AM, Andrew Haley wrote: >> >> The first patch: top-level build machinery changes. >> >> http://cr.openjdk.java.net/~aph/8064357-rev-1/ > > common/autoconf/flags.m4 > > + aarch64) > + ZERO_ARCHFLAG="" > + ;; > > Why is this required on aarch64 but not all the other architectures? I think it's because GCC rejects "-m64". Andrew. From volker.simonis at gmail.com Fri Nov 7 18:00:37 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 7 Nov 2014 19:00:37 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <545D0290.5080307@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> Message-ID: The changes look good besides the ones to common/autoconf/build-aux/config.sub When we did our initial check-in there have been objections to modify autoconf-config.guess because that one was "copied directly from the autoconf project and should not be modified". I think the same applies to config.sub as well. But on the other side, we have an OpenJDK specific wrapper for autoconf-config.guess (i.e. config.guess) but there's no such wrapper for config.sub where we could fix-up things. So we have three possibilities: 1. make your change as suggested (which breaks the rule of not editing upstream files) 2. create a wrapper for config.sub, i.e. move config.sub to autoconf-config.sub and call it from config.sub which can contain arbitrary fix-up coding (which seems a little over-engineered IMHO) 3. pull in the new version of config.guess and config.sub from [1] which already seem to have the changes you need. I'm all in favour of point three which would also allow us to get rid of some of the hacks which are currently in config.guess. And now, as we're still early in the jdk9 development the risk of doing this seems minimal, but let's see what the build-dev guy say? Regards, Volker [1] http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.sub;hb=HEAD On Fri, Nov 7, 2014 at 6:34 PM, Vladimir Kozlov wrote: > CCing to build-dev and JDK9-dev since it is top level changes. > > Note, it will go into staging aarch64 repo. > > Vladimir > > > On 11/7/14 9:21 AM, Andrew Haley wrote: >> >> The first patch: top-level build machinery changes. >> >> http://cr.openjdk.java.net/~aph/8064357-rev-1/ >> >> Andrew. >> > From daniel.daugherty at oracle.com Fri Nov 7 18:10:45 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 07 Nov 2014 11:10:45 -0700 Subject: 4-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <545BF5CD.6010008@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> <545729A3.7090301@oracle.com> <54596AC2.6050502@oracle.com> <5459FB96.9020404@oracle.com> <545BF5CD.6010008@oracle.com> Message-ID: <545D0B25.4070602@oracle.com> On 11/6/14 3:27 PM, serguei.spitsyn at oracle.com wrote: > Hi reviewers, > > I'm suggesting to review a modified fix: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.4/ > src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c line 130: * that a command after VM_DEATH will be allowed to complete Perhaps "VM command" instead of "command" just to be clear. Thumbs up! And we'll want to push this changeset after a push to Main_Baseline so that it has (at least) a week to bake... Dan > > > The 3-rd round fix is not right as it caused deadlocks in several > tests (in nsk.jdi.testlist and jtreg com/sun/jdi). > > Here is a deadlock example: > > ----------------- lwp# 2 / thread# 2 -------------------- > ffffffff7e8dc6a4 lwp_cond_wait (100138748, 100138730, 0, 0) > ffffffff7dcad148 void os::PlatformEvent::park() (100138700, d4788, > d4400, 0, ffffffff7e357440, 100138730) + 100 > ffffffff7dc3151c int Monitor::IWait(Thread*,long) (ffffffff7e3c5b98, > 100137000, 0, 1004405d0, 6e750, 0) + a4 > ffffffff7dc324d0 bool Monitor::wait(bool,long,bool) (1004405d0, > 100137000, 0, 0, 1, 20000000) + 358 > ffffffff7de6c530 int JavaThread::java_suspend_self() (1004405d0, > 100137000, 1, deab, 60000000, 100137000) + c8 > ffffffff7da5f478 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) > (10034bdc0, ffffffffffffffff, ffffffff7e3e6bd0, 100137000, 1, 2) + 258 > ffffffff7da2284c jvmtiError > JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bdc0, > ffffffffffffffff, 4, 9aeb0, 100137000) + 8c > ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, > ffffffff7ab3ad18, ffffffff7ab3ad18, 0) + 3c > ffffffff7aa1c804 enqueueCommand (10034bb90, 102c00, ffffffffffefd118, > ffffffff7ab3ad18, 102c00, ffffffff7ab3bd60) + 14c > ffffffff7aa1e23c eventHelper_reportEvents (d8, 100135d70, 2, 1, 1, 2) > + 10c > ffffffff7aa181f8 reportEvents (1001371f8, 0, 0, 14, 100135d70, 0) + 138 > ffffffff7aa187b8 event_callback (1001371f8, ffffffff7b0ffa88, > ffffffff7aa23150, ffffffff7aa376a0, ffffffff7ab3ad18, 100441ad0) + 360 > ffffffff7aa1b870 cbVMDeath (800, 1001371f8, ffffffff7aa37c48, > ffffffff7ab3ad18, 1018, 1000) + 1d8 > ffffffff7da3635c void JvmtiExport::post_vm_death() (1ffc, 100137000, > ffffffff7e3e8b30, ffffffff7e357440, 1, 10010cf30) + 534 > ffffffff7d7bb104 void before_exit(JavaThread*) (100137000, > ffffffff7e392350, ffffffff7e3fb938, 6ed99, ffffffff7e357440, > ffffffff7e3e6b70) + 30c > ffffffff7de72128 bool Threads::destroy_vm() (100137000, 100110a40, > ffffffff7e3f22f4, ffffffff7e3e6ab0, ffffffff7e357440, 30000000) + 100 > ffffffff7d8d0664 jni_DestroyJavaVM (100137000, 1ffc, > ffffffff7e3e8b30, ffffffff7e357440, 0, 10013700) + 1bc > ffffffff7ee08680 JavaMain (ffffffff7e3da790, 0, ffffffff7e3da790, > 10035de68, 0, ffffffff7e4143b0) + 860 > ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) > > ----------------- lwp# 12 / thread# 12 -------------------- > ffffffff7e8dc6a4 lwp_cond_wait (100349948, 100349930, 0, 0) > ffffffff7dcad148 void os::PlatformEvent::park() (100349900, d4788, > d4400, 0, ffffffff7e357440, 100349930) + 100 > ffffffff7da5f010 int JvmtiRawMonitor::raw_enter(Thread*) (10034a070, > 100348800, a, ffffffff7e3de340, 1, ffffffff7e115ff4) + 258 > ffffffff7da22450 jvmtiError > JvmtiEnv::RawMonitorEnter(JvmtiRawMonitor*) (ffffffff7ea05a00, > 10034a070, 1c7, 100348800, ffffffff7e357440, 4) + a0 > ffffffff7aa2f288 debugMonitorEnter (10034a070, c18, c00, > ffffffff7ab3ad28, ffffffff7ab3b940, 0) + 38 > ffffffff7aa14134 debugLoop_run (ffffffff7ab3b940, 1000, > ffffffff7ab3ad28, ffffffff7aa360d0, ffffffff5b2ff718, c18) + 11c > ffffffff7aa2a4f8 connectionInitiated (ffffffff5b504010, 1358, 1000, > ffffffff7ab3ad28, 1, ffffffff7ab3c080) + e0 > ffffffff7aa2a7d4 attachThread (ffffffffffefee48, 101000, > ffffffff5b504010, ffffffff7ab3ad28, 0, 10000000) + 54 > ffffffff7da56b18 void JvmtiAgentThread::call_start_function() > (100348800, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034880, 1) + 128 > ffffffff7de6a678 void JavaThread::thread_main_inner() (100348800, > 3d8, 1003497f8, 100349420, ffffffff5b2ff9f8, 0) + 90 > ffffffff7de6a5b4 void JavaThread::run() (100348800, 100349442, c, > fffffffea5f3e048, 3d8, 1003497f8) + 3ac > ffffffff7dc9f2e4 java_start (ca800, 100348800, ca904, > ffffffff7e16ff31, ffffffff7e357440, 4797) + 2e4 > ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) > > ----------------- lwp# 13 / thread# 13 -------------------- > ffffffff7e8dc6a4 lwp_cond_wait (10034d348, 10034d330, 0, 0) > ffffffff7dcad148 void os::PlatformEvent::park() (10034d300, d4788, > d4400, 0, ffffffff7e357440, 10034d330) + 100 > ffffffff7da5eac8 int JvmtiRawMonitor::SimpleWait(Thread*,long) > (10034bed0, 10034c000, ffffffffffffffff, 241000, 0, 10034c000) + 100 > ffffffff7da5f300 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) > (10034bed0, ffffffffffffffff, 1, 10034c000, ffffffff7e357440, > 10034c000) + e0 > ffffffff7da2284c jvmtiError > JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bed0, > ffffffffffffffff, 4, 9aeb0, 10034c000) + 8c > ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, > ffffffff7ab3ad18, ffffffff7ab3b940, 0) + 3c > ffffffff7aa1d838 doBlockCommandLoop (800, 1038, ffffffff7ab3ad18, > 1000, ffffffff7ab3ad18, ffffffff7ab3bd60) + 48 > ffffffff7aa1da3c commandLoop (c28, 10034c1f8, c00, ffffffff7ab3ad18, > 0, 10000000) + ac > ffffffff7da56b18 void JvmtiAgentThread::call_start_function() > (10034c000, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034c00, 1) + 128 > ffffffff7de6a678 void JavaThread::thread_main_inner() (10034c000, > 3d8, 10034cfe8, 10034cc10, ffffffff5b0ffbf8, 0) + 90 > ffffffff7de6a5b4 void JavaThread::run() (10034c000, 10034cc28, d, > fffffffea5f3e290, 3d8, 10034cfe8) + 3ac > ffffffff7dc9f2e4 java_start (ca800, 10034c000, ca904, > ffffffff7e16ff31, ffffffff7e357440, 181a) + 2e4 > ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) > > > The details: > - Thread #2: The cbVMDeath() event handler is waiting on the > commandCompleteLock in the enqueueCommand(). > The call chain is: > cbVMDeath() -> event_callback() -> reportEvents() -> > eventHelper_reportEvents() -> enqueueCommand(). > The enqueueCommand() depends on the commandLoop() that has to call > completeCommand(command) for the command being enqueued. > This has not been set yet: gdata->vmDead = JNI_TRUE > > - Thread #12: The debugLoop_run blocked on the vmDeathLock enter > > - Thread #13: The commandLoop is waiting on the blockCommandLoopLock > in the doBlockCommandLoop(). > It is because blockCommandLoop == JNI_TRUE which is set in the > needBlockCommandLoop() > if the following condition is true: > (cmd->commandKind == COMMAND_REPORT_EVENT_COMPOSITE && > cmd->u.reportEventComposite.suspendPolicy == > JDWP_SUSPEND_POLICY(ALL)) > > > It seems, the debugLoop_run() block on the vmDeathLock causes the > commandLoop() to wait indefinitely. > The cbVMDeath() can not proceed because the commandLoop() does not > make a progress. > > The vmDeathLock critical section in the cbVMDeath() event callback > seems to be an overkill (unnecessary). > A less intrusive synchronization is required here which is to wait > until the current command is completed > before returning to the JvmtiExport::post_vm_death(). > > The new approach (see new webrev) is to extend the resumeLock > synchronization pattern > to all VirtualMachine set of commands, not only the resume command. > The resumeLock name is replaced with the vmDeathLock to reflect new > semantics. > > In general, we could consider to do the same for the rest of the JDWP > command sets. > But it is better to be careful and see how this change goes first. > > > Thanks, > Serguei > > > On 11/5/14 2:27 AM, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> Thank you for the concerns! >> Testing showed several tests failing with deadlocks. >> Scenarios are similar to that you describe. >> >> Trying to understand the details. >> >> Thanks, >> Serguei >> >> On 11/4/14 4:09 PM, David Holmes wrote: >>> Hi Serguei, >>> >>> On 3/11/2014 5:07 PM, serguei.spitsyn at oracle.com wrote: >>>> On 11/2/14 8:58 PM, David Holmes wrote: >>>>> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >>>>>> Serguei, >>>>>> >>>>>> Thank you for good finding. This approach looks much better for me. >>>>>> >>>>>> The fix looks good. >>>>>> >>>>>> Is it necessary to release vmDeathLock locks at >>>>>> eventHandler.c:1244 before call >>>>>> >>>>>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? >>>>> >>>>> I agree this looks necessary, or at least more clean (if things are >>>>> failing we really don't know what is happening). >>>> >>>> Agreed (replied to Dmitry). >>>> >>>>> >>>>> More generally I'm concerned about whether any of the code paths >>>>> taken >>>>> while holding the new lock can result in deadlock - in particular >>>>> with >>>>> regard to the resumeLock ? >>>> >>>> The cbVMDeath() function never holds both vmDeathLock and >>>> resumeLock at >>>> the same time, >>>> so there is no chance for a deadlock that involves both these locks. >>>> >>>> Two more locks used in the cbVMDeath() are the callbackBlock and >>>> callbackLock. >>>> These two locks look completely unrelated to the debugLoop_run(). >>>> >>>> The debugLoop_run() function also uses the cmdQueueLock. >>>> The debugLoop_run() never holds both vmDeathLock and cmdQueueLock >>>> at the >>>> same time. >>>> >>>> So that I do not see any potential to introduce new deadlock with the >>>> vmDeathLock. >>>> >>>> However, it is still easy to overlook something here. >>>> Please, let me know if you see any danger. >>> >>> I was mainly concerned about what might happen in the call chain for >>> threadControl_resumeAll() (it certainly sounds like it might need to >>> use a resumeLock :) ). I see direct use of the threadLock and >>> indirectly the eventHandler lock; but there are further call paths I >>> did not explore. Wish there was an easy way to determine the >>> transitive closure of all locks used from a given call. >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> David >>>>> >>>>>> -Dmitry >>>>>> >>>>>> >>>>>> >>>>>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>>>>>> >>>>>>> It is 3-rd round of review for: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>> >>>>>>> New webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Summary >>>>>>> >>>>>>> For failing scenario, please, refer to the 1-st round RFR below. >>>>>>> >>>>>>> I've found what is missed in the jdwp agent shutdown and >>>>>>> decided to >>>>>>> switch from a workaround to a real fix. >>>>>>> >>>>>>> The agent VM_DEATH callback sets the gdata field: >>>>>>> gdata->vmDead = 1. >>>>>>> The agent debugLoop_run() has a guard against the VM shutdown: >>>>>>> >>>>>>> 165 } else if (gdata->vmDead && >>>>>>> 166 ((cmd->cmdSet) != >>>>>>> JDWP_COMMAND_SET(VirtualMachine))) { >>>>>>> 167 /* Protect the VM from calls while dead. >>>>>>> 168 * VirtualMachine cmdSet quietly ignores some >>>>>>> cmds >>>>>>> 169 * after VM death, so, it sends it's own >>>>>>> errors. >>>>>>> 170 */ >>>>>>> 171 outStream_setError(&out, >>>>>>> JDWP_ERROR(VM_DEAD)); >>>>>>> >>>>>>> >>>>>>> However, the guard above does not help much if the VM_DEATH >>>>>>> event >>>>>>> happens in the middle of a command execution. >>>>>>> There is a lack of synchronization here. >>>>>>> >>>>>>> The fix introduces new lock (vmDeathLock) which does not >>>>>>> allow to >>>>>>> execute the commands >>>>>>> and the VM_DEATH event callback concurrently. >>>>>>> It should work well for any function that is used in >>>>>>> implementation of >>>>>>> the JDWP_COMMAND_SET(VirtualMachine) . >>>>>>> >>>>>>> >>>>>>> Testing: >>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>> tests >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> The updated webrev: >>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The changes are: >>>>>>>> - added a comment recommended by Staffan >>>>>>>> - removed the ignore_wrong_phase() call from function >>>>>>>> classSignature() >>>>>>>> >>>>>>>> The classSignature() function is called in 16 places. >>>>>>>> Most of them do not tolerate the NULL in place of returned >>>>>>>> signature >>>>>>>> and will crash. >>>>>>>> I'm not comfortable to fix all the occurrences now and suggest to >>>>>>>> return to this >>>>>>>> issue after gaining experience with more failure cases that are >>>>>>>> still >>>>>>>> expected. >>>>>>>> The failure with the classSignature() involved was observed >>>>>>>> only once >>>>>>>> in the nightly >>>>>>>> and should be extremely rare reproducible. >>>>>>>> I'll file a placeholder bug if necessary. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Please, review the fix for: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>>> >>>>>>>>> >>>>>>>>> Open webrev: >>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Summary: >>>>>>>>> >>>>>>>>> The failing scenario: >>>>>>>>> The debugger and the debuggee are well aware a VM >>>>>>>>> shutdown has >>>>>>>>> been started in the target process. >>>>>>>>> The debugger at this point is not expected to send any >>>>>>>>> commands >>>>>>>>> to the JDWP agent. >>>>>>>>> However, the JDI layer (debugger side) and the jdwp agent >>>>>>>>> (debuggee side) >>>>>>>>> are not in sync with the consumer layers. >>>>>>>>> >>>>>>>>> One reason is because the test debugger does not invoke >>>>>>>>> the JDI >>>>>>>>> method VirtualMachine.dispose(). >>>>>>>>> Another reason is that the Debugger and the debuggee >>>>>>>>> processes >>>>>>>>> are uneasy to sync in general. >>>>>>>>> >>>>>>>>> As a result the following steps are possible: >>>>>>>>> - The test debugger sends a 'quit' command to the test >>>>>>>>> debuggee >>>>>>>>> - The debuggee is normally exiting >>>>>>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>>>>>> anonymous class unload event >>>>>>>>> - The JDI InternalEventHandler thread handles the >>>>>>>>> ClassUnloadEvent event >>>>>>>>> - The InternalEventHandler wants to uncache the matching >>>>>>>>> reference type. >>>>>>>>> If there is more than one class with the same host >>>>>>>>> class >>>>>>>>> signature, it can't distinguish them, >>>>>>>>> and so, deletes all references and re-retrieves them >>>>>>>>> again >>>>>>>>> (see tracing below): >>>>>>>>> MY_TRACE: JDI: >>>>>>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>>>>>> - The jdwp backend debugLoop_run() gets the command >>>>>>>>> from JDI >>>>>>>>> and calls the functions >>>>>>>>> classesForSignature() and classStatus() recursively. >>>>>>>>> - The classStatus() makes a call to the JVMTI >>>>>>>>> GetClassStatus() >>>>>>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>>>>>> - As a result the jdwp backend reports the JVMTI error >>>>>>>>> to the >>>>>>>>> JDI, and so, the test fails >>>>>>>>> >>>>>>>>> For details, see the analysis in bug report closed as a >>>>>>>>> dup of >>>>>>>>> the bug 6988950: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>>>>>> >>>>>>>>> Some similar cases can be found in the two bug reports >>>>>>>>> (6988950 >>>>>>>>> and 8024865) describing this issue. >>>>>>>>> >>>>>>>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE >>>>>>>>> error >>>>>>>>> as it is normal at the VM shutdown. >>>>>>>>> The original jdwp backend implementation had a similar >>>>>>>>> approach >>>>>>>>> for the raw monitor functions. >>>>>>>>> Threy use the ignore_vm_death() to workaround the >>>>>>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>>>>>> For reference, please, see the file: src/share/back/util.c >>>>>>>>> >>>>>>>>> >>>>>>>>> Testing: >>>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>>>> tests >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>> >> > From christian.thalinger at oracle.com Fri Nov 7 18:10:39 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Fri, 7 Nov 2014 10:10:39 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <545D078E.2090509@redhat.com> References: <545CFFA9.4070107@redhat.com> <6313454D-6690-4119-B55C-DBB356E4B3AC@oracle.com> <545D078E.2090509@redhat.com> Message-ID: > On Nov 7, 2014, at 9:55 AM, Andrew Haley wrote: > > On 11/07/2014 05:42 PM, Christian Thalinger wrote: >> >>> On Nov 7, 2014, at 9:21 AM, Andrew Haley wrote: >>> >>> The first patch: top-level build machinery changes. >>> >>> http://cr.openjdk.java.net/~aph/8064357-rev-1/ >> >> common/autoconf/flags.m4 >> >> + aarch64) >> + ZERO_ARCHFLAG="" >> + ;; >> >> Why is this required on aarch64 but not all the other architectures? > > I think it's because GCC rejects "-m64?. That?s interesting. I thought -m is some kind of common flag that works on all architectures. Can someone verify this? > > Andrew. > From aph at redhat.com Fri Nov 7 18:19:31 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 07 Nov 2014 18:19:31 +0000 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: References: <545CFFA9.4070107@redhat.com> <6313454D-6690-4119-B55C-DBB356E4B3AC@oracle.com> <545D078E.2090509@redhat.com> Message-ID: <545D0D33.70400@redhat.com> On 11/07/2014 06:10 PM, Christian Thalinger wrote: > >> On Nov 7, 2014, at 9:55 AM, Andrew Haley wrote: >> >> On 11/07/2014 05:42 PM, Christian Thalinger wrote: >>> >>>> On Nov 7, 2014, at 9:21 AM, Andrew Haley wrote: >>>> >>>> The first patch: top-level build machinery changes. >>>> >>>> http://cr.openjdk.java.net/~aph/8064357-rev-1/ >>> >>> common/autoconf/flags.m4 >>> >>> + aarch64) >>> + ZERO_ARCHFLAG="" >>> + ;; >>> >>> Why is this required on aarch64 but not all the other architectures? >> >> I think it's because GCC rejects "-m64?. > > That?s interesting. I thought -m is some kind of common > flag that works on all architectures. No, all the "-m" stuff is target-dependent. > Can someone verify this? mustang-01:~ $ gcc -m64 hello.c gcc: error: unrecognized command line option '-m64' mustang-01:~ $ gcc --version gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16) Andrew. From christian.thalinger at oracle.com Fri Nov 7 18:21:31 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Fri, 7 Nov 2014 10:21:31 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: References: <545CFFA9.4070107@redhat.com> <6313454D-6690-4119-B55C-DBB356E4B3AC@oracle.com> <545D078E.2090509@redhat.com> Message-ID: <9C88630B-54EC-47A2-869D-0C2F5E507ECB@oracle.com> > On Nov 7, 2014, at 10:10 AM, Christian Thalinger wrote: > >> >> On Nov 7, 2014, at 9:55 AM, Andrew Haley wrote: >> >> On 11/07/2014 05:42 PM, Christian Thalinger wrote: >>> >>>> On Nov 7, 2014, at 9:21 AM, Andrew Haley wrote: >>>> >>>> The first patch: top-level build machinery changes. >>>> >>>> http://cr.openjdk.java.net/~aph/8064357-rev-1/ >>> >>> common/autoconf/flags.m4 >>> >>> + aarch64) >>> + ZERO_ARCHFLAG="" >>> + ;; >>> >>> Why is this required on aarch64 but not all the other architectures? >> >> I think it's because GCC rejects "-m64?. > > That?s interesting. I thought -m is some kind of common flag that works on all architectures. Can someone verify this? This page doesn?t list it (while x86, SPARC, and PowerPC pages do): https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html I guess it?s good then. > >> >> Andrew. From christian.thalinger at oracle.com Fri Nov 7 18:21:56 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Fri, 7 Nov 2014 10:21:56 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <545D0D33.70400@redhat.com> References: <545CFFA9.4070107@redhat.com> <6313454D-6690-4119-B55C-DBB356E4B3AC@oracle.com> <545D078E.2090509@redhat.com> <545D0D33.70400@redhat.com> Message-ID: <7E34D2EE-A7D8-4B2D-BF47-627C566C8992@oracle.com> > On Nov 7, 2014, at 10:19 AM, Andrew Haley wrote: > > On 11/07/2014 06:10 PM, Christian Thalinger wrote: >> >>> On Nov 7, 2014, at 9:55 AM, Andrew Haley wrote: >>> >>> On 11/07/2014 05:42 PM, Christian Thalinger wrote: >>>> >>>>> On Nov 7, 2014, at 9:21 AM, Andrew Haley wrote: >>>>> >>>>> The first patch: top-level build machinery changes. >>>>> >>>>> http://cr.openjdk.java.net/~aph/8064357-rev-1/ >>>> >>>> common/autoconf/flags.m4 >>>> >>>> + aarch64) >>>> + ZERO_ARCHFLAG="" >>>> + ;; >>>> >>>> Why is this required on aarch64 but not all the other architectures? >>> >>> I think it's because GCC rejects "-m64?. >> >> That?s interesting. I thought -m is some kind of common >> flag that works on all architectures. > > No, all the "-m" stuff is target-dependent. > >> Can someone verify this? > > mustang-01:~ $ gcc -m64 hello.c > gcc: error: unrecognized command line option '-m64' > mustang-01:~ $ gcc --version > gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16) Thanks :-) > > Andrew. From jiangli.zhou at oracle.com Fri Nov 7 18:33:47 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 07 Nov 2014 10:33:47 -0800 Subject: Review request for 8058313: Mismatch of method descriptor and MethodParameters.parameters_count should cause MalformedParameterException In-Reply-To: <545CC42A.8030004@oracle.com> References: <54516C9A.7070404@oracle.com> <54518820.50700@oracle.com> <545251AD.8050208@oracle.com> <54527A90.4030503@oracle.com> <5452CC7F.1090809@oracle.com> <5457F530.2070907@oracle.com> <54597EFC.2070509@oracle.com> <545BBF74.4020607@oracle.com> <545C09FB.9020907@oracle.com> <545CC42A.8030004@oracle.com> Message-ID: <545D108B.5070504@oracle.com> Hi Eric, Looks okay. You also need a capital R reviewer for the change. Thanks, Jiangli On 11/07/2014 05:07 AM, Eric McCorkle wrote: > On 11/06/14 18:53, Jiangli Zhou wrote: > >> Could you please point to the updated webrev? I don't see the update in >> http://cr.openjdk.java.net/~emc/8058313/webrev.01/src/share/vm/prims/jvm.cpp.sdiff.html. > I made a mistake uploading it. It's here: > http://cr.openjdk.java.net/~emc/8058313/webrev.02/ > >> Thanks, >> Jiangli >>>> On 11/03/2014 01:35 PM, Eric McCorkle wrote: >>>>> Please review this issue so that it can go in along with 8058322. >>>>> Thanks. >>>>> >>>>> On 10/30/14 19:40, Eric McCorkle wrote: >>>>>> Thank you for the pointers. I have applied your changes and refreshed >>>>>> the webrev. >>>>>> >>>>>> http://cr.openjdk.java.net/~emc/8058313/ >>>>>> >>>>>> Also, I have posted the test for this and another patch here: >>>>>> http://cr.openjdk.java.net/~emc/8062556/ >>>>>> >>>>>> On 10/30/14 13:51, Jiangli Zhou wrote: >>>>>>> Hi Eric, >>>>>>> >>>>>>> On 10/30/2014 07:56 AM, Eric McCorkle wrote: >>>>>>>> On 10/29/14 20:36, Jiangli Zhou wrote: >>>>>>>>> Hi Eric, >>>>>>>>> >>>>>>>>> I wonder if we could specialize this particular case and avoid >>>>>>>>> changing >>>>>>>>> the parsing code. How about setting the _has_method_parameters >>>>>>>>> flag in >>>>>>>>> the ConstMethod when encounter such MethodParameter, and changing >>>>>>>>> JVM_GetMethodParameters() to return non-NULL value for such case >>>>>>>>> when >>>>>>>>> _has_method_parameters is true but method_parameters_length is 0. >>>>>>>>> Would >>>>>>>>> that work? >>>>>>>> Which parser are you talking about? The inline tables parser, or >>>>>>>> the >>>>>>>> class file parser. The class file parser has to change, because it >>>>>>>> was >>>>>>>> previously ignoring MethodParameters attributes with >>>>>>>> parameter_count 0. >>>>>>> It's the class parsing changes that I was referring to, mostly >>>>>>> relate to >>>>>>> the initialization and checking against method_parameters_length. >>>>>>> It's a >>>>>>> bit awkward to include the 0 case but also skipping it in the >>>>>>> loop. For >>>>>>> example, the following code in classFileParser.cpp changed ">" to >>>>>>> ">=" >>>>>>> in the if check, but has no real effect and is not need. >>>>>>> >>>>>>> 2486 // Copy method parameters >>>>>>> 2487 if (method_parameters_length >= 0) { >>>>>>> 2488 MethodParametersElement* elem = >>>>>>> m->constMethod()->method_parameters_start(); >>>>>>> 2489 for (int i = 0; i < method_parameters_length; i++) { >>>>>>> 2490 elem[i].name_cp_index = >>>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>>> 2491 method_parameters_data += 2; >>>>>>> 2492 elem[i].flags = >>>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>>> 2493 method_parameters_data += 2; >>>>>>> 2494 } >>>>>>> 2495 } >>>>>>> >>>>>>> >>>>>>>> I don't think your proposal will work. The inline tables' >>>>>>>> offsets are >>>>>>>> all dependent on what inline tables are actually present. If >>>>>>>> _has_method_parameters is set, then the inline tables code >>>>>>>> expects the >>>>>>>> last u2 of the inline tables to be a u2 indicating the number of >>>>>>>> method >>>>>>>> parameters entries, preceeded by the array of method parameters >>>>>>>> data. >>>>>>>> If _has_method_parameters is false, then it expects that there is no >>>>>>>> method parameters information at all (including no length >>>>>>>> field). If >>>>>>>> you were to set _has_method_parameters, but not store any >>>>>>>> information in >>>>>>>> the inline table, then it would cause errors for all the rest of the >>>>>>>> inline tables. >>>>>>> Thank you for reminding me of the complexity of the inlined table >>>>>>> calculation in the ConstMethod. My proposal would require tweaks in >>>>>>> that >>>>>>> area to correctly compute the table sizes. As it's easy to introduce >>>>>>> bugs in that area, it's not worth to change the table calculation >>>>>>> code >>>>>>> for this purpose. I agree my proposal is not a better choice in this >>>>>>> case. >>>>>>> >>>>>>>> What I do for the parameter_count = 0 case is just store >>>>>>>> a 0 u2 for zero-length method parameters information, and no data. >>>>>>>> All >>>>>>>> the existing inline tables code works fine with this case, so there >>>>>>>> aren't any serious changes to the inline tables code (other than >>>>>>>> allowing method parameters information to be stored when the >>>>>>>> array is >>>>>>>> length 0). But you have to make some change to the inline table >>>>>>>> code, >>>>>>>> otherwise the information won't be stored. >>>>>>> Ok. Could you please add comments to the change in constMethod.cpp to >>>>>>> explain above? >>>>>>> >>>>>>> In jvm.cpp, since -1 represents no method parameter now. Maybe >>>>>>> checking >>>>>>> against explicity and add comments for the 0-length case. >>>>>>> >>>>>>> JVM_ENTRY(jobjectArray, JVM_GetMethodParameters(JNIEnv *env, jobject >>>>>>> method)) >>>>>>> { >>>>>>> ... >>>>>>> // No method parameter >>>>>>> if (num_params == -1) { >>>>>>> return (jobjectArray)NULL; >>>>>>> } >>>>>>> >>>>>>> /* handle the rest here */ >>>>>>> // make sure all the symbols are properly formatted >>>>>>> for (int i = 0; i < num_params; i++) { >>>>>>> ... >>>>>>> } >>>>>>> >>>>>>> Thanks, >>>>>>> Jiangli >>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Jiangli >>>>>>>>> >>>>>>>>> On 10/29/2014 03:39 PM, Eric McCorkle wrote: >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> Please review this fix for parameter reflection which addresses >>>>>>>>>> hotspot >>>>>>>>>> falsely ignoring zero-length MethodParameter attributes. The JVMS >>>>>>>>>> allows a MethodParameters attribute with parameter_count = 0, and >>>>>>>>>> the >>>>>>>>>> parameter reflection spec states that a >>>>>>>>>> MalformedParametersException >>>>>>>>>> should be thrown if parameter_count does not match the number of >>>>>>>>>> real >>>>>>>>>> parameters to a method. Hotspot currently ignores >>>>>>>>>> MethodParameters >>>>>>>>>> attributes with parameter_count = 0; however, in a case where a >>>>>>>>>> (bad) >>>>>>>>>> MethodParameters attribute has parameter_count = 0, but the method >>>>>>>>>> has a >>>>>>>>>> nonzero number of real parameters, hotspot will return null from >>>>>>>>>> JVM_GetMethodParameters, the result being that a >>>>>>>>>> MalformedParametersException is not thrown (rather, the >>>>>>>>>> reflection API >>>>>>>>>> acts like there is no MethodParameters attribute). >>>>>>>>>> >>>>>>>>>> This patch causes hotspot to record the fact that a zero-length >>>>>>>>>> MethodParameters attribute does exist, causing the exception to be >>>>>>>>>> thrown when it should be. >>>>>>>>>> >>>>>>>>>> The bug is here: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8058313 >>>>>>>>>> >>>>>>>>>> The webrev is here: >>>>>>>>>> http://cr.openjdk.java.net/~emc/8058313/ From aph at redhat.com Fri Nov 7 18:53:03 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 07 Nov 2014 18:53:03 +0000 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> Message-ID: <545D150F.0@redhat.com> On 11/07/2014 06:00 PM, Volker Simonis wrote: > 3. pull in the new version of config.guess and config.sub from [1] > which already seem to have the changes you need. > > I'm all in favour of point three which would also allow us to get rid > of some of the hacks which are currently in config.guess. And now, as > we're still early in the jdk9 development the risk of doing this seems > minimal, but let's see what the build-dev guy say? So am I. build-dev people, do you want me to import config.guess from upstream? I can create a new issue. Andrew. From hws689 at gmail.com Fri Nov 7 19:11:52 2014 From: hws689 at gmail.com (Hilton Wichwski Silva) Date: Fri, 7 Nov 2014 17:11:52 -0200 Subject: No subject Message-ID: From max.ockner at oracle.com Fri Nov 7 19:13:29 2014 From: max.ockner at oracle.com (Max Ockner) Date: Fri, 07 Nov 2014 14:13:29 -0500 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. Message-ID: <545D19D9.3040400@oracle.com> ID: 8060449 webrev: http://cr.openjdk.java.net/~coleenp/8060449/ Summary: A "newly obsolete" command line option is one which is no longer supported, but still is acknowledged. There is a list of these in arguments.cpp. It used to be that only a fixed number of characters were checked when comparing a given command line option to the list of obsolete flags (strncmp was used, where the number of characters to check is equal to the length of the flag name from the table.) As a result, an arbitrary string appended to the end of an obsolete argument goes unnoticed. This issue is fixed by comparing the lengths of the given flag and the flags from the obsolete flags table. When a misspelled flag is fuzzy-matched to an obsolete flag, an appropriate warning is given to save the user a few key strokes: (1) unrecognized option [bad option]. (2) Did you mean [option]? (3) [option] is obsolete as of [version]) A new test for this feature checks for the presence of all three components of the above error message. Tested with: vm.quick.testlist hotspot jtreg tests jprt Thanks for your help! Max Ockner From serguei.spitsyn at oracle.com Fri Nov 7 19:26:16 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 07 Nov 2014 11:26:16 -0800 Subject: 4-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <545D0B25.4070602@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> <545729A3.7090301@oracle.com> <54596AC2.6050502@oracle.com> <5459FB96.9020404@oracle.com> <545BF5CD.6010008@oracle.com> <545D0B25.4070602@oracle.com> Message-ID: <545D1CD8.3020304@oracle.com> On 11/7/14 10:10 AM, Daniel D. Daugherty wrote: > On 11/6/14 3:27 PM, serguei.spitsyn at oracle.com wrote: >> Hi reviewers, >> >> I'm suggesting to review a modified fix: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.4/ >> > > src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c > line 130: * that a command after VM_DEATH will be allowed to complete > Perhaps "VM command" instead of "command" just to be clear. Agreed - done > > Thumbs up! And we'll want to push this changeset after a push > to Main_Baseline so that it has (at least) a week to bake... Ok. Thanks a lot! Serguei > > Dan > > >> >> >> The 3-rd round fix is not right as it caused deadlocks in several >> tests (in nsk.jdi.testlist and jtreg com/sun/jdi). >> >> Here is a deadlock example: >> >> ----------------- lwp# 2 / thread# 2 -------------------- >> ffffffff7e8dc6a4 lwp_cond_wait (100138748, 100138730, 0, 0) >> ffffffff7dcad148 void os::PlatformEvent::park() (100138700, d4788, >> d4400, 0, ffffffff7e357440, 100138730) + 100 >> ffffffff7dc3151c int Monitor::IWait(Thread*,long) (ffffffff7e3c5b98, >> 100137000, 0, 1004405d0, 6e750, 0) + a4 >> ffffffff7dc324d0 bool Monitor::wait(bool,long,bool) (1004405d0, >> 100137000, 0, 0, 1, 20000000) + 358 >> ffffffff7de6c530 int JavaThread::java_suspend_self() (1004405d0, >> 100137000, 1, deab, 60000000, 100137000) + c8 >> ffffffff7da5f478 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) >> (10034bdc0, ffffffffffffffff, ffffffff7e3e6bd0, 100137000, 1, 2) + 258 >> ffffffff7da2284c jvmtiError >> JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bdc0, >> ffffffffffffffff, 4, 9aeb0, 100137000) + 8c >> ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, >> ffffffff7ab3ad18, ffffffff7ab3ad18, 0) + 3c >> ffffffff7aa1c804 enqueueCommand (10034bb90, 102c00, >> ffffffffffefd118, ffffffff7ab3ad18, 102c00, ffffffff7ab3bd60) + 14c >> ffffffff7aa1e23c eventHelper_reportEvents (d8, 100135d70, 2, 1, 1, >> 2) + 10c >> ffffffff7aa181f8 reportEvents (1001371f8, 0, 0, 14, 100135d70, 0) + 138 >> ffffffff7aa187b8 event_callback (1001371f8, ffffffff7b0ffa88, >> ffffffff7aa23150, ffffffff7aa376a0, ffffffff7ab3ad18, 100441ad0) + 360 >> ffffffff7aa1b870 cbVMDeath (800, 1001371f8, ffffffff7aa37c48, >> ffffffff7ab3ad18, 1018, 1000) + 1d8 >> ffffffff7da3635c void JvmtiExport::post_vm_death() (1ffc, 100137000, >> ffffffff7e3e8b30, ffffffff7e357440, 1, 10010cf30) + 534 >> ffffffff7d7bb104 void before_exit(JavaThread*) (100137000, >> ffffffff7e392350, ffffffff7e3fb938, 6ed99, ffffffff7e357440, >> ffffffff7e3e6b70) + 30c >> ffffffff7de72128 bool Threads::destroy_vm() (100137000, 100110a40, >> ffffffff7e3f22f4, ffffffff7e3e6ab0, ffffffff7e357440, 30000000) + 100 >> ffffffff7d8d0664 jni_DestroyJavaVM (100137000, 1ffc, >> ffffffff7e3e8b30, ffffffff7e357440, 0, 10013700) + 1bc >> ffffffff7ee08680 JavaMain (ffffffff7e3da790, 0, ffffffff7e3da790, >> 10035de68, 0, ffffffff7e4143b0) + 860 >> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >> >> ----------------- lwp# 12 / thread# 12 -------------------- >> ffffffff7e8dc6a4 lwp_cond_wait (100349948, 100349930, 0, 0) >> ffffffff7dcad148 void os::PlatformEvent::park() (100349900, d4788, >> d4400, 0, ffffffff7e357440, 100349930) + 100 >> ffffffff7da5f010 int JvmtiRawMonitor::raw_enter(Thread*) (10034a070, >> 100348800, a, ffffffff7e3de340, 1, ffffffff7e115ff4) + 258 >> ffffffff7da22450 jvmtiError >> JvmtiEnv::RawMonitorEnter(JvmtiRawMonitor*) (ffffffff7ea05a00, >> 10034a070, 1c7, 100348800, ffffffff7e357440, 4) + a0 >> ffffffff7aa2f288 debugMonitorEnter (10034a070, c18, c00, >> ffffffff7ab3ad28, ffffffff7ab3b940, 0) + 38 >> ffffffff7aa14134 debugLoop_run (ffffffff7ab3b940, 1000, >> ffffffff7ab3ad28, ffffffff7aa360d0, ffffffff5b2ff718, c18) + 11c >> ffffffff7aa2a4f8 connectionInitiated (ffffffff5b504010, 1358, 1000, >> ffffffff7ab3ad28, 1, ffffffff7ab3c080) + e0 >> ffffffff7aa2a7d4 attachThread (ffffffffffefee48, 101000, >> ffffffff5b504010, ffffffff7ab3ad28, 0, 10000000) + 54 >> ffffffff7da56b18 void JvmtiAgentThread::call_start_function() >> (100348800, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034880, 1) + >> 128 >> ffffffff7de6a678 void JavaThread::thread_main_inner() (100348800, >> 3d8, 1003497f8, 100349420, ffffffff5b2ff9f8, 0) + 90 >> ffffffff7de6a5b4 void JavaThread::run() (100348800, 100349442, c, >> fffffffea5f3e048, 3d8, 1003497f8) + 3ac >> ffffffff7dc9f2e4 java_start (ca800, 100348800, ca904, >> ffffffff7e16ff31, ffffffff7e357440, 4797) + 2e4 >> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >> >> ----------------- lwp# 13 / thread# 13 -------------------- >> ffffffff7e8dc6a4 lwp_cond_wait (10034d348, 10034d330, 0, 0) >> ffffffff7dcad148 void os::PlatformEvent::park() (10034d300, d4788, >> d4400, 0, ffffffff7e357440, 10034d330) + 100 >> ffffffff7da5eac8 int JvmtiRawMonitor::SimpleWait(Thread*,long) >> (10034bed0, 10034c000, ffffffffffffffff, 241000, 0, 10034c000) + 100 >> ffffffff7da5f300 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) >> (10034bed0, ffffffffffffffff, 1, 10034c000, ffffffff7e357440, >> 10034c000) + e0 >> ffffffff7da2284c jvmtiError >> JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bed0, >> ffffffffffffffff, 4, 9aeb0, 10034c000) + 8c >> ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, >> ffffffff7ab3ad18, ffffffff7ab3b940, 0) + 3c >> ffffffff7aa1d838 doBlockCommandLoop (800, 1038, ffffffff7ab3ad18, >> 1000, ffffffff7ab3ad18, ffffffff7ab3bd60) + 48 >> ffffffff7aa1da3c commandLoop (c28, 10034c1f8, c00, ffffffff7ab3ad18, >> 0, 10000000) + ac >> ffffffff7da56b18 void JvmtiAgentThread::call_start_function() >> (10034c000, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034c00, 1) + >> 128 >> ffffffff7de6a678 void JavaThread::thread_main_inner() (10034c000, >> 3d8, 10034cfe8, 10034cc10, ffffffff5b0ffbf8, 0) + 90 >> ffffffff7de6a5b4 void JavaThread::run() (10034c000, 10034cc28, d, >> fffffffea5f3e290, 3d8, 10034cfe8) + 3ac >> ffffffff7dc9f2e4 java_start (ca800, 10034c000, ca904, >> ffffffff7e16ff31, ffffffff7e357440, 181a) + 2e4 >> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >> >> >> The details: >> - Thread #2: The cbVMDeath() event handler is waiting on the >> commandCompleteLock in the enqueueCommand(). >> The call chain is: >> cbVMDeath() -> event_callback() -> reportEvents() -> >> eventHelper_reportEvents() -> enqueueCommand(). >> The enqueueCommand() depends on the commandLoop() that has to call >> completeCommand(command) for the command being enqueued. >> This has not been set yet: gdata->vmDead = JNI_TRUE >> >> - Thread #12: The debugLoop_run blocked on the vmDeathLock enter >> >> - Thread #13: The commandLoop is waiting on the >> blockCommandLoopLock in the doBlockCommandLoop(). >> It is because blockCommandLoop == JNI_TRUE which is set in the >> needBlockCommandLoop() >> if the following condition is true: >> (cmd->commandKind == COMMAND_REPORT_EVENT_COMPOSITE && >> cmd->u.reportEventComposite.suspendPolicy == >> JDWP_SUSPEND_POLICY(ALL)) >> >> >> It seems, the debugLoop_run() block on the vmDeathLock causes the >> commandLoop() to wait indefinitely. >> The cbVMDeath() can not proceed because the commandLoop() does not >> make a progress. >> >> The vmDeathLock critical section in the cbVMDeath() event callback >> seems to be an overkill (unnecessary). >> A less intrusive synchronization is required here which is to wait >> until the current command is completed >> before returning to the JvmtiExport::post_vm_death(). >> >> The new approach (see new webrev) is to extend the resumeLock >> synchronization pattern >> to all VirtualMachine set of commands, not only the resume command. >> The resumeLock name is replaced with the vmDeathLock to reflect new >> semantics. >> >> In general, we could consider to do the same for the rest of the JDWP >> command sets. >> But it is better to be careful and see how this change goes first. >> >> >> Thanks, >> Serguei >> >> >> On 11/5/14 2:27 AM, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> Thank you for the concerns! >>> Testing showed several tests failing with deadlocks. >>> Scenarios are similar to that you describe. >>> >>> Trying to understand the details. >>> >>> Thanks, >>> Serguei >>> >>> On 11/4/14 4:09 PM, David Holmes wrote: >>>> Hi Serguei, >>>> >>>> On 3/11/2014 5:07 PM, serguei.spitsyn at oracle.com wrote: >>>>> On 11/2/14 8:58 PM, David Holmes wrote: >>>>>> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >>>>>>> Serguei, >>>>>>> >>>>>>> Thank you for good finding. This approach looks much better for me. >>>>>>> >>>>>>> The fix looks good. >>>>>>> >>>>>>> Is it necessary to release vmDeathLock locks at >>>>>>> eventHandler.c:1244 before call >>>>>>> >>>>>>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? >>>>>> >>>>>> I agree this looks necessary, or at least more clean (if things are >>>>>> failing we really don't know what is happening). >>>>> >>>>> Agreed (replied to Dmitry). >>>>> >>>>>> >>>>>> More generally I'm concerned about whether any of the code paths >>>>>> taken >>>>>> while holding the new lock can result in deadlock - in particular >>>>>> with >>>>>> regard to the resumeLock ? >>>>> >>>>> The cbVMDeath() function never holds both vmDeathLock and >>>>> resumeLock at >>>>> the same time, >>>>> so there is no chance for a deadlock that involves both these locks. >>>>> >>>>> Two more locks used in the cbVMDeath() are the callbackBlock and >>>>> callbackLock. >>>>> These two locks look completely unrelated to the debugLoop_run(). >>>>> >>>>> The debugLoop_run() function also uses the cmdQueueLock. >>>>> The debugLoop_run() never holds both vmDeathLock and cmdQueueLock >>>>> at the >>>>> same time. >>>>> >>>>> So that I do not see any potential to introduce new deadlock with the >>>>> vmDeathLock. >>>>> >>>>> However, it is still easy to overlook something here. >>>>> Please, let me know if you see any danger. >>>> >>>> I was mainly concerned about what might happen in the call chain >>>> for threadControl_resumeAll() (it certainly sounds like it might >>>> need to use a resumeLock :) ). I see direct use of the threadLock >>>> and indirectly the eventHandler lock; but there are further call >>>> paths I did not explore. Wish there was an easy way to determine >>>> the transitive closure of all locks used from a given call. >>>> >>>> Thanks, >>>> David >>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> >>>>>> David >>>>>> >>>>>>> -Dmitry >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>>>>>>> >>>>>>>> It is 3-rd round of review for: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>> >>>>>>>> New webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Summary >>>>>>>> >>>>>>>> For failing scenario, please, refer to the 1-st round RFR >>>>>>>> below. >>>>>>>> >>>>>>>> I've found what is missed in the jdwp agent shutdown and >>>>>>>> decided to >>>>>>>> switch from a workaround to a real fix. >>>>>>>> >>>>>>>> The agent VM_DEATH callback sets the gdata field: >>>>>>>> gdata->vmDead = 1. >>>>>>>> The agent debugLoop_run() has a guard against the VM shutdown: >>>>>>>> >>>>>>>> 165 } else if (gdata->vmDead && >>>>>>>> 166 ((cmd->cmdSet) != >>>>>>>> JDWP_COMMAND_SET(VirtualMachine))) { >>>>>>>> 167 /* Protect the VM from calls while dead. >>>>>>>> 168 * VirtualMachine cmdSet quietly ignores >>>>>>>> some >>>>>>>> cmds >>>>>>>> 169 * after VM death, so, it sends it's own >>>>>>>> errors. >>>>>>>> 170 */ >>>>>>>> 171 outStream_setError(&out, >>>>>>>> JDWP_ERROR(VM_DEAD)); >>>>>>>> >>>>>>>> >>>>>>>> However, the guard above does not help much if the VM_DEATH >>>>>>>> event >>>>>>>> happens in the middle of a command execution. >>>>>>>> There is a lack of synchronization here. >>>>>>>> >>>>>>>> The fix introduces new lock (vmDeathLock) which does not >>>>>>>> allow to >>>>>>>> execute the commands >>>>>>>> and the VM_DEATH event callback concurrently. >>>>>>>> It should work well for any function that is used in >>>>>>>> implementation of >>>>>>>> the JDWP_COMMAND_SET(VirtualMachine) . >>>>>>>> >>>>>>>> >>>>>>>> Testing: >>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG >>>>>>>> com/sun/jdi tests >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> The updated webrev: >>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> The changes are: >>>>>>>>> - added a comment recommended by Staffan >>>>>>>>> - removed the ignore_wrong_phase() call from function >>>>>>>>> classSignature() >>>>>>>>> >>>>>>>>> The classSignature() function is called in 16 places. >>>>>>>>> Most of them do not tolerate the NULL in place of returned >>>>>>>>> signature >>>>>>>>> and will crash. >>>>>>>>> I'm not comfortable to fix all the occurrences now and suggest to >>>>>>>>> return to this >>>>>>>>> issue after gaining experience with more failure cases that >>>>>>>>> are still >>>>>>>>> expected. >>>>>>>>> The failure with the classSignature() involved was observed >>>>>>>>> only once >>>>>>>>> in the nightly >>>>>>>>> and should be extremely rare reproducible. >>>>>>>>> I'll file a placeholder bug if necessary. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Please, review the fix for: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Open webrev: >>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Summary: >>>>>>>>>> >>>>>>>>>> The failing scenario: >>>>>>>>>> The debugger and the debuggee are well aware a VM >>>>>>>>>> shutdown has >>>>>>>>>> been started in the target process. >>>>>>>>>> The debugger at this point is not expected to send any >>>>>>>>>> commands >>>>>>>>>> to the JDWP agent. >>>>>>>>>> However, the JDI layer (debugger side) and the jdwp agent >>>>>>>>>> (debuggee side) >>>>>>>>>> are not in sync with the consumer layers. >>>>>>>>>> >>>>>>>>>> One reason is because the test debugger does not invoke >>>>>>>>>> the JDI >>>>>>>>>> method VirtualMachine.dispose(). >>>>>>>>>> Another reason is that the Debugger and the debuggee >>>>>>>>>> processes >>>>>>>>>> are uneasy to sync in general. >>>>>>>>>> >>>>>>>>>> As a result the following steps are possible: >>>>>>>>>> - The test debugger sends a 'quit' command to the test >>>>>>>>>> debuggee >>>>>>>>>> - The debuggee is normally exiting >>>>>>>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>>>>>>> anonymous class unload event >>>>>>>>>> - The JDI InternalEventHandler thread handles the >>>>>>>>>> ClassUnloadEvent event >>>>>>>>>> - The InternalEventHandler wants to uncache the matching >>>>>>>>>> reference type. >>>>>>>>>> If there is more than one class with the same host >>>>>>>>>> class >>>>>>>>>> signature, it can't distinguish them, >>>>>>>>>> and so, deletes all references and re-retrieves >>>>>>>>>> them again >>>>>>>>>> (see tracing below): >>>>>>>>>> MY_TRACE: JDI: >>>>>>>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>>>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>>>>>>> - The jdwp backend debugLoop_run() gets the command >>>>>>>>>> from JDI >>>>>>>>>> and calls the functions >>>>>>>>>> classesForSignature() and classStatus() recursively. >>>>>>>>>> - The classStatus() makes a call to the JVMTI >>>>>>>>>> GetClassStatus() >>>>>>>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>>>>>>> - As a result the jdwp backend reports the JVMTI >>>>>>>>>> error to the >>>>>>>>>> JDI, and so, the test fails >>>>>>>>>> >>>>>>>>>> For details, see the analysis in bug report closed as a >>>>>>>>>> dup of >>>>>>>>>> the bug 6988950: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>>>>>>> >>>>>>>>>> Some similar cases can be found in the two bug reports >>>>>>>>>> (6988950 >>>>>>>>>> and 8024865) describing this issue. >>>>>>>>>> >>>>>>>>>> The fix is to skip reporting the >>>>>>>>>> JVMTI_ERROR_WRONG_PHASE error >>>>>>>>>> as it is normal at the VM shutdown. >>>>>>>>>> The original jdwp backend implementation had a similar >>>>>>>>>> approach >>>>>>>>>> for the raw monitor functions. >>>>>>>>>> Threy use the ignore_vm_death() to workaround the >>>>>>>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>>>>>>> For reference, please, see the file: src/share/back/util.c >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Testing: >>>>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>>>>>>> tests >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>> >>> >> > From hws689 at gmail.com Fri Nov 7 19:28:10 2014 From: hws689 at gmail.com (Hilton Wichwski Silva) Date: Fri, 7 Nov 2014 17:28:10 -0200 Subject: No subject Message-ID: From eric.mccorkle at oracle.com Fri Nov 7 19:40:16 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Fri, 07 Nov 2014 14:40:16 -0500 Subject: Review request for 8058313: Mismatch of method descriptor and MethodParameters.parameters_count should cause MalformedParameterException In-Reply-To: <545D108B.5070504@oracle.com> References: <54516C9A.7070404@oracle.com> <54518820.50700@oracle.com> <545251AD.8050208@oracle.com> <54527A90.4030503@oracle.com> <5452CC7F.1090809@oracle.com> <5457F530.2070907@oracle.com> <54597EFC.2070509@oracle.com> <545BBF74.4020607@oracle.com> <545C09FB.9020907@oracle.com> <545CC42A.8030004@oracle.com> <545D108B.5070504@oracle.com> Message-ID: <545D2020.50801@oracle.com> Thanks, Are you a capital R? If not, wouldn't that mean I need two? On 11/07/14 13:33, Jiangli Zhou wrote: > Hi Eric, > > Looks okay. You also need a capital R reviewer for the change. > > Thanks, > Jiangli > > On 11/07/2014 05:07 AM, Eric McCorkle wrote: >> On 11/06/14 18:53, Jiangli Zhou wrote: >> >>> Could you please point to the updated webrev? I don't see the update in >>> http://cr.openjdk.java.net/~emc/8058313/webrev.01/src/share/vm/prims/jvm.cpp.sdiff.html. >>> >> I made a mistake uploading it. It's here: >> http://cr.openjdk.java.net/~emc/8058313/webrev.02/ >> >>> Thanks, >>> Jiangli >>>>> On 11/03/2014 01:35 PM, Eric McCorkle wrote: >>>>>> Please review this issue so that it can go in along with 8058322. >>>>>> Thanks. >>>>>> >>>>>> On 10/30/14 19:40, Eric McCorkle wrote: >>>>>>> Thank you for the pointers. I have applied your changes and >>>>>>> refreshed >>>>>>> the webrev. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~emc/8058313/ >>>>>>> >>>>>>> Also, I have posted the test for this and another patch here: >>>>>>> http://cr.openjdk.java.net/~emc/8062556/ >>>>>>> >>>>>>> On 10/30/14 13:51, Jiangli Zhou wrote: >>>>>>>> Hi Eric, >>>>>>>> >>>>>>>> On 10/30/2014 07:56 AM, Eric McCorkle wrote: >>>>>>>>> On 10/29/14 20:36, Jiangli Zhou wrote: >>>>>>>>>> Hi Eric, >>>>>>>>>> >>>>>>>>>> I wonder if we could specialize this particular case and avoid >>>>>>>>>> changing >>>>>>>>>> the parsing code. How about setting the _has_method_parameters >>>>>>>>>> flag in >>>>>>>>>> the ConstMethod when encounter such MethodParameter, and changing >>>>>>>>>> JVM_GetMethodParameters() to return non-NULL value for such case >>>>>>>>>> when >>>>>>>>>> _has_method_parameters is true but method_parameters_length is 0. >>>>>>>>>> Would >>>>>>>>>> that work? >>>>>>>>> Which parser are you talking about? The inline tables parser, or >>>>>>>>> the >>>>>>>>> class file parser. The class file parser has to change, >>>>>>>>> because it >>>>>>>>> was >>>>>>>>> previously ignoring MethodParameters attributes with >>>>>>>>> parameter_count 0. >>>>>>>> It's the class parsing changes that I was referring to, mostly >>>>>>>> relate to >>>>>>>> the initialization and checking against method_parameters_length. >>>>>>>> It's a >>>>>>>> bit awkward to include the 0 case but also skipping it in the >>>>>>>> loop. For >>>>>>>> example, the following code in classFileParser.cpp changed ">" to >>>>>>>> ">=" >>>>>>>> in the if check, but has no real effect and is not need. >>>>>>>> >>>>>>>> 2486 // Copy method parameters >>>>>>>> 2487 if (method_parameters_length >= 0) { >>>>>>>> 2488 MethodParametersElement* elem = >>>>>>>> m->constMethod()->method_parameters_start(); >>>>>>>> 2489 for (int i = 0; i < method_parameters_length; i++) { >>>>>>>> 2490 elem[i].name_cp_index = >>>>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>>>> 2491 method_parameters_data += 2; >>>>>>>> 2492 elem[i].flags = >>>>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>>>> 2493 method_parameters_data += 2; >>>>>>>> 2494 } >>>>>>>> 2495 } >>>>>>>> >>>>>>>> >>>>>>>>> I don't think your proposal will work. The inline tables' >>>>>>>>> offsets are >>>>>>>>> all dependent on what inline tables are actually present. If >>>>>>>>> _has_method_parameters is set, then the inline tables code >>>>>>>>> expects the >>>>>>>>> last u2 of the inline tables to be a u2 indicating the number of >>>>>>>>> method >>>>>>>>> parameters entries, preceeded by the array of method parameters >>>>>>>>> data. >>>>>>>>> If _has_method_parameters is false, then it expects that there >>>>>>>>> is no >>>>>>>>> method parameters information at all (including no length >>>>>>>>> field). If >>>>>>>>> you were to set _has_method_parameters, but not store any >>>>>>>>> information in >>>>>>>>> the inline table, then it would cause errors for all the rest >>>>>>>>> of the >>>>>>>>> inline tables. >>>>>>>> Thank you for reminding me of the complexity of the inlined table >>>>>>>> calculation in the ConstMethod. My proposal would require tweaks in >>>>>>>> that >>>>>>>> area to correctly compute the table sizes. As it's easy to >>>>>>>> introduce >>>>>>>> bugs in that area, it's not worth to change the table calculation >>>>>>>> code >>>>>>>> for this purpose. I agree my proposal is not a better choice in >>>>>>>> this >>>>>>>> case. >>>>>>>> >>>>>>>>> What I do for the parameter_count = 0 case is just store >>>>>>>>> a 0 u2 for zero-length method parameters information, and no data. >>>>>>>>> All >>>>>>>>> the existing inline tables code works fine with this case, so >>>>>>>>> there >>>>>>>>> aren't any serious changes to the inline tables code (other than >>>>>>>>> allowing method parameters information to be stored when the >>>>>>>>> array is >>>>>>>>> length 0). But you have to make some change to the inline table >>>>>>>>> code, >>>>>>>>> otherwise the information won't be stored. >>>>>>>> Ok. Could you please add comments to the change in >>>>>>>> constMethod.cpp to >>>>>>>> explain above? >>>>>>>> >>>>>>>> In jvm.cpp, since -1 represents no method parameter now. Maybe >>>>>>>> checking >>>>>>>> against explicity and add comments for the 0-length case. >>>>>>>> >>>>>>>> JVM_ENTRY(jobjectArray, JVM_GetMethodParameters(JNIEnv *env, >>>>>>>> jobject >>>>>>>> method)) >>>>>>>> { >>>>>>>> ... >>>>>>>> // No method parameter >>>>>>>> if (num_params == -1) { >>>>>>>> return (jobjectArray)NULL; >>>>>>>> } >>>>>>>> >>>>>>>> /* handle the rest here */ >>>>>>>> // make sure all the symbols are properly formatted >>>>>>>> for (int i = 0; i < num_params; i++) { >>>>>>>> ... >>>>>>>> } >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jiangli >>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Jiangli >>>>>>>>>> >>>>>>>>>> On 10/29/2014 03:39 PM, Eric McCorkle wrote: >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> Please review this fix for parameter reflection which addresses >>>>>>>>>>> hotspot >>>>>>>>>>> falsely ignoring zero-length MethodParameter attributes. The >>>>>>>>>>> JVMS >>>>>>>>>>> allows a MethodParameters attribute with parameter_count = 0, >>>>>>>>>>> and >>>>>>>>>>> the >>>>>>>>>>> parameter reflection spec states that a >>>>>>>>>>> MalformedParametersException >>>>>>>>>>> should be thrown if parameter_count does not match the number of >>>>>>>>>>> real >>>>>>>>>>> parameters to a method. Hotspot currently ignores >>>>>>>>>>> MethodParameters >>>>>>>>>>> attributes with parameter_count = 0; however, in a case where a >>>>>>>>>>> (bad) >>>>>>>>>>> MethodParameters attribute has parameter_count = 0, but the >>>>>>>>>>> method >>>>>>>>>>> has a >>>>>>>>>>> nonzero number of real parameters, hotspot will return null from >>>>>>>>>>> JVM_GetMethodParameters, the result being that a >>>>>>>>>>> MalformedParametersException is not thrown (rather, the >>>>>>>>>>> reflection API >>>>>>>>>>> acts like there is no MethodParameters attribute). >>>>>>>>>>> >>>>>>>>>>> This patch causes hotspot to record the fact that a zero-length >>>>>>>>>>> MethodParameters attribute does exist, causing the exception >>>>>>>>>>> to be >>>>>>>>>>> thrown when it should be. >>>>>>>>>>> >>>>>>>>>>> The bug is here: >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8058313 >>>>>>>>>>> >>>>>>>>>>> The webrev is here: >>>>>>>>>>> http://cr.openjdk.java.net/~emc/8058313/ > From serguei.spitsyn at oracle.com Fri Nov 7 19:41:01 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 07 Nov 2014 11:41:01 -0800 Subject: 4-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <545C9B37.5020205@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <5453F9F4.20309@oracle.com> <5454B258.1080104@oracle.com> <54570B68.3060806@oracle.com> <545729A3.7090301@oracle.com> <54596AC2.6050502@oracle.com> <5459FB96.9020404@oracle.com> <545BF5CD.6010008@oracle.com> <545C5620.7080300@oracle.com> <545C955B.3070209@oracle.com> <545C9819.5050806@oracle.com> <545C9B37.5020205@oracle.com> Message-ID: <545D204D.3060805@oracle.com> On 11/7/14 2:13 AM, serguei.spitsyn at oracle.com wrote: > On 11/7/14 1:59 AM, David Holmes wrote: >> On 7/11/2014 7:48 PM, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> >>> On 11/6/14 9:18 PM, David Holmes wrote: >>>> Hi Serguei, >>>> >>>> I think I get the gist of this approach but I'm not an expert on the >>>> JVM TI or JDWP event model. My main concern would be how the delay to >>>> the completion of cbVMDeath() might impact things - specifically if it >>>> might be a lengthy delay? >>> >>> 1. At the beginning the VirtualMachine comands check if >>> gdata->vmDead is >>> true >>> and in such case just return with the JDWP_ERROR(VM_DEAD) error or >>> quietly. >>> Normally, the cbVMDeath event callback needs to wait for just one >>> command. >>> >>> Please, see the VirtualMachine.c and the following comment in >>> debugLoop_run(): >>> >>> } else if (gdata->vmDead && >>> ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) { >>> /* Protect the VM from calls while dead. >>> * VirtualMachine cmdSet quietly ignores some cmds >>> * after VM death, so, it sends it's own errors. >>> */ >>> outStream_setError(&out, JDWP_ERROR(VM_DEAD)); >>> } else { >>> >>> >>> 2. We do not have many choices. >>> Without a sync on a command completeness we will continue getting >>> WRONG_PHASE errors intermittently. >>> Another choice is to use already reviewed ignore_wrong_phase >>> workaround. >>> Note, the workaround works Ok not for all the commands. >>> I understand, we need to make sure nothing is broken if we choose >>> this approach. :) >>> >>> 3. What delay would you consider lengthy: 1 sec, 10 sec, 1 min.? >> >> Anything that causes something unexpected to happen :) I'm just >> looking at the code and thinking what might go wrong. Really all we >> can do is try this and see. > > 1 min sleep looks too big as it causes timeout failures of some tests. > Launched the nsk.jdi and jtreg com/sun/jdi with 10 sec sleep. > Will see the results tomorrow. The nsk.jdi and the jtreg com/sun/jdi tests are passed with the 10 sec sleep. Thanks, Serguei > > Thanks! > Serguei > >> >> Thanks, >> David >> >>> For instance, I can add 10 sec sleep to provoke the command >>> execution delay and see what can be broken. >>> With 1 min sleep I see some timeouts in the jtreg com/sun/jdi >>> tests >>> though which is probably Ok. >>> >>> Thanks, >>> Serguei >>> >>>> >>>> Thanks, >>>> David >>>> >>>> On 7/11/2014 8:27 AM, serguei.spitsyn at oracle.com wrote: >>>>> Hi reviewers, >>>>> >>>>> I'm suggesting to review a modified fix: >>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.4/ >>>>> >>>>> >>>>> >>>>> >>>>> The 3-rd round fix is not right as it caused deadlocks in several >>>>> tests >>>>> (in nsk.jdi.testlist and jtreg com/sun/jdi). >>>>> >>>>> Here is a deadlock example: >>>>> >>>>> ----------------- lwp# 2 / thread# 2 -------------------- >>>>> ffffffff7e8dc6a4 lwp_cond_wait (100138748, 100138730, 0, 0) >>>>> ffffffff7dcad148 void os::PlatformEvent::park() (100138700, d4788, >>>>> d4400, 0, ffffffff7e357440, 100138730) + 100 >>>>> ffffffff7dc3151c int Monitor::IWait(Thread*,long) >>>>> (ffffffff7e3c5b98, >>>>> 100137000, 0, 1004405d0, 6e750, 0) + a4 >>>>> ffffffff7dc324d0 bool Monitor::wait(bool,long,bool) (1004405d0, >>>>> 100137000, 0, 0, 1, 20000000) + 358 >>>>> ffffffff7de6c530 int JavaThread::java_suspend_self() (1004405d0, >>>>> 100137000, 1, deab, 60000000, 100137000) + c8 >>>>> ffffffff7da5f478 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) >>>>> (10034bdc0, ffffffffffffffff, ffffffff7e3e6bd0, 100137000, 1, 2) + >>>>> 258 >>>>> ffffffff7da2284c jvmtiError >>>>> JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bdc0, >>>>> ffffffffffffffff, 4, 9aeb0, 100137000) + 8c >>>>> ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, >>>>> ffffffff7ab3ad18, ffffffff7ab3ad18, 0) + 3c >>>>> ffffffff7aa1c804 enqueueCommand (10034bb90, 102c00, >>>>> ffffffffffefd118, >>>>> ffffffff7ab3ad18, 102c00, ffffffff7ab3bd60) + 14c >>>>> ffffffff7aa1e23c eventHelper_reportEvents (d8, 100135d70, 2, 1, >>>>> 1, 2) >>>>> + 10c >>>>> ffffffff7aa181f8 reportEvents (1001371f8, 0, 0, 14, 100135d70, 0) + >>>>> 138 >>>>> ffffffff7aa187b8 event_callback (1001371f8, ffffffff7b0ffa88, >>>>> ffffffff7aa23150, ffffffff7aa376a0, ffffffff7ab3ad18, 100441ad0) + >>>>> 360 >>>>> ffffffff7aa1b870 cbVMDeath (800, 1001371f8, ffffffff7aa37c48, >>>>> ffffffff7ab3ad18, 1018, 1000) + 1d8 >>>>> ffffffff7da3635c void JvmtiExport::post_vm_death() (1ffc, >>>>> 100137000, >>>>> ffffffff7e3e8b30, ffffffff7e357440, 1, 10010cf30) + 534 >>>>> ffffffff7d7bb104 void before_exit(JavaThread*) (100137000, >>>>> ffffffff7e392350, ffffffff7e3fb938, 6ed99, ffffffff7e357440, >>>>> ffffffff7e3e6b70) + 30c >>>>> ffffffff7de72128 bool Threads::destroy_vm() (100137000, 100110a40, >>>>> ffffffff7e3f22f4, ffffffff7e3e6ab0, ffffffff7e357440, 30000000) + 100 >>>>> ffffffff7d8d0664 jni_DestroyJavaVM (100137000, 1ffc, >>>>> ffffffff7e3e8b30, >>>>> ffffffff7e357440, 0, 10013700) + 1bc >>>>> ffffffff7ee08680 JavaMain (ffffffff7e3da790, 0, ffffffff7e3da790, >>>>> 10035de68, 0, ffffffff7e4143b0) + 860 >>>>> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >>>>> >>>>> ----------------- lwp# 12 / thread# 12 -------------------- >>>>> ffffffff7e8dc6a4 lwp_cond_wait (100349948, 100349930, 0, 0) >>>>> ffffffff7dcad148 void os::PlatformEvent::park() (100349900, d4788, >>>>> d4400, 0, ffffffff7e357440, 100349930) + 100 >>>>> ffffffff7da5f010 int JvmtiRawMonitor::raw_enter(Thread*) >>>>> (10034a070, >>>>> 100348800, a, ffffffff7e3de340, 1, ffffffff7e115ff4) + 258 >>>>> ffffffff7da22450 jvmtiError >>>>> JvmtiEnv::RawMonitorEnter(JvmtiRawMonitor*) (ffffffff7ea05a00, >>>>> 10034a070, 1c7, 100348800, ffffffff7e357440, 4) + a0 >>>>> ffffffff7aa2f288 debugMonitorEnter (10034a070, c18, c00, >>>>> ffffffff7ab3ad28, ffffffff7ab3b940, 0) + 38 >>>>> ffffffff7aa14134 debugLoop_run (ffffffff7ab3b940, 1000, >>>>> ffffffff7ab3ad28, ffffffff7aa360d0, ffffffff5b2ff718, c18) + 11c >>>>> ffffffff7aa2a4f8 connectionInitiated (ffffffff5b504010, 1358, 1000, >>>>> ffffffff7ab3ad28, 1, ffffffff7ab3c080) + e0 >>>>> ffffffff7aa2a7d4 attachThread (ffffffffffefee48, 101000, >>>>> ffffffff5b504010, ffffffff7ab3ad28, 0, 10000000) + 54 >>>>> ffffffff7da56b18 void JvmtiAgentThread::call_start_function() >>>>> (100348800, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034880, 1) + >>>>> 128 >>>>> ffffffff7de6a678 void JavaThread::thread_main_inner() >>>>> (100348800, 3d8, >>>>> 1003497f8, 100349420, ffffffff5b2ff9f8, 0) + 90 >>>>> ffffffff7de6a5b4 void JavaThread::run() (100348800, 100349442, c, >>>>> fffffffea5f3e048, 3d8, 1003497f8) + 3ac >>>>> ffffffff7dc9f2e4 java_start (ca800, 100348800, ca904, >>>>> ffffffff7e16ff31, ffffffff7e357440, 4797) + 2e4 >>>>> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >>>>> >>>>> ----------------- lwp# 13 / thread# 13 -------------------- >>>>> ffffffff7e8dc6a4 lwp_cond_wait (10034d348, 10034d330, 0, 0) >>>>> ffffffff7dcad148 void os::PlatformEvent::park() (10034d300, d4788, >>>>> d4400, 0, ffffffff7e357440, 10034d330) + 100 >>>>> ffffffff7da5eac8 int JvmtiRawMonitor::SimpleWait(Thread*,long) >>>>> (10034bed0, 10034c000, ffffffffffffffff, 241000, 0, 10034c000) + 100 >>>>> ffffffff7da5f300 int JvmtiRawMonitor::raw_wait(long,bool,Thread*) >>>>> (10034bed0, ffffffffffffffff, 1, 10034c000, ffffffff7e357440, >>>>> 10034c000) >>>>> + e0 >>>>> ffffffff7da2284c jvmtiError >>>>> JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bed0, >>>>> ffffffffffffffff, 4, 9aeb0, 10034c000) + 8c >>>>> ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00, >>>>> ffffffff7ab3ad18, ffffffff7ab3b940, 0) + 3c >>>>> ffffffff7aa1d838 doBlockCommandLoop (800, 1038, ffffffff7ab3ad18, >>>>> 1000, ffffffff7ab3ad18, ffffffff7ab3bd60) + 48 >>>>> ffffffff7aa1da3c commandLoop (c28, 10034c1f8, c00, >>>>> ffffffff7ab3ad18, >>>>> 0, 10000000) + ac >>>>> ffffffff7da56b18 void JvmtiAgentThread::call_start_function() >>>>> (10034c000, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034c00, 1) + >>>>> 128 >>>>> ffffffff7de6a678 void JavaThread::thread_main_inner() >>>>> (10034c000, 3d8, >>>>> 10034cfe8, 10034cc10, ffffffff5b0ffbf8, 0) + 90 >>>>> ffffffff7de6a5b4 void JavaThread::run() (10034c000, 10034cc28, d, >>>>> fffffffea5f3e290, 3d8, 10034cfe8) + 3ac >>>>> ffffffff7dc9f2e4 java_start (ca800, 10034c000, ca904, >>>>> ffffffff7e16ff31, ffffffff7e357440, 181a) + 2e4 >>>>> ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0) >>>>> >>>>> >>>>> The details: >>>>> - Thread #2: The cbVMDeath() event handler is waiting on the >>>>> commandCompleteLock in the enqueueCommand(). >>>>> The call chain is: >>>>> cbVMDeath() -> event_callback() -> reportEvents() -> >>>>> eventHelper_reportEvents() -> enqueueCommand(). >>>>> The enqueueCommand() depends on the commandLoop() that has >>>>> to call >>>>> completeCommand(command) for the command being enqueued. >>>>> This has not been set yet: gdata->vmDead = JNI_TRUE >>>>> >>>>> - Thread #12: The debugLoop_run blocked on the vmDeathLock enter >>>>> >>>>> - Thread #13: The commandLoop is waiting on the >>>>> blockCommandLoopLock >>>>> in the doBlockCommandLoop(). >>>>> It is because blockCommandLoop == JNI_TRUE which is set in the >>>>> needBlockCommandLoop() >>>>> if the following condition is true: >>>>> (cmd->commandKind == COMMAND_REPORT_EVENT_COMPOSITE && >>>>> cmd->u.reportEventComposite.suspendPolicy == >>>>> JDWP_SUSPEND_POLICY(ALL)) >>>>> >>>>> >>>>> It seems, the debugLoop_run() block on the vmDeathLock causes the >>>>> commandLoop() to wait indefinitely. >>>>> The cbVMDeath() can not proceed because the commandLoop() does not >>>>> make >>>>> a progress. >>>>> >>>>> The vmDeathLock critical section in the cbVMDeath() event callback >>>>> seems >>>>> to be an overkill (unnecessary). >>>>> A less intrusive synchronization is required here which is to wait >>>>> until >>>>> the current command is completed >>>>> before returning to the JvmtiExport::post_vm_death(). >>>>> >>>>> The new approach (see new webrev) is to extend the resumeLock >>>>> synchronization pattern >>>>> to all VirtualMachine set of commands, not only the resume command. >>>>> The resumeLock name is replaced with the vmDeathLock to reflect new >>>>> semantics. >>>>> >>>>> In general, we could consider to do the same for the rest of the JDWP >>>>> command sets. >>>>> But it is better to be careful and see how this change goes first. >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 11/5/14 2:27 AM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi David, >>>>>> >>>>>> Thank you for the concerns! >>>>>> Testing showed several tests failing with deadlocks. >>>>>> Scenarios are similar to that you describe. >>>>>> >>>>>> Trying to understand the details. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> On 11/4/14 4:09 PM, David Holmes wrote: >>>>>>> Hi Serguei, >>>>>>> >>>>>>> On 3/11/2014 5:07 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> On 11/2/14 8:58 PM, David Holmes wrote: >>>>>>>>> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote: >>>>>>>>>> Serguei, >>>>>>>>>> >>>>>>>>>> Thank you for good finding. This approach looks much better >>>>>>>>>> for me. >>>>>>>>>> >>>>>>>>>> The fix looks good. >>>>>>>>>> >>>>>>>>>> Is it necessary to release vmDeathLock locks at >>>>>>>>>> eventHandler.c:1244 before call >>>>>>>>>> >>>>>>>>>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ? >>>>>>>>> >>>>>>>>> I agree this looks necessary, or at least more clean (if >>>>>>>>> things are >>>>>>>>> failing we really don't know what is happening). >>>>>>>> >>>>>>>> Agreed (replied to Dmitry). >>>>>>>> >>>>>>>>> >>>>>>>>> More generally I'm concerned about whether any of the code paths >>>>>>>>> taken >>>>>>>>> while holding the new lock can result in deadlock - in particular >>>>>>>>> with >>>>>>>>> regard to the resumeLock ? >>>>>>>> >>>>>>>> The cbVMDeath() function never holds both vmDeathLock and >>>>>>>> resumeLock at >>>>>>>> the same time, >>>>>>>> so there is no chance for a deadlock that involves both these >>>>>>>> locks. >>>>>>>> >>>>>>>> Two more locks used in the cbVMDeath() are the callbackBlock and >>>>>>>> callbackLock. >>>>>>>> These two locks look completely unrelated to the debugLoop_run(). >>>>>>>> >>>>>>>> The debugLoop_run() function also uses the cmdQueueLock. >>>>>>>> The debugLoop_run() never holds both vmDeathLock and >>>>>>>> cmdQueueLock at >>>>>>>> the >>>>>>>> same time. >>>>>>>> >>>>>>>> So that I do not see any potential to introduce new deadlock >>>>>>>> with the >>>>>>>> vmDeathLock. >>>>>>>> >>>>>>>> However, it is still easy to overlook something here. >>>>>>>> Please, let me know if you see any danger. >>>>>>> >>>>>>> I was mainly concerned about what might happen in the call chain >>>>>>> for >>>>>>> threadControl_resumeAll() (it certainly sounds like it might >>>>>>> need to >>>>>>> use a resumeLock :) ). I see direct use of the threadLock and >>>>>>> indirectly the eventHandler lock; but there are further call >>>>>>> paths I >>>>>>> did not explore. Wish there was an easy way to determine the >>>>>>> transitive closure of all locks used from a given call. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>>> -Dmitry >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> >>>>>>>>>>> It is 3-rd round of review for: >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>>>>> >>>>>>>>>>> New webrev: >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Summary >>>>>>>>>>> >>>>>>>>>>> For failing scenario, please, refer to the 1-st round RFR >>>>>>>>>>> below. >>>>>>>>>>> >>>>>>>>>>> I've found what is missed in the jdwp agent shutdown and >>>>>>>>>>> decided to >>>>>>>>>>> switch from a workaround to a real fix. >>>>>>>>>>> >>>>>>>>>>> The agent VM_DEATH callback sets the gdata field: >>>>>>>>>>> gdata->vmDead = 1. >>>>>>>>>>> The agent debugLoop_run() has a guard against the VM >>>>>>>>>>> shutdown: >>>>>>>>>>> >>>>>>>>>>> 165 } else if (gdata->vmDead && >>>>>>>>>>> 166 ((cmd->cmdSet) != >>>>>>>>>>> JDWP_COMMAND_SET(VirtualMachine))) { >>>>>>>>>>> 167 /* Protect the VM from calls while dead. >>>>>>>>>>> 168 * VirtualMachine cmdSet quietly ignores >>>>>>>>>>> some >>>>>>>>>>> cmds >>>>>>>>>>> 169 * after VM death, so, it sends it's own >>>>>>>>>>> errors. >>>>>>>>>>> 170 */ >>>>>>>>>>> 171 outStream_setError(&out, >>>>>>>>>>> JDWP_ERROR(VM_DEAD)); >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> However, the guard above does not help much if the VM_DEATH >>>>>>>>>>> event >>>>>>>>>>> happens in the middle of a command execution. >>>>>>>>>>> There is a lack of synchronization here. >>>>>>>>>>> >>>>>>>>>>> The fix introduces new lock (vmDeathLock) which does not >>>>>>>>>>> allow to >>>>>>>>>>> execute the commands >>>>>>>>>>> and the VM_DEATH event callback concurrently. >>>>>>>>>>> It should work well for any function that is used in >>>>>>>>>>> implementation of >>>>>>>>>>> the JDWP_COMMAND_SET(VirtualMachine) . >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Testing: >>>>>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG >>>>>>>>>>> com/sun/jdi >>>>>>>>>>> tests >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> The updated webrev: >>>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The changes are: >>>>>>>>>>>> - added a comment recommended by Staffan >>>>>>>>>>>> - removed the ignore_wrong_phase() call from function >>>>>>>>>>>> classSignature() >>>>>>>>>>>> >>>>>>>>>>>> The classSignature() function is called in 16 places. >>>>>>>>>>>> Most of them do not tolerate the NULL in place of returned >>>>>>>>>>>> signature >>>>>>>>>>>> and will crash. >>>>>>>>>>>> I'm not comfortable to fix all the occurrences now and >>>>>>>>>>>> suggest to >>>>>>>>>>>> return to this >>>>>>>>>>>> issue after gaining experience with more failure cases that >>>>>>>>>>>> are >>>>>>>>>>>> still >>>>>>>>>>>> expected. >>>>>>>>>>>> The failure with the classSignature() involved was observed >>>>>>>>>>>> only >>>>>>>>>>>> once >>>>>>>>>>>> in the nightly >>>>>>>>>>>> and should be extremely rare reproducible. >>>>>>>>>>>> I'll file a placeholder bug if necessary. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Please, review the fix for: >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Open webrev: >>>>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Summary: >>>>>>>>>>>>> >>>>>>>>>>>>> The failing scenario: >>>>>>>>>>>>> The debugger and the debuggee are well aware a VM >>>>>>>>>>>>> shutdown has >>>>>>>>>>>>> been started in the target process. >>>>>>>>>>>>> The debugger at this point is not expected to send any >>>>>>>>>>>>> commands >>>>>>>>>>>>> to the JDWP agent. >>>>>>>>>>>>> However, the JDI layer (debugger side) and the jdwp >>>>>>>>>>>>> agent >>>>>>>>>>>>> (debuggee side) >>>>>>>>>>>>> are not in sync with the consumer layers. >>>>>>>>>>>>> >>>>>>>>>>>>> One reason is because the test debugger does not invoke >>>>>>>>>>>>> the JDI >>>>>>>>>>>>> method VirtualMachine.dispose(). >>>>>>>>>>>>> Another reason is that the Debugger and the debuggee >>>>>>>>>>>>> processes >>>>>>>>>>>>> are uneasy to sync in general. >>>>>>>>>>>>> >>>>>>>>>>>>> As a result the following steps are possible: >>>>>>>>>>>>> - The test debugger sends a 'quit' command to the >>>>>>>>>>>>> test >>>>>>>>>>>>> debuggee >>>>>>>>>>>>> - The debuggee is normally exiting >>>>>>>>>>>>> - The jdwp backend reports (over the jdwp >>>>>>>>>>>>> protocol) an >>>>>>>>>>>>> anonymous class unload event >>>>>>>>>>>>> - The JDI InternalEventHandler thread handles the >>>>>>>>>>>>> ClassUnloadEvent event >>>>>>>>>>>>> - The InternalEventHandler wants to uncache the >>>>>>>>>>>>> matching >>>>>>>>>>>>> reference type. >>>>>>>>>>>>> If there is more than one class with the same host >>>>>>>>>>>>> class >>>>>>>>>>>>> signature, it can't distinguish them, >>>>>>>>>>>>> and so, deletes all references and re-retrieves >>>>>>>>>>>>> them >>>>>>>>>>>>> again >>>>>>>>>>>>> (see tracing below): >>>>>>>>>>>>> MY_TRACE: JDI: >>>>>>>>>>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>>>>>>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>>>>>>>>>> - The jdwp backend debugLoop_run() gets the command >>>>>>>>>>>>> from JDI >>>>>>>>>>>>> and calls the functions >>>>>>>>>>>>> classesForSignature() and classStatus() >>>>>>>>>>>>> recursively. >>>>>>>>>>>>> - The classStatus() makes a call to the JVMTI >>>>>>>>>>>>> GetClassStatus() >>>>>>>>>>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>>>>>>>>>> - As a result the jdwp backend reports the JVMTI >>>>>>>>>>>>> error >>>>>>>>>>>>> to the >>>>>>>>>>>>> JDI, and so, the test fails >>>>>>>>>>>>> >>>>>>>>>>>>> For details, see the analysis in bug report closed as a >>>>>>>>>>>>> dup of >>>>>>>>>>>>> the bug 6988950: >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>>>>>>>>>> >>>>>>>>>>>>> Some similar cases can be found in the two bug reports >>>>>>>>>>>>> (6988950 >>>>>>>>>>>>> and 8024865) describing this issue. >>>>>>>>>>>>> >>>>>>>>>>>>> The fix is to skip reporting the >>>>>>>>>>>>> JVMTI_ERROR_WRONG_PHASE >>>>>>>>>>>>> error >>>>>>>>>>>>> as it is normal at the VM shutdown. >>>>>>>>>>>>> The original jdwp backend implementation had a similar >>>>>>>>>>>>> approach >>>>>>>>>>>>> for the raw monitor functions. >>>>>>>>>>>>> Threy use the ignore_vm_death() to workaround the >>>>>>>>>>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>>>>>>>>>> For reference, please, see the file: >>>>>>>>>>>>> src/share/back/util.c >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Testing: >>>>>>>>>>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG >>>>>>>>>>>>> com/sun/jdi >>>>>>>>>>>>> tests >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>> >>>>> >>> > From jiangli.zhou at oracle.com Fri Nov 7 19:49:36 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 07 Nov 2014 11:49:36 -0800 Subject: Review request for 8058313: Mismatch of method descriptor and MethodParameters.parameters_count should cause MalformedParameterException In-Reply-To: <545D2020.50801@oracle.com> References: <54516C9A.7070404@oracle.com> <54518820.50700@oracle.com> <545251AD.8050208@oracle.com> <54527A90.4030503@oracle.com> <5452CC7F.1090809@oracle.com> <5457F530.2070907@oracle.com> <54597EFC.2070509@oracle.com> <545BBF74.4020607@oracle.com> <545C09FB.9020907@oracle.com> <545CC42A.8030004@oracle.com> <545D108B.5070504@oracle.com> <545D2020.50801@oracle.com> Message-ID: <545D2250.6070607@oracle.com> Eric, I'm not a "R"eviewer yet. You need at least two reviewers, within them at least one should be "R"eviewer. Thanks, Jiangli On 11/07/2014 11:40 AM, Eric McCorkle wrote: > Thanks, > > Are you a capital R? If not, wouldn't that mean I need two? > > On 11/07/14 13:33, Jiangli Zhou wrote: >> Hi Eric, >> >> Looks okay. You also need a capital R reviewer for the change. >> >> Thanks, >> Jiangli >> >> On 11/07/2014 05:07 AM, Eric McCorkle wrote: >>> On 11/06/14 18:53, Jiangli Zhou wrote: >>> >>>> Could you please point to the updated webrev? I don't see the update in >>>> http://cr.openjdk.java.net/~emc/8058313/webrev.01/src/share/vm/prims/jvm.cpp.sdiff.html. >>>> >>> I made a mistake uploading it. It's here: >>> http://cr.openjdk.java.net/~emc/8058313/webrev.02/ >>> >>>> Thanks, >>>> Jiangli >>>>>> On 11/03/2014 01:35 PM, Eric McCorkle wrote: >>>>>>> Please review this issue so that it can go in along with 8058322. >>>>>>> Thanks. >>>>>>> >>>>>>> On 10/30/14 19:40, Eric McCorkle wrote: >>>>>>>> Thank you for the pointers. I have applied your changes and >>>>>>>> refreshed >>>>>>>> the webrev. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~emc/8058313/ >>>>>>>> >>>>>>>> Also, I have posted the test for this and another patch here: >>>>>>>> http://cr.openjdk.java.net/~emc/8062556/ >>>>>>>> >>>>>>>> On 10/30/14 13:51, Jiangli Zhou wrote: >>>>>>>>> Hi Eric, >>>>>>>>> >>>>>>>>> On 10/30/2014 07:56 AM, Eric McCorkle wrote: >>>>>>>>>> On 10/29/14 20:36, Jiangli Zhou wrote: >>>>>>>>>>> Hi Eric, >>>>>>>>>>> >>>>>>>>>>> I wonder if we could specialize this particular case and avoid >>>>>>>>>>> changing >>>>>>>>>>> the parsing code. How about setting the _has_method_parameters >>>>>>>>>>> flag in >>>>>>>>>>> the ConstMethod when encounter such MethodParameter, and changing >>>>>>>>>>> JVM_GetMethodParameters() to return non-NULL value for such case >>>>>>>>>>> when >>>>>>>>>>> _has_method_parameters is true but method_parameters_length is 0. >>>>>>>>>>> Would >>>>>>>>>>> that work? >>>>>>>>>> Which parser are you talking about? The inline tables parser, or >>>>>>>>>> the >>>>>>>>>> class file parser. The class file parser has to change, >>>>>>>>>> because it >>>>>>>>>> was >>>>>>>>>> previously ignoring MethodParameters attributes with >>>>>>>>>> parameter_count 0. >>>>>>>>> It's the class parsing changes that I was referring to, mostly >>>>>>>>> relate to >>>>>>>>> the initialization and checking against method_parameters_length. >>>>>>>>> It's a >>>>>>>>> bit awkward to include the 0 case but also skipping it in the >>>>>>>>> loop. For >>>>>>>>> example, the following code in classFileParser.cpp changed ">" to >>>>>>>>> ">=" >>>>>>>>> in the if check, but has no real effect and is not need. >>>>>>>>> >>>>>>>>> 2486 // Copy method parameters >>>>>>>>> 2487 if (method_parameters_length >= 0) { >>>>>>>>> 2488 MethodParametersElement* elem = >>>>>>>>> m->constMethod()->method_parameters_start(); >>>>>>>>> 2489 for (int i = 0; i < method_parameters_length; i++) { >>>>>>>>> 2490 elem[i].name_cp_index = >>>>>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>>>>> 2491 method_parameters_data += 2; >>>>>>>>> 2492 elem[i].flags = >>>>>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>>>>> 2493 method_parameters_data += 2; >>>>>>>>> 2494 } >>>>>>>>> 2495 } >>>>>>>>> >>>>>>>>> >>>>>>>>>> I don't think your proposal will work. The inline tables' >>>>>>>>>> offsets are >>>>>>>>>> all dependent on what inline tables are actually present. If >>>>>>>>>> _has_method_parameters is set, then the inline tables code >>>>>>>>>> expects the >>>>>>>>>> last u2 of the inline tables to be a u2 indicating the number of >>>>>>>>>> method >>>>>>>>>> parameters entries, preceeded by the array of method parameters >>>>>>>>>> data. >>>>>>>>>> If _has_method_parameters is false, then it expects that there >>>>>>>>>> is no >>>>>>>>>> method parameters information at all (including no length >>>>>>>>>> field). If >>>>>>>>>> you were to set _has_method_parameters, but not store any >>>>>>>>>> information in >>>>>>>>>> the inline table, then it would cause errors for all the rest >>>>>>>>>> of the >>>>>>>>>> inline tables. >>>>>>>>> Thank you for reminding me of the complexity of the inlined table >>>>>>>>> calculation in the ConstMethod. My proposal would require tweaks in >>>>>>>>> that >>>>>>>>> area to correctly compute the table sizes. As it's easy to >>>>>>>>> introduce >>>>>>>>> bugs in that area, it's not worth to change the table calculation >>>>>>>>> code >>>>>>>>> for this purpose. I agree my proposal is not a better choice in >>>>>>>>> this >>>>>>>>> case. >>>>>>>>> >>>>>>>>>> What I do for the parameter_count = 0 case is just store >>>>>>>>>> a 0 u2 for zero-length method parameters information, and no data. >>>>>>>>>> All >>>>>>>>>> the existing inline tables code works fine with this case, so >>>>>>>>>> there >>>>>>>>>> aren't any serious changes to the inline tables code (other than >>>>>>>>>> allowing method parameters information to be stored when the >>>>>>>>>> array is >>>>>>>>>> length 0). But you have to make some change to the inline table >>>>>>>>>> code, >>>>>>>>>> otherwise the information won't be stored. >>>>>>>>> Ok. Could you please add comments to the change in >>>>>>>>> constMethod.cpp to >>>>>>>>> explain above? >>>>>>>>> >>>>>>>>> In jvm.cpp, since -1 represents no method parameter now. Maybe >>>>>>>>> checking >>>>>>>>> against explicity and add comments for the 0-length case. >>>>>>>>> >>>>>>>>> JVM_ENTRY(jobjectArray, JVM_GetMethodParameters(JNIEnv *env, >>>>>>>>> jobject >>>>>>>>> method)) >>>>>>>>> { >>>>>>>>> ... >>>>>>>>> // No method parameter >>>>>>>>> if (num_params == -1) { >>>>>>>>> return (jobjectArray)NULL; >>>>>>>>> } >>>>>>>>> >>>>>>>>> /* handle the rest here */ >>>>>>>>> // make sure all the symbols are properly formatted >>>>>>>>> for (int i = 0; i < num_params; i++) { >>>>>>>>> ... >>>>>>>>> } >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Jiangli >>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Jiangli >>>>>>>>>>> >>>>>>>>>>> On 10/29/2014 03:39 PM, Eric McCorkle wrote: >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> Please review this fix for parameter reflection which addresses >>>>>>>>>>>> hotspot >>>>>>>>>>>> falsely ignoring zero-length MethodParameter attributes. The >>>>>>>>>>>> JVMS >>>>>>>>>>>> allows a MethodParameters attribute with parameter_count = 0, >>>>>>>>>>>> and >>>>>>>>>>>> the >>>>>>>>>>>> parameter reflection spec states that a >>>>>>>>>>>> MalformedParametersException >>>>>>>>>>>> should be thrown if parameter_count does not match the number of >>>>>>>>>>>> real >>>>>>>>>>>> parameters to a method. Hotspot currently ignores >>>>>>>>>>>> MethodParameters >>>>>>>>>>>> attributes with parameter_count = 0; however, in a case where a >>>>>>>>>>>> (bad) >>>>>>>>>>>> MethodParameters attribute has parameter_count = 0, but the >>>>>>>>>>>> method >>>>>>>>>>>> has a >>>>>>>>>>>> nonzero number of real parameters, hotspot will return null from >>>>>>>>>>>> JVM_GetMethodParameters, the result being that a >>>>>>>>>>>> MalformedParametersException is not thrown (rather, the >>>>>>>>>>>> reflection API >>>>>>>>>>>> acts like there is no MethodParameters attribute). >>>>>>>>>>>> >>>>>>>>>>>> This patch causes hotspot to record the fact that a zero-length >>>>>>>>>>>> MethodParameters attribute does exist, causing the exception >>>>>>>>>>>> to be >>>>>>>>>>>> thrown when it should be. >>>>>>>>>>>> >>>>>>>>>>>> The bug is here: >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8058313 >>>>>>>>>>>> >>>>>>>>>>>> The webrev is here: >>>>>>>>>>>> http://cr.openjdk.java.net/~emc/8058313/ From eric.mccorkle at oracle.com Fri Nov 7 20:03:24 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Fri, 07 Nov 2014 15:03:24 -0500 Subject: Review request for 8058313: Mismatch of method descriptor and MethodParameters.parameters_count should cause MalformedParameterException In-Reply-To: <545D2250.6070607@oracle.com> References: <54516C9A.7070404@oracle.com> <54518820.50700@oracle.com> <545251AD.8050208@oracle.com> <54527A90.4030503@oracle.com> <5452CC7F.1090809@oracle.com> <5457F530.2070907@oracle.com> <54597EFC.2070509@oracle.com> <545BBF74.4020607@oracle.com> <545C09FB.9020907@oracle.com> <545CC42A.8030004@oracle.com> <545D108B.5070504@oracle.com> <545D2020.50801@oracle.com> <545D2250.6070607@oracle.com> Message-ID: <545D258C.4040106@oracle.com> Ah, ok, thanks. In that case, I need a capital-R reviewer to look at this, please? Thanks, Eric On 11/07/14 14:49, Jiangli Zhou wrote: > Eric, > > I'm not a "R"eviewer yet. You need at least two reviewers, within them > at least one should be "R"eviewer. > > Thanks, > Jiangli > > On 11/07/2014 11:40 AM, Eric McCorkle wrote: >> Thanks, >> >> Are you a capital R? If not, wouldn't that mean I need two? >> >> On 11/07/14 13:33, Jiangli Zhou wrote: >>> Hi Eric, >>> >>> Looks okay. You also need a capital R reviewer for the change. >>> >>> Thanks, >>> Jiangli >>> >>> On 11/07/2014 05:07 AM, Eric McCorkle wrote: >>>> On 11/06/14 18:53, Jiangli Zhou wrote: >>>> >>>>> Could you please point to the updated webrev? I don't see the >>>>> update in >>>>> http://cr.openjdk.java.net/~emc/8058313/webrev.01/src/share/vm/prims/jvm.cpp.sdiff.html. >>>>> >>>>> >>>> I made a mistake uploading it. It's here: >>>> http://cr.openjdk.java.net/~emc/8058313/webrev.02/ >>>> >>>>> Thanks, >>>>> Jiangli >>>>>>> On 11/03/2014 01:35 PM, Eric McCorkle wrote: >>>>>>>> Please review this issue so that it can go in along with 8058322. >>>>>>>> Thanks. >>>>>>>> >>>>>>>> On 10/30/14 19:40, Eric McCorkle wrote: >>>>>>>>> Thank you for the pointers. I have applied your changes and >>>>>>>>> refreshed >>>>>>>>> the webrev. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~emc/8058313/ >>>>>>>>> >>>>>>>>> Also, I have posted the test for this and another patch here: >>>>>>>>> http://cr.openjdk.java.net/~emc/8062556/ >>>>>>>>> >>>>>>>>> On 10/30/14 13:51, Jiangli Zhou wrote: >>>>>>>>>> Hi Eric, >>>>>>>>>> >>>>>>>>>> On 10/30/2014 07:56 AM, Eric McCorkle wrote: >>>>>>>>>>> On 10/29/14 20:36, Jiangli Zhou wrote: >>>>>>>>>>>> Hi Eric, >>>>>>>>>>>> >>>>>>>>>>>> I wonder if we could specialize this particular case and avoid >>>>>>>>>>>> changing >>>>>>>>>>>> the parsing code. How about setting the _has_method_parameters >>>>>>>>>>>> flag in >>>>>>>>>>>> the ConstMethod when encounter such MethodParameter, and >>>>>>>>>>>> changing >>>>>>>>>>>> JVM_GetMethodParameters() to return non-NULL value for such >>>>>>>>>>>> case >>>>>>>>>>>> when >>>>>>>>>>>> _has_method_parameters is true but method_parameters_length >>>>>>>>>>>> is 0. >>>>>>>>>>>> Would >>>>>>>>>>>> that work? >>>>>>>>>>> Which parser are you talking about? The inline tables >>>>>>>>>>> parser, or >>>>>>>>>>> the >>>>>>>>>>> class file parser. The class file parser has to change, >>>>>>>>>>> because it >>>>>>>>>>> was >>>>>>>>>>> previously ignoring MethodParameters attributes with >>>>>>>>>>> parameter_count 0. >>>>>>>>>> It's the class parsing changes that I was referring to, mostly >>>>>>>>>> relate to >>>>>>>>>> the initialization and checking against method_parameters_length. >>>>>>>>>> It's a >>>>>>>>>> bit awkward to include the 0 case but also skipping it in the >>>>>>>>>> loop. For >>>>>>>>>> example, the following code in classFileParser.cpp changed ">" to >>>>>>>>>> ">=" >>>>>>>>>> in the if check, but has no real effect and is not need. >>>>>>>>>> >>>>>>>>>> 2486 // Copy method parameters >>>>>>>>>> 2487 if (method_parameters_length >= 0) { >>>>>>>>>> 2488 MethodParametersElement* elem = >>>>>>>>>> m->constMethod()->method_parameters_start(); >>>>>>>>>> 2489 for (int i = 0; i < method_parameters_length; i++) { >>>>>>>>>> 2490 elem[i].name_cp_index = >>>>>>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>>>>>> 2491 method_parameters_data += 2; >>>>>>>>>> 2492 elem[i].flags = >>>>>>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>>>>>> 2493 method_parameters_data += 2; >>>>>>>>>> 2494 } >>>>>>>>>> 2495 } >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> I don't think your proposal will work. The inline tables' >>>>>>>>>>> offsets are >>>>>>>>>>> all dependent on what inline tables are actually present. If >>>>>>>>>>> _has_method_parameters is set, then the inline tables code >>>>>>>>>>> expects the >>>>>>>>>>> last u2 of the inline tables to be a u2 indicating the number of >>>>>>>>>>> method >>>>>>>>>>> parameters entries, preceeded by the array of method parameters >>>>>>>>>>> data. >>>>>>>>>>> If _has_method_parameters is false, then it expects that there >>>>>>>>>>> is no >>>>>>>>>>> method parameters information at all (including no length >>>>>>>>>>> field). If >>>>>>>>>>> you were to set _has_method_parameters, but not store any >>>>>>>>>>> information in >>>>>>>>>>> the inline table, then it would cause errors for all the rest >>>>>>>>>>> of the >>>>>>>>>>> inline tables. >>>>>>>>>> Thank you for reminding me of the complexity of the inlined table >>>>>>>>>> calculation in the ConstMethod. My proposal would require >>>>>>>>>> tweaks in >>>>>>>>>> that >>>>>>>>>> area to correctly compute the table sizes. As it's easy to >>>>>>>>>> introduce >>>>>>>>>> bugs in that area, it's not worth to change the table calculation >>>>>>>>>> code >>>>>>>>>> for this purpose. I agree my proposal is not a better choice in >>>>>>>>>> this >>>>>>>>>> case. >>>>>>>>>> >>>>>>>>>>> What I do for the parameter_count = 0 case is just store >>>>>>>>>>> a 0 u2 for zero-length method parameters information, and no >>>>>>>>>>> data. >>>>>>>>>>> All >>>>>>>>>>> the existing inline tables code works fine with this case, so >>>>>>>>>>> there >>>>>>>>>>> aren't any serious changes to the inline tables code (other than >>>>>>>>>>> allowing method parameters information to be stored when the >>>>>>>>>>> array is >>>>>>>>>>> length 0). But you have to make some change to the inline table >>>>>>>>>>> code, >>>>>>>>>>> otherwise the information won't be stored. >>>>>>>>>> Ok. Could you please add comments to the change in >>>>>>>>>> constMethod.cpp to >>>>>>>>>> explain above? >>>>>>>>>> >>>>>>>>>> In jvm.cpp, since -1 represents no method parameter now. Maybe >>>>>>>>>> checking >>>>>>>>>> against explicity and add comments for the 0-length case. >>>>>>>>>> >>>>>>>>>> JVM_ENTRY(jobjectArray, JVM_GetMethodParameters(JNIEnv *env, >>>>>>>>>> jobject >>>>>>>>>> method)) >>>>>>>>>> { >>>>>>>>>> ... >>>>>>>>>> // No method parameter >>>>>>>>>> if (num_params == -1) { >>>>>>>>>> return (jobjectArray)NULL; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> /* handle the rest here */ >>>>>>>>>> // make sure all the symbols are properly formatted >>>>>>>>>> for (int i = 0; i < num_params; i++) { >>>>>>>>>> ... >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Jiangli >>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Jiangli >>>>>>>>>>>> >>>>>>>>>>>> On 10/29/2014 03:39 PM, Eric McCorkle wrote: >>>>>>>>>>>>> Hello, >>>>>>>>>>>>> >>>>>>>>>>>>> Please review this fix for parameter reflection which >>>>>>>>>>>>> addresses >>>>>>>>>>>>> hotspot >>>>>>>>>>>>> falsely ignoring zero-length MethodParameter attributes. The >>>>>>>>>>>>> JVMS >>>>>>>>>>>>> allows a MethodParameters attribute with parameter_count = 0, >>>>>>>>>>>>> and >>>>>>>>>>>>> the >>>>>>>>>>>>> parameter reflection spec states that a >>>>>>>>>>>>> MalformedParametersException >>>>>>>>>>>>> should be thrown if parameter_count does not match the >>>>>>>>>>>>> number of >>>>>>>>>>>>> real >>>>>>>>>>>>> parameters to a method. Hotspot currently ignores >>>>>>>>>>>>> MethodParameters >>>>>>>>>>>>> attributes with parameter_count = 0; however, in a case >>>>>>>>>>>>> where a >>>>>>>>>>>>> (bad) >>>>>>>>>>>>> MethodParameters attribute has parameter_count = 0, but the >>>>>>>>>>>>> method >>>>>>>>>>>>> has a >>>>>>>>>>>>> nonzero number of real parameters, hotspot will return null >>>>>>>>>>>>> from >>>>>>>>>>>>> JVM_GetMethodParameters, the result being that a >>>>>>>>>>>>> MalformedParametersException is not thrown (rather, the >>>>>>>>>>>>> reflection API >>>>>>>>>>>>> acts like there is no MethodParameters attribute). >>>>>>>>>>>>> >>>>>>>>>>>>> This patch causes hotspot to record the fact that a >>>>>>>>>>>>> zero-length >>>>>>>>>>>>> MethodParameters attribute does exist, causing the exception >>>>>>>>>>>>> to be >>>>>>>>>>>>> thrown when it should be. >>>>>>>>>>>>> >>>>>>>>>>>>> The bug is here: >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8058313 >>>>>>>>>>>>> >>>>>>>>>>>>> The webrev is here: >>>>>>>>>>>>> http://cr.openjdk.java.net/~emc/8058313/ > From coleen.phillimore at oracle.com Fri Nov 7 20:26:59 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 07 Nov 2014 15:26:59 -0500 Subject: Review request for 8058313: Mismatch of method descriptor and MethodParameters.parameters_count should cause MalformedParameterException In-Reply-To: <545D258C.4040106@oracle.com> References: <54516C9A.7070404@oracle.com> <54518820.50700@oracle.com> <545251AD.8050208@oracle.com> <54527A90.4030503@oracle.com> <5452CC7F.1090809@oracle.com> <5457F530.2070907@oracle.com> <54597EFC.2070509@oracle.com> <545BBF74.4020607@oracle.com> <545C09FB.9020907@oracle.com> <545CC42A.8030004@oracle.com> <545D108B.5070504@oracle.com> <545D2020.50801@oracle.com> <545D2250.6070607@oracle.com> <545D258C.4040106@oracle.com> Message-ID: <545D2B13.2000306@oracle.com> Eric, I reviewed this also. This is a bit confusing storing a 0 for method_parameters_length=0 in ConstMethod but I think it's fine with the comments. I have bad news though, you may need to make serviceability agent changes. You can tell if you run nsk.sajdi.testlist in ute. Coleen On 11/7/14, 3:03 PM, Eric McCorkle wrote: > Ah, ok, thanks. > > In that case, I need a capital-R reviewer to look at this, please? > > Thanks, > Eric > > On 11/07/14 14:49, Jiangli Zhou wrote: >> Eric, >> >> I'm not a "R"eviewer yet. You need at least two reviewers, within them >> at least one should be "R"eviewer. >> >> Thanks, >> Jiangli >> >> On 11/07/2014 11:40 AM, Eric McCorkle wrote: >>> Thanks, >>> >>> Are you a capital R? If not, wouldn't that mean I need two? >>> >>> On 11/07/14 13:33, Jiangli Zhou wrote: >>>> Hi Eric, >>>> >>>> Looks okay. You also need a capital R reviewer for the change. >>>> >>>> Thanks, >>>> Jiangli >>>> >>>> On 11/07/2014 05:07 AM, Eric McCorkle wrote: >>>>> On 11/06/14 18:53, Jiangli Zhou wrote: >>>>> >>>>>> Could you please point to the updated webrev? I don't see the >>>>>> update in >>>>>> http://cr.openjdk.java.net/~emc/8058313/webrev.01/src/share/vm/prims/jvm.cpp.sdiff.html. >>>>>> >>>>>> >>>>> I made a mistake uploading it. It's here: >>>>> http://cr.openjdk.java.net/~emc/8058313/webrev.02/ >>>>> >>>>>> Thanks, >>>>>> Jiangli >>>>>>>> On 11/03/2014 01:35 PM, Eric McCorkle wrote: >>>>>>>>> Please review this issue so that it can go in along with 8058322. >>>>>>>>> Thanks. >>>>>>>>> >>>>>>>>> On 10/30/14 19:40, Eric McCorkle wrote: >>>>>>>>>> Thank you for the pointers. I have applied your changes and >>>>>>>>>> refreshed >>>>>>>>>> the webrev. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~emc/8058313/ >>>>>>>>>> >>>>>>>>>> Also, I have posted the test for this and another patch here: >>>>>>>>>> http://cr.openjdk.java.net/~emc/8062556/ >>>>>>>>>> >>>>>>>>>> On 10/30/14 13:51, Jiangli Zhou wrote: >>>>>>>>>>> Hi Eric, >>>>>>>>>>> >>>>>>>>>>> On 10/30/2014 07:56 AM, Eric McCorkle wrote: >>>>>>>>>>>> On 10/29/14 20:36, Jiangli Zhou wrote: >>>>>>>>>>>>> Hi Eric, >>>>>>>>>>>>> >>>>>>>>>>>>> I wonder if we could specialize this particular case and avoid >>>>>>>>>>>>> changing >>>>>>>>>>>>> the parsing code. How about setting the _has_method_parameters >>>>>>>>>>>>> flag in >>>>>>>>>>>>> the ConstMethod when encounter such MethodParameter, and >>>>>>>>>>>>> changing >>>>>>>>>>>>> JVM_GetMethodParameters() to return non-NULL value for such >>>>>>>>>>>>> case >>>>>>>>>>>>> when >>>>>>>>>>>>> _has_method_parameters is true but method_parameters_length >>>>>>>>>>>>> is 0. >>>>>>>>>>>>> Would >>>>>>>>>>>>> that work? >>>>>>>>>>>> Which parser are you talking about? The inline tables >>>>>>>>>>>> parser, or >>>>>>>>>>>> the >>>>>>>>>>>> class file parser. The class file parser has to change, >>>>>>>>>>>> because it >>>>>>>>>>>> was >>>>>>>>>>>> previously ignoring MethodParameters attributes with >>>>>>>>>>>> parameter_count 0. >>>>>>>>>>> It's the class parsing changes that I was referring to, mostly >>>>>>>>>>> relate to >>>>>>>>>>> the initialization and checking against method_parameters_length. >>>>>>>>>>> It's a >>>>>>>>>>> bit awkward to include the 0 case but also skipping it in the >>>>>>>>>>> loop. For >>>>>>>>>>> example, the following code in classFileParser.cpp changed ">" to >>>>>>>>>>> ">=" >>>>>>>>>>> in the if check, but has no real effect and is not need. >>>>>>>>>>> >>>>>>>>>>> 2486 // Copy method parameters >>>>>>>>>>> 2487 if (method_parameters_length >= 0) { >>>>>>>>>>> 2488 MethodParametersElement* elem = >>>>>>>>>>> m->constMethod()->method_parameters_start(); >>>>>>>>>>> 2489 for (int i = 0; i < method_parameters_length; i++) { >>>>>>>>>>> 2490 elem[i].name_cp_index = >>>>>>>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>>>>>>> 2491 method_parameters_data += 2; >>>>>>>>>>> 2492 elem[i].flags = >>>>>>>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>>>>>>> 2493 method_parameters_data += 2; >>>>>>>>>>> 2494 } >>>>>>>>>>> 2495 } >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> I don't think your proposal will work. The inline tables' >>>>>>>>>>>> offsets are >>>>>>>>>>>> all dependent on what inline tables are actually present. If >>>>>>>>>>>> _has_method_parameters is set, then the inline tables code >>>>>>>>>>>> expects the >>>>>>>>>>>> last u2 of the inline tables to be a u2 indicating the number of >>>>>>>>>>>> method >>>>>>>>>>>> parameters entries, preceeded by the array of method parameters >>>>>>>>>>>> data. >>>>>>>>>>>> If _has_method_parameters is false, then it expects that there >>>>>>>>>>>> is no >>>>>>>>>>>> method parameters information at all (including no length >>>>>>>>>>>> field). If >>>>>>>>>>>> you were to set _has_method_parameters, but not store any >>>>>>>>>>>> information in >>>>>>>>>>>> the inline table, then it would cause errors for all the rest >>>>>>>>>>>> of the >>>>>>>>>>>> inline tables. >>>>>>>>>>> Thank you for reminding me of the complexity of the inlined table >>>>>>>>>>> calculation in the ConstMethod. My proposal would require >>>>>>>>>>> tweaks in >>>>>>>>>>> that >>>>>>>>>>> area to correctly compute the table sizes. As it's easy to >>>>>>>>>>> introduce >>>>>>>>>>> bugs in that area, it's not worth to change the table calculation >>>>>>>>>>> code >>>>>>>>>>> for this purpose. I agree my proposal is not a better choice in >>>>>>>>>>> this >>>>>>>>>>> case. >>>>>>>>>>> >>>>>>>>>>>> What I do for the parameter_count = 0 case is just store >>>>>>>>>>>> a 0 u2 for zero-length method parameters information, and no >>>>>>>>>>>> data. >>>>>>>>>>>> All >>>>>>>>>>>> the existing inline tables code works fine with this case, so >>>>>>>>>>>> there >>>>>>>>>>>> aren't any serious changes to the inline tables code (other than >>>>>>>>>>>> allowing method parameters information to be stored when the >>>>>>>>>>>> array is >>>>>>>>>>>> length 0). But you have to make some change to the inline table >>>>>>>>>>>> code, >>>>>>>>>>>> otherwise the information won't be stored. >>>>>>>>>>> Ok. Could you please add comments to the change in >>>>>>>>>>> constMethod.cpp to >>>>>>>>>>> explain above? >>>>>>>>>>> >>>>>>>>>>> In jvm.cpp, since -1 represents no method parameter now. Maybe >>>>>>>>>>> checking >>>>>>>>>>> against explicity and add comments for the 0-length case. >>>>>>>>>>> >>>>>>>>>>> JVM_ENTRY(jobjectArray, JVM_GetMethodParameters(JNIEnv *env, >>>>>>>>>>> jobject >>>>>>>>>>> method)) >>>>>>>>>>> { >>>>>>>>>>> ... >>>>>>>>>>> // No method parameter >>>>>>>>>>> if (num_params == -1) { >>>>>>>>>>> return (jobjectArray)NULL; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> /* handle the rest here */ >>>>>>>>>>> // make sure all the symbols are properly formatted >>>>>>>>>>> for (int i = 0; i < num_params; i++) { >>>>>>>>>>> ... >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Jiangli >>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Jiangli >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/29/2014 03:39 PM, Eric McCorkle wrote: >>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please review this fix for parameter reflection which >>>>>>>>>>>>>> addresses >>>>>>>>>>>>>> hotspot >>>>>>>>>>>>>> falsely ignoring zero-length MethodParameter attributes. The >>>>>>>>>>>>>> JVMS >>>>>>>>>>>>>> allows a MethodParameters attribute with parameter_count = 0, >>>>>>>>>>>>>> and >>>>>>>>>>>>>> the >>>>>>>>>>>>>> parameter reflection spec states that a >>>>>>>>>>>>>> MalformedParametersException >>>>>>>>>>>>>> should be thrown if parameter_count does not match the >>>>>>>>>>>>>> number of >>>>>>>>>>>>>> real >>>>>>>>>>>>>> parameters to a method. Hotspot currently ignores >>>>>>>>>>>>>> MethodParameters >>>>>>>>>>>>>> attributes with parameter_count = 0; however, in a case >>>>>>>>>>>>>> where a >>>>>>>>>>>>>> (bad) >>>>>>>>>>>>>> MethodParameters attribute has parameter_count = 0, but the >>>>>>>>>>>>>> method >>>>>>>>>>>>>> has a >>>>>>>>>>>>>> nonzero number of real parameters, hotspot will return null >>>>>>>>>>>>>> from >>>>>>>>>>>>>> JVM_GetMethodParameters, the result being that a >>>>>>>>>>>>>> MalformedParametersException is not thrown (rather, the >>>>>>>>>>>>>> reflection API >>>>>>>>>>>>>> acts like there is no MethodParameters attribute). >>>>>>>>>>>>>> >>>>>>>>>>>>>> This patch causes hotspot to record the fact that a >>>>>>>>>>>>>> zero-length >>>>>>>>>>>>>> MethodParameters attribute does exist, causing the exception >>>>>>>>>>>>>> to be >>>>>>>>>>>>>> thrown when it should be. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The bug is here: >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8058313 >>>>>>>>>>>>>> >>>>>>>>>>>>>> The webrev is here: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~emc/8058313/ From vladimir.kozlov at oracle.com Fri Nov 7 21:12:14 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 07 Nov 2014 13:12:14 -0800 Subject: [8u40] backport RFR (XS): 8062950: Bug in locking code when UseOptoBiasInlining is disabled: assert(dmw->is_neutral()) failed: invariant Message-ID: <545D35AE.6090403@oracle.com> 8u40 backport request. Changes were pushed into jdk9 yesterday, no problems were found since then. Changes are applied to 8u cleanly. https://bugs.openjdk.java.net/browse/JDK-8062950 http://cr.openjdk.java.net/~goetz/webrevs/8062950-lockBug/webrev.00/ http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/ef2e397e8b7b Thanks, Vladimir From david.lloyd at redhat.com Fri Nov 7 18:53:58 2014 From: david.lloyd at redhat.com (David M. Lloyd) Date: Fri, 07 Nov 2014 12:53:58 -0600 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <545D0D33.70400@redhat.com> References: <545CFFA9.4070107@redhat.com> <6313454D-6690-4119-B55C-DBB356E4B3AC@oracle.com> <545D078E.2090509@redhat.com> <545D0D33.70400@redhat.com> Message-ID: <545D1546.9050902@redhat.com> On 11/07/2014 12:19 PM, Andrew Haley wrote: > On 11/07/2014 06:10 PM, Christian Thalinger wrote: >> >>> On Nov 7, 2014, at 9:55 AM, Andrew Haley wrote: >>> >>> On 11/07/2014 05:42 PM, Christian Thalinger wrote: >>>> >>>>> On Nov 7, 2014, at 9:21 AM, Andrew Haley wrote: >>>>> >>>>> The first patch: top-level build machinery changes. >>>>> >>>>> http://cr.openjdk.java.net/~aph/8064357-rev-1/ >>>> >>>> common/autoconf/flags.m4 >>>> >>>> + aarch64) >>>> + ZERO_ARCHFLAG="" >>>> + ;; >>>> >>>> Why is this required on aarch64 but not all the other architectures? >>> >>> I think it's because GCC rejects "-m64?. >> >> That?s interesting. I thought -m is some kind of common >> flag that works on all architectures. > > No, all the "-m" stuff is target-dependent. > >> Can someone verify this? > > mustang-01:~ $ gcc -m64 hello.c > gcc: error: unrecognized command line option '-m64' > mustang-01:~ $ gcc --version > gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16) Also the man page lists -m64 under only specific targets. -- - DML From erik.osterlund at lnu.se Sat Nov 8 09:43:32 2014 From: erik.osterlund at lnu.se (=?iso-8859-1?Q?Erik_=D6sterlund?=) Date: Sat, 8 Nov 2014 09:43:32 +0000 Subject: Branch Prediction? Message-ID: <2C1B2FA9-32C0-49EE-A617-3ECFCF90C006@lnu.se> Hi, Just out of curiosity, is there some good reason why we don't have a branch prediction macro? For every tight load a; cmpxchg(expect: a, addr: &x, new_val: b); loop, I feel a bit uneasy not telling the compiler that this is pretty likely to succeed, and relying on its guessing. Has it been excluded because it's considered not nice or perhaps it was simply never introduced because nobody found it useful? Could have some define like this for GCC, which for other compilers reduces to nothing: #define VM_EXPECT_TRUE(A) __builtin_expect((A), true) #define VM_EXPECT_FALSE(A) __builtin_expect((A), false) It might not lead to drastic performance improvements, but it feels weird not to tell the compiler what we know and keep secrets from it. And I think it's also nice for documentation purposes that people reading it also understand that this expression is gonna be true most of the time, and deal with it accordingly. /Erik From aleksey.shipilev at oracle.com Sat Nov 8 14:09:39 2014 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Sat, 08 Nov 2014 17:09:39 +0300 Subject: Compiler branch hints (was: Branch Prediction?) In-Reply-To: <2C1B2FA9-32C0-49EE-A617-3ECFCF90C006@lnu.se> References: <2C1B2FA9-32C0-49EE-A617-3ECFCF90C006@lnu.se> Message-ID: <545E2423.7070100@oracle.com> Hi, On 08.11.2014 12:43, Erik ?sterlund wrote: > Just out of curiosity, is there some good reason why we don't have a > branch prediction macro? For every tight load a; cmpxchg(ex expect: a, > addr: &x, new_val: b); loop, I feel a bit uneasy not telling the > compiler that this is pretty likely to succeed, and relying on its > guessing. > > Has it been excluded because it's considered not nice or perhaps it > was simply never introduced because nobody found it useful? I would not use term "branch prediction" here, since that erroneously relates the issue to CPU branch prediction. The macros you are proposing affect the code layout in compilers, not the CPU execution. Therefore, I have hard time understanding how would introducing these macros affect the CAS loop example above? When you have a loop, you will generate (back) branch to loop header, and as far as I can see, there is only one sane layout for the loop anyway. More interesting layouts emerge when compilers can use the hint to selectively peel the loops, but I don't think GCC cares about that? That being said, it might be worthwhile to try and optimize some performance-critical code, e.g. in classloaders or GC to see what effect you are after. I think we can find a hardware that lacks sophisticated HW branch prediction that will hide the cost. > Could have some define like this for GCC, which for other compilers > reduces to nothing: > > #define VM_EXPECT_TRUE(A) __builtin_expect((A), true) #define > VM_EXPECT_FALSE(A) __builtin_expect((A), > It might not lead to drastic performance improvements, but it feels > weird not to tell the compiler what we know and keep secrets from it. > And I think it's also nice for documentation purposes that people > reading it also understand that this expression is gonna be true most > of the time, and deal with it accordingly. I like Linux Kernel notation for these: likely/unlikely. As in; if (likely(a == 42)) { ... } Thanks, -Aleksey. From erik.osterlund at lnu.se Sat Nov 8 14:50:09 2014 From: erik.osterlund at lnu.se (=?Windows-1252?Q?Erik_=D6sterlund?=) Date: Sat, 8 Nov 2014 14:50:09 +0000 Subject: Compiler branch hints (was: Branch Prediction?) In-Reply-To: <545E2423.7070100@oracle.com> References: <2C1B2FA9-32C0-49EE-A617-3ECFCF90C006@lnu.se> <545E2423.7070100@oracle.com> Message-ID: <94A77717-2570-4537-8DAF-47BFC5572918@lnu.se> On 08 Nov 2014, at 15:09, Aleksey Shipilev wrote: > Hi, > > On 08.11.2014 12:43, Erik ?sterlund wrote: >> Just out of curiosity, is there some good reason why we don't have a >> branch prediction macro? For every tight load a; cmpxchg(ex expect: a, >> addr: &x, new_val: b); loop, I feel a bit uneasy not telling the >> compiler that this is pretty likely to succeed, and relying on its >> guessing. >> >> Has it been excluded because it's considered not nice or perhaps it >> was simply never introduced because nobody found it useful? > > I would not use term "branch prediction" here, since that erroneously > relates the issue to CPU branch prediction. The macros you are proposing > affect the code layout in compilers, not the CPU execution. > > Therefore, I have hard time understanding how would introducing these > macros affect the CAS loop example above? When you have a loop, you will > generate (back) branch to loop header, and as far as I can see, there is > only one sane layout for the loop anyway. More interesting layouts > emerge when compilers can use the hint to selectively peel the loops, > but I don't think GCC cares about that? Well, there are loads of optimizations on loops, like loop unwinding too where you repeat the body of loops and trade space for faster loops if they are likely to repeat etc. And a bunch of others of course. Does GCC do such loop optimizations? Maybe, maybe not. And if so, under which circumstances and in which configurations? Not quite sure. But I'd be surprised to say the least, if GCC didn't do any loop optimizations at all. And regardless which ones they do, hinting which code path is more likely is good practice, and that's why such compiler intrinsics exist in the first place. If we know for sure the loop will almost always end, why not tell the compiler, so that it won't have to guess? And then we won't have to guess what kind of guessing it's doing if any at all either. By making it explicit, we won't even have to think about it and can sleep well at night knowing that we gave the information we could to the compiler and that if the generated code looks bad, then at least we can blame somebody else. ;) As for my load phi cas loop example, other libraries like libatomic use such macros for those exact cases, so I'm guessing there's good reason for it. > That being said, it might be worthwhile to try and optimize some > performance-critical code, e.g. in classloaders or GC to see what effect > you are after. I think we can find a hardware that lacks sophisticated > HW branch prediction that will hide the cost. I'm sure we can. And even if we can't and the performance difference is invisible, I still think it's a good practice, simply for code readability if nothing else. Anyone reading the code will know which code path (...not hardware...) is more likely (and performance critical), and take that into consideration when writing performance critical code. >> Could have some define like this for GCC, which for other compilers >> reduces to nothing: >> >> #define VM_EXPECT_TRUE(A) __builtin_expect((A), true) #define >> VM_EXPECT_FALSE(A) __builtin_expect((A), > >> It might not lead to drastic performance improvements, but it feels >> weird not to tell the compiler what we know and keep secrets from it. >> And I think it's also nice for documentation purposes that people >> reading it also understand that this expression is gonna be true most >> of the time, and deal with it accordingly. > > I like Linux Kernel notation for these: likely/unlikely. As in; > > if (likely(a == 42)) { > ... > } Yeah I guess having a smaller name could be nicer. :) Given your reply, I'm guessing the answer to my original question (why we don't have the macro), is that it simply has not been introduced yet, rather than there existing some good reason why we would like to ban such compiler hints, and that it has been considered in the past? Thanks for the reply. /Erik > Thanks, > -Aleksey. From jaromir.hamala at gmail.com Sun Nov 9 18:53:00 2014 From: jaromir.hamala at gmail.com (Jaromir Hamala) Date: Sun, 9 Nov 2014 18:53:00 +0000 Subject: How to add a new intrinsic Message-ID: Hi, I've been playing with HotSpot and as learning exercise I've tried to add a new intrinsic - the pause instruction on x86 CPUs. I have very limited knowledge of HotSpot codebase and my knowledge of C++ is rusty at least. I think I've got it working for interpreted mode, but I'm not very sure :) At least it's not crashing JVM and I'm not getting UnsatisfiedLinkError either - on x86 obviously. I have generated webrev with my changes: https://dl.dropboxusercontent.com/u/3201393/pause-webrev/index.html in HotSpot. I do not have ambitions to include it in OpenJDK, but I greatly appreciate any feedback / help. I guess the next step for me is to include C1 and eventually C2 support - again - any pointers are very highly appreciated! Cheers, Jaromir -- ?Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.? Antoine de Saint Exup?ry From aleksey.shipilev at oracle.com Sun Nov 9 19:48:37 2014 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Sun, 09 Nov 2014 22:48:37 +0300 Subject: How to add a new intrinsic In-Reply-To: References: Message-ID: <545FC515.60608@oracle.com> Hi Jaromir, On 11/09/2014 09:53 PM, Jaromir Hamala wrote: > I do not have ambitions to include it in OpenJDK, Why not? Please follow the step 0 from here: http://openjdk.java.net/contribute/ -- submit the OCA. > but I greatly appreciate any feedback / help. I guess the next step > for me is to include C1 and eventually C2 support - again - any > pointers are very highly appreciated! This may be a shortest example for simple Unsafe intrinsic handled in interpreter, C1 and C2 (notice how much shorter the interpreter code is, since we "just" use the native methods as interpreter handlers): http://hg.openjdk.java.net/jdk8/jdk8/jdk/rev/ad6097d547e1 http://hg.openjdk.java.net/jdk8/jdk8/hotspot/rev/1e41b0bc58a0 IIRC, for C1, you would need to: 1) Handle the intrinsic in LIRGenerator::do_Intrinsic (c1_LIRGenerator.cpp), it should add the nodes to C1 IR, see e.g. membar_acquire(). You will have to create a new LirOp, with 0 arguments, in LIR_Code enum, say, "lir_pause". 2) Lower the lir_pause to machine code in LIR_Assembler::emit_op0 (see c1_LIRAssembler.cpp). There should be a call to macro-assembler defining the PAUSE instruction. For C2, you would need to: 1) Add the intrinsic definitions and intrinsic code into library_call.cpp/hpp. It would be easier to follow the code for some already-existing simple intrinsic, see e.g. inline_unsafe_prefetch. Your code should emit a new, special-named IR node, say, PauseNode. 2) Add the matching rule for PauseNode into architecture description file (x86_32.ad or x86_64.ad). This file matches the IR node to the concrete machine code to emit. It is usually macroed into assembler call. E.g. PrefetchRead node with mem argument is matched to Assembler::prefetchr in assembler_x86.cpp. There, the exact machine code is emitted. I think this is enough to make a working example. -Aleksey. From jaromir.hamala at gmail.com Sun Nov 9 20:38:48 2014 From: jaromir.hamala at gmail.com (Jaromir Hamala) Date: Sun, 9 Nov 2014 20:38:48 +0000 Subject: How to add a new intrinsic In-Reply-To: <545FC515.60608@oracle.com> References: <545FC515.60608@oracle.com> Message-ID: Hi Aleksey, thanks again for your feedback & help! I'm working on C1 right now and your examples made it way easier for me. I'll sort out the OCA thing. Cheers, Jaromir On Sun, Nov 9, 2014 at 7:48 PM, Aleksey Shipilev < aleksey.shipilev at oracle.com> wrote: > Hi Jaromir, > > On 11/09/2014 09:53 PM, Jaromir Hamala wrote: > > I do not have ambitions to include it in OpenJDK, > > Why not? Please follow the step 0 from here: > http://openjdk.java.net/contribute/ -- submit the OCA. > > > > but I greatly appreciate any feedback / help. I guess the next step > > for me is to include C1 and eventually C2 support - again - any > > pointers are very highly appreciated! > > This may be a shortest example for simple Unsafe intrinsic handled in > interpreter, C1 and C2 (notice how much shorter the interpreter code is, > since we "just" use the native methods as interpreter handlers): > http://hg.openjdk.java.net/jdk8/jdk8/jdk/rev/ad6097d547e1 > http://hg.openjdk.java.net/jdk8/jdk8/hotspot/rev/1e41b0bc58a0 > > > IIRC, for C1, you would need to: > > 1) Handle the intrinsic in LIRGenerator::do_Intrinsic > (c1_LIRGenerator.cpp), it should add the nodes to C1 IR, see e.g. > membar_acquire(). You will have to create a new LirOp, with 0 arguments, > in LIR_Code enum, say, "lir_pause". > > 2) Lower the lir_pause to machine code in LIR_Assembler::emit_op0 (see > c1_LIRAssembler.cpp). There should be a call to macro-assembler defining > the PAUSE instruction. > > > For C2, you would need to: > > 1) Add the intrinsic definitions and intrinsic code into > library_call.cpp/hpp. It would be easier to follow the code for some > already-existing simple intrinsic, see e.g. inline_unsafe_prefetch. Your > code should emit a new, special-named IR node, say, PauseNode. > > 2) Add the matching rule for PauseNode into architecture description > file (x86_32.ad or x86_64.ad). This file matches the IR node to the > concrete machine code to emit. It is usually macroed into assembler > call. E.g. PrefetchRead node with mem argument is matched to > Assembler::prefetchr in assembler_x86.cpp. There, the exact machine code > is emitted. > > > I think this is enough to make a working example. > > -Aleksey. > > -- ?Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.? Antoine de Saint Exup?ry From rednaxelafx at gmail.com Mon Nov 10 01:01:33 2014 From: rednaxelafx at gmail.com (Krystal Mok) Date: Sun, 9 Nov 2014 17:01:33 -0800 Subject: How to add a new intrinsic In-Reply-To: References: <545FC515.60608@oracle.com> Message-ID: Hi Jaromir, Here's the slides of a presentation I did on HotSpot intrinsics: http://www.slideshare.net/RednaxelaFX/green-teajug-hotspotintrinsics02232013 Hope it helps. That said, back in Taobao we've implemented the intrinsics for the pause instruction in HotSpot. From our experience, it's actually not a good entry-level task for someone new to the HotSpot code base, because of its semantics. In my opinion, the correct way to implement a good intrinsic for the "pause" instruction should do the following: 1. Don't use the "pause" name for the intrinsic. It's not a good name to indicate its purpose. Gil suggested something like "sun.misc.Unsafe.spinLoopHint()", which I think is much better than "pause()". Let me refer to this intrinsic with the name "spinLoopHint()" in the rest of this email. 2. The interpreter version of this intrinsic should actually be implemented with the EmptyMethod intrinsic, instead of an actual x86 pause instruction. The interpreter contains a dispatch loop itself (even though in the case of the HotSpot template interpreter it's token-threaded / indirect-threaded, but logically it's still a dispatch loop), whereas the loop that you want to affect is a Java-level loop, which is one level of abstract away from the interpreter. The pause instruction in the interpreter would be on a different level from the Java-level loop. The EmptyMethod intrinsic (-XX:+ UseFastEmptyMethods) is removed from the current version of HotSpot VM already, because it interferes with the tiered compilation system for not having method invocation counter update logic. But you can probably revive the code for implementing the pause intrinsic in the interpreter. 3. The C1 and C2 versions. These should only treat the "spinLoopHint()" call as a hint. Instead of generating an explicit "PauseNode" in place of this call, you should probably mark the hint on a LoopNode (or in the case of C1, mark it on the basic block with the backedge). Then, only emit the x86 pause instruction at the backedge if the loop is not a CountedLoop, assuming spin loops shouldn't look like a counted loop. This is very different from the way other intrinsics are implemented, say, String.equals() or Unsafe.compareAndSwapInt(), where you could just treat it as a call and unconditionally emit the code in place of the call. Just my two cents. - Kris On Sun, Nov 9, 2014 at 12:38 PM, Jaromir Hamala wrote: > Hi Aleksey, > > thanks again for your feedback & help! I'm working on C1 right now and your > examples made it way easier for me. I'll sort out the OCA thing. > > Cheers, > Jaromir > > > > On Sun, Nov 9, 2014 at 7:48 PM, Aleksey Shipilev < > aleksey.shipilev at oracle.com> wrote: > > > Hi Jaromir, > > > > On 11/09/2014 09:53 PM, Jaromir Hamala wrote: > > > I do not have ambitions to include it in OpenJDK, > > > > Why not? Please follow the step 0 from here: > > http://openjdk.java.net/contribute/ -- submit the OCA. > > > > > > > but I greatly appreciate any feedback / help. I guess the next step > > > for me is to include C1 and eventually C2 support - again - any > > > pointers are very highly appreciated! > > > > This may be a shortest example for simple Unsafe intrinsic handled in > > interpreter, C1 and C2 (notice how much shorter the interpreter code is, > > since we "just" use the native methods as interpreter handlers): > > http://hg.openjdk.java.net/jdk8/jdk8/jdk/rev/ad6097d547e1 > > http://hg.openjdk.java.net/jdk8/jdk8/hotspot/rev/1e41b0bc58a0 > > > > > > IIRC, for C1, you would need to: > > > > 1) Handle the intrinsic in LIRGenerator::do_Intrinsic > > (c1_LIRGenerator.cpp), it should add the nodes to C1 IR, see e.g. > > membar_acquire(). You will have to create a new LirOp, with 0 arguments, > > in LIR_Code enum, say, "lir_pause". > > > > 2) Lower the lir_pause to machine code in LIR_Assembler::emit_op0 (see > > c1_LIRAssembler.cpp). There should be a call to macro-assembler defining > > the PAUSE instruction. > > > > > > For C2, you would need to: > > > > 1) Add the intrinsic definitions and intrinsic code into > > library_call.cpp/hpp. It would be easier to follow the code for some > > already-existing simple intrinsic, see e.g. inline_unsafe_prefetch. Your > > code should emit a new, special-named IR node, say, PauseNode. > > > > 2) Add the matching rule for PauseNode into architecture description > > file (x86_32.ad or x86_64.ad). This file matches the IR node to the > > concrete machine code to emit. It is usually macroed into assembler > > call. E.g. PrefetchRead node with mem argument is matched to > > Assembler::prefetchr in assembler_x86.cpp. There, the exact machine code > > is emitted. > > > > > > I think this is enough to make a working example. > > > > -Aleksey. > > > > > > > -- > ?Perfection is achieved, not when there is nothing more to add, but when > there is nothing left to take away.? > Antoine de Saint Exup?ry > From erik.joelsson at oracle.com Mon Nov 10 07:55:39 2014 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Mon, 10 Nov 2014 08:55:39 +0100 Subject: adjust-mflags mangles paths into arguments In-Reply-To: References: Message-ID: <54606F7B.7020109@oracle.com> Hello Eric, Thanks for the detailed report. This looks like https://bugs.openjdk.java.net/browse/JDK-8028407 which probably needs to be backported to 8u as gnu make 4.0 is only going to get more common going forward. /Erik On 2014-11-07 23:15, Eric Reischer wrote: > Steps to reproduce: > > $ hg clone http://hg.openjdk.java.net/jdk8u/jdk8u jdk8u > $ cd jdk8u > $ sh get_source.sh > $ sh configure --with-freetype-include=/usr/include/freetype2 > --with-freetype-lib=/usr/lib/x86_64-linux-gnu > --with-debug-level=slowdebug > $ LOG=trace JOBS=1 make all > > > {....} > + echo > /home/emr/jdk8u/build/linux-x86_64-normal-server-slowdebug/hotspot/linux_amd64_compiler2/debug > /home/emr/jdk8u/hotspot/make/linux/makefiles/top.make:91: Building > ad_stuff (from > /home/emr/jdk8u/build/linux-x86_64-normal-server-slowdebug/hotspot/linux_amd64_compiler2/debug/../generated/platform.current) > (/home/emr/jdk8u/build/linux-x86_64-normal-server-slowdebug/hotspot/linux_amd64_compiler2/debug/../generated/platform.current > /home/emr/jdk8u/build/linux-x86_64-normal-server-slowdebug/hotspot/linux_amd64_compiler2/debug/../generated/adjust-mflags > newer) > ++ pwd > + echo > /home/emr/jdk8u/build/linux-x86_64-normal-server-slowdebug/hotspot/linux_amd64_compiler2/debug > /home/emr/jdk8u/hotspot/make/linux/makefiles/top.make:91: Building > ad_stuff (from > /home/emr/jdk8u/build/linux-x86_64-normal-server-slowdebug/hotspot/linux_amd64_compiler2/debug/../generated/platform.current) > (/home/emr/jdk8u/build/linux-x86_64-normal-server-slowdebug/hotspot/linux_amd64_compiler2/debug/../generated/platform.current > /home/emr/jdk8u/build/linux-x86_64-normal-server-slowdebug/hotspot/linux_amd64_compiler2/debug/../generated/adjust-mflags > newer) > ++ > /home/emr/jdk8u/build/linux-x86_64-normal-server-slowdebug/hotspot/linux_amd64_compiler2/debug/../generated/adjust-mflags > '-rRw -I/home/emr/jdk8u/make/common -I/home/emr/jdk8u/make/common > -I/home/emr/jdk8u/make/common -I/home/emr/jdk8u/make/common > -I/home/emr/jdk8u/make/common' 1 > + /usr/bin/make VERBOSE= LOG_LEVEL=trace -R -I > /home/emr/jdk8u/make/common -f adlc.make -r -rRw -I/home/emr/ -j1 > -dk8u/make/common -I/home/emr/jdk8u/make/common > -I/home/emr/jdk8u/make/common -I/home/emr/jdk8u/make/common > -I/home/emr/jdk8u/make/common > /usr/bin/make: invalid option -- '8' > /usr/bin/make: invalid option -- 'u' > /usr/bin/make: invalid option -- '/' > /usr/bin/make: invalid option -- 'a' > /usr/bin/make: invalid option -- '/' > /usr/bin/make: invalid option -- 'c' > Usage: make [options] [target] ... > Options: > -b, -m Ignored for compatibility. > {....} > > As you can see, "adjust-mflags" is taking the path > /home/emr/jdk8u/make/common > and incorrectly converting it into > /home/emr/ -j1 -dk8u/make/common > > Naturally, everything is boned from that point on. This occurs when > the "all" target gets to building hotspot. From erik.joelsson at oracle.com Mon Nov 10 08:08:41 2014 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Mon, 10 Nov 2014 09:08:41 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <545D150F.0@redhat.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> Message-ID: <54607289.9090002@oracle.com> Hello, I would certainly like to have these files updated, but unfortunately the license on these files changed from GPL2 to GPL3. This essentially means that the switch is non trivial from a legal perspective and the impression I've received when I last inquired about updating these files was that it's unlikely to ever happen unless a very strong case can be presented for why it's needed. So the reason we have the over engineered solution for config.guess is simply that it's much easier than getting legal approval for updating these files. /Erik On 2014-11-07 19:53, Andrew Haley wrote: > On 11/07/2014 06:00 PM, Volker Simonis wrote: >> 3. pull in the new version of config.guess and config.sub from [1] >> which already seem to have the changes you need. >> >> I'm all in favour of point three which would also allow us to get rid >> of some of the hacks which are currently in config.guess. And now, as >> we're still early in the jdk9 development the risk of doing this seems >> minimal, but let's see what the build-dev guy say? > So am I. build-dev people, do you want me to import config.guess from > upstream? I can create a new issue. > > Andrew. > From aph at redhat.com Mon Nov 10 08:54:03 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 10 Nov 2014 08:54:03 +0000 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <54607289.9090002@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> Message-ID: <54607D2B.3060007@redhat.com> On 10/11/14 08:08, Erik Joelsson wrote: > I would certainly like to have these files updated, but > unfortunately the license on these files changed from GPL2 to > GPL3. This essentially means that the switch is non trivial from a > legal perspective and the impression I've received when I last > inquired about updating these files was that it's unlikely to ever > happen unless a very strong case can be presented for why it's > needed. > > So the reason we have the over engineered solution for config.guess > is simply that it's much easier than getting legal approval for > updating these files. I see. That suggests to me that my simple one-liner is the way to go. Andrew. From volker.simonis at gmail.com Mon Nov 10 09:27:20 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 10 Nov 2014 10:27:20 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <54607289.9090002@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> Message-ID: On Mon, Nov 10, 2014 at 9:08 AM, Erik Joelsson wrote: > Hello, > > I would certainly like to have these files updated, but unfortunately the > license on these files changed from GPL2 to GPL3. This essentially means > that the switch is non trivial from a legal perspective and the impression > I've received when I last inquired about updating these files was that it's > unlikely to ever happen unless a very strong case can be presented for why > it's needed. > > So the reason we have the over engineered solution for config.guess is > simply that it's much easier than getting legal approval for updating these > files. OK, but in that case I don't see any reason for keeping this "over-engineered" solution at all. If there will not be any pulls from upstream anyway then there's no reason for keeping these file untouched. I'd propose then to just remove the wrappers and do all the chenges right in the corresponding files (of course that's not the topic of this change but should be done separately). Regards, Volker > > /Erik > > > On 2014-11-07 19:53, Andrew Haley wrote: >> >> On 11/07/2014 06:00 PM, Volker Simonis wrote: >>> >>> 3. pull in the new version of config.guess and config.sub from [1] >>> which already seem to have the changes you need. >>> >>> I'm all in favour of point three which would also allow us to get rid >>> of some of the hacks which are currently in config.guess. And now, as >>> we're still early in the jdk9 development the risk of doing this seems >>> minimal, but let's see what the build-dev guy say? >> >> So am I. build-dev people, do you want me to import config.guess from >> upstream? I can create a new issue. >> >> Andrew. >> > From erik.joelsson at oracle.com Mon Nov 10 09:42:00 2014 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Mon, 10 Nov 2014 10:42:00 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> Message-ID: <54608868.3010108@oracle.com> On 2014-11-10 10:27, Volker Simonis wrote: > On Mon, Nov 10, 2014 at 9:08 AM, Erik Joelsson wrote: >> Hello, >> >> I would certainly like to have these files updated, but unfortunately the >> license on these files changed from GPL2 to GPL3. This essentially means >> that the switch is non trivial from a legal perspective and the impression >> I've received when I last inquired about updating these files was that it's >> unlikely to ever happen unless a very strong case can be presented for why >> it's needed. >> >> So the reason we have the over engineered solution for config.guess is >> simply that it's much easier than getting legal approval for updating these >> files. > OK, but in that case I don't see any reason for keeping this > "over-engineered" solution at all. If there will not be any pulls from > upstream anyway then there's no reason for keeping these file > untouched. I'd propose then to just remove the wrappers and do all the > chenges right in the corresponding files (of course that's not the > topic of this change but should be done separately). And again, the reason we didn't change the existing file but instead wrapped it, was that we don't have explicit legal approval for doing derivative work for these 3rd party files. Maybe it's ok, maybe it's not, I will not be the person saying it is ok. /Erik > Regards, > Volker > >> /Erik >> >> >> On 2014-11-07 19:53, Andrew Haley wrote: >>> On 11/07/2014 06:00 PM, Volker Simonis wrote: >>>> 3. pull in the new version of config.guess and config.sub from [1] >>>> which already seem to have the changes you need. >>>> >>>> I'm all in favour of point three which would also allow us to get rid >>>> of some of the hacks which are currently in config.guess. And now, as >>>> we're still early in the jdk9 development the risk of doing this seems >>>> minimal, but let's see what the build-dev guy say? >>> So am I. build-dev people, do you want me to import config.guess from >>> upstream? I can create a new issue. >>> >>> Andrew. >>> From zoltan.majo at oracle.com Mon Nov 10 09:48:13 2014 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 10 Nov 2014 10:48:13 +0100 Subject: [8u40] Request for approval: Backport of 8057622(S) Message-ID: <546089DD.5030306@oracle.com> Hi, I would like to request the backport of the fix for JDK-8057622 to 8u40. The changes were pushed on November 6, 2014, nightly testing shows no problems. Unfortunately, the patch does not apply cleanly. [9] bug: https://bugs.openjdk.java.net/browse/JDK-8057622 [9] webrev: http://cr.openjdk.java.net/~zmajo/8057622/webrev.03/ [9] changeset: http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/e2441a0d98f3 [8u40] webrev: http://cr.openjdk.java.net/~zmajo/8057622_8u40/webrev.00/ All JPRT tests pass. Thank you and best regards, Zoltan From volker.simonis at gmail.com Mon Nov 10 10:32:35 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 10 Nov 2014 11:32:35 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <54608868.3010108@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> Message-ID: On Mon, Nov 10, 2014 at 10:42 AM, Erik Joelsson wrote: > > On 2014-11-10 10:27, Volker Simonis wrote: >> >> On Mon, Nov 10, 2014 at 9:08 AM, Erik Joelsson >> wrote: >>> >>> Hello, >>> >>> I would certainly like to have these files updated, but unfortunately the >>> license on these files changed from GPL2 to GPL3. This essentially means >>> that the switch is non trivial from a legal perspective and the >>> impression >>> I've received when I last inquired about updating these files was that >>> it's >>> unlikely to ever happen unless a very strong case can be presented for >>> why >>> it's needed. >>> >>> So the reason we have the over engineered solution for config.guess is >>> simply that it's much easier than getting legal approval for updating >>> these >>> files. >> >> OK, but in that case I don't see any reason for keeping this >> "over-engineered" solution at all. If there will not be any pulls from >> upstream anyway then there's no reason for keeping these file >> untouched. I'd propose then to just remove the wrappers and do all the >> chenges right in the corresponding files (of course that's not the >> topic of this change but should be done separately). > > And again, the reason we didn't change the existing file but instead wrapped > it, was that we don't have explicit legal approval for doing derivative work > for these 3rd party files. Maybe it's ok, maybe it's not, I will not be the > person saying it is ok. > OK, now I got it. I thought we just use the wrappers because we want to easily integrate the upstream versions. But instead it is only because we don't want to edit these files because of legal uncertainties. So in that case that means we're also not allowed to edit 'config.sub' and have to create a wrapper for it, right? Volker > /Erik > >> Regards, >> Volker >> >>> /Erik >>> >>> >>> On 2014-11-07 19:53, Andrew Haley wrote: >>>> >>>> On 11/07/2014 06:00 PM, Volker Simonis wrote: >>>>> >>>>> 3. pull in the new version of config.guess and config.sub from [1] >>>>> which already seem to have the changes you need. >>>>> >>>>> I'm all in favour of point three which would also allow us to get rid >>>>> of some of the hacks which are currently in config.guess. And now, as >>>>> we're still early in the jdk9 development the risk of doing this seems >>>>> minimal, but let's see what the build-dev guy say? >>>> >>>> So am I. build-dev people, do you want me to import config.guess from >>>> upstream? I can create a new issue. >>>> >>>> Andrew. >>>> > From mikael.vidstedt at oracle.com Mon Nov 10 14:38:41 2014 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Mon, 10 Nov 2014 15:38:41 +0100 Subject: Proposal: Allowing selective pushes to hotspot without jprt In-Reply-To: References: <540F7021.5080100@oracle.com> <5410CDA9.7030405@oracle.com> <541C6FD9.5050602@oracle.com> <545CB362.60501@oracle.com> Message-ID: <5460CDF1.8050205@oracle.com> On 2014-11-07 14:12, Volker Simonis wrote: > On Fri, Nov 7, 2014 at 12:56 PM, Mikael Vidstedt > wrote: >> Volker, >> >> Thanks for reminding me, this totally slipped my mind. >> >> I think it's fair to say say we've given this enough time for feedback, and >> that the feedback has been all supportive. With that in mind I consider the >> proposal approved and effective immediately! >> > OK great. So does this mean we can now push reviewed changes to the > ppc/aix subdirs right away? That is indeed the idea - modulo the "if at review code review time a change is for some reason deemed to be risky and/or otherwise have impact on shared files" part which, again, hopefully is rare. Cheers, Mikael > >> Cheers, >> Mikael >> >> >> On 2014-11-06 15:35, Volker Simonis wrote: >>> Hi Mikael, >>> >>> just wanted to ask what's the status of this project? >>> I hope it was not just a JavaOne hoax :) >>> >>> Regards, >>> Volker >>> >>> >>> On Fri, Sep 19, 2014 at 8:47 PM, Volker Simonis >>> wrote: >>>> Thanks Mikael, that sounds good! >>>> >>>> Regards, >>>> Volker >>>> >>>> >>>> On Fri, Sep 19, 2014 at 8:03 PM, Mikael Vidstedt >>>> wrote: >>>>> Volker, >>>>> >>>>> The proposal is only to change how the changes are pushed, not which >>>>> forests >>>>> changes can be pushed to. That is, we would still require hotspot >>>>> changes to >>>>> be pushed to one of the group repositories (jdk9/hs-{comp,gc,rt}) or to >>>>> the >>>>> jdk8u/hs-dev forest (jdk8u), but I propose that the relaxation be >>>>> applied on >>>>> all those (four) forests. Reasonable? >>>>> >>>>> Cheers, >>>>> Mikael >>>>> >>>>> >>>>> On 2014-09-12 11:38, Volker Simonis wrote: >>>>>> Hi Mikael, >>>>>> >>>>>> there's one more question that came to my mind: will the new rule >>>>>> apply to all hotspot respitories (i.e. jdk9/hs-rt/hotspot, >>>>>> jdk9/hs-comp/hotspot, jdk9/hs-gc/hotspot, jdk9/hs-hs/hotspot AND >>>>>> jdk8u/jdk8u-dev/hotspot, jdk8u/hs-dev/hotspot) ? >>>>>> >>>>>> Thanks, >>>>>> Volker >>>>>> >>>>>> >>>>>> On Thu, Sep 11, 2014 at 12:16 AM, Mikael Vidstedt >>>>>> wrote: >>>>>>> Andrew/Volker, >>>>>>> >>>>>>> Thanks for the positive feedback. The goal of the proposal is to >>>>>>> simplify >>>>>>> pushing changes which are effectively not tested by the jprt system >>>>>>> anyway. >>>>>>> The proposed relaxation would not affect work on other infrastructure >>>>>>> projects in any relevant way, but would hopefully improve all our >>>>>>> lives >>>>>>> significantly immediately. >>>>>>> >>>>>>> Cheers, >>>>>>> Mikael >>>>>>> >>>>>>> >>>>>>> On 2014-09-10 01:45, Volker Simonis wrote: >>>>>>>> Hi Mikael, >>>>>>>> >>>>>>>> thanks a lot for this proposal. I think this will dramatically >>>>>>>> simplify our work to keep our ports up to date! So I fully support >>>>>>>> it. >>>>>>>> >>>>>>>> Nevertheless, I think this can only be a first step towards fully >>>>>>>> open >>>>>>>> the JPRT system to developers outside Oracle. With "opening" I mean >>>>>>>> to >>>>>>>> allow OpenJDK commiters from outside Oracle to submit and run JPRT >>>>>>>> jobs as well as allowing porting projects to add hardware which >>>>>>>> builds >>>>>>>> and tests the HotSpot on alternative platforms. >>>>>>>> >>>>>>>> So while I'm all in favor of your proposal I hope you can allay my >>>>>>>> doubts that this simplification will hopefully not push the >>>>>>>> realization of a truly OPEN JPRT system even further away. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Volker >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Sep 9, 2014 at 11:24 PM, Mikael Vidstedt >>>>>>>> wrote: >>>>>>>>> All, >>>>>>>>> >>>>>>>>> Made up primarily of low level C++ code, the Hotspot codebase is >>>>>>>>> highly >>>>>>>>> platform dependent and also tightly coupled with the tool chains on >>>>>>>>> the >>>>>>>>> various platforms. Each platform/tool chain combination has its set >>>>>>>>> of >>>>>>>>> special quirks, and code must be implemented in a way such that it >>>>>>>>> only >>>>>>>>> relies on the common subset of syntax and functionality across all >>>>>>>>> these >>>>>>>>> combinations. History has taught us that even simple changes can >>>>>>>>> have >>>>>>>>> surprising results when compiled with different compilers. >>>>>>>>> >>>>>>>>> For more than a decade the Hotspot team has ensured a minimum >>>>>>>>> quality >>>>>>>>> level >>>>>>>>> by requiring all pushes to be done through a build and test system >>>>>>>>> (jprt) >>>>>>>>> which guarantees that the code resulting from applying a set of >>>>>>>>> changes >>>>>>>>> builds on a set of core platforms and that a set of core tests pass. >>>>>>>>> Only >>>>>>>>> if >>>>>>>>> all the builds and tests pass will the changes actually be pushed to >>>>>>>>> the >>>>>>>>> target repository. >>>>>>>>> >>>>>>>>> We believe that testing like the above, in combination with later >>>>>>>>> stages >>>>>>>>> of >>>>>>>>> testing, is vital to ensuring that the quality level of the Hotspot >>>>>>>>> code >>>>>>>>> remains high and that developers do not run into situations where >>>>>>>>> the >>>>>>>>> latest >>>>>>>>> version has build errors on some platforms. >>>>>>>>> >>>>>>>>> Recently the AIX/PPC port was added to the set of OpenJDK platforms. >>>>>>>>> From >>>>>>>>> a >>>>>>>>> Hotspot perspective this new platform added a set of AIX/PPC >>>>>>>>> specific >>>>>>>>> files >>>>>>>>> including some platform specific changes to shared code. The AIX/PPC >>>>>>>>> platform is not tested by Oracle as part of Hotspot push jobs. The >>>>>>>>> same >>>>>>>>> thing applies for the shark and zero versions of Hotspot. >>>>>>>>> >>>>>>>>> While Hotspot developers remain committed to making sure changes are >>>>>>>>> developed in a way such that the quality level remains high across >>>>>>>>> all >>>>>>>>> platforms and variants, because of the above mentioned complexities >>>>>>>>> it >>>>>>>>> is >>>>>>>>> inevitable that from time to time changes will be made which >>>>>>>>> introduce >>>>>>>>> issues on specific platforms or tool chains not part of the core >>>>>>>>> testing. >>>>>>>>> >>>>>>>>> To allow these issues to be resolved more quickly I would like to >>>>>>>>> propose >>>>>>>>> a >>>>>>>>> relaxation in the requirements on how changes to Hotspot are pushed. >>>>>>>>> Specifically I would like to allow for direct pushes to the hotspot/ >>>>>>>>> repository of files specific to the following ports/variants/tools: >>>>>>>>> >>>>>>>>> * AIX >>>>>>>>> * PPC >>>>>>>>> * Shark >>>>>>>>> * Zero >>>>>>>>> >>>>>>>>> Today this translates into the following files: >>>>>>>>> >>>>>>>>> - src/cpu/ppc/** >>>>>>>>> - src/cpu/zero/** >>>>>>>>> - src/os/aix/** >>>>>>>>> - src/os_cpu/aix_ppc/** >>>>>>>>> - src/os_cpu/bsd_zero/** >>>>>>>>> - src/os_cpu/linux_ppc/** >>>>>>>>> - src/os_cpu/linux_zero/** >>>>>>>>> >>>>>>>>> Note that all changes are still required to go through the normal >>>>>>>>> development and review cycle; the proposed relaxation only applies >>>>>>>>> to >>>>>>>>> how >>>>>>>>> the changes are pushed. >>>>>>>>> >>>>>>>>> If at code review time a change is for some reason deemed to be >>>>>>>>> risky >>>>>>>>> and/or >>>>>>>>> otherwise have impact on shared files the reviewer may request that >>>>>>>>> the >>>>>>>>> change to go through the regular push testing. For changes only >>>>>>>>> touching >>>>>>>>> the >>>>>>>>> above set of files this expected to be rare. >>>>>>>>> >>>>>>>>> Please let me know what you think. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Mikael >>>>>>>>> From goetz.lindenmaier at sap.com Mon Nov 10 14:57:11 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 10 Nov 2014 14:57:11 +0000 Subject: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. Message-ID: <4295855A5C1DE049A61835A1887419CC2CF264E2@DEWDFEMB12A.global.corp.sap> Hi, I need to improve a row of things around compressed oops heap handling to achieve good performance on ppc. I prepared a first webrev for review: http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.00/ A detailed technical description of the change is in the webrev and according bug. If requested, I will split the change into parts with more respective less impact on non-ppc platforms. The change is derived from well-tested code in our VM. Originally it was crafted to require the least changes of VM coding, I changed it to be better streamlined with the VM. I tested this change to deliver heaps at about the same addresses as before. Heap addresses mostly differ in lower bits. In some cases (Solaris 5.11) a heap in a better compressed oops mode is found, though. I ran (and adapted) test/runtime/CompressedOops and gc/arguments/TestUseCompressedOops*. Best regards, Goetz. From vladimir.kozlov at oracle.com Mon Nov 10 15:52:29 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 10 Nov 2014 07:52:29 -0800 Subject: [8u40] Request for approval: Backport of 8057622(S) In-Reply-To: <546089DD.5030306@oracle.com> References: <546089DD.5030306@oracle.com> Message-ID: <5460DF3D.8040204@oracle.com> Looks good. Thanks, Vladimir On 11/10/14 1:48 AM, Zolt?n Maj? wrote: > Hi, > > > I would like to request the backport of the fix for JDK-8057622 to 8u40. > The changes were pushed on November 6, 2014, nightly testing shows no > problems. > > Unfortunately, the patch does not apply cleanly. > > [9] bug: https://bugs.openjdk.java.net/browse/JDK-8057622 > [9] webrev: http://cr.openjdk.java.net/~zmajo/8057622/webrev.03/ > [9] changeset: > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/e2441a0d98f3 > [8u40] webrev: http://cr.openjdk.java.net/~zmajo/8057622_8u40/webrev.00/ > > All JPRT tests pass. > > Thank you and best regards, > > > Zoltan > From zoltan.majo at oracle.com Mon Nov 10 15:55:52 2014 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 10 Nov 2014 16:55:52 +0100 Subject: [8u40] Request for approval: Backport of 8057622(S) In-Reply-To: <5460DF3D.8040204@oracle.com> References: <546089DD.5030306@oracle.com> <5460DF3D.8040204@oracle.com> Message-ID: <5460E008.70203@oracle.com> Thank you, for the feedback, Vladimir! Best regards, Zoltan On 11/10/2014 04:52 PM, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 11/10/14 1:48 AM, Zolt?n Maj? wrote: >> Hi, >> >> >> I would like to request the backport of the fix for JDK-8057622 to 8u40. >> The changes were pushed on November 6, 2014, nightly testing shows no >> problems. >> >> Unfortunately, the patch does not apply cleanly. >> >> [9] bug: https://bugs.openjdk.java.net/browse/JDK-8057622 >> [9] webrev: http://cr.openjdk.java.net/~zmajo/8057622/webrev.03/ >> [9] changeset: >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/e2441a0d98f3 >> [8u40] webrev: http://cr.openjdk.java.net/~zmajo/8057622_8u40/webrev.00/ >> >> All JPRT tests pass. >> >> Thank you and best regards, >> >> >> Zoltan >> From omajid at redhat.com Mon Nov 10 17:45:37 2014 From: omajid at redhat.com (Omair Majid) Date: Mon, 10 Nov 2014 12:45:37 -0500 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: References: <545CFFA9.4070107@redhat.com> <6313454D-6690-4119-B55C-DBB356E4B3AC@oracle.com> <545D078E.2090509@redhat.com> Message-ID: <20141110174536.GA2885@redhat.com> * Christian Thalinger [2014-11-07 13:11]: > > On Nov 7, 2014, at 9:55 AM, Andrew Haley wrote: > > On 11/07/2014 05:42 PM, Christian Thalinger wrote: > >> common/autoconf/flags.m4 > >> > >> + aarch64) > >> + ZERO_ARCHFLAG="" > >> + ;; > >> > >> Why is this required on aarch64 but not all the other architectures? > > > > I think it's because GCC rejects "-m64?. > > That?s interesting. I thought -m is some kind of common > flag that works on all architectures. Can someone verify this? I had to do a similar fix for zero on arm32: http://hg.openjdk.java.net/jdk8/jdk8/rev/1dfcc874461e#l2.7 Perhaps that can be re-used here? Thanks, Omair -- PGP Key: 66484681 (http://pgp.mit.edu/) Fingerprint = F072 555B 0A17 3957 4E95 0056 F286 F14F 6648 4681 From eric.mccorkle at oracle.com Mon Nov 10 19:03:22 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Mon, 10 Nov 2014 14:03:22 -0500 Subject: Review request for 8058313: Mismatch of method descriptor and MethodParameters.parameters_count should cause MalformedParameterException In-Reply-To: <545D2B13.2000306@oracle.com> References: <54516C9A.7070404@oracle.com> <54518820.50700@oracle.com> <545251AD.8050208@oracle.com> <54527A90.4030503@oracle.com> <5452CC7F.1090809@oracle.com> <5457F530.2070907@oracle.com> <54597EFC.2070509@oracle.com> <545BBF74.4020607@oracle.com> <545C09FB.9020907@oracle.com> <545CC42A.8030004@oracle.com> <545D108B.5070504@oracle.com> <545D2020.50801@oracle.com> <545D2250.6070607@oracle.com> <545D258C.4040106@oracle.com> <545D2B13.2000306@oracle.com> Message-ID: <54610BFA.2080703@oracle.com> I ran the test list in tonga, with a patched and unpatched checkout. No difference in failures. On 11/07/14 15:26, Coleen Phillimore wrote: > > Eric, > > I reviewed this also. This is a bit confusing storing a 0 for > method_parameters_length=0 in ConstMethod but I think it's fine with the > comments. > > I have bad news though, you may need to make serviceability agent > changes. You can tell if you run nsk.sajdi.testlist in ute. > > Coleen > > On 11/7/14, 3:03 PM, Eric McCorkle wrote: >> Ah, ok, thanks. >> >> In that case, I need a capital-R reviewer to look at this, please? >> >> Thanks, >> Eric >> >> On 11/07/14 14:49, Jiangli Zhou wrote: >>> Eric, >>> >>> I'm not a "R"eviewer yet. You need at least two reviewers, within them >>> at least one should be "R"eviewer. >>> >>> Thanks, >>> Jiangli >>> >>> On 11/07/2014 11:40 AM, Eric McCorkle wrote: >>>> Thanks, >>>> >>>> Are you a capital R? If not, wouldn't that mean I need two? >>>> >>>> On 11/07/14 13:33, Jiangli Zhou wrote: >>>>> Hi Eric, >>>>> >>>>> Looks okay. You also need a capital R reviewer for the change. >>>>> >>>>> Thanks, >>>>> Jiangli >>>>> >>>>> On 11/07/2014 05:07 AM, Eric McCorkle wrote: >>>>>> On 11/06/14 18:53, Jiangli Zhou wrote: >>>>>> >>>>>>> Could you please point to the updated webrev? I don't see the >>>>>>> update in >>>>>>> http://cr.openjdk.java.net/~emc/8058313/webrev.01/src/share/vm/prims/jvm.cpp.sdiff.html. >>>>>>> >>>>>>> >>>>>>> >>>>>> I made a mistake uploading it. It's here: >>>>>> http://cr.openjdk.java.net/~emc/8058313/webrev.02/ >>>>>> >>>>>>> Thanks, >>>>>>> Jiangli >>>>>>>>> On 11/03/2014 01:35 PM, Eric McCorkle wrote: >>>>>>>>>> Please review this issue so that it can go in along with 8058322. >>>>>>>>>> Thanks. >>>>>>>>>> >>>>>>>>>> On 10/30/14 19:40, Eric McCorkle wrote: >>>>>>>>>>> Thank you for the pointers. I have applied your changes and >>>>>>>>>>> refreshed >>>>>>>>>>> the webrev. >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~emc/8058313/ >>>>>>>>>>> >>>>>>>>>>> Also, I have posted the test for this and another patch here: >>>>>>>>>>> http://cr.openjdk.java.net/~emc/8062556/ >>>>>>>>>>> >>>>>>>>>>> On 10/30/14 13:51, Jiangli Zhou wrote: >>>>>>>>>>>> Hi Eric, >>>>>>>>>>>> >>>>>>>>>>>> On 10/30/2014 07:56 AM, Eric McCorkle wrote: >>>>>>>>>>>>> On 10/29/14 20:36, Jiangli Zhou wrote: >>>>>>>>>>>>>> Hi Eric, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I wonder if we could specialize this particular case and >>>>>>>>>>>>>> avoid >>>>>>>>>>>>>> changing >>>>>>>>>>>>>> the parsing code. How about setting the >>>>>>>>>>>>>> _has_method_parameters >>>>>>>>>>>>>> flag in >>>>>>>>>>>>>> the ConstMethod when encounter such MethodParameter, and >>>>>>>>>>>>>> changing >>>>>>>>>>>>>> JVM_GetMethodParameters() to return non-NULL value for such >>>>>>>>>>>>>> case >>>>>>>>>>>>>> when >>>>>>>>>>>>>> _has_method_parameters is true but method_parameters_length >>>>>>>>>>>>>> is 0. >>>>>>>>>>>>>> Would >>>>>>>>>>>>>> that work? >>>>>>>>>>>>> Which parser are you talking about? The inline tables >>>>>>>>>>>>> parser, or >>>>>>>>>>>>> the >>>>>>>>>>>>> class file parser. The class file parser has to change, >>>>>>>>>>>>> because it >>>>>>>>>>>>> was >>>>>>>>>>>>> previously ignoring MethodParameters attributes with >>>>>>>>>>>>> parameter_count 0. >>>>>>>>>>>> It's the class parsing changes that I was referring to, mostly >>>>>>>>>>>> relate to >>>>>>>>>>>> the initialization and checking against >>>>>>>>>>>> method_parameters_length. >>>>>>>>>>>> It's a >>>>>>>>>>>> bit awkward to include the 0 case but also skipping it in the >>>>>>>>>>>> loop. For >>>>>>>>>>>> example, the following code in classFileParser.cpp changed >>>>>>>>>>>> ">" to >>>>>>>>>>>> ">=" >>>>>>>>>>>> in the if check, but has no real effect and is not need. >>>>>>>>>>>> >>>>>>>>>>>> 2486 // Copy method parameters >>>>>>>>>>>> 2487 if (method_parameters_length >= 0) { >>>>>>>>>>>> 2488 MethodParametersElement* elem = >>>>>>>>>>>> m->constMethod()->method_parameters_start(); >>>>>>>>>>>> 2489 for (int i = 0; i < method_parameters_length; i++) { >>>>>>>>>>>> 2490 elem[i].name_cp_index = >>>>>>>>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>>>>>>>> 2491 method_parameters_data += 2; >>>>>>>>>>>> 2492 elem[i].flags = >>>>>>>>>>>> Bytes::get_Java_u2(method_parameters_data); >>>>>>>>>>>> 2493 method_parameters_data += 2; >>>>>>>>>>>> 2494 } >>>>>>>>>>>> 2495 } >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> I don't think your proposal will work. The inline tables' >>>>>>>>>>>>> offsets are >>>>>>>>>>>>> all dependent on what inline tables are actually present. If >>>>>>>>>>>>> _has_method_parameters is set, then the inline tables code >>>>>>>>>>>>> expects the >>>>>>>>>>>>> last u2 of the inline tables to be a u2 indicating the >>>>>>>>>>>>> number of >>>>>>>>>>>>> method >>>>>>>>>>>>> parameters entries, preceeded by the array of method >>>>>>>>>>>>> parameters >>>>>>>>>>>>> data. >>>>>>>>>>>>> If _has_method_parameters is false, then it expects that there >>>>>>>>>>>>> is no >>>>>>>>>>>>> method parameters information at all (including no length >>>>>>>>>>>>> field). If >>>>>>>>>>>>> you were to set _has_method_parameters, but not store any >>>>>>>>>>>>> information in >>>>>>>>>>>>> the inline table, then it would cause errors for all the rest >>>>>>>>>>>>> of the >>>>>>>>>>>>> inline tables. >>>>>>>>>>>> Thank you for reminding me of the complexity of the inlined >>>>>>>>>>>> table >>>>>>>>>>>> calculation in the ConstMethod. My proposal would require >>>>>>>>>>>> tweaks in >>>>>>>>>>>> that >>>>>>>>>>>> area to correctly compute the table sizes. As it's easy to >>>>>>>>>>>> introduce >>>>>>>>>>>> bugs in that area, it's not worth to change the table >>>>>>>>>>>> calculation >>>>>>>>>>>> code >>>>>>>>>>>> for this purpose. I agree my proposal is not a better choice in >>>>>>>>>>>> this >>>>>>>>>>>> case. >>>>>>>>>>>> >>>>>>>>>>>>> What I do for the parameter_count = 0 case is just >>>>>>>>>>>>> store >>>>>>>>>>>>> a 0 u2 for zero-length method parameters information, and no >>>>>>>>>>>>> data. >>>>>>>>>>>>> All >>>>>>>>>>>>> the existing inline tables code works fine with this case, so >>>>>>>>>>>>> there >>>>>>>>>>>>> aren't any serious changes to the inline tables code (other >>>>>>>>>>>>> than >>>>>>>>>>>>> allowing method parameters information to be stored when the >>>>>>>>>>>>> array is >>>>>>>>>>>>> length 0). But you have to make some change to the inline >>>>>>>>>>>>> table >>>>>>>>>>>>> code, >>>>>>>>>>>>> otherwise the information won't be stored. >>>>>>>>>>>> Ok. Could you please add comments to the change in >>>>>>>>>>>> constMethod.cpp to >>>>>>>>>>>> explain above? >>>>>>>>>>>> >>>>>>>>>>>> In jvm.cpp, since -1 represents no method parameter now. Maybe >>>>>>>>>>>> checking >>>>>>>>>>>> against explicity and add comments for the 0-length case. >>>>>>>>>>>> >>>>>>>>>>>> JVM_ENTRY(jobjectArray, JVM_GetMethodParameters(JNIEnv *env, >>>>>>>>>>>> jobject >>>>>>>>>>>> method)) >>>>>>>>>>>> { >>>>>>>>>>>> ... >>>>>>>>>>>> // No method parameter >>>>>>>>>>>> if (num_params == -1) { >>>>>>>>>>>> return (jobjectArray)NULL; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> /* handle the rest here */ >>>>>>>>>>>> // make sure all the symbols are properly formatted >>>>>>>>>>>> for (int i = 0; i < num_params; i++) { >>>>>>>>>>>> ... >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Jiangli >>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Jiangli >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 10/29/2014 03:39 PM, Eric McCorkle wrote: >>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please review this fix for parameter reflection which >>>>>>>>>>>>>>> addresses >>>>>>>>>>>>>>> hotspot >>>>>>>>>>>>>>> falsely ignoring zero-length MethodParameter attributes. >>>>>>>>>>>>>>> The >>>>>>>>>>>>>>> JVMS >>>>>>>>>>>>>>> allows a MethodParameters attribute with parameter_count >>>>>>>>>>>>>>> = 0, >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> parameter reflection spec states that a >>>>>>>>>>>>>>> MalformedParametersException >>>>>>>>>>>>>>> should be thrown if parameter_count does not match the >>>>>>>>>>>>>>> number of >>>>>>>>>>>>>>> real >>>>>>>>>>>>>>> parameters to a method. Hotspot currently ignores >>>>>>>>>>>>>>> MethodParameters >>>>>>>>>>>>>>> attributes with parameter_count = 0; however, in a case >>>>>>>>>>>>>>> where a >>>>>>>>>>>>>>> (bad) >>>>>>>>>>>>>>> MethodParameters attribute has parameter_count = 0, but the >>>>>>>>>>>>>>> method >>>>>>>>>>>>>>> has a >>>>>>>>>>>>>>> nonzero number of real parameters, hotspot will return null >>>>>>>>>>>>>>> from >>>>>>>>>>>>>>> JVM_GetMethodParameters, the result being that a >>>>>>>>>>>>>>> MalformedParametersException is not thrown (rather, the >>>>>>>>>>>>>>> reflection API >>>>>>>>>>>>>>> acts like there is no MethodParameters attribute). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This patch causes hotspot to record the fact that a >>>>>>>>>>>>>>> zero-length >>>>>>>>>>>>>>> MethodParameters attribute does exist, causing the exception >>>>>>>>>>>>>>> to be >>>>>>>>>>>>>>> thrown when it should be. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The bug is here: >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8058313 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The webrev is here: >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~emc/8058313/ > From christian.tornqvist at oracle.com Mon Nov 10 20:13:55 2014 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Mon, 10 Nov 2014 15:13:55 -0500 Subject: [8u40] Backport RFR: 8058251 - assert(_count > 0) failed: Negative counter when running runtime/NMT/MallocTrackingVerify.java Message-ID: <00e501cffd22$dac59150$9050b3f0$@oracle.com> Changes were pushed to jdk9 last week (http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/8d5860808a16 ) , patch applied cleanly to 8u. Webrev: http://cr.openjdk.java.net/~ctornqvi/webrev/8058251/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8058251 Thanks, Christian From coleen.phillimore at oracle.com Mon Nov 10 20:43:46 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 10 Nov 2014 15:43:46 -0500 Subject: [8u40] Backport RFR: 8058251 - assert(_count > 0) failed: Negative counter when running runtime/NMT/MallocTrackingVerify.java In-Reply-To: <00e501cffd22$dac59150$9050b3f0$@oracle.com> References: <00e501cffd22$dac59150$9050b3f0$@oracle.com> Message-ID: <54612382.1060802@oracle.com> Looks good. Thanks for backporting this, we need the NMT fixes in 8u. Coleen On 11/10/14, 3:13 PM, Christian Tornqvist wrote: > Changes were pushed to jdk9 last week > (http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/8d5860808a16 ) , patch > applied cleanly to 8u. > > > > Webrev: > > http://cr.openjdk.java.net/~ctornqvi/webrev/8058251/webrev.00/ > > > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8058251 > > > > Thanks, > > Christian > From christian.tornqvist at oracle.com Mon Nov 10 22:42:15 2014 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Mon, 10 Nov 2014 17:42:15 -0500 Subject: [8u40] Backport RFR: 8059803 - Update use of GetVersionEx to get correct Windows version in hs_err files Message-ID: <015e01cffd37$93ab4f20$bb01ed60$@oracle.com> Changes were pushed to jdk9 last week (http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/092a9eddf58d ), patch applied cleanly to 8u. Webrev: http://cr.openjdk.java.net/~ctornqvi/webrev/8059803/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8059803 Thanks, Christian From eric.mccorkle at oracle.com Tue Nov 11 00:23:21 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Mon, 10 Nov 2014 19:23:21 -0500 Subject: Review request for 8058322: Zero name_index item of MethodParameters attribute cause MalformedParameterException In-Reply-To: <545113BA.6010504@oracle.com> References: <545100EC.5000204@oracle.com> <545113BA.6010504@oracle.com> Message-ID: <546156F9.1070205@oracle.com> I had to make a barely-nontrivial change, to deal with the fact that apparently you can't assign NULL to a Handle on some platforms (an embedded platform). Please re-approve. Thanks. On 10/29/14 12:20, Coleen Phillimore wrote: > > Looks good, Eric. > > thanks, > Coleen > > On 10/29/14, 10:59 AM, Eric McCorkle wrote: >> Hello, >> >> Please review this simple change which addresses a failure condition in >> the method parameter reflection implementation. In the initial >> implementation of method parameter reflection, a parameter with a >> parameter_name index of 0 denoted a parameter with no name, and the VM >> translated this into the empty string when creating the Parameter object >> to return to Java code. However, towards the end of the 8 cycle, the >> spec was updated to state that a zero parameter_name index should denote >> a parameter with no name, and should result in Parameter.getName() >> returning an empty string, whereas the empty string /constant/ is >> expressly forbidden as a parameter name, and should result in >> MalformedParametersException. The reflection API was updated to reflect >> this behavior, but it seems the VM still translates a parameter_name >> index of 0 into the empty string. This patch removes that, resulting in >> correct behavior of the reflection API for parameters with no name. >> >> The webrev is here: >> http://cr.openjdk.java.net/~emc/8058322/ >> >> The bug is here: >> https://bugs.openjdk.java.net/browse/JDK-8058322 > From eric.mccorkle at oracle.com Tue Nov 11 00:39:56 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Mon, 10 Nov 2014 19:39:56 -0500 Subject: Review request for 8058322: Zero name_index item of MethodParameters attribute cause MalformedParameterException In-Reply-To: <546156F9.1070205@oracle.com> References: <545100EC.5000204@oracle.com> <545113BA.6010504@oracle.com> <546156F9.1070205@oracle.com> Message-ID: <54615ADC.1070706@oracle.com> Disregard this. Apologies; I had the wrong patch. On 11/10/14 19:23, Eric McCorkle wrote: > I had to make a barely-nontrivial change, to deal with the fact that > apparently you can't assign NULL to a Handle on some platforms (an > embedded platform). > > Please re-approve. > > Thanks. > > On 10/29/14 12:20, Coleen Phillimore wrote: >> >> Looks good, Eric. >> >> thanks, >> Coleen >> >> On 10/29/14, 10:59 AM, Eric McCorkle wrote: >>> Hello, >>> >>> Please review this simple change which addresses a failure condition in >>> the method parameter reflection implementation. In the initial >>> implementation of method parameter reflection, a parameter with a >>> parameter_name index of 0 denoted a parameter with no name, and the VM >>> translated this into the empty string when creating the Parameter object >>> to return to Java code. However, towards the end of the 8 cycle, the >>> spec was updated to state that a zero parameter_name index should denote >>> a parameter with no name, and should result in Parameter.getName() >>> returning an empty string, whereas the empty string /constant/ is >>> expressly forbidden as a parameter name, and should result in >>> MalformedParametersException. The reflection API was updated to reflect >>> this behavior, but it seems the VM still translates a parameter_name >>> index of 0 into the empty string. This patch removes that, resulting in >>> correct behavior of the reflection API for parameters with no name. >>> >>> The webrev is here: >>> http://cr.openjdk.java.net/~emc/8058322/ >>> >>> The bug is here: >>> https://bugs.openjdk.java.net/browse/JDK-8058322 >> From eric.mccorkle at oracle.com Tue Nov 11 00:41:36 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Mon, 10 Nov 2014 19:41:36 -0500 Subject: Review request for 8058322: Zero name_index item of MethodParameters attribute cause MalformedParameterException In-Reply-To: <54615ADC.1070706@oracle.com> References: <545100EC.5000204@oracle.com> <545113BA.6010504@oracle.com> <546156F9.1070205@oracle.com> <54615ADC.1070706@oracle.com> Message-ID: <54615B40.7040209@oracle.com> Actually, the webrev script seems to have pulled in both patches. The change in question is in reflection.cpp. On 11/10/14 19:39, Eric McCorkle wrote: > Disregard this. Apologies; I had the wrong patch. > > On 11/10/14 19:23, Eric McCorkle wrote: >> I had to make a barely-nontrivial change, to deal with the fact that >> apparently you can't assign NULL to a Handle on some platforms (an >> embedded platform). >> >> Please re-approve. >> >> Thanks. >> >> On 10/29/14 12:20, Coleen Phillimore wrote: >>> >>> Looks good, Eric. >>> >>> thanks, >>> Coleen >>> >>> On 10/29/14, 10:59 AM, Eric McCorkle wrote: >>>> Hello, >>>> >>>> Please review this simple change which addresses a failure condition in >>>> the method parameter reflection implementation. In the initial >>>> implementation of method parameter reflection, a parameter with a >>>> parameter_name index of 0 denoted a parameter with no name, and the VM >>>> translated this into the empty string when creating the Parameter object >>>> to return to Java code. However, towards the end of the 8 cycle, the >>>> spec was updated to state that a zero parameter_name index should denote >>>> a parameter with no name, and should result in Parameter.getName() >>>> returning an empty string, whereas the empty string /constant/ is >>>> expressly forbidden as a parameter name, and should result in >>>> MalformedParametersException. The reflection API was updated to reflect >>>> this behavior, but it seems the VM still translates a parameter_name >>>> index of 0 into the empty string. This patch removes that, resulting in >>>> correct behavior of the reflection API for parameters with no name. >>>> >>>> The webrev is here: >>>> http://cr.openjdk.java.net/~emc/8058322/ >>>> >>>> The bug is here: >>>> https://bugs.openjdk.java.net/browse/JDK-8058322 >>> From coleen.phillimore at oracle.com Tue Nov 11 00:44:40 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 10 Nov 2014 19:44:40 -0500 Subject: Review request for 8058322: Zero name_index item of MethodParameters attribute cause MalformedParameterException In-Reply-To: <546156F9.1070205@oracle.com> References: <545100EC.5000204@oracle.com> <545113BA.6010504@oracle.com> <546156F9.1070205@oracle.com> Message-ID: <54615BF8.4010505@oracle.com> I see your change to reflection.cpp at the end. It looks fine. Coleen On 11/10/14, 7:23 PM, Eric McCorkle wrote: > I had to make a barely-nontrivial change, to deal with the fact that > apparently you can't assign NULL to a Handle on some platforms (an > embedded platform). > > Please re-approve. > > Thanks. > > On 10/29/14 12:20, Coleen Phillimore wrote: >> Looks good, Eric. >> >> thanks, >> Coleen >> >> On 10/29/14, 10:59 AM, Eric McCorkle wrote: >>> Hello, >>> >>> Please review this simple change which addresses a failure condition in >>> the method parameter reflection implementation. In the initial >>> implementation of method parameter reflection, a parameter with a >>> parameter_name index of 0 denoted a parameter with no name, and the VM >>> translated this into the empty string when creating the Parameter object >>> to return to Java code. However, towards the end of the 8 cycle, the >>> spec was updated to state that a zero parameter_name index should denote >>> a parameter with no name, and should result in Parameter.getName() >>> returning an empty string, whereas the empty string /constant/ is >>> expressly forbidden as a parameter name, and should result in >>> MalformedParametersException. The reflection API was updated to reflect >>> this behavior, but it seems the VM still translates a parameter_name >>> index of 0 into the empty string. This patch removes that, resulting in >>> correct behavior of the reflection API for parameters with no name. >>> >>> The webrev is here: >>> http://cr.openjdk.java.net/~emc/8058322/ >>> >>> The bug is here: >>> https://bugs.openjdk.java.net/browse/JDK-8058322 From george.triantafillou at oracle.com Tue Nov 11 01:12:09 2014 From: george.triantafillou at oracle.com (George Triantafillou) Date: Mon, 10 Nov 2014 20:12:09 -0500 Subject: [8u40] Backport RFR: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms Message-ID: <54616269.20408@oracle.com> Changes were pushed to jdk9 last week:http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/c2881c208f7a The patch applied cleanly to 8u. Bug: https://bugs.openjdk.java.net/browse/JDK-8061969 Webrev: http://cr.openjdk.java.net/~gtriantafill/8061969/webrev.01/ -George From coleen.phillimore at oracle.com Tue Nov 11 01:21:48 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 10 Nov 2014 20:21:48 -0500 Subject: [8u40] Backport RFR: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms In-Reply-To: <54616269.20408@oracle.com> References: <54616269.20408@oracle.com> Message-ID: <546164AC.2050807@oracle.com> This looks good - the title of the bug should be to rewrite MallocSiteHashOverflow.java since that's what this change does. It doesn't reenable it because my change does :) Coleen On 11/10/14, 8:12 PM, George Triantafillou wrote: > Changes were pushed to jdk9 last > week:http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/c2881c208f7a > > The patch applied cleanly to 8u. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8061969 > Webrev: http://cr.openjdk.java.net/~gtriantafill/8061969/webrev.01/ > > -George > From david.holmes at oracle.com Tue Nov 11 04:57:13 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 11 Nov 2014 14:57:13 +1000 Subject: RFR:8047290:Ensure consistent safepoint checking in MutexLockerEx In-Reply-To: <545CF033.4010503@oracle.com> References: <543EB71A.8000403@oracle.com> <543F174F.7040204@oracle.com> <545CF033.4010503@oracle.com> Message-ID: <54619729.30704@oracle.com> Hi Max, On 8/11/2014 2:15 AM, Max Ockner wrote: > Hello all, > I have made these additonal changes: > -Moved the assert() statements into the lock and lock_without_safepoint > methods. > -Changed Monitor::SafepointAllowed to Monitor::SafepointCheckRequired > -Changed the Monitor::SafepointCheckRequired values for several locks > which were locked outside of a MutexLockerEx (some were locked with > MutexLocker, some were locked were locked without any MutexLocker* ) > > New webrev location: http://cr.openjdk.java.net/~coleenp/8047290/ Generally this is all okay - a few style and other nits below. However you missed adding an assert in Monitor::wait to check if the no_safepoint_check flag was being used correctly for the current monitor. Specific comments: src/share/vm/runtime/mutex.hpp This comment is no longer accurate with the moved check location: + // MutexLockerEx checks these flags when acquiring a lock + // to ensure consistent checking for each lock. The same goes for other references to MutexLockerEx in the enum description. Also copyright year needs updating. --- src/share/vm/runtime/mutex.cpp 898 //Ensure 961 //Ensure Space needed after // --- src/share/vm/runtime/mutexLocker.cpp + var = new type(Mutex::pri, #var, vm_block,safepoint_check_allowed); \ space needed after comma in k,s --- src/share/vm/runtime/mutexLocker.hpp Whitespace only changes - looks like leftovers from removed edits. Thanks, David > Additional testing: > jtreg ./jdk/test/java/lang/invoke > jtreg jfr tests > > Here is a list of ALL of the "sometimes" locks: > > "WorkGroup monitor" share/vm/utilities/workgroup.cpp > "SLTMonitor" share/vm/gc_implementation/shared/concurrentGCThread.cpp > "CompactibleFreeListSpace._lock" > share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp > "freelist par lock" > share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp > "SR_lock" share/vm/runtime/thread.cpp > > The remaining "sometimes" locks can be found in > share/vm/runtime/mutexLocker.cpp: > > ParGCRareEvent_lock > Safepoint_lock > Threads_lock > VMOperationQueue_lock > VMOperationRequest_lock > Terminator_lock > Heap_lock > Compile_lock > PeriodicTask_lock > JfrStacktrace_lock > > I have not checked the validity of the "sometimes" locks, and I believe > that this should be a different project. > > Thanks for your help! > Max Ockner > On 10/15/2014 8:54 PM, David Holmes wrote: >> Hi Max, >> >> This is looking good. >> >> A few high-level initial comments: >> >> I think SafepointAllowed should be SafepointCheckNeeded >> >> Why are the checks in MutexLocker when the state is maintained in the >> mutex itself and the mutex/monitor has lock_without_safepoint, and >> wait() ? I would have expected to see the >> check in the mutex/monitor methods. >> >> Checking consistent usage of the _no_safepoint_check_flag is good. But >> another part of this is that a monitor/mutex that never checks for >> safepoints should never be held when a thread blocks at a safepoint - >> is there some way to easily check that? I was surprised how many locks >> are actually not checking for safepoints. >> >> Did you find any cases where the mutex/monitor was being used >> inconsistently and incorrectly? >> >> Did you analyse the "sometimes" cases to see if they were safe? >> (Aside: just for fun check out what happens if you lock the >> Threads_lock with a safepoint check and a safepoint has been requested >> :) ). >> >> Cheers, >> David >> >> On 16/10/2014 4:04 AM, Max Ockner wrote: >>> Hi all, >>> >>> I am a new member of the Hotspot runtime team in Burlington, MA. >>> Please review my first fix related to safepoint checking. >>> >>> Summary: MutexLockerEx can either acquire a lock with or without a >>> safepoint check. >>> In some cases, a particular lock must either safepoint check always or >>> never to avoid deadlocking. >>> Some other locks have semantics which allow them to avoid deadlocks >>> despite having a safepoint check only some of the time. >>> All locks that are OK having inconsistent safepoint checks have been >>> marked. All locks that should never safepoint check and all locks that >>> should always safepoint check have also been marked. >>> When a MutexLockerEx acquires a lock with or without a safepoint check, >>> the lock's safepointAllowed marker is checked to ensure consistent >>> safepoint checking. >>> >>> Webrev: http://oklahoma.us.oracle.com/~mockner/webrev/8047290/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8047290 >>> >>> Tested with: >>> jprt "-testset hotspot" >>> jtreg hotspot >>> vm.quick.testlist >>> >>> Whitebox tests: >>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java: Test >>> expects Assert ("This lock should always have a safepoint check") >>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java: Test >>> expects Assert ("This lock should never have a safepoint check") >>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java: code >>> should not assert. (Lock is properly acquired with no safepoint check) >>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java: code >>> should not assert. (Lock is properly acquired with safepoint check) >>> >>> Thanks, >>> Max >>> > From david.holmes at oracle.com Tue Nov 11 05:33:52 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 11 Nov 2014 15:33:52 +1000 Subject: RFR:8047290:Ensure consistent safepoint checking in MutexLockerEx In-Reply-To: <545CF718.2020702@oracle.com> References: <543EB71A.8000403@oracle.com> <543F174F.7040204@oracle.com> <545CF033.4010503@oracle.com> <545CF718.2020702@oracle.com> Message-ID: <54619FC0.3070100@oracle.com> On 8/11/2014 2:45 AM, Bertrand Delsart wrote: > Hi Max, > > Like David, I think we should go further but this is one step in the > right direction. Thanks for doing it. > > Only noticed one small issue. The "or allow" part of the comment look > strange in lock_without_safepoint_check: > > void Monitor::lock_without_safepoint_check(Thread * Self) { > + //Ensure that the Monitor does not require or allow safepoint checks. This is okay to me. "require" maps to _safepoint_check_always, while "allow" maps to _safepoint_check_sometimes. Cheers, David > + assert(this->_safepoint_check_required != > Monitor::_safepoint_check_always, > + err_msg("This lock should always have a safepoint check: %s", > + this->name())); > > Regards, > > Bertrand (not a Reviewer). > > On 07/11/2014 17:15, Max Ockner wrote: >> Hello all, >> I have made these additonal changes: >> -Moved the assert() statements into the lock and lock_without_safepoint >> methods. >> -Changed Monitor::SafepointAllowed to Monitor::SafepointCheckRequired >> -Changed the Monitor::SafepointCheckRequired values for several locks >> which were locked outside of a MutexLockerEx (some were locked with >> MutexLocker, some were locked were locked without any MutexLocker* ) >> >> New webrev location: http://cr.openjdk.java.net/~coleenp/8047290/ >> >> Additional testing: >> jtreg ./jdk/test/java/lang/invoke >> jtreg jfr tests >> >> Here is a list of ALL of the "sometimes" locks: >> >> "WorkGroup monitor" share/vm/utilities/workgroup.cpp >> "SLTMonitor" share/vm/gc_implementation/shared/concurrentGCThread.cpp >> "CompactibleFreeListSpace._lock" >> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp >> >> "freelist par lock" >> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp >> >> "SR_lock" share/vm/runtime/thread.cpp >> >> The remaining "sometimes" locks can be found in >> share/vm/runtime/mutexLocker.cpp: >> >> ParGCRareEvent_lock >> Safepoint_lock >> Threads_lock >> VMOperationQueue_lock >> VMOperationRequest_lock >> Terminator_lock >> Heap_lock >> Compile_lock >> PeriodicTask_lock >> JfrStacktrace_lock >> >> I have not checked the validity of the "sometimes" locks, and I believe >> that this should be a different project. >> >> Thanks for your help! >> Max Ockner >> On 10/15/2014 8:54 PM, David Holmes wrote: >>> Hi Max, >>> >>> This is looking good. >>> >>> A few high-level initial comments: >>> >>> I think SafepointAllowed should be SafepointCheckNeeded >>> >>> Why are the checks in MutexLocker when the state is maintained in the >>> mutex itself and the mutex/monitor has lock_without_safepoint, and >>> wait() ? I would have expected to see the >>> check in the mutex/monitor methods. >>> >>> Checking consistent usage of the _no_safepoint_check_flag is good. But >>> another part of this is that a monitor/mutex that never checks for >>> safepoints should never be held when a thread blocks at a safepoint - >>> is there some way to easily check that? I was surprised how many locks >>> are actually not checking for safepoints. >>> >>> Did you find any cases where the mutex/monitor was being used >>> inconsistently and incorrectly? >>> >>> Did you analyse the "sometimes" cases to see if they were safe? >>> (Aside: just for fun check out what happens if you lock the >>> Threads_lock with a safepoint check and a safepoint has been requested >>> :) ). >>> >>> Cheers, >>> David >>> >>> On 16/10/2014 4:04 AM, Max Ockner wrote: >>>> Hi all, >>>> >>>> I am a new member of the Hotspot runtime team in Burlington, MA. >>>> Please review my first fix related to safepoint checking. >>>> >>>> Summary: MutexLockerEx can either acquire a lock with or without a >>>> safepoint check. >>>> In some cases, a particular lock must either safepoint check always or >>>> never to avoid deadlocking. >>>> Some other locks have semantics which allow them to avoid deadlocks >>>> despite having a safepoint check only some of the time. >>>> All locks that are OK having inconsistent safepoint checks have been >>>> marked. All locks that should never safepoint check and all locks that >>>> should always safepoint check have also been marked. >>>> When a MutexLockerEx acquires a lock with or without a safepoint check, >>>> the lock's safepointAllowed marker is checked to ensure consistent >>>> safepoint checking. >>>> >>>> Webrev: http://oklahoma.us.oracle.com/~mockner/webrev/8047290/ >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8047290 >>>> >>>> Tested with: >>>> jprt "-testset hotspot" >>>> jtreg hotspot >>>> vm.quick.testlist >>>> >>>> Whitebox tests: >>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java: Test >>>> expects Assert ("This lock should always have a safepoint check") >>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java: Test >>>> expects Assert ("This lock should never have a safepoint check") >>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java: code >>>> should not assert. (Lock is properly acquired with no safepoint check) >>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java: code >>>> should not assert. (Lock is properly acquired with safepoint check) >>>> >>>> Thanks, >>>> Max >>>> >> > > From goetz.lindenmaier at sap.com Tue Nov 11 10:13:37 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 11 Nov 2014 10:13:37 +0000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF26820@DEWDFEMB12A.global.corp.sap> Hi Markus, Could you fix your issue and did the other tests pass? Are there any follow-up actions I should take? Best regards, Goetz. -----Original Message----- From: Markus Gr?nlund [mailto:markus.gronlund at oracle.com] Sent: Donnerstag, 6. November 2014 14:50 To: Lindenmaier, Goetz Cc: David Holmes; hotspot-dev at openjdk.java.net Subject: RE: RFR (L): 8062370: Various minor code improvements Hi Goetz, Thanks for looking into this. I think I will be able to update the internal code I am working on to accommodate your updates. I don't know if any other code will see potential issues - only testing will tell. So I would await the rollback and I will putback my updated code - let's see if other issues appear after this - we should know after this nights nightly testing (then we can re-evaluate the rollback). Thanks Markus -----Original Message----- From: Lindenmaier, Goetz [mailto:goetz.lindenmaier at sap.com] Sent: den 6 november 2014 13:49 To: David Holmes; hotspot-dev at openjdk.java.net Cc: Markus Gr?nlund Subject: RE: RFR (L): 8062370: Various minor code improvements Hi David, Well, yes, that's right. But then you can simply pass in count+1. It works also if the caller knows he will only use 'count' bytes of the string. In this case +1 must be allocated. But that both is quite special. Currently, if the string is truncated, there is no null byte on windows. And there are a lot of uses of this method in the VM (via jio_snprintf). Should I use the internal bug number for the rollback-fix? How should we proceed, as I can't fix you internal code? Best regards, Goetz. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Donnerstag, 6. November 2014 12:23 To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net Cc: Markus Gr?nlund Subject: Re: RFR (L): 8062370: Various minor code improvements On 6/11/2014 8:43 PM, Lindenmaier, Goetz wrote: > Hi David, > > yes, windows does not null terminate if there is an overflow. > Obviously there are overflows, and they now see one less character. I > think this should be fixed where jio_vsnprintf is called. Having > non-null terminated strings isn't nice. I think it depends on what you consider an overflow. If the buffer is already null terminated and you pass in a count that covers up to the location before the null then there is no problem - except now the logic will introduce a second null in place of the last character. > But for now I will roll back this single change. I'll send a RFR soon. > > Where did you see the problem? It was in our closed code so I can't go into details. We have a non-public bug number: 8063089 Thanks, David > > Best regards, > Goetz. > > > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 6. November 2014 11:30 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: Re: RFR (L): 8062370: Various minor code improvements > > On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: >> Thanks David, I'll have a look. > > It seems that windows vsnprintf may not null-terminate the string - > which I think is what your patch was trying to address. But if we have > existing code that works with that then the fix is now overwriting the > last character. I can't quite see how to handle this in a cross > platform manner, but in the immediate term we should probably revert > that part of the changeset. > > David > >> Best regards, >> Goetz. >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Donnerstag, 6. November 2014 11:09 >> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >> Cc: Markus Gr?nlund >> Subject: Re: RFR (L): 8062370: Various minor code improvements >> >> Hi Goetz, >> >> This change has introduced a bug: >> >> - return vsnprintf(str, count, fmt, args); >> + >> + int result = vsnprintf(str, count, fmt, args); if ((result > 0 && >> + (size_t)result >= count) || result == -1) { >> + str[count - 1] = '\0'; >> + result = -1; >> + } >> + >> + return result; >> >> some strings are getting their last character truncated on Windows. >> >> David >> >> On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: >>> Hi David, >>> >>> thanks for looking at the change! I fixed the issue in a new >>> webrev: >>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ >>> >>> Best regards, >>> Goetz. >>> >>> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com] >>> Sent: Mittwoch, 5. November 2014 02:49 >>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>> >>> Hi Goetz, >>> >>> The only issue I see is in: >>> >>> src/share/vm/runtime/globals.cpp >>> >>> where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the >>> "abort on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. >>> >>> Thanks, >>> David >>> >>> On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> this change contains a row of minor code improvements we did to >>>> fulfil our internal quality requirements. We would like to share >>>> these with openJDK. >>>> >>>> Please review and test this change. I please need a sponsor. >>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >>>> https://bugs.openjdk.java.net/browse/JDK-8062370 >>>> >>>> We tested this on windows 64, linux x86_64, mac, solaris sparc >>>> 32+64 bit and, of course, the ppc platforms. >>>> >>>> >>>> Some details: >>>> >>>> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. >>>> >>>> We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. >>>> >>>> We add some missing memory frees and some closing of files. >>>> >>>> jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. >>>> >>>> Best regards, >>>> >>>> Goetz >>>> >>>> >>>> >>>> From staffan.larsen at oracle.com Tue Nov 11 10:18:48 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 11 Nov 2014 11:18:48 +0100 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF26820@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> <4295855A5C1DE049A61835A1887419CC2CF26820@DEWDFEMB12A.global.corp.sap> Message-ID: <3A58190E-F5AF-49BB-93ED-09868082FAB9@oracle.com> > On 11 nov 2014, at 11:13, Lindenmaier, Goetz wrote: > > Hi Markus, > > Could you fix your issue and did the other tests pass? The issue was fixed and we see no more test failures. > Are there any follow-up actions I should take? There was a suggestion for better documentation of jio_vsnprintf so that future usages know what to expect. I think that would be good to follow up on. /Staffan > > Best regards, > Goetz. > > -----Original Message----- > From: Markus Gr?nlund [mailto:markus.gronlund at oracle.com] > Sent: Donnerstag, 6. November 2014 14:50 > To: Lindenmaier, Goetz > Cc: David Holmes; hotspot-dev at openjdk.java.net > Subject: RE: RFR (L): 8062370: Various minor code improvements > > Hi Goetz, > > Thanks for looking into this. > > I think I will be able to update the internal code I am working on to accommodate your updates. > > I don't know if any other code will see potential issues - only testing will tell. > > So I would await the rollback and I will putback my updated code - let's see if other issues appear after this - we should know after this nights nightly testing (then we can re-evaluate the rollback). > > Thanks > Markus > > > -----Original Message----- > From: Lindenmaier, Goetz [mailto:goetz.lindenmaier at sap.com] > Sent: den 6 november 2014 13:49 > To: David Holmes; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: RE: RFR (L): 8062370: Various minor code improvements > > Hi David, > > Well, yes, that's right. But then you can simply pass in count+1. > It works also if the caller knows he will only use 'count' bytes of the string. In this case +1 must be allocated. > But that both is quite special. > > Currently, if the string is truncated, there is no null byte on windows. And there are a lot of uses of this method in the VM (via jio_snprintf). > > Should I use the internal bug number for the rollback-fix? > > How should we proceed, as I can't fix you internal code? > > Best regards, > Goetz. > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 6. November 2014 12:23 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: Re: RFR (L): 8062370: Various minor code improvements > > On 6/11/2014 8:43 PM, Lindenmaier, Goetz wrote: >> Hi David, >> >> yes, windows does not null terminate if there is an overflow. >> Obviously there are overflows, and they now see one less character. I >> think this should be fixed where jio_vsnprintf is called. Having >> non-null terminated strings isn't nice. > > I think it depends on what you consider an overflow. If the buffer is already null terminated and you pass in a count that covers up to the location before the null then there is no problem - except now the logic will introduce a second null in place of the last character. > >> But for now I will roll back this single change. I'll send a RFR soon. >> >> Where did you see the problem? > > It was in our closed code so I can't go into details. We have a non-public bug number: 8063089 > > Thanks, > David > >> >> Best regards, >> Goetz. >> >> >> >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Donnerstag, 6. November 2014 11:30 >> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >> Cc: Markus Gr?nlund >> Subject: Re: RFR (L): 8062370: Various minor code improvements >> >> On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: >>> Thanks David, I'll have a look. >> >> It seems that windows vsnprintf may not null-terminate the string - >> which I think is what your patch was trying to address. But if we have >> existing code that works with that then the fix is now overwriting the >> last character. I can't quite see how to handle this in a cross >> platform manner, but in the immediate term we should probably revert >> that part of the changeset. >> >> David >> >>> Best regards, >>> Goetz. >>> >>> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com] >>> Sent: Donnerstag, 6. November 2014 11:09 >>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>> Cc: Markus Gr?nlund >>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>> >>> Hi Goetz, >>> >>> This change has introduced a bug: >>> >>> - return vsnprintf(str, count, fmt, args); >>> + >>> + int result = vsnprintf(str, count, fmt, args); if ((result > 0 && >>> + (size_t)result >= count) || result == -1) { >>> + str[count - 1] = '\0'; >>> + result = -1; >>> + } >>> + >>> + return result; >>> >>> some strings are getting their last character truncated on Windows. >>> >>> David >>> >>> On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: >>>> Hi David, >>>> >>>> thanks for looking at the change! I fixed the issue in a new >>>> webrev: >>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ >>>> >>>> Best regards, >>>> Goetz. >>>> >>>> -----Original Message----- >>>> From: David Holmes [mailto:david.holmes at oracle.com] >>>> Sent: Mittwoch, 5. November 2014 02:49 >>>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>>> >>>> Hi Goetz, >>>> >>>> The only issue I see is in: >>>> >>>> src/share/vm/runtime/globals.cpp >>>> >>>> where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the >>>> "abort on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. >>>> >>>> Thanks, >>>> David >>>> >>>> On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >>>>> Hi, >>>>> >>>>> this change contains a row of minor code improvements we did to >>>>> fulfil our internal quality requirements. We would like to share >>>>> these with openJDK. >>>>> >>>>> Please review and test this change. I please need a sponsor. >>>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >>>>> https://bugs.openjdk.java.net/browse/JDK-8062370 >>>>> >>>>> We tested this on windows 64, linux x86_64, mac, solaris sparc >>>>> 32+64 bit and, of course, the ppc platforms. >>>>> >>>>> >>>>> Some details: >>>>> >>>>> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. >>>>> >>>>> We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. >>>>> >>>>> We add some missing memory frees and some closing of files. >>>>> >>>>> jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. >>>>> >>>>> Best regards, >>>>> >>>>> Goetz >>>>> >>>>> >>>>> >>>>> From markus.gronlund at oracle.com Tue Nov 11 10:37:24 2014 From: markus.gronlund at oracle.com (=?iso-8859-1?B?TWFya3VzIEdy9m5sdW5k?=) Date: Tue, 11 Nov 2014 02:37:24 -0800 (PST) Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF26820@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> <4295855A5C1DE049A61835A1887419CC2CF26820@DEWDFEMB12A.global.corp.sap> Message-ID: <2e9bd47c-366a-446b-89d0-b431a5816007@default> Hi Goetz, Thanks for following up on this. I adjusted a few calls into jio_snprintf() that were particular for Windows to accommodate the updates. >From the test results I have seen so far, it seems no other issues appeared which could be related to this. So I think the change should be left as is (no rollback), but maybe a comment could be added about the jio_snprintf() semantics (NULL termination on overflows, expected 'count' etc). Thanks Markus -----Original Message----- From: Lindenmaier, Goetz [mailto:goetz.lindenmaier at sap.com] Sent: den 11 november 2014 11:14 To: Markus Gr?nlund Cc: David Holmes; hotspot-dev at openjdk.java.net Subject: RE: RFR (L): 8062370: Various minor code improvements Hi Markus, Could you fix your issue and did the other tests pass? Are there any follow-up actions I should take? Best regards, Goetz. -----Original Message----- From: Markus Gr?nlund [mailto:markus.gronlund at oracle.com] Sent: Donnerstag, 6. November 2014 14:50 To: Lindenmaier, Goetz Cc: David Holmes; hotspot-dev at openjdk.java.net Subject: RE: RFR (L): 8062370: Various minor code improvements Hi Goetz, Thanks for looking into this. I think I will be able to update the internal code I am working on to accommodate your updates. I don't know if any other code will see potential issues - only testing will tell. So I would await the rollback and I will putback my updated code - let's see if other issues appear after this - we should know after this nights nightly testing (then we can re-evaluate the rollback). Thanks Markus -----Original Message----- From: Lindenmaier, Goetz [mailto:goetz.lindenmaier at sap.com] Sent: den 6 november 2014 13:49 To: David Holmes; hotspot-dev at openjdk.java.net Cc: Markus Gr?nlund Subject: RE: RFR (L): 8062370: Various minor code improvements Hi David, Well, yes, that's right. But then you can simply pass in count+1. It works also if the caller knows he will only use 'count' bytes of the string. In this case +1 must be allocated. But that both is quite special. Currently, if the string is truncated, there is no null byte on windows. And there are a lot of uses of this method in the VM (via jio_snprintf). Should I use the internal bug number for the rollback-fix? How should we proceed, as I can't fix you internal code? Best regards, Goetz. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Donnerstag, 6. November 2014 12:23 To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net Cc: Markus Gr?nlund Subject: Re: RFR (L): 8062370: Various minor code improvements On 6/11/2014 8:43 PM, Lindenmaier, Goetz wrote: > Hi David, > > yes, windows does not null terminate if there is an overflow. > Obviously there are overflows, and they now see one less character. I > think this should be fixed where jio_vsnprintf is called. Having > non-null terminated strings isn't nice. I think it depends on what you consider an overflow. If the buffer is already null terminated and you pass in a count that covers up to the location before the null then there is no problem - except now the logic will introduce a second null in place of the last character. > But for now I will roll back this single change. I'll send a RFR soon. > > Where did you see the problem? It was in our closed code so I can't go into details. We have a non-public bug number: 8063089 Thanks, David > > Best regards, > Goetz. > > > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 6. November 2014 11:30 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: Re: RFR (L): 8062370: Various minor code improvements > > On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: >> Thanks David, I'll have a look. > > It seems that windows vsnprintf may not null-terminate the string - > which I think is what your patch was trying to address. But if we have > existing code that works with that then the fix is now overwriting the > last character. I can't quite see how to handle this in a cross > platform manner, but in the immediate term we should probably revert > that part of the changeset. > > David > >> Best regards, >> Goetz. >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Donnerstag, 6. November 2014 11:09 >> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >> Cc: Markus Gr?nlund >> Subject: Re: RFR (L): 8062370: Various minor code improvements >> >> Hi Goetz, >> >> This change has introduced a bug: >> >> - return vsnprintf(str, count, fmt, args); >> + >> + int result = vsnprintf(str, count, fmt, args); if ((result > 0 && >> + (size_t)result >= count) || result == -1) { >> + str[count - 1] = '\0'; >> + result = -1; >> + } >> + >> + return result; >> >> some strings are getting their last character truncated on Windows. >> >> David >> >> On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: >>> Hi David, >>> >>> thanks for looking at the change! I fixed the issue in a new >>> webrev: >>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ >>> >>> Best regards, >>> Goetz. >>> >>> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com] >>> Sent: Mittwoch, 5. November 2014 02:49 >>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>> >>> Hi Goetz, >>> >>> The only issue I see is in: >>> >>> src/share/vm/runtime/globals.cpp >>> >>> where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the >>> "abort on OOM" semantics of NEW_C_HEAP_ARRAY you need to use os::strdup_check_oom. >>> >>> Thanks, >>> David >>> >>> On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> this change contains a row of minor code improvements we did to >>>> fulfil our internal quality requirements. We would like to share >>>> these with openJDK. >>>> >>>> Please review and test this change. I please need a sponsor. >>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >>>> https://bugs.openjdk.java.net/browse/JDK-8062370 >>>> >>>> We tested this on windows 64, linux x86_64, mac, solaris sparc >>>> 32+64 bit and, of course, the ppc platforms. >>>> >>>> >>>> Some details: >>>> >>>> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus not representable as i64 what is used in the CONST64 macro. This change adapts UCONST64 to use ui64, and the usages of these macros where necessary. >>>> >>>> We add some more strncpy uses. Also, we fix strncpy on windows. There, strncpy does not write a \0 into the last byte if the copied string is too long. >>>> >>>> We add some missing memory frees and some closing of files. >>>> >>>> jio_vsnprintf() works differently on windows and linux. This change adapts this to show the same behaviour on all platforms. See java.cpp. >>>> >>>> Best regards, >>>> >>>> Goetz >>>> >>>> >>>> >>>> From george.triantafillou at oracle.com Tue Nov 11 13:36:47 2014 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 11 Nov 2014 08:36:47 -0500 Subject: [8u40] Backport RFR: 8059803 - Update use of GetVersionEx to get correct Windows version in hs_err files In-Reply-To: <015e01cffd37$93ab4f20$bb01ed60$@oracle.com> References: <015e01cffd37$93ab4f20$bb01ed60$@oracle.com> Message-ID: <546210EF.9040202@oracle.com> Hi Christian, Looks good. -George On 11/10/2014 5:42 PM, Christian Tornqvist wrote: > Changes were pushed to jdk9 last week > (http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/092a9eddf58d ), patch > applied cleanly to 8u. > > > > Webrev: > > http://cr.openjdk.java.net/~ctornqvi/webrev/8059803/webrev.00/ > > > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8059803 > > > > Thanks, > > Christian > From george.triantafillou at oracle.com Tue Nov 11 14:29:53 2014 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 11 Nov 2014 09:29:53 -0500 Subject: [8u40] Backport RFR: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms In-Reply-To: <546164AC.2050807@oracle.com> References: <54616269.20408@oracle.com> <546164AC.2050807@oracle.com> Message-ID: <54621D61.9060901@oracle.com> Thanks Coleen. -George On 11/10/2014 8:21 PM, Coleen Phillimore wrote: > > This looks good - the title of the bug should be to rewrite > MallocSiteHashOverflow.java since that's what this change does. It > doesn't reenable it because my change does :) > > Coleen > > On 11/10/14, 8:12 PM, George Triantafillou wrote: >> Changes were pushed to jdk9 last >> week:http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/c2881c208f7a >> >> The patch applied cleanly to 8u. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8061969 >> Webrev: http://cr.openjdk.java.net/~gtriantafill/8061969/webrev.01/ >> >> -George >> > From harold.seigel at oracle.com Tue Nov 11 14:51:20 2014 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 11 Nov 2014 09:51:20 -0500 Subject: [8u40] Backport RFR: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms In-Reply-To: <54616269.20408@oracle.com> References: <54616269.20408@oracle.com> Message-ID: <54622268.8040706@oracle.com> Hi George, The backport looks good. Harold On 11/10/2014 8:12 PM, George Triantafillou wrote: > Changes were pushed to jdk9 last > week:http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/c2881c208f7a > > The patch applied cleanly to 8u. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8061969 > Webrev: http://cr.openjdk.java.net/~gtriantafill/8061969/webrev.01/ > > -George > From george.triantafillou at oracle.com Tue Nov 11 14:59:09 2014 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 11 Nov 2014 09:59:09 -0500 Subject: [8u40] Backport RFR: 8061969 - [TESTBUG] MallocSiteHashOverflow.java should be enabled for 32-bit platforms In-Reply-To: <54622268.8040706@oracle.com> References: <54616269.20408@oracle.com> <54622268.8040706@oracle.com> Message-ID: <5462243D.5010903@oracle.com> Thanks Harold. -George On 11/11/2014 9:51 AM, harold seigel wrote: > Hi George, > > The backport looks good. > > Harold > > On 11/10/2014 8:12 PM, George Triantafillou wrote: >> Changes were pushed to jdk9 last >> week:http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/c2881c208f7a >> >> The patch applied cleanly to 8u. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8061969 >> Webrev: http://cr.openjdk.java.net/~gtriantafill/8061969/webrev.01/ >> >> -George >> > From mikael.gerdin at oracle.com Tue Nov 11 15:18:59 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 11 Nov 2014 16:18:59 +0100 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs Message-ID: <546228E3.8030207@oracle.com> Hi all, I've sent this to hotspot-dev instead of just hotspot-gc-dev in the hope of getting some extra feedback from our resident concurrency experts. Please review this subtle change to the order in which we read fields in G1OffsetTableContigSpace::saved_mark_word, original included here for reference: 1003 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) 1008 return top(); 1009 else 1010 return Space::saved_mark_word(); 1011 } 1012 When getting a new gc alloc region several stores are performed where store ordering needs to be enforced and several synchronization points occur. [write path] ST(_saved_mark_word) #StoreStore ST(_gc_time_stamp) ST(_top) // satisfying alloc request #StoreStore ST(_alloc_region) // publishing to other gc workers #MonitorEnter ST(_top) // potential further allocations #MonitorExit #MonitorEnter ST(_top) // potential further allocations #MonitorExit When we inspect a region during remembered set scanning we need to ensure that we never read memory which have been allocated by a GC worker thread for the purpose of copying objects into. The way this works is that a time stamp field is supposed to signal to a scanning thread that it should look at addresses below _top if the time stamp is old or addresses below _saved_mark_word if the time stamp is current. The current code does (as seen above) [read path] LD(_gc_time_stamp) LD(_top) or (depending on time stamp) LD(_saved_mark_word) Because these values are written to without full mutual exclusion we need to be very careful about the order in which we read these values, and this is where I argue that the current code is incorrect. In order to observe a consistent view of the ordered stores in the [write path] above we need to load the values in the reverse order they were written, with proper #LoadLoad ordering enforced. The problem which we've observed here is that after we've read the time stamp as below the heap time stamp the top pointer can be updated by a GC worker allocating objects into this region. To make sure that the top value we see is in fact valid we must read it before we read the time stamp which determines which value we should return from the saved_mark_word function. My suggested fix is to load _top first and enforce #LoadLoad ordering enforced: HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { G1CollectedHeap* g1h = G1CollectedHeap::heap(); assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); HeapWord* local_top = top(); OrderAccess::loadload(); if (_gc_time_stamp < g1h->get_gc_time_stamp()) { return local_top; } else { return Space::saved_mark_word(); } } I've successfully reproduced the crash with the original code by adding some random sleep calls between the load of the time stamp and the load of top so I'm fairly certain that this resolves the issue. I've also verified that the fix I'm proposing does resolve the bug for the team which encountered the issue, even if I can't reproduce that crash locally. I also plan to attempt design around some of the races in this code to reduce its complexity, but for the sake of backporting the fix to 8u40 I'd like to start with just adding the minimal fix. Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ Testing: JPRT, local kitchensink (4 hours), gc test suite Thanks /Mikael From aph at redhat.com Tue Nov 11 16:53:55 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 11 Nov 2014 16:53:55 +0000 Subject: RFR: AARCH64: Top-level JDK changes Message-ID: <54623F23.4000404@redhat.com> The changes for the /jdk subdirectory. http://cr.openjdk.java.net/~aph/aarch64-JDK-8064594/ Andrew. From aph at redhat.com Tue Nov 11 19:02:21 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 11 Nov 2014 19:02:21 +0000 Subject: RFR: AARCH64: Changes to HotSpot shared code Message-ID: <54625D3D.4000007@redhat.com> http://cr.openjdk.java.net/~aph/aarch64-JDK-8064611/hotspot.patch Everything except cpu/ and os_cpu/. Most of this is obvious and trivial, with a few exceptions. In memory/metaspace.cpp, we allocated the memory for metadata in a different way. This is because we want to be able to decode and encode compressed metadata pointers with a single instruction, and we can always do that iff the base address is of a particular form. In opto/, we have made some changes in order to be able to use AArch64 store release instructions for volatile field stores. These don't require leading or trailing barriers. We have tried several times to do this without changing shared code, but it is impossible with the current back-end interface. In several places a release store is used where the AArch64 memory model makes it unnecessary. From earlier emails on this list we discovered that the only architecture which requires this release store is IA64, and OpenJDK does not support it anyway. We should perhaps look at re-engineering the way that memory barriers and memory accesses are handled in HotSpot with a view to pushing all these architecture-dependent assumptions out to the back ends. Andrew. From goetz.lindenmaier at sap.com Wed Nov 12 08:32:04 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 12 Nov 2014 08:32:04 +0000 Subject: AARCH64: Changes to HotSpot shared code In-Reply-To: <54625D3D.4000007@redhat.com> References: <54625D3D.4000007@redhat.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> Hi Andrew, the change does not build as the file sun.jvm.hotspot.debugger.MachineDescriptionAArch64.java is missing. If I add that file, I get cc1plus: error: unrecognized command line option "-Wno-error=cpp" We are using g++ 4.1.2. Then I get os_linux.cpp:1927: error: EM_AARCH64 was not declared in this scope A few lines above it's dealt with a missing EM_486 definition. I guess this should be fixed similarly. After these fixes the change builds and runs on PPC nicely. Considering the change in metaspace: I just proposed a similar change for the java heap, see http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.00/ Don't you want to use 4*G << narrow_klass_shift() for alignment and the test for zerobased compressed klasses? That's what we enforce for the heap. I think the shift is superior to movk, as it preserves 0 (which is rare for klass pointers, I admit). And, could you add #if !defined(AARCH64) && !defined(PPC64) at this place? We'll implement the optimized klass compression, too. ;) In ci_LIR.hpp|cpp please use AARCH64 in #ifs. I would propose to add aarch stuff in alphabetical order (or maybe establish alphabetical order where absent) whenever all cpus are listed (makefiles, os_linux, vm_version, ...) Also, should we sort the cpu includes alphabetically at some point? Best regards, Goetz. -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Andrew Haley Sent: Dienstag, 11. November 2014 20:02 To: hotspot-dev Source Developers; aarch64-port-dev at openjdk.java.net Subject: RFR: AARCH64: Changes to HotSpot shared code http://cr.openjdk.java.net/~aph/aarch64-JDK-8064611/hotspot.patch Everything except cpu/ and os_cpu/. Most of this is obvious and trivial, with a few exceptions. In memory/metaspace.cpp, we allocated the memory for metadata in a different way. This is because we want to be able to decode and encode compressed metadata pointers with a single instruction, and we can always do that iff the base address is of a particular form. In opto/, we have made some changes in order to be able to use AArch64 store release instructions for volatile field stores. These don't require leading or trailing barriers. We have tried several times to do this without changing shared code, but it is impossible with the current back-end interface. In several places a release store is used where the AArch64 memory model makes it unnecessary. From earlier emails on this list we discovered that the only architecture which requires this release store is IA64, and OpenJDK does not support it anyway. We should perhaps look at re-engineering the way that memory barriers and memory accesses are handled in HotSpot with a view to pushing all these architecture-dependent assumptions out to the back ends. Andrew. From bengt.rutisson at oracle.com Wed Nov 12 09:09:04 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Wed, 12 Nov 2014 10:09:04 +0100 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <546228E3.8030207@oracle.com> References: <546228E3.8030207@oracle.com> Message-ID: <546323B0.3060109@oracle.com> Hi Mikael, Looks good to me. Really good work! Bengt On 2014-11-11 16:18, Mikael Gerdin wrote: > Hi all, > > I've sent this to hotspot-dev instead of just hotspot-gc-dev in the > hope of getting some extra feedback from our resident concurrency > experts. > > Please review this subtle change to the order in which we read fields > in G1OffsetTableContigSpace::saved_mark_word, original included here > for reference: > 1003 > 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { > 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); > 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); > 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) > 1008 return top(); > 1009 else > 1010 return Space::saved_mark_word(); > 1011 } > 1012 > > When getting a new gc alloc region several stores are performed where > store ordering needs to be enforced and several synchronization points > occur. > [write path] > ST(_saved_mark_word) > #StoreStore > ST(_gc_time_stamp) > ST(_top) // satisfying alloc request > #StoreStore > ST(_alloc_region) // publishing to other gc workers > #MonitorEnter > ST(_top) // potential further allocations > #MonitorExit > #MonitorEnter > ST(_top) // potential further allocations > #MonitorExit > > When we inspect a region during remembered set scanning we need to > ensure that we never read memory which have been allocated by a GC > worker thread for the purpose of copying objects into. > The way this works is that a time stamp field is supposed to signal to > a scanning thread that it should look at addresses below _top if the > time stamp is old or addresses below _saved_mark_word if the time > stamp is current. > > The current code does (as seen above) > [read path] > LD(_gc_time_stamp) > LD(_top) > or (depending on time stamp) > LD(_saved_mark_word) > > Because these values are written to without full mutual exclusion we > need to be very careful about the order in which we read these values, > and this is where I argue that the current code is incorrect. > In order to observe a consistent view of the ordered stores in the > [write path] above we need to load the values in the reverse order > they were written, with proper #LoadLoad ordering enforced. > > The problem which we've observed here is that after we've read the > time stamp as below the heap time stamp the top pointer can be updated > by a GC worker allocating objects into this region. To make sure that > the top value we see is in fact valid we must read it before we read > the time stamp which determines which value we should return from the > saved_mark_word function. > > My suggested fix is to load _top first and enforce #LoadLoad ordering > enforced: > HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { > G1CollectedHeap* g1h = G1CollectedHeap::heap(); > assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); > HeapWord* local_top = top(); > OrderAccess::loadload(); > if (_gc_time_stamp < g1h->get_gc_time_stamp()) { > return local_top; > } else { > return Space::saved_mark_word(); > } > } > > I've successfully reproduced the crash with the original code by > adding some random sleep calls between the load of the time stamp and > the load of top so I'm fairly certain that this resolves the issue. > I've also verified that the fix I'm proposing does resolve the bug for > the team which encountered the issue, even if I can't reproduce that > crash locally. > > I also plan to attempt design around some of the races in this code to > reduce its complexity, but for the sake of backporting the fix to 8u40 > I'd like to start with just adding the minimal fix. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 > Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ > Testing: JPRT, local kitchensink (4 hours), gc test suite > > Thanks > /Mikael From aph at redhat.com Wed Nov 12 09:39:06 2014 From: aph at redhat.com (Andrew Haley) Date: Wed, 12 Nov 2014 09:39:06 +0000 Subject: AARCH64: Changes to HotSpot shared code In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> Message-ID: <54632ABA.5000706@redhat.com> Hi, On 12/11/14 08:32, Lindenmaier, Goetz wrote: > the change does not build as the file > sun.jvm.hotspot.debugger.MachineDescriptionAArch64.java > is missing. Sorry, I will fix that in a new webrev. > If I add that file, I get > cc1plus: error: unrecognized command line option "-Wno-error=cpp" > We are using g++ 4.1.2. This is very awkward. Without this command it does not build on a recent GCC. I want to avoid yet more configury if I can, so I'll have a think. > Then I get > os_linux.cpp:1927: error: EM_AARCH64 was not declared in this scope > A few lines above it's dealt with a missing EM_486 definition. > I guess this should be fixed similarly. Okay, I will try to see why this does not work for you. > After these fixes the change builds and runs on PPC nicely. > > Considering the change in metaspace: I just proposed a similar change > for the java heap, see > http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.00/ > > Don't you want to use 4*G << narrow_klass_shift() for alignment and > the test for zerobased compressed klasses? That's what we enforce for > the heap. I think the shift is superior to movk, as it preserves 0 > (which is rare for klass pointers, I admit). I don't understand what you are saying. We have a solution that can compress a klass in one instruction. What would we want that is different? > And, could you add #if !defined(AARCH64) && !defined(PPC64) at this place? We'll > implement the optimized klass compression, too. ;) Okay, if it makes sense to do so. > In ci_LIR.hpp|cpp please use AARCH64 in #ifs. > > I would propose to add aarch stuff in alphabetical order > (or maybe establish alphabetical order where absent) whenever all cpus are > listed (makefiles, os_linux, vm_version, ...) > > Also, should we sort the cpu includes alphabetically at some point? I'm trying to make the minimum changes, but I will do whatever people want. Andrew. From aph at redhat.com Wed Nov 12 09:45:39 2014 From: aph at redhat.com (Andrew Haley) Date: Wed, 12 Nov 2014 09:45:39 +0000 Subject: RFR: AARCH64: 8064594: Top-level JDK changes Message-ID: <54632C43.7020109@redhat.com> The changes for the /jdk subdirectory. http://cr.openjdk.java.net/~aph/aarch64-JDK-8064594/ Andrew. From thomas.stuefe at gmail.com Wed Nov 12 10:31:37 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 12 Nov 2014 11:31:37 +0100 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <2e9bd47c-366a-446b-89d0-b431a5816007@default> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> <4295855A5C1DE049A61835A1887419CC2CF26820@DEWDFEMB12A.global.corp.sap> <2e9bd47c-366a-446b-89d0-b431a5816007@default> Message-ID: Hi, could you please review this little addition? (added comments for jio_snprintf) http://cr.openjdk.java.net/~simonis/webrevs/8062370/ Thanks! Best Regards, Thomas On Tue, Nov 11, 2014 at 11:37 AM, Markus Gr?nlund < markus.gronlund at oracle.com> wrote: > Hi Goetz, > > Thanks for following up on this. > > I adjusted a few calls into jio_snprintf() that were particular for > Windows to accommodate the updates. > > From the test results I have seen so far, it seems no other issues > appeared which could be related to this. > > So I think the change should be left as is (no rollback), but maybe a > comment could be added about the jio_snprintf() semantics (NULL termination > on overflows, expected 'count' etc). > > Thanks > Markus > > > -----Original Message----- > From: Lindenmaier, Goetz [mailto:goetz.lindenmaier at sap.com] > Sent: den 11 november 2014 11:14 > To: Markus Gr?nlund > Cc: David Holmes; hotspot-dev at openjdk.java.net > Subject: RE: RFR (L): 8062370: Various minor code improvements > > Hi Markus, > > Could you fix your issue and did the other tests pass? > > Are there any follow-up actions I should take? > > Best regards, > Goetz. > > -----Original Message----- > From: Markus Gr?nlund [mailto:markus.gronlund at oracle.com] > Sent: Donnerstag, 6. November 2014 14:50 > To: Lindenmaier, Goetz > Cc: David Holmes; hotspot-dev at openjdk.java.net > Subject: RE: RFR (L): 8062370: Various minor code improvements > > Hi Goetz, > > Thanks for looking into this. > > I think I will be able to update the internal code I am working on to > accommodate your updates. > > I don't know if any other code will see potential issues - only testing > will tell. > > So I would await the rollback and I will putback my updated code - let's > see if other issues appear after this - we should know after this nights > nightly testing (then we can re-evaluate the rollback). > > Thanks > Markus > > > -----Original Message----- > From: Lindenmaier, Goetz [mailto:goetz.lindenmaier at sap.com] > Sent: den 6 november 2014 13:49 > To: David Holmes; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: RE: RFR (L): 8062370: Various minor code improvements > > Hi David, > > Well, yes, that's right. But then you can simply pass in count+1. > It works also if the caller knows he will only use 'count' bytes of the > string. In this case +1 must be allocated. > But that both is quite special. > > Currently, if the string is truncated, there is no null byte on windows. > And there are a lot of uses of this method in the VM (via jio_snprintf). > > Should I use the internal bug number for the rollback-fix? > > How should we proceed, as I can't fix you internal code? > > Best regards, > Goetz. > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 6. November 2014 12:23 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Cc: Markus Gr?nlund > Subject: Re: RFR (L): 8062370: Various minor code improvements > > On 6/11/2014 8:43 PM, Lindenmaier, Goetz wrote: > > Hi David, > > > > yes, windows does not null terminate if there is an overflow. > > Obviously there are overflows, and they now see one less character. I > > think this should be fixed where jio_vsnprintf is called. Having > > non-null terminated strings isn't nice. > > I think it depends on what you consider an overflow. If the buffer is > already null terminated and you pass in a count that covers up to the > location before the null then there is no problem - except now the logic > will introduce a second null in place of the last character. > > > But for now I will roll back this single change. I'll send a RFR soon. > > > > Where did you see the problem? > > It was in our closed code so I can't go into details. We have a non-public > bug number: 8063089 > > Thanks, > David > > > > > Best regards, > > Goetz. > > > > > > > > > > -----Original Message----- > > From: David Holmes [mailto:david.holmes at oracle.com] > > Sent: Donnerstag, 6. November 2014 11:30 > > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > > Cc: Markus Gr?nlund > > Subject: Re: RFR (L): 8062370: Various minor code improvements > > > > On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: > >> Thanks David, I'll have a look. > > > > It seems that windows vsnprintf may not null-terminate the string - > > which I think is what your patch was trying to address. But if we have > > existing code that works with that then the fix is now overwriting the > > last character. I can't quite see how to handle this in a cross > > platform manner, but in the immediate term we should probably revert > > that part of the changeset. > > > > David > > > >> Best regards, > >> Goetz. > >> > >> -----Original Message----- > >> From: David Holmes [mailto:david.holmes at oracle.com] > >> Sent: Donnerstag, 6. November 2014 11:09 > >> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > >> Cc: Markus Gr?nlund > >> Subject: Re: RFR (L): 8062370: Various minor code improvements > >> > >> Hi Goetz, > >> > >> This change has introduced a bug: > >> > >> - return vsnprintf(str, count, fmt, args); > >> + > >> + int result = vsnprintf(str, count, fmt, args); if ((result > 0 && > >> + (size_t)result >= count) || result == -1) { > >> + str[count - 1] = '\0'; > >> + result = -1; > >> + } > >> + > >> + return result; > >> > >> some strings are getting their last character truncated on Windows. > >> > >> David > >> > >> On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: > >>> Hi David, > >>> > >>> thanks for looking at the change! I fixed the issue in a new > >>> webrev: > >>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ > >>> > >>> Best regards, > >>> Goetz. > >>> > >>> -----Original Message----- > >>> From: David Holmes [mailto:david.holmes at oracle.com] > >>> Sent: Mittwoch, 5. November 2014 02:49 > >>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > >>> Subject: Re: RFR (L): 8062370: Various minor code improvements > >>> > >>> Hi Goetz, > >>> > >>> The only issue I see is in: > >>> > >>> src/share/vm/runtime/globals.cpp > >>> > >>> where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the > >>> "abort on OOM" semantics of NEW_C_HEAP_ARRAY you need to use > os::strdup_check_oom. > >>> > >>> Thanks, > >>> David > >>> > >>> On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: > >>>> Hi, > >>>> > >>>> this change contains a row of minor code improvements we did to > >>>> fulfil our internal quality requirements. We would like to share > >>>> these with openJDK. > >>>> > >>>> Please review and test this change. I please need a sponsor. > >>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ > >>>> https://bugs.openjdk.java.net/browse/JDK-8062370 > >>>> > >>>> We tested this on windows 64, linux x86_64, mac, solaris sparc > >>>> 32+64 bit and, of course, the ppc platforms. > >>>> > >>>> > >>>> Some details: > >>>> > >>>> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus > not representable as i64 what is used in the CONST64 macro. This change > adapts UCONST64 to use ui64, and the usages of these macros where necessary. > >>>> > >>>> We add some more strncpy uses. Also, we fix strncpy on windows. > There, strncpy does not write a \0 into the last byte if the copied string > is too long. > >>>> > >>>> We add some missing memory frees and some closing of files. > >>>> > >>>> jio_vsnprintf() works differently on windows and linux. This change > adapts this to show the same behaviour on all platforms. See java.cpp. > >>>> > >>>> Best regards, > >>>> > >>>> Goetz > >>>> > >>>> > >>>> > >>>> > From goetz.lindenmaier at sap.com Wed Nov 12 11:40:09 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 12 Nov 2014 11:40:09 +0000 Subject: AARCH64: Changes to HotSpot shared code In-Reply-To: <54632ABA.5000706@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> Hi Andrew, I think EM_AARCH64 is not defined in my case because the gcc / environment is too old to know about AARCH. metaspace: Ok, I think I missed that you can ignore the shift altogether as the compressed class space is enforced to be < 0xc0000000. Further, klasses are never 0. So on ppc we will still prefer to shift if the heap is < 32g, but 4G alignment helps in the other cases as long as the compressed class space is smaller than 4G. Best regards, Goetz. -----Original Message----- From: Andrew Haley [mailto:aph at redhat.com] Sent: Mittwoch, 12. November 2014 10:39 To: Lindenmaier, Goetz; hotspot-dev Source Developers; aarch64-port-dev at openjdk.java.net Subject: Re: AARCH64: Changes to HotSpot shared code Hi, On 12/11/14 08:32, Lindenmaier, Goetz wrote: > the change does not build as the file > sun.jvm.hotspot.debugger.MachineDescriptionAArch64.java > is missing. Sorry, I will fix that in a new webrev. > If I add that file, I get > cc1plus: error: unrecognized command line option "-Wno-error=cpp" > We are using g++ 4.1.2. This is very awkward. Without this command it does not build on a recent GCC. I want to avoid yet more configury if I can, so I'll have a think. > Then I get > os_linux.cpp:1927: error: EM_AARCH64 was not declared in this scope > A few lines above it's dealt with a missing EM_486 definition. > I guess this should be fixed similarly. Okay, I will try to see why this does not work for you. > After these fixes the change builds and runs on PPC nicely. > > Considering the change in metaspace: I just proposed a similar change > for the java heap, see > http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.00/ > > Don't you want to use 4*G << narrow_klass_shift() for alignment and > the test for zerobased compressed klasses? That's what we enforce for > the heap. I think the shift is superior to movk, as it preserves 0 > (which is rare for klass pointers, I admit). I don't understand what you are saying. We have a solution that can compress a klass in one instruction. What would we want that is different? > And, could you add #if !defined(AARCH64) && !defined(PPC64) at this place? We'll > implement the optimized klass compression, too. ;) Okay, if it makes sense to do so. > In ci_LIR.hpp|cpp please use AARCH64 in #ifs. > > I would propose to add aarch stuff in alphabetical order > (or maybe establish alphabetical order where absent) whenever all cpus are > listed (makefiles, os_linux, vm_version, ...) > > Also, should we sort the cpu includes alphabetically at some point? I'm trying to make the minimum changes, but I will do whatever people want. Andrew. From aph at redhat.com Wed Nov 12 11:42:51 2014 From: aph at redhat.com (Andrew Haley) Date: Wed, 12 Nov 2014 11:42:51 +0000 Subject: AARCH64: Changes to HotSpot shared code In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> Message-ID: <546347BB.6050409@redhat.com> On 11/12/2014 11:40 AM, Lindenmaier, Goetz wrote: > I think EM_AARCH64 is not defined in my case because the gcc / environment > is too old to know about AARCH. Okay, I can #ifdef that. > metaspace: > Ok, I think I missed that you can ignore the shift altogether as the compressed class space > is enforced to be < 0xc0000000. Further, klasses are never 0. > So on ppc we will still prefer to shift if the heap is < 32g, but 4G alignment helps in the > other cases as long as the compressed class space is smaller than 4G. I suppose we're getting close to declaring that this should really be handled by the back end. Andrew. From aph at redhat.com Wed Nov 12 11:48:08 2014 From: aph at redhat.com (Andrew Haley) Date: Wed, 12 Nov 2014 11:48:08 +0000 Subject: RFR: AARCH64: 8064594: Top-level JDK changes Message-ID: <546348F8.9060900@redhat.com> The changes for the /jdk subdirectory. The missing files problem bit me again. New webrev: http://cr.openjdk.java.net/~aph/aarch64-JDK-8064594-1/ Apologies, Andrew. From stefan.karlsson at oracle.com Wed Nov 12 12:39:07 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 12 Nov 2014 13:39:07 +0100 Subject: RFR: 8064580 and 8064581: Move INCLUDE_CDS and INCLUDE_ALL_GCS to the end of the include lists Message-ID: <546354EB.80606@oracle.com> Hi all, Please, review the following two cleanup patches to move the conditional include lines to the end of the include lists. The patches also add missing macros.hpp includes, that are needed when the INCLUDE_* defines are used. There are also a few minor cleanups near some usages of INCLUDE_ALL_GCS. http://cr.openjdk.java.net/~stefank/8064580/webrev.01 - Fix INCLUDE_CDS http://cr.openjdk.java.net/~stefank/8064581/webrev.01 - Fix INCLUDE_ALL_GCS Some background to the sort order, the INCLUDE_* defines and macros.hpp: The include lines where inserted and sorted in the includeDB removal patch. As part of that patch all includes that were guarded by #ifndef were put at the end of the include list. See: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f95d63e2154a - 6989984: Use standard include model for Hospot Later the selective inclusion of parts like, for example, CDS and the non-serial GCs were changed and now we also rely on the defines present in macros.hpp. With that change it's now important that all conditional includes are added after the inclusion of macros.hpp. See: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fb19af007ffc - 7189254: Change makefiles for more flexibility to override defaults thanks, StefanK From goetz.lindenmaier at sap.com Wed Nov 12 13:47:49 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 12 Nov 2014 13:47:49 +0000 Subject: AARCH64: Changes to HotSpot shared code In-Reply-To: <546347BB.6050409@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <546347BB.6050409@redhat.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF26DEC@DEWDFEMB12A.global.corp.sap> Yes, if you add defined(PPC64) I can handle the rest in the back end. But I can also add that later on ... basically I wanted to say that it's a good idea to align the class space that way. Best regards, Goetz. -----Original Message----- From: Andrew Haley [mailto:aph at redhat.com] Sent: Mittwoch, 12. November 2014 12:43 To: Lindenmaier, Goetz; hotspot-dev Source Developers; aarch64-port-dev at openjdk.java.net Subject: Re: AARCH64: Changes to HotSpot shared code On 11/12/2014 11:40 AM, Lindenmaier, Goetz wrote: > I think EM_AARCH64 is not defined in my case because the gcc / environment > is too old to know about AARCH. Okay, I can #ifdef that. > metaspace: > Ok, I think I missed that you can ignore the shift altogether as the compressed class space > is enforced to be < 0xc0000000. Further, klasses are never 0. > So on ppc we will still prefer to shift if the heap is < 32g, but 4G alignment helps in the > other cases as long as the compressed class space is smaller than 4G. I suppose we're getting close to declaring that this should really be handled by the back end. Andrew. From jesper.wilhelmsson at oracle.com Wed Nov 12 13:50:13 2014 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Wed, 12 Nov 2014 14:50:13 +0100 Subject: RFR: 8064580 and 8064581: Move INCLUDE_CDS and INCLUDE_ALL_GCS to the end of the include lists In-Reply-To: <546354EB.80606@oracle.com> References: <546354EB.80606@oracle.com> Message-ID: <54636595.7080003@oracle.com> Looks good! /Jesper Stefan Karlsson skrev 12/11/14 13:39: > Hi all, > > Please, review the following two cleanup patches to move the conditional > include lines to the end of the include lists. The patches also add missing > macros.hpp includes, that are needed when the INCLUDE_* defines are used. > There are also a few minor cleanups near some usages of INCLUDE_ALL_GCS. > > http://cr.openjdk.java.net/~stefank/8064580/webrev.01 - Fix INCLUDE_CDS > http://cr.openjdk.java.net/~stefank/8064581/webrev.01 - Fix INCLUDE_ALL_GCS > > Some background to the sort order, the INCLUDE_* defines and macros.hpp: > > The include lines where inserted and sorted in the includeDB removal patch. As > part of that patch all includes that were guarded by #ifndef were put at the > end of the include list. See: > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f95d63e2154a - 6989984: Use > standard include model for Hospot > > Later the selective inclusion of parts like, for example, CDS and the > non-serial GCs were changed and now we also rely on the defines present in > macros.hpp. With that change it's now important that all conditional includes > are added after the inclusion of macros.hpp. See: > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fb19af007ffc - 7189254: > Change makefiles for more flexibility to override defaults > > thanks, > StefanK From stefan.karlsson at oracle.com Wed Nov 12 13:44:04 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 12 Nov 2014 14:44:04 +0100 Subject: RFR: 8064580 and 8064581: Move INCLUDE_CDS and INCLUDE_ALL_GCS to the end of the include lists In-Reply-To: <54636595.7080003@oracle.com> References: <546354EB.80606@oracle.com> <54636595.7080003@oracle.com> Message-ID: <54636424.60800@oracle.com> On 2014-11-12 14:50, Jesper Wilhelmsson wrote: > Looks good! Thanks, Jesper! StefanK > /Jesper > > Stefan Karlsson skrev 12/11/14 13:39: >> Hi all, >> >> Please, review the following two cleanup patches to move the >> conditional include lines to the end of the include lists. The >> patches also add missing macros.hpp includes, that are needed when >> the INCLUDE_* defines are used. There are also a few minor cleanups >> near some usages of INCLUDE_ALL_GCS. >> >> http://cr.openjdk.java.net/~stefank/8064580/webrev.01 - Fix INCLUDE_CDS >> http://cr.openjdk.java.net/~stefank/8064581/webrev.01 - Fix >> INCLUDE_ALL_GCS >> >> Some background to the sort order, the INCLUDE_* defines and macros.hpp: >> >> The include lines where inserted and sorted in the includeDB removal >> patch. As part of that patch all includes that were guarded by >> #ifndef were put at the end of the include list. See: >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f95d63e2154a - >> 6989984: Use standard include model for Hospot >> >> Later the selective inclusion of parts like, for example, CDS and the >> non-serial GCs were changed and now we also rely on the defines >> present in macros.hpp. With that change it's now important that all >> conditional includes are added after the inclusion of macros.hpp. See: >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fb19af007ffc - >> 7189254: Change makefiles for more flexibility to override defaults >> >> thanks, >> StefanK > From bengt.rutisson at oracle.com Wed Nov 12 14:33:50 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Wed, 12 Nov 2014 15:33:50 +0100 Subject: RFR: 8064580 and 8064581: Move INCLUDE_CDS and INCLUDE_ALL_GCS to the end of the include lists In-Reply-To: <546354EB.80606@oracle.com> References: <546354EB.80606@oracle.com> Message-ID: <54636FCE.70801@oracle.com> Hi Stefan, Looks good to me. Thanks, Bengt On 2014-11-12 13:39, Stefan Karlsson wrote: > Hi all, > > Please, review the following two cleanup patches to move the > conditional include lines to the end of the include lists. The patches > also add missing macros.hpp includes, that are needed when the > INCLUDE_* defines are used. There are also a few minor cleanups near > some usages of INCLUDE_ALL_GCS. > > http://cr.openjdk.java.net/~stefank/8064580/webrev.01 - Fix INCLUDE_CDS > http://cr.openjdk.java.net/~stefank/8064581/webrev.01 - Fix > INCLUDE_ALL_GCS > > Some background to the sort order, the INCLUDE_* defines and macros.hpp: > > The include lines where inserted and sorted in the includeDB removal > patch. As part of that patch all includes that were guarded by #ifndef > were put at the end of the include list. See: > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f95d63e2154a - > 6989984: Use standard include model for Hospot > > Later the selective inclusion of parts like, for example, CDS and the > non-serial GCs were changed and now we also rely on the defines > present in macros.hpp. With that change it's now important that all > conditional includes are added after the inclusion of macros.hpp. See: > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fb19af007ffc - > 7189254: Change makefiles for more flexibility to override defaults > > thanks, > StefanK From stefan.karlsson at oracle.com Wed Nov 12 14:49:23 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 12 Nov 2014 15:49:23 +0100 Subject: RFR: 8064580 and 8064581: Move INCLUDE_CDS and INCLUDE_ALL_GCS to the end of the include lists In-Reply-To: <54636FCE.70801@oracle.com> References: <546354EB.80606@oracle.com> <54636FCE.70801@oracle.com> Message-ID: <54637373.3090607@oracle.com> On 2014-11-12 15:33, Bengt Rutisson wrote: > > Hi Stefan, > > Looks good to me. Thanks, Bengt. StefanK > > Thanks, > Bengt > > On 2014-11-12 13:39, Stefan Karlsson wrote: >> Hi all, >> >> Please, review the following two cleanup patches to move the >> conditional include lines to the end of the include lists. The >> patches also add missing macros.hpp includes, that are needed when >> the INCLUDE_* defines are used. There are also a few minor cleanups >> near some usages of INCLUDE_ALL_GCS. >> >> http://cr.openjdk.java.net/~stefank/8064580/webrev.01 - Fix INCLUDE_CDS >> http://cr.openjdk.java.net/~stefank/8064581/webrev.01 - Fix >> INCLUDE_ALL_GCS >> >> Some background to the sort order, the INCLUDE_* defines and macros.hpp: >> >> The include lines where inserted and sorted in the includeDB removal >> patch. As part of that patch all includes that were guarded by >> #ifndef were put at the end of the include list. See: >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f95d63e2154a - >> 6989984: Use standard include model for Hospot >> >> Later the selective inclusion of parts like, for example, CDS and the >> non-serial GCs were changed and now we also rely on the defines >> present in macros.hpp. With that change it's now important that all >> conditional includes are added after the inclusion of macros.hpp. See: >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fb19af007ffc - >> 7189254: Change makefiles for more flexibility to override defaults >> >> thanks, >> StefanK > From coleen.phillimore at oracle.com Wed Nov 12 15:57:53 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 12 Nov 2014 10:57:53 -0500 Subject: RFR: 8064580 and 8064581: Move INCLUDE_CDS and INCLUDE_ALL_GCS to the end of the include lists In-Reply-To: <54637373.3090607@oracle.com> References: <546354EB.80606@oracle.com> <54636FCE.70801@oracle.com> <54637373.3090607@oracle.com> Message-ID: <54638381.10608@oracle.com> Looks good to me. And thanks for clarifying the rules. Coleen On 11/12/14, 9:49 AM, Stefan Karlsson wrote: > On 2014-11-12 15:33, Bengt Rutisson wrote: >> >> Hi Stefan, >> >> Looks good to me. > > Thanks, Bengt. > StefanK > >> >> Thanks, >> Bengt >> >> On 2014-11-12 13:39, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please, review the following two cleanup patches to move the >>> conditional include lines to the end of the include lists. The >>> patches also add missing macros.hpp includes, that are needed when >>> the INCLUDE_* defines are used. There are also a few minor cleanups >>> near some usages of INCLUDE_ALL_GCS. >>> >>> http://cr.openjdk.java.net/~stefank/8064580/webrev.01 - Fix >>> INCLUDE_CDS >>> http://cr.openjdk.java.net/~stefank/8064581/webrev.01 - Fix >>> INCLUDE_ALL_GCS >>> >>> Some background to the sort order, the INCLUDE_* defines and >>> macros.hpp: >>> >>> The include lines where inserted and sorted in the includeDB removal >>> patch. As part of that patch all includes that were guarded by >>> #ifndef were put at the end of the include list. See: >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f95d63e2154a - >>> 6989984: Use standard include model for Hospot >>> >>> Later the selective inclusion of parts like, for example, CDS and >>> the non-serial GCs were changed and now we also rely on the defines >>> present in macros.hpp. With that change it's now important that all >>> conditional includes are added after the inclusion of macros.hpp. See: >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fb19af007ffc - >>> 7189254: Change makefiles for more flexibility to override defaults >>> >>> thanks, >>> StefanK >> > From stefan.karlsson at oracle.com Wed Nov 12 16:17:57 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 12 Nov 2014 17:17:57 +0100 Subject: RFR: 8064580 and 8064581: Move INCLUDE_CDS and INCLUDE_ALL_GCS to the end of the include lists In-Reply-To: <54638381.10608@oracle.com> References: <546354EB.80606@oracle.com> <54636FCE.70801@oracle.com> <54637373.3090607@oracle.com> <54638381.10608@oracle.com> Message-ID: <54638835.7070504@oracle.com> On 2014-11-12 16:57, Coleen Phillimore wrote: > > Looks good to me. And thanks for clarifying the rules. Thanks, Coleen. StefanK > > Coleen > > On 11/12/14, 9:49 AM, Stefan Karlsson wrote: >> On 2014-11-12 15:33, Bengt Rutisson wrote: >>> >>> Hi Stefan, >>> >>> Looks good to me. >> >> Thanks, Bengt. >> StefanK >> >>> >>> Thanks, >>> Bengt >>> >>> On 2014-11-12 13:39, Stefan Karlsson wrote: >>>> Hi all, >>>> >>>> Please, review the following two cleanup patches to move the >>>> conditional include lines to the end of the include lists. The >>>> patches also add missing macros.hpp includes, that are needed when >>>> the INCLUDE_* defines are used. There are also a few minor cleanups >>>> near some usages of INCLUDE_ALL_GCS. >>>> >>>> http://cr.openjdk.java.net/~stefank/8064580/webrev.01 - Fix >>>> INCLUDE_CDS >>>> http://cr.openjdk.java.net/~stefank/8064581/webrev.01 - Fix >>>> INCLUDE_ALL_GCS >>>> >>>> Some background to the sort order, the INCLUDE_* defines and >>>> macros.hpp: >>>> >>>> The include lines where inserted and sorted in the includeDB >>>> removal patch. As part of that patch all includes that were guarded >>>> by #ifndef were put at the end of the include list. See: >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f95d63e2154a - >>>> 6989984: Use standard include model for Hospot >>>> >>>> Later the selective inclusion of parts like, for example, CDS and >>>> the non-serial GCs were changed and now we also rely on the defines >>>> present in macros.hpp. With that change it's now important that all >>>> conditional includes are added after the inclusion of macros.hpp. See: >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fb19af007ffc - >>>> 7189254: Change makefiles for more flexibility to override defaults >>>> >>>> thanks, >>>> StefanK >>> >> > From lois.foltan at oracle.com Wed Nov 12 16:24:37 2014 From: lois.foltan at oracle.com (Lois Foltan) Date: Wed, 12 Nov 2014 11:24:37 -0500 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. In-Reply-To: <545D19D9.3040400@oracle.com> References: <545D19D9.3040400@oracle.com> Message-ID: <546389C5.2070608@oracle.com> Hi Max, Overall, this looks good. A couple of comments: src/share/vm/runtime/arguments.cpp - Within the altered if statement, shouldn't the second comparison of &s[1] to len actually be (strlen(&s[1] == (len-1)), since the original len was calculated to take into account the character in s[0]? This comment also applies to the second strncmp, shouldn't it be (len-1)? - Minor coding style, if the first part of an if statement logically does a "strncmp" and then a "strlen" it would be nice if follow on conditions in that if statement did the same for consistency. Please consider reversing the second "strlen" and "strncmp". I find it a little bit more readable that way. test/runtime/CommandLine/ObsoleteFlagErrorMessage.java - looks good, please update copyright before pushing. Thanks, Lois On 11/7/2014 2:13 PM, Max Ockner wrote: > ID: 8060449 > webrev: http://cr.openjdk.java.net/~coleenp/8060449/ > > Summary: A "newly obsolete" command line option is one which is no > longer supported, but still is acknowledged. There is a list of these > in arguments.cpp. > It used to be that only a fixed number of characters were checked when > comparing a given command line option to the list of obsolete flags > (strncmp was used, where the number of characters to check is equal to > the length of the flag name from the table.) > As a result, an arbitrary string appended to the end of an obsolete > argument goes unnoticed. > This issue is fixed by comparing the lengths of the given flag and the > flags from the obsolete flags table. > When a misspelled flag is fuzzy-matched to an obsolete flag, an > appropriate warning is given to save the user a few key strokes: (1) > unrecognized option [bad option]. (2) Did you mean [option]? (3) > [option] is obsolete as of [version]) > > A new test for this feature checks for the presence of all three > components of the above error message. > > Tested with: vm.quick.testlist > hotspot jtreg tests > jprt > > Thanks for your help! > Max Ockner From lois.foltan at oracle.com Wed Nov 12 16:26:37 2014 From: lois.foltan at oracle.com (Lois Foltan) Date: Wed, 12 Nov 2014 11:26:37 -0500 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. In-Reply-To: <546389C5.2070608@oracle.com> References: <545D19D9.3040400@oracle.com> <546389C5.2070608@oracle.com> Message-ID: <54638A3D.20206@oracle.com> On 11/12/2014 11:24 AM, Lois Foltan wrote: > Hi Max, > > Overall, this looks good. A couple of comments: > > src/share/vm/runtime/arguments.cpp > > - Within the altered if statement, shouldn't the second comparison > of &s[1] to len actually be (strlen(&s[1] == (len-1)), since the > original len was calculated to take into account the character in > s[0]? This comment also applies to the second strncmp, shouldn't it > be (len-1)? Scratch this first comment all together. I misread and thought the strlen was calculated differently against s and not flag_status.name at first. Sorry about that! Lois > > - Minor coding style, if the first part of an if statement logically > does a "strncmp" and then a "strlen" it would be nice if follow on > conditions in that if statement did the same for consistency. Please > consider reversing the second "strlen" and "strncmp". I find it a > little bit more readable that way. > > test/runtime/CommandLine/ObsoleteFlagErrorMessage.java > > - looks good, please update copyright before pushing. > > Thanks, > Lois > > On 11/7/2014 2:13 PM, Max Ockner wrote: >> ID: 8060449 >> webrev: http://cr.openjdk.java.net/~coleenp/8060449/ >> >> Summary: A "newly obsolete" command line option is one which is no >> longer supported, but still is acknowledged. There is a list of these >> in arguments.cpp. >> It used to be that only a fixed number of characters were checked >> when comparing a given command line option to the list of obsolete >> flags (strncmp was used, where the number of characters to check is >> equal to the length of the flag name from the table.) >> As a result, an arbitrary string appended to the end of an obsolete >> argument goes unnoticed. >> This issue is fixed by comparing the lengths of the given flag and >> the flags from the obsolete flags table. >> When a misspelled flag is fuzzy-matched to an obsolete flag, an >> appropriate warning is given to save the user a few key strokes: (1) >> unrecognized option [bad option]. (2) Did you mean [option]? (3) >> [option] is obsolete as of [version]) >> >> A new test for this feature checks for the presence of all three >> components of the above error message. >> >> Tested with: vm.quick.testlist >> hotspot jtreg tests >> jprt >> >> Thanks for your help! >> Max Ockner > From aph at redhat.com Wed Nov 12 16:49:36 2014 From: aph at redhat.com (Andrew Haley) Date: Wed, 12 Nov 2014 16:49:36 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> Message-ID: <54638FA0.8040204@redhat.com> On 11/12/2014 11:40 AM, Lindenmaier, Goetz wrote: > Hi Andrew, Hi, Thank you for your comment. I have prepared a new webrev at http://cr.openjdk.java.net/~aph/aarch64-8064611-1/ which I hope addresses everything you mentioned. I haven't re-ordered any of the lists of processors because I think this is a separate issue. Andrew. From daniel.daugherty at oracle.com Wed Nov 12 18:41:45 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 12 Nov 2014 11:41:45 -0700 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. In-Reply-To: <545D19D9.3040400@oracle.com> References: <545D19D9.3040400@oracle.com> Message-ID: <5463A9E9.8010005@oracle.com> On 11/7/14 12:13 PM, Max Ockner wrote: > ID: 8060449 > webrev: http://cr.openjdk.java.net/~coleenp/8060449/ src/share/vm/runtime/arguments.cpp line 336: ) { This fragment is on a line by itself and far left. Minimally, it should align like this: line 331: if (... line 336: ) { However, I recommend a slightly different structure to this logic: size_t f_len = strlen(flag_status.name); size_t s_len = strlen(s); if (f_len == s_len || (f_len + 1) == s_len) { // this flag is the right length for a possible match if (strncmp(flag_status.name, s, f_len) == 0) || ((s[0] == '+' || s[0] == '-') && strncmp(flag_status.name, &s[1], f_len) == 0)) { // this flag is an exact match if (JDK_Version::current().compare(flag_status.accept_until) == -1) { ... } } } i++; I have no idea if the above formatting is going to be preserved by e-mail clients... Dan > > Summary: A "newly obsolete" command line option is one which is no > longer supported, but still is acknowledged. There is a list of these > in arguments.cpp. > It used to be that only a fixed number of characters were checked when > comparing a given command line option to the list of obsolete flags > (strncmp was used, where the number of characters to check is equal to > the length of the flag name from the table.) > As a result, an arbitrary string appended to the end of an obsolete > argument goes unnoticed. > This issue is fixed by comparing the lengths of the given flag and the > flags from the obsolete flags table. > When a misspelled flag is fuzzy-matched to an obsolete flag, an > appropriate warning is given to save the user a few key strokes: (1) > unrecognized option [bad option]. (2) Did you mean [option]? (3) > [option] is obsolete as of [version]) > > A new test for this feature checks for the presence of all three > components of the above error message. > > Tested with: vm.quick.testlist > hotspot jtreg tests > jprt > > Thanks for your help! > Max Ockner > > From max.ockner at oracle.com Wed Nov 12 19:43:40 2014 From: max.ockner at oracle.com (Max Ockner) Date: Wed, 12 Nov 2014 14:43:40 -0500 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. In-Reply-To: <54638A3D.20206@oracle.com> References: <545D19D9.3040400@oracle.com> <546389C5.2070608@oracle.com> <54638A3D.20206@oracle.com> Message-ID: <5463B86C.5090803@oracle.com> Lois, Thank you for looking over this. I have reversed the order of the logic as you recommended and I have updated the copyrights. Max On 11/12/2014 11:26 AM, Lois Foltan wrote: > > On 11/12/2014 11:24 AM, Lois Foltan wrote: >> Hi Max, >> >> Overall, this looks good. A couple of comments: >> >> src/share/vm/runtime/arguments.cpp >> >> - Within the altered if statement, shouldn't the second comparison >> of &s[1] to len actually be (strlen(&s[1] == (len-1)), since the >> original len was calculated to take into account the character in >> s[0]? This comment also applies to the second strncmp, shouldn't it >> be (len-1)? > > Scratch this first comment all together. I misread and thought the > strlen was calculated differently against s and not flag_status.name > at first. Sorry about that! > Lois > >> >> - Minor coding style, if the first part of an if statement >> logically does a "strncmp" and then a "strlen" it would be nice if >> follow on conditions in that if statement did the same for >> consistency. Please consider reversing the second "strlen" and >> "strncmp". I find it a little bit more readable that way. >> >> test/runtime/CommandLine/ObsoleteFlagErrorMessage.java >> >> - looks good, please update copyright before pushing. >> >> Thanks, >> Lois >> >> On 11/7/2014 2:13 PM, Max Ockner wrote: >>> ID: 8060449 >>> webrev: http://cr.openjdk.java.net/~coleenp/8060449/ >>> >>> Summary: A "newly obsolete" command line option is one which is no >>> longer supported, but still is acknowledged. There is a list of >>> these in arguments.cpp. >>> It used to be that only a fixed number of characters were checked >>> when comparing a given command line option to the list of obsolete >>> flags (strncmp was used, where the number of characters to check is >>> equal to the length of the flag name from the table.) >>> As a result, an arbitrary string appended to the end of an obsolete >>> argument goes unnoticed. >>> This issue is fixed by comparing the lengths of the given flag and >>> the flags from the obsolete flags table. >>> When a misspelled flag is fuzzy-matched to an obsolete flag, an >>> appropriate warning is given to save the user a few key strokes: (1) >>> unrecognized option [bad option]. (2) Did you mean [option]? (3) >>> [option] is obsolete as of [version]) >>> >>> A new test for this feature checks for the presence of all three >>> components of the above error message. >>> >>> Tested with: vm.quick.testlist >>> hotspot jtreg tests >>> jprt >>> >>> Thanks for your help! >>> Max Ockner >> > From max.ockner at oracle.com Wed Nov 12 20:04:38 2014 From: max.ockner at oracle.com (Max Ockner) Date: Wed, 12 Nov 2014 15:04:38 -0500 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. In-Reply-To: <5463A9E9.8010005@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> Message-ID: <5463BD56.500@oracle.com> Dan, I have reformatted the "){" fragment on line 336 as you recommended. Thanks for catching that. For your second recommendation, I think I have a use case where the recommended code would not function properly: Let's say there is a boolean flag SomeFlag, and let's say that the user tries to type "-XX:SomeFlagg". The first if statement passes because strlen("SomeFlagg") = strlen("SomeFlag")+1. The second conditional checks if (strncmp(flag_status.name, s, f_len) == 0). But f_len, the length of "SomeFlag" is 8. The result is that the 9th character of the user's input, which is where s differs from flag_status.name, is not checked,so this condition is passed as well. Thanks, Max On 11/12/2014 1:41 PM, Daniel D. Daugherty wrote: > On 11/7/14 12:13 PM, Max Ockner wrote: >> ID: 8060449 >> webrev: http://cr.openjdk.java.net/~coleenp/8060449/ > > src/share/vm/runtime/arguments.cpp > > line 336: ) { > This fragment is on a line by itself and far left. > Minimally, it should align like this: > > line 331: if (... > line 336: ) { > > However, I recommend a slightly different structure to > this logic: > > size_t f_len = strlen(flag_status.name); > size_t s_len = strlen(s); > if (f_len == s_len || (f_len + 1) == s_len) { > // this flag is the right length for a possible match > if (strncmp(flag_status.name, s, f_len) == 0) || > ((s[0] == '+' || s[0] == '-') && > strncmp(flag_status.name, &s[1], f_len) == 0)) { > // this flag is an exact match > if (JDK_Version::current().compare(flag_status.accept_until) > == -1) { > ... > } > } > } > i++; > > I have no idea if the above formatting is going to be > preserved by e-mail clients... > > Dan > > >> >> Summary: A "newly obsolete" command line option is one which is no >> longer supported, but still is acknowledged. There is a list of these >> in arguments.cpp. >> It used to be that only a fixed number of characters were checked >> when comparing a given command line option to the list of obsolete >> flags (strncmp was used, where the number of characters to check is >> equal to the length of the flag name from the table.) >> As a result, an arbitrary string appended to the end of an obsolete >> argument goes unnoticed. >> This issue is fixed by comparing the lengths of the given flag and >> the flags from the obsolete flags table. >> When a misspelled flag is fuzzy-matched to an obsolete flag, an >> appropriate warning is given to save the user a few key strokes: (1) >> unrecognized option [bad option]. (2) Did you mean [option]? (3) >> [option] is obsolete as of [version]) >> >> A new test for this feature checks for the presence of all three >> components of the above error message. >> >> Tested with: vm.quick.testlist >> hotspot jtreg tests >> jprt >> >> Thanks for your help! >> Max Ockner >> >> > From dean.long at oracle.com Wed Nov 12 20:23:42 2014 From: dean.long at oracle.com (Dean Long) Date: Wed, 12 Nov 2014 12:23:42 -0800 Subject: RFR: AARCH64: Changes to HotSpot shared code In-Reply-To: <54625D3D.4000007@redhat.com> References: <54625D3D.4000007@redhat.com> Message-ID: <5463C1CE.9040301@oracle.com> On 11/11/2014 11:02 AM, Andrew Haley wrote: > http://cr.openjdk.java.net/~aph/aarch64-JDK-8064611/hotspot.patch > > Everything except cpu/ and os_cpu/. > > Most of this is obvious and trivial, with a few exceptions. > > In memory/metaspace.cpp, we allocated the memory for metadata in a > different way. This is because we want to be able to decode and > encode compressed metadata pointers with a single instruction, and we > can always do that iff the base address is of a particular form. > > In opto/, we have made some changes in order to be able to use AArch64 > store release instructions for volatile field stores. These don't > require leading or trailing barriers. We have tried several times to > do this without changing shared code, but it is impossible with the > current back-end interface. Is this something ppc64 can also take advantage of? I hope Vladimir can suggest a more flexible way to do this, perhaps with a runtime flag. > In several places a release store is used where the AArch64 memory > model makes it unnecessary. From earlier emails on this list we > discovered that the only architecture which requires this release > store is IA64, and OpenJDK does not support it anyway. We should > perhaps look at re-engineering the way that memory barriers and memory > accesses are handled in HotSpot with a view to pushing all these > architecture-dependent assumptions out to the back ends. I agree. More comments below. > Andrew. c1_Canonicalizer.cpp Can this be handled in the back-end? I imagine other platforms, such as x86, have similar limitations. c1_LIR.cpp It looks like you need a temp for convert because your backend because you're checking the FPSR. What happens if you ignore the FPSR, do you get a wrong result? c1_LinearScan.cpp I'm not familiar with what the changed code is doing. Can you explain why it applies to x86 and aarch64? c1_Runtime1.cpp This will break our closed port that NOP instructions for patching. How about moving your deopt-instead-of-patch support into Runtime1::patch_code() and enable it with a read-only platform-specific developer runtime flag (see INTPRESSURE for example)? compiledIC.hpp You should be able to use set_inst_mark()/cbuf.insts_mark() to set and retrieve the mark address. arguments.cpp I wish there was a way to fix ReservedCodeCacheSize in the back-end. dl From daniel.daugherty at oracle.com Wed Nov 12 20:38:05 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 12 Nov 2014 13:38:05 -0700 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. In-Reply-To: <5463BD56.500@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> Message-ID: <5463C52D.4000600@oracle.com> On 11/12/14 1:04 PM, Max Ockner wrote: > Dan, > I have reformatted the "){" fragment on line 336 as you recommended. > Thanks for catching that. Thanks. > For your second recommendation, I think I have a use case where the > recommended code would not function properly: > > Let's say there is a boolean flag SomeFlag, and let's say that the > user tries to type "-XX:SomeFlagg". > > The first if statement passes because strlen("SomeFlagg") = > strlen("SomeFlag")+1. > The second conditional checks if (strncmp(flag_status.name, s, f_len) > == 0). But f_len, the length of "SomeFlag" is 8. The result is that > the 9th character of the user's input, which is where s differs from > flag_status.name, is not checked,so this condition is passed as well. Your use case catches a bug in what I posted. I had originally planned to change the two strncmp() calls to strcmp() so that we get a complete match, but then I couldn't remember if a straight strcmp() triggers Parfait warnings so I couldn't finish reasoning my way through that maze... Switching the 'f_len' parameter to 's_len' would solve the problem without triggering Parfait, but it is totally your call. Dan > > Thanks, > Max > > > > On 11/12/2014 1:41 PM, Daniel D. Daugherty wrote: >> On 11/7/14 12:13 PM, Max Ockner wrote: >>> ID: 8060449 >>> webrev: http://cr.openjdk.java.net/~coleenp/8060449/ >> >> src/share/vm/runtime/arguments.cpp >> >> line 336: ) { >> This fragment is on a line by itself and far left. >> Minimally, it should align like this: >> >> line 331: if (... >> line 336: ) { >> >> However, I recommend a slightly different structure to >> this logic: >> >> size_t f_len = strlen(flag_status.name); >> size_t s_len = strlen(s); >> if (f_len == s_len || (f_len + 1) == s_len) { >> // this flag is the right length for a possible match >> if (strncmp(flag_status.name, s, f_len) == 0) || >> ((s[0] == '+' || s[0] == '-') && >> strncmp(flag_status.name, &s[1], f_len) == 0)) { >> // this flag is an exact match >> if (JDK_Version::current().compare(flag_status.accept_until) >> == -1) { >> ... >> } >> } >> } >> i++; >> >> I have no idea if the above formatting is going to be >> preserved by e-mail clients... >> >> Dan >> >> >>> >>> Summary: A "newly obsolete" command line option is one which is no >>> longer supported, but still is acknowledged. There is a list of >>> these in arguments.cpp. >>> It used to be that only a fixed number of characters were checked >>> when comparing a given command line option to the list of obsolete >>> flags (strncmp was used, where the number of characters to check is >>> equal to the length of the flag name from the table.) >>> As a result, an arbitrary string appended to the end of an obsolete >>> argument goes unnoticed. >>> This issue is fixed by comparing the lengths of the given flag and >>> the flags from the obsolete flags table. >>> When a misspelled flag is fuzzy-matched to an obsolete flag, an >>> appropriate warning is given to save the user a few key strokes: (1) >>> unrecognized option [bad option]. (2) Did you mean [option]? (3) >>> [option] is obsolete as of [version]) >>> >>> A new test for this feature checks for the presence of all three >>> components of the above error message. >>> >>> Tested with: vm.quick.testlist >>> hotspot jtreg tests >>> jprt >>> >>> Thanks for your help! >>> Max Ockner >>> >>> >> > From coleen.phillimore at oracle.com Wed Nov 12 22:45:00 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 12 Nov 2014 17:45:00 -0500 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. In-Reply-To: <5463C52D.4000600@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> Message-ID: <5463E2EC.8010206@oracle.com> Hi, I think Max's code looks fine with the formatting changes you suggested, and changes that Lois have suggested. I didn't think your coding suggestion made it that much clearer really. Or it's 6 of one, half dozen of the other... Thanks! Coleen On 11/12/14, 3:38 PM, Daniel D. Daugherty wrote: > On 11/12/14 1:04 PM, Max Ockner wrote: >> Dan, >> I have reformatted the "){" fragment on line 336 as you >> recommended. Thanks for catching that. > > Thanks. > > >> For your second recommendation, I think I have a use case where the >> recommended code would not function properly: >> >> Let's say there is a boolean flag SomeFlag, and let's say that the >> user tries to type "-XX:SomeFlagg". >> >> The first if statement passes because strlen("SomeFlagg") = >> strlen("SomeFlag")+1. >> The second conditional checks if (strncmp(flag_status.name, s, f_len) >> == 0). But f_len, the length of "SomeFlag" is 8. The result is that >> the 9th character of the user's input, which is where s differs from >> flag_status.name, is not checked,so this condition is passed as well. > > Your use case catches a bug in what I posted. I had originally > planned to change the two strncmp() calls to strcmp() so that > we get a complete match, but then I couldn't remember if a > straight strcmp() triggers Parfait warnings so I couldn't > finish reasoning my way through that maze... > > Switching the 'f_len' parameter to 's_len' would solve the > problem without triggering Parfait, but it is totally your > call. > > Dan > >> >> Thanks, >> Max >> >> >> >> On 11/12/2014 1:41 PM, Daniel D. Daugherty wrote: >>> On 11/7/14 12:13 PM, Max Ockner wrote: >>>> ID: 8060449 >>>> webrev: http://cr.openjdk.java.net/~coleenp/8060449/ >>> >>> src/share/vm/runtime/arguments.cpp >>> >>> line 336: ) { >>> This fragment is on a line by itself and far left. >>> Minimally, it should align like this: >>> >>> line 331: if (... >>> line 336: ) { >>> >>> However, I recommend a slightly different structure to >>> this logic: >>> >>> size_t f_len = strlen(flag_status.name); >>> size_t s_len = strlen(s); >>> if (f_len == s_len || (f_len + 1) == s_len) { >>> // this flag is the right length for a possible match >>> if (strncmp(flag_status.name, s, f_len) == 0) || >>> ((s[0] == '+' || s[0] == '-') && >>> strncmp(flag_status.name, &s[1], f_len) == 0)) { >>> // this flag is an exact match >>> if (JDK_Version::current().compare(flag_status.accept_until) >>> == -1) { >>> ... >>> } >>> } >>> } >>> i++; >>> >>> I have no idea if the above formatting is going to be >>> preserved by e-mail clients... >>> >>> Dan >>> >>> >>>> >>>> Summary: A "newly obsolete" command line option is one which is no >>>> longer supported, but still is acknowledged. There is a list of >>>> these in arguments.cpp. >>>> It used to be that only a fixed number of characters were checked >>>> when comparing a given command line option to the list of obsolete >>>> flags (strncmp was used, where the number of characters to check is >>>> equal to the length of the flag name from the table.) >>>> As a result, an arbitrary string appended to the end of an obsolete >>>> argument goes unnoticed. >>>> This issue is fixed by comparing the lengths of the given flag and >>>> the flags from the obsolete flags table. >>>> When a misspelled flag is fuzzy-matched to an obsolete flag, an >>>> appropriate warning is given to save the user a few key strokes: >>>> (1) unrecognized option [bad option]. (2) Did you mean [option]? >>>> (3) [option] is obsolete as of [version]) >>>> >>>> A new test for this feature checks for the presence of all three >>>> components of the above error message. >>>> >>>> Tested with: vm.quick.testlist >>>> hotspot jtreg tests >>>> jprt >>>> >>>> Thanks for your help! >>>> Max Ockner >>>> >>>> >>> >> > From christian.thalinger at oracle.com Wed Nov 12 23:48:51 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Wed, 12 Nov 2014 15:48:51 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <54623F23.4000404@redhat.com> References: <54623F23.4000404@redhat.com> Message-ID: Looks good to me. Maybe we should CC sound-dev (if that?s the correct list)? > On Nov 11, 2014, at 8:53 AM, Andrew Haley wrote: > > The changes for the /jdk subdirectory. > > http://cr.openjdk.java.net/~aph/aarch64-JDK-8064594/ > > Andrew. From christian.thalinger at oracle.com Wed Nov 12 23:51:59 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Wed, 12 Nov 2014 15:51:59 -0800 Subject: RFR: AARCH64: 8064594: Top-level JDK changes In-Reply-To: <546348F8.9060900@redhat.com> References: <546348F8.9060900@redhat.com> Message-ID: Oh. I just replied to the wrong email. Anyway, here it goes again: Maybe we should CC sound-dev (if that?s the correct list)? The new jvm.cfg files should only have a copyright year of 2014. Otherwise this looks good. > On Nov 12, 2014, at 3:48 AM, Andrew Haley wrote: > > The changes for the /jdk subdirectory. > > The missing files problem bit me again. > > New webrev: http://cr.openjdk.java.net/~aph/aarch64-JDK-8064594-1/ > > Apologies, > Andrew. From dean.long at oracle.com Thu Nov 13 00:04:34 2014 From: dean.long at oracle.com (Dean Long) Date: Wed, 12 Nov 2014 16:04:34 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: References: <54623F23.4000404@redhat.com> Message-ID: <5463F592.6060000@oracle.com> And adding build-infra-dev and jdk9-dev wouldn't hurt either. dl On 11/12/2014 3:48 PM, Christian Thalinger wrote: > Looks good to me. Maybe we should CC sound-dev (if that?s the correct list)? > >> On Nov 11, 2014, at 8:53 AM, Andrew Haley wrote: >> >> The changes for the /jdk subdirectory. >> >> http://cr.openjdk.java.net/~aph/aarch64-JDK-8064594/ >> >> Andrew. From vladimir.kozlov at oracle.com Thu Nov 13 00:14:05 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 12 Nov 2014 16:14:05 -0800 Subject: RFR: AARCH64: 8064594: Top-level JDK changes In-Reply-To: References: Message-ID: <5463F7CD.5080002@oracle.com> Oh. I just replied to the wrong email. Anyway, here it goes again: Maybe we should CC sound-dev (if that?s the correct list)? The new jvm.cfg files should only have a copyright year of 2014. Otherwise this looks good. > On Nov 12, 2014, at 3:48 AM, Andrew Haley wrote: > > The changes for the /jdk subdirectory. > > The missing files problem bit me again. > > New webrev: http://cr.openjdk.java.net/~aph/aarch64-JDK-8064594-1/ > > Apologies, > Andrew. From david.holmes at oracle.com Thu Nov 13 03:03:49 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Nov 2014 13:03:49 +1000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> <4295855A5C1DE049A61835A1887419CC2CF26820@DEWDFEMB12A.global.corp.sap> <2e9bd47c-366a-446b-89d0-b431a5816007@default> Message-ID: <54641F95.5030201@oracle.com> Hi Thomas, On 12/11/2014 8:31 PM, Thomas St?fe wrote: > Hi, > > could you please review this little addition? (added comments for > jio_snprintf) > > http://cr.openjdk.java.net/~simonis/webrevs/8062370/ A new bug is needed for these changes. As people rarely look at the header file when reading the code could you augment the last line of the comment in jvm.cpp from: + // return always -1. to + // always return -1, and perform null termination. Thanks, David > Thanks! > > Best Regards, > > Thomas > > On Tue, Nov 11, 2014 at 11:37 AM, Markus Gr?nlund < > markus.gronlund at oracle.com> wrote: > >> Hi Goetz, >> >> Thanks for following up on this. >> >> I adjusted a few calls into jio_snprintf() that were particular for >> Windows to accommodate the updates. >> >> From the test results I have seen so far, it seems no other issues >> appeared which could be related to this. >> >> So I think the change should be left as is (no rollback), but maybe a >> comment could be added about the jio_snprintf() semantics (NULL termination >> on overflows, expected 'count' etc). >> >> Thanks >> Markus >> >> >> -----Original Message----- >> From: Lindenmaier, Goetz [mailto:goetz.lindenmaier at sap.com] >> Sent: den 11 november 2014 11:14 >> To: Markus Gr?nlund >> Cc: David Holmes; hotspot-dev at openjdk.java.net >> Subject: RE: RFR (L): 8062370: Various minor code improvements >> >> Hi Markus, >> >> Could you fix your issue and did the other tests pass? >> >> Are there any follow-up actions I should take? >> >> Best regards, >> Goetz. >> >> -----Original Message----- >> From: Markus Gr?nlund [mailto:markus.gronlund at oracle.com] >> Sent: Donnerstag, 6. November 2014 14:50 >> To: Lindenmaier, Goetz >> Cc: David Holmes; hotspot-dev at openjdk.java.net >> Subject: RE: RFR (L): 8062370: Various minor code improvements >> >> Hi Goetz, >> >> Thanks for looking into this. >> >> I think I will be able to update the internal code I am working on to >> accommodate your updates. >> >> I don't know if any other code will see potential issues - only testing >> will tell. >> >> So I would await the rollback and I will putback my updated code - let's >> see if other issues appear after this - we should know after this nights >> nightly testing (then we can re-evaluate the rollback). >> >> Thanks >> Markus >> >> >> -----Original Message----- >> From: Lindenmaier, Goetz [mailto:goetz.lindenmaier at sap.com] >> Sent: den 6 november 2014 13:49 >> To: David Holmes; hotspot-dev at openjdk.java.net >> Cc: Markus Gr?nlund >> Subject: RE: RFR (L): 8062370: Various minor code improvements >> >> Hi David, >> >> Well, yes, that's right. But then you can simply pass in count+1. >> It works also if the caller knows he will only use 'count' bytes of the >> string. In this case +1 must be allocated. >> But that both is quite special. >> >> Currently, if the string is truncated, there is no null byte on windows. >> And there are a lot of uses of this method in the VM (via jio_snprintf). >> >> Should I use the internal bug number for the rollback-fix? >> >> How should we proceed, as I can't fix you internal code? >> >> Best regards, >> Goetz. >> >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Donnerstag, 6. November 2014 12:23 >> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >> Cc: Markus Gr?nlund >> Subject: Re: RFR (L): 8062370: Various minor code improvements >> >> On 6/11/2014 8:43 PM, Lindenmaier, Goetz wrote: >>> Hi David, >>> >>> yes, windows does not null terminate if there is an overflow. >>> Obviously there are overflows, and they now see one less character. I >>> think this should be fixed where jio_vsnprintf is called. Having >>> non-null terminated strings isn't nice. >> >> I think it depends on what you consider an overflow. If the buffer is >> already null terminated and you pass in a count that covers up to the >> location before the null then there is no problem - except now the logic >> will introduce a second null in place of the last character. >> >>> But for now I will roll back this single change. I'll send a RFR soon. >>> >>> Where did you see the problem? >> >> It was in our closed code so I can't go into details. We have a non-public >> bug number: 8063089 >> >> Thanks, >> David >> >>> >>> Best regards, >>> Goetz. >>> >>> >>> >>> >>> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com] >>> Sent: Donnerstag, 6. November 2014 11:30 >>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>> Cc: Markus Gr?nlund >>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>> >>> On 6/11/2014 8:17 PM, Lindenmaier, Goetz wrote: >>>> Thanks David, I'll have a look. >>> >>> It seems that windows vsnprintf may not null-terminate the string - >>> which I think is what your patch was trying to address. But if we have >>> existing code that works with that then the fix is now overwriting the >>> last character. I can't quite see how to handle this in a cross >>> platform manner, but in the immediate term we should probably revert >>> that part of the changeset. >>> >>> David >>> >>>> Best regards, >>>> Goetz. >>>> >>>> -----Original Message----- >>>> From: David Holmes [mailto:david.holmes at oracle.com] >>>> Sent: Donnerstag, 6. November 2014 11:09 >>>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>>> Cc: Markus Gr?nlund >>>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>>> >>>> Hi Goetz, >>>> >>>> This change has introduced a bug: >>>> >>>> - return vsnprintf(str, count, fmt, args); >>>> + >>>> + int result = vsnprintf(str, count, fmt, args); if ((result > 0 && >>>> + (size_t)result >= count) || result == -1) { >>>> + str[count - 1] = '\0'; >>>> + result = -1; >>>> + } >>>> + >>>> + return result; >>>> >>>> some strings are getting their last character truncated on Windows. >>>> >>>> David >>>> >>>> On 5/11/2014 6:16 PM, Lindenmaier, Goetz wrote: >>>>> Hi David, >>>>> >>>>> thanks for looking at the change! I fixed the issue in a new >>>>> webrev: >>>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.01/ >>>>> >>>>> Best regards, >>>>> Goetz. >>>>> >>>>> -----Original Message----- >>>>> From: David Holmes [mailto:david.holmes at oracle.com] >>>>> Sent: Mittwoch, 5. November 2014 02:49 >>>>> To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >>>>> Subject: Re: RFR (L): 8062370: Various minor code improvements >>>>> >>>>> Hi Goetz, >>>>> >>>>> The only issue I see is in: >>>>> >>>>> src/share/vm/runtime/globals.cpp >>>>> >>>>> where you replaced NEW_C_HEAP_ARRAY with os::strdup. To keep the >>>>> "abort on OOM" semantics of NEW_C_HEAP_ARRAY you need to use >> os::strdup_check_oom. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 30/10/2014 6:28 PM, Lindenmaier, Goetz wrote: >>>>>> Hi, >>>>>> >>>>>> this change contains a row of minor code improvements we did to >>>>>> fulfil our internal quality requirements. We would like to share >>>>>> these with openJDK. >>>>>> >>>>>> Please review and test this change. I please need a sponsor. >>>>>> http://cr.openjdk.java.net/~goetz/webrevs/8062370/webrev.00/ >>>>>> https://bugs.openjdk.java.net/browse/JDK-8062370 >>>>>> >>>>>> We tested this on windows 64, linux x86_64, mac, solaris sparc >>>>>> 32+64 bit and, of course, the ppc platforms. >>>>>> >>>>>> >>>>>> Some details: >>>>>> >>>>>> CONST64(0x8000000000000000) is wrong, as 0x8... is positive, and thus >> not representable as i64 what is used in the CONST64 macro. This change >> adapts UCONST64 to use ui64, and the usages of these macros where necessary. >>>>>> >>>>>> We add some more strncpy uses. Also, we fix strncpy on windows. >> There, strncpy does not write a \0 into the last byte if the copied string >> is too long. >>>>>> >>>>>> We add some missing memory frees and some closing of files. >>>>>> >>>>>> jio_vsnprintf() works differently on windows and linux. This change >> adapts this to show the same behaviour on all platforms. See java.cpp. >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Goetz >>>>>> >>>>>> >>>>>> >>>>>> >> From david.holmes at oracle.com Thu Nov 13 04:09:45 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Nov 2014 14:09:45 +1000 Subject: RFR: 8058255: Native jbyte Atomic::cmpxchg for supported x86 platforms In-Reply-To: <77B724DF-8174-4E0D-B86C-7A320DEFB4A2@lnu.se> References: <37B3D027-5B2E-417C-A679-D58AA250FCEF@lnu.se> <4CC8B7BA-1536-47A3-9CEF-069191E574B7@lnu.se> <47EB5B12-540E-45F7-8873-FA7BB015A8FE@oracle.com> <545ACD53.3050108@oracle.com> <77B724DF-8174-4E0D-B86C-7A320DEFB4A2@lnu.se> Message-ID: <54642F09.50507@oracle.com> Hi Erik, Sorry for the delay in getting back to this. I'm fine with the change in this form at this time. Thanks, David On 8/11/2014 12:43 AM, Erik ?sterlund wrote: > Hi David, > > Full webrev of the proposed push: > http://cr.openjdk.java.net/~jwilhelm/8058255/webrev.04/ > > Incremental webrev of the proposed push: > http://cr.openjdk.java.net/~jwilhelm/8058255/webrev.04.incremental/ > > Note that the #define is still left in this change as Paul wanted this pushed before we push my solution for getting rid of it (using templates and inheritance). > > Thanks, > Erik > > On 06 Nov 2014, at 02:22, David Holmes wrote: > >> I'd like to see a final webrev please! I've lost track of this a bit. >> >> Thanks, >> David >> >> On 6/11/2014 8:08 AM, Erik ?sterlund wrote: >>> Okay, thanks a lot for the reviews Paul and Kim. :) >>> Kim can you confirm I'm good to go? Everything you mentioned is fixed and I'm ready to go. >>> >>> Thanks, >>> >>> /Erik >>> >>> On 05 Nov 2014, at 22:10, Paul Hohensee > wrote: >>> >>> I don't need a new webrev either, so afaic you're good to go. >>> >>> Thanks, >>> >>> Paul >>> >>> >>> On Tue, Nov 4, 2014 at 1:15 PM, Kim Barrett > wrote: >>> On Nov 3, 2014, at 7:21 PM, Erik ?sterlund > wrote: >>>> >>>>> [legacy issue, not in changed code] >>>>> I think the comment for generate_atomic_cmpxchg_long() is wrong in the >>>>> return value; shouldn't it be returning a jlong? Probably a C-Y bug. >>>> >>>> No generate_atomic_cmpxchg_long() is used for generating code stubs for jlong CAS. I.e. it returns the address of the generated stub rather than executing a CAS - hence the return type is correct. >>> >>> The comment that I?m complaining about is the one describing the operation being supported by the generator, whose return type should be jlong, just as the corresponding return type in the comment for the new cmpxchg_byte support is jbyte. That is, >>> >>> 623 // Support for jint atomic::atomic_cmpxchg_long(jlong exchange_value, >>> >>> should be ?// Support for jlong ?" >>> >>>>> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp >>>>> 96 : "q" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp) >>>>> >>>>> Why is the new byte version using "q" for exchange_value, where the >>>>> existing int and long versions use "r"? [There might be a good >>>>> reason, and this is just my rusty assembler skills showing.] >>>> >>>> With the "q" constraint you select one of the 8-bit-addressable registers rax, rcx, rdx, rbx (as opposed to any register with "r?). >>> >>> Thanks for the explanation. I didn?t remember that at all, and the documentation I skimmed yesterday wasn?t helping. >>> >>>> The compare_value is assigned to eax using "a" which is also 8-bit-addressable (al). Also cmpxchgb needs it to be in al specifically. >>> >>> At least I got that part. >>> >>>> The former (allocating 8-bit-addressable registers) wasn't a concern for the other variants really, but here this is pretty important for the operands of cmpxchgb. :) >>> >>> Indeed. >>> >>>>> ------------------------------------------------------------------------------ >>>>> >>>>> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp >>>>> src/os_cpu/windows_x86/vm/os_windows_x86.hpp >>>>> >>>>> The windows port seems to only support specialized cmpxchgb when >>>>> defined(AMD64), while the BSD/Linux variants don't have that >>>>> restriction. Why this inconsistency? Or am I missing something, >>>>> which seems entirely possible in this tangle. >>>> >>>> If you look closely, you will see there are two definitions - one for AMD64 using a runtime-generated code stub. >>>> Then there is another MSVC assembly variant for #ifndef AMD64. >>>> This goes perfectly consistent with e.g. the jint cmpxchg for windows way of doing things. >>> >>> Oops, you are correct. >>> >>>> Do you want a new webrev? (just polished comments and renamed the #define as per request) >>> >>> I don?t think I need one, but others might want a closer to final version. >>> >>> >>> > From david.holmes at oracle.com Thu Nov 13 04:20:20 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Nov 2014 14:20:20 +1000 Subject: RFR: 8064580 and 8064581: Move INCLUDE_CDS and INCLUDE_ALL_GCS to the end of the include lists In-Reply-To: <546354EB.80606@oracle.com> References: <546354EB.80606@oracle.com> Message-ID: <54643184.9070500@oracle.com> Hi Stefan, Please ensure the main-line platforms build okay with PCH disabled. Thanks, David On 12/11/2014 10:39 PM, Stefan Karlsson wrote: > Hi all, > > Please, review the following two cleanup patches to move the conditional > include lines to the end of the include lists. The patches also add > missing macros.hpp includes, that are needed when the INCLUDE_* defines > are used. There are also a few minor cleanups near some usages of > INCLUDE_ALL_GCS. > > http://cr.openjdk.java.net/~stefank/8064580/webrev.01 - Fix INCLUDE_CDS > http://cr.openjdk.java.net/~stefank/8064581/webrev.01 - Fix > INCLUDE_ALL_GCS > > Some background to the sort order, the INCLUDE_* defines and macros.hpp: > > The include lines where inserted and sorted in the includeDB removal > patch. As part of that patch all includes that were guarded by #ifndef > were put at the end of the include list. See: > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f95d63e2154a - > 6989984: Use standard include model for Hospot > > Later the selective inclusion of parts like, for example, CDS and the > non-serial GCs were changed and now we also rely on the defines present > in macros.hpp. With that change it's now important that all conditional > includes are added after the inclusion of macros.hpp. See: > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fb19af007ffc - > 7189254: Change makefiles for more flexibility to override defaults > > thanks, > StefanK From david.holmes at oracle.com Thu Nov 13 04:28:22 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Nov 2014 14:28:22 +1000 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <5463F592.6060000@oracle.com> References: <54623F23.4000404@redhat.com> <5463F592.6060000@oracle.com> Message-ID: <54643366.8080105@oracle.com> On 13/11/2014 10:04 AM, Dean Long wrote: > And adding build-infra-dev and jdk9-dev wouldn't hurt either. Let's not get carried away for what is quite a trivial copying of existing platform specific patterns :) build-dev (not build-infra-dev) would be okay. jdk9-dev isn't needed if already on hotspot-dev, build-dev and sound-dev. These changes seem quite trivially fine to me. David H. > dl > > On 11/12/2014 3:48 PM, Christian Thalinger wrote: >> Looks good to me. Maybe we should CC sound-dev (if that?s the correct >> list)? >> >>> On Nov 11, 2014, at 8:53 AM, Andrew Haley wrote: >>> >>> The changes for the /jdk subdirectory. >>> >>> http://cr.openjdk.java.net/~aph/aarch64-JDK-8064594/ >>> >>> Andrew. > From david.holmes at oracle.com Thu Nov 13 04:54:12 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Nov 2014 14:54:12 +1000 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <546228E3.8030207@oracle.com> References: <546228E3.8030207@oracle.com> Message-ID: <54643974.9050805@oracle.com> Hi Mikael, Without knowing the details it is hard to determine the correctness of this. What you describe below sounds reasonable - but what about the opposite problem in the new code: what if you read an old top() then a new timestamp, before top() is updated? Will that work correctly or will the region between the old-top and new-top be missed? Cheers, David H. On 12/11/2014 1:18 AM, Mikael Gerdin wrote: > Hi all, > > I've sent this to hotspot-dev instead of just hotspot-gc-dev in the hope > of getting some extra feedback from our resident concurrency experts. > > Please review this subtle change to the order in which we read fields in > G1OffsetTableContigSpace::saved_mark_word, original included here for > reference: > 1003 > 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { > 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); > 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); > 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) > 1008 return top(); > 1009 else > 1010 return Space::saved_mark_word(); > 1011 } > 1012 > > When getting a new gc alloc region several stores are performed where > store ordering needs to be enforced and several synchronization points > occur. > [write path] > ST(_saved_mark_word) > #StoreStore > ST(_gc_time_stamp) > ST(_top) // satisfying alloc request > #StoreStore > ST(_alloc_region) // publishing to other gc workers > #MonitorEnter > ST(_top) // potential further allocations > #MonitorExit > #MonitorEnter > ST(_top) // potential further allocations > #MonitorExit > > When we inspect a region during remembered set scanning we need to > ensure that we never read memory which have been allocated by a GC > worker thread for the purpose of copying objects into. > The way this works is that a time stamp field is supposed to signal to a > scanning thread that it should look at addresses below _top if the time > stamp is old or addresses below _saved_mark_word if the time stamp is > current. > > The current code does (as seen above) > [read path] > LD(_gc_time_stamp) > LD(_top) > or (depending on time stamp) > LD(_saved_mark_word) > > Because these values are written to without full mutual exclusion we > need to be very careful about the order in which we read these values, > and this is where I argue that the current code is incorrect. > In order to observe a consistent view of the ordered stores in the > [write path] above we need to load the values in the reverse order they > were written, with proper #LoadLoad ordering enforced. > > The problem which we've observed here is that after we've read the time > stamp as below the heap time stamp the top pointer can be updated by a > GC worker allocating objects into this region. To make sure that the top > value we see is in fact valid we must read it before we read the time > stamp which determines which value we should return from the > saved_mark_word function. > > My suggested fix is to load _top first and enforce #LoadLoad ordering > enforced: > HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { > G1CollectedHeap* g1h = G1CollectedHeap::heap(); > assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); > HeapWord* local_top = top(); > OrderAccess::loadload(); > if (_gc_time_stamp < g1h->get_gc_time_stamp()) { > return local_top; > } else { > return Space::saved_mark_word(); > } > } > > I've successfully reproduced the crash with the original code by adding > some random sleep calls between the load of the time stamp and the load > of top so I'm fairly certain that this resolves the issue. I've also > verified that the fix I'm proposing does resolve the bug for the team > which encountered the issue, even if I can't reproduce that crash locally. > > I also plan to attempt design around some of the races in this code to > reduce its complexity, but for the sake of backporting the fix to 8u40 > I'd like to start with just adding the minimal fix. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 > Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ > Testing: JPRT, local kitchensink (4 hours), gc test suite > > Thanks > /Mikael From erik.osterlund at lnu.se Thu Nov 13 08:09:23 2014 From: erik.osterlund at lnu.se (=?Windows-1252?Q?Erik_=D6sterlund?=) Date: Thu, 13 Nov 2014 08:09:23 +0000 Subject: RFR: 8058255: Native jbyte Atomic::cmpxchg for supported x86 platforms In-Reply-To: <54642F09.50507@oracle.com> References: <37B3D027-5B2E-417C-A679-D58AA250FCEF@lnu.se> <4CC8B7BA-1536-47A3-9CEF-069191E574B7@lnu.se> <47EB5B12-540E-45F7-8873-FA7BB015A8FE@oracle.com> <545ACD53.3050108@oracle.com> <77B724DF-8174-4E0D-B86C-7A320DEFB4A2@lnu.se> <54642F09.50507@oracle.com> Message-ID: <56B84BA3-56E9-4F99-AA6A-AE3D500B5CD1@lnu.se> Hi David, Thanks for the review. :) /Erik On 13 Nov 2014, at 05:09, David Holmes wrote: > Hi Erik, > > Sorry for the delay in getting back to this. > > I'm fine with the change in this form at this time. > > Thanks, > David > > On 8/11/2014 12:43 AM, Erik ?sterlund wrote: >> Hi David, >> >> Full webrev of the proposed push: >> http://cr.openjdk.java.net/~jwilhelm/8058255/webrev.04/ >> >> Incremental webrev of the proposed push: >> http://cr.openjdk.java.net/~jwilhelm/8058255/webrev.04.incremental/ >> >> Note that the #define is still left in this change as Paul wanted this pushed before we push my solution for getting rid of it (using templates and inheritance). >> >> Thanks, >> Erik >> >> On 06 Nov 2014, at 02:22, David Holmes wrote: >> >>> I'd like to see a final webrev please! I've lost track of this a bit. >>> >>> Thanks, >>> David >>> >>> On 6/11/2014 8:08 AM, Erik ?sterlund wrote: >>>> Okay, thanks a lot for the reviews Paul and Kim. :) >>>> Kim can you confirm I'm good to go? Everything you mentioned is fixed and I'm ready to go. >>>> >>>> Thanks, >>>> >>>> /Erik >>>> >>>> On 05 Nov 2014, at 22:10, Paul Hohensee > wrote: >>>> >>>> I don't need a new webrev either, so afaic you're good to go. >>>> >>>> Thanks, >>>> >>>> Paul >>>> >>>> >>>> On Tue, Nov 4, 2014 at 1:15 PM, Kim Barrett > wrote: >>>> On Nov 3, 2014, at 7:21 PM, Erik ?sterlund > wrote: >>>>> >>>>>> [legacy issue, not in changed code] >>>>>> I think the comment for generate_atomic_cmpxchg_long() is wrong in the >>>>>> return value; shouldn't it be returning a jlong? Probably a C-Y bug. >>>>> >>>>> No generate_atomic_cmpxchg_long() is used for generating code stubs for jlong CAS. I.e. it returns the address of the generated stub rather than executing a CAS - hence the return type is correct. >>>> >>>> The comment that I?m complaining about is the one describing the operation being supported by the generator, whose return type should be jlong, just as the corresponding return type in the comment for the new cmpxchg_byte support is jbyte. That is, >>>> >>>> 623 // Support for jint atomic::atomic_cmpxchg_long(jlong exchange_value, >>>> >>>> should be ?// Support for jlong ?" >>>> >>>>>> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp >>>>>> 96 : "q" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp) >>>>>> >>>>>> Why is the new byte version using "q" for exchange_value, where the >>>>>> existing int and long versions use "r"? [There might be a good >>>>>> reason, and this is just my rusty assembler skills showing.] >>>>> >>>>> With the "q" constraint you select one of the 8-bit-addressable registers rax, rcx, rdx, rbx (as opposed to any register with "r?). >>>> >>>> Thanks for the explanation. I didn?t remember that at all, and the documentation I skimmed yesterday wasn?t helping. >>>> >>>>> The compare_value is assigned to eax using "a" which is also 8-bit-addressable (al). Also cmpxchgb needs it to be in al specifically. >>>> >>>> At least I got that part. >>>> >>>>> The former (allocating 8-bit-addressable registers) wasn't a concern for the other variants really, but here this is pretty important for the operands of cmpxchgb. :) >>>> >>>> Indeed. >>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> >>>>>> src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp >>>>>> src/os_cpu/windows_x86/vm/os_windows_x86.hpp >>>>>> >>>>>> The windows port seems to only support specialized cmpxchgb when >>>>>> defined(AMD64), while the BSD/Linux variants don't have that >>>>>> restriction. Why this inconsistency? Or am I missing something, >>>>>> which seems entirely possible in this tangle. >>>>> >>>>> If you look closely, you will see there are two definitions - one for AMD64 using a runtime-generated code stub. >>>>> Then there is another MSVC assembly variant for #ifndef AMD64. >>>>> This goes perfectly consistent with e.g. the jint cmpxchg for windows way of doing things. >>>> >>>> Oops, you are correct. >>>> >>>>> Do you want a new webrev? (just polished comments and renamed the #define as per request) >>>> >>>> I don?t think I need one, but others might want a closer to final version. >>>> >>>> >>>> >> From tobias.hartmann at oracle.com Thu Nov 13 09:07:09 2014 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 13 Nov 2014 10:07:09 +0100 Subject: [8u40] Backport RFR: 8056071: compiler/whitebox/IsMethodCompilableTest.java fails with 'method() is not compilable after 3 iterations' Message-ID: <546474BD.2080201@oracle.com> Hi, please review the following backport request for 8u40. 8056071: compiler/whitebox/IsMethodCompilableTest.java fails with 'method() is not compilable after 3 iterations' https://bugs.openjdk.java.net/browse/JDK-8056071 http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0d599246de33 The changes were pushed on Tuesday. Nightly testing showed no problems. The changes apply cleanly to 8u40. Thanks, Tobias From goetz.lindenmaier at sap.com Thu Nov 13 09:25:49 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 13 Nov 2014 09:25:49 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <54638FA0.8040204@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF270B0@DEWDFEMB12A.global.corp.sap> Hi Andrew, I retested the change, works fine. Thanks for the fixes! You missed to fix the define(AARCH64)/defined(TARGET_ARCH_aarch64) in c1_LIR.cpp. You only fixed the header file. Nothing more from my side! Best regards, Goetz. -----Original Message----- From: Andrew Haley [mailto:aph at redhat.com] Sent: Mittwoch, 12. November 2014 17:50 To: Lindenmaier, Goetz; hotspot-dev Source Developers; aarch64-port-dev at openjdk.java.net Subject: AARCH64: 8064611: Changes to HotSpot shared code On 11/12/2014 11:40 AM, Lindenmaier, Goetz wrote: > Hi Andrew, Hi, Thank you for your comment. I have prepared a new webrev at http://cr.openjdk.java.net/~aph/aarch64-8064611-1/ which I hope addresses everything you mentioned. I haven't re-ordered any of the lists of processors because I think this is a separate issue. Andrew. From aph at redhat.com Thu Nov 13 09:31:44 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 13 Nov 2014 09:31:44 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF270B0@DEWDFEMB12A.global.corp.sap> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF270B0@DEWDFEMB12A.global.corp.sap> Message-ID: <54647A80.60108@redhat.com> On 13/11/14 09:25, Lindenmaier, Goetz wrote: > You missed to fix the define(AARCH64)/defined(TARGET_ARCH_aarch64) in c1_LIR.cpp. Oh, rats. Another webrev coming up. Andrew. From aph at redhat.com Thu Nov 13 09:31:49 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 13 Nov 2014 09:31:49 +0000 Subject: RFR: AARCH64: Changes to HotSpot shared code In-Reply-To: <5463C1CE.9040301@oracle.com> References: <54625D3D.4000007@redhat.com> <5463C1CE.9040301@oracle.com> Message-ID: <54647A85.6020203@redhat.com> On 12/11/14 20:23, Dean Long wrote: > On 11/11/2014 11:02 AM, Andrew Haley wrote: >> http://cr.openjdk.java.net/~aph/aarch64-JDK-8064611/hotspot.patch >> >> Everything except cpu/ and os_cpu/. >> >> Most of this is obvious and trivial, with a few exceptions. >> >> In memory/metaspace.cpp, we allocated the memory for metadata in a >> different way. This is because we want to be able to decode and >> encode compressed metadata pointers with a single instruction, and we >> can always do that iff the base address is of a particular form. >> >> In opto/, we have made some changes in order to be able to use AArch64 >> store release instructions for volatile field stores. These don't >> require leading or trailing barriers. We have tried several times to >> do this without changing shared code, but it is impossible with the >> current back-end interface. > Is this something ppc64 can also take advantage of? I hope Vladimir can > suggest > a more flexible way to do this, perhaps with a runtime flag. Perhaps so, but as far as I'm aware AArch64 is the only CPU with exactly these semantics. From my point of view, it would be ideal if we simply emitted volatile store and volatile load as nodes and let the back end handle them. But if we do that we lose the opportunity to coalesce barriers in C2 optimization. Hmmm.... :-) >> In several places a release store is used where the AArch64 memory >> model makes it unnecessary. From earlier emails on this list we >> discovered that the only architecture which requires this release >> store is IA64, and OpenJDK does not support it anyway. We should >> perhaps look at re-engineering the way that memory barriers and memory >> accesses are handled in HotSpot with a view to pushing all these >> architecture-dependent assumptions out to the back ends. > I agree. More comments below. >> Andrew. > c1_Canonicalizer.cpp > Can this be handled in the back-end? I imagine other platforms, > such as x86, have similar limitations. It certainly could be. Maybe pd_valid_shift_count() ? But I'm striving not to touch any other ports. > c1_LIR.cpp > It looks like you need a temp for convert because your backend > because you're checking the FPSR. > What happens if you ignore the FPSR, do you get a wrong result? I've looked for a while, and I'm sorry but I don't understand which hunk this refers to. > c1_LinearScan.cpp > I'm not familiar with what the changed code is doing. Can you > explain why it applies to x86 and aarch64? It certainly was at the time. I'll investigate to see if this is still needed. > c1_Runtime1.cpp > This will break our closed port that NOP instructions for > patching. Ah, interesting. I spent quite a lot of time kicking around ideas for C1 patching, but (to my surprise) deoptimizing instead didn't seem to have significant adverse effect. > How about moving your deopt-instead-of-patch support > into Runtime1::patch_code() and enable it with a read-only > platform-specific developer runtime flag > (see INTPRESSURE for example)? Okay. I'll have a look at that. > compiledIC.hpp > You should be able to use set_inst_mark()/cbuf.insts_mark() to set > and retrieve the mark address. Okay. > arguments.cpp > I wish there was a way to fix ReservedCodeCacheSize in the back-end. Indeed. Thanks, Andrew. From goetz.lindenmaier at sap.com Thu Nov 13 09:38:12 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 13 Nov 2014 09:38:12 +0000 Subject: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF264E2@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF264E2@DEWDFEMB12A.global.corp.sap> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF27104@DEWDFEMB12A.global.corp.sap> Hi, would somebody volunteer and look at this change? I'd appreciate it a lot! Best regards, Goetz. -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz Sent: Montag, 10. November 2014 15:57 To: hotspot-dev at openjdk.java.net Subject: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. Hi, I need to improve a row of things around compressed oops heap handling to achieve good performance on ppc. I prepared a first webrev for review: http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.00/ A detailed technical description of the change is in the webrev and according bug. If requested, I will split the change into parts with more respective less impact on non-ppc platforms. The change is derived from well-tested code in our VM. Originally it was crafted to require the least changes of VM coding, I changed it to be better streamlined with the VM. I tested this change to deliver heaps at about the same addresses as before. Heap addresses mostly differ in lower bits. In some cases (Solaris 5.11) a heap in a better compressed oops mode is found, though. I ran (and adapted) test/runtime/CompressedOops and gc/arguments/TestUseCompressedOops*. Best regards, Goetz. From aph at redhat.com Thu Nov 13 09:45:41 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 13 Nov 2014 09:45:41 +0000 Subject: RFR: AARCH64: 8064594: Top-level JDK changes In-Reply-To: References: <546348F8.9060900@redhat.com> Message-ID: <54647DC5.9010102@redhat.com> On 12/11/14 23:51, Christian Thalinger wrote: > The new jvm.cfg files should only have a copyright year of 2014. Why, exactly? They have been around for a while. Andrew. From mikael.gerdin at oracle.com Thu Nov 13 10:19:43 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 13 Nov 2014 11:19:43 +0100 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <54643974.9050805@oracle.com> References: <546228E3.8030207@oracle.com> <54643974.9050805@oracle.com> Message-ID: <546485BF.30802@oracle.com> Hi David, On 2014-11-13 05:54, David Holmes wrote: > Hi Mikael, > > Without knowing the details it is hard to determine the correctness of > this. What you describe below sounds reasonable - but what about the > opposite problem in the new code: what if you read an old top() then a > new timestamp, before top() is updated? Will that work correctly or will > the region between the old-top and new-top be missed? I realize that not everyone is up to speed on the specifics of this code, but I appreciate you feedback on the general reasoning. Reading an old _top value is safe, and in fact we must enforce that the only _top values we ever return from this functions were set before the GC occurred. Reading a too recent _top value is the cause of the crash in this bug, since if this function returns a a recently updated _top value that is because another GC worker has allocated into this region and is in the process of copying objects into it. The point of the timestamp value is to only return old values of _top and if the timestamp is current it should return another value. I've updated the webrev slightly due to off-list feedback that I should attempt to avoid reading the time stamp more than once (for the assert). I've also noticed that I messed up the indentation of a curly brace so I fixed that as well. Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.1/ Incremental webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0_to_1/ /Mikael > > Cheers, > David H. > > On 12/11/2014 1:18 AM, Mikael Gerdin wrote: >> Hi all, >> >> I've sent this to hotspot-dev instead of just hotspot-gc-dev in the hope >> of getting some extra feedback from our resident concurrency experts. >> >> Please review this subtle change to the order in which we read fields in >> G1OffsetTableContigSpace::saved_mark_word, original included here for >> reference: >> 1003 >> 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >> 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); >> 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); >> 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) >> 1008 return top(); >> 1009 else >> 1010 return Space::saved_mark_word(); >> 1011 } >> 1012 >> >> When getting a new gc alloc region several stores are performed where >> store ordering needs to be enforced and several synchronization points >> occur. >> [write path] >> ST(_saved_mark_word) >> #StoreStore >> ST(_gc_time_stamp) >> ST(_top) // satisfying alloc request >> #StoreStore >> ST(_alloc_region) // publishing to other gc workers >> #MonitorEnter >> ST(_top) // potential further allocations >> #MonitorExit >> #MonitorEnter >> ST(_top) // potential further allocations >> #MonitorExit >> >> When we inspect a region during remembered set scanning we need to >> ensure that we never read memory which have been allocated by a GC >> worker thread for the purpose of copying objects into. >> The way this works is that a time stamp field is supposed to signal to a >> scanning thread that it should look at addresses below _top if the time >> stamp is old or addresses below _saved_mark_word if the time stamp is >> current. >> >> The current code does (as seen above) >> [read path] >> LD(_gc_time_stamp) >> LD(_top) >> or (depending on time stamp) >> LD(_saved_mark_word) >> >> Because these values are written to without full mutual exclusion we >> need to be very careful about the order in which we read these values, >> and this is where I argue that the current code is incorrect. >> In order to observe a consistent view of the ordered stores in the >> [write path] above we need to load the values in the reverse order they >> were written, with proper #LoadLoad ordering enforced. >> >> The problem which we've observed here is that after we've read the time >> stamp as below the heap time stamp the top pointer can be updated by a >> GC worker allocating objects into this region. To make sure that the top >> value we see is in fact valid we must read it before we read the time >> stamp which determines which value we should return from the >> saved_mark_word function. >> >> My suggested fix is to load _top first and enforce #LoadLoad ordering >> enforced: >> HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >> G1CollectedHeap* g1h = G1CollectedHeap::heap(); >> assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); >> HeapWord* local_top = top(); >> OrderAccess::loadload(); >> if (_gc_time_stamp < g1h->get_gc_time_stamp()) { >> return local_top; >> } else { >> return Space::saved_mark_word(); >> } >> } >> >> I've successfully reproduced the crash with the original code by adding >> some random sleep calls between the load of the time stamp and the load >> of top so I'm fairly certain that this resolves the issue. I've also >> verified that the fix I'm proposing does resolve the bug for the team >> which encountered the issue, even if I can't reproduce that crash >> locally. >> >> I also plan to attempt design around some of the races in this code to >> reduce its complexity, but for the sake of backporting the fix to 8u40 >> I'd like to start with just adding the minimal fix. >> >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 >> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ >> Testing: JPRT, local kitchensink (4 hours), gc test suite >> >> Thanks >> /Mikael From goetz.lindenmaier at sap.com Thu Nov 13 10:20:37 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 13 Nov 2014 10:20:37 +0000 Subject: RFR(XS): 8064786: Fix debug build after 8062808: Turn on the -Wreturn-type warning Message-ID: <4295855A5C1DE049A61835A1887419CC2CF27134@DEWDFEMB12A.global.corp.sap> Hi, please review, test and sponsor this tiny change. It fixes the debug build in the gc repository. https://bugs.openjdk.java.net/browse/JDK-8064786 http://cr.openjdk.java.net/~goetz/webrevs/8064786-warnRet/webrev.00/ Best regards, Goetz. From stefan.karlsson at oracle.com Thu Nov 13 10:34:41 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 13 Nov 2014 11:34:41 +0100 Subject: RFR(XS): 8064786: Fix debug build after 8062808: Turn on the -Wreturn-type warning In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF27134@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF27134@DEWDFEMB12A.global.corp.sap> Message-ID: <54648941.7070108@oracle.com> On 2014-11-13 11:20, Lindenmaier, Goetz wrote: > Hi, > > please review, test and sponsor this tiny change. It fixes the debug build in the gc repository. > https://bugs.openjdk.java.net/browse/JDK-8064786 > http://cr.openjdk.java.net/~goetz/webrevs/8064786-warnRet/webrev.00/ Looks good. Thanks for fixing. Another approach would be to just remove the ShouldNotReachHere() lines. I'll push when we get another review. thanks, StefanK > > Best regards, > Goetz. From aph at redhat.com Thu Nov 13 11:31:20 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 13 Nov 2014 11:31:20 +0000 Subject: RFR: AARCH64: Changes to HotSpot shared code In-Reply-To: <54647A85.6020203@redhat.com> References: <54625D3D.4000007@redhat.com> <5463C1CE.9040301@oracle.com> <54647A85.6020203@redhat.com> Message-ID: <54649688.7040401@redhat.com> On 11/13/2014 09:31 AM, Andrew Haley wrote: >> > How about moving your deopt-instead-of-patch support >> > into Runtime1::patch_code() and enable it with a read-only >> > platform-specific developer runtime flag >> > (see INTPRESSURE for example)? > Okay. I'll have a look at that. Does this mean that I'll need to add a flag to all back ends? Andrew. From david.holmes at oracle.com Thu Nov 13 11:50:27 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Nov 2014 21:50:27 +1000 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <546485BF.30802@oracle.com> References: <546228E3.8030207@oracle.com> <54643974.9050805@oracle.com> <546485BF.30802@oracle.com> Message-ID: <54649B03.5010900@oracle.com> On 13/11/2014 8:19 PM, Mikael Gerdin wrote: > Hi David, > > On 2014-11-13 05:54, David Holmes wrote: >> Hi Mikael, >> >> Without knowing the details it is hard to determine the correctness of >> this. What you describe below sounds reasonable - but what about the >> opposite problem in the new code: what if you read an old top() then a >> new timestamp, before top() is updated? Will that work correctly or will >> the region between the old-top and new-top be missed? > > I realize that not everyone is up to speed on the specifics of this > code, but I appreciate you feedback on the general reasoning. > > Reading an old _top value is safe, and in fact we must enforce that the > only _top values we ever return from this functions were set before the > GC occurred. > > Reading a too recent _top value is the cause of the crash in this bug, > since if this function returns a a recently updated _top value that is > because another GC worker has allocated into this region and is in the > process of copying objects into it. The point of the timestamp value is > to only return old values of _top and if the timestamp is current it > should return another value. Ok. I assumed there was some kind of monotonic progression that made this okay but as I said I'm not at all familiar with the code. > I've updated the webrev slightly due to off-list feedback that I should > attempt to avoid reading the time stamp more than once (for the assert). I noticed that too but I was looking at the reading of g1h->get_gc_time_stamp() twice, not the reading of _gc_time_stamp. Shouldn't you read them both once eg: HeapWord* local_top = top(); OrderAccess::loadload(); const unsigned local_time_stamp = _gc_time_stamp; const unsigned gc_time_stamp = g1h->get_gc_time_stamp(); assert(local_time_stamp <= gc_time_stamp, "invariant" ); if (local_time_stamp < gc_time_stamp) { return local_top; } else { return Space::saved_mark_word(); } ? David ----- > I've also noticed that I messed up the indentation of a curly brace so I > fixed that as well. > > Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.1/ > > Incremental webrev: > http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0_to_1/ > > /Mikael > >> >> Cheers, >> David H. >> >> On 12/11/2014 1:18 AM, Mikael Gerdin wrote: >>> Hi all, >>> >>> I've sent this to hotspot-dev instead of just hotspot-gc-dev in the hope >>> of getting some extra feedback from our resident concurrency experts. >>> >>> Please review this subtle change to the order in which we read fields in >>> G1OffsetTableContigSpace::saved_mark_word, original included here for >>> reference: >>> 1003 >>> 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>> 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>> 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), >>> "invariant" ); >>> 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) >>> 1008 return top(); >>> 1009 else >>> 1010 return Space::saved_mark_word(); >>> 1011 } >>> 1012 >>> >>> When getting a new gc alloc region several stores are performed where >>> store ordering needs to be enforced and several synchronization points >>> occur. >>> [write path] >>> ST(_saved_mark_word) >>> #StoreStore >>> ST(_gc_time_stamp) >>> ST(_top) // satisfying alloc request >>> #StoreStore >>> ST(_alloc_region) // publishing to other gc workers >>> #MonitorEnter >>> ST(_top) // potential further allocations >>> #MonitorExit >>> #MonitorEnter >>> ST(_top) // potential further allocations >>> #MonitorExit >>> >>> When we inspect a region during remembered set scanning we need to >>> ensure that we never read memory which have been allocated by a GC >>> worker thread for the purpose of copying objects into. >>> The way this works is that a time stamp field is supposed to signal to a >>> scanning thread that it should look at addresses below _top if the time >>> stamp is old or addresses below _saved_mark_word if the time stamp is >>> current. >>> >>> The current code does (as seen above) >>> [read path] >>> LD(_gc_time_stamp) >>> LD(_top) >>> or (depending on time stamp) >>> LD(_saved_mark_word) >>> >>> Because these values are written to without full mutual exclusion we >>> need to be very careful about the order in which we read these values, >>> and this is where I argue that the current code is incorrect. >>> In order to observe a consistent view of the ordered stores in the >>> [write path] above we need to load the values in the reverse order they >>> were written, with proper #LoadLoad ordering enforced. >>> >>> The problem which we've observed here is that after we've read the time >>> stamp as below the heap time stamp the top pointer can be updated by a >>> GC worker allocating objects into this region. To make sure that the top >>> value we see is in fact valid we must read it before we read the time >>> stamp which determines which value we should return from the >>> saved_mark_word function. >>> >>> My suggested fix is to load _top first and enforce #LoadLoad ordering >>> enforced: >>> HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>> G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>> assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); >>> HeapWord* local_top = top(); >>> OrderAccess::loadload(); >>> if (_gc_time_stamp < g1h->get_gc_time_stamp()) { >>> return local_top; >>> } else { >>> return Space::saved_mark_word(); >>> } >>> } >>> >>> I've successfully reproduced the crash with the original code by adding >>> some random sleep calls between the load of the time stamp and the load >>> of top so I'm fairly certain that this resolves the issue. I've also >>> verified that the fix I'm proposing does resolve the bug for the team >>> which encountered the issue, even if I can't reproduce that crash >>> locally. >>> >>> I also plan to attempt design around some of the races in this code to >>> reduce its complexity, but for the sake of backporting the fix to 8u40 >>> I'd like to start with just adding the minimal fix. >>> >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 >>> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ >>> Testing: JPRT, local kitchensink (4 hours), gc test suite >>> >>> Thanks >>> /Mikael From thomas.schatzl at oracle.com Thu Nov 13 11:52:46 2014 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 13 Nov 2014 12:52:46 +0100 Subject: RFR(XS): 8064786: Fix debug build after 8062808: Turn on the -Wreturn-type warning In-Reply-To: <54648941.7070108@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF27134@DEWDFEMB12A.global.corp.sap> <54648941.7070108@oracle.com> Message-ID: <1415879566.3449.1.camel@oracle.com> Hi, On Thu, 2014-11-13 at 11:34 +0100, Stefan Karlsson wrote: > On 2014-11-13 11:20, Lindenmaier, Goetz wrote: > > Hi, > > > > please review, test and sponsor this tiny change. It fixes the debug build in the gc repository. > > https://bugs.openjdk.java.net/browse/JDK-8064786 > > http://cr.openjdk.java.net/~goetz/webrevs/8064786-warnRet/webrev.00/ > > Looks good. Thanks for fixing. Another approach would be to just remove > the ShouldNotReachHere() lines. > > I'll push when we get another review. Looks good to me. Thomas From david.holmes at oracle.com Thu Nov 13 11:53:33 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Nov 2014 21:53:33 +1000 Subject: RFR(XS): 8064786: Fix debug build after 8062808: Turn on the -Wreturn-type warning In-Reply-To: <54648941.7070108@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF27134@DEWDFEMB12A.global.corp.sap> <54648941.7070108@oracle.com> Message-ID: <54649BBD.2040009@oracle.com> On 13/11/2014 8:34 PM, Stefan Karlsson wrote: > On 2014-11-13 11:20, Lindenmaier, Goetz wrote: >> Hi, >> >> please review, test and sponsor this tiny change. It fixes the debug >> build in the gc repository. >> https://bugs.openjdk.java.net/browse/JDK-8064786 >> http://cr.openjdk.java.net/~goetz/webrevs/8064786-warnRet/webrev.00/ > > Looks good. Thanks for fixing. Another approach would be to just remove > the ShouldNotReachHere() lines. > > I'll push when we get another review. Reviewed. But I'm concerned as to how this was not detected with the original fix. I assume we don't build with a compiler that complains about this code? Thanks, David > thanks, > StefanK > >> >> Best regards, >> Goetz. > From goetz.lindenmaier at sap.com Thu Nov 13 11:55:43 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 13 Nov 2014 11:55:43 +0000 Subject: RFR(XS): 8064786: Fix debug build after 8062808: Turn on the -Wreturn-type warning In-Reply-To: <54649BBD.2040009@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF27134@DEWDFEMB12A.global.corp.sap> <54648941.7070108@oracle.com> <54649BBD.2040009@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF271D4@DEWDFEMB12A.global.corp.sap> Maybe you don't do a debug build. Fastdbg and opt pass. I verified that it's not my old compiler, gcc 4.3.4 fails, too. Best regards, Goetz -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Donnerstag, 13. November 2014 12:54 To: Stefan Karlsson; Lindenmaier, Goetz; hotspot-dev at openjdk.java.net Subject: Re: RFR(XS): 8064786: Fix debug build after 8062808: Turn on the -Wreturn-type warning On 13/11/2014 8:34 PM, Stefan Karlsson wrote: > On 2014-11-13 11:20, Lindenmaier, Goetz wrote: >> Hi, >> >> please review, test and sponsor this tiny change. It fixes the debug >> build in the gc repository. >> https://bugs.openjdk.java.net/browse/JDK-8064786 >> http://cr.openjdk.java.net/~goetz/webrevs/8064786-warnRet/webrev.00/ > > Looks good. Thanks for fixing. Another approach would be to just remove > the ShouldNotReachHere() lines. > > I'll push when we get another review. Reviewed. But I'm concerned as to how this was not detected with the original fix. I assume we don't build with a compiler that complains about this code? Thanks, David > thanks, > StefanK > >> >> Best regards, >> Goetz. > From david.holmes at oracle.com Thu Nov 13 11:59:05 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Nov 2014 21:59:05 +1000 Subject: RFR(XS): 8064786: Fix debug build after 8062808: Turn on the -Wreturn-type warning In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF271D4@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF27134@DEWDFEMB12A.global.corp.sap> <54648941.7070108@oracle.com> <54649BBD.2040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF271D4@DEWDFEMB12A.global.corp.sap> Message-ID: <54649D09.7040406@oracle.com> On 13/11/2014 9:55 PM, Lindenmaier, Goetz wrote: > Maybe you don't do a debug build. Fastdbg and opt pass. Ah - no we don't do debug, just product and fastdebug. too many variations. But I'm concerned because this suggests there are more differences between debug and fastdebug than just optimization levels (which I thought was the difference). Thanks, David > I verified that it's not my old compiler, gcc 4.3.4 fails, too. > > Best regards, > Goetz > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 13. November 2014 12:54 > To: Stefan Karlsson; Lindenmaier, Goetz; hotspot-dev at openjdk.java.net > Subject: Re: RFR(XS): 8064786: Fix debug build after 8062808: Turn on the -Wreturn-type warning > > On 13/11/2014 8:34 PM, Stefan Karlsson wrote: >> On 2014-11-13 11:20, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> please review, test and sponsor this tiny change. It fixes the debug >>> build in the gc repository. >>> https://bugs.openjdk.java.net/browse/JDK-8064786 >>> http://cr.openjdk.java.net/~goetz/webrevs/8064786-warnRet/webrev.00/ >> >> Looks good. Thanks for fixing. Another approach would be to just remove >> the ShouldNotReachHere() lines. >> >> I'll push when we get another review. > > Reviewed. > > But I'm concerned as to how this was not detected with the original fix. > I assume we don't build with a compiler that complains about this code? > > Thanks, > David > >> thanks, >> StefanK >> >>> >>> Best regards, >>> Goetz. >> From mikael.gerdin at oracle.com Thu Nov 13 12:08:58 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 13 Nov 2014 13:08:58 +0100 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <54649B03.5010900@oracle.com> References: <546228E3.8030207@oracle.com> <54643974.9050805@oracle.com> <546485BF.30802@oracle.com> <54649B03.5010900@oracle.com> Message-ID: <54649F5A.3060904@oracle.com> On 2014-11-13 12:50, David Holmes wrote: > On 13/11/2014 8:19 PM, Mikael Gerdin wrote: >> Hi David, >> >> On 2014-11-13 05:54, David Holmes wrote: >>> Hi Mikael, >>> >>> Without knowing the details it is hard to determine the correctness of >>> this. What you describe below sounds reasonable - but what about the >>> opposite problem in the new code: what if you read an old top() then a >>> new timestamp, before top() is updated? Will that work correctly or will >>> the region between the old-top and new-top be missed? >> >> I realize that not everyone is up to speed on the specifics of this >> code, but I appreciate you feedback on the general reasoning. >> >> Reading an old _top value is safe, and in fact we must enforce that the >> only _top values we ever return from this functions were set before the >> GC occurred. >> >> Reading a too recent _top value is the cause of the crash in this bug, >> since if this function returns a a recently updated _top value that is >> because another GC worker has allocated into this region and is in the >> process of copying objects into it. The point of the timestamp value is >> to only return old values of _top and if the timestamp is current it >> should return another value. > > Ok. I assumed there was some kind of monotonic progression that made > this okay but as I said I'm not at all familiar with the code. > >> I've updated the webrev slightly due to off-list feedback that I should >> attempt to avoid reading the time stamp more than once (for the assert). > > I noticed that too but I was looking at the reading of > g1h->get_gc_time_stamp() twice, not the reading of _gc_time_stamp. > Shouldn't you read them both once eg: > > HeapWord* local_top = top(); > OrderAccess::loadload(); > const unsigned local_time_stamp = _gc_time_stamp; > const unsigned gc_time_stamp = g1h->get_gc_time_stamp(); > assert(local_time_stamp <= gc_time_stamp, "invariant" ); > if (local_time_stamp < gc_time_stamp) { > return local_top; > } else { > return Space::saved_mark_word(); > } > > ? The G1CollectedHeap time stamp is incremented once before any gc workers are notified, so it should not be possible to observe different values of the time stamp during a collection. There is also a OrderAccess::fence call after incrementing the timestamp which I'm not sure is needed but I'll leave it in regardless. The time stamp is also incremented after the collection has completed but those time stamps should not be visible to gc workers either. /Mikael > > David > ----- > >> I've also noticed that I messed up the indentation of a curly brace so I >> fixed that as well. >> >> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.1/ >> >> Incremental webrev: >> http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0_to_1/ >> >> /Mikael >> >>> >>> Cheers, >>> David H. >>> >>> On 12/11/2014 1:18 AM, Mikael Gerdin wrote: >>>> Hi all, >>>> >>>> I've sent this to hotspot-dev instead of just hotspot-gc-dev in the >>>> hope >>>> of getting some extra feedback from our resident concurrency experts. >>>> >>>> Please review this subtle change to the order in which we read >>>> fields in >>>> G1OffsetTableContigSpace::saved_mark_word, original included here for >>>> reference: >>>> 1003 >>>> 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>>> 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>>> 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), >>>> "invariant" ); >>>> 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) >>>> 1008 return top(); >>>> 1009 else >>>> 1010 return Space::saved_mark_word(); >>>> 1011 } >>>> 1012 >>>> >>>> When getting a new gc alloc region several stores are performed where >>>> store ordering needs to be enforced and several synchronization points >>>> occur. >>>> [write path] >>>> ST(_saved_mark_word) >>>> #StoreStore >>>> ST(_gc_time_stamp) >>>> ST(_top) // satisfying alloc request >>>> #StoreStore >>>> ST(_alloc_region) // publishing to other gc workers >>>> #MonitorEnter >>>> ST(_top) // potential further allocations >>>> #MonitorExit >>>> #MonitorEnter >>>> ST(_top) // potential further allocations >>>> #MonitorExit >>>> >>>> When we inspect a region during remembered set scanning we need to >>>> ensure that we never read memory which have been allocated by a GC >>>> worker thread for the purpose of copying objects into. >>>> The way this works is that a time stamp field is supposed to signal >>>> to a >>>> scanning thread that it should look at addresses below _top if the time >>>> stamp is old or addresses below _saved_mark_word if the time stamp is >>>> current. >>>> >>>> The current code does (as seen above) >>>> [read path] >>>> LD(_gc_time_stamp) >>>> LD(_top) >>>> or (depending on time stamp) >>>> LD(_saved_mark_word) >>>> >>>> Because these values are written to without full mutual exclusion we >>>> need to be very careful about the order in which we read these values, >>>> and this is where I argue that the current code is incorrect. >>>> In order to observe a consistent view of the ordered stores in the >>>> [write path] above we need to load the values in the reverse order they >>>> were written, with proper #LoadLoad ordering enforced. >>>> >>>> The problem which we've observed here is that after we've read the time >>>> stamp as below the heap time stamp the top pointer can be updated by a >>>> GC worker allocating objects into this region. To make sure that the >>>> top >>>> value we see is in fact valid we must read it before we read the time >>>> stamp which determines which value we should return from the >>>> saved_mark_word function. >>>> >>>> My suggested fix is to load _top first and enforce #LoadLoad ordering >>>> enforced: >>>> HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>>> G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>>> assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); >>>> HeapWord* local_top = top(); >>>> OrderAccess::loadload(); >>>> if (_gc_time_stamp < g1h->get_gc_time_stamp()) { >>>> return local_top; >>>> } else { >>>> return Space::saved_mark_word(); >>>> } >>>> } >>>> >>>> I've successfully reproduced the crash with the original code by adding >>>> some random sleep calls between the load of the time stamp and the load >>>> of top so I'm fairly certain that this resolves the issue. I've also >>>> verified that the fix I'm proposing does resolve the bug for the team >>>> which encountered the issue, even if I can't reproduce that crash >>>> locally. >>>> >>>> I also plan to attempt design around some of the races in this code to >>>> reduce its complexity, but for the sake of backporting the fix to 8u40 >>>> I'd like to start with just adding the minimal fix. >>>> >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 >>>> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ >>>> Testing: JPRT, local kitchensink (4 hours), gc test suite >>>> >>>> Thanks >>>> /Mikael From mikael.auno at oracle.com Thu Nov 13 13:56:04 2014 From: mikael.auno at oracle.com (Mikael Auno) Date: Thu, 13 Nov 2014 14:56:04 +0100 Subject: RFR: 8064799: [TESTBUG] JT-Reg Serviceability tests to be run as part of JPRT submit job Message-ID: <5464B874.4060206@oracle.com> Hi, Could I please get a review of this addition of SVC tests to JPRT submit jobs. So far, I'm only adding JDI tests as those are the only ones I have completed code coverage analysis on to determine the best subset to add. The other areas will be added too, but I'm adding these now to get the ball rolling asap. I've run these through JPRT once already without failures and have got two more runs in the pipe. I've also looked through the history for these tests and found that they do not have any known instabilities to worry about. Issue: https://bugs.openjdk.java.net/browse/JDK-8064799 Webrev: http://cr.openjdk.java.net/~miauno/8064799/webrev.00/ Thanks, Mikael From magnus.ihse.bursie at oracle.com Thu Nov 13 14:03:00 2014 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Thu, 13 Nov 2014 15:03:00 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <20141110174536.GA2885@redhat.com> References: <545CFFA9.4070107@redhat.com> <6313454D-6690-4119-B55C-DBB356E4B3AC@oracle.com> <545D078E.2090509@redhat.com> <20141110174536.GA2885@redhat.com> Message-ID: <5464BA14.40306@oracle.com> On 2014-11-10 18:45, Omair Majid wrote: > * Christian Thalinger [2014-11-07 13:11]: >>> On Nov 7, 2014, at 9:55 AM, Andrew Haley wrote: >>> On 11/07/2014 05:42 PM, Christian Thalinger wrote: >>>> common/autoconf/flags.m4 >>>> >>>> + aarch64) >>>> + ZERO_ARCHFLAG="" >>>> + ;; >>>> >>>> Why is this required on aarch64 but not all the other architectures? >>> I think it's because GCC rejects "-m64?. >> That?s interesting. I thought -m is some kind of common >> flag that works on all architectures. Can someone verify this? > I had to do a similar fix for zero on arm32: > http://hg.openjdk.java.net/jdk8/jdk8/rev/1dfcc874461e#l2.7 > > Perhaps that can be re-used here? That's a good point. $COMPILER_TARGET_BITS_FLAG is used in several places. All of them ought to be guarded by tests on $COMPILER_SUPPORTS_TARGET_BITS_FLAG, but I don't think they are. But this sounds like such a situation -- what we should be testing for is not so much the platform as if $COMPILER_TARGET_BITS_FLAG is supported. /Magnus From magnus.ihse.bursie at oracle.com Thu Nov 13 14:09:45 2014 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Thu, 13 Nov 2014 15:09:45 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> Message-ID: <5464BBA9.1000809@oracle.com> On 2014-11-10 11:32, Volker Simonis wrote: > On Mon, Nov 10, 2014 at 10:42 AM, Erik Joelsson > wrote: >> On 2014-11-10 10:27, Volker Simonis wrote: >>> On Mon, Nov 10, 2014 at 9:08 AM, Erik Joelsson >>> wrote: >>>> Hello, >>>> >>>> I would certainly like to have these files updated, but unfortunately the >>>> license on these files changed from GPL2 to GPL3. This essentially means >>>> that the switch is non trivial from a legal perspective and the >>>> impression >>>> I've received when I last inquired about updating these files was that >>>> it's >>>> unlikely to ever happen unless a very strong case can be presented for >>>> why >>>> it's needed. >>>> >>>> So the reason we have the over engineered solution for config.guess is >>>> simply that it's much easier than getting legal approval for updating >>>> these >>>> files. >>> OK, but in that case I don't see any reason for keeping this >>> "over-engineered" solution at all. If there will not be any pulls from >>> upstream anyway then there's no reason for keeping these file >>> untouched. I'd propose then to just remove the wrappers and do all the >>> chenges right in the corresponding files (of course that's not the >>> topic of this change but should be done separately). >> And again, the reason we didn't change the existing file but instead wrapped >> it, was that we don't have explicit legal approval for doing derivative work >> for these 3rd party files. Maybe it's ok, maybe it's not, I will not be the >> person saying it is ok. >> > OK, now I got it. I thought we just use the wrappers because we want > to easily integrate the upstream versions. But instead it is only > because we don't want to edit these files because of legal > uncertainties. > > So in that case that means we're also not allowed to edit 'config.sub' > and have to create a wrapper for it, right? Yes, you are correct. We cannot modify these files. As far as I understand, the legal reason for including these files are the explicit exception: # As a special exception to the GNU General Public License, if you # distribute this file as part of a program that contains a # configuration script generated by Autoconf, you may include it under # the same distribution terms that you use for the rest of that program. But this is just a distribution license, not a modification license. From my IANAL point of view, this exception should be enough to disregard if the file is also distributed under GPL2 or GPL3. Unfortunately, as Erik says, our lawyers are apprehensive of GLP3. So while we thought that we could be able to periodically sync these files with upstream (and remove our external "patches" after a while), we have not been able to do so. So, this fix will need to do the same dance with config.sub as for guess.guess. Unfortunately. :( /Magnus From magnus.ihse.bursie at oracle.com Thu Nov 13 14:12:28 2014 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Thu, 13 Nov 2014 15:12:28 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <545D0290.5080307@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> Message-ID: <5464BC4C.3050705@oracle.com> On 2014-11-07 18:34, Vladimir Kozlov wrote: > CCing to build-dev and JDK9-dev since it is top level changes. > > Note, it will go into staging aarch64 repo. > > Vladimir > > On 11/7/14 9:21 AM, Andrew Haley wrote: >> The first patch: top-level build machinery changes. >> >> http://cr.openjdk.java.net/~aph/8064357-rev-1/ >> >> Andrew. >> I have a question about the changes in platform.m4. I was a bit surprised to see that VAR_CPU_ARCH is set to "aarch64" and not "arm". The meaning of the CPU_ARCH variable is supposed to cover platforms with similar architecture, regardless of address size. But maybe I'm just too unfamiliar with aarch64, and it is not a trivial 64-bit extension of arm, but more like a separate platform? /Magnus From aph at redhat.com Thu Nov 13 14:28:10 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 13 Nov 2014 14:28:10 +0000 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <5464BBA9.1000809@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> Message-ID: <5464BFFA.7050205@redhat.com> On 11/13/2014 02:09 PM, Magnus Ihse Bursie wrote: > From my IANAL point of view, this exception should be enough to > disregard if the file is also distributed under GPL2 or GPL3. > Unfortunately, as Erik says, our lawyers are apprehensive of GLP3. So > while we thought that we could be able to periodically sync these files > with upstream (and remove our external "patches" after a while), we have > not been able to do so. > > So, this fix will need to do the same dance with config.sub as for > guess.guess. Unfortunately. :( Fair enough. If I knew what you wanted me to do, I'd do it. Andrew. From aph at redhat.com Thu Nov 13 14:29:51 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 13 Nov 2014 14:29:51 +0000 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <5464BC4C.3050705@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <5464BC4C.3050705@oracle.com> Message-ID: <5464C05F.2000706@redhat.com> On 11/13/2014 02:12 PM, Magnus Ihse Bursie wrote: > I was a bit surprised to see that VAR_CPU_ARCH is set to "aarch64" and > not "arm". The meaning of the CPU_ARCH variable is supposed to cover > platforms with similar architecture, regardless of address size. But > maybe I'm just too unfamiliar with aarch64, and it is not a trivial > 64-bit extension of arm, but more like a separate platform? It is a completely new architecture. Almost the only thing it has in common with ARM is that it's from the same company. It is possible to run an AArch64 CPU in ARM mode, but there is no interworking. Andrew. From daniel.daugherty at oracle.com Thu Nov 13 14:35:46 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 13 Nov 2014 07:35:46 -0700 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <546485BF.30802@oracle.com> References: <546228E3.8030207@oracle.com> <54643974.9050805@oracle.com> <546485BF.30802@oracle.com> Message-ID: <5464C1C2.2080804@oracle.com> > Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.1/ src/share/vm/gc_implementation/g1/heapRegion.cpp nit line 1007: assert(local_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); You got rid of the space after the '('. Can you also get rid of the space before ');'? Thanks for putting _gc_time_stamp in a local so that it's not fetched more than once. Would it be useful to move the assert() after the setting of local_top? That would give someone debugging the assertion failure core file a little more context. Dunno... your call. Dan On 11/13/14 3:19 AM, Mikael Gerdin wrote: > Hi David, > > On 2014-11-13 05:54, David Holmes wrote: >> Hi Mikael, >> >> Without knowing the details it is hard to determine the correctness of >> this. What you describe below sounds reasonable - but what about the >> opposite problem in the new code: what if you read an old top() then a >> new timestamp, before top() is updated? Will that work correctly or will >> the region between the old-top and new-top be missed? > > I realize that not everyone is up to speed on the specifics of this > code, but I appreciate you feedback on the general reasoning. > > Reading an old _top value is safe, and in fact we must enforce that > the only _top values we ever return from this functions were set > before the GC occurred. > > Reading a too recent _top value is the cause of the crash in this bug, > since if this function returns a a recently updated _top value that is > because another GC worker has allocated into this region and is in the > process of copying objects into it. The point of the timestamp value > is to only return old values of _top and if the timestamp is current > it should return another value. > > I've updated the webrev slightly due to off-list feedback that I > should attempt to avoid reading the time stamp more than once (for the > assert). > I've also noticed that I messed up the indentation of a curly brace so > I fixed that as well. > > Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.1/ > > Incremental webrev: > http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0_to_1/ > > /Mikael > >> >> Cheers, >> David H. >> >> On 12/11/2014 1:18 AM, Mikael Gerdin wrote: >>> Hi all, >>> >>> I've sent this to hotspot-dev instead of just hotspot-gc-dev in the >>> hope >>> of getting some extra feedback from our resident concurrency experts. >>> >>> Please review this subtle change to the order in which we read >>> fields in >>> G1OffsetTableContigSpace::saved_mark_word, original included here for >>> reference: >>> 1003 >>> 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>> 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>> 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), >>> "invariant" ); >>> 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) >>> 1008 return top(); >>> 1009 else >>> 1010 return Space::saved_mark_word(); >>> 1011 } >>> 1012 >>> >>> When getting a new gc alloc region several stores are performed where >>> store ordering needs to be enforced and several synchronization points >>> occur. >>> [write path] >>> ST(_saved_mark_word) >>> #StoreStore >>> ST(_gc_time_stamp) >>> ST(_top) // satisfying alloc request >>> #StoreStore >>> ST(_alloc_region) // publishing to other gc workers >>> #MonitorEnter >>> ST(_top) // potential further allocations >>> #MonitorExit >>> #MonitorEnter >>> ST(_top) // potential further allocations >>> #MonitorExit >>> >>> When we inspect a region during remembered set scanning we need to >>> ensure that we never read memory which have been allocated by a GC >>> worker thread for the purpose of copying objects into. >>> The way this works is that a time stamp field is supposed to signal >>> to a >>> scanning thread that it should look at addresses below _top if the time >>> stamp is old or addresses below _saved_mark_word if the time stamp is >>> current. >>> >>> The current code does (as seen above) >>> [read path] >>> LD(_gc_time_stamp) >>> LD(_top) >>> or (depending on time stamp) >>> LD(_saved_mark_word) >>> >>> Because these values are written to without full mutual exclusion we >>> need to be very careful about the order in which we read these values, >>> and this is where I argue that the current code is incorrect. >>> In order to observe a consistent view of the ordered stores in the >>> [write path] above we need to load the values in the reverse order they >>> were written, with proper #LoadLoad ordering enforced. >>> >>> The problem which we've observed here is that after we've read the time >>> stamp as below the heap time stamp the top pointer can be updated by a >>> GC worker allocating objects into this region. To make sure that the >>> top >>> value we see is in fact valid we must read it before we read the time >>> stamp which determines which value we should return from the >>> saved_mark_word function. >>> >>> My suggested fix is to load _top first and enforce #LoadLoad ordering >>> enforced: >>> HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>> G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>> assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); >>> HeapWord* local_top = top(); >>> OrderAccess::loadload(); >>> if (_gc_time_stamp < g1h->get_gc_time_stamp()) { >>> return local_top; >>> } else { >>> return Space::saved_mark_word(); >>> } >>> } >>> >>> I've successfully reproduced the crash with the original code by adding >>> some random sleep calls between the load of the time stamp and the load >>> of top so I'm fairly certain that this resolves the issue. I've also >>> verified that the fix I'm proposing does resolve the bug for the team >>> which encountered the issue, even if I can't reproduce that crash >>> locally. >>> >>> I also plan to attempt design around some of the races in this code to >>> reduce its complexity, but for the sake of backporting the fix to 8u40 >>> I'd like to start with just adding the minimal fix. >>> >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 >>> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ >>> Testing: JPRT, local kitchensink (4 hours), gc test suite >>> >>> Thanks >>> /Mikael From daniel.daugherty at oracle.com Thu Nov 13 14:41:28 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 13 Nov 2014 07:41:28 -0700 Subject: RFR(XS): 8064786: Fix debug build after 8062808: Turn on the -Wreturn-type warning In-Reply-To: <54649D09.7040406@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF27134@DEWDFEMB12A.global.corp.sap> <54648941.7070108@oracle.com> <54649BBD.2040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF271D4@DEWDFEMB12A.global.corp.sap> <54649D09.7040406@oracle.com> Message-ID: <5464C318.7090406@oracle.com> On 11/13/14 4:59 AM, David Holmes wrote: > On 13/11/2014 9:55 PM, Lindenmaier, Goetz wrote: >> Maybe you don't do a debug build. Fastdbg and opt pass. > > Ah - no we don't do debug, just product and fastdebug. > > too many variations. But I'm concerned because this suggests > there are more differences between debug and fastdebug than just > optimization levels (which I thought was the difference). There are a few places where a debug build has more verification code because running that verification code takes a long time. I remember running into this in the StringTable stuff that I played with more than a year ago... Dan > > Thanks, > David > >> I verified that it's not my old compiler, gcc 4.3.4 fails, too. >> >> Best regards, >> Goetz >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Donnerstag, 13. November 2014 12:54 >> To: Stefan Karlsson; Lindenmaier, Goetz; hotspot-dev at openjdk.java.net >> Subject: Re: RFR(XS): 8064786: Fix debug build after 8062808: Turn on >> the -Wreturn-type warning >> >> On 13/11/2014 8:34 PM, Stefan Karlsson wrote: >>> On 2014-11-13 11:20, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> please review, test and sponsor this tiny change. It fixes the debug >>>> build in the gc repository. >>>> https://bugs.openjdk.java.net/browse/JDK-8064786 >>>> http://cr.openjdk.java.net/~goetz/webrevs/8064786-warnRet/webrev.00/ >>> >>> Looks good. Thanks for fixing. Another approach would be to just remove >>> the ShouldNotReachHere() lines. >>> >>> I'll push when we get another review. >> >> Reviewed. >> >> But I'm concerned as to how this was not detected with the original fix. >> I assume we don't build with a compiler that complains about this code? >> >> Thanks, >> David >> >>> thanks, >>> StefanK >>> >>>> >>>> Best regards, >>>> Goetz. >>> From magnus.ihse.bursie at oracle.com Thu Nov 13 15:00:48 2014 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Thu, 13 Nov 2014 16:00:48 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <5464BFFA.7050205@redhat.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> Message-ID: <5464C7A0.4080304@oracle.com> On 2014-11-13 15:28, Andrew Haley wrote: > On 11/13/2014 02:09 PM, Magnus Ihse Bursie wrote: >> From my IANAL point of view, this exception should be enough to >> disregard if the file is also distributed under GPL2 or GPL3. >> Unfortunately, as Erik says, our lawyers are apprehensive of GLP3. So >> while we thought that we could be able to periodically sync these files >> with upstream (and remove our external "patches" after a while), we have >> not been able to do so. >> >> So, this fix will need to do the same dance with config.sub as for >> guess.guess. Unfortunately. :( > Fair enough. If I knew what you wanted me to do, I'd do it. hg mv config.sub autoconf-config.sub hg cp config.guess config.sub and then fix config.sub so that it runs autoconf-config.sub and modifies the output to what you expect it to be from config.sub when running on this particular platform. Or do some other suitable workaround for config.sub. Perhaps it's easier to modify the input parameters rather than the output string, as we did for config.guess. /Magnus From aph at redhat.com Thu Nov 13 15:03:56 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 13 Nov 2014 15:03:56 +0000 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <5464C7A0.4080304@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> Message-ID: <5464C85C.50908@redhat.com> On 11/13/2014 03:00 PM, Magnus Ihse Bursie wrote: > > hg mv config.sub autoconf-config.sub > hg cp config.guess config.sub > > and then fix config.sub so that it runs autoconf-config.sub and modifies > the output to what you expect it to be from config.sub when running on > this particular platform. OK, I'll do that. Thanks! Andrew. From stefan.karlsson at oracle.com Thu Nov 13 14:59:13 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 13 Nov 2014 15:59:13 +0100 Subject: RFR: 8064811: Use THREAD instead of CHECK_NULL in return statements Message-ID: <5464C741.2060105@oracle.com> Hi all, Please, review this patch to replace usages of the CHECK_ macros in return statements, with the THREAD define. http://cr.openjdk.java.net/~stefank/8064811/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8064811 From the bug report: Take the following method as an example: Klass* ConstantPool::klass_ref_at(int which, TRAPS) { return klass_at(klass_ref_index_at(which), CHECK_NULL); } This will expand into: Klass* ConstantPool::klass_ref_at(int which, TRAPS) { return klass_at(klass_ref_index_at(which), THREAD); if (HAS_PENDING_EXCEPTIONS) { return NULL; } (void)(0); } The if-statement will never be reached. We have seen cases where the compiler warns about this, and the recent change to enable -Wreturn-type will make this more likely to happen. The suggested solution is to change the example above into: Klass* ConstantPool::klass_ref_at(int which, TRAPS) { return klass_at(klass_ref_index_at(which), THREAD); } thanks, StefanK From mikael.gerdin at oracle.com Thu Nov 13 15:30:32 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 13 Nov 2014 16:30:32 +0100 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <5464C1C2.2080804@oracle.com> References: <546228E3.8030207@oracle.com> <54643974.9050805@oracle.com> <546485BF.30802@oracle.com> <5464C1C2.2080804@oracle.com> Message-ID: <5464CE98.80308@oracle.com> Hi Dan, On 2014-11-13 15:35, Daniel D. Daugherty wrote: > > Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.1/ > > src/share/vm/gc_implementation/g1/heapRegion.cpp > nit line 1007: assert(local_time_stamp <= g1h->get_gc_time_stamp(), > "invariant" ); > You got rid of the space after the '('. Can you also get > rid of the space before ');'? I'll fix that before I push the change. > > Thanks for putting _gc_time_stamp in a local so that it's > not fetched more than once. > > Would it be useful to move the assert() after the setting of > local_top? That would give someone debugging the assertion > failure core file a little more context. Dunno... your call. The assert is not strictly related to the _top field, if the time stamp assert fails we're most likely experiencing a completely different problem. I'd rather leave the assert where it is. /Mikael > > Dan > > > On 11/13/14 3:19 AM, Mikael Gerdin wrote: >> Hi David, >> >> On 2014-11-13 05:54, David Holmes wrote: >>> Hi Mikael, >>> >>> Without knowing the details it is hard to determine the correctness of >>> this. What you describe below sounds reasonable - but what about the >>> opposite problem in the new code: what if you read an old top() then a >>> new timestamp, before top() is updated? Will that work correctly or will >>> the region between the old-top and new-top be missed? >> >> I realize that not everyone is up to speed on the specifics of this >> code, but I appreciate you feedback on the general reasoning. >> >> Reading an old _top value is safe, and in fact we must enforce that >> the only _top values we ever return from this functions were set >> before the GC occurred. >> >> Reading a too recent _top value is the cause of the crash in this bug, >> since if this function returns a a recently updated _top value that is >> because another GC worker has allocated into this region and is in the >> process of copying objects into it. The point of the timestamp value >> is to only return old values of _top and if the timestamp is current >> it should return another value. >> >> I've updated the webrev slightly due to off-list feedback that I >> should attempt to avoid reading the time stamp more than once (for the >> assert). >> I've also noticed that I messed up the indentation of a curly brace so >> I fixed that as well. >> >> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.1/ >> >> Incremental webrev: >> http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0_to_1/ >> >> /Mikael >> >>> >>> Cheers, >>> David H. >>> >>> On 12/11/2014 1:18 AM, Mikael Gerdin wrote: >>>> Hi all, >>>> >>>> I've sent this to hotspot-dev instead of just hotspot-gc-dev in the >>>> hope >>>> of getting some extra feedback from our resident concurrency experts. >>>> >>>> Please review this subtle change to the order in which we read >>>> fields in >>>> G1OffsetTableContigSpace::saved_mark_word, original included here for >>>> reference: >>>> 1003 >>>> 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>>> 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>>> 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), >>>> "invariant" ); >>>> 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) >>>> 1008 return top(); >>>> 1009 else >>>> 1010 return Space::saved_mark_word(); >>>> 1011 } >>>> 1012 >>>> >>>> When getting a new gc alloc region several stores are performed where >>>> store ordering needs to be enforced and several synchronization points >>>> occur. >>>> [write path] >>>> ST(_saved_mark_word) >>>> #StoreStore >>>> ST(_gc_time_stamp) >>>> ST(_top) // satisfying alloc request >>>> #StoreStore >>>> ST(_alloc_region) // publishing to other gc workers >>>> #MonitorEnter >>>> ST(_top) // potential further allocations >>>> #MonitorExit >>>> #MonitorEnter >>>> ST(_top) // potential further allocations >>>> #MonitorExit >>>> >>>> When we inspect a region during remembered set scanning we need to >>>> ensure that we never read memory which have been allocated by a GC >>>> worker thread for the purpose of copying objects into. >>>> The way this works is that a time stamp field is supposed to signal >>>> to a >>>> scanning thread that it should look at addresses below _top if the time >>>> stamp is old or addresses below _saved_mark_word if the time stamp is >>>> current. >>>> >>>> The current code does (as seen above) >>>> [read path] >>>> LD(_gc_time_stamp) >>>> LD(_top) >>>> or (depending on time stamp) >>>> LD(_saved_mark_word) >>>> >>>> Because these values are written to without full mutual exclusion we >>>> need to be very careful about the order in which we read these values, >>>> and this is where I argue that the current code is incorrect. >>>> In order to observe a consistent view of the ordered stores in the >>>> [write path] above we need to load the values in the reverse order they >>>> were written, with proper #LoadLoad ordering enforced. >>>> >>>> The problem which we've observed here is that after we've read the time >>>> stamp as below the heap time stamp the top pointer can be updated by a >>>> GC worker allocating objects into this region. To make sure that the >>>> top >>>> value we see is in fact valid we must read it before we read the time >>>> stamp which determines which value we should return from the >>>> saved_mark_word function. >>>> >>>> My suggested fix is to load _top first and enforce #LoadLoad ordering >>>> enforced: >>>> HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>>> G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>>> assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); >>>> HeapWord* local_top = top(); >>>> OrderAccess::loadload(); >>>> if (_gc_time_stamp < g1h->get_gc_time_stamp()) { >>>> return local_top; >>>> } else { >>>> return Space::saved_mark_word(); >>>> } >>>> } >>>> >>>> I've successfully reproduced the crash with the original code by adding >>>> some random sleep calls between the load of the time stamp and the load >>>> of top so I'm fairly certain that this resolves the issue. I've also >>>> verified that the fix I'm proposing does resolve the bug for the team >>>> which encountered the issue, even if I can't reproduce that crash >>>> locally. >>>> >>>> I also plan to attempt design around some of the races in this code to >>>> reduce its complexity, but for the sake of backporting the fix to 8u40 >>>> I'd like to start with just adding the minimal fix. >>>> >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 >>>> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ >>>> Testing: JPRT, local kitchensink (4 hours), gc test suite >>>> >>>> Thanks >>>> /Mikael > From coleen.phillimore at oracle.com Thu Nov 13 15:33:15 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 13 Nov 2014 10:33:15 -0500 Subject: RFR: 8064811: Use THREAD instead of CHECK_NULL in return statements In-Reply-To: <5464C741.2060105@oracle.com> References: <5464C741.2060105@oracle.com> Message-ID: <5464CF3B.4060004@oracle.com> The thing that I worry about with this change is that if someone adds code later after the return, they'll miss changing the THREAD parameter back into CHECK. But maybe it's okay because the thing returned will be NULL and code is likely to crash on someone trying to use the value returned. Ok. This is a good cleanup. I'm surprised there weren't more. Thanks, Coleen On 11/13/14, 9:59 AM, Stefan Karlsson wrote: > Hi all, > > Please, review this patch to replace usages of the CHECK_ macros in > return statements, with the THREAD define. > > http://cr.openjdk.java.net/~stefank/8064811/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8064811 > > From the bug report: > > Take the following method as an example: > Klass* ConstantPool::klass_ref_at(int which, TRAPS) { > return klass_at(klass_ref_index_at(which), CHECK_NULL); > } > > This will expand into: > Klass* ConstantPool::klass_ref_at(int which, TRAPS) { > return klass_at(klass_ref_index_at(which), THREAD); > if (HAS_PENDING_EXCEPTIONS) { > return NULL; > } > (void)(0); > } > > The if-statement will never be reached. > > We have seen cases where the compiler warns about this, and the recent > change to enable -Wreturn-type will make this more likely to happen. > > The suggested solution is to change the example above into: > Klass* ConstantPool::klass_ref_at(int which, TRAPS) { > return klass_at(klass_ref_index_at(which), THREAD); > } > > thanks, > StefanK From stefan.karlsson at oracle.com Thu Nov 13 15:28:55 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 13 Nov 2014 16:28:55 +0100 Subject: RFR: 8064811: Use THREAD instead of CHECK_NULL in return statements In-Reply-To: <5464CF3B.4060004@oracle.com> References: <5464C741.2060105@oracle.com> <5464CF3B.4060004@oracle.com> Message-ID: <5464CE37.9020102@oracle.com> On 2014-11-13 16:33, Coleen Phillimore wrote: > > The thing that I worry about with this change is that if someone adds > code later after the return, they'll miss changing the THREAD > parameter back into CHECK. I was thinking the same, but I knew the we used this idiom in other places in the JVM and thought that changing these would be OK. An alternative approach would be to always read out the value into a variable: Klass* ConstantPool::klass_ref_at(int which, TRAPS) { Klass* k = klass_at(klass_ref_index_at(which), CHECK_NULL); return k; } I can do that if people feel more comfortable with it. Thanks, Stefank > But maybe it's okay because the thing returned will be NULL and code > is likely to crash on someone trying to use the value returned. Ok. > This is a good cleanup. I'm surprised there weren't more. > > Thanks, > Coleen > > On 11/13/14, 9:59 AM, Stefan Karlsson wrote: >> Hi all, >> >> Please, review this patch to replace usages of the CHECK_ macros in >> return statements, with the THREAD define. >> >> http://cr.openjdk.java.net/~stefank/8064811/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8064811 >> >> From the bug report: >> >> Take the following method as an example: >> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >> return klass_at(klass_ref_index_at(which), CHECK_NULL); >> } >> >> This will expand into: >> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >> return klass_at(klass_ref_index_at(which), THREAD); >> if (HAS_PENDING_EXCEPTIONS) { >> return NULL; >> } >> (void)(0); >> } >> >> The if-statement will never be reached. >> >> We have seen cases where the compiler warns about this, and the >> recent change to enable -Wreturn-type will make this more likely to >> happen. >> >> The suggested solution is to change the example above into: >> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >> return klass_at(klass_ref_index_at(which), THREAD); >> } >> >> thanks, >> StefanK > From mikael.auno at oracle.com Thu Nov 13 15:53:22 2014 From: mikael.auno at oracle.com (Mikael Auno) Date: Thu, 13 Nov 2014 16:53:22 +0100 Subject: RFR: 8064799: [TESTBUG] JT-Reg Serviceability tests to be run as part of JPRT submit job In-Reply-To: <5464B874.4060206@oracle.com> References: <5464B874.4060206@oracle.com> Message-ID: <5464D3F2.9060107@oracle.com> On 2014-11-13 14:56, Mikael Auno wrote: > Hi, > > Could I please get a review of this addition of SVC tests to JPRT submit > jobs. So far, I'm only adding JDI tests as those are the only ones I > have completed code coverage analysis on to determine the best subset to > add. The other areas will be added too, but I'm adding these now to get > the ball rolling asap. > > I've run these through JPRT once already without failures and have got > two more runs in the pipe. I've also looked through the history for > these tests and found that they do not have any known instabilities to > worry about. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8064799 > Webrev: http://cr.openjdk.java.net/~miauno/8064799/webrev.00/ The additional JPRT runs have completed now and have no failures. Here are also the duration (in seconds) for each test on each platform to in case anyone wonders ("com/sun/jdi" prefix stripped out): > ----------------------------------------------------------------------------------------------------------------------------------------------------------------- > | | lin_i586-c1 | lin_i586-c2 | lin_x64-c2 | osx_x64-c2 | sol_sparcv9-c2 | sol_x64-c2 | win_i586-c1 | win_i586-c2 | win_x64-c2 | > ----------------------------------------------------------------------------------------------------------------------------------------------------------------- > | .../AcceptTimeout | 1.24 | 1.277 | 1.308 | 1.349 | 1.54 | 1.204 | 2.184 | 2.293 | 2.34 | > | .../AccessSpecifierTest | 1.689 | 1.883 | 2.021 | 2.303 | 4.892 | 2.048 | 1.332 | 1.707 | 2.19 | > | .../AfterThreadDeathTest | 0.855 | 0.748 | 0.815 | 0.605 | 1.098 | 0.691 | 0.299 | 0.424 | 1.683 | > | .../ArrayRangeTest | 0.659 | 0.823 | 0.794 | 0.702 | 1.445 | 0.837 | 0.783 | 0.503 | 1.267 | > | .../ConstantPoolInfo | 0.589 | 0.74 | 0.791 | 0.621 | 1.067 | 0.607 | 0.315 | 0.408 | 0.674 | > | .../CountFilterTest | 0.588 | 0.638 | 0.729 | 0.617 | 1.068 | 0.618 | 0.3 | 0.502 | 0.674 | > | .../EarlyReturnNegativeTest | 0.724 | 0.8 | 0.824 | 0.675 | 1.186 | 0.642 | 0.362 | 0.627 | 0.736 | > | .../EarlyReturnTest | 1.218 | 1.164 | 1.295 | 1.207 | 1.962 | 1.307 | 0.72 | 1.189 | 1.242 | > | .../FieldWatchpoints | 0.616 | 0.628 | 0.728 | 0.6 | 1.052 | 0.683 | 0.3 | 0.408 | 0.674 | > | .../FramesTest | 0.598 | 0.696 | 0.741 | 0.601 | 1.006 | 0.592 | 0.299 | 0.425 | 0.627 | > | .../InstanceFilter | 0.604 | 0.677 | 0.696 | 0.587 | 1.005 | 0.608 | 0.284 | 0.393 | 0.69 | > | .../InterfaceMethodsTest | 0.706 | 0.83 | 0.837 | 0.69 | 1.193 | 0.762 | 0.362 | 1.032 | 0.752 | > | .../InvokeTest | 0.719 | 0.788 | 0.861 | 0.71 | 1.196 | 0.647 | 0.377 | 0.752 | 0.721 | > | .../LocalVariableEqual | 0.66 | 0.662 | 0.714 | 0.622 | 1.087 | 0.715 | 0.315 | 0.383 | 0.612 | > | .../LocationTest | 0.639 | 0.651 | 0.688 | 0.715 | 1.014 | 0.612 | 0.299 | 0.362 | 0.58 | > | .../ModificationWatchpoints | 0.764 | 0.789 | 0.872 | 0.726 | 1.375 | 0.668 | 0.424 | 0.502 | 0.877 | > | .../MonitorEventTest | 0.597 | 0.638 | 0.69 | 1.608 | 1.03 | 0.648 | 0.284 | 0.377 | 0.689 | > | .../MonitorFrameInfo | 0.622 | 0.652 | 0.731 | 0.596 | 1.014 | 0.592 | 0.299 | 0.456 | 0.612 | > | .../NullThreadGroupNameTest | 0.602 | 0.702 | 0.733 | 0.588 | 1.045 | 0.572 | 0.299 | 0.362 | 0.58 | > | .../PopAndStepTest | 0.318 | 0.351 | 0.416 | 0.593 | 0.989 | 0.713 | 0.3 | 0.455 | 0.752 | > | .../PopAsynchronousTest | 0.718 | 0.869 | 0.8 | 0.654 | 1.063 | 0.619 | 0.519 | 0.581 | 0.737 | > | .../ProcessAttachTest | 6.748 | 6.482 | 6.781 | 7.115 | 9.167 | 6.973 | 6.043 | 6.355 | 6.881 | > | .../redefineMethod/RedefineTest | 3.678 | 3.743 | 4.072 | 5.207 | 7.081 | 3.757 | 2.976 | 3.568 | 4.723 | > | .../ReferrersTest | 0.846 | 0.811 | 0.866 | 1.493 | 2.295 | 1.096 | 0.642 | 0.892 | 1.31 | > | .../RequestReflectionTest | 0.642 | 0.644 | 0.706 | 0.59 | 1.172 | 0.584 | 0.3 | 0.737 | 0.736 | > | .../ResumeOneThreadTest | 0.612 | 0.661 | 0.688 | 0.669 | 1.073 | 0.583 | 0.502 | 0.362 | 0.658 | > | .../RunToExit | 1.434 | 1.454 | 1.462 | 1.215 | 1.182 | 1.188 | 1.126 | 1.126 | 1.22 | > | .../sde/MangleTest | 0.739 | 0.976 | 0.9 | 0.72 | 1.295 | 0.703 | 0.486 | 0.752 | 1.11 | > | .../sde/TemperatureTableTest | 0.923 | 0.986 | 0.992 | 0.784 | 1.368 | 0.846 | 0.502 | 0.892 | 0.908 | > | .../SourceNameFilterTest | 1.246 | 1.365 | 1.45 | 1.246 | 2.041 | 1.215 | 0.599 | 1.051 | 1.316 | > | .../VarargsTest | 0.713 | 0.763 | 0.814 | 0.718 | 1.183 | 0.654 | 0.393 | 0.533 | 0.924 | > | .../Vars | 0.568 | 0.609 | 0.692 | 0.628 | 1.012 | 0.583 | 0.284 | 0.362 | 0.564 | > ----------------------------------------------------------------------------------------------------------------------------------------------------------------- > | Total | 33.874 | 35.564 | 37.507 | 37.754 | 57.196 | 34.567 | 24.509 | 30.771 | 40.059 | > ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Thanks, Mikael From serguei.spitsyn at oracle.com Thu Nov 13 16:41:21 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 13 Nov 2014 08:41:21 -0800 Subject: RFR: 8064799: [TESTBUG] JT-Reg Serviceability tests to be run as part of JPRT submit job In-Reply-To: <5464D3F2.9060107@oracle.com> References: <5464B874.4060206@oracle.com> <5464D3F2.9060107@oracle.com> Message-ID: <5464DF31.3000503@oracle.com> It looks good. Thanks, Serguei On 11/13/14 7:53 AM, Mikael Auno wrote: > On 2014-11-13 14:56, Mikael Auno wrote: >> Hi, >> >> Could I please get a review of this addition of SVC tests to JPRT submit >> jobs. So far, I'm only adding JDI tests as those are the only ones I >> have completed code coverage analysis on to determine the best subset to >> add. The other areas will be added too, but I'm adding these now to get >> the ball rolling asap. >> >> I've run these through JPRT once already without failures and have got >> two more runs in the pipe. I've also looked through the history for >> these tests and found that they do not have any known instabilities to >> worry about. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8064799 >> Webrev: http://cr.openjdk.java.net/~miauno/8064799/webrev.00/ > The additional JPRT runs have completed now and have no failures. Here > are also the duration (in seconds) for each test on each platform to in > case anyone wonders ("com/sun/jdi" prefix stripped out): > >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------- >> | | lin_i586-c1 | lin_i586-c2 | lin_x64-c2 | osx_x64-c2 | sol_sparcv9-c2 | sol_x64-c2 | win_i586-c1 | win_i586-c2 | win_x64-c2 | >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------- >> | .../AcceptTimeout | 1.24 | 1.277 | 1.308 | 1.349 | 1.54 | 1.204 | 2.184 | 2.293 | 2.34 | >> | .../AccessSpecifierTest | 1.689 | 1.883 | 2.021 | 2.303 | 4.892 | 2.048 | 1.332 | 1.707 | 2.19 | >> | .../AfterThreadDeathTest | 0.855 | 0.748 | 0.815 | 0.605 | 1.098 | 0.691 | 0.299 | 0.424 | 1.683 | >> | .../ArrayRangeTest | 0.659 | 0.823 | 0.794 | 0.702 | 1.445 | 0.837 | 0.783 | 0.503 | 1.267 | >> | .../ConstantPoolInfo | 0.589 | 0.74 | 0.791 | 0.621 | 1.067 | 0.607 | 0.315 | 0.408 | 0.674 | >> | .../CountFilterTest | 0.588 | 0.638 | 0.729 | 0.617 | 1.068 | 0.618 | 0.3 | 0.502 | 0.674 | >> | .../EarlyReturnNegativeTest | 0.724 | 0.8 | 0.824 | 0.675 | 1.186 | 0.642 | 0.362 | 0.627 | 0.736 | >> | .../EarlyReturnTest | 1.218 | 1.164 | 1.295 | 1.207 | 1.962 | 1.307 | 0.72 | 1.189 | 1.242 | >> | .../FieldWatchpoints | 0.616 | 0.628 | 0.728 | 0.6 | 1.052 | 0.683 | 0.3 | 0.408 | 0.674 | >> | .../FramesTest | 0.598 | 0.696 | 0.741 | 0.601 | 1.006 | 0.592 | 0.299 | 0.425 | 0.627 | >> | .../InstanceFilter | 0.604 | 0.677 | 0.696 | 0.587 | 1.005 | 0.608 | 0.284 | 0.393 | 0.69 | >> | .../InterfaceMethodsTest | 0.706 | 0.83 | 0.837 | 0.69 | 1.193 | 0.762 | 0.362 | 1.032 | 0.752 | >> | .../InvokeTest | 0.719 | 0.788 | 0.861 | 0.71 | 1.196 | 0.647 | 0.377 | 0.752 | 0.721 | >> | .../LocalVariableEqual | 0.66 | 0.662 | 0.714 | 0.622 | 1.087 | 0.715 | 0.315 | 0.383 | 0.612 | >> | .../LocationTest | 0.639 | 0.651 | 0.688 | 0.715 | 1.014 | 0.612 | 0.299 | 0.362 | 0.58 | >> | .../ModificationWatchpoints | 0.764 | 0.789 | 0.872 | 0.726 | 1.375 | 0.668 | 0.424 | 0.502 | 0.877 | >> | .../MonitorEventTest | 0.597 | 0.638 | 0.69 | 1.608 | 1.03 | 0.648 | 0.284 | 0.377 | 0.689 | >> | .../MonitorFrameInfo | 0.622 | 0.652 | 0.731 | 0.596 | 1.014 | 0.592 | 0.299 | 0.456 | 0.612 | >> | .../NullThreadGroupNameTest | 0.602 | 0.702 | 0.733 | 0.588 | 1.045 | 0.572 | 0.299 | 0.362 | 0.58 | >> | .../PopAndStepTest | 0.318 | 0.351 | 0.416 | 0.593 | 0.989 | 0.713 | 0.3 | 0.455 | 0.752 | >> | .../PopAsynchronousTest | 0.718 | 0.869 | 0.8 | 0.654 | 1.063 | 0.619 | 0.519 | 0.581 | 0.737 | >> | .../ProcessAttachTest | 6.748 | 6.482 | 6.781 | 7.115 | 9.167 | 6.973 | 6.043 | 6.355 | 6.881 | >> | .../redefineMethod/RedefineTest | 3.678 | 3.743 | 4.072 | 5.207 | 7.081 | 3.757 | 2.976 | 3.568 | 4.723 | >> | .../ReferrersTest | 0.846 | 0.811 | 0.866 | 1.493 | 2.295 | 1.096 | 0.642 | 0.892 | 1.31 | >> | .../RequestReflectionTest | 0.642 | 0.644 | 0.706 | 0.59 | 1.172 | 0.584 | 0.3 | 0.737 | 0.736 | >> | .../ResumeOneThreadTest | 0.612 | 0.661 | 0.688 | 0.669 | 1.073 | 0.583 | 0.502 | 0.362 | 0.658 | >> | .../RunToExit | 1.434 | 1.454 | 1.462 | 1.215 | 1.182 | 1.188 | 1.126 | 1.126 | 1.22 | >> | .../sde/MangleTest | 0.739 | 0.976 | 0.9 | 0.72 | 1.295 | 0.703 | 0.486 | 0.752 | 1.11 | >> | .../sde/TemperatureTableTest | 0.923 | 0.986 | 0.992 | 0.784 | 1.368 | 0.846 | 0.502 | 0.892 | 0.908 | >> | .../SourceNameFilterTest | 1.246 | 1.365 | 1.45 | 1.246 | 2.041 | 1.215 | 0.599 | 1.051 | 1.316 | >> | .../VarargsTest | 0.713 | 0.763 | 0.814 | 0.718 | 1.183 | 0.654 | 0.393 | 0.533 | 0.924 | >> | .../Vars | 0.568 | 0.609 | 0.692 | 0.628 | 1.012 | 0.583 | 0.284 | 0.362 | 0.564 | >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------- >> | Total | 33.874 | 35.564 | 37.507 | 37.754 | 57.196 | 34.567 | 24.509 | 30.771 | 40.059 | >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------- > Thanks, > Mikael From daniel.daugherty at oracle.com Thu Nov 13 17:11:30 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 13 Nov 2014 10:11:30 -0700 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <5464CE98.80308@oracle.com> References: <546228E3.8030207@oracle.com> <54643974.9050805@oracle.com> <546485BF.30802@oracle.com> <5464C1C2.2080804@oracle.com> <5464CE98.80308@oracle.com> Message-ID: <5464E642.9080200@oracle.com> On 11/13/14 8:30 AM, Mikael Gerdin wrote: > Hi Dan, > > On 2014-11-13 15:35, Daniel D. Daugherty wrote: >> > Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.1/ >> >> src/share/vm/gc_implementation/g1/heapRegion.cpp >> nit line 1007: assert(local_time_stamp <= g1h->get_gc_time_stamp(), >> "invariant" ); >> You got rid of the space after the '('. Can you also get >> rid of the space before ');'? > > I'll fix that before I push the change. Thanks! > >> >> Thanks for putting _gc_time_stamp in a local so that it's >> not fetched more than once. >> >> Would it be useful to move the assert() after the setting of >> local_top? That would give someone debugging the assertion >> failure core file a little more context. Dunno... your call. > > The assert is not strictly related to the _top field, if the time > stamp assert fails we're most likely experiencing a completely > different problem. > I'd rather leave the assert where it is. No problem. Dan > > /Mikael > >> >> Dan >> >> >> On 11/13/14 3:19 AM, Mikael Gerdin wrote: >>> Hi David, >>> >>> On 2014-11-13 05:54, David Holmes wrote: >>>> Hi Mikael, >>>> >>>> Without knowing the details it is hard to determine the correctness of >>>> this. What you describe below sounds reasonable - but what about the >>>> opposite problem in the new code: what if you read an old top() then a >>>> new timestamp, before top() is updated? Will that work correctly or >>>> will >>>> the region between the old-top and new-top be missed? >>> >>> I realize that not everyone is up to speed on the specifics of this >>> code, but I appreciate you feedback on the general reasoning. >>> >>> Reading an old _top value is safe, and in fact we must enforce that >>> the only _top values we ever return from this functions were set >>> before the GC occurred. >>> >>> Reading a too recent _top value is the cause of the crash in this bug, >>> since if this function returns a a recently updated _top value that is >>> because another GC worker has allocated into this region and is in the >>> process of copying objects into it. The point of the timestamp value >>> is to only return old values of _top and if the timestamp is current >>> it should return another value. >>> >>> I've updated the webrev slightly due to off-list feedback that I >>> should attempt to avoid reading the time stamp more than once (for the >>> assert). >>> I've also noticed that I messed up the indentation of a curly brace so >>> I fixed that as well. >>> >>> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.1/ >>> >>> Incremental webrev: >>> http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0_to_1/ >>> >>> /Mikael >>> >>>> >>>> Cheers, >>>> David H. >>>> >>>> On 12/11/2014 1:18 AM, Mikael Gerdin wrote: >>>>> Hi all, >>>>> >>>>> I've sent this to hotspot-dev instead of just hotspot-gc-dev in the >>>>> hope >>>>> of getting some extra feedback from our resident concurrency experts. >>>>> >>>>> Please review this subtle change to the order in which we read >>>>> fields in >>>>> G1OffsetTableContigSpace::saved_mark_word, original included here for >>>>> reference: >>>>> 1003 >>>>> 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>>>> 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>>>> 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), >>>>> "invariant" ); >>>>> 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) >>>>> 1008 return top(); >>>>> 1009 else >>>>> 1010 return Space::saved_mark_word(); >>>>> 1011 } >>>>> 1012 >>>>> >>>>> When getting a new gc alloc region several stores are performed where >>>>> store ordering needs to be enforced and several synchronization >>>>> points >>>>> occur. >>>>> [write path] >>>>> ST(_saved_mark_word) >>>>> #StoreStore >>>>> ST(_gc_time_stamp) >>>>> ST(_top) // satisfying alloc request >>>>> #StoreStore >>>>> ST(_alloc_region) // publishing to other gc workers >>>>> #MonitorEnter >>>>> ST(_top) // potential further allocations >>>>> #MonitorExit >>>>> #MonitorEnter >>>>> ST(_top) // potential further allocations >>>>> #MonitorExit >>>>> >>>>> When we inspect a region during remembered set scanning we need to >>>>> ensure that we never read memory which have been allocated by a GC >>>>> worker thread for the purpose of copying objects into. >>>>> The way this works is that a time stamp field is supposed to signal >>>>> to a >>>>> scanning thread that it should look at addresses below _top if the >>>>> time >>>>> stamp is old or addresses below _saved_mark_word if the time stamp is >>>>> current. >>>>> >>>>> The current code does (as seen above) >>>>> [read path] >>>>> LD(_gc_time_stamp) >>>>> LD(_top) >>>>> or (depending on time stamp) >>>>> LD(_saved_mark_word) >>>>> >>>>> Because these values are written to without full mutual exclusion we >>>>> need to be very careful about the order in which we read these >>>>> values, >>>>> and this is where I argue that the current code is incorrect. >>>>> In order to observe a consistent view of the ordered stores in the >>>>> [write path] above we need to load the values in the reverse order >>>>> they >>>>> were written, with proper #LoadLoad ordering enforced. >>>>> >>>>> The problem which we've observed here is that after we've read the >>>>> time >>>>> stamp as below the heap time stamp the top pointer can be updated >>>>> by a >>>>> GC worker allocating objects into this region. To make sure that the >>>>> top >>>>> value we see is in fact valid we must read it before we read the time >>>>> stamp which determines which value we should return from the >>>>> saved_mark_word function. >>>>> >>>>> My suggested fix is to load _top first and enforce #LoadLoad ordering >>>>> enforced: >>>>> HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>>>> G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>>>> assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); >>>>> HeapWord* local_top = top(); >>>>> OrderAccess::loadload(); >>>>> if (_gc_time_stamp < g1h->get_gc_time_stamp()) { >>>>> return local_top; >>>>> } else { >>>>> return Space::saved_mark_word(); >>>>> } >>>>> } >>>>> >>>>> I've successfully reproduced the crash with the original code by >>>>> adding >>>>> some random sleep calls between the load of the time stamp and the >>>>> load >>>>> of top so I'm fairly certain that this resolves the issue. I've also >>>>> verified that the fix I'm proposing does resolve the bug for the team >>>>> which encountered the issue, even if I can't reproduce that crash >>>>> locally. >>>>> >>>>> I also plan to attempt design around some of the races in this >>>>> code to >>>>> reduce its complexity, but for the sake of backporting the fix to >>>>> 8u40 >>>>> I'd like to start with just adding the minimal fix. >>>>> >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 >>>>> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ >>>>> Testing: JPRT, local kitchensink (4 hours), gc test suite >>>>> >>>>> Thanks >>>>> /Mikael >> From volker.simonis at gmail.com Thu Nov 13 17:24:27 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 13 Nov 2014 18:24:27 +0100 Subject: RFR: 8064811: Use THREAD instead of CHECK_NULL in return statements In-Reply-To: <5464CE37.9020102@oracle.com> References: <5464C741.2060105@oracle.com> <5464CF3B.4060004@oracle.com> <5464CE37.9020102@oracle.com> Message-ID: Hi Stefan, thanks a LOT for doing this change. This one was on my list since years (see "6889002: CHECK macros in return constructs lead to unreachable code" at https://bugs.openjdk.java.net/browse/JDK-6889002) but I never finalized it. Your change looks good and I think it must be stressed that your code doesn't change any functionality because it only eliminates dead code! If we will have code after your change which fails to check for pending exceptions or which expects to get a certain result in the case of a pending exceptions it was just es wrong already before your change because the exception check was never reached. So unless we see one of these errors I don't think it will be necessary to introduce all these temporary variables. When I started to work on this problem years ago I went the other way round: I looked at the called functions (i.e. klass_at() in your example) to check if they already return NULL in the case of a pending exception. As far as I remember they all behaved "well" - otherwise we would already have seen some problems with the current implementation. Thanks and best regards, Volker PS: our HPUX C++ compiler will love this change:) On Thu, Nov 13, 2014 at 4:28 PM, Stefan Karlsson wrote: > > On 2014-11-13 16:33, Coleen Phillimore wrote: >> >> >> The thing that I worry about with this change is that if someone adds code >> later after the return, they'll miss changing the THREAD parameter back into >> CHECK. > > > > I was thinking the same, but I knew the we used this idiom in other places > in the JVM and thought that changing these would be OK. An alternative > approach would be to always read out the value into a variable: > > Klass* ConstantPool::klass_ref_at(int which, TRAPS) { > Klass* k = klass_at(klass_ref_index_at(which), CHECK_NULL); > return k; > } > > I can do that if people feel more comfortable with it. > > Thanks, > Stefank > > >> But maybe it's okay because the thing returned will be NULL and code is >> likely to crash on someone trying to use the value returned. Ok. This is a >> good cleanup. I'm surprised there weren't more. >> >> Thanks, >> Coleen >> >> On 11/13/14, 9:59 AM, Stefan Karlsson wrote: >>> >>> Hi all, >>> >>> Please, review this patch to replace usages of the CHECK_ macros in >>> return statements, with the THREAD define. >>> >>> http://cr.openjdk.java.net/~stefank/8064811/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8064811 >>> >>> From the bug report: >>> >>> Take the following method as an example: >>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>> return klass_at(klass_ref_index_at(which), CHECK_NULL); >>> } >>> >>> This will expand into: >>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>> return klass_at(klass_ref_index_at(which), THREAD); >>> if (HAS_PENDING_EXCEPTIONS) { >>> return NULL; >>> } >>> (void)(0); >>> } >>> >>> The if-statement will never be reached. >>> >>> We have seen cases where the compiler warns about this, and the recent >>> change to enable -Wreturn-type will make this more likely to happen. >>> >>> The suggested solution is to change the example above into: >>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>> return klass_at(klass_ref_index_at(which), THREAD); >>> } >>> >>> thanks, >>> StefanK >> >> > From christian.thalinger at oracle.com Thu Nov 13 18:33:46 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 13 Nov 2014 10:33:46 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <5464BBA9.1000809@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> Message-ID: <0B65D4AA-B876-4106-9CA0-2F2AF5F43F83@oracle.com> > On Nov 13, 2014, at 6:09 AM, Magnus Ihse Bursie wrote: > > On 2014-11-10 11:32, Volker Simonis wrote: >> On Mon, Nov 10, 2014 at 10:42 AM, Erik Joelsson >> wrote: >>> On 2014-11-10 10:27, Volker Simonis wrote: >>>> On Mon, Nov 10, 2014 at 9:08 AM, Erik Joelsson >>>> wrote: >>>>> Hello, >>>>> >>>>> I would certainly like to have these files updated, but unfortunately the >>>>> license on these files changed from GPL2 to GPL3. This essentially means >>>>> that the switch is non trivial from a legal perspective and the >>>>> impression >>>>> I've received when I last inquired about updating these files was that >>>>> it's >>>>> unlikely to ever happen unless a very strong case can be presented for >>>>> why >>>>> it's needed. >>>>> >>>>> So the reason we have the over engineered solution for config.guess is >>>>> simply that it's much easier than getting legal approval for updating >>>>> these >>>>> files. >>>> OK, but in that case I don't see any reason for keeping this >>>> "over-engineered" solution at all. If there will not be any pulls from >>>> upstream anyway then there's no reason for keeping these file >>>> untouched. I'd propose then to just remove the wrappers and do all the >>>> chenges right in the corresponding files (of course that's not the >>>> topic of this change but should be done separately). >>> And again, the reason we didn't change the existing file but instead wrapped >>> it, was that we don't have explicit legal approval for doing derivative work >>> for these 3rd party files. Maybe it's ok, maybe it's not, I will not be the >>> person saying it is ok. >>> >> OK, now I got it. I thought we just use the wrappers because we want >> to easily integrate the upstream versions. But instead it is only >> because we don't want to edit these files because of legal >> uncertainties. >> >> So in that case that means we're also not allowed to edit 'config.sub' >> and have to create a wrapper for it, right? > > Yes, you are correct. We cannot modify these files. > > As far as I understand, the legal reason for including these files are the explicit exception: > > # As a special exception to the GNU General Public License, if you > # distribute this file as part of a program that contains a > # configuration script generated by Autoconf, you may include it under > # the same distribution terms that you use for the rest of that program. > > But this is just a distribution license, not a modification license. > > From my IANAL point of view, this exception should be enough to disregard if the file is also distributed under GPL2 or GPL3. Unfortunately, as Erik says, our lawyers are apprehensive of GLP3. So while we thought that we could be able to periodically sync these files with upstream (and remove our external "patches" after a while), we have not been able to do so. Why do we have these files in our repository in the first place? > > So, this fix will need to do the same dance with config.sub as for guess.guess. Unfortunately. :( > > /Magnus From joe.darcy at oracle.com Thu Nov 13 19:00:02 2014 From: joe.darcy at oracle.com (joe darcy) Date: Thu, 13 Nov 2014 11:00:02 -0800 Subject: RFR: AARCH64: 8064594: Top-level JDK changes In-Reply-To: <54647DC5.9010102@redhat.com> References: <546348F8.9060900@redhat.com> <54647DC5.9010102@redhat.com> Message-ID: <5464FFB2.2050205@oracle.com> FWIW, if I were creating a new file by first copying an old file, I would use a copyright range from the creation date of the old file to the current year. -Joe On 11/13/2014 1:45 AM, Andrew Haley wrote: > On 12/11/14 23:51, Christian Thalinger wrote: >> The new jvm.cfg files should only have a copyright year of 2014. > Why, exactly? They have been around for a while. > > Andrew. > From eric.mccorkle at oracle.com Thu Nov 13 21:04:36 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Thu, 13 Nov 2014 16:04:36 -0500 Subject: Review request for JDK-8064571: java/lang/instrument/IsModifiableClassAgent.java: assert(length > 0) failed: should only be called if table is present Message-ID: <54651CE4.2070003@oracle.com> Hello, Please review this simple fix for a JDK test failure that was introduced by the change for JDK-8058313. A condition for an assertion was left as ">", when it should have been changed to ">=". Note that this only occurs in artificial test cases; javac does not produce classfiles with zero-length MethodParameters attributes. The webrev is here: http://cr.openjdk.java.net/~emc/8064571/ The bug report is here: https://bugs.openjdk.java.net/browse/JDK-8064571 From coleen.phillimore at oracle.com Thu Nov 13 21:09:49 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 13 Nov 2014 16:09:49 -0500 Subject: Review request for JDK-8064571: java/lang/instrument/IsModifiableClassAgent.java: assert(length > 0) failed: should only be called if table is present In-Reply-To: <54651CE4.2070003@oracle.com> References: <54651CE4.2070003@oracle.com> Message-ID: <54651E1D.2030202@oracle.com> Looks good. Thanks for running the java/lang/instrument tests. Coleen On 11/13/14, 4:04 PM, Eric McCorkle wrote: > Hello, > > Please review this simple fix for a JDK test failure that was introduced > by the change for JDK-8058313. A condition for an assertion was left as > ">", when it should have been changed to ">=". > > Note that this only occurs in artificial test cases; javac does not > produce classfiles with zero-length MethodParameters attributes. > > The webrev is here: > http://cr.openjdk.java.net/~emc/8064571/ > > The bug report is here: > https://bugs.openjdk.java.net/browse/JDK-8064571 From lois.foltan at oracle.com Thu Nov 13 21:20:11 2014 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 13 Nov 2014 16:20:11 -0500 Subject: Review request for JDK-8064571: java/lang/instrument/IsModifiableClassAgent.java: assert(length > 0) failed: should only be called if table is present In-Reply-To: <54651CE4.2070003@oracle.com> References: <54651CE4.2070003@oracle.com> Message-ID: <5465208B.1050607@oracle.com> Looks good. Lois On 11/13/2014 4:04 PM, Eric McCorkle wrote: > Hello, > > Please review this simple fix for a JDK test failure that was introduced > by the change for JDK-8058313. A condition for an assertion was left as > ">", when it should have been changed to ">=". > > Note that this only occurs in artificial test cases; javac does not > produce classfiles with zero-length MethodParameters attributes. > > The webrev is here: > http://cr.openjdk.java.net/~emc/8064571/ > > The bug report is here: > https://bugs.openjdk.java.net/browse/JDK-8064571 From serguei.spitsyn at oracle.com Thu Nov 13 22:30:00 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 13 Nov 2014 14:30:00 -0800 Subject: Review request for JDK-8064571: java/lang/instrument/IsModifiableClassAgent.java: assert(length > 0) failed: should only be called if table is present In-Reply-To: <54651CE4.2070003@oracle.com> References: <54651CE4.2070003@oracle.com> Message-ID: <546530E8.70505@oracle.com> It looks good. Thanks, Serguei On 11/13/14 1:04 PM, Eric McCorkle wrote: > Hello, > > Please review this simple fix for a JDK test failure that was introduced > by the change for JDK-8058313. A condition for an assertion was left as > ">", when it should have been changed to ">=". > > Note that this only occurs in artificial test cases; javac does not > produce classfiles with zero-length MethodParameters attributes. > > The webrev is here: > http://cr.openjdk.java.net/~emc/8064571/ > > The bug report is here: > https://bugs.openjdk.java.net/browse/JDK-8064571 From max.ockner at oracle.com Thu Nov 13 22:59:43 2014 From: max.ockner at oracle.com (Max Ockner) Date: Thu, 13 Nov 2014 17:59:43 -0500 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. In-Reply-To: <5463C52D.4000600@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> Message-ID: <546537DF.20803@oracle.com> ID: 8060449 I have made the following additional changes: -switched the order of the length check and the strncmp in is_newly_obsolete() -fixed formatting + copyright issues. I decided against further restructuring the logic in is_newly_obsolete() because I don't believe that it will make the code any clearer. webrev: http://cr.openjdk.java.net/~coleenp/8060449/ Please approve. Thanks again, Max Ockner On 11/12/2014 3:38 PM, Daniel D. Daugherty wrote: > On 11/12/14 1:04 PM, Max Ockner wrote: >> Dan, >> I have reformatted the "){" fragment on line 336 as you >> recommended. Thanks for catching that. > > Thanks. > > >> For your second recommendation, I think I have a use case where the >> recommended code would not function properly: >> >> Let's say there is a boolean flag SomeFlag, and let's say that the >> user tries to type "-XX:SomeFlagg". >> >> The first if statement passes because strlen("SomeFlagg") = >> strlen("SomeFlag")+1. >> The second conditional checks if (strncmp(flag_status.name, s, f_len) >> == 0). But f_len, the length of "SomeFlag" is 8. The result is that >> the 9th character of the user's input, which is where s differs from >> flag_status.name, is not checked,so this condition is passed as well. > > Your use case catches a bug in what I posted. I had originally > planned to change the two strncmp() calls to strcmp() so that > we get a complete match, but then I couldn't remember if a > straight strcmp() triggers Parfait warnings so I couldn't > finish reasoning my way through that maze... > > Switching the 'f_len' parameter to 's_len' would solve the > problem without triggering Parfait, but it is totally your > call. > > Dan > >> >> Thanks, >> Max >> >> >> >> On 11/12/2014 1:41 PM, Daniel D. Daugherty wrote: >>> On 11/7/14 12:13 PM, Max Ockner wrote: >>>> ID: 8060449 >>>> webrev: http://cr.openjdk.java.net/~coleenp/8060449/ >>> >>> src/share/vm/runtime/arguments.cpp >>> >>> line 336: ) { >>> This fragment is on a line by itself and far left. >>> Minimally, it should align like this: >>> >>> line 331: if (... >>> line 336: ) { >>> >>> However, I recommend a slightly different structure to >>> this logic: >>> >>> size_t f_len = strlen(flag_status.name); >>> size_t s_len = strlen(s); >>> if (f_len == s_len || (f_len + 1) == s_len) { >>> // this flag is the right length for a possible match >>> if (strncmp(flag_status.name, s, f_len) == 0) || >>> ((s[0] == '+' || s[0] == '-') && >>> strncmp(flag_status.name, &s[1], f_len) == 0)) { >>> // this flag is an exact match >>> if (JDK_Version::current().compare(flag_status.accept_until) >>> == -1) { >>> ... >>> } >>> } >>> } >>> i++; >>> >>> I have no idea if the above formatting is going to be >>> preserved by e-mail clients... >>> >>> Dan >>> >>> >>>> >>>> Summary: A "newly obsolete" command line option is one which is no >>>> longer supported, but still is acknowledged. There is a list of >>>> these in arguments.cpp. >>>> It used to be that only a fixed number of characters were checked >>>> when comparing a given command line option to the list of obsolete >>>> flags (strncmp was used, where the number of characters to check is >>>> equal to the length of the flag name from the table.) >>>> As a result, an arbitrary string appended to the end of an obsolete >>>> argument goes unnoticed. >>>> This issue is fixed by comparing the lengths of the given flag and >>>> the flags from the obsolete flags table. >>>> When a misspelled flag is fuzzy-matched to an obsolete flag, an >>>> appropriate warning is given to save the user a few key strokes: >>>> (1) unrecognized option [bad option]. (2) Did you mean [option]? >>>> (3) [option] is obsolete as of [version]) >>>> >>>> A new test for this feature checks for the presence of all three >>>> components of the above error message. >>>> >>>> Tested with: vm.quick.testlist >>>> hotspot jtreg tests >>>> jprt >>>> >>>> Thanks for your help! >>>> Max Ockner >>>> >>>> >>> >> > From vladimir.kozlov at oracle.com Thu Nov 13 23:06:28 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 13 Nov 2014 15:06:28 -0800 Subject: RFR: AARCH64: 8064594: Top-level JDK changes In-Reply-To: <5464FFB2.2050205@oracle.com> References: <546348F8.9060900@redhat.com> <54647DC5.9010102@redhat.com> <5464FFB2.2050205@oracle.com> Message-ID: <54653974.70804@oracle.com> Looks like the only comment we have is to change 2013 year to 2014 in the Copyright header in new files (and keep creation year from old file as Joe suggested). I can do that trivial change and push these changes into aarch64 staging repo. webrev: http://cr.openjdk.java.net/~aph/aarch64-JDK-8064594-1/ I included David's review from other thread to keep all reviews here: On 11/12/14 8:28 PM, David Holmes wrote: > On 13/11/2014 10:04 AM, Dean Long wrote: >> And adding build-infra-dev and jdk9-dev wouldn't hurt either. > > Let's not get carried away for what is quite a trivial copying of > existing platform specific patterns :) build-dev (not build-infra-dev) > would be okay. jdk9-dev isn't needed if already on hotspot-dev, > build-dev and sound-dev. > > These changes seem quite trivially fine to me. > > David H. Thanks, Vladimir On 11/13/14 11:00 AM, joe darcy wrote: > FWIW, if I were creating a new file by first copying an old file, I > would use a copyright range from the creation date of the old file to > the current year. > > -Joe > > On 11/13/2014 1:45 AM, Andrew Haley wrote: >> On 12/11/14 23:51, Christian Thalinger wrote: >>> The new jvm.cfg files should only have a copyright year of 2014. >> Why, exactly? They have been around for a while. >> >> Andrew. >> > From max.ockner at oracle.com Thu Nov 13 23:06:02 2014 From: max.ockner at oracle.com (Max Ockner) Date: Thu, 13 Nov 2014 18:06:02 -0500 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. In-Reply-To: <5463C52D.4000600@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> Message-ID: <5465395A.4080209@oracle.com> Correction - new webrev is at http://cr.openjdk.java.net/~coleenp/8060449.1/ Max Ockner On 11/12/2014 3:38 PM, Daniel D. Daugherty wrote: > On 11/12/14 1:04 PM, Max Ockner wrote: >> Dan, >> I have reformatted the "){" fragment on line 336 as you >> recommended. Thanks for catching that. > > Thanks. > > >> For your second recommendation, I think I have a use case where the >> recommended code would not function properly: >> >> Let's say there is a boolean flag SomeFlag, and let's say that the >> user tries to type "-XX:SomeFlagg". >> >> The first if statement passes because strlen("SomeFlagg") = >> strlen("SomeFlag")+1. >> The second conditional checks if (strncmp(flag_status.name, s, f_len) >> == 0). But f_len, the length of "SomeFlag" is 8. The result is that >> the 9th character of the user's input, which is where s differs from >> flag_status.name, is not checked,so this condition is passed as well. > > Your use case catches a bug in what I posted. I had originally > planned to change the two strncmp() calls to strcmp() so that > we get a complete match, but then I couldn't remember if a > straight strcmp() triggers Parfait warnings so I couldn't > finish reasoning my way through that maze... > > Switching the 'f_len' parameter to 's_len' would solve the > problem without triggering Parfait, but it is totally your > call. > > Dan > >> >> Thanks, >> Max >> >> >> >> On 11/12/2014 1:41 PM, Daniel D. Daugherty wrote: >>> On 11/7/14 12:13 PM, Max Ockner wrote: >>>> ID: 8060449 >>>> webrev: http://cr.openjdk.java.net/~coleenp/8060449/ >>> >>> src/share/vm/runtime/arguments.cpp >>> >>> line 336: ) { >>> This fragment is on a line by itself and far left. >>> Minimally, it should align like this: >>> >>> line 331: if (... >>> line 336: ) { >>> >>> However, I recommend a slightly different structure to >>> this logic: >>> >>> size_t f_len = strlen(flag_status.name); >>> size_t s_len = strlen(s); >>> if (f_len == s_len || (f_len + 1) == s_len) { >>> // this flag is the right length for a possible match >>> if (strncmp(flag_status.name, s, f_len) == 0) || >>> ((s[0] == '+' || s[0] == '-') && >>> strncmp(flag_status.name, &s[1], f_len) == 0)) { >>> // this flag is an exact match >>> if (JDK_Version::current().compare(flag_status.accept_until) >>> == -1) { >>> ... >>> } >>> } >>> } >>> i++; >>> >>> I have no idea if the above formatting is going to be >>> preserved by e-mail clients... >>> >>> Dan >>> >>> >>>> >>>> Summary: A "newly obsolete" command line option is one which is no >>>> longer supported, but still is acknowledged. There is a list of >>>> these in arguments.cpp. >>>> It used to be that only a fixed number of characters were checked >>>> when comparing a given command line option to the list of obsolete >>>> flags (strncmp was used, where the number of characters to check is >>>> equal to the length of the flag name from the table.) >>>> As a result, an arbitrary string appended to the end of an obsolete >>>> argument goes unnoticed. >>>> This issue is fixed by comparing the lengths of the given flag and >>>> the flags from the obsolete flags table. >>>> When a misspelled flag is fuzzy-matched to an obsolete flag, an >>>> appropriate warning is given to save the user a few key strokes: >>>> (1) unrecognized option [bad option]. (2) Did you mean [option]? >>>> (3) [option] is obsolete as of [version]) >>>> >>>> A new test for this feature checks for the presence of all three >>>> components of the above error message. >>>> >>>> Tested with: vm.quick.testlist >>>> hotspot jtreg tests >>>> jprt >>>> >>>> Thanks for your help! >>>> Max Ockner >>>> >>>> >>> >> > From lois.foltan at oracle.com Thu Nov 13 23:43:34 2014 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 13 Nov 2014 18:43:34 -0500 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. In-Reply-To: <5465395A.4080209@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> <5465395A.4080209@oracle.com> Message-ID: <54654226.9070309@oracle.com> Hi Max, This looks good! Three really minor coding style comments included for completeness but I don't need to see another code review if you choose to fix these. src/share/vm/runtime/arguments.cpp - line # 336, usually the { would be placed on line 335 at the end of the if statement's conditional expression - line # 945, need a blank space between the "){" - line #952 the closing } is not lined up with the if keyword Again, these are minor. Lois On 11/13/2014 6:06 PM, Max Ockner wrote: > Correction - new webrev is at > http://cr.openjdk.java.net/~coleenp/8060449.1/ > > Max Ockner > > > On 11/12/2014 3:38 PM, Daniel D. Daugherty wrote: >> On 11/12/14 1:04 PM, Max Ockner wrote: >>> Dan, >>> I have reformatted the "){" fragment on line 336 as you >>> recommended. Thanks for catching that. >> >> Thanks. >> >> >>> For your second recommendation, I think I have a use case where the >>> recommended code would not function properly: >>> >>> Let's say there is a boolean flag SomeFlag, and let's say that the >>> user tries to type "-XX:SomeFlagg". >>> >>> The first if statement passes because strlen("SomeFlagg") = >>> strlen("SomeFlag")+1. >>> The second conditional checks if (strncmp(flag_status.name, s, >>> f_len) == 0). But f_len, the length of "SomeFlag" is 8. The result >>> is that the 9th character of the user's input, which is where s >>> differs from flag_status.name, is not checked,so this condition is >>> passed as well. >> >> Your use case catches a bug in what I posted. I had originally >> planned to change the two strncmp() calls to strcmp() so that >> we get a complete match, but then I couldn't remember if a >> straight strcmp() triggers Parfait warnings so I couldn't >> finish reasoning my way through that maze... >> >> Switching the 'f_len' parameter to 's_len' would solve the >> problem without triggering Parfait, but it is totally your >> call. >> >> Dan >> >>> >>> Thanks, >>> Max >>> >>> >>> >>> On 11/12/2014 1:41 PM, Daniel D. Daugherty wrote: >>>> On 11/7/14 12:13 PM, Max Ockner wrote: >>>>> ID: 8060449 >>>>> webrev: http://cr.openjdk.java.net/~coleenp/8060449/ >>>> >>>> src/share/vm/runtime/arguments.cpp >>>> >>>> line 336: ) { >>>> This fragment is on a line by itself and far left. >>>> Minimally, it should align like this: >>>> >>>> line 331: if (... >>>> line 336: ) { >>>> >>>> However, I recommend a slightly different structure to >>>> this logic: >>>> >>>> size_t f_len = strlen(flag_status.name); >>>> size_t s_len = strlen(s); >>>> if (f_len == s_len || (f_len + 1) == s_len) { >>>> // this flag is the right length for a possible match >>>> if (strncmp(flag_status.name, s, f_len) == 0) || >>>> ((s[0] == '+' || s[0] == '-') && >>>> strncmp(flag_status.name, &s[1], f_len) == 0)) { >>>> // this flag is an exact match >>>> if >>>> (JDK_Version::current().compare(flag_status.accept_until) == -1) { >>>> ... >>>> } >>>> } >>>> } >>>> i++; >>>> >>>> I have no idea if the above formatting is going to be >>>> preserved by e-mail clients... >>>> >>>> Dan >>>> >>>> >>>>> >>>>> Summary: A "newly obsolete" command line option is one which is no >>>>> longer supported, but still is acknowledged. There is a list of >>>>> these in arguments.cpp. >>>>> It used to be that only a fixed number of characters were checked >>>>> when comparing a given command line option to the list of obsolete >>>>> flags (strncmp was used, where the number of characters to check >>>>> is equal to the length of the flag name from the table.) >>>>> As a result, an arbitrary string appended to the end of an >>>>> obsolete argument goes unnoticed. >>>>> This issue is fixed by comparing the lengths of the given flag and >>>>> the flags from the obsolete flags table. >>>>> When a misspelled flag is fuzzy-matched to an obsolete flag, an >>>>> appropriate warning is given to save the user a few key strokes: >>>>> (1) unrecognized option [bad option]. (2) Did you mean [option]? >>>>> (3) [option] is obsolete as of [version]) >>>>> >>>>> A new test for this feature checks for the presence of all three >>>>> components of the above error message. >>>>> >>>>> Tested with: vm.quick.testlist >>>>> hotspot jtreg tests >>>>> jprt >>>>> >>>>> Thanks for your help! >>>>> Max Ockner >>>>> >>>>> >>>> >>> >> > From daniel.daugherty at oracle.com Thu Nov 13 23:52:08 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 13 Nov 2014 16:52:08 -0700 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. In-Reply-To: <54654226.9070309@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> <5465395A.4080209@oracle.com> <54654226.9070309@oracle.com> Message-ID: <54654428.8040002@oracle.com> Max, I'm good with this version also. src/share/vm/runtime/arguments.cpp No content comments; same formatting comments as Lois. Background: It is generally faster to do length checks before string comparisons. However, in this case, speed is not an issue. test/runtime/CommandLine/ObsoleteFlagErrorMessage.java lines 39-41: you should have a space after '//' and before your comment begins for readability. I don't need to see another code review if you choose to fix any of the formatting issues. Dan On 11/13/14 4:43 PM, Lois Foltan wrote: > Hi Max, > > This looks good! Three really minor coding style comments included > for completeness but I don't need to see another code review if you > choose to fix these. > > src/share/vm/runtime/arguments.cpp > - line # 336, usually the { would be placed on line 335 at the end > of the if statement's conditional expression > - line # 945, need a blank space between the "){" > - line #952 the closing } is not lined up with the if keyword > > Again, these are minor. > Lois > > > On 11/13/2014 6:06 PM, Max Ockner wrote: >> Correction - new webrev is at >> http://cr.openjdk.java.net/~coleenp/8060449.1/ >> >> Max Ockner >> >> >> On 11/12/2014 3:38 PM, Daniel D. Daugherty wrote: >>> On 11/12/14 1:04 PM, Max Ockner wrote: >>>> Dan, >>>> I have reformatted the "){" fragment on line 336 as you >>>> recommended. Thanks for catching that. >>> >>> Thanks. >>> >>> >>>> For your second recommendation, I think I have a use case where the >>>> recommended code would not function properly: >>>> >>>> Let's say there is a boolean flag SomeFlag, and let's say that the >>>> user tries to type "-XX:SomeFlagg". >>>> >>>> The first if statement passes because strlen("SomeFlagg") = >>>> strlen("SomeFlag")+1. >>>> The second conditional checks if (strncmp(flag_status.name, s, >>>> f_len) == 0). But f_len, the length of "SomeFlag" is 8. The result >>>> is that the 9th character of the user's input, which is where s >>>> differs from flag_status.name, is not checked,so this condition is >>>> passed as well. >>> >>> Your use case catches a bug in what I posted. I had originally >>> planned to change the two strncmp() calls to strcmp() so that >>> we get a complete match, but then I couldn't remember if a >>> straight strcmp() triggers Parfait warnings so I couldn't >>> finish reasoning my way through that maze... >>> >>> Switching the 'f_len' parameter to 's_len' would solve the >>> problem without triggering Parfait, but it is totally your >>> call. >>> >>> Dan >>> >>>> >>>> Thanks, >>>> Max >>>> >>>> >>>> >>>> On 11/12/2014 1:41 PM, Daniel D. Daugherty wrote: >>>>> On 11/7/14 12:13 PM, Max Ockner wrote: >>>>>> ID: 8060449 >>>>>> webrev: http://cr.openjdk.java.net/~coleenp/8060449/ >>>>> >>>>> src/share/vm/runtime/arguments.cpp >>>>> >>>>> line 336: ) { >>>>> This fragment is on a line by itself and far left. >>>>> Minimally, it should align like this: >>>>> >>>>> line 331: if (... >>>>> line 336: ) { >>>>> >>>>> However, I recommend a slightly different structure to >>>>> this logic: >>>>> >>>>> size_t f_len = strlen(flag_status.name); >>>>> size_t s_len = strlen(s); >>>>> if (f_len == s_len || (f_len + 1) == s_len) { >>>>> // this flag is the right length for a possible match >>>>> if (strncmp(flag_status.name, s, f_len) == 0) || >>>>> ((s[0] == '+' || s[0] == '-') && >>>>> strncmp(flag_status.name, &s[1], f_len) == 0)) { >>>>> // this flag is an exact match >>>>> if >>>>> (JDK_Version::current().compare(flag_status.accept_until) == -1) { >>>>> ... >>>>> } >>>>> } >>>>> } >>>>> i++; >>>>> >>>>> I have no idea if the above formatting is going to be >>>>> preserved by e-mail clients... >>>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> Summary: A "newly obsolete" command line option is one which is >>>>>> no longer supported, but still is acknowledged. There is a list >>>>>> of these in arguments.cpp. >>>>>> It used to be that only a fixed number of characters were checked >>>>>> when comparing a given command line option to the list of >>>>>> obsolete flags (strncmp was used, where the number of characters >>>>>> to check is equal to the length of the flag name from the table.) >>>>>> As a result, an arbitrary string appended to the end of an >>>>>> obsolete argument goes unnoticed. >>>>>> This issue is fixed by comparing the lengths of the given flag >>>>>> and the flags from the obsolete flags table. >>>>>> When a misspelled flag is fuzzy-matched to an obsolete flag, an >>>>>> appropriate warning is given to save the user a few key strokes: >>>>>> (1) unrecognized option [bad option]. (2) Did you mean [option]? >>>>>> (3) [option] is obsolete as of [version]) >>>>>> >>>>>> A new test for this feature checks for the presence of all three >>>>>> components of the above error message. >>>>>> >>>>>> Tested with: vm.quick.testlist >>>>>> hotspot jtreg tests >>>>>> jprt >>>>>> >>>>>> Thanks for your help! >>>>>> Max Ockner >>>>>> >>>>>> >>>>> >>>> >>> >> > From vladimir.kozlov at oracle.com Thu Nov 13 23:56:05 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 13 Nov 2014 15:56:05 -0800 Subject: RFR: AARCH64: Changes to HotSpot shared code In-Reply-To: <54649688.7040401@redhat.com> References: <54625D3D.4000007@redhat.com> <5463C1CE.9040301@oracle.com> <54647A85.6020203@redhat.com> <54649688.7040401@redhat.com> Message-ID: <54654515.1080801@oracle.com> On 11/13/14 3:31 AM, Andrew Haley wrote: > On 11/13/2014 09:31 AM, Andrew Haley wrote: >>>> How about moving your deopt-instead-of-patch support >>>> into Runtime1::patch_code() and enable it with a read-only >>>> platform-specific developer runtime flag >>>> (see INTPRESSURE for example)? >> Okay. I'll have a look at that. > > Does this mean that I'll need to add a flag to all back ends? Yes. And we will help with closed changes. Vladimir > > Andrew. > > From vladimir.kozlov at oracle.com Fri Nov 14 01:13:51 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 13 Nov 2014 17:13:51 -0800 Subject: [9] RFR(XS): 8059596: VM startup fails with 'Invalid code heap sizes' if -XX:ReservedCodeCacheSize is set In-Reply-To: <542D1847.7090107@oracle.com> References: <542D1847.7090107@oracle.com> Message-ID: <5465574F.2040401@oracle.com> Hi Tobias, At the line 1149 we set ReservedCodeCacheSize to ERGO so it is not DEFAULT anymore: 1149 FLAG_SET_ERGO(uintx, ReservedCodeCacheSize, ReservedCodeCacheSize * 5); As result segments sizes are not set there. They will be set to in CodeCache::initialize_heaps() as (ReservedCodeCacheSize - NonNMethodCodeHeapSize) / 2: CodeHeap 'non-nmethods': size=5700Kb used=2278Kb max_used=2279Kb free=3421Kb CodeHeap 'profiled nmethods': size=120032Kb used=120Kb max_used=120Kb free=119912Kb CodeHeap 'non-profiled nmethods': size=120032Kb used=22Kb max_used=22Kb free=120 But it is not *5 sizes: define_pd_global(intx, NonProfiledCodeHeapSize, 21*M); define_pd_global(intx, ProfiledCodeHeapSize, 22*M); And it skips the assert in arguments.cpp Vladimir On 10/2/14 2:17 AM, Tobias Hartmann wrote: > Hi, > > please review this small patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8059596 > Webrev: http://cr.openjdk.java.net/~thartmann/8059596/webrev.00/ > > Problem: > The VM startup fails with 'Invalid code heap sizes' if > -XX:ReservedCodeCacheSize >= 240M is specified. The problem is that in > Arguments::set_tiered_flags() the code cache size is increased by 5 if > TieredCompilation is enabled. This should only be done for default values. > > Solution: > Add missing FLAG_IS_DEFAULT(ReservedCodeCacheSize) check. > > Thanks, > Tobias From david.holmes at oracle.com Fri Nov 14 02:28:08 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 14 Nov 2014 12:28:08 +1000 Subject: RFR: 8064799: [TESTBUG] JT-Reg Serviceability tests to be run as part of JPRT submit job In-Reply-To: <5464D3F2.9060107@oracle.com> References: <5464B874.4060206@oracle.com> <5464D3F2.9060107@oracle.com> Message-ID: <546568B8.2050305@oracle.com> Hi Mikael, So to be clear this adds the JDK JDI tests (select subset thereof) to "-testset hotspot" for full forest submissions to JPRT. Test overhead seems okay. Hopefully the test stability will be exhibited in JPRT as well as other testing. Ok. David On 14/11/2014 1:53 AM, Mikael Auno wrote: > On 2014-11-13 14:56, Mikael Auno wrote: >> Hi, >> >> Could I please get a review of this addition of SVC tests to JPRT submit >> jobs. So far, I'm only adding JDI tests as those are the only ones I >> have completed code coverage analysis on to determine the best subset to >> add. The other areas will be added too, but I'm adding these now to get >> the ball rolling asap. >> >> I've run these through JPRT once already without failures and have got >> two more runs in the pipe. I've also looked through the history for >> these tests and found that they do not have any known instabilities to >> worry about. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8064799 >> Webrev: http://cr.openjdk.java.net/~miauno/8064799/webrev.00/ > > The additional JPRT runs have completed now and have no failures. Here > are also the duration (in seconds) for each test on each platform to in > case anyone wonders ("com/sun/jdi" prefix stripped out): > >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------- >> | | lin_i586-c1 | lin_i586-c2 | lin_x64-c2 | osx_x64-c2 | sol_sparcv9-c2 | sol_x64-c2 | win_i586-c1 | win_i586-c2 | win_x64-c2 | >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------- >> | .../AcceptTimeout | 1.24 | 1.277 | 1.308 | 1.349 | 1.54 | 1.204 | 2.184 | 2.293 | 2.34 | >> | .../AccessSpecifierTest | 1.689 | 1.883 | 2.021 | 2.303 | 4.892 | 2.048 | 1.332 | 1.707 | 2.19 | >> | .../AfterThreadDeathTest | 0.855 | 0.748 | 0.815 | 0.605 | 1.098 | 0.691 | 0.299 | 0.424 | 1.683 | >> | .../ArrayRangeTest | 0.659 | 0.823 | 0.794 | 0.702 | 1.445 | 0.837 | 0.783 | 0.503 | 1.267 | >> | .../ConstantPoolInfo | 0.589 | 0.74 | 0.791 | 0.621 | 1.067 | 0.607 | 0.315 | 0.408 | 0.674 | >> | .../CountFilterTest | 0.588 | 0.638 | 0.729 | 0.617 | 1.068 | 0.618 | 0.3 | 0.502 | 0.674 | >> | .../EarlyReturnNegativeTest | 0.724 | 0.8 | 0.824 | 0.675 | 1.186 | 0.642 | 0.362 | 0.627 | 0.736 | >> | .../EarlyReturnTest | 1.218 | 1.164 | 1.295 | 1.207 | 1.962 | 1.307 | 0.72 | 1.189 | 1.242 | >> | .../FieldWatchpoints | 0.616 | 0.628 | 0.728 | 0.6 | 1.052 | 0.683 | 0.3 | 0.408 | 0.674 | >> | .../FramesTest | 0.598 | 0.696 | 0.741 | 0.601 | 1.006 | 0.592 | 0.299 | 0.425 | 0.627 | >> | .../InstanceFilter | 0.604 | 0.677 | 0.696 | 0.587 | 1.005 | 0.608 | 0.284 | 0.393 | 0.69 | >> | .../InterfaceMethodsTest | 0.706 | 0.83 | 0.837 | 0.69 | 1.193 | 0.762 | 0.362 | 1.032 | 0.752 | >> | .../InvokeTest | 0.719 | 0.788 | 0.861 | 0.71 | 1.196 | 0.647 | 0.377 | 0.752 | 0.721 | >> | .../LocalVariableEqual | 0.66 | 0.662 | 0.714 | 0.622 | 1.087 | 0.715 | 0.315 | 0.383 | 0.612 | >> | .../LocationTest | 0.639 | 0.651 | 0.688 | 0.715 | 1.014 | 0.612 | 0.299 | 0.362 | 0.58 | >> | .../ModificationWatchpoints | 0.764 | 0.789 | 0.872 | 0.726 | 1.375 | 0.668 | 0.424 | 0.502 | 0.877 | >> | .../MonitorEventTest | 0.597 | 0.638 | 0.69 | 1.608 | 1.03 | 0.648 | 0.284 | 0.377 | 0.689 | >> | .../MonitorFrameInfo | 0.622 | 0.652 | 0.731 | 0.596 | 1.014 | 0.592 | 0.299 | 0.456 | 0.612 | >> | .../NullThreadGroupNameTest | 0.602 | 0.702 | 0.733 | 0.588 | 1.045 | 0.572 | 0.299 | 0.362 | 0.58 | >> | .../PopAndStepTest | 0.318 | 0.351 | 0.416 | 0.593 | 0.989 | 0.713 | 0.3 | 0.455 | 0.752 | >> | .../PopAsynchronousTest | 0.718 | 0.869 | 0.8 | 0.654 | 1.063 | 0.619 | 0.519 | 0.581 | 0.737 | >> | .../ProcessAttachTest | 6.748 | 6.482 | 6.781 | 7.115 | 9.167 | 6.973 | 6.043 | 6.355 | 6.881 | >> | .../redefineMethod/RedefineTest | 3.678 | 3.743 | 4.072 | 5.207 | 7.081 | 3.757 | 2.976 | 3.568 | 4.723 | >> | .../ReferrersTest | 0.846 | 0.811 | 0.866 | 1.493 | 2.295 | 1.096 | 0.642 | 0.892 | 1.31 | >> | .../RequestReflectionTest | 0.642 | 0.644 | 0.706 | 0.59 | 1.172 | 0.584 | 0.3 | 0.737 | 0.736 | >> | .../ResumeOneThreadTest | 0.612 | 0.661 | 0.688 | 0.669 | 1.073 | 0.583 | 0.502 | 0.362 | 0.658 | >> | .../RunToExit | 1.434 | 1.454 | 1.462 | 1.215 | 1.182 | 1.188 | 1.126 | 1.126 | 1.22 | >> | .../sde/MangleTest | 0.739 | 0.976 | 0.9 | 0.72 | 1.295 | 0.703 | 0.486 | 0.752 | 1.11 | >> | .../sde/TemperatureTableTest | 0.923 | 0.986 | 0.992 | 0.784 | 1.368 | 0.846 | 0.502 | 0.892 | 0.908 | >> | .../SourceNameFilterTest | 1.246 | 1.365 | 1.45 | 1.246 | 2.041 | 1.215 | 0.599 | 1.051 | 1.316 | >> | .../VarargsTest | 0.713 | 0.763 | 0.814 | 0.718 | 1.183 | 0.654 | 0.393 | 0.533 | 0.924 | >> | .../Vars | 0.568 | 0.609 | 0.692 | 0.628 | 1.012 | 0.583 | 0.284 | 0.362 | 0.564 | >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------- >> | Total | 33.874 | 35.564 | 37.507 | 37.754 | 57.196 | 34.567 | 24.509 | 30.771 | 40.059 | >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Thanks, > Mikael > From david.holmes at oracle.com Fri Nov 14 02:39:10 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 14 Nov 2014 12:39:10 +1000 Subject: RFR: 8064811: Use THREAD instead of CHECK_NULL in return statements In-Reply-To: References: <5464C741.2060105@oracle.com> <5464CF3B.4060004@oracle.com> <5464CE37.9020102@oracle.com> Message-ID: <54656B4E.4000809@oracle.com> On 14/11/2014 3:24 AM, Volker Simonis wrote: > Hi Stefan, > > thanks a LOT for doing this change. This one was on my list since > years (see "6889002: CHECK macros in return constructs lead to > unreachable code" at https://bugs.openjdk.java.net/browse/JDK-6889002) > but I never finalized it. > > Your change looks good and I think it must be stressed that your code > doesn't change any functionality because it only eliminates dead code! > > If we will have code after your change which fails to check for > pending exceptions or which expects to get a certain result in the > case of a pending exceptions it was just es wrong already before your > change because the exception check was never reached. So unless we see > one of these errors I don't think it will be necessary to introduce > all these temporary variables. > > When I started to work on this problem years ago I went the other way > round: I looked at the called functions (i.e. klass_at() in your > example) to check if they already return NULL in the case of a pending > exception. As far as I remember they all behaved "well" - otherwise we > would already have seen some problems with the current implementation. This was my concern too - what actually gets returned in those cases! But if this has been verified, and as you say we don't see problems because of this, then the change to just THREAD seems okay. Personally though I prefer the alternative style: Klass* ConstantPool::klass_ref_at(int which, TRAPS) { Klass* k = klass_at(klass_ref_index_at(which), CHECK_NULL); return k; } I'll leave it up to Stefan. :) Cheers, David > Thanks and best regards, > Volker > > PS: our HPUX C++ compiler will love this change:) > > > On Thu, Nov 13, 2014 at 4:28 PM, Stefan Karlsson > wrote: >> >> On 2014-11-13 16:33, Coleen Phillimore wrote: >>> >>> >>> The thing that I worry about with this change is that if someone adds code >>> later after the return, they'll miss changing the THREAD parameter back into >>> CHECK. >> >> >> >> I was thinking the same, but I knew the we used this idiom in other places >> in the JVM and thought that changing these would be OK. An alternative >> approach would be to always read out the value into a variable: >> >> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >> Klass* k = klass_at(klass_ref_index_at(which), CHECK_NULL); >> return k; >> } >> >> I can do that if people feel more comfortable with it. >> >> Thanks, >> Stefank >> >> >>> But maybe it's okay because the thing returned will be NULL and code is >>> likely to crash on someone trying to use the value returned. Ok. This is a >>> good cleanup. I'm surprised there weren't more. >>> >>> Thanks, >>> Coleen >>> >>> On 11/13/14, 9:59 AM, Stefan Karlsson wrote: >>>> >>>> Hi all, >>>> >>>> Please, review this patch to replace usages of the CHECK_ macros in >>>> return statements, with the THREAD define. >>>> >>>> http://cr.openjdk.java.net/~stefank/8064811/webrev.01/ >>>> https://bugs.openjdk.java.net/browse/JDK-8064811 >>>> >>>> From the bug report: >>>> >>>> Take the following method as an example: >>>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>>> return klass_at(klass_ref_index_at(which), CHECK_NULL); >>>> } >>>> >>>> This will expand into: >>>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>>> return klass_at(klass_ref_index_at(which), THREAD); >>>> if (HAS_PENDING_EXCEPTIONS) { >>>> return NULL; >>>> } >>>> (void)(0); >>>> } >>>> >>>> The if-statement will never be reached. >>>> >>>> We have seen cases where the compiler warns about this, and the recent >>>> change to enable -Wreturn-type will make this more likely to happen. >>>> >>>> The suggested solution is to change the example above into: >>>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>>> return klass_at(klass_ref_index_at(which), THREAD); >>>> } >>>> >>>> thanks, >>>> StefanK >>> >>> >> From vladimir.kozlov at oracle.com Fri Nov 14 03:10:48 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 13 Nov 2014 19:10:48 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <54638FA0.8040204@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> Message-ID: <546572B8.9080005@oracle.com> Hi Andrew, os_linux.cpp I assume you will implement Goetz's suggestion (similar to EM_486). memory/metaspace.cpp Use {} for first check. Align ReservedSpace() calls parameters to '('. can_use_cds_with_metaspace_addr() should be called only inside #if INCLUDE_CDS. opto/graphKit.cpp May be it should be PPC64_ONLY(release) NOT_PPC64(unordered)? opto/library_call.cpp I don't see membar changes for volatile stores in inline_unsafe_access(). opto/parse2.cpp To avoid platform specific code use StoreNode::release_if_reference(T_OBJECT) there since that method returns correct value with your change. opto/parse3.cpp I am fine with using AARCH macros here. Our code style requires using {} for if's body - please add them at lines 288-290. runtime/arguments.cpp Is it really 128MB max value for ReservedCodeCacheSize on aarch64? What is default ReservedCodeCacheSize size? You may need to change next code if you can allocate only 128MB: 2547 } else if (ReservedCodeCacheSize > 2*G) { 2548 // Code cache size larger than MAXINT is not supported. 2549 jio_fprintf(defaultStream::error_stream(), I think you need to add new platforms specific flag CodeCacheSizeLimit and use it instead of our hard-coded 2Gb (maxint). In new files Copyright's last year should be 2014. Thanks, Vladimir On 11/12/14 8:49 AM, Andrew Haley wrote: > On 11/12/2014 11:40 AM, Lindenmaier, Goetz wrote: >> Hi Andrew, > > Hi, > > Thank you for your comment. I have prepared a new webrev at > > http://cr.openjdk.java.net/~aph/aarch64-8064611-1/ > > which I hope addresses everything you mentioned. > > I haven't re-ordered any of the lists of processors because I think > this is a separate issue. > > Andrew. > From dean.long at oracle.com Fri Nov 14 06:15:25 2014 From: dean.long at oracle.com (Dean Long) Date: Thu, 13 Nov 2014 22:15:25 -0800 Subject: RFR: AARCH64: Changes to HotSpot shared code In-Reply-To: <54647A85.6020203@redhat.com> References: <54625D3D.4000007@redhat.com> <5463C1CE.9040301@oracle.com> <54647A85.6020203@redhat.com> Message-ID: <54659DFD.3060001@oracle.com> On 11/13/2014 1:31 AM, Andrew Haley wrote: > On 12/11/14 20:23, Dean Long wrote: >> On 11/11/2014 11:02 AM, Andrew Haley wrote: >>> http://cr.openjdk.java.net/~aph/aarch64-JDK-8064611/hotspot.patch >>> >>> Everything except cpu/ and os_cpu/. >>> >>> Most of this is obvious and trivial, with a few exceptions. >>> >>> In memory/metaspace.cpp, we allocated the memory for metadata in a >>> different way. This is because we want to be able to decode and >>> encode compressed metadata pointers with a single instruction, and we >>> can always do that iff the base address is of a particular form. >>> >>> In opto/, we have made some changes in order to be able to use AArch64 >>> store release instructions for volatile field stores. These don't >>> require leading or trailing barriers. We have tried several times to >>> do this without changing shared code, but it is impossible with the >>> current back-end interface. >> Is this something ppc64 can also take advantage of? I hope Vladimir can >> suggest >> a more flexible way to do this, perhaps with a runtime flag. > Perhaps so, but as far as I'm aware AArch64 is the only CPU with > exactly these semantics. From my point of view, it would be ideal if > we simply emitted volatile store and volatile load as nodes and let > the back end handle them. But if we do that we lose the opportunity > to coalesce barriers in C2 optimization. Hmmm.... :-) > >>> In several places a release store is used where the AArch64 memory >>> model makes it unnecessary. From earlier emails on this list we >>> discovered that the only architecture which requires this release >>> store is IA64, and OpenJDK does not support it anyway. We should >>> perhaps look at re-engineering the way that memory barriers and memory >>> accesses are handled in HotSpot with a view to pushing all these >>> architecture-dependent assumptions out to the back ends. >> I agree. More comments below. >>> Andrew. >> c1_Canonicalizer.cpp >> Can this be handled in the back-end? I imagine other platforms, >> such as x86, have similar limitations. > It certainly could be. Maybe pd_valid_shift_count() ? But I'm striving > not to touch any other ports. What I'm actually wondering is what happens if you remove the AARCH64 log2_scale check altogether. As far as I can tell, it isn't needed, because in do_UnsafeGetRaw, the scale is only used directly in the LIR_Address for X86 and ARM. For other platforms, we do: LIR_Opr tmp = new_pointer_register(); __ shift_left(index_op, log2_scale, tmp); addr = new LIR_Address(base_op, tmp, dst_type); so you don't have to worry about a mis-scaled load on AARCH64. >> c1_LIR.cpp >> It looks like you need a temp for convert because your backend >> because you're checking the FPSR. >> What happens if you ignore the FPSR, do you get a wrong result? > I've looked for a while, and I'm sorry but I don't understand which > hunk this refers to. This one, for lir_convert: #if defined(PPC) || defined(TARGET_ARCH_aarch64) if (opConvert->_tmp1->is_valid()) do_temp(opConvert->_tmp1); if (opConvert->_tmp2->is_valid()) do_temp(opConvert->_tmp2); #endif I'm wondering if, for example, d2l in the back-end needs to check FPSCR.IOC? If you get the correct result even if FPSCR.IOC is set, then you should be able to simply ignore FPSCR.IOC. dl >> c1_LinearScan.cpp >> I'm not familiar with what the changed code is doing. Can you >> explain why it applies to x86 and aarch64? > It certainly was at the time. I'll investigate to see if this is > still needed. > >> c1_Runtime1.cpp >> This will break our closed port that NOP instructions for >> patching. > Ah, interesting. I spent quite a lot of time kicking around ideas for > C1 patching, but (to my surprise) deoptimizing instead didn't seem to > have significant adverse effect. > >> How about moving your deopt-instead-of-patch support >> into Runtime1::patch_code() and enable it with a read-only >> platform-specific developer runtime flag >> (see INTPRESSURE for example)? > Okay. I'll have a look at that. > >> compiledIC.hpp >> You should be able to use set_inst_mark()/cbuf.insts_mark() to set >> and retrieve the mark address. > Okay. > >> arguments.cpp >> I wish there was a way to fix ReservedCodeCacheSize in the back-end. > Indeed. > > Thanks, > Andrew. From staffan.larsen at oracle.com Fri Nov 14 07:51:50 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 14 Nov 2014 08:51:50 +0100 Subject: RFR: 8064799: [TESTBUG] JT-Reg Serviceability tests to be run as part of JPRT submit job In-Reply-To: <5464D3F2.9060107@oracle.com> References: <5464B874.4060206@oracle.com> <5464D3F2.9060107@oracle.com> Message-ID: <1B9130A1-DC34-4562-8E1C-CF59E41010E2@oracle.com> Looks good! Thanks, /Staffan > On 13 nov 2014, at 16:53, Mikael Auno wrote: > > On 2014-11-13 14:56, Mikael Auno wrote: >> Hi, >> >> Could I please get a review of this addition of SVC tests to JPRT submit >> jobs. So far, I'm only adding JDI tests as those are the only ones I >> have completed code coverage analysis on to determine the best subset to >> add. The other areas will be added too, but I'm adding these now to get >> the ball rolling asap. >> >> I've run these through JPRT once already without failures and have got >> two more runs in the pipe. I've also looked through the history for >> these tests and found that they do not have any known instabilities to >> worry about. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8064799 >> Webrev: http://cr.openjdk.java.net/~miauno/8064799/webrev.00/ > > The additional JPRT runs have completed now and have no failures. Here > are also the duration (in seconds) for each test on each platform to in > case anyone wonders ("com/sun/jdi" prefix stripped out): > >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------- >> | | lin_i586-c1 | lin_i586-c2 | lin_x64-c2 | osx_x64-c2 | sol_sparcv9-c2 | sol_x64-c2 | win_i586-c1 | win_i586-c2 | win_x64-c2 | >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------- >> | .../AcceptTimeout | 1.24 | 1.277 | 1.308 | 1.349 | 1.54 | 1.204 | 2.184 | 2.293 | 2.34 | >> | .../AccessSpecifierTest | 1.689 | 1.883 | 2.021 | 2.303 | 4.892 | 2.048 | 1.332 | 1.707 | 2.19 | >> | .../AfterThreadDeathTest | 0.855 | 0.748 | 0.815 | 0.605 | 1.098 | 0.691 | 0.299 | 0.424 | 1.683 | >> | .../ArrayRangeTest | 0.659 | 0.823 | 0.794 | 0.702 | 1.445 | 0.837 | 0.783 | 0.503 | 1.267 | >> | .../ConstantPoolInfo | 0.589 | 0.74 | 0.791 | 0.621 | 1.067 | 0.607 | 0.315 | 0.408 | 0.674 | >> | .../CountFilterTest | 0.588 | 0.638 | 0.729 | 0.617 | 1.068 | 0.618 | 0.3 | 0.502 | 0.674 | >> | .../EarlyReturnNegativeTest | 0.724 | 0.8 | 0.824 | 0.675 | 1.186 | 0.642 | 0.362 | 0.627 | 0.736 | >> | .../EarlyReturnTest | 1.218 | 1.164 | 1.295 | 1.207 | 1.962 | 1.307 | 0.72 | 1.189 | 1.242 | >> | .../FieldWatchpoints | 0.616 | 0.628 | 0.728 | 0.6 | 1.052 | 0.683 | 0.3 | 0.408 | 0.674 | >> | .../FramesTest | 0.598 | 0.696 | 0.741 | 0.601 | 1.006 | 0.592 | 0.299 | 0.425 | 0.627 | >> | .../InstanceFilter | 0.604 | 0.677 | 0.696 | 0.587 | 1.005 | 0.608 | 0.284 | 0.393 | 0.69 | >> | .../InterfaceMethodsTest | 0.706 | 0.83 | 0.837 | 0.69 | 1.193 | 0.762 | 0.362 | 1.032 | 0.752 | >> | .../InvokeTest | 0.719 | 0.788 | 0.861 | 0.71 | 1.196 | 0.647 | 0.377 | 0.752 | 0.721 | >> | .../LocalVariableEqual | 0.66 | 0.662 | 0.714 | 0.622 | 1.087 | 0.715 | 0.315 | 0.383 | 0.612 | >> | .../LocationTest | 0.639 | 0.651 | 0.688 | 0.715 | 1.014 | 0.612 | 0.299 | 0.362 | 0.58 | >> | .../ModificationWatchpoints | 0.764 | 0.789 | 0.872 | 0.726 | 1.375 | 0.668 | 0.424 | 0.502 | 0.877 | >> | .../MonitorEventTest | 0.597 | 0.638 | 0.69 | 1.608 | 1.03 | 0.648 | 0.284 | 0.377 | 0.689 | >> | .../MonitorFrameInfo | 0.622 | 0.652 | 0.731 | 0.596 | 1.014 | 0.592 | 0.299 | 0.456 | 0.612 | >> | .../NullThreadGroupNameTest | 0.602 | 0.702 | 0.733 | 0.588 | 1.045 | 0.572 | 0.299 | 0.362 | 0.58 | >> | .../PopAndStepTest | 0.318 | 0.351 | 0.416 | 0.593 | 0.989 | 0.713 | 0.3 | 0.455 | 0.752 | >> | .../PopAsynchronousTest | 0.718 | 0.869 | 0.8 | 0.654 | 1.063 | 0.619 | 0.519 | 0.581 | 0.737 | >> | .../ProcessAttachTest | 6.748 | 6.482 | 6.781 | 7.115 | 9.167 | 6.973 | 6.043 | 6.355 | 6.881 | >> | .../redefineMethod/RedefineTest | 3.678 | 3.743 | 4.072 | 5.207 | 7.081 | 3.757 | 2.976 | 3.568 | 4.723 | >> | .../ReferrersTest | 0.846 | 0.811 | 0.866 | 1.493 | 2.295 | 1.096 | 0.642 | 0.892 | 1.31 | >> | .../RequestReflectionTest | 0.642 | 0.644 | 0.706 | 0.59 | 1.172 | 0.584 | 0.3 | 0.737 | 0.736 | >> | .../ResumeOneThreadTest | 0.612 | 0.661 | 0.688 | 0.669 | 1.073 | 0.583 | 0.502 | 0.362 | 0.658 | >> | .../RunToExit | 1.434 | 1.454 | 1.462 | 1.215 | 1.182 | 1.188 | 1.126 | 1.126 | 1.22 | >> | .../sde/MangleTest | 0.739 | 0.976 | 0.9 | 0.72 | 1.295 | 0.703 | 0.486 | 0.752 | 1.11 | >> | .../sde/TemperatureTableTest | 0.923 | 0.986 | 0.992 | 0.784 | 1.368 | 0.846 | 0.502 | 0.892 | 0.908 | >> | .../SourceNameFilterTest | 1.246 | 1.365 | 1.45 | 1.246 | 2.041 | 1.215 | 0.599 | 1.051 | 1.316 | >> | .../VarargsTest | 0.713 | 0.763 | 0.814 | 0.718 | 1.183 | 0.654 | 0.393 | 0.533 | 0.924 | >> | .../Vars | 0.568 | 0.609 | 0.692 | 0.628 | 1.012 | 0.583 | 0.284 | 0.362 | 0.564 | >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------- >> | Total | 33.874 | 35.564 | 37.507 | 37.754 | 57.196 | 34.567 | 24.509 | 30.771 | 40.059 | >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Thanks, > Mikael From goetz.lindenmaier at sap.com Fri Nov 14 07:56:49 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 14 Nov 2014 07:56:49 +0000 Subject: RFR(XS): 8064786: Fix debug build after 8062808: Turn on the -Wreturn-type warning In-Reply-To: <54649BBD.2040009@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF27134@DEWDFEMB12A.global.corp.sap> <54648941.7070108@oracle.com> <54649BBD.2040009@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF27540@DEWDFEMB12A.global.corp.sap> Hi Stefan, thanks for handling this! David, Thomas, thanks for reviews! Best regards, Goetz. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Donnerstag, 13. November 2014 12:54 To: Stefan Karlsson; Lindenmaier, Goetz; hotspot-dev at openjdk.java.net Subject: Re: RFR(XS): 8064786: Fix debug build after 8062808: Turn on the -Wreturn-type warning On 13/11/2014 8:34 PM, Stefan Karlsson wrote: > On 2014-11-13 11:20, Lindenmaier, Goetz wrote: >> Hi, >> >> please review, test and sponsor this tiny change. It fixes the debug >> build in the gc repository. >> https://bugs.openjdk.java.net/browse/JDK-8064786 >> http://cr.openjdk.java.net/~goetz/webrevs/8064786-warnRet/webrev.00/ > > Looks good. Thanks for fixing. Another approach would be to just remove > the ShouldNotReachHere() lines. > > I'll push when we get another review. Reviewed. But I'm concerned as to how this was not detected with the original fix. I assume we don't build with a compiler that complains about this code? Thanks, David > thanks, > StefanK > >> >> Best regards, >> Goetz. > From Alan.Bateman at oracle.com Fri Nov 14 08:03:38 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 14 Nov 2014 08:03:38 +0000 Subject: RFR: 8064799: [TESTBUG] JT-Reg Serviceability tests to be run as part of JPRT submit job In-Reply-To: <5464B874.4060206@oracle.com> References: <5464B874.4060206@oracle.com> Message-ID: <5465B75A.5080609@oracle.com> On 13/11/2014 13:56, Mikael Auno wrote: > Hi, > > Could I please get a review of this addition of SVC tests to JPRT submit > jobs. So far, I'm only adding JDI tests as those are the only ones I > have completed code coverage analysis on to determine the best subset to > add. The other areas will be added too, but I'm adding these now to get > the ball rolling asap. > > I've run these through JPRT once already without failures and have got > two more runs in the pipe. I've also looked through the history for > these tests and found that they do not have any known instabilities to > worry about. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8064799 > Webrev: http://cr.openjdk.java.net/~miauno/8064799/webrev.00/ > > This doesn't look very maintainable. If we are adding *_sanity groups then would be better to move them to their own section of the file so that they are in the middle of the main grouping of the tests? Also I think we should avoid list individual tests if we can, would it be better to leave out JDI until your analysis is completed? -Alan From bengt.rutisson at oracle.com Fri Nov 14 08:10:35 2014 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 14 Nov 2014 09:10:35 +0100 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <5464E642.9080200@oracle.com> References: <546228E3.8030207@oracle.com> <54643974.9050805@oracle.com> <546485BF.30802@oracle.com> <5464C1C2.2080804@oracle.com> <5464CE98.80308@oracle.com> <5464E642.9080200@oracle.com> Message-ID: <5465B8FB.4020004@oracle.com> Hi Mikael, Latest webrev looks good to me too. Thanks, Bengt On 2014-11-13 18:11, Daniel D. Daugherty wrote: > On 11/13/14 8:30 AM, Mikael Gerdin wrote: >> Hi Dan, >> >> On 2014-11-13 15:35, Daniel D. Daugherty wrote: >>> > Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.1/ >>> >>> src/share/vm/gc_implementation/g1/heapRegion.cpp >>> nit line 1007: assert(local_time_stamp <= >>> g1h->get_gc_time_stamp(), >>> "invariant" ); >>> You got rid of the space after the '('. Can you also get >>> rid of the space before ');'? >> >> I'll fix that before I push the change. > > Thanks! > > >> >>> >>> Thanks for putting _gc_time_stamp in a local so that it's >>> not fetched more than once. >>> >>> Would it be useful to move the assert() after the setting of >>> local_top? That would give someone debugging the assertion >>> failure core file a little more context. Dunno... your call. >> >> The assert is not strictly related to the _top field, if the time >> stamp assert fails we're most likely experiencing a completely >> different problem. >> I'd rather leave the assert where it is. > > No problem. > > Dan > > >> >> /Mikael >> >>> >>> Dan >>> >>> >>> On 11/13/14 3:19 AM, Mikael Gerdin wrote: >>>> Hi David, >>>> >>>> On 2014-11-13 05:54, David Holmes wrote: >>>>> Hi Mikael, >>>>> >>>>> Without knowing the details it is hard to determine the >>>>> correctness of >>>>> this. What you describe below sounds reasonable - but what about the >>>>> opposite problem in the new code: what if you read an old top() >>>>> then a >>>>> new timestamp, before top() is updated? Will that work correctly >>>>> or will >>>>> the region between the old-top and new-top be missed? >>>> >>>> I realize that not everyone is up to speed on the specifics of this >>>> code, but I appreciate you feedback on the general reasoning. >>>> >>>> Reading an old _top value is safe, and in fact we must enforce that >>>> the only _top values we ever return from this functions were set >>>> before the GC occurred. >>>> >>>> Reading a too recent _top value is the cause of the crash in this bug, >>>> since if this function returns a a recently updated _top value that is >>>> because another GC worker has allocated into this region and is in the >>>> process of copying objects into it. The point of the timestamp value >>>> is to only return old values of _top and if the timestamp is current >>>> it should return another value. >>>> >>>> I've updated the webrev slightly due to off-list feedback that I >>>> should attempt to avoid reading the time stamp more than once (for the >>>> assert). >>>> I've also noticed that I messed up the indentation of a curly brace so >>>> I fixed that as well. >>>> >>>> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.1/ >>>> >>>> Incremental webrev: >>>> http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0_to_1/ >>>> >>>> /Mikael >>>> >>>>> >>>>> Cheers, >>>>> David H. >>>>> >>>>> On 12/11/2014 1:18 AM, Mikael Gerdin wrote: >>>>>> Hi all, >>>>>> >>>>>> I've sent this to hotspot-dev instead of just hotspot-gc-dev in the >>>>>> hope >>>>>> of getting some extra feedback from our resident concurrency >>>>>> experts. >>>>>> >>>>>> Please review this subtle change to the order in which we read >>>>>> fields in >>>>>> G1OffsetTableContigSpace::saved_mark_word, original included here >>>>>> for >>>>>> reference: >>>>>> 1003 >>>>>> 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>>>>> 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>>>>> 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), >>>>>> "invariant" ); >>>>>> 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) >>>>>> 1008 return top(); >>>>>> 1009 else >>>>>> 1010 return Space::saved_mark_word(); >>>>>> 1011 } >>>>>> 1012 >>>>>> >>>>>> When getting a new gc alloc region several stores are performed >>>>>> where >>>>>> store ordering needs to be enforced and several synchronization >>>>>> points >>>>>> occur. >>>>>> [write path] >>>>>> ST(_saved_mark_word) >>>>>> #StoreStore >>>>>> ST(_gc_time_stamp) >>>>>> ST(_top) // satisfying alloc request >>>>>> #StoreStore >>>>>> ST(_alloc_region) // publishing to other gc workers >>>>>> #MonitorEnter >>>>>> ST(_top) // potential further allocations >>>>>> #MonitorExit >>>>>> #MonitorEnter >>>>>> ST(_top) // potential further allocations >>>>>> #MonitorExit >>>>>> >>>>>> When we inspect a region during remembered set scanning we need to >>>>>> ensure that we never read memory which have been allocated by a GC >>>>>> worker thread for the purpose of copying objects into. >>>>>> The way this works is that a time stamp field is supposed to signal >>>>>> to a >>>>>> scanning thread that it should look at addresses below _top if >>>>>> the time >>>>>> stamp is old or addresses below _saved_mark_word if the time >>>>>> stamp is >>>>>> current. >>>>>> >>>>>> The current code does (as seen above) >>>>>> [read path] >>>>>> LD(_gc_time_stamp) >>>>>> LD(_top) >>>>>> or (depending on time stamp) >>>>>> LD(_saved_mark_word) >>>>>> >>>>>> Because these values are written to without full mutual exclusion we >>>>>> need to be very careful about the order in which we read these >>>>>> values, >>>>>> and this is where I argue that the current code is incorrect. >>>>>> In order to observe a consistent view of the ordered stores in the >>>>>> [write path] above we need to load the values in the reverse >>>>>> order they >>>>>> were written, with proper #LoadLoad ordering enforced. >>>>>> >>>>>> The problem which we've observed here is that after we've read >>>>>> the time >>>>>> stamp as below the heap time stamp the top pointer can be updated >>>>>> by a >>>>>> GC worker allocating objects into this region. To make sure that the >>>>>> top >>>>>> value we see is in fact valid we must read it before we read the >>>>>> time >>>>>> stamp which determines which value we should return from the >>>>>> saved_mark_word function. >>>>>> >>>>>> My suggested fix is to load _top first and enforce #LoadLoad >>>>>> ordering >>>>>> enforced: >>>>>> HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>>>>> G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>>>>> assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), >>>>>> "invariant" ); >>>>>> HeapWord* local_top = top(); >>>>>> OrderAccess::loadload(); >>>>>> if (_gc_time_stamp < g1h->get_gc_time_stamp()) { >>>>>> return local_top; >>>>>> } else { >>>>>> return Space::saved_mark_word(); >>>>>> } >>>>>> } >>>>>> >>>>>> I've successfully reproduced the crash with the original code by >>>>>> adding >>>>>> some random sleep calls between the load of the time stamp and >>>>>> the load >>>>>> of top so I'm fairly certain that this resolves the issue. I've also >>>>>> verified that the fix I'm proposing does resolve the bug for the >>>>>> team >>>>>> which encountered the issue, even if I can't reproduce that crash >>>>>> locally. >>>>>> >>>>>> I also plan to attempt design around some of the races in this >>>>>> code to >>>>>> reduce its complexity, but for the sake of backporting the fix to >>>>>> 8u40 >>>>>> I'd like to start with just adding the minimal fix. >>>>>> >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 >>>>>> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ >>>>>> Testing: JPRT, local kitchensink (4 hours), gc test suite >>>>>> >>>>>> Thanks >>>>>> /Mikael >>> > From magnus.ihse.bursie at oracle.com Fri Nov 14 08:39:36 2014 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 14 Nov 2014 09:39:36 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <0B65D4AA-B876-4106-9CA0-2F2AF5F43F83@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <0B65D4AA-B876-4106-9CA0-2F2AF5F43F83@oracle.com> Message-ID: <62FBBB3A-B1CE-43D8-9D99-3087BDDBDC37@oracle.com> > 13 nov 2014 kl. 19:33 skrev Christian Thalinger : > > >> On Nov 13, 2014, at 6:09 AM, Magnus Ihse Bursie wrote: >> >>> On 2014-11-10 11:32, Volker Simonis wrote: >>> On Mon, Nov 10, 2014 at 10:42 AM, Erik Joelsson >>> wrote: >>>> On 2014-11-10 10:27, Volker Simonis wrote: >>>>> On Mon, Nov 10, 2014 at 9:08 AM, Erik Joelsson >>>>> wrote: >>>>>> Hello, >>>>>> >>>>>> I would certainly like to have these files updated, but unfortunately the >>>>>> license on these files changed from GPL2 to GPL3. This essentially means >>>>>> that the switch is non trivial from a legal perspective and the >>>>>> impression >>>>>> I've received when I last inquired about updating these files was that >>>>>> it's >>>>>> unlikely to ever happen unless a very strong case can be presented for >>>>>> why >>>>>> it's needed. >>>>>> >>>>>> So the reason we have the over engineered solution for config.guess is >>>>>> simply that it's much easier than getting legal approval for updating >>>>>> these >>>>>> files. >>>>> OK, but in that case I don't see any reason for keeping this >>>>> "over-engineered" solution at all. If there will not be any pulls from >>>>> upstream anyway then there's no reason for keeping these file >>>>> untouched. I'd propose then to just remove the wrappers and do all the >>>>> chenges right in the corresponding files (of course that's not the >>>>> topic of this change but should be done separately). >>>> And again, the reason we didn't change the existing file but instead wrapped >>>> it, was that we don't have explicit legal approval for doing derivative work >>>> for these 3rd party files. Maybe it's ok, maybe it's not, I will not be the >>>> person saying it is ok. >>> OK, now I got it. I thought we just use the wrappers because we want >>> to easily integrate the upstream versions. But instead it is only >>> because we don't want to edit these files because of legal >>> uncertainties. >>> >>> So in that case that means we're also not allowed to edit 'config.sub' >>> and have to create a wrapper for it, right? >> >> Yes, you are correct. We cannot modify these files. >> >> As far as I understand, the legal reason for including these files are the explicit exception: >> >> # As a special exception to the GNU General Public License, if you >> # distribute this file as part of a program that contains a >> # configuration script generated by Autoconf, you may include it under >> # the same distribution terms that you use for the rest of that program. >> >> But this is just a distribution license, not a modification license. >> >> From my IANAL point of view, this exception should be enough to disregard if the file is also distributed under GPL2 or GPL3. Unfortunately, as Erik says, our lawyers are apprehensive of GLP3. So while we thought that we could be able to periodically sync these files with upstream (and remove our external "patches" after a while), we have not been able to do so. > > Why do we have these files in our repository in the first place? Because they are needed by the configure script. They are a sort of runtime libraries for configure, but since they are written as shell scripts, the source code form and the executable form is the same. The distribution exception is there exactly since anyone should be able to distribute the files with their configure script. That does not mean that you are allowed to edit it, though. /Magnus > >> >> So, this fix will need to do the same dance with config.sub as for guess.guess. Unfortunately. :( >> >> /Magnus > From staffan.larsen at oracle.com Fri Nov 14 08:40:27 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 14 Nov 2014 09:40:27 +0100 Subject: RFR: 8064799: [TESTBUG] JT-Reg Serviceability tests to be run as part of JPRT submit job In-Reply-To: <5465B75A.5080609@oracle.com> References: <5464B874.4060206@oracle.com> <5465B75A.5080609@oracle.com> Message-ID: <6D4C1371-535D-4EA6-BE39-4B08AE9AB039@oracle.com> > On 14 nov 2014, at 09:03, Alan Bateman wrote: > > On 13/11/2014 13:56, Mikael Auno wrote: >> Hi, >> >> Could I please get a review of this addition of SVC tests to JPRT submit >> jobs. So far, I'm only adding JDI tests as those are the only ones I >> have completed code coverage analysis on to determine the best subset to >> add. The other areas will be added too, but I'm adding these now to get >> the ball rolling asap. >> >> I've run these through JPRT once already without failures and have got >> two more runs in the pipe. I've also looked through the history for >> these tests and found that they do not have any known instabilities to >> worry about. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8064799 >> Webrev: http://cr.openjdk.java.net/~miauno/8064799/webrev.00/ >> >> > This doesn't look very maintainable. If we are adding *_sanity groups then would be better to move them to their own section of the file so that they are in the middle of the main grouping of the tests? Also I think we should avoid list individual tests if we can, would it be better to leave out JDI until your analysis is completed? So the goal here has been to increase the test coverage of hotspot jprt push jobs, but with a limited impact on execution time. This is all to make sure hotspot changes do no break serviceability features. While it would be great to run all tests at all times, we don?t have time for that. Mikael has been doing code coverage analysis to find the subset of test that provides the biggest bang for the buck. Starting with JDI is as good as any place to start. I agree that listing individual tests is not particularly appealing, but I don?t see many other options. We could possibly use @key tags to select the tests but there isn?t much support in makefiles and jprt for that if I recall. We could use sub-folders, but that quickly gets out of hand. We could move the _sanity lists to one place in the file to make it easier to see the rest. /Staffan From mikael.auno at oracle.com Fri Nov 14 08:47:49 2014 From: mikael.auno at oracle.com (Mikael Auno) Date: Fri, 14 Nov 2014 09:47:49 +0100 Subject: RFR: 8064799: [TESTBUG] JT-Reg Serviceability tests to be run as part of JPRT submit job In-Reply-To: <546568B8.2050305@oracle.com> References: <5464B874.4060206@oracle.com> <5464D3F2.9060107@oracle.com> <546568B8.2050305@oracle.com> Message-ID: <5465C1B5.3080109@oracle.com> On 2014-11-14 03:28, David Holmes wrote: > So to be clear this adds the JDK JDI tests (select subset thereof) to > "-testset hotspot" for full forest submissions to JPRT. Yes, that is precisely the idea; adding SVC tests to hotspot JPRT pushes so that development in other teams does not break SVC features. This has already been done by compiler and runtime in the same way too, although their test groups are in hotspot/test instead of jdk/test. Mikael From stefan.karlsson at oracle.com Fri Nov 14 08:39:36 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 14 Nov 2014 09:39:36 +0100 Subject: RFR: 8064811: Use THREAD instead of CHECK_NULL in return statements In-Reply-To: References: <5464C741.2060105@oracle.com> <5464CF3B.4060004@oracle.com> <5464CE37.9020102@oracle.com> Message-ID: <5465BFC8.7080809@oracle.com> Hi Volker, On 2014-11-13 18:24, Volker Simonis wrote: > Hi Stefan, > > thanks a LOT for doing this change. This one was on my list since > years (see "6889002: CHECK macros in return constructs lead to > unreachable code" at https://bugs.openjdk.java.net/browse/JDK-6889002) > but I never finalized it. I didn't CHECK for already open bugs/enhancements for this ;) > > Your change looks good and I think it must be stressed that your code > doesn't change any functionality because it only eliminates dead code! > > If we will have code after your change which fails to check for > pending exceptions or which expects to get a certain result in the > case of a pending exceptions it was just es wrong already before your > change because the exception check was never reached. So unless we see > one of these errors I don't think it will be necessary to introduce > all these temporary variables. > > When I started to work on this problem years ago I went the other way > round: I looked at the called functions (i.e. klass_at() in your > example) to check if they already return NULL in the case of a pending > exception. As far as I remember they all behaved "well" - otherwise we > would already have seen some problems with the current implementation. Thanks for verifying this. > > Thanks and best regards, > Volker > > PS: our HPUX C++ compiler will love this change:) Great! Thanks, StefanK > > > On Thu, Nov 13, 2014 at 4:28 PM, Stefan Karlsson > wrote: >> On 2014-11-13 16:33, Coleen Phillimore wrote: >>> >>> The thing that I worry about with this change is that if someone adds code >>> later after the return, they'll miss changing the THREAD parameter back into >>> CHECK. >> >> >> I was thinking the same, but I knew the we used this idiom in other places >> in the JVM and thought that changing these would be OK. An alternative >> approach would be to always read out the value into a variable: >> >> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >> Klass* k = klass_at(klass_ref_index_at(which), CHECK_NULL); >> return k; >> } >> >> I can do that if people feel more comfortable with it. >> >> Thanks, >> Stefank >> >> >>> But maybe it's okay because the thing returned will be NULL and code is >>> likely to crash on someone trying to use the value returned. Ok. This is a >>> good cleanup. I'm surprised there weren't more. >>> >>> Thanks, >>> Coleen >>> >>> On 11/13/14, 9:59 AM, Stefan Karlsson wrote: >>>> Hi all, >>>> >>>> Please, review this patch to replace usages of the CHECK_ macros in >>>> return statements, with the THREAD define. >>>> >>>> http://cr.openjdk.java.net/~stefank/8064811/webrev.01/ >>>> https://bugs.openjdk.java.net/browse/JDK-8064811 >>>> >>>> From the bug report: >>>> >>>> Take the following method as an example: >>>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>>> return klass_at(klass_ref_index_at(which), CHECK_NULL); >>>> } >>>> >>>> This will expand into: >>>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>>> return klass_at(klass_ref_index_at(which), THREAD); >>>> if (HAS_PENDING_EXCEPTIONS) { >>>> return NULL; >>>> } >>>> (void)(0); >>>> } >>>> >>>> The if-statement will never be reached. >>>> >>>> We have seen cases where the compiler warns about this, and the recent >>>> change to enable -Wreturn-type will make this more likely to happen. >>>> >>>> The suggested solution is to change the example above into: >>>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>>> return klass_at(klass_ref_index_at(which), THREAD); >>>> } >>>> >>>> thanks, >>>> StefanK >>> From Alan.Bateman at oracle.com Fri Nov 14 08:53:43 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 14 Nov 2014 08:53:43 +0000 Subject: RFR: 8064799: [TESTBUG] JT-Reg Serviceability tests to be run as part of JPRT submit job In-Reply-To: <6D4C1371-535D-4EA6-BE39-4B08AE9AB039@oracle.com> References: <5464B874.4060206@oracle.com> <5465B75A.5080609@oracle.com> <6D4C1371-535D-4EA6-BE39-4B08AE9AB039@oracle.com> Message-ID: <5465C317.2060201@oracle.com> On 14/11/2014 08:40, Staffan Larsen wrote: > : > > So the goal here has been to increase the test coverage of hotspot > jprt push jobs, but with a limited impact on execution time. This is > all to make sure hotspot changes do no break serviceability features. > While it would be great to run all tests at all times, we don?t have > time for that. Mikael has been doing code coverage analysis to find > the subset of test that provides the biggest bang for the buck. > Starting with JDI is as good as any place to start. > > I agree that listing individual tests is not particularly appealing, > but I don?t see many other options. We could possibly use @key tags to > select the tests but there isn?t much support in makefiles and jprt > for that if I recall. We could use sub-folders, but that quickly gets > out of hand. > > We could move the _sanity lists to one place in the file to make it > easier to see the rest. > My main concern is keeping the test group hierarchy easy to understand and maintain. It has to be easy to identify any tests that aren't run by any of the top-level groups for example. For serviceability then the original idea was to have the jdk_svc group that ran all of the tests, this in turn consisted of 5 sub-groups to cover the various areas. This sub-groups will execute concurrently on different JPRT clients and works reasonably well, albeit with some imbalance in the execution time. The "real" definitions of the groups end just after the section with the comment "Client area groups" so if you can move the new sanity groups down to below that with a good comment to explain what they are then it should okay. At some point we need to look at removing completely the "Profile based ..." groups at the end. They are completely unmaintainable and there are much better ways of doing this now (with @requires). -Alan From stefan.karlsson at oracle.com Fri Nov 14 08:44:53 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 14 Nov 2014 09:44:53 +0100 Subject: RFR: 8064811: Use THREAD instead of CHECK_NULL in return statements In-Reply-To: <54656B4E.4000809@oracle.com> References: <5464C741.2060105@oracle.com> <5464CF3B.4060004@oracle.com> <5464CE37.9020102@oracle.com> <54656B4E.4000809@oracle.com> Message-ID: <5465C105.7080909@oracle.com> On 2014-11-14 03:39, David Holmes wrote: > On 14/11/2014 3:24 AM, Volker Simonis wrote: >> Hi Stefan, >> >> thanks a LOT for doing this change. This one was on my list since >> years (see "6889002: CHECK macros in return constructs lead to >> unreachable code" at https://bugs.openjdk.java.net/browse/JDK-6889002) >> but I never finalized it. >> >> Your change looks good and I think it must be stressed that your code >> doesn't change any functionality because it only eliminates dead code! >> >> If we will have code after your change which fails to check for >> pending exceptions or which expects to get a certain result in the >> case of a pending exceptions it was just es wrong already before your >> change because the exception check was never reached. So unless we see >> one of these errors I don't think it will be necessary to introduce >> all these temporary variables. >> >> When I started to work on this problem years ago I went the other way >> round: I looked at the called functions (i.e. klass_at() in your >> example) to check if they already return NULL in the case of a pending >> exception. As far as I remember they all behaved "well" - otherwise we >> would already have seen some problems with the current implementation. > > This was my concern too - what actually gets returned in those cases! > But if this has been verified, and as you say we don't see problems > because of this, then the change to just THREAD seems okay. > > Personally though I prefer the alternative style: > > Klass* ConstantPool::klass_ref_at(int which, TRAPS) { > Klass* k = klass_at(klass_ref_index_at(which), CHECK_NULL); > return k; > } > > I'll leave it up to Stefan. :) OK. I'll push the patch as it's currently written. Then if we decide to use temporary variables instead, we can do that as a separate patch. Thanks, StefanK > > Cheers, > David > > >> Thanks and best regards, >> Volker >> >> PS: our HPUX C++ compiler will love this change:) >> >> >> On Thu, Nov 13, 2014 at 4:28 PM, Stefan Karlsson >> wrote: >>> >>> On 2014-11-13 16:33, Coleen Phillimore wrote: >>>> >>>> >>>> The thing that I worry about with this change is that if someone >>>> adds code >>>> later after the return, they'll miss changing the THREAD parameter >>>> back into >>>> CHECK. >>> >>> >>> >>> I was thinking the same, but I knew the we used this idiom in other >>> places >>> in the JVM and thought that changing these would be OK. An alternative >>> approach would be to always read out the value into a variable: >>> >>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>> Klass* k = klass_at(klass_ref_index_at(which), CHECK_NULL); >>> return k; >>> } >>> >>> I can do that if people feel more comfortable with it. >>> >>> Thanks, >>> Stefank >>> >>> >>>> But maybe it's okay because the thing returned will be NULL and >>>> code is >>>> likely to crash on someone trying to use the value returned. Ok. >>>> This is a >>>> good cleanup. I'm surprised there weren't more. >>>> >>>> Thanks, >>>> Coleen >>>> >>>> On 11/13/14, 9:59 AM, Stefan Karlsson wrote: >>>>> >>>>> Hi all, >>>>> >>>>> Please, review this patch to replace usages of the CHECK_ macros in >>>>> return statements, with the THREAD define. >>>>> >>>>> http://cr.openjdk.java.net/~stefank/8064811/webrev.01/ >>>>> https://bugs.openjdk.java.net/browse/JDK-8064811 >>>>> >>>>> From the bug report: >>>>> >>>>> Take the following method as an example: >>>>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>>>> return klass_at(klass_ref_index_at(which), CHECK_NULL); >>>>> } >>>>> >>>>> This will expand into: >>>>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>>>> return klass_at(klass_ref_index_at(which), THREAD); >>>>> if (HAS_PENDING_EXCEPTIONS) { >>>>> return NULL; >>>>> } >>>>> (void)(0); >>>>> } >>>>> >>>>> The if-statement will never be reached. >>>>> >>>>> We have seen cases where the compiler warns about this, and the >>>>> recent >>>>> change to enable -Wreturn-type will make this more likely to happen. >>>>> >>>>> The suggested solution is to change the example above into: >>>>> Klass* ConstantPool::klass_ref_at(int which, TRAPS) { >>>>> return klass_at(klass_ref_index_at(which), THREAD); >>>>> } >>>>> >>>>> thanks, >>>>> StefanK >>>> >>>> >>> From mikael.auno at oracle.com Fri Nov 14 09:32:16 2014 From: mikael.auno at oracle.com (Mikael Auno) Date: Fri, 14 Nov 2014 10:32:16 +0100 Subject: RFR: 8064799: [TESTBUG] JT-Reg Serviceability tests to be run as part of JPRT submit job In-Reply-To: <5465C317.2060201@oracle.com> References: <5464B874.4060206@oracle.com> <5465B75A.5080609@oracle.com> <6D4C1371-535D-4EA6-BE39-4B08AE9AB039@oracle.com> <5465C317.2060201@oracle.com> Message-ID: <5465CC20.40802@oracle.com> On 2014-11-14 09:53, Alan Bateman wrote: > On 14/11/2014 08:40, Staffan Larsen wrote: >> : >> >> So the goal here has been to increase the test coverage of hotspot >> jprt push jobs, but with a limited impact on execution time. This is >> all to make sure hotspot changes do no break serviceability features. >> While it would be great to run all tests at all times, we don?t have >> time for that. Mikael has been doing code coverage analysis to find >> the subset of test that provides the biggest bang for the buck. >> Starting with JDI is as good as any place to start. >> >> I agree that listing individual tests is not particularly appealing, >> but I don?t see many other options. We could possibly use @key tags to >> select the tests but there isn?t much support in makefiles and jprt >> for that if I recall. We could use sub-folders, but that quickly gets >> out of hand. >> >> We could move the _sanity lists to one place in the file to make it >> easier to see the rest. >> > > My main concern is keeping the test group hierarchy easy to understand > and maintain. It has to be easy to identify any tests that aren't run by > any of the top-level groups for example. For serviceability then the > original idea was to have the jdk_svc group that ran all of the tests, > this in turn consisted of 5 sub-groups to cover the various areas. This > sub-groups will execute concurrently on different JPRT clients and works > reasonably well, albeit with some imbalance in the execution time. > > The "real" definitions of the groups end just after the section with the > comment "Client area groups" so if you can move the new sanity groups > down to below that with a good comment to explain what they are then it > should okay. > > At some point we need to look at removing completely the "Profile based > ..." groups at the end. They are completely unmaintainable and there are > much better ways of doing this now (with @requires). Here's an updated webrev with your proposed changes. http://cr.openjdk.java.net/~miauno/8064799/webrev.01/ Mikael From aph at redhat.com Fri Nov 14 09:46:19 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 14 Nov 2014 09:46:19 +0000 Subject: RFR: AARCH64: Changes to HotSpot shared code In-Reply-To: <54654515.1080801@oracle.com> References: <54625D3D.4000007@redhat.com> <5463C1CE.9040301@oracle.com> <54647A85.6020203@redhat.com> <54649688.7040401@redhat.com> <54654515.1080801@oracle.com> Message-ID: <5465CF6B.3050305@redhat.com> On 13/11/14 23:56, Vladimir Kozlov wrote: > On 11/13/14 3:31 AM, Andrew Haley wrote: >> On 11/13/2014 09:31 AM, Andrew Haley wrote: >>>>> How about moving your deopt-instead-of-patch support >>>>> into Runtime1::patch_code() and enable it with a read-only >>>>> platform-specific developer runtime flag >>>>> (see INTPRESSURE for example)? >>> Okay. I'll have a look at that. >> >> Does this mean that I'll need to add a flag to all back ends? > > Yes. And we will help with closed changes. Okay, I've got that. Will do. Andrew. From Alan.Bateman at oracle.com Fri Nov 14 09:55:40 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 14 Nov 2014 09:55:40 +0000 Subject: RFR: 8064799: [TESTBUG] JT-Reg Serviceability tests to be run as part of JPRT submit job In-Reply-To: <5465CC20.40802@oracle.com> References: <5464B874.4060206@oracle.com> <5465B75A.5080609@oracle.com> <6D4C1371-535D-4EA6-BE39-4B08AE9AB039@oracle.com> <5465C317.2060201@oracle.com> <5465CC20.40802@oracle.com> Message-ID: <5465D19C.3040609@oracle.com> On 14/11/2014 09:32, Mikael Auno wrote: > : > Here's an updated webrev with your proposed changes. > > http://cr.openjdk.java.net/~miauno/8064799/webrev.01/ > This looks to okay, thanks for taking the concern on board. -Alan From thomas.stuefe at gmail.com Fri Nov 14 09:59:49 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 14 Nov 2014 10:59:49 +0100 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: <54641F95.5030201@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> <4295855A5C1DE049A61835A1887419CC2CF26820@DEWDFEMB12A.global.corp.sap> <2e9bd47c-366a-446b-89d0-b431a5816007@default> <54641F95.5030201@oracle.com> Message-ID: Hi David, thanks for looking. See here the corrected webrev: http://cr.openjdk.java.net/~simonis/webrevs/8064779/ for this new bug report: https://bugs.openjdk.java.net/browse/JDK-8064779 Best Regards, Thomas On Thu, Nov 13, 2014 at 4:03 AM, David Holmes wrote: > Hi Thomas, > > On 12/11/2014 8:31 PM, Thomas St?fe wrote: > >> Hi, >> >> could you please review this little addition? (added comments for >> jio_snprintf) >> >> http://cr.openjdk.java.net/~simonis/webrevs/8062370/ >> > > A new bug is needed for these changes. > > As people rarely look at the header file when reading the code could you > augment the last line of the comment in jvm.cpp from: > > + // return always -1. > > to > > + // always return -1, and perform null termination. > > Thanks, > David > > From aph at redhat.com Fri Nov 14 10:07:12 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 14 Nov 2014 10:07:12 +0000 Subject: RFR: AARCH64: Changes to HotSpot shared code In-Reply-To: <54659DFD.3060001@oracle.com> References: <54625D3D.4000007@redhat.com> <5463C1CE.9040301@oracle.com> <54647A85.6020203@redhat.com> <54659DFD.3060001@oracle.com> Message-ID: <5465D450.3090107@redhat.com> On 14/11/14 06:15, Dean Long wrote: > On 11/13/2014 1:31 AM, Andrew Haley wrote: >> On 12/11/14 20:23, Dean Long wrote: >>> On 11/11/2014 11:02 AM, Andrew Haley wrote: >>>> http://cr.openjdk.java.net/~aph/aarch64-JDK-8064611/hotspot.patch >>>> >>>> Everything except cpu/ and os_cpu/. >>>> >>>> Most of this is obvious and trivial, with a few exceptions. >>>> >>>> In memory/metaspace.cpp, we allocated the memory for metadata in a >>>> different way. This is because we want to be able to decode and >>>> encode compressed metadata pointers with a single instruction, and we >>>> can always do that iff the base address is of a particular form. >>>> >>>> In opto/, we have made some changes in order to be able to use AArch64 >>>> store release instructions for volatile field stores. These don't >>>> require leading or trailing barriers. We have tried several times to >>>> do this without changing shared code, but it is impossible with the >>>> current back-end interface. >>> Is this something ppc64 can also take advantage of? I hope Vladimir can >>> suggest >>> a more flexible way to do this, perhaps with a runtime flag. >> Perhaps so, but as far as I'm aware AArch64 is the only CPU with >> exactly these semantics. From my point of view, it would be ideal if >> we simply emitted volatile store and volatile load as nodes and let >> the back end handle them. But if we do that we lose the opportunity >> to coalesce barriers in C2 optimization. Hmmm.... :-) >> >>>> In several places a release store is used where the AArch64 memory >>>> model makes it unnecessary. From earlier emails on this list we >>>> discovered that the only architecture which requires this release >>>> store is IA64, and OpenJDK does not support it anyway. We should >>>> perhaps look at re-engineering the way that memory barriers and memory >>>> accesses are handled in HotSpot with a view to pushing all these >>>> architecture-dependent assumptions out to the back ends. >>> I agree. More comments below. >>>> Andrew. >>> c1_Canonicalizer.cpp >>> Can this be handled in the back-end? I imagine other platforms, >>> such as x86, have similar limitations. >> It certainly could be. Maybe pd_valid_shift_count() ? But I'm striving >> not to touch any other ports. > > What I'm actually wondering is what happens if you remove the AARCH64 > log2_scale check > altogether. As far as I can tell, it isn't needed, because in > do_UnsafeGetRaw, the scale > is only used directly in the LIR_Address for X86 and ARM. For other > platforms, we do: > > LIR_Opr tmp = new_pointer_register(); > __ shift_left(index_op, log2_scale, tmp); > addr = new LIR_Address(base_op, tmp, dst_type); > > so you don't have to worry about a mis-scaled load on AARCH64. Aha! Okay, that may be a more recent change. I don't think I would have made that change if it wasn't necessary at the time, but never mind, one hunk is gone, thanks. >>> c1_LIR.cpp >>> It looks like you need a temp for convert because your backend >>> because you're checking the FPSR. >>> What happens if you ignore the FPSR, do you get a wrong result? >> I've looked for a while, and I'm sorry but I don't understand which >> hunk this refers to. > > This one, for lir_convert: > > #if defined(PPC) || defined(TARGET_ARCH_aarch64) > if (opConvert->_tmp1->is_valid()) do_temp(opConvert->_tmp1); > if (opConvert->_tmp2->is_valid()) do_temp(opConvert->_tmp2); > #endif > > I'm wondering if, for example, d2l in the back-end needs to check FPSCR.IOC? > If you get the correct result even if FPSCR.IOC is set, then you should > be able to > simply ignore FPSCR.IOC. Yes, you're quite right. Thanks again, Andrew. From edward.nevill at linaro.org Fri Nov 14 10:34:09 2014 From: edward.nevill at linaro.org (Edward Nevill) Date: Fri, 14 Nov 2014 10:34:09 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546572B8.9080005@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> Message-ID: <1415961249.26700.18.camel@mylittlepony.linaroharston> On Thu, 2014-11-13 at 19:10 -0800, Vladimir Kozlov wrote: > runtime/arguments.cpp > > Is it really 128MB max value for ReservedCodeCacheSize on aarch64? What > is default ReservedCodeCacheSize size? Yes. The limit is imposed by the maximum span of the B/BL instructions. In practice we have not found this to be a problem. Overnight testing with Hadoop, Specjbb, SpecJVM, JTreg, jcstress shows it gets nowhere near 128M. The default value of ReservedCodeCacheSize depends on the arch and C1/C2/Tiered settings So, for example aarch64(C1) ReservedCodeCacheSize = 32*M aarch64(C2) ReservedCodeCacheSize = 48*M ppc(C2) ReservedCodeCacheSize = 256*M x86(C1) ReservedCodeCacheSize = 32*M x86(C2) ReservedCodeCacheSize = 48*M However, in arguments.cpp it increases ReservedCodeCacheSize if TieredCompilation is enabled in set_tiered_flags. // Increase the code cache size - tiered compiles a lot more. if (FLAG_IS_DEFAULT(ReservedCodeCacheSize)) { FLAG_SET_ERGO(uintx, ReservedCodeCacheSize, ReservedCodeCacheSize * 5); } And hence the reason we had to put the limit of 128M (because 5*48M > 128M, unless we were will to reduce the default ReservedCodeCacheSize in the non tier case to < 128M/5). > > You may need to change next code if you can allocate only 128MB: > > 2547 } else if (ReservedCodeCacheSize > 2*G) { > 2548 // Code cache size larger than MAXINT is not supported. > 2549 jio_fprintf(defaultStream::error_stream(), > > I think you need to add new platforms specific flag CodeCacheSizeLimit > and use it instead of our hard-coded 2Gb (maxint). OK. So what you are suggesting is adding CodeCacheSizeLimt as a product_pd to globals.hpp, then adding a define_pd_global to each of globals_aarch64.hpp, globals_x86.hpp, ....? Or something else? All the best, Ed. From aph at redhat.com Fri Nov 14 10:40:53 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 14 Nov 2014 10:40:53 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546572B8.9080005@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> Message-ID: <5465DC35.9030601@redhat.com> On 14/11/14 03:10, Vladimir Kozlov wrote: > Is it really 128MB max value for ReservedCodeCacheSize on aarch64? Well, here's the story. Branches can reach 128M. One of the core assumptions HotSpot makes (for inline caches and a few other things) is that you can atomically patch a branch or call. Patching multi-word blocks of code on AArch64 is very hard because there is no ordering of memory access between cores and no synchronization between instruction and data caches. And you can only patch nops, branches, and traps: anything else is undefined behaviour. So, we need to patch running code. If branches are over 128M, we're going to find it hard. The only decent (and architecturally well-defined) way I found was to use a load from the constant pool to supply the destination. And that causes a delay, even when reading from L1 cache. Every call is potentially a far call, and (once you're over 128M) so is every branch from compiled code into the runtime. (There are several other ways to handle far branches, but they're all pretty unpleasant. For example, it is possible to handle it optimistically: compile short branches and assume every branch will reach, and deoptimize if we get unlucky, but eww.) I have written code to handle a large code cache and tried various ideas, but I abandoned it. The key insight for me was the realization that the code cache is just that: it's a cache. And IMO it makes more sense to live with a smaller code cache than pessimze everything. Having said all that, I admit the decision to limit the cache to 128M might be the wrong choice for some workloads, so I am quite happy to revisit this problem at a later date, but I don't think it's critical right now. > What is default ReservedCodeCacheSize size? I don't quite understand what you're asking. On AArch64, or other systems? Default is 64M * 5 for C2. Andrew. From goetz.lindenmaier at sap.com Fri Nov 14 11:27:23 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 14 Nov 2014 11:27:23 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <5465DC35.9030601@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <5465DC35.9030601@redhat.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF2762E@DEWDFEMB12A.global.corp.sap> Hi, on PPC, we solved this with the trampoline stubs we introduced. Short calls are done directly. If we need a longer call, we jump to the trampoline stub that does the load from the constant pool. So short calls are efficient, but longer ones have an overhead of a short branch. Another advantage for us is that we can schedule the short branch for Power6 well. Drawback is that the trampoline stub is sitting there for every call, wasting code cache and constant pool entries. Best regards, Goetz. -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Andrew Haley Sent: Freitag, 14. November 2014 11:41 To: Vladimir Kozlov; hotspot-dev Source Developers; aarch64-port-dev at openjdk.java.net Subject: Re: AARCH64: 8064611: Changes to HotSpot shared code On 14/11/14 03:10, Vladimir Kozlov wrote: > Is it really 128MB max value for ReservedCodeCacheSize on aarch64? Well, here's the story. Branches can reach 128M. One of the core assumptions HotSpot makes (for inline caches and a few other things) is that you can atomically patch a branch or call. Patching multi-word blocks of code on AArch64 is very hard because there is no ordering of memory access between cores and no synchronization between instruction and data caches. And you can only patch nops, branches, and traps: anything else is undefined behaviour. So, we need to patch running code. If branches are over 128M, we're going to find it hard. The only decent (and architecturally well-defined) way I found was to use a load from the constant pool to supply the destination. And that causes a delay, even when reading from L1 cache. Every call is potentially a far call, and (once you're over 128M) so is every branch from compiled code into the runtime. (There are several other ways to handle far branches, but they're all pretty unpleasant. For example, it is possible to handle it optimistically: compile short branches and assume every branch will reach, and deoptimize if we get unlucky, but eww.) I have written code to handle a large code cache and tried various ideas, but I abandoned it. The key insight for me was the realization that the code cache is just that: it's a cache. And IMO it makes more sense to live with a smaller code cache than pessimze everything. Having said all that, I admit the decision to limit the cache to 128M might be the wrong choice for some workloads, so I am quite happy to revisit this problem at a later date, but I don't think it's critical right now. > What is default ReservedCodeCacheSize size? I don't quite understand what you're asking. On AArch64, or other systems? Default is 64M * 5 for C2. Andrew. From tobias.hartmann at oracle.com Fri Nov 14 11:41:05 2014 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 14 Nov 2014 12:41:05 +0100 Subject: [9] RFR(XS): 8059596: VM startup fails with 'Invalid code heap sizes' if -XX:ReservedCodeCacheSize is set In-Reply-To: <5465574F.2040401@oracle.com> References: <542D1847.7090107@oracle.com> <5465574F.2040401@oracle.com> Message-ID: <5465EA51.8050802@oracle.com> Hi Vladimir, On 14.11.2014 02:13, Vladimir Kozlov wrote: > Hi Tobias, > > At the line 1149 we set ReservedCodeCacheSize to ERGO so it is not DEFAULT anymore: > > 1149 FLAG_SET_ERGO(uintx, ReservedCodeCacheSize, ReservedCodeCacheSize * 5); > > As result segments sizes are not set there. They will be set to in > CodeCache::initialize_heaps() as (ReservedCodeCacheSize - > NonNMethodCodeHeapSize) / 2: > > CodeHeap 'non-nmethods': size=5700Kb used=2278Kb max_used=2279Kb free=3421Kb > CodeHeap 'profiled nmethods': size=120032Kb used=120Kb max_used=120Kb free=119912Kb > CodeHeap 'non-profiled nmethods': size=120032Kb used=22Kb max_used=22Kb free=120 > > But it is not *5 sizes: > > define_pd_global(intx, NonProfiledCodeHeapSize, 21*M); > define_pd_global(intx, ProfiledCodeHeapSize, 22*M); > > And it skips the assert in arguments.cpp Thanks for pointing that out. As Vladimir I. already suggested, I will move the logic into CodeCache::initialize_heaps() and check consistency there. I filed JDK-8059611 for that. Thanks, Tobias > > Vladimir > > On 10/2/14 2:17 AM, Tobias Hartmann wrote: >> Hi, >> >> please review this small patch. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8059596 >> Webrev: http://cr.openjdk.java.net/~thartmann/8059596/webrev.00/ >> >> Problem: >> The VM startup fails with 'Invalid code heap sizes' if >> -XX:ReservedCodeCacheSize >= 240M is specified. The problem is that in >> Arguments::set_tiered_flags() the code cache size is increased by 5 if >> TieredCompilation is enabled. This should only be done for default values. >> >> Solution: >> Add missing FLAG_IS_DEFAULT(ReservedCodeCacheSize) check. >> >> Thanks, >> Tobias From aph at redhat.com Fri Nov 14 11:48:27 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 14 Nov 2014 11:48:27 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF2762E@DEWDFEMB12A.global.corp.sap> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <5465DC35.9030601@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF2762E@DEWDFEMB12A.global.corp.sap> Message-ID: <5465EC0B.80605@redhat.com> On 11/14/2014 11:27 AM, Lindenmaier, Goetz wrote: > on PPC, we solved this with the trampoline stubs we introduced. > > Short calls are done directly. If we need a longer call, we jump to the > trampoline stub that does the load from the constant pool. > So short calls are efficient, but longer ones have an overhead of a short > branch. Another advantage for us is that we can schedule the short > branch for Power6 well. > > Drawback is that the trampoline stub is sitting there for every call, > wasting code cache and constant pool entries. Yes, I considered that too. Andrew. From mikael.auno at oracle.com Fri Nov 14 11:58:23 2014 From: mikael.auno at oracle.com (Mikael Auno) Date: Fri, 14 Nov 2014 12:58:23 +0100 Subject: RFR: 8064799: [TESTBUG] JT-Reg Serviceability tests to be run as part of JPRT submit job In-Reply-To: <5465D19C.3040609@oracle.com> References: <5464B874.4060206@oracle.com> <5465B75A.5080609@oracle.com> <6D4C1371-535D-4EA6-BE39-4B08AE9AB039@oracle.com> <5465C317.2060201@oracle.com> <5465CC20.40802@oracle.com> <5465D19C.3040609@oracle.com> Message-ID: <5465EE5F.7060902@oracle.com> On 2014-11-14 10:55, Alan Bateman wrote: > On 14/11/2014 09:32, Mikael Auno wrote: >> : >> Here's an updated webrev with your proposed changes. >> >> http://cr.openjdk.java.net/~miauno/8064799/webrev.01/ >> > This looks to okay, thanks for taking the concern on board. > > -Alan Perfect. Thanks for the reviews, all of you. Mikael From sgehwolf at redhat.com Fri Nov 14 13:18:43 2014 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Fri, 14 Nov 2014 14:18:43 +0100 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven Message-ID: <1415971123.3278.55.camel@localhost.localdomain> Hi, Could I please get a review and sponsor for the following fix: bug: https://bugs.openjdk.java.net/browse/JDK-8064815 webrev: https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.0/ When running Maven ("mvn") on a Zero variant build on PPC/PPC64 hardware it throws a StackOverflowError. This is because the stack bound calculation does not account for red and yellow pages. The bug has a slightly different patch attached. The changes to hotspot/src/os/linux/vm/os_linux.cpp aren't needed for this bug. Testing done: A Zero variant build of OpenJDK 9 on PPC/PPC64 throws StackOverflowError without this fix and works fine with this fix applied. Note that this problem seems to surface on architectures where pages are large. PPC is one such instance. Page size there is 64KB and Zero initially sets its minimal stack allowance to 64KB (one page), src/os_cpu/linux_zero/vm/os_linux_zero.cpp. In os::init_2 this gets potentially increased if min_stack_allowed is small. The case on PPC Zero. However, then later at runtime the calculation of available stack is wrong since it does not account for red and yellow pages. Thus it thinks there is too little stack available where in fact more stack is available. Thanks, Severin From mikael.gerdin at oracle.com Fri Nov 14 13:20:32 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Fri, 14 Nov 2014 14:20:32 +0100 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <546228E3.8030207@oracle.com> References: <546228E3.8030207@oracle.com> Message-ID: <546601A0.6080708@oracle.com> All, I've just realized (with the help of Stefan K.) that my webrev.1 is in fact incorrect. It actually re-introduces the race by using a local variable for the time stamp assert. With that in mind I want to push the contents of webrev.0 (with the indentation fix) instead since that has gone through extensive testing. I plan to push this today and I'm already working on a bunch of cleanups to this code. /Mikael On 2014-11-11 16:18, Mikael Gerdin wrote: > Hi all, > > I've sent this to hotspot-dev instead of just hotspot-gc-dev in the hope > of getting some extra feedback from our resident concurrency experts. > > Please review this subtle change to the order in which we read fields in > G1OffsetTableContigSpace::saved_mark_word, original included here for > reference: > 1003 > 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { > 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); > 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); > 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) > 1008 return top(); > 1009 else > 1010 return Space::saved_mark_word(); > 1011 } > 1012 > > When getting a new gc alloc region several stores are performed where > store ordering needs to be enforced and several synchronization points > occur. > [write path] > ST(_saved_mark_word) > #StoreStore > ST(_gc_time_stamp) > ST(_top) // satisfying alloc request > #StoreStore > ST(_alloc_region) // publishing to other gc workers > #MonitorEnter > ST(_top) // potential further allocations > #MonitorExit > #MonitorEnter > ST(_top) // potential further allocations > #MonitorExit > > When we inspect a region during remembered set scanning we need to > ensure that we never read memory which have been allocated by a GC > worker thread for the purpose of copying objects into. > The way this works is that a time stamp field is supposed to signal to a > scanning thread that it should look at addresses below _top if the time > stamp is old or addresses below _saved_mark_word if the time stamp is > current. > > The current code does (as seen above) > [read path] > LD(_gc_time_stamp) > LD(_top) > or (depending on time stamp) > LD(_saved_mark_word) > > Because these values are written to without full mutual exclusion we > need to be very careful about the order in which we read these values, > and this is where I argue that the current code is incorrect. > In order to observe a consistent view of the ordered stores in the > [write path] above we need to load the values in the reverse order they > were written, with proper #LoadLoad ordering enforced. > > The problem which we've observed here is that after we've read the time > stamp as below the heap time stamp the top pointer can be updated by a > GC worker allocating objects into this region. To make sure that the top > value we see is in fact valid we must read it before we read the time > stamp which determines which value we should return from the > saved_mark_word function. > > My suggested fix is to load _top first and enforce #LoadLoad ordering > enforced: > HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { > G1CollectedHeap* g1h = G1CollectedHeap::heap(); > assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); > HeapWord* local_top = top(); > OrderAccess::loadload(); > if (_gc_time_stamp < g1h->get_gc_time_stamp()) { > return local_top; > } else { > return Space::saved_mark_word(); > } > } > > I've successfully reproduced the crash with the original code by adding > some random sleep calls between the load of the time stamp and the load > of top so I'm fairly certain that this resolves the issue. I've also > verified that the fix I'm proposing does resolve the bug for the team > which encountered the issue, even if I can't reproduce that crash locally. > > I also plan to attempt design around some of the races in this code to > reduce its complexity, but for the sake of backporting the fix to 8u40 > I'd like to start with just adding the minimal fix. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 > Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ > Testing: JPRT, local kitchensink (4 hours), gc test suite > > Thanks > /Mikael From stefan.karlsson at oracle.com Fri Nov 14 13:16:01 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 14 Nov 2014 14:16:01 +0100 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <546601A0.6080708@oracle.com> References: <546228E3.8030207@oracle.com> <546601A0.6080708@oracle.com> Message-ID: <54660091.6060506@oracle.com> On 2014-11-14 14:20, Mikael Gerdin wrote: > All, > > I've just realized (with the help of Stefan K.) that my webrev.1 is in > fact incorrect. It actually re-introduces the race by using a local > variable for the time stamp assert. > > With that in mind I want to push the contents of webrev.0 (with the > indentation fix) instead since that has gone through extensive testing. > > I plan to push this today and I'm already working on a bunch of > cleanups to this code. Looks good. Thanks, StefanK > > /Mikael > > On 2014-11-11 16:18, Mikael Gerdin wrote: >> Hi all, >> >> I've sent this to hotspot-dev instead of just hotspot-gc-dev in the hope >> of getting some extra feedback from our resident concurrency experts. >> >> Please review this subtle change to the order in which we read fields in >> G1OffsetTableContigSpace::saved_mark_word, original included here for >> reference: >> 1003 >> 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >> 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); >> 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), >> "invariant" ); >> 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) >> 1008 return top(); >> 1009 else >> 1010 return Space::saved_mark_word(); >> 1011 } >> 1012 >> >> When getting a new gc alloc region several stores are performed where >> store ordering needs to be enforced and several synchronization points >> occur. >> [write path] >> ST(_saved_mark_word) >> #StoreStore >> ST(_gc_time_stamp) >> ST(_top) // satisfying alloc request >> #StoreStore >> ST(_alloc_region) // publishing to other gc workers >> #MonitorEnter >> ST(_top) // potential further allocations >> #MonitorExit >> #MonitorEnter >> ST(_top) // potential further allocations >> #MonitorExit >> >> When we inspect a region during remembered set scanning we need to >> ensure that we never read memory which have been allocated by a GC >> worker thread for the purpose of copying objects into. >> The way this works is that a time stamp field is supposed to signal to a >> scanning thread that it should look at addresses below _top if the time >> stamp is old or addresses below _saved_mark_word if the time stamp is >> current. >> >> The current code does (as seen above) >> [read path] >> LD(_gc_time_stamp) >> LD(_top) >> or (depending on time stamp) >> LD(_saved_mark_word) >> >> Because these values are written to without full mutual exclusion we >> need to be very careful about the order in which we read these values, >> and this is where I argue that the current code is incorrect. >> In order to observe a consistent view of the ordered stores in the >> [write path] above we need to load the values in the reverse order they >> were written, with proper #LoadLoad ordering enforced. >> >> The problem which we've observed here is that after we've read the time >> stamp as below the heap time stamp the top pointer can be updated by a >> GC worker allocating objects into this region. To make sure that the top >> value we see is in fact valid we must read it before we read the time >> stamp which determines which value we should return from the >> saved_mark_word function. >> >> My suggested fix is to load _top first and enforce #LoadLoad ordering >> enforced: >> HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >> G1CollectedHeap* g1h = G1CollectedHeap::heap(); >> assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); >> HeapWord* local_top = top(); >> OrderAccess::loadload(); >> if (_gc_time_stamp < g1h->get_gc_time_stamp()) { >> return local_top; >> } else { >> return Space::saved_mark_word(); >> } >> } >> >> I've successfully reproduced the crash with the original code by adding >> some random sleep calls between the load of the time stamp and the load >> of top so I'm fairly certain that this resolves the issue. I've also >> verified that the fix I'm proposing does resolve the bug for the team >> which encountered the issue, even if I can't reproduce that crash >> locally. >> >> I also plan to attempt design around some of the races in this code to >> reduce its complexity, but for the sake of backporting the fix to 8u40 >> I'd like to start with just adding the minimal fix. >> >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 >> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ >> Testing: JPRT, local kitchensink (4 hours), gc test suite >> >> Thanks >> /Mikael From volker.simonis at gmail.com Fri Nov 14 13:46:24 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 14 Nov 2014 14:46:24 +0100 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven In-Reply-To: <1415971123.3278.55.camel@localhost.localdomain> References: <1415971123.3278.55.camel@localhost.localdomain> Message-ID: Hi Severin, I can sponsor this change if we get one more review. The only comment I have is that in ZeroStack::suggest_size() there doesn't seem to be a handling for the potentially negative values returned by ZeroStack::abi_stack_available(). Regards, Volker On Fri, Nov 14, 2014 at 2:18 PM, Severin Gehwolf wrote: > Hi, > > Could I please get a review and sponsor for the following fix: > > bug: https://bugs.openjdk.java.net/browse/JDK-8064815 > webrev: > https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.0/ > > When running Maven ("mvn") on a Zero variant build on PPC/PPC64 hardware > it throws a StackOverflowError. This is because the stack bound > calculation does not account for red and yellow pages. > > The bug has a slightly different patch attached. The changes to > hotspot/src/os/linux/vm/os_linux.cpp aren't needed for this bug. > > Testing done: A Zero variant build of OpenJDK 9 on PPC/PPC64 throws > StackOverflowError without this fix and works fine with this fix > applied. > > Note that this problem seems to surface on architectures where pages are > large. PPC is one such instance. Page size there is 64KB and Zero > initially sets its minimal stack allowance to 64KB (one page), > src/os_cpu/linux_zero/vm/os_linux_zero.cpp. In os::init_2 this gets > potentially increased if min_stack_allowed is small. The case on PPC > Zero. > > However, then later at runtime the calculation of available stack is > wrong since it does not account for red and yellow pages. Thus it thinks > there is too little stack available where in fact more stack is > available. > > Thanks, > Severin > From eric.mccorkle at oracle.com Fri Nov 14 13:56:30 2014 From: eric.mccorkle at oracle.com (Eric McCorkle) Date: Fri, 14 Nov 2014 08:56:30 -0500 Subject: Review request for JDK-8064571: java/lang/instrument/IsModifiableClassAgent.java: assert(length > 0) failed: should only be called if table is present In-Reply-To: <54651E1D.2030202@oracle.com> References: <54651CE4.2070003@oracle.com> <54651E1D.2030202@oracle.com> Message-ID: <54660A0E.1070702@oracle.com> Actually, I realized the comparison in the assert is tautologically false. Therefore, I removed the assert altogether. Please re-approve. On 11/13/14 16:09, Coleen Phillimore wrote: > > Looks good. Thanks for running the java/lang/instrument tests. > > Coleen > > On 11/13/14, 4:04 PM, Eric McCorkle wrote: >> Hello, >> >> Please review this simple fix for a JDK test failure that was introduced >> by the change for JDK-8058313. A condition for an assertion was left as >> ">", when it should have been changed to ">=". >> >> Note that this only occurs in artificial test cases; javac does not >> produce classfiles with zero-length MethodParameters attributes. >> >> The webrev is here: >> http://cr.openjdk.java.net/~emc/8064571/ >> >> The bug report is here: >> https://bugs.openjdk.java.net/browse/JDK-8064571 > From stefan.karlsson at oracle.com Fri Nov 14 13:53:33 2014 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 14 Nov 2014 14:53:33 +0100 Subject: RFR: 8064580 and 8064581: Move INCLUDE_CDS and INCLUDE_ALL_GCS to the end of the include lists In-Reply-To: <54643184.9070500@oracle.com> References: <546354EB.80606@oracle.com> <54643184.9070500@oracle.com> Message-ID: <5466095D.5090509@oracle.com> Hi David, On 2014-11-13 05:20, David Holmes wrote: > Hi Stefan, > > Please ensure the main-line platforms build okay with PCH disabled. Good suggestion. We actually fail to link on Windows because of a missing include of utilities/macros.hpp in a file that uses INCLUDE_TRACE. :) I've contacted Serviceability and they will fix this and go through all all files that use INCLUDE_TRACE and make sure that they include macros.hpp. Thanks, StefanK > > Thanks, > David > > On 12/11/2014 10:39 PM, Stefan Karlsson wrote: >> Hi all, >> >> Please, review the following two cleanup patches to move the conditional >> include lines to the end of the include lists. The patches also add >> missing macros.hpp includes, that are needed when the INCLUDE_* defines >> are used. There are also a few minor cleanups near some usages of >> INCLUDE_ALL_GCS. >> >> http://cr.openjdk.java.net/~stefank/8064580/webrev.01 - Fix >> INCLUDE_CDS >> http://cr.openjdk.java.net/~stefank/8064581/webrev.01 - Fix >> INCLUDE_ALL_GCS >> >> Some background to the sort order, the INCLUDE_* defines and macros.hpp: >> >> The include lines where inserted and sorted in the includeDB removal >> patch. As part of that patch all includes that were guarded by #ifndef >> were put at the end of the include list. See: >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f95d63e2154a - >> 6989984: Use standard include model for Hospot >> >> Later the selective inclusion of parts like, for example, CDS and the >> non-serial GCs were changed and now we also rely on the defines present >> in macros.hpp. With that change it's now important that all conditional >> includes are added after the inclusion of macros.hpp. See: >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fb19af007ffc - >> 7189254: Change makefiles for more flexibility to override defaults >> >> thanks, >> StefanK From bill.pittore at oracle.com Fri Nov 14 14:48:37 2014 From: bill.pittore at oracle.com (bill pittore) Date: Fri, 14 Nov 2014 09:48:37 -0500 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <546601A0.6080708@oracle.com> References: <546228E3.8030207@oracle.com> <546601A0.6080708@oracle.com> Message-ID: <54661645.8030803@oracle.com> I don't want to hold up your current push but have a question inline. On 11/14/2014 8:20 AM, Mikael Gerdin wrote: > All, > > I've just realized (with the help of Stefan K.) that my webrev.1 is in > fact incorrect. It actually re-introduces the race by using a local > variable for the time stamp assert. > > With that in mind I want to push the contents of webrev.0 (with the > indentation fix) instead since that has gone through extensive testing. > > I plan to push this today and I'm already working on a bunch of > cleanups to this code. > > /Mikael > > On 2014-11-11 16:18, Mikael Gerdin wrote: >> Hi all, >> >> I've sent this to hotspot-dev instead of just hotspot-gc-dev in the hope >> of getting some extra feedback from our resident concurrency experts. >> >> Please review this subtle change to the order in which we read fields in >> G1OffsetTableContigSpace::saved_mark_word, original included here for >> reference: >> 1003 >> 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >> 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); >> 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), >> "invariant" ); >> 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) >> 1008 return top(); >> 1009 else >> 1010 return Space::saved_mark_word(); >> 1011 } >> 1012 >> >> When getting a new gc alloc region several stores are performed where >> store ordering needs to be enforced and several synchronization points >> occur. >> [write path] >> ST(_saved_mark_word) >> #StoreStore >> ST(_gc_time_stamp) >> ST(_top) // satisfying alloc request If at this point in time the read thread running on some other core (on a system with weak memory ordering) reads _top and _gc_time_stamp I'm not convinced that the values will always be what you think. The store of _top could float above the store of _gc_time_stamp. I don't know the gc code in question so I don't know if that's good or bad but I do think it's possible. If you need strict ordering of the two writes then a StoreStore between them will guarantee that. Bertrand or David H. please correct me if I'm wrong. thanks, bill >> #StoreStore >> ST(_alloc_region) // publishing to other gc workers >> #MonitorEnter >> ST(_top) // potential further allocations >> #MonitorExit >> #MonitorEnter >> ST(_top) // potential further allocations >> #MonitorExit >> >> When we inspect a region during remembered set scanning we need to >> ensure that we never read memory which have been allocated by a GC >> worker thread for the purpose of copying objects into. >> The way this works is that a time stamp field is supposed to signal to a >> scanning thread that it should look at addresses below _top if the time >> stamp is old or addresses below _saved_mark_word if the time stamp is >> current. >> >> The current code does (as seen above) >> [read path] >> LD(_gc_time_stamp) >> LD(_top) >> or (depending on time stamp) >> LD(_saved_mark_word) >> >> Because these values are written to without full mutual exclusion we >> need to be very careful about the order in which we read these values, >> and this is where I argue that the current code is incorrect. >> In order to observe a consistent view of the ordered stores in the >> [write path] above we need to load the values in the reverse order they >> were written, with proper #LoadLoad ordering enforced. >> >> The problem which we've observed here is that after we've read the time >> stamp as below the heap time stamp the top pointer can be updated by a >> GC worker allocating objects into this region. To make sure that the top >> value we see is in fact valid we must read it before we read the time >> stamp which determines which value we should return from the >> saved_mark_word function. >> >> My suggested fix is to load _top first and enforce #LoadLoad ordering >> enforced: >> HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >> G1CollectedHeap* g1h = G1CollectedHeap::heap(); >> assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); >> HeapWord* local_top = top(); >> OrderAccess::loadload(); >> if (_gc_time_stamp < g1h->get_gc_time_stamp()) { >> return local_top; >> } else { >> return Space::saved_mark_word(); >> } >> } >> >> I've successfully reproduced the crash with the original code by adding >> some random sleep calls between the load of the time stamp and the load >> of top so I'm fairly certain that this resolves the issue. I've also >> verified that the fix I'm proposing does resolve the bug for the team >> which encountered the issue, even if I can't reproduce that crash >> locally. >> >> I also plan to attempt design around some of the races in this code to >> reduce its complexity, but for the sake of backporting the fix to 8u40 >> I'd like to start with just adding the minimal fix. >> >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 >> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ >> Testing: JPRT, local kitchensink (4 hours), gc test suite >> >> Thanks >> /Mikael From mikael.gerdin at oracle.com Fri Nov 14 15:32:47 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Fri, 14 Nov 2014 16:32:47 +0100 Subject: RFR: JDK-8058209: Race in G1 card scanning could allow scanning of memory covered by PLABs In-Reply-To: <54661645.8030803@oracle.com> References: <546228E3.8030207@oracle.com> <546601A0.6080708@oracle.com> <54661645.8030803@oracle.com> Message-ID: <5466209F.1050109@oracle.com> Hi Bill, On 2014-11-14 15:48, bill pittore wrote: > I don't want to hold up your current push but have a question inline. Thanks, I'm in a bit of a hurry to get this in time to be able to backport it to 8u40. > > On 11/14/2014 8:20 AM, Mikael Gerdin wrote: >> All, >> >> I've just realized (with the help of Stefan K.) that my webrev.1 is in >> fact incorrect. It actually re-introduces the race by using a local >> variable for the time stamp assert. >> >> With that in mind I want to push the contents of webrev.0 (with the >> indentation fix) instead since that has gone through extensive testing. >> >> I plan to push this today and I'm already working on a bunch of >> cleanups to this code. >> >> /Mikael >> >> On 2014-11-11 16:18, Mikael Gerdin wrote: >>> Hi all, >>> >>> I've sent this to hotspot-dev instead of just hotspot-gc-dev in the hope >>> of getting some extra feedback from our resident concurrency experts. >>> >>> Please review this subtle change to the order in which we read fields in >>> G1OffsetTableContigSpace::saved_mark_word, original included here for >>> reference: >>> 1003 >>> 1004 HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>> 1005 G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>> 1006 assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), >>> "invariant" ); >>> 1007 if (_gc_time_stamp < g1h->get_gc_time_stamp()) >>> 1008 return top(); >>> 1009 else >>> 1010 return Space::saved_mark_word(); >>> 1011 } >>> 1012 >>> >>> When getting a new gc alloc region several stores are performed where >>> store ordering needs to be enforced and several synchronization points >>> occur. >>> [write path] >>> ST(_saved_mark_word) >>> #StoreStore >>> ST(_gc_time_stamp) >>> ST(_top) // satisfying alloc request > If at this point in time the read thread running on some other core (on > a system with weak memory ordering) reads _top and _gc_time_stamp I'm > not convinced that the values will always be what you think. The store > of _top could float above the store of _gc_time_stamp. I don't know the > gc code in question so I don't know if that's good or bad but I do think > it's possible. If you need strict ordering of the two writes then a > StoreStore between them will guarantee that. Bertrand or David H. please > correct me if I'm wrong. I think you are correct about this. I chose to only fix this for the case I can verify on x86 to make a minimal fix for the crash in 8u40. As I mentioned earlier in the thread I plan to rework this code a bit to reduce the number of races and introduce proper usage of barriers (and/or use store_release/load_acquire as I think they can reduce the overhead slightly) /Mikael > > thanks, > bill >>> #StoreStore >>> ST(_alloc_region) // publishing to other gc workers >>> #MonitorEnter >>> ST(_top) // potential further allocations >>> #MonitorExit >>> #MonitorEnter >>> ST(_top) // potential further allocations >>> #MonitorExit >>> >>> When we inspect a region during remembered set scanning we need to >>> ensure that we never read memory which have been allocated by a GC >>> worker thread for the purpose of copying objects into. >>> The way this works is that a time stamp field is supposed to signal to a >>> scanning thread that it should look at addresses below _top if the time >>> stamp is old or addresses below _saved_mark_word if the time stamp is >>> current. >>> >>> The current code does (as seen above) >>> [read path] >>> LD(_gc_time_stamp) >>> LD(_top) >>> or (depending on time stamp) >>> LD(_saved_mark_word) >>> >>> Because these values are written to without full mutual exclusion we >>> need to be very careful about the order in which we read these values, >>> and this is where I argue that the current code is incorrect. >>> In order to observe a consistent view of the ordered stores in the >>> [write path] above we need to load the values in the reverse order they >>> were written, with proper #LoadLoad ordering enforced. >>> >>> The problem which we've observed here is that after we've read the time >>> stamp as below the heap time stamp the top pointer can be updated by a >>> GC worker allocating objects into this region. To make sure that the top >>> value we see is in fact valid we must read it before we read the time >>> stamp which determines which value we should return from the >>> saved_mark_word function. >>> >>> My suggested fix is to load _top first and enforce #LoadLoad ordering >>> enforced: >>> HeapWord* G1OffsetTableContigSpace::saved_mark_word() const { >>> G1CollectedHeap* g1h = G1CollectedHeap::heap(); >>> assert( _gc_time_stamp <= g1h->get_gc_time_stamp(), "invariant" ); >>> HeapWord* local_top = top(); >>> OrderAccess::loadload(); >>> if (_gc_time_stamp < g1h->get_gc_time_stamp()) { >>> return local_top; >>> } else { >>> return Space::saved_mark_word(); >>> } >>> } >>> >>> I've successfully reproduced the crash with the original code by adding >>> some random sleep calls between the load of the time stamp and the load >>> of top so I'm fairly certain that this resolves the issue. I've also >>> verified that the fix I'm proposing does resolve the bug for the team >>> which encountered the issue, even if I can't reproduce that crash >>> locally. >>> >>> I also plan to attempt design around some of the races in this code to >>> reduce its complexity, but for the sake of backporting the fix to 8u40 >>> I'd like to start with just adding the minimal fix. >>> >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8058209 >>> Webrev: http://cr.openjdk.java.net/~mgerdin/8058209/webrev.0/ >>> Testing: JPRT, local kitchensink (4 hours), gc test suite >>> >>> Thanks >>> /Mikael > From coleen.phillimore at oracle.com Fri Nov 14 16:04:23 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 14 Nov 2014 11:04:23 -0500 Subject: RFR:8060449:Proper error messages for newly obsolete command line flags. In-Reply-To: <54654428.8040002@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> <5465395A.4080209@oracle.com> <54654226.9070309@oracle.com> <54654428.8040002@oracle.com> Message-ID: <54662807.5010902@oracle.com> Yes, this looks good with the changes that Lois and Dan suggested for formatting. Thanks! Coleen On 11/13/14, 6:52 PM, Daniel D. Daugherty wrote: > Max, > > I'm good with this version also. > > src/share/vm/runtime/arguments.cpp > No content comments; same formatting comments as Lois. > > Background: It is generally faster to do length checks before > string comparisons. However, in this case, speed is not an issue. > > test/runtime/CommandLine/ObsoleteFlagErrorMessage.java > lines 39-41: you should have a space after '//' and before your > comment begins for readability. > > I don't need to see another code review if you choose to fix any > of the formatting issues. > > Dan > > > On 11/13/14 4:43 PM, Lois Foltan wrote: >> Hi Max, >> >> This looks good! Three really minor coding style comments included >> for completeness but I don't need to see another code review if you >> choose to fix these. >> >> src/share/vm/runtime/arguments.cpp >> - line # 336, usually the { would be placed on line 335 at the end >> of the if statement's conditional expression >> - line # 945, need a blank space between the "){" >> - line #952 the closing } is not lined up with the if keyword >> >> Again, these are minor. >> Lois >> >> >> On 11/13/2014 6:06 PM, Max Ockner wrote: >>> Correction - new webrev is at >>> http://cr.openjdk.java.net/~coleenp/8060449.1/ >>> >>> Max Ockner >>> >>> >>> On 11/12/2014 3:38 PM, Daniel D. Daugherty wrote: >>>> On 11/12/14 1:04 PM, Max Ockner wrote: >>>>> Dan, >>>>> I have reformatted the "){" fragment on line 336 as you >>>>> recommended. Thanks for catching that. >>>> >>>> Thanks. >>>> >>>> >>>>> For your second recommendation, I think I have a use case where >>>>> the recommended code would not function properly: >>>>> >>>>> Let's say there is a boolean flag SomeFlag, and let's say that the >>>>> user tries to type "-XX:SomeFlagg". >>>>> >>>>> The first if statement passes because strlen("SomeFlagg") = >>>>> strlen("SomeFlag")+1. >>>>> The second conditional checks if (strncmp(flag_status.name, s, >>>>> f_len) == 0). But f_len, the length of "SomeFlag" is 8. The >>>>> result is that the 9th character of the user's input, which is >>>>> where s differs from flag_status.name, is not checked,so this >>>>> condition is passed as well. >>>> >>>> Your use case catches a bug in what I posted. I had originally >>>> planned to change the two strncmp() calls to strcmp() so that >>>> we get a complete match, but then I couldn't remember if a >>>> straight strcmp() triggers Parfait warnings so I couldn't >>>> finish reasoning my way through that maze... >>>> >>>> Switching the 'f_len' parameter to 's_len' would solve the >>>> problem without triggering Parfait, but it is totally your >>>> call. >>>> >>>> Dan >>>> >>>>> >>>>> Thanks, >>>>> Max >>>>> >>>>> >>>>> >>>>> On 11/12/2014 1:41 PM, Daniel D. Daugherty wrote: >>>>>> On 11/7/14 12:13 PM, Max Ockner wrote: >>>>>>> ID: 8060449 >>>>>>> webrev: http://cr.openjdk.java.net/~coleenp/8060449/ >>>>>> >>>>>> src/share/vm/runtime/arguments.cpp >>>>>> >>>>>> line 336: ) { >>>>>> This fragment is on a line by itself and far left. >>>>>> Minimally, it should align like this: >>>>>> >>>>>> line 331: if (... >>>>>> line 336: ) { >>>>>> >>>>>> However, I recommend a slightly different structure to >>>>>> this logic: >>>>>> >>>>>> size_t f_len = strlen(flag_status.name); >>>>>> size_t s_len = strlen(s); >>>>>> if (f_len == s_len || (f_len + 1) == s_len) { >>>>>> // this flag is the right length for a possible match >>>>>> if (strncmp(flag_status.name, s, f_len) == 0) || >>>>>> ((s[0] == '+' || s[0] == '-') && >>>>>> strncmp(flag_status.name, &s[1], f_len) == 0)) { >>>>>> // this flag is an exact match >>>>>> if >>>>>> (JDK_Version::current().compare(flag_status.accept_until) == -1) { >>>>>> ... >>>>>> } >>>>>> } >>>>>> } >>>>>> i++; >>>>>> >>>>>> I have no idea if the above formatting is going to be >>>>>> preserved by e-mail clients... >>>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>>> >>>>>>> Summary: A "newly obsolete" command line option is one which is >>>>>>> no longer supported, but still is acknowledged. There is a list >>>>>>> of these in arguments.cpp. >>>>>>> It used to be that only a fixed number of characters were >>>>>>> checked when comparing a given command line option to the list >>>>>>> of obsolete flags (strncmp was used, where the number of >>>>>>> characters to check is equal to the length of the flag name from >>>>>>> the table.) >>>>>>> As a result, an arbitrary string appended to the end of an >>>>>>> obsolete argument goes unnoticed. >>>>>>> This issue is fixed by comparing the lengths of the given flag >>>>>>> and the flags from the obsolete flags table. >>>>>>> When a misspelled flag is fuzzy-matched to an obsolete flag, an >>>>>>> appropriate warning is given to save the user a few key strokes: >>>>>>> (1) unrecognized option [bad option]. (2) Did you mean [option]? >>>>>>> (3) [option] is obsolete as of [version]) >>>>>>> >>>>>>> A new test for this feature checks for the presence of all three >>>>>>> components of the above error message. >>>>>>> >>>>>>> Tested with: vm.quick.testlist >>>>>>> hotspot jtreg tests >>>>>>> jprt >>>>>>> >>>>>>> Thanks for your help! >>>>>>> Max Ockner >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From coleen.phillimore at oracle.com Fri Nov 14 16:10:13 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 14 Nov 2014 11:10:13 -0500 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> <4295855A5C1DE049A61835A1887419CC2CF26820@DEWDFEMB12A.global.corp.sap> <2e9bd47c-366a-446b-89d0-b431a5816007@default> <54641F95.5030201@oracle.com> Message-ID: <54662965.30900@oracle.com> This looks good and I can sponsor this. There is one typo: + * respectivly, with the following differences: should be 'respectively'. The new comment is much clearer. thanks, Coleen On 11/14/14, 4:59 AM, Thomas St?fe wrote: > Hi David, > > thanks for looking. See here the corrected webrev: > > http://cr.openjdk.java.net/~simonis/webrevs/8064779/ > > for this new bug report: > > https://bugs.openjdk.java.net/browse/JDK-8064779 > Best Regards, Thomas > > > On Thu, Nov 13, 2014 at 4:03 AM, David Holmes > wrote: > >> Hi Thomas, >> >> On 12/11/2014 8:31 PM, Thomas St?fe wrote: >> >>> Hi, >>> >>> could you please review this little addition? (added comments for >>> jio_snprintf) >>> >>> http://cr.openjdk.java.net/~simonis/webrevs/8062370/ >>> >> A new bug is needed for these changes. >> >> As people rarely look at the header file when reading the code could you >> augment the last line of the comment in jvm.cpp from: >> >> + // return always -1. >> >> to >> >> + // always return -1, and perform null termination. >> >> Thanks, >> David >> >> From coleen.phillimore at oracle.com Fri Nov 14 16:15:01 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 14 Nov 2014 11:15:01 -0500 Subject: Review request for JDK-8064571: java/lang/instrument/IsModifiableClassAgent.java: assert(length > 0) failed: should only be called if table is present In-Reply-To: <54660A0E.1070702@oracle.com> References: <54651CE4.2070003@oracle.com> <54651E1D.2030202@oracle.com> <54660A0E.1070702@oracle.com> Message-ID: <54662A85.2010000@oracle.com> Okay, yes this is fine. We need Jiangli to review it too. Coleen On 11/14/14, 8:56 AM, Eric McCorkle wrote: > Actually, I realized the comparison in the assert is tautologically > false. Therefore, I removed the assert altogether. Please re-approve. > > On 11/13/14 16:09, Coleen Phillimore wrote: >> Looks good. Thanks for running the java/lang/instrument tests. >> >> Coleen >> >> On 11/13/14, 4:04 PM, Eric McCorkle wrote: >>> Hello, >>> >>> Please review this simple fix for a JDK test failure that was introduced >>> by the change for JDK-8058313. A condition for an assertion was left as >>> ">", when it should have been changed to ">=". >>> >>> Note that this only occurs in artificial test cases; javac does not >>> produce classfiles with zero-length MethodParameters attributes. >>> >>> The webrev is here: >>> http://cr.openjdk.java.net/~emc/8064571/ >>> >>> The bug report is here: >>> https://bugs.openjdk.java.net/browse/JDK-8064571 From aph at redhat.com Fri Nov 14 16:40:43 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 14 Nov 2014 16:40:43 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546572B8.9080005@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> Message-ID: <5466308B.2080303@redhat.com> TVM for the review. I'm working my way through all the comments. On 11/14/2014 03:10 AM, Vladimir Kozlov wrote: > I think you need to add new platforms specific flag CodeCacheSizeLimit > and use it instead of our hard-coded 2Gb (maxint). I'm trying to understand the right way to do this. Surely it must be possible to define a flag and set its default value in one place and override it in the platform files, but I can't find any example which does this. Do we really have to change all platforms? Andrew. From vladimir.kozlov at oracle.com Fri Nov 14 17:06:18 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 14 Nov 2014 09:06:18 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <5465DC35.9030601@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <5465DC35.9030601@redhat.com> Message-ID: <5466368A.1030408@oracle.com> Thank you Edward and Andrew for explanation. So defaults are next: aarch64(C1) ReservedCodeCacheSize = 32*M aarch64(C2) ReservedCodeCacheSize = 48*M I understand that *5 will exceed 128MB limit. My concern here is that our experience shows that TieredCompilation will eat 200MB very easy. So with 128MB you may hit performance issues during long runs (C1 does not stop compiling) because there will be not enough space for code and we will start evicting old compiled code or stop compiling. On 11/14/14 2:34 AM, Edward Nevill wrote: >> >You may need to change next code if you can allocate only 128MB: >> > >> >2547 } else if (ReservedCodeCacheSize > 2*G) { >> >2548 // Code cache size larger than MAXINT is not supported. >> >2549 jio_fprintf(defaultStream::error_stream(), >> > >> >I think you need to add new platforms specific flag CodeCacheSizeLimit >> >and use it instead of our hard-coded 2Gb (maxint). > OK. So what you are suggesting is adding CodeCacheSizeLimt as a product_pd to globals.hpp, then adding a define_pd_global to each of globals_aarch64.hpp, globals_x86.hpp, ....? Or something else? Yes, I thought about product_pd() and define_pd_global(). But then some crazy (security) guys may start playing with it and file P1 bugs. Okay, how about a constant (#define CODE_CACHE_SIZE_LIMIT NOT_AARCH64(2*G) ARCH64_ONLY(138*M)) in this place so you can use it in these 2 checks? > > All the best, > Ed. Thanks, Vladimir On 11/14/14 2:40 AM, Andrew Haley wrote: > On 14/11/14 03:10, Vladimir Kozlov wrote: >> Is it really 128MB max value for ReservedCodeCacheSize on aarch64? > > Well, here's the story. Branches can reach 128M. One of the core > assumptions HotSpot makes (for inline caches and a few other things) > is that you can atomically patch a branch or call. Patching > multi-word blocks of code on AArch64 is very hard because there is no > ordering of memory access between cores and no synchronization between > instruction and data caches. And you can only patch nops, branches, > and traps: anything else is undefined behaviour. > > So, we need to patch running code. If branches are over 128M, we're > going to find it hard. The only decent (and architecturally > well-defined) way I found was to use a load from the constant pool to > supply the destination. And that causes a delay, even when reading > from L1 cache. Every call is potentially a far call, and (once you're > over 128M) so is every branch from compiled code into the runtime. > (There are several other ways to handle far branches, but they're all > pretty unpleasant. For example, it is possible to handle it > optimistically: compile short branches and assume every branch will > reach, and deoptimize if we get unlucky, but eww.) > > I have written code to handle a large code cache and tried various > ideas, but I abandoned it. The key insight for me was the realization > that the code cache is just that: it's a cache. And IMO it makes more > sense to live with a smaller code cache than pessimze everything. > > Having said all that, I admit the decision to limit the cache to 128M > might be the wrong choice for some workloads, so I am quite happy to > revisit this problem at a later date, but I don't think it's critical > right now. > >> What is default ReservedCodeCacheSize size? > > I don't quite understand what you're asking. On AArch64, or other > systems? Default is 64M * 5 for C2. > > Andrew. > From jiangli.zhou at oracle.com Fri Nov 14 17:16:10 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 14 Nov 2014 09:16:10 -0800 Subject: Review request for JDK-8064571: java/lang/instrument/IsModifiableClassAgent.java: assert(length > 0) failed: should only be called if table is present In-Reply-To: <54662A85.2010000@oracle.com> References: <54651CE4.2070003@oracle.com> <54651E1D.2030202@oracle.com> <54660A0E.1070702@oracle.com> <54662A85.2010000@oracle.com> Message-ID: <546638DA.3040909@oracle.com> Hi Eric, Looks okay to me also. Thanks, Jiangli On 11/14/2014 08:15 AM, Coleen Phillimore wrote: > > Okay, yes this is fine. We need Jiangli to review it too. > Coleen > > On 11/14/14, 8:56 AM, Eric McCorkle wrote: >> Actually, I realized the comparison in the assert is tautologically >> false. Therefore, I removed the assert altogether. Please re-approve. >> >> On 11/13/14 16:09, Coleen Phillimore wrote: >>> Looks good. Thanks for running the java/lang/instrument tests. >>> >>> Coleen >>> >>> On 11/13/14, 4:04 PM, Eric McCorkle wrote: >>>> Hello, >>>> >>>> Please review this simple fix for a JDK test failure that was >>>> introduced >>>> by the change for JDK-8058313. A condition for an assertion was >>>> left as >>>> ">", when it should have been changed to ">=". >>>> >>>> Note that this only occurs in artificial test cases; javac does not >>>> produce classfiles with zero-length MethodParameters attributes. >>>> >>>> The webrev is here: >>>> http://cr.openjdk.java.net/~emc/8064571/ >>>> >>>> The bug report is here: >>>> https://bugs.openjdk.java.net/browse/JDK-8064571 > From aph at redhat.com Fri Nov 14 17:17:00 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 14 Nov 2014 17:17:00 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <5466368A.1030408@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <5465DC35.9030601@redhat.com> <5466368A.1030408@oracle.com> Message-ID: <5466390C.2040305@redhat.com> On 11/14/2014 05:06 PM, Vladimir Kozlov wrote: > Thank you Edward and Andrew for explanation. > > So defaults are next: > > aarch64(C1) ReservedCodeCacheSize = 32*M > aarch64(C2) ReservedCodeCacheSize = 48*M > > I understand that *5 will exceed 128MB limit. My concern here is > that our experience shows that TieredCompilation will eat 200MB very > easy. So with 128MB you may hit performance issues during long runs > (C1 does not stop compiling) because there will be not enough space > for code and we will start evicting old compiled code or stop > compiling. Perhaps there is something I am not understanding: isn't evicting old compiled code really a good thing to do, thus avoiding many indirections? I suppose it's a matter of whether any application has a working set size of more than 128Mb for the code cache, although I can see that fragmentation might be an issue. Anyway, as I indicated, I'm quite willing to revisit this at a later date, if that's okay with you. > Okay, how about a constant (#define CODE_CACHE_SIZE_LIMIT > NOT_AARCH64(2*G) ARCH64_ONLY(138*M)) in this place so you can use it > in these 2 checks? Will do. Thanks, Andrew. From aph at redhat.com Fri Nov 14 17:19:03 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 14 Nov 2014 17:19:03 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <5466390C.2040305@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <5465DC35.9030601@redhat.com> <5466368A.1030408@oracle.com> <5466390C.2040305@redhat.com> Message-ID: <54663987.3040506@redhat.com> On 11/14/2014 05:17 PM, Andrew Haley wrote: > Perhaps there is something I am not understanding: isn't evicting old > compiled code really a good thing to do, thus avoiding many > indirections? I suppose it's a matter of whether any application has > a working set size of more than 128Mb for the code cache, although I > can see that fragmentation might be an issue. Sorry, I see what you mean now: C1-compiled bloated code will displace really good-quality C2-compiled code. A light went on in my brain just after I hit "send"... Andrew. From vladimir.kozlov at oracle.com Fri Nov 14 17:25:49 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 14 Nov 2014 09:25:49 -0800 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven In-Reply-To: References: <1415971123.3278.55.camel@localhost.localdomain> Message-ID: <54663B1D.6060408@oracle.com> So this code assumes that 'stack_used' local is allocated on stack and uses its address to calculated used space. But where is a guarantee that passed 'thread' is the current thread? Thanks, Vladimir On 11/14/14 5:46 AM, Volker Simonis wrote: > Hi Severin, > > I can sponsor this change if we get one more review. > > The only comment I have is that in ZeroStack::suggest_size() there > doesn't seem to be a handling for the potentially negative values > returned by ZeroStack::abi_stack_available(). > > Regards, > Volker > > > On Fri, Nov 14, 2014 at 2:18 PM, Severin Gehwolf wrote: >> Hi, >> >> Could I please get a review and sponsor for the following fix: >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8064815 >> webrev: >> https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.0/ >> >> When running Maven ("mvn") on a Zero variant build on PPC/PPC64 hardware >> it throws a StackOverflowError. This is because the stack bound >> calculation does not account for red and yellow pages. >> >> The bug has a slightly different patch attached. The changes to >> hotspot/src/os/linux/vm/os_linux.cpp aren't needed for this bug. >> >> Testing done: A Zero variant build of OpenJDK 9 on PPC/PPC64 throws >> StackOverflowError without this fix and works fine with this fix >> applied. >> >> Note that this problem seems to surface on architectures where pages are >> large. PPC is one such instance. Page size there is 64KB and Zero >> initially sets its minimal stack allowance to 64KB (one page), >> src/os_cpu/linux_zero/vm/os_linux_zero.cpp. In os::init_2 this gets >> potentially increased if min_stack_allowed is small. The case on PPC >> Zero. >> >> However, then later at runtime the calculation of available stack is >> wrong since it does not account for red and yellow pages. Thus it thinks >> there is too little stack available where in fact more stack is >> available. >> >> Thanks, >> Severin >> From vladimir.kozlov at oracle.com Fri Nov 14 17:32:13 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 14 Nov 2014 09:32:13 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <5466390C.2040305@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <5465DC35.9030601@redhat.com> <5466368A.1030408@oracle.com> <5466390C.2040305@redhat.com> Message-ID: <54663C9D.9050200@oracle.com> On 11/14/14 9:17 AM, Andrew Haley wrote: > On 11/14/2014 05:06 PM, Vladimir Kozlov wrote: >> Thank you Edward and Andrew for explanation. >> >> So defaults are next: >> >> aarch64(C1) ReservedCodeCacheSize = 32*M >> aarch64(C2) ReservedCodeCacheSize = 48*M >> >> I understand that *5 will exceed 128MB limit. My concern here is >> that our experience shows that TieredCompilation will eat 200MB very >> easy. So with 128MB you may hit performance issues during long runs >> (C1 does not stop compiling) because there will be not enough space >> for code and we will start evicting old compiled code or stop >> compiling. > > Perhaps there is something I am not understanding: isn't evicting old > compiled code really a good thing to do, thus avoiding many > indirections? I suppose it's a matter of whether any application has > a working set size of more than 128Mb for the code cache, although I > can see that fragmentation might be an issue. > > Anyway, as I indicated, I'm quite willing to revisit this at a later > date, if that's okay with you. Yes, I am perfectly fine with that. > >> Okay, how about a constant (#define CODE_CACHE_SIZE_LIMIT >> NOT_AARCH64(2*G) ARCH64_ONLY(138*M)) in this place so you can use it >> in these 2 checks? > > Will do. Thanks, Vladimir > > Thanks, > Andrew. > From dean.long at oracle.com Fri Nov 14 19:14:04 2014 From: dean.long at oracle.com (Dean Long) Date: Fri, 14 Nov 2014 11:14:04 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <5466368A.1030408@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <5465DC35.9030601@redhat.com> <5466368A.1030408@oracle.com> Message-ID: <5466547C.4010803@oracle.com> Is there a way to do this so it doesn't limit our closed aarch64 port? Perhaps #ifndef CODE_CACHE_SIZE_LIMIT #define CODE_CACHE_SIZE_LIMIT 2*G #endif which would allow a port to override it by defining CODE_CACHE_SIZE_LIMIT in a file such as globalDefinitions_.hpp. dl On 11/14/2014 9:06 AM, Vladimir Kozlov wrote: > Yes, I thought about product_pd() and define_pd_global(). But then > some crazy (security) guys may start playing with it and file P1 bugs. > Okay, how about a constant (#define CODE_CACHE_SIZE_LIMIT > NOT_AARCH64(2*G) ARCH64_ONLY(138*M)) in this place so you can use it > in these 2 checks? From vladimir.kozlov at oracle.com Fri Nov 14 19:45:49 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 14 Nov 2014 11:45:49 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <5466547C.4010803@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <5465DC35.9030601@redhat.com> <5466368A.1030408@oracle.com> <5466547C.4010803@oracle.com> Message-ID: <54665BED.9050607@oracle.com> Yes, it is very good suggestion. And it will allow to avoid a platform check in arguments.cpp. Andrew, please, use this approach. Thanks, Vladimir On 11/14/14 11:14 AM, Dean Long wrote: > Is there a way to do this so it doesn't limit our closed aarch64 port? > Perhaps > > #ifndef CODE_CACHE_SIZE_LIMIT > #define CODE_CACHE_SIZE_LIMIT 2*G > #endif > > which would allow a port to override it by defining > CODE_CACHE_SIZE_LIMIT in > a file such as globalDefinitions_.hpp. > > dl > > On 11/14/2014 9:06 AM, Vladimir Kozlov wrote: >> Yes, I thought about product_pd() and define_pd_global(). But then >> some crazy (security) guys may start playing with it and file P1 bugs. >> Okay, how about a constant (#define CODE_CACHE_SIZE_LIMIT >> NOT_AARCH64(2*G) ARCH64_ONLY(138*M)) in this place so you can use it >> in these 2 checks? > From dean.long at oracle.com Fri Nov 14 20:44:13 2014 From: dean.long at oracle.com (Dean Long) Date: Fri, 14 Nov 2014 12:44:13 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <62FBBB3A-B1CE-43D8-9D99-3087BDDBDC37@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <0B65D4AA-B876-4106-9CA0-2F2AF5F43F83@oracle.com> <62FBBB3A-B1CE-43D8-9D99-3087BDDBDC37@oracle.com> Message-ID: <5466699D.2050808@oracle.com> On 11/14/2014 12:39 AM, Magnus Ihse Bursie wrote: >> 13 nov 2014 kl. 19:33 skrev Christian Thalinger : >> >> >>> On Nov 13, 2014, at 6:09 AM, Magnus Ihse Bursie wrote: >>> >>>> On 2014-11-10 11:32, Volker Simonis wrote: >>>> On Mon, Nov 10, 2014 at 10:42 AM, Erik Joelsson >>>> wrote: >>>>> On 2014-11-10 10:27, Volker Simonis wrote: >>>>>> On Mon, Nov 10, 2014 at 9:08 AM, Erik Joelsson >>>>>> wrote: >>>>>>> Hello, >>>>>>> >>>>>>> I would certainly like to have these files updated, but unfortunately the >>>>>>> license on these files changed from GPL2 to GPL3. This essentially means >>>>>>> that the switch is non trivial from a legal perspective and the >>>>>>> impression >>>>>>> I've received when I last inquired about updating these files was that >>>>>>> it's >>>>>>> unlikely to ever happen unless a very strong case can be presented for >>>>>>> why >>>>>>> it's needed. >>>>>>> >>>>>>> So the reason we have the over engineered solution for config.guess is >>>>>>> simply that it's much easier than getting legal approval for updating >>>>>>> these >>>>>>> files. >>>>>> OK, but in that case I don't see any reason for keeping this >>>>>> "over-engineered" solution at all. If there will not be any pulls from >>>>>> upstream anyway then there's no reason for keeping these file >>>>>> untouched. I'd propose then to just remove the wrappers and do all the >>>>>> chenges right in the corresponding files (of course that's not the >>>>>> topic of this change but should be done separately). >>>>> And again, the reason we didn't change the existing file but instead wrapped >>>>> it, was that we don't have explicit legal approval for doing derivative work >>>>> for these 3rd party files. Maybe it's ok, maybe it's not, I will not be the >>>>> person saying it is ok. >>>> OK, now I got it. I thought we just use the wrappers because we want >>>> to easily integrate the upstream versions. But instead it is only >>>> because we don't want to edit these files because of legal >>>> uncertainties. >>>> >>>> So in that case that means we're also not allowed to edit 'config.sub' >>>> and have to create a wrapper for it, right? >>> Yes, you are correct. We cannot modify these files. >>> >>> As far as I understand, the legal reason for including these files are the explicit exception: >>> >>> # As a special exception to the GNU General Public License, if you >>> # distribute this file as part of a program that contains a >>> # configuration script generated by Autoconf, you may include it under >>> # the same distribution terms that you use for the rest of that program. >>> >>> But this is just a distribution license, not a modification license. >>> >>> From my IANAL point of view, this exception should be enough to disregard if the file is also distributed under GPL2 or GPL3. Unfortunately, as Erik says, our lawyers are apprehensive of GLP3. So while we thought that we could be able to periodically sync these files with upstream (and remove our external "patches" after a while), we have not been able to do so. >> Why do we have these files in our repository in the first place? > Because they are needed by the configure script. They are a sort of runtime libraries for configure, but since they are written as shell scripts, the source code form and the executable form is the same. > > The distribution exception is there exactly since anyone should be able to distribute the files with their configure script. That does not mean that you are allowed to edit it, though. What if we require Autoconf to be installed on the host? Does that solve any problems? dl > /Magnus > >>> So, this fix will need to do the same dance with config.sub as for guess.guess. Unfortunately. :( >>> >>> /Magnus From serguei.spitsyn at oracle.com Sat Nov 15 00:56:04 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 14 Nov 2014 16:56:04 -0800 Subject: Review request for JDK-8064571: java/lang/instrument/IsModifiableClassAgent.java: assert(length > 0) failed: should only be called if table is present In-Reply-To: <54660A0E.1070702@oracle.com> References: <54651CE4.2070003@oracle.com> <54651E1D.2030202@oracle.com> <54660A0E.1070702@oracle.com> Message-ID: <5466A4A4.2040008@oracle.com> Good. Thanks, Serguei On 11/14/14 5:56 AM, Eric McCorkle wrote: > Actually, I realized the comparison in the assert is tautologically > false. Therefore, I removed the assert altogether. Please re-approve. > > On 11/13/14 16:09, Coleen Phillimore wrote: >> Looks good. Thanks for running the java/lang/instrument tests. >> >> Coleen >> >> On 11/13/14, 4:04 PM, Eric McCorkle wrote: >>> Hello, >>> >>> Please review this simple fix for a JDK test failure that was introduced >>> by the change for JDK-8058313. A condition for an assertion was left as >>> ">", when it should have been changed to ">=". >>> >>> Note that this only occurs in artificial test cases; javac does not >>> produce classfiles with zero-length MethodParameters attributes. >>> >>> The webrev is here: >>> http://cr.openjdk.java.net/~emc/8064571/ >>> >>> The bug report is here: >>> https://bugs.openjdk.java.net/browse/JDK-8064571 From vladimir.kozlov at oracle.com Sat Nov 15 01:03:18 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 14 Nov 2014 17:03:18 -0800 Subject: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF264E2@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF264E2@DEWDFEMB12A.global.corp.sap> Message-ID: <5466A656.40707@oracle.com> Hi Goetz, It is very significant rewriting and it takes time to evaluate it. And I would not say it is simpler then before :) These is what I found so far. The idea to try to allocate in a range instead of just below UnscaledOopHeapMax or OopEncodingHeapMax is good. So I would ask to do several attempts (3?) on non_PPC64 platforms too. It is matter of preference but I am not comfortable with switch in loop. For me sequential 'if (addr == 0)' checks is simpler. One thing worries me that you release found space and try to get it again with ReservedHeapSpace. Is it possible to add new ReservedHeapSpace ctor which simple use already allocated space? The next code in ReservedHeapSpace() is hard to understand (): (UseCompressedOops && (requested_address == NULL || requested_address+size > (char*)OopEncodingHeapMax) ? may be move all this into noaccess_prefix_size() and add comments. Why you need prefix when requested_address == NULL? Remove next comment in universe.cpp: // SAPJVM GL 2014-09-22 Again you will release space so why bother to include space for classes?: + // For small heaps, save some space for compressed class pointer + // space so it can be decoded with no base. virtualspace.cpp With new code size+noaccess_prefix could be requested. But later it is not used if WIN64_ONLY(&& UseLargePages) and you will have empty non-protected page below heap. matcher.hpp Universe::narrow_oop_use_implicit_null_checks() should be true for such case too. So you can add new condition with || to existing ones. The only condition you relax is base != NULL. Right? arguments.* files Why you need PropertyList_add changes. Do you have platform specific changes? Thanks, Vladimir On 11/10/14 6:57 AM, Lindenmaier, Goetz wrote: > Hi, > > I need to improve a row of things around compressed oops heap handling > to achieve good performance on ppc. > I prepared a first webrev for review: > http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.00/ > > A detailed technical description of the change is in the webrev and according bug. > > If requested, I will split the change into parts with more respective less impact on > non-ppc platforms. > > The change is derived from well-tested code in our VM. Originally it was > crafted to require the least changes of VM coding, I changed it to be better > streamlined with the VM. > I tested this change to deliver heaps at about the same addresses as before. > Heap addresses mostly differ in lower bits. In some cases (Solaris 5.11) a heap > in a better compressed oops mode is found, though. > I ran (and adapted) test/runtime/CompressedOops and gc/arguments/TestUseCompressedOops*. > > Best regards, > Goetz. > > From vladimir.kozlov at oracle.com Sat Nov 15 01:55:07 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 14 Nov 2014 17:55:07 -0800 Subject: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. In-Reply-To: <5466A656.40707@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF264E2@DEWDFEMB12A.global.corp.sap> <5466A656.40707@oracle.com> Message-ID: <5466B27B.2020903@oracle.com> One more thing. You should allow an allocation in the range when returned from OS allocated address does not match requested address. We had such cases on OSX, for example, when OS allocates at different address but still inside range. Regards, Vladimir On 11/14/14 5:03 PM, Vladimir Kozlov wrote: > Hi Goetz, > > It is very significant rewriting and it takes time to evaluate it. > And I would not say it is simpler then before :) > These is what I found so far. > > The idea to try to allocate in a range instead of just below UnscaledOopHeapMax or OopEncodingHeapMax is good. So I > would ask to do several attempts (3?) on non_PPC64 platforms too. > > It is matter of preference but I am not comfortable with switch in loop. For me sequential 'if (addr == 0)' checks is > simpler. > > One thing worries me that you release found space and try to get it again with ReservedHeapSpace. Is it possible to add > new ReservedHeapSpace ctor which simple use already allocated space? > > The next code in ReservedHeapSpace() is hard to understand (): > > (UseCompressedOops && (requested_address == NULL || requested_address+size > (char*)OopEncodingHeapMax) ? > > may be move all this into noaccess_prefix_size() and add comments. > Why you need prefix when requested_address == NULL? > > Remove next comment in universe.cpp: > > // SAPJVM GL 2014-09-22 > > > Again you will release space so why bother to include space for classes?: > > + // For small heaps, save some space for compressed class pointer > + // space so it can be decoded with no base. > > virtualspace.cpp > > With new code size+noaccess_prefix could be requested. But later it is not used if WIN64_ONLY(&& UseLargePages) and you > will have empty non-protected page below heap. > > matcher.hpp > > Universe::narrow_oop_use_implicit_null_checks() should be true for such case too. So you can add new condition with || > to existing ones. The only condition you relax is base != NULL. Right? > > arguments.* files > > Why you need PropertyList_add changes. > > Do you have platform specific changes? > > Thanks, > Vladimir > > On 11/10/14 6:57 AM, Lindenmaier, Goetz wrote: >> Hi, >> >> I need to improve a row of things around compressed oops heap handling >> to achieve good performance on ppc. >> I prepared a first webrev for review: >> http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.00/ >> >> A detailed technical description of the change is in the webrev and according bug. >> >> If requested, I will split the change into parts with more respective less impact on >> non-ppc platforms. >> >> The change is derived from well-tested code in our VM. Originally it was >> crafted to require the least changes of VM coding, I changed it to be better >> streamlined with the VM. >> I tested this change to deliver heaps at about the same addresses as before. >> Heap addresses mostly differ in lower bits. In some cases (Solaris 5.11) a heap >> in a better compressed oops mode is found, though. >> I ran (and adapted) test/runtime/CompressedOops and gc/arguments/TestUseCompressedOops*. >> >> Best regards, >> Goetz. >> >> From tobias.hartmann at oracle.com Mon Nov 17 06:44:15 2014 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 17 Nov 2014 07:44:15 +0100 Subject: [8u40] Backport RFR: 8056071: compiler/whitebox/IsMethodCompilableTest.java fails with 'method() is not compilable after 3 iterations' In-Reply-To: <546474BD.2080201@oracle.com> References: <546474BD.2080201@oracle.com> Message-ID: <5469993F.90001@oracle.com> Hi, can I get a review for this? Thanks, Tobias On 13.11.2014 10:07, Tobias Hartmann wrote: > Hi, > > please review the following backport request for 8u40. > > 8056071: compiler/whitebox/IsMethodCompilableTest.java fails with 'method() is > not compilable after 3 iterations' > https://bugs.openjdk.java.net/browse/JDK-8056071 > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0d599246de33 > > The changes were pushed on Tuesday. Nightly testing showed no problems. The > changes apply cleanly to 8u40. > > Thanks, > Tobias > From goetz.lindenmaier at sap.com Mon Nov 17 08:32:51 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 17 Nov 2014 08:32:51 +0000 Subject: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. In-Reply-To: <5466A656.40707@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF264E2@DEWDFEMB12A.global.corp.sap> <5466A656.40707@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF27CD3@DEWDFEMB12A.global.corp.sap> Hi Vladimir, > It is very significant rewriting and it takes time to evaluate it. Yes, I know ... and I don't want to push, but nevertheless a ping can be useful sometimes. Thanks a lot for looking at it. > And I would not say it is simpler then before :) If I fix what you propose it's gonna get even more simple ;) > These is what I found so far. > The idea to try to allocate in a range instead of just below > UnscaledOopHeapMax or OopEncodingHeapMax is good. So I would ask to do > several attempts (3?) on non_PPC64 platforms too. Set to 3. > It is matter of preference but I am not comfortable with switch in loop. > For me sequential 'if (addr == 0)' checks is simpler. I'll fix this. > One thing worries me that you release found space and try to get it > again with ReservedHeapSpace. Is it possible to add new > ReservedHeapSpace ctor which simple use already allocated space? This was to keep diff's small, but I also think a new constructor is good. I'll fix this. > The next code in ReservedHeapSpace() is hard to understand (): >(UseCompressedOops && (requested_address == NULL || requested_address+size > (char*)OopEncodingHeapMax) ? > may be move all this into noaccess_prefix_size() and add comments. I have to redo this anyways if I make new constructors. > Why you need prefix when requested_address == NULL? If we allocate with NULL, we most probably will get a heap where base != NULL and thus need a noaccess prefix. > Remove next comment in universe.cpp: > // SAPJVM GL 2014-09-22 Removed. > Again you will release space so why bother to include space for classes?: >+ // For small heaps, save some space for compressed class pointer >+ // space so it can be decoded with no base. This was done like this before. We must assure the upper bound of the heap is low enough that the compressed class space still fits in there. virtualspace.cpp > With new code size+noaccess_prefix could be requested. But later it is > not used if WIN64_ONLY(&& UseLargePages) and you will have empty > non-protected page below heap. There's several points to this: * Also if not protectable, the heap base has to be below the real start of the heap. Else the first object in the heap will be compressed to 'null' and decompression will fail. * If we don't reserve the memory other stuff can end up in this space. On errors, if would be quite unexpected to find memory there. * To get a heap for the new disjoint mode I must control the size of this. Requesting a heap starting at (aligned base + prefix) is more likely to fail. * The size for the prefix must anyways be considered when deciding whether the heap is small enough to run with compressed oops. So distinguishing the case where we really can omit this would require quite some additional checks everywhere, and I thought it's not worth it. matcher.hpp > Universe::narrow_oop_use_implicit_null_checks() should be true for such > case too. So you can add new condition with || to existing ones. The > only condition you relax is base != NULL. Right? Yes, that's how it's intended. arguments.* files > Why you need PropertyList_add changes. Oh, the code using it got lost. I commented on this in the description in the webrev. "To more efficiently run expensive tests in various compressed oop modes, we set a property with the mode the VM is running in. So far it's called "com.sap.vm.test.compressedOopsMode" better suggestions are welcome (and necessary I guess). Our long running tests that are supposed to run in a dedicated compressed oop mode check this property and abort themselves if it's not the expected mode." When I know about the heap I do Arguments::PropertyList_add(new SystemProperty("com.sap.vm.test.compressedOopsMode", narrow_oop_mode_to_string(narrow_oop_mode()), false)); in universe.cpp. On some OSes it's deterministic which modes work, there we don't start such tests. Others, as you mentioned OSX, are very indeterministic. Here we save testruntime with this. But it's not that important. We can still parse the PrintCompresseOopsMode output after the test and discard the run. > Do you have platform specific changes? Yes, for ppc and aix. I'll submit them once this is in. >From your other mail: > One more thing. You should allow an allocation in the range when returned from OS allocated address does not match > requested address. We had such cases on OSX, for example, when OS allocates at different address but still inside range. Good point. I'll fix that in os::attempt_reserve_memory_in_range. I'll ping again once a new webrev is done! Best regards, Goetz. On 11/10/14 6:57 AM, Lindenmaier, Goetz wrote: > Hi, > > I need to improve a row of things around compressed oops heap handling > to achieve good performance on ppc. > I prepared a first webrev for review: > http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.00/ > > A detailed technical description of the change is in the webrev and according bug. > > If requested, I will split the change into parts with more respective less impact on > non-ppc platforms. > > The change is derived from well-tested code in our VM. Originally it was > crafted to require the least changes of VM coding, I changed it to be better > streamlined with the VM. > I tested this change to deliver heaps at about the same addresses as before. > Heap addresses mostly differ in lower bits. In some cases (Solaris 5.11) a heap > in a better compressed oops mode is found, though. > I ran (and adapted) test/runtime/CompressedOops and gc/arguments/TestUseCompressedOops*. > > Best regards, > Goetz. > > From thomas.stuefe at gmail.com Mon Nov 17 10:15:27 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 17 Nov 2014 11:15:27 +0100 Subject: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF27CD3@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF264E2@DEWDFEMB12A.global.corp.sap> <5466A656.40707@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF27CD3@DEWDFEMB12A.global.corp.sap> Message-ID: > > > From your other mail: > > One more thing. You should allow an allocation in the range when > returned from OS allocated address does not match > > requested address. We had such cases on OSX, for example, when OS > allocates at different address but still inside range. > Good point. I'll fix that in os::attempt_reserve_memory_in_range. > > I tested this but it did not really improve on matters. On Mac OS and Solaris it seemed that the OS ignores the wish-address given with mmap(3) completely for new allocations; so it returns whatever it pleases (instead of, say, try its best to find a pointer in the vicinity of the wish pointer). Which almost never was in the specified range. So, we felt it did not improve matters over a simple mmap(NULL,...) which happens later anyway as a fallback. Kind Regards, Thomas From magnus.ihse.bursie at oracle.com Mon Nov 17 12:11:16 2014 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Mon, 17 Nov 2014 13:11:16 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <5466699D.2050808@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <0B65D4AA-B876-4106-9CA0-2F2AF5F43F83@oracle.com> <62FBBB3A-B1CE-43D8-9D99-3087BDDBDC37@oracle.com> <5466699D.2050808@oracle.com> Message-ID: <5469E5E4.7040606@oracle.com> On 2014-11-14 21:44, Dean Long wrote: > >> >> The distribution exception is there exactly since anyone should be >> able to distribute the files with their configure script. That does >> not mean that you are allowed to edit it, though. > What if we require Autoconf to be installed on the host? Does that > solve any problems? No, unfortunately not. /Magnus From markus.gronlund at oracle.com Mon Nov 17 12:16:09 2014 From: markus.gronlund at oracle.com (=?utf-8?B?TWFya3VzIEdyw7ZubHVuZA==?=) Date: Mon, 17 Nov 2014 04:16:09 -0800 (PST) Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> <4295855A5C1DE049A61835A1887419CC2CF26820@DEWDFEMB12A.global.corp.sap> <2e9bd47c-366a-446b-89d0-b431a5816007@default> <54641F95.5030201@oracle.com> Message-ID: Thomas, thanks for adding the comments, looks good. Cheers Markus -----Original Message----- From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] Sent: den 14 november 2014 11:00 To: David Holmes Cc: HotSpot Open Source Developers Subject: Re: RFR (L): 8062370: Various minor code improvements Hi David, thanks for looking. See here the corrected webrev: http://cr.openjdk.java.net/~simonis/webrevs/8064779/ for this new bug report: https://bugs.openjdk.java.net/browse/JDK-8064779 Best Regards, Thomas On Thu, Nov 13, 2014 at 4:03 AM, David Holmes wrote: > Hi Thomas, > > On 12/11/2014 8:31 PM, Thomas St?fe wrote: > >> Hi, >> >> could you please review this little addition? (added comments for >> jio_snprintf) >> >> http://cr.openjdk.java.net/~simonis/webrevs/8062370/ >> > > A new bug is needed for these changes. > > As people rarely look at the header file when reading the code could > you augment the last line of the comment in jvm.cpp from: > > + // return always -1. > > to > > + // always return -1, and perform null termination. > > Thanks, > David > > From thomas.stuefe at gmail.com Mon Nov 17 13:34:54 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 17 Nov 2014 14:34:54 +0100 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> <4295855A5C1DE049A61835A1887419CC2CF26820@DEWDFEMB12A.global.corp.sap> <2e9bd47c-366a-446b-89d0-b431a5816007@default> <54641F95.5030201@oracle.com> Message-ID: thanks, and thanks to Coleen for sponsoring! On Mon, Nov 17, 2014 at 1:16 PM, Markus Gr?nlund wrote: > Thomas, > > thanks for adding the comments, looks good. > > Cheers > Markus > > -----Original Message----- > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > Sent: den 14 november 2014 11:00 > To: David Holmes > Cc: HotSpot Open Source Developers > Subject: Re: RFR (L): 8062370: Various minor code improvements > > Hi David, > > thanks for looking. See here the corrected webrev: > > http://cr.openjdk.java.net/~simonis/webrevs/8064779/ > > for this new bug report: > > https://bugs.openjdk.java.net/browse/JDK-8064779 > Best Regards, Thomas > > > On Thu, Nov 13, 2014 at 4:03 AM, David Holmes > wrote: > > > Hi Thomas, > > > > On 12/11/2014 8:31 PM, Thomas St?fe wrote: > > > >> Hi, > >> > >> could you please review this little addition? (added comments for > >> jio_snprintf) > >> > >> http://cr.openjdk.java.net/~simonis/webrevs/8062370/ > >> > > > > A new bug is needed for these changes. > > > > As people rarely look at the header file when reading the code could > > you augment the last line of the comment in jvm.cpp from: > > > > + // return always -1. > > > > to > > > > + // always return -1, and perform null termination. > > > > Thanks, > > David > > > > > From sgehwolf at redhat.com Mon Nov 17 13:54:39 2014 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Mon, 17 Nov 2014 14:54:39 +0100 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven In-Reply-To: References: <1415971123.3278.55.camel@localhost.localdomain> Message-ID: <1416232479.3760.18.camel@localhost.localdomain> Hi Volker, On Fri, 2014-11-14 at 14:46 +0100, Volker Simonis wrote: > Hi Severin, > > I can sponsor this change if we get one more review. Thanks! > The only comment I have is that in ZeroStack::suggest_size() there > doesn't seem to be a handling for the potentially negative values > returned by ZeroStack::abi_stack_available(). OK. I could do something like this. diff --git a/src/cpu/zero/vm/stack_zero.cpp b/src/cpu/zero/vm/stack_zero.cpp --- a/src/cpu/zero/vm/stack_zero.cpp +++ b/src/cpu/zero/vm/stack_zero.cpp @@ -30,7 +30,9 @@ int ZeroStack::suggest_size(Thread *thread) const { assert(needs_setup(), "already set up"); - return align_size_down(abi_stack_available(thread) / 2, wordSize); + ssize_t abi_available = abi_stack_available(thread); + assert(abi_available >= 0, "available abi stack must be >= 0"); + return align_size_down(abi_available / 2, wordSize); } Did you have something else in mind? I'm not sure if it's necessary since suggest_size is only being called when the zero stack is set up (src/cpu/zero/vm/stubGenerator_zero.cpp). Thoughts? Thanks, Severin > On Fri, Nov 14, 2014 at 2:18 PM, Severin Gehwolf wrote: > > Hi, > > > > Could I please get a review and sponsor for the following fix: > > > > bug: https://bugs.openjdk.java.net/browse/JDK-8064815 > > webrev: > > https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.0/ > > > > When running Maven ("mvn") on a Zero variant build on PPC/PPC64 hardware > > it throws a StackOverflowError. This is because the stack bound > > calculation does not account for red and yellow pages. > > > > The bug has a slightly different patch attached. The changes to > > hotspot/src/os/linux/vm/os_linux.cpp aren't needed for this bug. > > > > Testing done: A Zero variant build of OpenJDK 9 on PPC/PPC64 throws > > StackOverflowError without this fix and works fine with this fix > > applied. > > > > Note that this problem seems to surface on architectures where pages are > > large. PPC is one such instance. Page size there is 64KB and Zero > > initially sets its minimal stack allowance to 64KB (one page), > > src/os_cpu/linux_zero/vm/os_linux_zero.cpp. In os::init_2 this gets > > potentially increased if min_stack_allowed is small. The case on PPC > > Zero. > > > > However, then later at runtime the calculation of available stack is > > wrong since it does not account for red and yellow pages. Thus it thinks > > there is too little stack available where in fact more stack is > > available. > > > > Thanks, > > Severin > > From aph at redhat.com Mon Nov 17 16:14:45 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 17 Nov 2014 16:14:45 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546572B8.9080005@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> Message-ID: <546A1EF5.6060607@redhat.com> Hi, New webrev at http://cr.openjdk.java.net/~aph/aarch64-JDK-8064611-2/ I have tried to integrate everybody's suggestions, and I hope that I haven't missed anything. Thanks to everyone who helped. Andrew. From sgehwolf at redhat.com Mon Nov 17 17:00:19 2014 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Mon, 17 Nov 2014 18:00:19 +0100 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven In-Reply-To: <54663B1D.6060408@oracle.com> References: <1415971123.3278.55.camel@localhost.localdomain> <54663B1D.6060408@oracle.com> Message-ID: <1416243619.3760.32.camel@localhost.localdomain> Hi Vladimir, On Fri, 2014-11-14 at 09:25 -0800, Vladimir Kozlov wrote: > So this code assumes that 'stack_used' local is allocated on stack and > uses its address to calculated used space. But where is a guarantee > that passed 'thread' is the current thread? Yes, in src/cpu/zero/vm/stubGenerator_zero.cpp the stack is alloca'ed. I don't think there is currently a guarantee that the passed thread is the current thread. What kind of guarantee are you looking for? Any suggestions? Cheers, Severin > Thanks, > Vladimir > > On 11/14/14 5:46 AM, Volker Simonis wrote: > > Hi Severin, > > > > I can sponsor this change if we get one more review. > > > > The only comment I have is that in ZeroStack::suggest_size() there > > doesn't seem to be a handling for the potentially negative values > > returned by ZeroStack::abi_stack_available(). > > > > Regards, > > Volker > > > > > > On Fri, Nov 14, 2014 at 2:18 PM, Severin Gehwolf wrote: > >> Hi, > >> > >> Could I please get a review and sponsor for the following fix: > >> > >> bug: https://bugs.openjdk.java.net/browse/JDK-8064815 > >> webrev: > >> https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.0/ > >> > >> When running Maven ("mvn") on a Zero variant build on PPC/PPC64 hardware > >> it throws a StackOverflowError. This is because the stack bound > >> calculation does not account for red and yellow pages. > >> > >> The bug has a slightly different patch attached. The changes to > >> hotspot/src/os/linux/vm/os_linux.cpp aren't needed for this bug. > >> > >> Testing done: A Zero variant build of OpenJDK 9 on PPC/PPC64 throws > >> StackOverflowError without this fix and works fine with this fix > >> applied. > >> > >> Note that this problem seems to surface on architectures where pages are > >> large. PPC is one such instance. Page size there is 64KB and Zero > >> initially sets its minimal stack allowance to 64KB (one page), > >> src/os_cpu/linux_zero/vm/os_linux_zero.cpp. In os::init_2 this gets > >> potentially increased if min_stack_allowed is small. The case on PPC > >> Zero. > >> > >> However, then later at runtime the calculation of available stack is > >> wrong since it does not account for red and yellow pages. Thus it thinks > >> there is too little stack available where in fact more stack is > >> available. > >> > >> Thanks, > >> Severin > >> From vladimir.kozlov at oracle.com Mon Nov 17 17:43:09 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 17 Nov 2014 09:43:09 -0800 Subject: [8u40] Backport RFR: 8056071: compiler/whitebox/IsMethodCompilableTest.java fails with 'method() is not compilable after 3 iterations' In-Reply-To: <546474BD.2080201@oracle.com> References: <546474BD.2080201@oracle.com> Message-ID: <546A33AD.50803@oracle.com> Looks good. Thanks, Vladimir On 11/13/14 1:07 AM, Tobias Hartmann wrote: > Hi, > > please review the following backport request for 8u40. > > 8056071: compiler/whitebox/IsMethodCompilableTest.java fails with 'method() is > not compilable after 3 iterations' > https://bugs.openjdk.java.net/browse/JDK-8056071 > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0d599246de33 > > The changes were pushed on Tuesday. Nightly testing showed no problems. The > changes apply cleanly to 8u40. > > Thanks, > Tobias > From vladimir.kozlov at oracle.com Mon Nov 17 17:54:30 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 17 Nov 2014 09:54:30 -0800 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven In-Reply-To: <1416243619.3760.32.camel@localhost.localdomain> References: <1415971123.3278.55.camel@localhost.localdomain> <54663B1D.6060408@oracle.com> <1416243619.3760.32.camel@localhost.localdomain> Message-ID: <546A3656.4000400@oracle.com> I mean next guarantee: guarantee(Thread::current() == thread, "should run in the same thread"); otherwise 'thread->stack_base() - (address) &stack_used' will give wrong result if threads are different. Thanks, Vladimir On 11/17/14 9:00 AM, Severin Gehwolf wrote: > Hi Vladimir, > > On Fri, 2014-11-14 at 09:25 -0800, Vladimir Kozlov wrote: >> So this code assumes that 'stack_used' local is allocated on stack and >> uses its address to calculated used space. But where is a guarantee >> that passed 'thread' is the current thread? > > Yes, in src/cpu/zero/vm/stubGenerator_zero.cpp the stack is alloca'ed. I > don't think there is currently a guarantee that the passed thread is the > current thread. What kind of guarantee are you looking for? Any > suggestions? > > Cheers, > Severin > >> Thanks, >> Vladimir >> >> On 11/14/14 5:46 AM, Volker Simonis wrote: >>> Hi Severin, >>> >>> I can sponsor this change if we get one more review. >>> >>> The only comment I have is that in ZeroStack::suggest_size() there >>> doesn't seem to be a handling for the potentially negative values >>> returned by ZeroStack::abi_stack_available(). >>> >>> Regards, >>> Volker >>> >>> >>> On Fri, Nov 14, 2014 at 2:18 PM, Severin Gehwolf wrote: >>>> Hi, >>>> >>>> Could I please get a review and sponsor for the following fix: >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8064815 >>>> webrev: >>>> https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.0/ >>>> >>>> When running Maven ("mvn") on a Zero variant build on PPC/PPC64 hardware >>>> it throws a StackOverflowError. This is because the stack bound >>>> calculation does not account for red and yellow pages. >>>> >>>> The bug has a slightly different patch attached. The changes to >>>> hotspot/src/os/linux/vm/os_linux.cpp aren't needed for this bug. >>>> >>>> Testing done: A Zero variant build of OpenJDK 9 on PPC/PPC64 throws >>>> StackOverflowError without this fix and works fine with this fix >>>> applied. >>>> >>>> Note that this problem seems to surface on architectures where pages are >>>> large. PPC is one such instance. Page size there is 64KB and Zero >>>> initially sets its minimal stack allowance to 64KB (one page), >>>> src/os_cpu/linux_zero/vm/os_linux_zero.cpp. In os::init_2 this gets >>>> potentially increased if min_stack_allowed is small. The case on PPC >>>> Zero. >>>> >>>> However, then later at runtime the calculation of available stack is >>>> wrong since it does not account for red and yellow pages. Thus it thinks >>>> there is too little stack available where in fact more stack is >>>> available. >>>> >>>> Thanks, >>>> Severin >>>> > > > From volker.simonis at gmail.com Mon Nov 17 19:14:10 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 17 Nov 2014 20:14:10 +0100 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven In-Reply-To: <1416232479.3760.18.camel@localhost.localdomain> References: <1415971123.3278.55.camel@localhost.localdomain> <1416232479.3760.18.camel@localhost.localdomain> Message-ID: On Mon, Nov 17, 2014 at 2:54 PM, Severin Gehwolf wrote: > Hi Volker, > > On Fri, 2014-11-14 at 14:46 +0100, Volker Simonis wrote: >> Hi Severin, >> >> I can sponsor this change if we get one more review. > > Thanks! > >> The only comment I have is that in ZeroStack::suggest_size() there >> doesn't seem to be a handling for the potentially negative values >> returned by ZeroStack::abi_stack_available(). > > OK. I could do something like this. > > diff --git a/src/cpu/zero/vm/stack_zero.cpp > b/src/cpu/zero/vm/stack_zero.cpp > --- a/src/cpu/zero/vm/stack_zero.cpp > +++ b/src/cpu/zero/vm/stack_zero.cpp > @@ -30,7 +30,9 @@ > > int ZeroStack::suggest_size(Thread *thread) const { > assert(needs_setup(), "already set up"); > - return align_size_down(abi_stack_available(thread) / 2, wordSize); > + ssize_t abi_available = abi_stack_available(thread); > + assert(abi_available >= 0, "available abi stack must be >= 0"); > + return align_size_down(abi_available / 2, wordSize); > } > > Did you have something else in mind? I'm not sure if it's necessary > since suggest_size is only being called when the zero stack is set up > (src/cpu/zero/vm/stubGenerator_zero.cpp). Thoughts? > I think an assertion would be good (but 'abi_available' should be an 'int' because 'ssize_t' is only specified to hold values from -1 to SSIZE_MAX). As far as I can see, the possibly negative result of ZeroStack::suggest_size() is casted into a size_t in StubGenerator::call_stub() before doing an alloca() and just a few lines later we call EntryFrame::build() which will do an overflow check. Nevertheless its better to fail early in the case of a problem. Thank, Volker > Thanks, > Severin > >> On Fri, Nov 14, 2014 at 2:18 PM, Severin Gehwolf wrote: >> > Hi, >> > >> > Could I please get a review and sponsor for the following fix: >> > >> > bug: https://bugs.openjdk.java.net/browse/JDK-8064815 >> > webrev: >> > https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.0/ >> > >> > When running Maven ("mvn") on a Zero variant build on PPC/PPC64 hardware >> > it throws a StackOverflowError. This is because the stack bound >> > calculation does not account for red and yellow pages. >> > >> > The bug has a slightly different patch attached. The changes to >> > hotspot/src/os/linux/vm/os_linux.cpp aren't needed for this bug. >> > >> > Testing done: A Zero variant build of OpenJDK 9 on PPC/PPC64 throws >> > StackOverflowError without this fix and works fine with this fix >> > applied. >> > >> > Note that this problem seems to surface on architectures where pages are >> > large. PPC is one such instance. Page size there is 64KB and Zero >> > initially sets its minimal stack allowance to 64KB (one page), >> > src/os_cpu/linux_zero/vm/os_linux_zero.cpp. In os::init_2 this gets >> > potentially increased if min_stack_allowed is small. The case on PPC >> > Zero. >> > >> > However, then later at runtime the calculation of available stack is >> > wrong since it does not account for red and yellow pages. Thus it thinks >> > there is too little stack available where in fact more stack is >> > available. >> > >> > Thanks, >> > Severin >> > > > > From vladimir.kozlov at oracle.com Mon Nov 17 20:55:35 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 17 Nov 2014 12:55:35 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546A1EF5.6060607@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> Message-ID: <546A60C7.1070408@oracle.com> library_call.cpp: Please, add a comments. And why MemBarAcquire is not generated here for volatile loads on aarch64? In parse3.cpp we still do. Otherwise this looks good. Thanks, Vladimir On 11/17/14 8:14 AM, Andrew Haley wrote: > Hi, > > New webrev at > > http://cr.openjdk.java.net/~aph/aarch64-JDK-8064611-2/ > > I have tried to integrate everybody's suggestions, and I hope that > I haven't missed anything. > > Thanks to everyone who helped. > > Andrew. > From coleen.phillimore at oracle.com Mon Nov 17 22:49:44 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 17 Nov 2014 17:49:44 -0500 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM Message-ID: <546A7B88.8000504@oracle.com> Summary: note all MemberNames created on internal list for adjusting method entries. The JVM MemberNameTable code will push all member names on the list rather than trying to index by method_idnum. The code to look up MemberName types wasn't used so was removed. Class redefinition iterates through the table sequentially to update the Method* pointers in saved member names. This change will work with David Chase's change to the Java code for bug 8013267 without the extra code dealing with class redefinition. Tested with vm.quick.testlist, jck tests and jtreg tests, including the mlvm tests that failed in the bug report. open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ bug link https://bugs.openjdk.java.net/browse/JDK-8042235 Thanks, Coleen From dean.long at oracle.com Mon Nov 17 23:06:01 2014 From: dean.long at oracle.com (Dean Long) Date: Mon, 17 Nov 2014 15:06:01 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546A1EF5.6060607@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> Message-ID: <546A7F59.7030903@oracle.com> I only have 2 issues: 1) I'm still hoping for an explanation of the is_last_use change in c1_LinearScan.cpp 2) Do you remember why compactingPermGenGen.o needs -O1? The rest looks good. dl On 11/17/2014 8:14 AM, Andrew Haley wrote: > Hi, > > New webrev at > > http://cr.openjdk.java.net/~aph/aarch64-JDK-8064611-2/ > > I have tried to integrate everybody's suggestions, and I hope that > I haven't missed anything. > > Thanks to everyone who helped. > > Andrew. > From serguei.spitsyn at oracle.com Tue Nov 18 00:44:24 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 17 Nov 2014 16:44:24 -0800 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <546A7B88.8000504@oracle.com> References: <546A7B88.8000504@oracle.com> Message-ID: <546A9668.9050709@oracle.com> Coleen, The fix look good. Thank you for taking care about this! Thanks, Serguei On 11/17/14 2:49 PM, Coleen Phillimore wrote: > Summary: note all MemberNames created on internal list for adjusting > method entries. > > The JVM MemberNameTable code will push all member names on the list > rather than trying to index by method_idnum. The code to look up > MemberName types wasn't used so was removed. Class redefinition > iterates through the table sequentially to update the Method* pointers > in saved member names. > > This change will work with David Chase's change to the Java code for > bug 8013267 without the extra code dealing with class redefinition. > > Tested with vm.quick.testlist, jck tests and jtreg tests, including > the mlvm tests that failed in the bug report. > > open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ > bug link https://bugs.openjdk.java.net/browse/JDK-8042235 > > Thanks, > Coleen From christian.thalinger at oracle.com Tue Nov 18 00:59:05 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Mon, 17 Nov 2014 16:59:05 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <5469E5E4.7040606@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <0B65D4AA-B876-4106-9CA0-2F2AF5F43F83@oracle.com> <62FBBB3A-B1CE-43D8-9D99-3087BDDBDC37@oracle.com> <5466699D.2050808@oracle.com> <5469E5E4.7040606@oracle.com> Message-ID: > On Nov 17, 2014, at 4:11 AM, Magnus Ihse Bursie wrote: > > On 2014-11-14 21:44, Dean Long wrote: >> >>> >>> The distribution exception is there exactly since anyone should be able to distribute the files with their configure script. That does not mean that you are allowed to edit it, though. >> What if we require Autoconf to be installed on the host? Does that solve any problems? > No, unfortunately not. Why not? > > /Magnus From david.holmes at oracle.com Tue Nov 18 01:37:05 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 18 Nov 2014 11:37:05 +1000 Subject: RFR (L): 8062370: Various minor code improvements In-Reply-To: References: <4295855A5C1DE049A61835A1887419CC2CF22447@DEWDFEMB12A.global.corp.sap> <545981F4.3050006@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF24352@DEWDFEMB12A.global.corp.sap> <545B48D3.5040009@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF248C2@DEWDFEMB12A.global.corp.sap> <545B4DC4.3020709@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2492F@DEWDFEMB12A.global.corp.sap> <545B5A14.1020508@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF249B5@DEWDFEMB12A.global.corp.sap> <4295855A5C1DE049A61835A1887419CC2CF26820@DEWDFEMB12A.global.corp.sap> <2e9bd47c-366a-446b-89d0-b431a5816007@default> <54641F95.5030201@oracle.com> Message-ID: <546AA2C1.8010308@oracle.com> For the record looks good to me. Thanks, David On 14/11/2014 7:59 PM, Thomas St?fe wrote: > Hi David, > > thanks for looking. See here the corrected webrev: > > http://cr.openjdk.java.net/~simonis/webrevs/8064779/ > > for this new bug report: > > https://bugs.openjdk.java.net/browse/JDK-8064779 > > Best Regards, Thomas > > > On Thu, Nov 13, 2014 at 4:03 AM, David Holmes > wrote: > > Hi Thomas, > > On 12/11/2014 8:31 PM, Thomas St?fe wrote: > > Hi, > > could you please review this little addition? (added comments for > jio_snprintf) > > http://cr.openjdk.java.net/~__simonis/webrevs/8062370/ > > > > A new bug is needed for these changes. > > As people rarely look at the header file when reading the code could > you augment the last line of the comment in jvm.cpp from: > > + // return always -1. > > to > > + // always return -1, and perform null termination. > > Thanks, > David > From dean.long at oracle.com Tue Nov 18 03:20:16 2014 From: dean.long at oracle.com (Dean Long) Date: Mon, 17 Nov 2014 19:20:16 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <0B65D4AA-B876-4106-9CA0-2F2AF5F43F83@oracle.com> <62FBBB3A-B1CE-43D8-9D99-3087BDDBDC37@oracle.com> <5466699D.2050808@oracle.com> <5469E5E4.7040606@oracle.com> Message-ID: <546ABAF0.1030706@oracle.com> On 11/17/2014 4:59 PM, Christian Thalinger wrote: >> On Nov 17, 2014, at 4:11 AM, Magnus Ihse Bursie wrote: >> >> On 2014-11-14 21:44, Dean Long wrote: >>>> The distribution exception is there exactly since anyone should be able to distribute the files with their configure script. That does not mean that you are allowed to edit it, though. >>> What if we require Autoconf to be installed on the host? Does that solve any problems? >> No, unfortunately not. > Why not? Yes, not just any version of Autoconf, but a newer version with aarch64 support. Then we wouldn't have to edit the script, right? dl >> /Magnus From tobias.hartmann at oracle.com Tue Nov 18 06:03:48 2014 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 18 Nov 2014 07:03:48 +0100 Subject: [8u40] Backport RFR: 8056071: compiler/whitebox/IsMethodCompilableTest.java fails with 'method() is not compilable after 3 iterations' In-Reply-To: <546A33AD.50803@oracle.com> References: <546474BD.2080201@oracle.com> <546A33AD.50803@oracle.com> Message-ID: <546AE144.1060307@oracle.com> Thanks, Vladimir. Best, Tobias On 17.11.2014 18:43, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 11/13/14 1:07 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following backport request for 8u40. >> >> 8056071: compiler/whitebox/IsMethodCompilableTest.java fails with 'method() is >> not compilable after 3 iterations' >> https://bugs.openjdk.java.net/browse/JDK-8056071 >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0d599246de33 >> >> The changes were pushed on Tuesday. Nightly testing showed no problems. The >> changes apply cleanly to 8u40. >> >> Thanks, >> Tobias >> From mikael.gerdin at oracle.com Tue Nov 18 09:03:18 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 18 Nov 2014 10:03:18 +0100 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546A7F59.7030903@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A7F59.7030903@oracle.com> Message-ID: <546B0B56.4020701@oracle.com> Dean, On 2014-11-18 00:06, Dean Long wrote: > I only have 2 issues: > > 1) I'm still hoping for an explanation of the is_last_use change in > c1_LinearScan.cpp > 2) Do you remember why compactingPermGenGen.o needs -O1? compactingPermGenGen.cpp was removed as of JDK8-b58, more than two years ago. /Mikael > > The rest looks good. > > dl > > On 11/17/2014 8:14 AM, Andrew Haley wrote: >> Hi, >> >> New webrev at >> >> http://cr.openjdk.java.net/~aph/aarch64-JDK-8064611-2/ >> >> I have tried to integrate everybody's suggestions, and I hope that >> I haven't missed anything. >> >> Thanks to everyone who helped. >> >> Andrew. >> > From aph at redhat.com Tue Nov 18 09:34:42 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 18 Nov 2014 09:34:42 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546A7F59.7030903@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A7F59.7030903@oracle.com> Message-ID: <546B12B2.3020103@redhat.com> On 17/11/14 23:06, Dean Long wrote: > 1) I'm still hoping for an explanation of the is_last_use change in > c1_LinearScan.cpp I don't exactly know; it was a long time ago. The reasonable thing for me to do is take it out and do some testing, but I am a little nervous about the amount of real-world testing I'd need to do to be sure. I can appreciate that we don't really want to have mysteries like this in new code, though. > 2) Do you remember why compactingPermGenGen.o needs -O1? It's probably obsolete: there was a bug in amd64 which needed it, and the -O1 is still there in amd64.make. There's also the NOOPT for the copied fdlibm routines in sharedRuntimeTrig.o: I suspect that's obsolete too. The right thing to do here is, I suggest, to take out both of these dubious fragments but be prepared to re-commit them. Andrew. From roland.westrelin at oracle.com Tue Nov 18 09:38:18 2014 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Tue, 18 Nov 2014 10:38:18 +0100 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546A60C7.1070408@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> Message-ID: There?s very little code that?s marked with #ifdef X86 or #ifdef SPARC in the C2 code. Wouldn?t we want to hide everything that is AARCH64 specific behind functions in Matcher like we do elsewhere? Roland. From aph at redhat.com Tue Nov 18 09:38:47 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 18 Nov 2014 09:38:47 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546A60C7.1070408@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> Message-ID: <546B13A7.7080103@redhat.com> On 17/11/14 20:55, Vladimir Kozlov wrote: > library_call.cpp: > > Please, add a comments. > And why MemBarAcquire is not generated here for volatile loads on > aarch64? In parse3.cpp we still do. Okay. I've been thinking about that. In aarch64.ad I elide an unnecessary MemBarAcquire (by looking at the ideal graph). This is something of a kludge, and perhaps it would be better not to generate it in the first place. What do you think? Thanks, Andrew. From aph at redhat.com Tue Nov 18 09:39:41 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 18 Nov 2014 09:39:41 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546B0B56.4020701@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A7F59.7030903@oracle.com> <546B0B56.4020701@oracle.com> Message-ID: <546B13DD.1030004@redhat.com> On 18/11/14 09:03, Mikael Gerdin wrote: > compactingPermGenGen.cpp was removed as of JDK8-b58, more than two years > ago. Ah, perhaps we should remove that from all back-ends, then. :-) Thanks, Andrew. From volker.simonis at gmail.com Tue Nov 18 10:03:05 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 18 Nov 2014 11:03:05 +0100 Subject: Proposal: Allowing selective pushes to hotspot without jprt In-Reply-To: <5460CDF1.8050205@oracle.com> References: <540F7021.5080100@oracle.com> <5410CDA9.7030405@oracle.com> <541C6FD9.5050602@oracle.com> <545CB362.60501@oracle.com> <5460CDF1.8050205@oracle.com> Message-ID: On Mon, Nov 10, 2014 at 3:38 PM, Mikael Vidstedt wrote: > > On 2014-11-07 14:12, Volker Simonis wrote: >> >> On Fri, Nov 7, 2014 at 12:56 PM, Mikael Vidstedt >> wrote: >>> >>> Volker, >>> >>> Thanks for reminding me, this totally slipped my mind. >>> >>> I think it's fair to say say we've given this enough time for feedback, >>> and >>> that the feedback has been all supportive. With that in mind I consider >>> the >>> proposal approved and effective immediately! >>> >> OK great. So does this mean we can now push reviewed changes to the >> ppc/aix subdirs right away? > > > That is indeed the idea - modulo the "if at review code review time a change > is for some reason deemed to be risky and/or otherwise have impact on shared > files" part which, again, hopefully is rare. > You're right - it works! I've just pushed my first AIX-only change to hotspot-rt! Thanks, Volker > Cheers, > Mikael > > >> >>> Cheers, >>> Mikael >>> >>> >>> On 2014-11-06 15:35, Volker Simonis wrote: >>>> >>>> Hi Mikael, >>>> >>>> just wanted to ask what's the status of this project? >>>> I hope it was not just a JavaOne hoax :) >>>> >>>> Regards, >>>> Volker >>>> >>>> >>>> On Fri, Sep 19, 2014 at 8:47 PM, Volker Simonis >>>> wrote: >>>>> >>>>> Thanks Mikael, that sounds good! >>>>> >>>>> Regards, >>>>> Volker >>>>> >>>>> >>>>> On Fri, Sep 19, 2014 at 8:03 PM, Mikael Vidstedt >>>>> wrote: >>>>>> >>>>>> Volker, >>>>>> >>>>>> The proposal is only to change how the changes are pushed, not which >>>>>> forests >>>>>> changes can be pushed to. That is, we would still require hotspot >>>>>> changes to >>>>>> be pushed to one of the group repositories (jdk9/hs-{comp,gc,rt}) or >>>>>> to >>>>>> the >>>>>> jdk8u/hs-dev forest (jdk8u), but I propose that the relaxation be >>>>>> applied on >>>>>> all those (four) forests. Reasonable? >>>>>> >>>>>> Cheers, >>>>>> Mikael >>>>>> >>>>>> >>>>>> On 2014-09-12 11:38, Volker Simonis wrote: >>>>>>> >>>>>>> Hi Mikael, >>>>>>> >>>>>>> there's one more question that came to my mind: will the new rule >>>>>>> apply to all hotspot respitories (i.e. jdk9/hs-rt/hotspot, >>>>>>> jdk9/hs-comp/hotspot, jdk9/hs-gc/hotspot, jdk9/hs-hs/hotspot AND >>>>>>> jdk8u/jdk8u-dev/hotspot, jdk8u/hs-dev/hotspot) ? >>>>>>> >>>>>>> Thanks, >>>>>>> Volker >>>>>>> >>>>>>> >>>>>>> On Thu, Sep 11, 2014 at 12:16 AM, Mikael Vidstedt >>>>>>> wrote: >>>>>>>> >>>>>>>> Andrew/Volker, >>>>>>>> >>>>>>>> Thanks for the positive feedback. The goal of the proposal is to >>>>>>>> simplify >>>>>>>> pushing changes which are effectively not tested by the jprt system >>>>>>>> anyway. >>>>>>>> The proposed relaxation would not affect work on other >>>>>>>> infrastructure >>>>>>>> projects in any relevant way, but would hopefully improve all our >>>>>>>> lives >>>>>>>> significantly immediately. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Mikael >>>>>>>> >>>>>>>> >>>>>>>> On 2014-09-10 01:45, Volker Simonis wrote: >>>>>>>>> >>>>>>>>> Hi Mikael, >>>>>>>>> >>>>>>>>> thanks a lot for this proposal. I think this will dramatically >>>>>>>>> simplify our work to keep our ports up to date! So I fully support >>>>>>>>> it. >>>>>>>>> >>>>>>>>> Nevertheless, I think this can only be a first step towards fully >>>>>>>>> open >>>>>>>>> the JPRT system to developers outside Oracle. With "opening" I mean >>>>>>>>> to >>>>>>>>> allow OpenJDK commiters from outside Oracle to submit and run JPRT >>>>>>>>> jobs as well as allowing porting projects to add hardware which >>>>>>>>> builds >>>>>>>>> and tests the HotSpot on alternative platforms. >>>>>>>>> >>>>>>>>> So while I'm all in favor of your proposal I hope you can allay my >>>>>>>>> doubts that this simplification will hopefully not push the >>>>>>>>> realization of a truly OPEN JPRT system even further away. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Volker >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Sep 9, 2014 at 11:24 PM, Mikael Vidstedt >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> All, >>>>>>>>>> >>>>>>>>>> Made up primarily of low level C++ code, the Hotspot codebase is >>>>>>>>>> highly >>>>>>>>>> platform dependent and also tightly coupled with the tool chains >>>>>>>>>> on >>>>>>>>>> the >>>>>>>>>> various platforms. Each platform/tool chain combination has its >>>>>>>>>> set >>>>>>>>>> of >>>>>>>>>> special quirks, and code must be implemented in a way such that it >>>>>>>>>> only >>>>>>>>>> relies on the common subset of syntax and functionality across all >>>>>>>>>> these >>>>>>>>>> combinations. History has taught us that even simple changes can >>>>>>>>>> have >>>>>>>>>> surprising results when compiled with different compilers. >>>>>>>>>> >>>>>>>>>> For more than a decade the Hotspot team has ensured a minimum >>>>>>>>>> quality >>>>>>>>>> level >>>>>>>>>> by requiring all pushes to be done through a build and test system >>>>>>>>>> (jprt) >>>>>>>>>> which guarantees that the code resulting from applying a set of >>>>>>>>>> changes >>>>>>>>>> builds on a set of core platforms and that a set of core tests >>>>>>>>>> pass. >>>>>>>>>> Only >>>>>>>>>> if >>>>>>>>>> all the builds and tests pass will the changes actually be pushed >>>>>>>>>> to >>>>>>>>>> the >>>>>>>>>> target repository. >>>>>>>>>> >>>>>>>>>> We believe that testing like the above, in combination with later >>>>>>>>>> stages >>>>>>>>>> of >>>>>>>>>> testing, is vital to ensuring that the quality level of the >>>>>>>>>> Hotspot >>>>>>>>>> code >>>>>>>>>> remains high and that developers do not run into situations where >>>>>>>>>> the >>>>>>>>>> latest >>>>>>>>>> version has build errors on some platforms. >>>>>>>>>> >>>>>>>>>> Recently the AIX/PPC port was added to the set of OpenJDK >>>>>>>>>> platforms. >>>>>>>>>> From >>>>>>>>>> a >>>>>>>>>> Hotspot perspective this new platform added a set of AIX/PPC >>>>>>>>>> specific >>>>>>>>>> files >>>>>>>>>> including some platform specific changes to shared code. The >>>>>>>>>> AIX/PPC >>>>>>>>>> platform is not tested by Oracle as part of Hotspot push jobs. The >>>>>>>>>> same >>>>>>>>>> thing applies for the shark and zero versions of Hotspot. >>>>>>>>>> >>>>>>>>>> While Hotspot developers remain committed to making sure changes >>>>>>>>>> are >>>>>>>>>> developed in a way such that the quality level remains high across >>>>>>>>>> all >>>>>>>>>> platforms and variants, because of the above mentioned >>>>>>>>>> complexities >>>>>>>>>> it >>>>>>>>>> is >>>>>>>>>> inevitable that from time to time changes will be made which >>>>>>>>>> introduce >>>>>>>>>> issues on specific platforms or tool chains not part of the core >>>>>>>>>> testing. >>>>>>>>>> >>>>>>>>>> To allow these issues to be resolved more quickly I would like to >>>>>>>>>> propose >>>>>>>>>> a >>>>>>>>>> relaxation in the requirements on how changes to Hotspot are >>>>>>>>>> pushed. >>>>>>>>>> Specifically I would like to allow for direct pushes to the >>>>>>>>>> hotspot/ >>>>>>>>>> repository of files specific to the following >>>>>>>>>> ports/variants/tools: >>>>>>>>>> >>>>>>>>>> * AIX >>>>>>>>>> * PPC >>>>>>>>>> * Shark >>>>>>>>>> * Zero >>>>>>>>>> >>>>>>>>>> Today this translates into the following files: >>>>>>>>>> >>>>>>>>>> - src/cpu/ppc/** >>>>>>>>>> - src/cpu/zero/** >>>>>>>>>> - src/os/aix/** >>>>>>>>>> - src/os_cpu/aix_ppc/** >>>>>>>>>> - src/os_cpu/bsd_zero/** >>>>>>>>>> - src/os_cpu/linux_ppc/** >>>>>>>>>> - src/os_cpu/linux_zero/** >>>>>>>>>> >>>>>>>>>> Note that all changes are still required to go through the normal >>>>>>>>>> development and review cycle; the proposed relaxation only applies >>>>>>>>>> to >>>>>>>>>> how >>>>>>>>>> the changes are pushed. >>>>>>>>>> >>>>>>>>>> If at code review time a change is for some reason deemed to be >>>>>>>>>> risky >>>>>>>>>> and/or >>>>>>>>>> otherwise have impact on shared files the reviewer may request >>>>>>>>>> that >>>>>>>>>> the >>>>>>>>>> change to go through the regular push testing. For changes only >>>>>>>>>> touching >>>>>>>>>> the >>>>>>>>>> above set of files this expected to be rare. >>>>>>>>>> >>>>>>>>>> Please let me know what you think. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Mikael >>>>>>>>>> > From sgehwolf at redhat.com Tue Nov 18 10:15:44 2014 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 18 Nov 2014 11:15:44 +0100 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven In-Reply-To: References: <1415971123.3278.55.camel@localhost.localdomain> <1416232479.3760.18.camel@localhost.localdomain> Message-ID: <1416305744.3379.20.camel@localhost.localdomain> Hi Volker, On Mon, 2014-11-17 at 20:14 +0100, Volker Simonis wrote: > On Mon, Nov 17, 2014 at 2:54 PM, Severin Gehwolf wrote: > > Hi Volker, > > > > On Fri, 2014-11-14 at 14:46 +0100, Volker Simonis wrote: > >> Hi Severin, > >> > >> I can sponsor this change if we get one more review. > > > > Thanks! > > > >> The only comment I have is that in ZeroStack::suggest_size() there > >> doesn't seem to be a handling for the potentially negative values > >> returned by ZeroStack::abi_stack_available(). > > > > OK. I could do something like this. > > > > diff --git a/src/cpu/zero/vm/stack_zero.cpp > > b/src/cpu/zero/vm/stack_zero.cpp > > --- a/src/cpu/zero/vm/stack_zero.cpp > > +++ b/src/cpu/zero/vm/stack_zero.cpp > > @@ -30,7 +30,9 @@ > > > > int ZeroStack::suggest_size(Thread *thread) const { > > assert(needs_setup(), "already set up"); > > - return align_size_down(abi_stack_available(thread) / 2, wordSize); > > + ssize_t abi_available = abi_stack_available(thread); > > + assert(abi_available >= 0, "available abi stack must be >= 0"); > > + return align_size_down(abi_available / 2, wordSize); > > } > > > > Did you have something else in mind? I'm not sure if it's necessary > > since suggest_size is only being called when the zero stack is set up > > (src/cpu/zero/vm/stubGenerator_zero.cpp). Thoughts? > > > > I think an assertion would be good. Sure. Updated webrev coming soon. > (but 'abi_available' should be an > 'int' because 'ssize_t' is only specified to hold values from -1 to > SSIZE_MAX). OK, thanks! In that case I think it's safer to also change back the return type of abi_stack_available(). ssize_t worked since it's defined as signed int on linux. > As far as I can see, the possibly negative result of > ZeroStack::suggest_size() is casted into a size_t in > StubGenerator::call_stub() before doing an alloca() and just a few > lines later we call EntryFrame::build() which will do an overflow > check. Nevertheless its better to fail early in the case of a problem. Makes sense. Thanks for clarifications and the review! Cheers, Severin > Thank, > Volker > > > Thanks, > > Severin > > > >> On Fri, Nov 14, 2014 at 2:18 PM, Severin Gehwolf wrote: > >> > Hi, > >> > > >> > Could I please get a review and sponsor for the following fix: > >> > > >> > bug: https://bugs.openjdk.java.net/browse/JDK-8064815 > >> > webrev: > >> > https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.0/ > >> > > >> > When running Maven ("mvn") on a Zero variant build on PPC/PPC64 hardware > >> > it throws a StackOverflowError. This is because the stack bound > >> > calculation does not account for red and yellow pages. > >> > > >> > The bug has a slightly different patch attached. The changes to > >> > hotspot/src/os/linux/vm/os_linux.cpp aren't needed for this bug. > >> > > >> > Testing done: A Zero variant build of OpenJDK 9 on PPC/PPC64 throws > >> > StackOverflowError without this fix and works fine with this fix > >> > applied. > >> > > >> > Note that this problem seems to surface on architectures where pages are > >> > large. PPC is one such instance. Page size there is 64KB and Zero > >> > initially sets its minimal stack allowance to 64KB (one page), > >> > src/os_cpu/linux_zero/vm/os_linux_zero.cpp. In os::init_2 this gets > >> > potentially increased if min_stack_allowed is small. The case on PPC > >> > Zero. > >> > > >> > However, then later at runtime the calculation of available stack is > >> > wrong since it does not account for red and yellow pages. Thus it thinks > >> > there is too little stack available where in fact more stack is > >> > available. > >> > > >> > Thanks, > >> > Severin > >> > > > > > > > From volker.simonis at gmail.com Tue Nov 18 10:21:34 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 18 Nov 2014 11:21:34 +0100 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven In-Reply-To: <1416305744.3379.20.camel@localhost.localdomain> References: <1415971123.3278.55.camel@localhost.localdomain> <1416232479.3760.18.camel@localhost.localdomain> <1416305744.3379.20.camel@localhost.localdomain> Message-ID: On Tue, Nov 18, 2014 at 11:15 AM, Severin Gehwolf wrote: > Hi Volker, > > On Mon, 2014-11-17 at 20:14 +0100, Volker Simonis wrote: >> On Mon, Nov 17, 2014 at 2:54 PM, Severin Gehwolf wrote: >> > Hi Volker, >> > >> > On Fri, 2014-11-14 at 14:46 +0100, Volker Simonis wrote: >> >> Hi Severin, >> >> >> >> I can sponsor this change if we get one more review. >> > >> > Thanks! >> > >> >> The only comment I have is that in ZeroStack::suggest_size() there >> >> doesn't seem to be a handling for the potentially negative values >> >> returned by ZeroStack::abi_stack_available(). >> > >> > OK. I could do something like this. >> > >> > diff --git a/src/cpu/zero/vm/stack_zero.cpp >> > b/src/cpu/zero/vm/stack_zero.cpp >> > --- a/src/cpu/zero/vm/stack_zero.cpp >> > +++ b/src/cpu/zero/vm/stack_zero.cpp >> > @@ -30,7 +30,9 @@ >> > >> > int ZeroStack::suggest_size(Thread *thread) const { >> > assert(needs_setup(), "already set up"); >> > - return align_size_down(abi_stack_available(thread) / 2, wordSize); >> > + ssize_t abi_available = abi_stack_available(thread); >> > + assert(abi_available >= 0, "available abi stack must be >= 0"); >> > + return align_size_down(abi_available / 2, wordSize); >> > } >> > >> > Did you have something else in mind? I'm not sure if it's necessary >> > since suggest_size is only being called when the zero stack is set up >> > (src/cpu/zero/vm/stubGenerator_zero.cpp). Thoughts? >> > >> >> I think an assertion would be good. > > Sure. Updated webrev coming soon. > >> (but 'abi_available' should be an >> 'int' because 'ssize_t' is only specified to hold values from -1 to >> SSIZE_MAX). > > OK, thanks! In that case I think it's safer to also change back the > return type of abi_stack_available(). ssize_t worked since it's defined > as signed int on linux. > I don't like ssize_t because of it's unclear semantics. I understand that it happens to work on Linux because it's defined as a signed int there. But then why not use an int in the first place? So I'd prefer if you'd use an int. Thanks, Volker >> As far as I can see, the possibly negative result of >> ZeroStack::suggest_size() is casted into a size_t in >> StubGenerator::call_stub() before doing an alloca() and just a few >> lines later we call EntryFrame::build() which will do an overflow >> check. Nevertheless its better to fail early in the case of a problem. > > Makes sense. Thanks for clarifications and the review! > > Cheers, > Severin > >> Thank, >> Volker >> >> > Thanks, >> > Severin >> > >> >> On Fri, Nov 14, 2014 at 2:18 PM, Severin Gehwolf wrote: >> >> > Hi, >> >> > >> >> > Could I please get a review and sponsor for the following fix: >> >> > >> >> > bug: https://bugs.openjdk.java.net/browse/JDK-8064815 >> >> > webrev: >> >> > https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.0/ >> >> > >> >> > When running Maven ("mvn") on a Zero variant build on PPC/PPC64 hardware >> >> > it throws a StackOverflowError. This is because the stack bound >> >> > calculation does not account for red and yellow pages. >> >> > >> >> > The bug has a slightly different patch attached. The changes to >> >> > hotspot/src/os/linux/vm/os_linux.cpp aren't needed for this bug. >> >> > >> >> > Testing done: A Zero variant build of OpenJDK 9 on PPC/PPC64 throws >> >> > StackOverflowError without this fix and works fine with this fix >> >> > applied. >> >> > >> >> > Note that this problem seems to surface on architectures where pages are >> >> > large. PPC is one such instance. Page size there is 64KB and Zero >> >> > initially sets its minimal stack allowance to 64KB (one page), >> >> > src/os_cpu/linux_zero/vm/os_linux_zero.cpp. In os::init_2 this gets >> >> > potentially increased if min_stack_allowed is small. The case on PPC >> >> > Zero. >> >> > >> >> > However, then later at runtime the calculation of available stack is >> >> > wrong since it does not account for red and yellow pages. Thus it thinks >> >> > there is too little stack available where in fact more stack is >> >> > available. >> >> > >> >> > Thanks, >> >> > Severin >> >> > >> > >> > >> > > > > From sgehwolf at redhat.com Tue Nov 18 11:05:57 2014 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 18 Nov 2014 12:05:57 +0100 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven In-Reply-To: References: <1415971123.3278.55.camel@localhost.localdomain> <1416232479.3760.18.camel@localhost.localdomain> <1416305744.3379.20.camel@localhost.localdomain> Message-ID: <1416308757.3379.30.camel@localhost.localdomain> Hi Volker, Vladimir, On Tue, 2014-11-18 at 11:21 +0100, Volker Simonis wrote: [...] > >> I think an assertion would be good. > > > > Sure. Updated webrev coming soon. > > > >> (but 'abi_available' should be an > >> 'int' because 'ssize_t' is only specified to hold values from -1 to > >> SSIZE_MAX). > > > > OK, thanks! In that case I think it's safer to also change back the > > return type of abi_stack_available(). ssize_t worked since it's defined > > as signed int on linux. > > > > I don't like ssize_t because of it's unclear semantics. I understand > that it happens to work on Linux because it's defined as a signed int > there. But then why not use an int in the first place? So I'd prefer > if you'd use an int. Agreed. Changed it back to use an int. On Mon, 2014-11-17 at 09:54 -0800, Vladimir Kozlov wrote: > I mean next guarantee: > > guarantee(Thread::current() == thread, "should run in the same > thread"); > > otherwise 'thread->stack_base() - (address) &stack_used' will give > wrong result if threads are different. Thanks for the clarification, Vladimir! I've added this guarantee. Updated webrev: https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.1/ Thanks again for the reviews! Cheers, Severin From vladimir.kozlov at oracle.com Tue Nov 18 15:36:00 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 18 Nov 2014 07:36:00 -0800 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven In-Reply-To: <1416308757.3379.30.camel@localhost.localdomain> References: <1415971123.3278.55.camel@localhost.localdomain> <1416232479.3760.18.camel@localhost.localdomain> <1416305744.3379.20.camel@localhost.localdomain> <1416308757.3379.30.camel@localhost.localdomain> Message-ID: <546B6760.8050502@oracle.com> Looks good. Thanks, Vladimir On 11/18/14 3:05 AM, Severin Gehwolf wrote: > Hi Volker, Vladimir, > > On Tue, 2014-11-18 at 11:21 +0100, Volker Simonis wrote: > [...] >>>> I think an assertion would be good. >>> >>> Sure. Updated webrev coming soon. >>> >>>> (but 'abi_available' should be an >>>> 'int' because 'ssize_t' is only specified to hold values from -1 to >>>> SSIZE_MAX). >>> >>> OK, thanks! In that case I think it's safer to also change back the >>> return type of abi_stack_available(). ssize_t worked since it's defined >>> as signed int on linux. >>> >> >> I don't like ssize_t because of it's unclear semantics. I understand >> that it happens to work on Linux because it's defined as a signed int >> there. But then why not use an int in the first place? So I'd prefer >> if you'd use an int. > > Agreed. Changed it back to use an int. > > On Mon, 2014-11-17 at 09:54 -0800, Vladimir Kozlov wrote: >> I mean next guarantee: >> >> guarantee(Thread::current() == thread, "should run in the same >> thread"); >> >> otherwise 'thread->stack_base() - (address) &stack_used' will give >> wrong result if threads are different. > > Thanks for the clarification, Vladimir! I've added this guarantee. > > Updated webrev: > https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.1/ > > Thanks again for the reviews! > > Cheers, > Severin > From magnus.ihse.bursie at oracle.com Tue Nov 18 15:36:25 2014 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Tue, 18 Nov 2014 16:36:25 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <0B65D4AA-B876-4106-9CA0-2F2AF5F43F83@oracle.com> <62FBBB3A-B1CE-43D8-9D99-3087BDDBDC37@oracle.com> <5466699D.2050808@oracle.com> <5469E5E4.7040606@oracle.com> Message-ID: <546B6779.9050608@oracle.com> On 2014-11-18 01:59, Christian Thalinger wrote: >> On Nov 17, 2014, at 4:11 AM, Magnus Ihse Bursie wrote: >> >> On 2014-11-14 21:44, Dean Long wrote: >>>> The distribution exception is there exactly since anyone should be able to distribute the files with their configure script. That does not mean that you are allowed to edit it, though. >>> What if we require Autoconf to be installed on the host? Does that solve any problems? >> No, unfortunately not. > Why not? Autoconf picks up these files automatically from the build-aux directory. That's also the reason we need to rename the original files and provide wrappers with the same name, since we can't even redirect that functionality to a file with another name. /Magnus From coleen.phillimore at oracle.com Tue Nov 18 15:52:01 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 18 Nov 2014 10:52:01 -0500 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <546A9668.9050709@oracle.com> References: <546A7B88.8000504@oracle.com> <546A9668.9050709@oracle.com> Message-ID: <546B6B21.3060201@oracle.com> Thank you Serguei. I forgot to mention in the RFR that you actually diagnosed the problem, and this fix is based on the one you provided in the bug. Thanks! Coleen On 11/17/14, 7:44 PM, serguei.spitsyn at oracle.com wrote: > Coleen, > > The fix look good. > Thank you for taking care about this! > > Thanks, > Serguei > > On 11/17/14 2:49 PM, Coleen Phillimore wrote: >> Summary: note all MemberNames created on internal list for adjusting >> method entries. >> >> The JVM MemberNameTable code will push all member names on the list >> rather than trying to index by method_idnum. The code to look up >> MemberName types wasn't used so was removed. Class redefinition >> iterates through the table sequentially to update the Method* >> pointers in saved member names. >> >> This change will work with David Chase's change to the Java code for >> bug 8013267 without the extra code dealing with class redefinition. >> >> Tested with vm.quick.testlist, jck tests and jtreg tests, including >> the mlvm tests that failed in the bug report. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ >> bug link https://bugs.openjdk.java.net/browse/JDK-8042235 >> >> Thanks, >> Coleen > From aph at redhat.com Tue Nov 18 17:06:21 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 18 Nov 2014 17:06:21 +0000 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <5464C85C.50908@redhat.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> Message-ID: <546B7C8D.7050409@redhat.com> On 11/13/2014 03:03 PM, Andrew Haley wrote: > On 11/13/2014 03:00 PM, Magnus Ihse Bursie wrote: >> >> hg mv config.sub autoconf-config.sub >> hg cp config.guess config.sub >> >> and then fix config.sub so that it runs autoconf-config.sub and modifies >> the output to what you expect it to be from config.sub when running on >> this particular platform. > > OK, I'll do that. Thanks! I've kicked around a few ideas, and I think this is right now. http://cr.openjdk.java.net/~aph/aarch64-8064357-3/ Thanks, Andrew. From christian.thalinger at oracle.com Tue Nov 18 18:09:07 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Tue, 18 Nov 2014 10:09:07 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <546B6779.9050608@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <0B65D4AA-B876-4106-9CA0-2F2AF5F43F83@oracle.com> <62FBBB3A-B1CE-43D8-9D99-3087BDDBDC37@oracle.com> <5466699D.2050808@oracle.com> <5469E5E4.7040606@oracle.com> <546B6779.9050608@oracle.com> Message-ID: <94199476-15AD-4226-8C18-2708F6F94DC3@oracle.com> > On Nov 18, 2014, at 7:36 AM, Magnus Ihse Bursie wrote: > > On 2014-11-18 01:59, Christian Thalinger wrote: >>> On Nov 17, 2014, at 4:11 AM, Magnus Ihse Bursie wrote: >>> >>> On 2014-11-14 21:44, Dean Long wrote: >>>>> The distribution exception is there exactly since anyone should be able to distribute the files with their configure script. That does not mean that you are allowed to edit it, though. >>>> What if we require Autoconf to be installed on the host? Does that solve any problems? >>> No, unfortunately not. >> Why not? > > Autoconf picks up these files automatically from the build-aux directory. That's also the reason we need to rename the original files and provide wrappers with the same name, since we can't even redirect that functionality to a file with another name. So do I understand you correctly that the files we need are automatically copied into the workspace but since we want to use our own, old versions we renamed them and use these instead? > > /Magnus From volker.simonis at gmail.com Tue Nov 18 18:23:08 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 18 Nov 2014 19:23:08 +0100 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven In-Reply-To: <546B6760.8050502@oracle.com> References: <1415971123.3278.55.camel@localhost.localdomain> <1416232479.3760.18.camel@localhost.localdomain> <1416305744.3379.20.camel@localhost.localdomain> <1416308757.3379.30.camel@localhost.localdomain> <546B6760.8050502@oracle.com> Message-ID: Thanks Vladimir. @Severin: I've just pushed the change to hotspot-rt: http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/acc869dcded3 Regards, Volker On Tue, Nov 18, 2014 at 4:36 PM, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > > On 11/18/14 3:05 AM, Severin Gehwolf wrote: >> >> Hi Volker, Vladimir, >> >> On Tue, 2014-11-18 at 11:21 +0100, Volker Simonis wrote: >> [...] >>>>> >>>>> I think an assertion would be good. >>>> >>>> >>>> Sure. Updated webrev coming soon. >>>> >>>>> (but 'abi_available' should be an >>>>> 'int' because 'ssize_t' is only specified to hold values from -1 to >>>>> SSIZE_MAX). >>>> >>>> >>>> OK, thanks! In that case I think it's safer to also change back the >>>> return type of abi_stack_available(). ssize_t worked since it's defined >>>> as signed int on linux. >>>> >>> >>> I don't like ssize_t because of it's unclear semantics. I understand >>> that it happens to work on Linux because it's defined as a signed int >>> there. But then why not use an int in the first place? So I'd prefer >>> if you'd use an int. >> >> >> Agreed. Changed it back to use an int. >> >> On Mon, 2014-11-17 at 09:54 -0800, Vladimir Kozlov wrote: >>> >>> I mean next guarantee: >>> >>> guarantee(Thread::current() == thread, "should run in the same >>> thread"); >>> >>> otherwise 'thread->stack_base() - (address) &stack_used' will give >>> wrong result if threads are different. >> >> >> Thanks for the clarification, Vladimir! I've added this guarantee. >> >> Updated webrev: >> https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.1/ >> >> Thanks again for the reviews! >> >> Cheers, >> Severin >> > From daniel.daugherty at oracle.com Tue Nov 18 18:24:29 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 18 Nov 2014 11:24:29 -0700 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <546A7B88.8000504@oracle.com> References: <546A7B88.8000504@oracle.com> Message-ID: <546B8EDD.1060809@oracle.com> On 11/17/14 3:49 PM, Coleen Phillimore wrote: > Summary: note all MemberNames created on internal list for adjusting > method entries. > > The JVM MemberNameTable code will push all member names on the list > rather than trying to index by method_idnum. The code to look up > MemberName types wasn't used so was removed. Class redefinition > iterates through the table sequentially to update the Method* pointers > in saved member names. > > This change will work with David Chase's change to the Java code for > bug 8013267 without the extra code dealing with class redefinition. > > Tested with vm.quick.testlist, jck tests and jtreg tests, including > the mlvm tests that failed in the bug report. > > open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ src/share/vm/classfile/javaClasses.hpp No comments. src/share/vm/classfile/javaClasses.cpp No comments. src/share/vm/oops/instanceKlass.hpp No comments. src/share/vm/oops/instanceKlass.cpp line 2951: _member_names = new (ResourceObj::C_HEAP, mtClass) MemberNameTable(idnum_allocated_count()); Not your bug, but what should happen if this new fails? Or is this one of the operator overrides that handles that by killing the VM? src/share/vm/prims/jvm.cpp nit line 612: methodHandle m (THREAD, method); Please delete space between 'm ('. lines 607-609, 616: uses of 'new_obj' Should all of these be switched to 'new_obj_h()'? In particular, is line 616 subject to being moved by GC if the methodHandle creation goes to a safepoint? line 620: return JNIHandles::make_local(env, oop(new_obj_h())); is the 'oop(...)' around 'new_obj_h()' redundant? I might be rusty, but isn't 'new_obj_h()' the unhandled oop? src/share/vm/prims/methodHandles.hpp No comments. src/share/vm/prims/methodHandles.cpp No comments. test/compiler/jsr292/RedefineMethodUsedByMultipleMethodHandles.java line 142 // static class FooTransformer implements ClassFileTransformer, Opcodes { Do you still need this line? Dan > bug link https://bugs.openjdk.java.net/browse/JDK-8042235 > > Thanks, > Coleen From christian.thalinger at oracle.com Tue Nov 18 18:41:28 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Tue, 18 Nov 2014 10:41:28 -0800 Subject: Branch Prediction? In-Reply-To: <2C1B2FA9-32C0-49EE-A617-3ECFCF90C006@lnu.se> References: <2C1B2FA9-32C0-49EE-A617-3ECFCF90C006@lnu.se> Message-ID: <874361AB-9854-4CFA-8C16-4569B5CE3A4B@oracle.com> I?m not sure if the silence means nobody knows or nobody cares. Speaking for myself, I don?t know of any history on this. > On Nov 8, 2014, at 1:43 AM, Erik ?sterlund wrote: > > Hi, > > Just out of curiosity, is there some good reason why we don't have a branch prediction macro? > For every tight load a; cmpxchg(expect: a, addr: &x, new_val: b); loop, I feel a bit uneasy not telling the compiler that this is pretty likely to succeed, and relying on its guessing. > > Has it been excluded because it's considered not nice or perhaps it was simply never introduced because nobody found it useful? > > Could have some define like this for GCC, which for other compilers reduces to nothing: > > #define VM_EXPECT_TRUE(A) __builtin_expect((A), true) > #define VM_EXPECT_FALSE(A) __builtin_expect((A), false) > > It might not lead to drastic performance improvements, but it feels weird not to tell the compiler what we know and keep secrets from it. And I think it's also nice for documentation purposes that people reading it also understand that this expression is gonna be true most of the time, and deal with it accordingly. > > /Erik From christian.thalinger at oracle.com Tue Nov 18 18:42:26 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Tue, 18 Nov 2014 10:42:26 -0800 Subject: Branch Prediction? In-Reply-To: <874361AB-9854-4CFA-8C16-4569B5CE3A4B@oracle.com> References: <2C1B2FA9-32C0-49EE-A617-3ECFCF90C006@lnu.se> <874361AB-9854-4CFA-8C16-4569B5CE3A4B@oracle.com> Message-ID: ?or I could just read the next email. Doh. > On Nov 18, 2014, at 10:41 AM, Christian Thalinger wrote: > > I?m not sure if the silence means nobody knows or nobody cares. Speaking for myself, I don?t know of any history on this. > >> On Nov 8, 2014, at 1:43 AM, Erik ?sterlund wrote: >> >> Hi, >> >> Just out of curiosity, is there some good reason why we don't have a branch prediction macro? >> For every tight load a; cmpxchg(expect: a, addr: &x, new_val: b); loop, I feel a bit uneasy not telling the compiler that this is pretty likely to succeed, and relying on its guessing. >> >> Has it been excluded because it's considered not nice or perhaps it was simply never introduced because nobody found it useful? >> >> Could have some define like this for GCC, which for other compilers reduces to nothing: >> >> #define VM_EXPECT_TRUE(A) __builtin_expect((A), true) >> #define VM_EXPECT_FALSE(A) __builtin_expect((A), false) >> >> It might not lead to drastic performance improvements, but it feels weird not to tell the compiler what we know and keep secrets from it. And I think it's also nice for documentation purposes that people reading it also understand that this expression is gonna be true most of the time, and deal with it accordingly. >> >> /Erik > From christian.thalinger at oracle.com Tue Nov 18 18:47:51 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Tue, 18 Nov 2014 10:47:51 -0800 Subject: RFR: AARCH64: 8064594: Top-level JDK changes In-Reply-To: <5464FFB2.2050205@oracle.com> References: <546348F8.9060900@redhat.com> <54647DC5.9010102@redhat.com> <5464FFB2.2050205@oracle.com> Message-ID: <92D90230-1F37-4506-9B1D-576FEF49F805@oracle.com> > On Nov 13, 2014, at 11:00 AM, joe darcy wrote: > > FWIW, if I were creating a new file by first copying an old file, I would use a copyright range from the creation date of the old file to the current year. Fair enough. > > -Joe > > On 11/13/2014 1:45 AM, Andrew Haley wrote: >> On 12/11/14 23:51, Christian Thalinger wrote: >>> The new jvm.cfg files should only have a copyright year of 2014. >> Why, exactly? They have been around for a while. >> >> Andrew. >> > From serguei.spitsyn at oracle.com Tue Nov 18 19:58:37 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 18 Nov 2014 11:58:37 -0800 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <546B6B21.3060201@oracle.com> References: <546A7B88.8000504@oracle.com> <546A9668.9050709@oracle.com> <546B6B21.3060201@oracle.com> Message-ID: <546BA4ED.6020905@oracle.com> No problem, Coleen. Thanks! Serguei On 11/18/14 7:52 AM, Coleen Phillimore wrote: > > Thank you Serguei. I forgot to mention in the RFR that you actually > diagnosed the problem, and this fix is based on the one you provided > in the bug. > > Thanks! > Coleen > > On 11/17/14, 7:44 PM, serguei.spitsyn at oracle.com wrote: >> Coleen, >> >> The fix look good. >> Thank you for taking care about this! >> >> Thanks, >> Serguei >> >> On 11/17/14 2:49 PM, Coleen Phillimore wrote: >>> Summary: note all MemberNames created on internal list for adjusting >>> method entries. >>> >>> The JVM MemberNameTable code will push all member names on the list >>> rather than trying to index by method_idnum. The code to look up >>> MemberName types wasn't used so was removed. Class redefinition >>> iterates through the table sequentially to update the Method* >>> pointers in saved member names. >>> >>> This change will work with David Chase's change to the Java code for >>> bug 8013267 without the extra code dealing with class redefinition. >>> >>> Tested with vm.quick.testlist, jck tests and jtreg tests, including >>> the mlvm tests that failed in the bug report. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ >>> bug link https://bugs.openjdk.java.net/browse/JDK-8042235 >>> >>> Thanks, >>> Coleen >> > From max.ockner at oracle.com Tue Nov 18 21:53:34 2014 From: max.ockner at oracle.com (Max Ockner) Date: Tue, 18 Nov 2014 16:53:34 -0500 Subject: RFR:8060074:Cleanup of unused memory tracking parameters in os::free and its callers. In-Reply-To: <5463C52D.4000600@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> Message-ID: <546BBFDE.8040003@oracle.com> Hello all, Please review this minor cleanup: Bug ID: 8060074 Webrev: http://cr.openjdk.java.net/~coleenp/8060074/ Summary: (1) os::free takes two arguments, but never uses the second argument, which is a MEMFLAG. I have removed this argument from every os::free call. (2) The FreeHeap method in src/share/vm/memory/allocation.inline.hpp also takes a MEMFLAG argument, which is only used to call os::free. Now an unused argument, it has been removed from all FreeHeap calls. No other methods which directly call os::free() have this problem. (3) The FREE_C_HEAP_ARRAY macro in src/share/vm/memory/allocation.hpp takes A MEMFLAG argument which is passed to FreeHeap, and nothing else. This argument is now unused, and has been removed. No other methods which call FreeHeap have this problem. No methods or macros which use the FREE_C_HEAP_ARRAY macro needed cleanup. I have also removed the extra argument from the definitions of the above methods. Tests: jtreg hotspot tests with -vmoption:"-XX:NativeMemoryTrackingdetail" Thanks for your help, Max Ockner From harold.seigel at oracle.com Tue Nov 18 22:05:57 2014 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 18 Nov 2014 17:05:57 -0500 Subject: RFR:8060074:Cleanup of unused memory tracking parameters in os::free and its callers. In-Reply-To: <546BBFDE.8040003@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> <546BBFDE.8040003@oracle.com> Message-ID: <546BC2C5.1080107@oracle.com> Hi Max, Changes look good. Could you remove memflags from this comment in allocation.hpp? // FREE_C_HEAP_OBJ(objname, type, memflags) Thanks, Harold On 11/18/2014 4:53 PM, Max Ockner wrote: > Hello all, > Please review this minor cleanup: > > Bug ID: 8060074 > Webrev: http://cr.openjdk.java.net/~coleenp/8060074/ > > Summary: > (1) os::free takes two arguments, but never uses the second argument, > which is a MEMFLAG. I have removed this argument from every os::free > call. > (2) The FreeHeap method in src/share/vm/memory/allocation.inline.hpp > also takes a MEMFLAG argument, which is only used to call os::free. > Now an unused argument, it has been removed from all FreeHeap calls. > No other methods which directly call os::free() have this problem. > (3) The FREE_C_HEAP_ARRAY macro in src/share/vm/memory/allocation.hpp > takes A MEMFLAG argument which is passed to FreeHeap, and nothing > else. This argument is now unused, and has been removed. No other > methods which call FreeHeap have this problem. > > No methods or macros which use the FREE_C_HEAP_ARRAY macro needed > cleanup. I have also removed the extra argument from the definitions > of the above methods. > > Tests: jtreg hotspot tests with > -vmoption:"-XX:NativeMemoryTrackingdetail" > > Thanks for your help, > Max Ockner From vladimir.kozlov at oracle.com Tue Nov 18 23:54:13 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 18 Nov 2014 15:54:13 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546B13A7.7080103@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> <546B13A7.7080103@redhat.com> Message-ID: <546BDC25.4040202@oracle.com> For these changes lets generate them as you did until now and rework this later. Thanks, Vladimir On 11/18/14 1:38 AM, Andrew Haley wrote: > On 17/11/14 20:55, Vladimir Kozlov wrote: >> library_call.cpp: >> >> Please, add a comments. >> And why MemBarAcquire is not generated here for volatile loads on >> aarch64? In parse3.cpp we still do. > > Okay. > > I've been thinking about that. In aarch64.ad I elide an unnecessary > MemBarAcquire (by looking at the ideal graph). This is something of a > kludge, and perhaps it would be better not to generate it in the first > place. What do you think? > > Thanks, > Andrew. > From david.holmes at oracle.com Wed Nov 19 01:45:22 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 19 Nov 2014 11:45:22 +1000 Subject: Branch Prediction? In-Reply-To: References: <2C1B2FA9-32C0-49EE-A617-3ECFCF90C006@lnu.se> <874361AB-9854-4CFA-8C16-4569B5CE3A4B@oracle.com> Message-ID: <546BF632.60007@oracle.com> On 19/11/2014 4:42 AM, Christian Thalinger wrote: > ?or I could just read the next email. Doh. Not so obvious when the subject was changed to "Compiler branch hints". I had to go to the archives to find it. My 2c. The performance focus has been on generated code, not the runtime code. Personally I dislike these magic macros as they clutter the code. David >> On Nov 18, 2014, at 10:41 AM, Christian Thalinger wrote: >> >> I?m not sure if the silence means nobody knows or nobody cares. Speaking for myself, I don?t know of any history on this. >> >>> On Nov 8, 2014, at 1:43 AM, Erik ?sterlund wrote: >>> >>> Hi, >>> >>> Just out of curiosity, is there some good reason why we don't have a branch prediction macro? >>> For every tight load a; cmpxchg(expect: a, addr: &x, new_val: b); loop, I feel a bit uneasy not telling the compiler that this is pretty likely to succeed, and relying on its guessing. >>> >>> Has it been excluded because it's considered not nice or perhaps it was simply never introduced because nobody found it useful? >>> >>> Could have some define like this for GCC, which for other compilers reduces to nothing: >>> >>> #define VM_EXPECT_TRUE(A) __builtin_expect((A), true) >>> #define VM_EXPECT_FALSE(A) __builtin_expect((A), false) >>> >>> It might not lead to drastic performance improvements, but it feels weird not to tell the compiler what we know and keep secrets from it. And I think it's also nice for documentation purposes that people reading it also understand that this expression is gonna be true most of the time, and deal with it accordingly. >>> >>> /Erik >> > From dean.long at oracle.com Wed Nov 19 01:48:33 2014 From: dean.long at oracle.com (Dean Long) Date: Tue, 18 Nov 2014 17:48:33 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546B12B2.3020103@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A7F59.7030903@oracle.com> <546B12B2.3020103@redhat.com> Message-ID: <546BF6F1.1010507@oracle.com> On 11/18/2014 1:34 AM, Andrew Haley wrote: > On 17/11/14 23:06, Dean Long wrote: >> 1) I'm still hoping for an explanation of the is_last_use change in >> c1_LinearScan.cpp > I don't exactly know; it was a long time ago. The reasonable thing > for me to do is take it out and do some testing, but I am a little > nervous about the amount of real-world testing I'd need to do to be > sure. I can appreciate that we don't really want to have mysteries > like this in new code, though. > >> 2) Do you remember why compactingPermGenGen.o needs -O1? > It's probably obsolete: there was a bug in amd64 which needed it, and > the -O1 is still there in amd64.make. There's also the NOOPT for the > copied fdlibm routines in sharedRuntimeTrig.o: I suspect that's > obsolete too. > > The right thing to do here is, I suggest, to take out both of these > dubious fragments but be prepared to re-commit them. That sounds good to me. dl > Andrew. From david.holmes at oracle.com Wed Nov 19 01:53:38 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 19 Nov 2014 11:53:38 +1000 Subject: Proposal: Allowing selective pushes to hotspot without jprt In-Reply-To: References: <540F7021.5080100@oracle.com> <5410CDA9.7030405@oracle.com> <541C6FD9.5050602@oracle.com> <545CB362.60501@oracle.com> <5460CDF1.8050205@oracle.com> Message-ID: <546BF822.7010702@oracle.com> On 18/11/2014 8:03 PM, Volker Simonis wrote: > On Mon, Nov 10, 2014 at 3:38 PM, Mikael Vidstedt > wrote: >> >> On 2014-11-07 14:12, Volker Simonis wrote: >>> >>> On Fri, Nov 7, 2014 at 12:56 PM, Mikael Vidstedt >>> wrote: >>>> >>>> Volker, >>>> >>>> Thanks for reminding me, this totally slipped my mind. >>>> >>>> I think it's fair to say say we've given this enough time for feedback, >>>> and >>>> that the feedback has been all supportive. With that in mind I consider >>>> the >>>> proposal approved and effective immediately! >>>> >>> OK great. So does this mean we can now push reviewed changes to the >>> ppc/aix subdirs right away? >> >> >> That is indeed the idea - modulo the "if at review code review time a change >> is for some reason deemed to be risky and/or otherwise have impact on shared >> files" part which, again, hopefully is rare. >> > > You're right - it works! > I've just pushed my first AIX-only change to hotspot-rt! Congratulations! Unfortunately it caused us a problem as now the repos can change whilst a job is going through JPRT - this requires a new merge due to multiple heads and so triggered a failure. But Mikael is working on it :) David > Thanks, > Volker > >> Cheers, >> Mikael >> >> >>> >>>> Cheers, >>>> Mikael >>>> >>>> >>>> On 2014-11-06 15:35, Volker Simonis wrote: >>>>> >>>>> Hi Mikael, >>>>> >>>>> just wanted to ask what's the status of this project? >>>>> I hope it was not just a JavaOne hoax :) >>>>> >>>>> Regards, >>>>> Volker >>>>> >>>>> >>>>> On Fri, Sep 19, 2014 at 8:47 PM, Volker Simonis >>>>> wrote: >>>>>> >>>>>> Thanks Mikael, that sounds good! >>>>>> >>>>>> Regards, >>>>>> Volker >>>>>> >>>>>> >>>>>> On Fri, Sep 19, 2014 at 8:03 PM, Mikael Vidstedt >>>>>> wrote: >>>>>>> >>>>>>> Volker, >>>>>>> >>>>>>> The proposal is only to change how the changes are pushed, not which >>>>>>> forests >>>>>>> changes can be pushed to. That is, we would still require hotspot >>>>>>> changes to >>>>>>> be pushed to one of the group repositories (jdk9/hs-{comp,gc,rt}) or >>>>>>> to >>>>>>> the >>>>>>> jdk8u/hs-dev forest (jdk8u), but I propose that the relaxation be >>>>>>> applied on >>>>>>> all those (four) forests. Reasonable? >>>>>>> >>>>>>> Cheers, >>>>>>> Mikael >>>>>>> >>>>>>> >>>>>>> On 2014-09-12 11:38, Volker Simonis wrote: >>>>>>>> >>>>>>>> Hi Mikael, >>>>>>>> >>>>>>>> there's one more question that came to my mind: will the new rule >>>>>>>> apply to all hotspot respitories (i.e. jdk9/hs-rt/hotspot, >>>>>>>> jdk9/hs-comp/hotspot, jdk9/hs-gc/hotspot, jdk9/hs-hs/hotspot AND >>>>>>>> jdk8u/jdk8u-dev/hotspot, jdk8u/hs-dev/hotspot) ? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Volker >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Sep 11, 2014 at 12:16 AM, Mikael Vidstedt >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Andrew/Volker, >>>>>>>>> >>>>>>>>> Thanks for the positive feedback. The goal of the proposal is to >>>>>>>>> simplify >>>>>>>>> pushing changes which are effectively not tested by the jprt system >>>>>>>>> anyway. >>>>>>>>> The proposed relaxation would not affect work on other >>>>>>>>> infrastructure >>>>>>>>> projects in any relevant way, but would hopefully improve all our >>>>>>>>> lives >>>>>>>>> significantly immediately. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Mikael >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2014-09-10 01:45, Volker Simonis wrote: >>>>>>>>>> >>>>>>>>>> Hi Mikael, >>>>>>>>>> >>>>>>>>>> thanks a lot for this proposal. I think this will dramatically >>>>>>>>>> simplify our work to keep our ports up to date! So I fully support >>>>>>>>>> it. >>>>>>>>>> >>>>>>>>>> Nevertheless, I think this can only be a first step towards fully >>>>>>>>>> open >>>>>>>>>> the JPRT system to developers outside Oracle. With "opening" I mean >>>>>>>>>> to >>>>>>>>>> allow OpenJDK commiters from outside Oracle to submit and run JPRT >>>>>>>>>> jobs as well as allowing porting projects to add hardware which >>>>>>>>>> builds >>>>>>>>>> and tests the HotSpot on alternative platforms. >>>>>>>>>> >>>>>>>>>> So while I'm all in favor of your proposal I hope you can allay my >>>>>>>>>> doubts that this simplification will hopefully not push the >>>>>>>>>> realization of a truly OPEN JPRT system even further away. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Volker >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Sep 9, 2014 at 11:24 PM, Mikael Vidstedt >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> All, >>>>>>>>>>> >>>>>>>>>>> Made up primarily of low level C++ code, the Hotspot codebase is >>>>>>>>>>> highly >>>>>>>>>>> platform dependent and also tightly coupled with the tool chains >>>>>>>>>>> on >>>>>>>>>>> the >>>>>>>>>>> various platforms. Each platform/tool chain combination has its >>>>>>>>>>> set >>>>>>>>>>> of >>>>>>>>>>> special quirks, and code must be implemented in a way such that it >>>>>>>>>>> only >>>>>>>>>>> relies on the common subset of syntax and functionality across all >>>>>>>>>>> these >>>>>>>>>>> combinations. History has taught us that even simple changes can >>>>>>>>>>> have >>>>>>>>>>> surprising results when compiled with different compilers. >>>>>>>>>>> >>>>>>>>>>> For more than a decade the Hotspot team has ensured a minimum >>>>>>>>>>> quality >>>>>>>>>>> level >>>>>>>>>>> by requiring all pushes to be done through a build and test system >>>>>>>>>>> (jprt) >>>>>>>>>>> which guarantees that the code resulting from applying a set of >>>>>>>>>>> changes >>>>>>>>>>> builds on a set of core platforms and that a set of core tests >>>>>>>>>>> pass. >>>>>>>>>>> Only >>>>>>>>>>> if >>>>>>>>>>> all the builds and tests pass will the changes actually be pushed >>>>>>>>>>> to >>>>>>>>>>> the >>>>>>>>>>> target repository. >>>>>>>>>>> >>>>>>>>>>> We believe that testing like the above, in combination with later >>>>>>>>>>> stages >>>>>>>>>>> of >>>>>>>>>>> testing, is vital to ensuring that the quality level of the >>>>>>>>>>> Hotspot >>>>>>>>>>> code >>>>>>>>>>> remains high and that developers do not run into situations where >>>>>>>>>>> the >>>>>>>>>>> latest >>>>>>>>>>> version has build errors on some platforms. >>>>>>>>>>> >>>>>>>>>>> Recently the AIX/PPC port was added to the set of OpenJDK >>>>>>>>>>> platforms. >>>>>>>>>>> From >>>>>>>>>>> a >>>>>>>>>>> Hotspot perspective this new platform added a set of AIX/PPC >>>>>>>>>>> specific >>>>>>>>>>> files >>>>>>>>>>> including some platform specific changes to shared code. The >>>>>>>>>>> AIX/PPC >>>>>>>>>>> platform is not tested by Oracle as part of Hotspot push jobs. The >>>>>>>>>>> same >>>>>>>>>>> thing applies for the shark and zero versions of Hotspot. >>>>>>>>>>> >>>>>>>>>>> While Hotspot developers remain committed to making sure changes >>>>>>>>>>> are >>>>>>>>>>> developed in a way such that the quality level remains high across >>>>>>>>>>> all >>>>>>>>>>> platforms and variants, because of the above mentioned >>>>>>>>>>> complexities >>>>>>>>>>> it >>>>>>>>>>> is >>>>>>>>>>> inevitable that from time to time changes will be made which >>>>>>>>>>> introduce >>>>>>>>>>> issues on specific platforms or tool chains not part of the core >>>>>>>>>>> testing. >>>>>>>>>>> >>>>>>>>>>> To allow these issues to be resolved more quickly I would like to >>>>>>>>>>> propose >>>>>>>>>>> a >>>>>>>>>>> relaxation in the requirements on how changes to Hotspot are >>>>>>>>>>> pushed. >>>>>>>>>>> Specifically I would like to allow for direct pushes to the >>>>>>>>>>> hotspot/ >>>>>>>>>>> repository of files specific to the following >>>>>>>>>>> ports/variants/tools: >>>>>>>>>>> >>>>>>>>>>> * AIX >>>>>>>>>>> * PPC >>>>>>>>>>> * Shark >>>>>>>>>>> * Zero >>>>>>>>>>> >>>>>>>>>>> Today this translates into the following files: >>>>>>>>>>> >>>>>>>>>>> - src/cpu/ppc/** >>>>>>>>>>> - src/cpu/zero/** >>>>>>>>>>> - src/os/aix/** >>>>>>>>>>> - src/os_cpu/aix_ppc/** >>>>>>>>>>> - src/os_cpu/bsd_zero/** >>>>>>>>>>> - src/os_cpu/linux_ppc/** >>>>>>>>>>> - src/os_cpu/linux_zero/** >>>>>>>>>>> >>>>>>>>>>> Note that all changes are still required to go through the normal >>>>>>>>>>> development and review cycle; the proposed relaxation only applies >>>>>>>>>>> to >>>>>>>>>>> how >>>>>>>>>>> the changes are pushed. >>>>>>>>>>> >>>>>>>>>>> If at code review time a change is for some reason deemed to be >>>>>>>>>>> risky >>>>>>>>>>> and/or >>>>>>>>>>> otherwise have impact on shared files the reviewer may request >>>>>>>>>>> that >>>>>>>>>>> the >>>>>>>>>>> change to go through the regular push testing. For changes only >>>>>>>>>>> touching >>>>>>>>>>> the >>>>>>>>>>> above set of files this expected to be rare. >>>>>>>>>>> >>>>>>>>>>> Please let me know what you think. >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Mikael >>>>>>>>>>> >> From dean.long at oracle.com Wed Nov 19 01:52:51 2014 From: dean.long at oracle.com (Dean Long) Date: Tue, 18 Nov 2014 17:52:51 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> Message-ID: <546BF7F3.5020507@oracle.com> That would be my preference too, but then we have to touch all the other ports. I'll let Vladimir have the final vote. dl On 11/18/2014 1:38 AM, Roland Westrelin wrote: > There?s very little code that?s marked with #ifdef X86 or #ifdef SPARC in the C2 code. Wouldn?t we want to hide everything that is AARCH64 specific behind functions in Matcher like we do elsewhere? > > Roland. From paul.hohensee at gmail.com Wed Nov 19 02:36:06 2014 From: paul.hohensee at gmail.com (Paul Hohensee) Date: Wed, 19 Nov 2014 10:36:06 +0800 Subject: Branch Prediction? Message-ID: History as I remember it. :) It's been considered, and decided against. The platforms which openjdk currently targets all have decent to spectacular hardware branch prediction. Those that didn't, such as Niagara 1, ignored prediction bits. Conclusion is that it's not worth complicating the code with, as David says, 'magic macros'. Paul David Holmes wrote: On 19/11/2014 4:42 AM, Christian Thalinger wrote: > ?or I could just read the next email. Doh. Not so obvious when the subject was changed to "Compiler branch hints". I had to go to the archives to find it. My 2c. The performance focus has been on generated code, not the runtime code. Personally I dislike these magic macros as they clutter the code. David >> On Nov 18, 2014, at 10:41 AM, Christian Thalinger < christian.thalinger at oracle.com> wrote: >> >> I?m not sure if the silence means nobody knows or nobody cares. Speaking for myself, I don?t know of any history on this. >> >>> On Nov 8, 2014, at 1:43 AM, Erik ?sterlund wrote: >>> >>> Hi, >>> >>> Just out of curiosity, is there some good reason why we don't have a branch prediction macro? >>> For every tight load a; cmpxchg(expect: a, addr: &x, new_val: b); loop, I feel a bit uneasy not telling the compiler that this is pretty likely to succeed, and relying on its guessing. >>> >>> Has it been excluded because it's considered not nice or perhaps it was simply never introduced because nobody found it useful? >>> >>> Could have some define like this for GCC, which for other compilers reduces to nothing: >>> >>> #define VM_EXPECT_TRUE(A) __builtin_expect((A), true) >>> #define VM_EXPECT_FALSE(A) __builtin_expect((A), false) >>> >>> It might not lead to drastic performance improvements, but it feels weird not to tell the compiler what we know and keep secrets from it. And I think it's also nice for documentation purposes that people reading it also understand that this expression is gonna be true most of the time, and deal with it accordingly. >>> >>> /Erik From david.holmes at oracle.com Wed Nov 19 02:47:56 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 19 Nov 2014 12:47:56 +1000 Subject: RFR:8060074:Cleanup of unused memory tracking parameters in os::free and its callers. In-Reply-To: <546BBFDE.8040003@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> <546BBFDE.8040003@oracle.com> Message-ID: <546C04DC.1010606@oracle.com> Hi Max, So I would have assumed memflags were being passed to all the "free" routines for NMT purposes. Otherwise how does NMT track this? Thanks, David On 19/11/2014 7:53 AM, Max Ockner wrote: > Hello all, > Please review this minor cleanup: > > Bug ID: 8060074 > Webrev: http://cr.openjdk.java.net/~coleenp/8060074/ > > Summary: > (1) os::free takes two arguments, but never uses the second argument, > which is a MEMFLAG. I have removed this argument from every os::free call. > (2) The FreeHeap method in src/share/vm/memory/allocation.inline.hpp > also takes a MEMFLAG argument, which is only used to call os::free. Now > an unused argument, it has been removed from all FreeHeap calls. No > other methods which directly call os::free() have this problem. > (3) The FREE_C_HEAP_ARRAY macro in src/share/vm/memory/allocation.hpp > takes A MEMFLAG argument which is passed to FreeHeap, and nothing else. > This argument is now unused, and has been removed. No other methods > which call FreeHeap have this problem. > > No methods or macros which use the FREE_C_HEAP_ARRAY macro needed > cleanup. I have also removed the extra argument from the definitions of > the above methods. > > Tests: jtreg hotspot tests with -vmoption:"-XX:NativeMemoryTrackingdetail" > > Thanks for your help, > Max Ockner From vladimir.kozlov at oracle.com Wed Nov 19 03:03:29 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 18 Nov 2014 19:03:29 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546BF7F3.5020507@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> <546BF7F3.5020507@oracle.com> Message-ID: <546C0881.8050905@oracle.com> Yes, we can hide AARCH64 using something similar to CODE_CACHE_SIZE_LIMIT macro which could be overwritten in platform specific files if needed: USE_STORE_RELEASE_FOR_VOLATILE. Or slightly more complicated declaration similar to support_IRIW_for_not_multiple_copy_atomic_cpu boolean constant. Dean, will it help us if we do that? If yes, then we should do that. I did not insisted on removing AARCH64 from C2 because there was discussion about generating special volatile load/store nodes instead of using memory order flag. So we may rewrite this code anyway. Regards, Vladimir On 11/18/14 5:52 PM, Dean Long wrote: > That would be my preference too, but then we have to touch all the other > ports. > I'll let Vladimir have the final vote. > > dl > > On 11/18/2014 1:38 AM, Roland Westrelin wrote: >> There?s very little code that?s marked with #ifdef X86 or #ifdef SPARC >> in the C2 code. Wouldn?t we want to hide everything that is AARCH64 >> specific behind functions in Matcher like we do elsewhere? >> >> Roland. > From dean.long at oracle.com Wed Nov 19 03:45:40 2014 From: dean.long at oracle.com (Dean Long) Date: Tue, 18 Nov 2014 19:45:40 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546C0881.8050905@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> <546BF7F3.5020507@oracle.com> <546C0881.8050905@oracle.com> Message-ID: <546C1264.6090308@oracle.com> On 11/18/2014 7:03 PM, Vladimir Kozlov wrote: > Yes, we can hide AARCH64 using something similar to > CODE_CACHE_SIZE_LIMIT macro which could be overwritten in platform > specific files if needed: USE_STORE_RELEASE_FOR_VOLATILE. > Or slightly more complicated declaration similar to > support_IRIW_for_not_multiple_copy_atomic_cpu boolean constant. > > Dean, will it help us if we do that? If yes, then we should do that. > Yes, this will help us. Following the boolean constant example, we would have something like: #ifdef USE_STORE_RELEASE_FOR_VOLATILE const bool use_store_release_for_volatile = true; #else const bool use_store_release_for_volatile = false; #endif > I did not insisted on removing AARCH64 from C2 because there was > discussion about generating special volatile load/store nodes instead > of using memory order flag. So we may rewrite this code anyway. > I like the sound of this, but do we still need a way to turn it off, so that platforms that emit explicit barriers can still collapse redundant barriers? Please file a bug/RFE for this so we don't forget it. dl > Regards, > Vladimir > > On 11/18/14 5:52 PM, Dean Long wrote: >> That would be my preference too, but then we have to touch all the other >> ports. >> I'll let Vladimir have the final vote. >> >> dl >> >> On 11/18/2014 1:38 AM, Roland Westrelin wrote: >>> There?s very little code that?s marked with #ifdef X86 or #ifdef SPARC >>> in the C2 code. Wouldn?t we want to hide everything that is AARCH64 >>> specific behind functions in Matcher like we do elsewhere? >>> >>> Roland. >> From david.holmes at oracle.com Wed Nov 19 05:17:33 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 19 Nov 2014 15:17:33 +1000 Subject: RFR: JDK-8035663 Suspicious failure of test java/util/concurrent/Phaser/FickleRegister.java Message-ID: <546C27ED.7090700@oracle.com> webrev: http://cr.openjdk.java.net/~dholmes/8035663/webrev.jdk9/ This test failure exposed a number of issues with the logic in unsafe.cpp for handling atomic updates of Java long fields on platforms without any direct support for a 64-bit CAS operation - platforms for which supports_cx8 is not true. This only impacts our SE Embedded PPC32 platform (where we have been using this fix for some time now) but in case other such platforms came along I wanted to get this pushed to mainline. What the unsafe code did was to use the object containing the field as a lock object for reading and writing the field. This seems reasonable on the surface but in fact had a fatal flaw - because we were locking a Java-level visible object inside what was considered to be a lock-free code-path by the application and library logic, we could actually induce a deadlock - which is why the test failed. In addition the code had two further flaws: 1. Because the field could also be updated via direct assignment in Java code the unsafe code needed to perform an Atomic::load of the field. And for good measure we also employ an Atomic::store to ensure no interference with direct reads of the field in Java code. 2. The address of the field was being calculated before using the ObjectLocker to lock the object, but locking could encounter a safepoint check allowing the object to relocated by the GC, and we would then use a stale address. To fix all of this we: - introduce a special Mutex to use instead of the deadlock-inducing Java object - use Atomic::load and Atomic::store to access the jlong field - avoid safepoints when locking (alternatively you could ensure you calculate the address after acquiring the lock ) Thanks, David From thomas.stuefe at gmail.com Wed Nov 19 07:38:27 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 19 Nov 2014 08:38:27 +0100 Subject: RFR:8060074:Cleanup of unused memory tracking parameters in os::free and its callers. In-Reply-To: <546C04DC.1010606@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> <546BBFDE.8040003@oracle.com> <546C04DC.1010606@oracle.com> Message-ID: Hi Max, Thank you, I like this change, this bugged me since a long time. I am not an official reviewer, but for me it looks good. @David: NMT does not use this (afaik never did); it uses malloc headers to keep track of allocation meta data. Kind regards, Thomas On Wed, Nov 19, 2014 at 3:47 AM, David Holmes wrote: > Hi Max, > > So I would have assumed memflags were being passed to all the "free" > routines for NMT purposes. Otherwise how does NMT track this? > > Thanks, > David > > > On 19/11/2014 7:53 AM, Max Ockner wrote: > >> Hello all, >> Please review this minor cleanup: >> >> Bug ID: 8060074 >> Webrev: http://cr.openjdk.java.net/~coleenp/8060074/ >> >> Summary: >> (1) os::free takes two arguments, but never uses the second argument, >> which is a MEMFLAG. I have removed this argument from every os::free call. >> (2) The FreeHeap method in src/share/vm/memory/allocation.inline.hpp >> also takes a MEMFLAG argument, which is only used to call os::free. Now >> an unused argument, it has been removed from all FreeHeap calls. No >> other methods which directly call os::free() have this problem. >> (3) The FREE_C_HEAP_ARRAY macro in src/share/vm/memory/allocation.hpp >> takes A MEMFLAG argument which is passed to FreeHeap, and nothing else. >> This argument is now unused, and has been removed. No other methods >> which call FreeHeap have this problem. >> >> No methods or macros which use the FREE_C_HEAP_ARRAY macro needed >> cleanup. I have also removed the extra argument from the definitions of >> the above methods. >> >> Tests: jtreg hotspot tests with -vmoption:"-XX: >> NativeMemoryTrackingdetail" >> >> Thanks for your help, >> Max Ockner >> > From volker.simonis at gmail.com Wed Nov 19 08:23:44 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 19 Nov 2014 09:23:44 +0100 Subject: Proposal: Allowing selective pushes to hotspot without jprt In-Reply-To: <546BF822.7010702@oracle.com> References: <540F7021.5080100@oracle.com> <5410CDA9.7030405@oracle.com> <541C6FD9.5050602@oracle.com> <545CB362.60501@oracle.com> <5460CDF1.8050205@oracle.com> <546BF822.7010702@oracle.com> Message-ID: On Wed, Nov 19, 2014 at 2:53 AM, David Holmes wrote: > On 18/11/2014 8:03 PM, Volker Simonis wrote: >> >> On Mon, Nov 10, 2014 at 3:38 PM, Mikael Vidstedt >> wrote: >>> >>> >>> On 2014-11-07 14:12, Volker Simonis wrote: >>>> >>>> >>>> On Fri, Nov 7, 2014 at 12:56 PM, Mikael Vidstedt >>>> wrote: >>>>> >>>>> >>>>> Volker, >>>>> >>>>> Thanks for reminding me, this totally slipped my mind. >>>>> >>>>> I think it's fair to say say we've given this enough time for feedback, >>>>> and >>>>> that the feedback has been all supportive. With that in mind I consider >>>>> the >>>>> proposal approved and effective immediately! >>>>> >>>> OK great. So does this mean we can now push reviewed changes to the >>>> ppc/aix subdirs right away? >>> >>> >>> >>> That is indeed the idea - modulo the "if at review code review time a >>> change >>> is for some reason deemed to be risky and/or otherwise have impact on >>> shared >>> files" part which, again, hopefully is rare. >>> >> >> You're right - it works! >> I've just pushed my first AIX-only change to hotspot-rt! > > > Congratulations! > > Unfortunately it caused us a problem as now the repos can change whilst a > job is going through JPRT - this requires a new merge due to multiple heads > and so triggered a failure. But Mikael is working on it :) It's astonishing how even the smallest changes can introduce non-foreseeable problems. Hopefully Mikael will be able to fix the problem somehow (if he didn't had too much of the champagne already :) Regards, Volker > > David > > >> Thanks, >> Volker >> >>> Cheers, >>> Mikael >>> >>> >>>> >>>>> Cheers, >>>>> Mikael >>>>> >>>>> >>>>> On 2014-11-06 15:35, Volker Simonis wrote: >>>>>> >>>>>> >>>>>> Hi Mikael, >>>>>> >>>>>> just wanted to ask what's the status of this project? >>>>>> I hope it was not just a JavaOne hoax :) >>>>>> >>>>>> Regards, >>>>>> Volker >>>>>> >>>>>> >>>>>> On Fri, Sep 19, 2014 at 8:47 PM, Volker Simonis >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> Thanks Mikael, that sounds good! >>>>>>> >>>>>>> Regards, >>>>>>> Volker >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 19, 2014 at 8:03 PM, Mikael Vidstedt >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> Volker, >>>>>>>> >>>>>>>> The proposal is only to change how the changes are pushed, not which >>>>>>>> forests >>>>>>>> changes can be pushed to. That is, we would still require hotspot >>>>>>>> changes to >>>>>>>> be pushed to one of the group repositories (jdk9/hs-{comp,gc,rt}) or >>>>>>>> to >>>>>>>> the >>>>>>>> jdk8u/hs-dev forest (jdk8u), but I propose that the relaxation be >>>>>>>> applied on >>>>>>>> all those (four) forests. Reasonable? >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Mikael >>>>>>>> >>>>>>>> >>>>>>>> On 2014-09-12 11:38, Volker Simonis wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Mikael, >>>>>>>>> >>>>>>>>> there's one more question that came to my mind: will the new rule >>>>>>>>> apply to all hotspot respitories (i.e. jdk9/hs-rt/hotspot, >>>>>>>>> jdk9/hs-comp/hotspot, jdk9/hs-gc/hotspot, jdk9/hs-hs/hotspot AND >>>>>>>>> jdk8u/jdk8u-dev/hotspot, jdk8u/hs-dev/hotspot) ? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Volker >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Sep 11, 2014 at 12:16 AM, Mikael Vidstedt >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Andrew/Volker, >>>>>>>>>> >>>>>>>>>> Thanks for the positive feedback. The goal of the proposal is to >>>>>>>>>> simplify >>>>>>>>>> pushing changes which are effectively not tested by the jprt >>>>>>>>>> system >>>>>>>>>> anyway. >>>>>>>>>> The proposed relaxation would not affect work on other >>>>>>>>>> infrastructure >>>>>>>>>> projects in any relevant way, but would hopefully improve all our >>>>>>>>>> lives >>>>>>>>>> significantly immediately. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Mikael >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2014-09-10 01:45, Volker Simonis wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi Mikael, >>>>>>>>>>> >>>>>>>>>>> thanks a lot for this proposal. I think this will dramatically >>>>>>>>>>> simplify our work to keep our ports up to date! So I fully >>>>>>>>>>> support >>>>>>>>>>> it. >>>>>>>>>>> >>>>>>>>>>> Nevertheless, I think this can only be a first step towards fully >>>>>>>>>>> open >>>>>>>>>>> the JPRT system to developers outside Oracle. With "opening" I >>>>>>>>>>> mean >>>>>>>>>>> to >>>>>>>>>>> allow OpenJDK commiters from outside Oracle to submit and run >>>>>>>>>>> JPRT >>>>>>>>>>> jobs as well as allowing porting projects to add hardware which >>>>>>>>>>> builds >>>>>>>>>>> and tests the HotSpot on alternative platforms. >>>>>>>>>>> >>>>>>>>>>> So while I'm all in favor of your proposal I hope you can allay >>>>>>>>>>> my >>>>>>>>>>> doubts that this simplification will hopefully not push the >>>>>>>>>>> realization of a truly OPEN JPRT system even further away. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Volker >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Sep 9, 2014 at 11:24 PM, Mikael Vidstedt >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> All, >>>>>>>>>>>> >>>>>>>>>>>> Made up primarily of low level C++ code, the Hotspot codebase is >>>>>>>>>>>> highly >>>>>>>>>>>> platform dependent and also tightly coupled with the tool chains >>>>>>>>>>>> on >>>>>>>>>>>> the >>>>>>>>>>>> various platforms. Each platform/tool chain combination has its >>>>>>>>>>>> set >>>>>>>>>>>> of >>>>>>>>>>>> special quirks, and code must be implemented in a way such that >>>>>>>>>>>> it >>>>>>>>>>>> only >>>>>>>>>>>> relies on the common subset of syntax and functionality across >>>>>>>>>>>> all >>>>>>>>>>>> these >>>>>>>>>>>> combinations. History has taught us that even simple changes can >>>>>>>>>>>> have >>>>>>>>>>>> surprising results when compiled with different compilers. >>>>>>>>>>>> >>>>>>>>>>>> For more than a decade the Hotspot team has ensured a minimum >>>>>>>>>>>> quality >>>>>>>>>>>> level >>>>>>>>>>>> by requiring all pushes to be done through a build and test >>>>>>>>>>>> system >>>>>>>>>>>> (jprt) >>>>>>>>>>>> which guarantees that the code resulting from applying a set of >>>>>>>>>>>> changes >>>>>>>>>>>> builds on a set of core platforms and that a set of core tests >>>>>>>>>>>> pass. >>>>>>>>>>>> Only >>>>>>>>>>>> if >>>>>>>>>>>> all the builds and tests pass will the changes actually be >>>>>>>>>>>> pushed >>>>>>>>>>>> to >>>>>>>>>>>> the >>>>>>>>>>>> target repository. >>>>>>>>>>>> >>>>>>>>>>>> We believe that testing like the above, in combination with >>>>>>>>>>>> later >>>>>>>>>>>> stages >>>>>>>>>>>> of >>>>>>>>>>>> testing, is vital to ensuring that the quality level of the >>>>>>>>>>>> Hotspot >>>>>>>>>>>> code >>>>>>>>>>>> remains high and that developers do not run into situations >>>>>>>>>>>> where >>>>>>>>>>>> the >>>>>>>>>>>> latest >>>>>>>>>>>> version has build errors on some platforms. >>>>>>>>>>>> >>>>>>>>>>>> Recently the AIX/PPC port was added to the set of OpenJDK >>>>>>>>>>>> platforms. >>>>>>>>>>>> From >>>>>>>>>>>> a >>>>>>>>>>>> Hotspot perspective this new platform added a set of AIX/PPC >>>>>>>>>>>> specific >>>>>>>>>>>> files >>>>>>>>>>>> including some platform specific changes to shared code. The >>>>>>>>>>>> AIX/PPC >>>>>>>>>>>> platform is not tested by Oracle as part of Hotspot push jobs. >>>>>>>>>>>> The >>>>>>>>>>>> same >>>>>>>>>>>> thing applies for the shark and zero versions of Hotspot. >>>>>>>>>>>> >>>>>>>>>>>> While Hotspot developers remain committed to making sure changes >>>>>>>>>>>> are >>>>>>>>>>>> developed in a way such that the quality level remains high >>>>>>>>>>>> across >>>>>>>>>>>> all >>>>>>>>>>>> platforms and variants, because of the above mentioned >>>>>>>>>>>> complexities >>>>>>>>>>>> it >>>>>>>>>>>> is >>>>>>>>>>>> inevitable that from time to time changes will be made which >>>>>>>>>>>> introduce >>>>>>>>>>>> issues on specific platforms or tool chains not part of the core >>>>>>>>>>>> testing. >>>>>>>>>>>> >>>>>>>>>>>> To allow these issues to be resolved more quickly I would like >>>>>>>>>>>> to >>>>>>>>>>>> propose >>>>>>>>>>>> a >>>>>>>>>>>> relaxation in the requirements on how changes to Hotspot are >>>>>>>>>>>> pushed. >>>>>>>>>>>> Specifically I would like to allow for direct pushes to the >>>>>>>>>>>> hotspot/ >>>>>>>>>>>> repository of files specific to the following >>>>>>>>>>>> ports/variants/tools: >>>>>>>>>>>> >>>>>>>>>>>> * AIX >>>>>>>>>>>> * PPC >>>>>>>>>>>> * Shark >>>>>>>>>>>> * Zero >>>>>>>>>>>> >>>>>>>>>>>> Today this translates into the following files: >>>>>>>>>>>> >>>>>>>>>>>> - src/cpu/ppc/** >>>>>>>>>>>> - src/cpu/zero/** >>>>>>>>>>>> - src/os/aix/** >>>>>>>>>>>> - src/os_cpu/aix_ppc/** >>>>>>>>>>>> - src/os_cpu/bsd_zero/** >>>>>>>>>>>> - src/os_cpu/linux_ppc/** >>>>>>>>>>>> - src/os_cpu/linux_zero/** >>>>>>>>>>>> >>>>>>>>>>>> Note that all changes are still required to go through the >>>>>>>>>>>> normal >>>>>>>>>>>> development and review cycle; the proposed relaxation only >>>>>>>>>>>> applies >>>>>>>>>>>> to >>>>>>>>>>>> how >>>>>>>>>>>> the changes are pushed. >>>>>>>>>>>> >>>>>>>>>>>> If at code review time a change is for some reason deemed to be >>>>>>>>>>>> risky >>>>>>>>>>>> and/or >>>>>>>>>>>> otherwise have impact on shared files the reviewer may request >>>>>>>>>>>> that >>>>>>>>>>>> the >>>>>>>>>>>> change to go through the regular push testing. For changes only >>>>>>>>>>>> touching >>>>>>>>>>>> the >>>>>>>>>>>> above set of files this expected to be rare. >>>>>>>>>>>> >>>>>>>>>>>> Please let me know what you think. >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> Mikael >>>>>>>>>>>> >>> > From erik.joelsson at oracle.com Wed Nov 19 08:56:09 2014 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Wed, 19 Nov 2014 09:56:09 +0100 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <94199476-15AD-4226-8C18-2708F6F94DC3@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <0B65D4AA-B876-4106-9CA0-2F2AF5F43F83@oracle.com> <62FBBB3A-B1CE-43D8-9D99-3087BDDBDC37@oracle.com> <5466699D.2050808@oracle.com> <5469E5E4.7040606@oracle.com> <546B6779.9050608@oracle.com> <94199476-15AD-4226-8C18-2708F6F94DC3@oracle.com> Message-ID: <546C5B29.8050602@oracle.com> On 2014-11-18 19:09, Christian Thalinger wrote: >> On Nov 18, 2014, at 7:36 AM, Magnus Ihse Bursie wrote: >> >> On 2014-11-18 01:59, Christian Thalinger wrote: >>>> On Nov 17, 2014, at 4:11 AM, Magnus Ihse Bursie wrote: >>>> >>>> On 2014-11-14 21:44, Dean Long wrote: >>>>>> The distribution exception is there exactly since anyone should be able to distribute the files with their configure script. That does not mean that you are allowed to edit it, though. >>>>> What if we require Autoconf to be installed on the host? Does that solve any problems? >>>> No, unfortunately not. >>> Why not? >> Autoconf picks up these files automatically from the build-aux directory. That's also the reason we need to rename the original files and provide wrappers with the same name, since we can't even redirect that functionality to a file with another name. > So do I understand you correctly that the files we need are automatically copied into the workspace but since we want to use our own, old versions we renamed them and use these instead? No, I will try to clarify. Autoconf is a tool that takes one (or more) input files (written in m4 macro language) and generates a shell script, often named "configure". This shell script is what you would typically run to configure your project. Autoconf defines an API of m4 macros specifically for configure scripts which is basically what makes it useful. Most of these macros are expanded into the generated configure script. However, for reasons unknown to us, some of the more complex functionality has been split out into separate shell script "library" files. These library files, often referred to as "build-aux" are supposed to be distributed with the project source code, along with the generated configure script. We distribute them in common/autoconf/build-aux. These files can be found in the source distribution of autoconf or by downloading from the official scm repo for them. They are not part of the binary distribution of autoconf on my Ubuntu system at least. For this reason, it wouldn't help requiring autoconf to be installed as that wouldn't provide the files. For non GPL projects to be able to distribute the files in build-aux, they come with a special exception to GPL, which basically allows them to be freely distributed as long as they are part of a configure script. This exception does not seem to give any exception for deriving work from them. /Erik From sgehwolf at redhat.com Wed Nov 19 09:54:04 2014 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 19 Nov 2014 10:54:04 +0100 Subject: (9) RFR: 8064815 Zero+PPC64: Stack overflow when running Maven In-Reply-To: References: <1415971123.3278.55.camel@localhost.localdomain> <1416232479.3760.18.camel@localhost.localdomain> <1416305744.3379.20.camel@localhost.localdomain> <1416308757.3379.30.camel@localhost.localdomain> <546B6760.8050502@oracle.com> Message-ID: <1416390844.3144.6.camel@localhost.localdomain> On Tue, 2014-11-18 at 19:23 +0100, Volker Simonis wrote: > Thanks Vladimir. > > @Severin: I've just pushed the change to hotspot-rt: > > http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/acc869dcded3 Thanks! Cheers, Severin > Regards, > Volker > > On Tue, Nov 18, 2014 at 4:36 PM, Vladimir Kozlov > wrote: > > Looks good. > > > > Thanks, > > Vladimir > > > > > > On 11/18/14 3:05 AM, Severin Gehwolf wrote: > >> > >> Hi Volker, Vladimir, > >> > >> On Tue, 2014-11-18 at 11:21 +0100, Volker Simonis wrote: > >> [...] > >>>>> > >>>>> I think an assertion would be good. > >>>> > >>>> > >>>> Sure. Updated webrev coming soon. > >>>> > >>>>> (but 'abi_available' should be an > >>>>> 'int' because 'ssize_t' is only specified to hold values from -1 to > >>>>> SSIZE_MAX). > >>>> > >>>> > >>>> OK, thanks! In that case I think it's safer to also change back the > >>>> return type of abi_stack_available(). ssize_t worked since it's defined > >>>> as signed int on linux. > >>>> > >>> > >>> I don't like ssize_t because of it's unclear semantics. I understand > >>> that it happens to work on Linux because it's defined as a signed int > >>> there. But then why not use an int in the first place? So I'd prefer > >>> if you'd use an int. > >> > >> > >> Agreed. Changed it back to use an int. > >> > >> On Mon, 2014-11-17 at 09:54 -0800, Vladimir Kozlov wrote: > >>> > >>> I mean next guarantee: > >>> > >>> guarantee(Thread::current() == thread, "should run in the same > >>> thread"); > >>> > >>> otherwise 'thread->stack_base() - (address) &stack_used' will give > >>> wrong result if threads are different. > >> > >> > >> Thanks for the clarification, Vladimir! I've added this guarantee. > >> > >> Updated webrev: > >> https://jerboaa.fedorapeople.org/bugs/openjdk/JDK-8064815/webrev.1/ > >> > >> Thanks again for the reviews! > >> > >> Cheers, > >> Severin > >> > > From aleksey.shipilev at oracle.com Wed Nov 19 11:21:40 2014 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Wed, 19 Nov 2014 14:21:40 +0300 Subject: RFR: JDK-8035663 Suspicious failure of test java/util/concurrent/Phaser/FickleRegister.java In-Reply-To: <546C27ED.7090700@oracle.com> References: <546C27ED.7090700@oracle.com> Message-ID: <546C7D44.1000502@oracle.com> Hi David, On 11/19/2014 08:17 AM, David Holmes wrote: > This test failure exposed a number of issues with the logic in > unsafe.cpp for handling atomic updates of Java long fields on platforms > without any direct support for a 64-bit CAS operation - platforms for > which supports_cx8 is not true. This only impacts our SE Embedded PPC32 > platform (where we have been using this fix for some time now) but in > case other such platforms came along I wanted to get this pushed to > mainline. Does this also apply to "double" handling? > What the unsafe code did was to use the object containing the field as a > lock object for reading and writing the field. This seems reasonable on > the surface but in fact had a fatal flaw - because we were locking a > Java-level visible object inside what was considered to be a lock-free > code-path by the application and library logic, we could actually induce > a deadlock - which is why the test failed. Ouch. > In addition the code had two further flaws: > > 1. Because the field could also be updated via direct assignment in Java > code the unsafe code needed to perform an Atomic::load of the field. And > for good measure we also employ an Atomic::store to ensure no > interference with direct reads of the field in Java code. I am confused by this explanation. This seems to imply Atomic::store and Atomic::load are already providing the access atomicity. Why do we need the lock? Given the interaction with Java code, I wonder if this should be handled in per-platform Atomic::* definitions, not in Unsafe stubs? Also, if Atomic::* provide the atomic support, why do we have this: 1178 #ifdef SUPPORTS_NATIVE_CX8 1179 return (jlong)(Atomic::cmpxchg(x, addr, e)) == e; 1180 #else 1181 if (VM_Version::supports_cx8()) 1182 return (jlong)(Atomic::cmpxchg(x, addr, e)) == e; 1183 else { 1184 jboolean success = false; 1185 MutexLockerEx mu(UnsafeJlong_lock, Mutex::_no_safepoint_check_flag); 1186 jlong val = Atomic::load(addr); 1187 if (val == e) { Atomic::store(x, addr); success = true; } 1188 return success; 1189 } 1190 #endif ...instead of delegating to Atomic::cmpxchg directly? If locking is needed to maintain the atomicity in absence of 8-byte CAS on target platform, the Atomic::* should IMO maintain the global lock guarding *all* load/stores as well as CASes there. Thanks, -Aleksey. From david.holmes at oracle.com Wed Nov 19 12:03:35 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 19 Nov 2014 22:03:35 +1000 Subject: RFR: JDK-8035663 Suspicious failure of test java/util/concurrent/Phaser/FickleRegister.java In-Reply-To: <546C7D44.1000502@oracle.com> References: <546C27ED.7090700@oracle.com> <546C7D44.1000502@oracle.com> Message-ID: <546C8717.4090506@oracle.com> Hi Aleksey, On 19/11/2014 9:21 PM, Aleksey Shipilev wrote: > Hi David, > > On 11/19/2014 08:17 AM, David Holmes wrote: >> This test failure exposed a number of issues with the logic in >> unsafe.cpp for handling atomic updates of Java long fields on platforms >> without any direct support for a 64-bit CAS operation - platforms for >> which supports_cx8 is not true. This only impacts our SE Embedded PPC32 >> platform (where we have been using this fix for some time now) but in >> case other such platforms came along I wanted to get this pushed to >> mainline. > > Does this also apply to "double" handling? It would in theory, but there's no API for CAS of double fields. >> What the unsafe code did was to use the object containing the field as a >> lock object for reading and writing the field. This seems reasonable on >> the surface but in fact had a fatal flaw - because we were locking a >> Java-level visible object inside what was considered to be a lock-free >> code-path by the application and library logic, we could actually induce >> a deadlock - which is why the test failed. > > Ouch. > >> In addition the code had two further flaws: >> >> 1. Because the field could also be updated via direct assignment in Java >> code the unsafe code needed to perform an Atomic::load of the field. And >> for good measure we also employ an Atomic::store to ensure no >> interference with direct reads of the field in Java code. > > I am confused by this explanation. This seems to imply Atomic::store and > Atomic::load are already providing the access atomicity. Why do we need > the lock? The lock provides the atomicity needed for the logical CAS operation. Atomic loads and stores don't help with that. But atomic loads and stores are needed to ensure direct assignment of the volatile field (which is already implemented via Atomic::store) can't interfere with a concurrent "cas" operation. > Given the interaction with Java code, I wonder if this should be handled > in per-platform Atomic::* definitions, not in Unsafe stubs? Not sure what you mean by this, but this is the implementation of the Unsafe API - which is done using the runtime facilities of the Atomic class. This code handles both the supports_cx8 and doesn't_support_cx8 cases. > Also, if > Atomic::* provide the atomic support, why do we have this: > > 1178 #ifdef SUPPORTS_NATIVE_CX8 > 1179 return (jlong)(Atomic::cmpxchg(x, addr, e)) == e; > 1180 #else > 1181 if (VM_Version::supports_cx8()) > 1182 return (jlong)(Atomic::cmpxchg(x, addr, e)) == e; > 1183 else { > 1184 jboolean success = false; > 1185 MutexLockerEx mu(UnsafeJlong_lock, > Mutex::_no_safepoint_check_flag); > 1186 jlong val = Atomic::load(addr); > 1187 if (val == e) { Atomic::store(x, addr); success = true; } > 1188 return success; > 1189 } > 1190 #endif > > ...instead of delegating to Atomic::cmpxchg directly? If locking is > needed to maintain the atomicity in absence of 8-byte CAS on target > platform, the Atomic::* should IMO maintain the global lock guarding > *all* load/stores as well as CASes there. The aim here is not to try to provide an Atomic::cmpxchg(jlong) implementation for general use, but to provide the atomic operations needed by the Unsafe class. We've deliberately steered away from using a 64-bit CAS in the runtime precisely because it is not available on all platforms, and a lock-based solution for the general runtime would become a bottleneck. Of course if the original implementation had instead gone down that path we wouldn't have known anything different - but it didn't. This patch simply fixes the current broken implementation of Unsafe. Thanks, David > Thanks, > -Aleksey. > From lois.foltan at oracle.com Wed Nov 19 12:15:26 2014 From: lois.foltan at oracle.com (Lois Foltan) Date: Wed, 19 Nov 2014 07:15:26 -0500 Subject: RFR:8060074:Cleanup of unused memory tracking parameters in os::free and its callers. In-Reply-To: <546BBFDE.8040003@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> <546BBFDE.8040003@oracle.com> Message-ID: <546C89DE.2080409@oracle.com> Hi Max, Looks good. Minor comment, several files need the copyright statements updated before pushing. No need for another webrev, though. Thanks, Lois On 11/18/2014 4:53 PM, Max Ockner wrote: > Hello all, > Please review this minor cleanup: > > Bug ID: 8060074 > Webrev: http://cr.openjdk.java.net/~coleenp/8060074/ > > Summary: > (1) os::free takes two arguments, but never uses the second argument, > which is a MEMFLAG. I have removed this argument from every os::free > call. > (2) The FreeHeap method in src/share/vm/memory/allocation.inline.hpp > also takes a MEMFLAG argument, which is only used to call os::free. > Now an unused argument, it has been removed from all FreeHeap calls. > No other methods which directly call os::free() have this problem. > (3) The FREE_C_HEAP_ARRAY macro in src/share/vm/memory/allocation.hpp > takes A MEMFLAG argument which is passed to FreeHeap, and nothing > else. This argument is now unused, and has been removed. No other > methods which call FreeHeap have this problem. > > No methods or macros which use the FREE_C_HEAP_ARRAY macro needed > cleanup. I have also removed the extra argument from the definitions > of the above methods. > > Tests: jtreg hotspot tests with > -vmoption:"-XX:NativeMemoryTrackingdetail" > > Thanks for your help, > Max Ockner From aleksey.shipilev at oracle.com Wed Nov 19 12:44:58 2014 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Wed, 19 Nov 2014 15:44:58 +0300 Subject: RFR: JDK-8035663 Suspicious failure of test java/util/concurrent/Phaser/FickleRegister.java In-Reply-To: <546C8717.4090506@oracle.com> References: <546C27ED.7090700@oracle.com> <546C7D44.1000502@oracle.com> <546C8717.4090506@oracle.com> Message-ID: <546C90CA.5090804@oracle.com> On 11/19/2014 03:03 PM, David Holmes wrote: > On 19/11/2014 9:21 PM, Aleksey Shipilev wrote: > The lock provides the atomicity needed for the logical CAS operation. > Atomic loads and stores don't help with that. But atomic loads and > stores are needed to ensure direct assignment of the volatile field > (which is already implemented via Atomic::store) can't interfere with a > concurrent "cas" operation. I understand the intent now. I guess our saving grace is that the absence of NATIVE_CX8 precludes the use of intrinsics or other codegen tricks that bypass the current Unsafe stubs. >> Given the interaction with Java code, I wonder if this should be handled >> in per-platform Atomic::* definitions, not in Unsafe stubs? > > Not sure what you mean by this, but this is the implementation of the > Unsafe API - which is done using the runtime facilities of the Atomic > class. This code handles both the supports_cx8 and doesn't_support_cx8 > cases. This is my point: we leak the platform-specific details into Unsafe API stubs. I think this contributes to technical debt. Implementing the locked Atomic::cmpxchg(jlong) for doesn't_support_cx8 case still seems more fitting. Your excellent comment in unsafe.cpp really belongs in global Atomic definitions. This also becomes a more correct fix: if compilers inject runtime call to Atomic::load/store for volatile long fields when NATIVE_CX8=false, then they would properly serialize with lock-protected CAS. > The aim here is not to try to provide an Atomic::cmpxchg(jlong) > implementation for general use, but to provide the atomic operations > needed by the Unsafe class. We've deliberately steered away from > using a 64-bit CAS in the runtime precisely because it is not > available on all platforms, and a lock-based solution for the general > runtime would become a bottleneck. Of course if the original > implementation had instead gone down that path we wouldn't have known > anything different - but it didn't. Why? jlong atomicity on that particular platform does not seem an isolated and one-off Unsafe issue. This does seem like a generic problem you will face on that particular platform in future. It feels weird to steer away from implementing the Atomic APIs because "it can become a bottleneck". That is, you may want to steer away from *using* the APIs excessively if you know it has problems on some platforms. But steering away from *implementing* leads to finally implementing it in ad-hoc places, e.g. Unsafe stubs in this case. > This patch simply fixes the current broken implementation of Unsafe. I agree with the approach to dodge the deadlock, but if we touch this area of the runtime code, it seems a good idea to make it more consistent and less spread-out. Thanks, -Aleksey. From lois.foltan at oracle.com Wed Nov 19 13:13:32 2014 From: lois.foltan at oracle.com (Lois Foltan) Date: Wed, 19 Nov 2014 08:13:32 -0500 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <546A7B88.8000504@oracle.com> References: <546A7B88.8000504@oracle.com> Message-ID: <546C977C.9050603@oracle.com> Hi Coleen, I think this looks good and I actually like the implementation better than the former indexing approach. One minor comment: src/share/vm/prims/methodHandles.cpp - line #278 if statement conditional, the expression "m->method_holder()" could be changed to "m_klass" which is set at the top of the routine to contain m->method_holder(). m_klass seems to be used consistently in that manner throughout the routine. Lois On 11/17/2014 5:49 PM, Coleen Phillimore wrote: > Summary: note all MemberNames created on internal list for adjusting > method entries. > > The JVM MemberNameTable code will push all member names on the list > rather than trying to index by method_idnum. The code to look up > MemberName types wasn't used so was removed. Class redefinition > iterates through the table sequentially to update the Method* pointers > in saved member names. > > This change will work with David Chase's change to the Java code for > bug 8013267 without the extra code dealing with class redefinition. > > Tested with vm.quick.testlist, jck tests and jtreg tests, including > the mlvm tests that failed in the bug report. > > open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ > bug link https://bugs.openjdk.java.net/browse/JDK-8042235 > > Thanks, > Coleen From coleen.phillimore at oracle.com Wed Nov 19 16:24:41 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 19 Nov 2014 11:24:41 -0500 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <546C977C.9050603@oracle.com> References: <546A7B88.8000504@oracle.com> <546C977C.9050603@oracle.com> Message-ID: <546CC449.9020709@oracle.com> On 11/19/14, 8:13 AM, Lois Foltan wrote: > Hi Coleen, > > I think this looks good and I actually like the implementation better > than the former indexing approach. One minor comment: > > src/share/vm/prims/methodHandles.cpp > - line #278 if statement conditional, the expression > "m->method_holder()" could be changed to "m_klass" > which is set at the top of the routine to contain > m->method_holder(). m_klass seems to be used > consistently in that manner throughout the routine. Hi Lois, Thank you for reviewing this. Yes, m_klass is better so I changed it. I didn't change the other uses of m->method_holder() in this function though because I'm not touching those lines. thanks! Coleen > > Lois > > On 11/17/2014 5:49 PM, Coleen Phillimore wrote: >> Summary: note all MemberNames created on internal list for adjusting >> method entries. >> >> The JVM MemberNameTable code will push all member names on the list >> rather than trying to index by method_idnum. The code to look up >> MemberName types wasn't used so was removed. Class redefinition >> iterates through the table sequentially to update the Method* >> pointers in saved member names. >> >> This change will work with David Chase's change to the Java code for >> bug 8013267 without the extra code dealing with class redefinition. >> >> Tested with vm.quick.testlist, jck tests and jtreg tests, including >> the mlvm tests that failed in the bug report. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ >> bug link https://bugs.openjdk.java.net/browse/JDK-8042235 >> >> Thanks, >> Coleen > From coleen.phillimore at oracle.com Wed Nov 19 16:27:56 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 19 Nov 2014 11:27:56 -0500 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <546C977C.9050603@oracle.com> References: <546A7B88.8000504@oracle.com> <546C977C.9050603@oracle.com> Message-ID: <546CC50C.4000304@oracle.com> On 11/19/14, 8:13 AM, Lois Foltan wrote: > Hi Coleen, > > I think this looks good and I actually like the implementation better > than the former indexing approach. One minor comment: > > src/share/vm/prims/methodHandles.cpp > - line #278 if statement conditional, the expression > "m->method_holder()" could be changed to "m_klass" > which is set at the top of the routine to contain > m->method_holder(). m_klass seems to be used > consistently in that manner throughout the routine. \ m_klass is a stupid KlassHandle so I'd have to upcast it to InstanceKlass but method_holder() returns the right type InstanceKlass so I now think m->method_holder() is better. I should get back to work deciding whether we can remove these KlassHandle types.... Thanks! Coleen > > Lois > > On 11/17/2014 5:49 PM, Coleen Phillimore wrote: >> Summary: note all MemberNames created on internal list for adjusting >> method entries. >> >> The JVM MemberNameTable code will push all member names on the list >> rather than trying to index by method_idnum. The code to look up >> MemberName types wasn't used so was removed. Class redefinition >> iterates through the table sequentially to update the Method* >> pointers in saved member names. >> >> This change will work with David Chase's change to the Java code for >> bug 8013267 without the extra code dealing with class redefinition. >> >> Tested with vm.quick.testlist, jck tests and jtreg tests, including >> the mlvm tests that failed in the bug report. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ >> bug link https://bugs.openjdk.java.net/browse/JDK-8042235 >> >> Thanks, >> Coleen > From coleen.phillimore at oracle.com Wed Nov 19 16:59:29 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 19 Nov 2014 11:59:29 -0500 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <546B8EDD.1060809@oracle.com> References: <546A7B88.8000504@oracle.com> <546B8EDD.1060809@oracle.com> Message-ID: <546CCC71.2060501@oracle.com> Thank you Dan for reviewing this. New webrev, see below for why. http://cr.openjdk.java.net/~coleenp/8042235_2/ On 11/18/14, 1:24 PM, Daniel D. Daugherty wrote: > On 11/17/14 3:49 PM, Coleen Phillimore wrote: >> Summary: note all MemberNames created on internal list for adjusting >> method entries. >> >> The JVM MemberNameTable code will push all member names on the list >> rather than trying to index by method_idnum. The code to look up >> MemberName types wasn't used so was removed. Class redefinition >> iterates through the table sequentially to update the Method* >> pointers in saved member names. >> >> This change will work with David Chase's change to the Java code for >> bug 8013267 without the extra code dealing with class redefinition. >> >> Tested with vm.quick.testlist, jck tests and jtreg tests, including >> the mlvm tests that failed in the bug report. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ > > src/share/vm/classfile/javaClasses.hpp > No comments. > > src/share/vm/classfile/javaClasses.cpp > No comments. > > src/share/vm/oops/instanceKlass.hpp > No comments. > > src/share/vm/oops/instanceKlass.cpp > line 2951: _member_names = new (ResourceObj::C_HEAP, mtClass) > MemberNameTable(idnum_allocated_count()); > Not your bug, but what should happen if this new fails? Or > is this one of the operator overrides that handles that by > killing the VM? This is one of the ResourceObj::new calls that does a vm_exit_out_of_memory. We could add some code to handle allocation failure at this part (maybe allocate a very small member name array instead) but I don't think it's worth it at this point. > > src/share/vm/prims/jvm.cpp > nit line 612: methodHandle m (THREAD, method); > Please delete space between 'm ('. Okay. > > lines 607-609, 616: uses of 'new_obj' > Should all of these be switched to 'new_obj_h()'? > > In particular, is line 616 subject to being moved by GC > if the methodHandle creation goes to a safepoint? So methodHandle doesn't go to a safepoint, and the uses of new_obj (unhandled) are safe in that call until the add_member_name() call. But you have to know that the other calls that use new_obj don't go to a safepoint. It's always safest to Handle early and pass around handles. This function is odd in that it uses the oop to copy into first though. This comment made me examine this code and the call to register_finalizer can safepoint. In this case the MemberName Method* could be redefined and not copied. This can't happen because MemberName is a final class without finalizers, but I've rearranged the code to be safe for this case if it ever happens. The webrev above is this rearranged code. I've rerun the java/lang/invoke tests on it. > > line 620: return JNIHandles::make_local(env, oop(new_obj_h())); > is the 'oop(...)' around 'new_obj_h()' redundant? I might > be rusty, but isn't 'new_obj_h()' the unhandled oop? > Yes, this was unnecessary. > src/share/vm/prims/methodHandles.hpp > No comments. > > src/share/vm/prims/methodHandles.cpp > No comments. > > test/compiler/jsr292/RedefineMethodUsedByMultipleMethodHandles.java > line 142 // static class FooTransformer implements > ClassFileTransformer, Opcodes { > Do you still need this line? No, I removed it. Thanks! Coleen > > > Dan > > >> bug link https://bugs.openjdk.java.net/browse/JDK-8042235 >> >> Thanks, >> Coleen > From coleen.phillimore at oracle.com Wed Nov 19 17:01:20 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 19 Nov 2014 12:01:20 -0500 Subject: RFR:8060074:Cleanup of unused memory tracking parameters in os::free and its callers. In-Reply-To: <546C89DE.2080409@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> <546BBFDE.8040003@oracle.com> <546C89DE.2080409@oracle.com> Message-ID: <546CCCE0.2010802@oracle.com> Max, This looks great to me also. Thank you for making this change! I can update the copyright when I sponsor it for you. Thanks, Coleen On 11/19/14, 7:15 AM, Lois Foltan wrote: > Hi Max, > > Looks good. Minor comment, several files need the copyright > statements updated before pushing. No need for another webrev, though. > > Thanks, > Lois > > > On 11/18/2014 4:53 PM, Max Ockner wrote: >> Hello all, >> Please review this minor cleanup: >> >> Bug ID: 8060074 >> Webrev: http://cr.openjdk.java.net/~coleenp/8060074/ >> >> Summary: >> (1) os::free takes two arguments, but never uses the second argument, >> which is a MEMFLAG. I have removed this argument from every os::free >> call. >> (2) The FreeHeap method in src/share/vm/memory/allocation.inline.hpp >> also takes a MEMFLAG argument, which is only used to call os::free. >> Now an unused argument, it has been removed from all FreeHeap calls. >> No other methods which directly call os::free() have this problem. >> (3) The FREE_C_HEAP_ARRAY macro in src/share/vm/memory/allocation.hpp >> takes A MEMFLAG argument which is passed to FreeHeap, and nothing >> else. This argument is now unused, and has been removed. No other >> methods which call FreeHeap have this problem. >> >> No methods or macros which use the FREE_C_HEAP_ARRAY macro needed >> cleanup. I have also removed the extra argument from the definitions >> of the above methods. >> >> Tests: jtreg hotspot tests with >> -vmoption:"-XX:NativeMemoryTrackingdetail" >> >> Thanks for your help, >> Max Ockner > From daniel.daugherty at oracle.com Wed Nov 19 17:27:38 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 19 Nov 2014 10:27:38 -0700 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <546CCC71.2060501@oracle.com> References: <546A7B88.8000504@oracle.com> <546B8EDD.1060809@oracle.com> <546CCC71.2060501@oracle.com> Message-ID: <546CD30A.30401@oracle.com> On 11/19/14 9:59 AM, Coleen Phillimore wrote: > > Thank you Dan for reviewing this. New webrev, see below for why. > > http://cr.openjdk.java.net/~coleenp/8042235_2/ src/share/vm/classfile/javaClasses.hpp No comments. src/share/vm/classfile/javaClasses.cpp No comments. src/share/vm/oops/instanceKlass.hpp No comments. src/share/vm/oops/instanceKlass.cpp No comments. src/share/vm/prims/jvm.cpp No comments. That JVM_Clone() code is scary... :-) src/share/vm/prims/methodHandles.hpp No comments. src/share/vm/prims/methodHandles.cpp No comments. test/compiler/jsr292/RedefineMethodUsedByMultipleMethodHandles.java No comments. Thumbs up. Dan > > > On 11/18/14, 1:24 PM, Daniel D. Daugherty wrote: >> On 11/17/14 3:49 PM, Coleen Phillimore wrote: >>> Summary: note all MemberNames created on internal list for adjusting >>> method entries. >>> >>> The JVM MemberNameTable code will push all member names on the list >>> rather than trying to index by method_idnum. The code to look up >>> MemberName types wasn't used so was removed. Class redefinition >>> iterates through the table sequentially to update the Method* >>> pointers in saved member names. >>> >>> This change will work with David Chase's change to the Java code for >>> bug 8013267 without the extra code dealing with class redefinition. >>> >>> Tested with vm.quick.testlist, jck tests and jtreg tests, including >>> the mlvm tests that failed in the bug report. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ >> >> src/share/vm/classfile/javaClasses.hpp >> No comments. >> >> src/share/vm/classfile/javaClasses.cpp >> No comments. >> >> src/share/vm/oops/instanceKlass.hpp >> No comments. >> >> src/share/vm/oops/instanceKlass.cpp >> line 2951: _member_names = new (ResourceObj::C_HEAP, mtClass) >> MemberNameTable(idnum_allocated_count()); >> Not your bug, but what should happen if this new fails? Or >> is this one of the operator overrides that handles that by >> killing the VM? > > This is one of the ResourceObj::new calls that does a > vm_exit_out_of_memory. We could add some code to handle allocation > failure at this part (maybe allocate a very small member name array > instead) but I don't think it's worth it at this point. > >> >> src/share/vm/prims/jvm.cpp >> nit line 612: methodHandle m (THREAD, method); >> Please delete space between 'm ('. > > Okay. >> >> lines 607-609, 616: uses of 'new_obj' >> Should all of these be switched to 'new_obj_h()'? >> >> In particular, is line 616 subject to being moved by GC >> if the methodHandle creation goes to a safepoint? > > So methodHandle doesn't go to a safepoint, and the uses of new_obj > (unhandled) are safe in that call until the add_member_name() call. > But you have to know that the other calls that use new_obj don't go to > a safepoint. It's always safest to Handle early and pass around > handles. This function is odd in that it uses the oop to copy into > first though. > > This comment made me examine this code and the call to > register_finalizer can safepoint. In this case the MemberName Method* > could be redefined and not copied. This can't happen because > MemberName is a final class without finalizers, but I've rearranged > the code to be safe for this case if it ever happens. > > The webrev above is this rearranged code. I've rerun the > java/lang/invoke tests on it. >> >> line 620: return JNIHandles::make_local(env, oop(new_obj_h())); >> is the 'oop(...)' around 'new_obj_h()' redundant? I might >> be rusty, but isn't 'new_obj_h()' the unhandled oop? >> > > Yes, this was unnecessary. >> src/share/vm/prims/methodHandles.hpp >> No comments. >> >> src/share/vm/prims/methodHandles.cpp >> No comments. >> >> test/compiler/jsr292/RedefineMethodUsedByMultipleMethodHandles.java >> line 142 // static class FooTransformer implements >> ClassFileTransformer, Opcodes { >> Do you still need this line? > > No, I removed it. > > Thanks! > Coleen >> >> >> Dan >> >> >>> bug link https://bugs.openjdk.java.net/browse/JDK-8042235 >>> >>> Thanks, >>> Coleen >> > From coleen.phillimore at oracle.com Wed Nov 19 17:36:58 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 19 Nov 2014 12:36:58 -0500 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <546CD30A.30401@oracle.com> References: <546A7B88.8000504@oracle.com> <546B8EDD.1060809@oracle.com> <546CCC71.2060501@oracle.com> <546CD30A.30401@oracle.com> Message-ID: <546CD53A.4050204@oracle.com> Thanks Dan! On 11/19/14, 12:27 PM, Daniel D. Daugherty wrote: > On 11/19/14 9:59 AM, Coleen Phillimore wrote: >> >> Thank you Dan for reviewing this. New webrev, see below for why. >> >> http://cr.openjdk.java.net/~coleenp/8042235_2/ > > src/share/vm/classfile/javaClasses.hpp > No comments. > > src/share/vm/classfile/javaClasses.cpp > No comments. > > src/share/vm/oops/instanceKlass.hpp > No comments. > > src/share/vm/oops/instanceKlass.cpp > No comments. > > src/share/vm/prims/jvm.cpp > No comments. That JVM_Clone() code is scary... :-) There are scarier things :) Coleen > > src/share/vm/prims/methodHandles.hpp > No comments. > > src/share/vm/prims/methodHandles.cpp > No comments. > > test/compiler/jsr292/RedefineMethodUsedByMultipleMethodHandles.java > No comments. > > > Thumbs up. > > Dan > > > > >> >> >> On 11/18/14, 1:24 PM, Daniel D. Daugherty wrote: >>> On 11/17/14 3:49 PM, Coleen Phillimore wrote: >>>> Summary: note all MemberNames created on internal list for >>>> adjusting method entries. >>>> >>>> The JVM MemberNameTable code will push all member names on the list >>>> rather than trying to index by method_idnum. The code to look up >>>> MemberName types wasn't used so was removed. Class redefinition >>>> iterates through the table sequentially to update the Method* >>>> pointers in saved member names. >>>> >>>> This change will work with David Chase's change to the Java code >>>> for bug 8013267 without the extra code dealing with class >>>> redefinition. >>>> >>>> Tested with vm.quick.testlist, jck tests and jtreg tests, including >>>> the mlvm tests that failed in the bug report. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ >>> >>> src/share/vm/classfile/javaClasses.hpp >>> No comments. >>> >>> src/share/vm/classfile/javaClasses.cpp >>> No comments. >>> >>> src/share/vm/oops/instanceKlass.hpp >>> No comments. >>> >>> src/share/vm/oops/instanceKlass.cpp >>> line 2951: _member_names = new (ResourceObj::C_HEAP, mtClass) >>> MemberNameTable(idnum_allocated_count()); >>> Not your bug, but what should happen if this new fails? Or >>> is this one of the operator overrides that handles that by >>> killing the VM? >> >> This is one of the ResourceObj::new calls that does a >> vm_exit_out_of_memory. We could add some code to handle allocation >> failure at this part (maybe allocate a very small member name array >> instead) but I don't think it's worth it at this point. >> >>> >>> src/share/vm/prims/jvm.cpp >>> nit line 612: methodHandle m (THREAD, method); >>> Please delete space between 'm ('. >> >> Okay. >>> >>> lines 607-609, 616: uses of 'new_obj' >>> Should all of these be switched to 'new_obj_h()'? >>> >>> In particular, is line 616 subject to being moved by GC >>> if the methodHandle creation goes to a safepoint? >> >> So methodHandle doesn't go to a safepoint, and the uses of new_obj >> (unhandled) are safe in that call until the add_member_name() call. >> But you have to know that the other calls that use new_obj don't go >> to a safepoint. It's always safest to Handle early and pass around >> handles. This function is odd in that it uses the oop to copy into >> first though. >> >> This comment made me examine this code and the call to >> register_finalizer can safepoint. In this case the MemberName >> Method* could be redefined and not copied. This can't happen because >> MemberName is a final class without finalizers, but I've rearranged >> the code to be safe for this case if it ever happens. >> >> The webrev above is this rearranged code. I've rerun the >> java/lang/invoke tests on it. >>> >>> line 620: return JNIHandles::make_local(env, oop(new_obj_h())); >>> is the 'oop(...)' around 'new_obj_h()' redundant? I might >>> be rusty, but isn't 'new_obj_h()' the unhandled oop? >>> >> >> Yes, this was unnecessary. >>> src/share/vm/prims/methodHandles.hpp >>> No comments. >>> >>> src/share/vm/prims/methodHandles.cpp >>> No comments. >>> >>> test/compiler/jsr292/RedefineMethodUsedByMultipleMethodHandles.java >>> line 142 // static class FooTransformer implements >>> ClassFileTransformer, Opcodes { >>> Do you still need this line? >> >> No, I removed it. >> >> Thanks! >> Coleen >>> >>> >>> Dan >>> >>> >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8042235 >>>> >>>> Thanks, >>>> Coleen >>> >> > From aph at redhat.com Wed Nov 19 17:49:40 2014 From: aph at redhat.com (Andrew Haley) Date: Wed, 19 Nov 2014 17:49:40 +0000 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <20141118180315.GB22927@redhat.com> References: <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> <546B7C8D.7050409@redhat.com> <20141118180315.GB22927@redhat.com> Message-ID: <546CD834.50004@redhat.com> I think this covers everything that reviewers have mentioned: http://cr.openjdk.java.net/~aph/aarch64-8064357-4/ Andrew. From david.holmes at oracle.com Wed Nov 19 20:26:44 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 20 Nov 2014 06:26:44 +1000 Subject: RFR: JDK-8035663 Suspicious failure of test java/util/concurrent/Phaser/FickleRegister.java In-Reply-To: <546C90CA.5090804@oracle.com> References: <546C27ED.7090700@oracle.com> <546C7D44.1000502@oracle.com> <546C8717.4090506@oracle.com> <546C90CA.5090804@oracle.com> Message-ID: <546CFD04.7090003@oracle.com> On 19/11/2014 10:44 PM, Aleksey Shipilev wrote: > On 11/19/2014 03:03 PM, David Holmes wrote: >> On 19/11/2014 9:21 PM, Aleksey Shipilev wrote: >> The lock provides the atomicity needed for the logical CAS operation. >> Atomic loads and stores don't help with that. But atomic loads and >> stores are needed to ensure direct assignment of the volatile field >> (which is already implemented via Atomic::store) can't interfere with a >> concurrent "cas" operation. > > I understand the intent now. I guess our saving grace is that the > absence of NATIVE_CX8 precludes the use of intrinsics or other codegen > tricks that bypass the current Unsafe stubs. Yes. Also I should have given a clearer picture of the context. This part of the Unsafe API is not intended for general use and mixing with regular Java code. It is only used as the implementation for a few of the java.util.concurrent classes. It isn't even used by AtomicLongFieldUpdater (which implements the locking at the Java level directly using the updater instance as the lock object). As per the test name the problem usage here was exposed via Phaser - which uses Unsafe directly for performance - to access its "private volatile long state" field. > >>> Given the interaction with Java code, I wonder if this should be handled >>> in per-platform Atomic::* definitions, not in Unsafe stubs? >> >> Not sure what you mean by this, but this is the implementation of the >> Unsafe API - which is done using the runtime facilities of the Atomic >> class. This code handles both the supports_cx8 and doesn't_support_cx8 >> cases. > > This is my point: we leak the platform-specific details into Unsafe API > stubs. I think this contributes to technical debt. Implementing the > locked Atomic::cmpxchg(jlong) for doesn't_support_cx8 case still seems > more fitting. Your excellent comment in unsafe.cpp really belongs in > global Atomic definitions. And it could have been implemented that way from the beginning but was not. > This also becomes a more correct fix: if compilers inject runtime call > to Atomic::load/store for volatile long fields when NATIVE_CX8=false, > then they would properly serialize with lock-protected CAS. Normal accesses to volatile long fields are already atomic as per the requirements of the spec. They are atomic wrt each other but not wrt these lock-based operations. The 'correct' fix is not to make direct accesses serialize with the lock-protected CAS, (which would kill the performance of non-CASing Java code) but to not mix such accesses in the first place. Again this is not an application coding issue but a library implementation issue. > >> The aim here is not to try to provide an Atomic::cmpxchg(jlong) >> implementation for general use, but to provide the atomic operations >> needed by the Unsafe class. We've deliberately steered away from >> using a 64-bit CAS in the runtime precisely because it is not >> available on all platforms, and a lock-based solution for the general >> runtime would become a bottleneck. Of course if the original >> implementation had instead gone down that path we wouldn't have known >> anything different - but it didn't. > > Why? jlong atomicity on that particular platform does not seem an > isolated and one-off Unsafe issue. This does seem like a generic problem > you will face on that particular platform in future. Simple fact is that not all 32-bit platforms support a 64-bit CAS. So we have avoided implementing anything in the VM runtime that requires a 64-bit CAS on a 32-bit platform. If we implemented the Atomic::cmpxchg(jlong) for all platforms as you suggest then people would have used it without too much thought - potentially leading to problems in key algorithms on platforms without suports_cx8. We've actually caught this happening a few times over recent years. When it does we question whether the variable in question really needs to be 64-bit on a 32-bit platform and the answer so far has always been no - so we either use int, or intptr_t so that we get 32-bit on 32-bit and 64-bit on 64-bit. > It feels weird to steer away from implementing the Atomic APIs because > "it can become a bottleneck". That is, you may want to steer away from > *using* the APIs excessively if you know it has problems on some > platforms. But steering away from *implementing* leads to finally > implementing it in ad-hoc places, e.g. Unsafe stubs in this case. It is not as general as you make out. We only need 64-bit CAS in very limited places. > >> This patch simply fixes the current broken implementation of Unsafe. > > I agree with the approach to dodge the deadlock, but if we touch this > area of the runtime code, it seems a good idea to make it more > consistent and less spread-out. Touching the code to fix the existing bugs does not mean we should re-design and implement this part of the runtime support. Thanks for looking at it. David > > Thanks, > -Aleksey. > From serguei.spitsyn at oracle.com Wed Nov 19 21:21:04 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 19 Nov 2014 13:21:04 -0800 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <546CCC71.2060501@oracle.com> References: <546A7B88.8000504@oracle.com> <546B8EDD.1060809@oracle.com> <546CCC71.2060501@oracle.com> Message-ID: <546D09C0.2060105@oracle.com> Looks good. Thanks, Serguei On 11/19/14 8:59 AM, Coleen Phillimore wrote: > > Thank you Dan for reviewing this. New webrev, see below for why. > > http://cr.openjdk.java.net/~coleenp/8042235_2/ > > > On 11/18/14, 1:24 PM, Daniel D. Daugherty wrote: >> On 11/17/14 3:49 PM, Coleen Phillimore wrote: >>> Summary: note all MemberNames created on internal list for adjusting >>> method entries. >>> >>> The JVM MemberNameTable code will push all member names on the list >>> rather than trying to index by method_idnum. The code to look up >>> MemberName types wasn't used so was removed. Class redefinition >>> iterates through the table sequentially to update the Method* >>> pointers in saved member names. >>> >>> This change will work with David Chase's change to the Java code for >>> bug 8013267 without the extra code dealing with class redefinition. >>> >>> Tested with vm.quick.testlist, jck tests and jtreg tests, including >>> the mlvm tests that failed in the bug report. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ >> >> src/share/vm/classfile/javaClasses.hpp >> No comments. >> >> src/share/vm/classfile/javaClasses.cpp >> No comments. >> >> src/share/vm/oops/instanceKlass.hpp >> No comments. >> >> src/share/vm/oops/instanceKlass.cpp >> line 2951: _member_names = new (ResourceObj::C_HEAP, mtClass) >> MemberNameTable(idnum_allocated_count()); >> Not your bug, but what should happen if this new fails? Or >> is this one of the operator overrides that handles that by >> killing the VM? > > This is one of the ResourceObj::new calls that does a > vm_exit_out_of_memory. We could add some code to handle allocation > failure at this part (maybe allocate a very small member name array > instead) but I don't think it's worth it at this point. > >> >> src/share/vm/prims/jvm.cpp >> nit line 612: methodHandle m (THREAD, method); >> Please delete space between 'm ('. > > Okay. >> >> lines 607-609, 616: uses of 'new_obj' >> Should all of these be switched to 'new_obj_h()'? >> >> In particular, is line 616 subject to being moved by GC >> if the methodHandle creation goes to a safepoint? > > So methodHandle doesn't go to a safepoint, and the uses of new_obj > (unhandled) are safe in that call until the add_member_name() call. > But you have to know that the other calls that use new_obj don't go to > a safepoint. It's always safest to Handle early and pass around > handles. This function is odd in that it uses the oop to copy into > first though. > > This comment made me examine this code and the call to > register_finalizer can safepoint. In this case the MemberName Method* > could be redefined and not copied. This can't happen because > MemberName is a final class without finalizers, but I've rearranged > the code to be safe for this case if it ever happens. > > The webrev above is this rearranged code. I've rerun the > java/lang/invoke tests on it. >> >> line 620: return JNIHandles::make_local(env, oop(new_obj_h())); >> is the 'oop(...)' around 'new_obj_h()' redundant? I might >> be rusty, but isn't 'new_obj_h()' the unhandled oop? >> > > Yes, this was unnecessary. >> src/share/vm/prims/methodHandles.hpp >> No comments. >> >> src/share/vm/prims/methodHandles.cpp >> No comments. >> >> test/compiler/jsr292/RedefineMethodUsedByMultipleMethodHandles.java >> line 142 // static class FooTransformer implements >> ClassFileTransformer, Opcodes { >> Do you still need this line? > > No, I removed it. > > Thanks! > Coleen >> >> >> Dan >> >> >>> bug link https://bugs.openjdk.java.net/browse/JDK-8042235 >>> >>> Thanks, >>> Coleen >> > From max.ockner at oracle.com Wed Nov 19 21:57:50 2014 From: max.ockner at oracle.com (Max Ockner) Date: Wed, 19 Nov 2014 16:57:50 -0500 Subject: RFR:8060074:Cleanup of unused memory tracking parameters in os::free and its callers. In-Reply-To: <546C89DE.2080409@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> <546BBFDE.8040003@oracle.com> <546C89DE.2080409@oracle.com> Message-ID: <546D125E.7020405@oracle.com> Thank you all for your help! Max On 11/19/2014 7:15 AM, Lois Foltan wrote: > Hi Max, > > Looks good. Minor comment, several files need the copyright > statements updated before pushing. No need for another webrev, though. > > Thanks, > Lois > > > On 11/18/2014 4:53 PM, Max Ockner wrote: >> Hello all, >> Please review this minor cleanup: >> >> Bug ID: 8060074 >> Webrev: http://cr.openjdk.java.net/~coleenp/8060074/ >> >> Summary: >> (1) os::free takes two arguments, but never uses the second argument, >> which is a MEMFLAG. I have removed this argument from every os::free >> call. >> (2) The FreeHeap method in src/share/vm/memory/allocation.inline.hpp >> also takes a MEMFLAG argument, which is only used to call os::free. >> Now an unused argument, it has been removed from all FreeHeap calls. >> No other methods which directly call os::free() have this problem. >> (3) The FREE_C_HEAP_ARRAY macro in src/share/vm/memory/allocation.hpp >> takes A MEMFLAG argument which is passed to FreeHeap, and nothing >> else. This argument is now unused, and has been removed. No other >> methods which call FreeHeap have this problem. >> >> No methods or macros which use the FREE_C_HEAP_ARRAY macro needed >> cleanup. I have also removed the extra argument from the definitions >> of the above methods. >> >> Tests: jtreg hotspot tests with >> -vmoption:"-XX:NativeMemoryTrackingdetail" >> >> Thanks for your help, >> Max Ockner > From coleen.phillimore at oracle.com Wed Nov 19 22:42:08 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 19 Nov 2014 17:42:08 -0500 Subject: RFR 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <546D09C0.2060105@oracle.com> References: <546A7B88.8000504@oracle.com> <546B8EDD.1060809@oracle.com> <546CCC71.2060501@oracle.com> <546D09C0.2060105@oracle.com> Message-ID: <546D1CC0.5010504@oracle.com> On 11/19/14, 4:21 PM, serguei.spitsyn at oracle.com wrote: > Looks good. Thank you again, Serguei. Coleen > > Thanks, > Serguei > > > On 11/19/14 8:59 AM, Coleen Phillimore wrote: >> >> Thank you Dan for reviewing this. New webrev, see below for why. >> >> http://cr.openjdk.java.net/~coleenp/8042235_2/ >> >> >> On 11/18/14, 1:24 PM, Daniel D. Daugherty wrote: >>> On 11/17/14 3:49 PM, Coleen Phillimore wrote: >>>> Summary: note all MemberNames created on internal list for >>>> adjusting method entries. >>>> >>>> The JVM MemberNameTable code will push all member names on the list >>>> rather than trying to index by method_idnum. The code to look up >>>> MemberName types wasn't used so was removed. Class redefinition >>>> iterates through the table sequentially to update the Method* >>>> pointers in saved member names. >>>> >>>> This change will work with David Chase's change to the Java code >>>> for bug 8013267 without the extra code dealing with class >>>> redefinition. >>>> >>>> Tested with vm.quick.testlist, jck tests and jtreg tests, including >>>> the mlvm tests that failed in the bug report. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8042235/ >>> >>> src/share/vm/classfile/javaClasses.hpp >>> No comments. >>> >>> src/share/vm/classfile/javaClasses.cpp >>> No comments. >>> >>> src/share/vm/oops/instanceKlass.hpp >>> No comments. >>> >>> src/share/vm/oops/instanceKlass.cpp >>> line 2951: _member_names = new (ResourceObj::C_HEAP, mtClass) >>> MemberNameTable(idnum_allocated_count()); >>> Not your bug, but what should happen if this new fails? Or >>> is this one of the operator overrides that handles that by >>> killing the VM? >> >> This is one of the ResourceObj::new calls that does a >> vm_exit_out_of_memory. We could add some code to handle allocation >> failure at this part (maybe allocate a very small member name array >> instead) but I don't think it's worth it at this point. >> >>> >>> src/share/vm/prims/jvm.cpp >>> nit line 612: methodHandle m (THREAD, method); >>> Please delete space between 'm ('. >> >> Okay. >>> >>> lines 607-609, 616: uses of 'new_obj' >>> Should all of these be switched to 'new_obj_h()'? >>> >>> In particular, is line 616 subject to being moved by GC >>> if the methodHandle creation goes to a safepoint? >> >> So methodHandle doesn't go to a safepoint, and the uses of new_obj >> (unhandled) are safe in that call until the add_member_name() call. >> But you have to know that the other calls that use new_obj don't go >> to a safepoint. It's always safest to Handle early and pass around >> handles. This function is odd in that it uses the oop to copy into >> first though. >> >> This comment made me examine this code and the call to >> register_finalizer can safepoint. In this case the MemberName >> Method* could be redefined and not copied. This can't happen because >> MemberName is a final class without finalizers, but I've rearranged >> the code to be safe for this case if it ever happens. >> >> The webrev above is this rearranged code. I've rerun the >> java/lang/invoke tests on it. >>> >>> line 620: return JNIHandles::make_local(env, oop(new_obj_h())); >>> is the 'oop(...)' around 'new_obj_h()' redundant? I might >>> be rusty, but isn't 'new_obj_h()' the unhandled oop? >>> >> >> Yes, this was unnecessary. >>> src/share/vm/prims/methodHandles.hpp >>> No comments. >>> >>> src/share/vm/prims/methodHandles.cpp >>> No comments. >>> >>> test/compiler/jsr292/RedefineMethodUsedByMultipleMethodHandles.java >>> line 142 // static class FooTransformer implements >>> ClassFileTransformer, Opcodes { >>> Do you still need this line? >> >> No, I removed it. >> >> Thanks! >> Coleen >>> >>> >>> Dan >>> >>> >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8042235 >>>> >>>> Thanks, >>>> Coleen >>> >> > From christian.thalinger at oracle.com Thu Nov 20 00:28:57 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Wed, 19 Nov 2014 16:28:57 -0800 Subject: RFR: AARCH64: Top-level JDK changes In-Reply-To: <546C5B29.8050602@oracle.com> References: <545CFFA9.4070107@redhat.com> <545D0290.5080307@oracle.com> <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <0B65D4AA-B876-4106-9CA0-2F2AF5F43F83@oracle.com> <62FBBB3A-B1CE-43D8-9D99-3087BDDBDC37@oracle.com> <5466699D.2050808@oracle.com> <5469E5E4.7040606@oracle.com> <546B6779.9050608@oracle.com> <94199476-15AD-4226-8C18-2708F6F94DC3@oracle.com> <546C5B29.8050602@oracle.com> Message-ID: <1A79BF26-783C-4026-92B6-2150C3C49B4D@oracle.com> > On Nov 19, 2014, at 12:56 AM, Erik Joelsson wrote: > > > On 2014-11-18 19:09, Christian Thalinger wrote: >>> On Nov 18, 2014, at 7:36 AM, Magnus Ihse Bursie wrote: >>> >>> On 2014-11-18 01:59, Christian Thalinger wrote: >>>>> On Nov 17, 2014, at 4:11 AM, Magnus Ihse Bursie wrote: >>>>> >>>>> On 2014-11-14 21:44, Dean Long wrote: >>>>>>> The distribution exception is there exactly since anyone should be able to distribute the files with their configure script. That does not mean that you are allowed to edit it, though. >>>>>> What if we require Autoconf to be installed on the host? Does that solve any problems? >>>>> No, unfortunately not. >>>> Why not? >>> Autoconf picks up these files automatically from the build-aux directory. That's also the reason we need to rename the original files and provide wrappers with the same name, since we can't even redirect that functionality to a file with another name. >> So do I understand you correctly that the files we need are automatically copied into the workspace but since we want to use our own, old versions we renamed them and use these instead? > No, I will try to clarify. > > Autoconf is a tool that takes one (or more) input files (written in m4 macro language) and generates a shell script, often named "configure". This shell script is what you would typically run to configure your project. Autoconf defines an API of m4 macros specifically for configure scripts which is basically what makes it useful. Most of these macros are expanded into the generated configure script. > > However, for reasons unknown to us, some of the more complex functionality has been split out into separate shell script "library" files. These library files, often referred to as "build-aux" are supposed to be distributed with the project source code, along with the generated configure script. We distribute them in common/autoconf/build-aux. These files can be found in the source distribution of autoconf or by downloading from the official scm repo for them. They are not part of the binary distribution of autoconf on my Ubuntu system at least. Well, that?s because config.guess and config.sub are part of automake: http://git.savannah.gnu.org/gitweb/?p=automake.git;a=tree;f=lib;hb=HEAD and installed e.g. in this directory on Solaris: $ ls /usr/share/automake-1.10/ acinstall* ansi2knr.1 Automake/ config-ml.in config.sub* depcomp* INSTALL mdate-sh* mkinstalldirs* symlink-tree* ylwrap* am/ ansi2knr.c compile* config.guess* COPYING elisp-comp* install-sh* missing* py-compile* texinfo.tex $ automake --add-missing makes a copy of these files, if necessary: configure.ac:3: the top level configure.ac:3: installing `./config.sub' configure.ac:3: installing `./config.guess' $ ls -l config.* lrwxrwxrwx 1 cthaling staff 37 2014-11-19 16:26 config.guess -> /usr/share/automake-1.10/config.guess* lrwxrwxrwx 1 cthaling staff 35 2014-11-19 16:26 config.sub -> /usr/share/automake-1.10/config.sub* > For this reason, it wouldn't help requiring autoconf to be installed as that wouldn't provide the files. > > For non GPL projects to be able to distribute the files in build-aux, they come with a special exception to GPL, which basically allows them to be freely distributed as long as they are part of a configure script. This exception does not seem to give any exception for deriving work from them. > > /Erik From erik.osterlund at lnu.se Thu Nov 20 08:58:14 2014 From: erik.osterlund at lnu.se (=?iso-8859-1?Q?Erik_=D6sterlund?=) Date: Thu, 20 Nov 2014 08:58:14 +0000 Subject: Branch Prediction? In-Reply-To: References: Message-ID: <710192F3-FB58-4972-AA10-28B77988B403@lnu.se> Okay, I suspected there would have been a conscious decision, thanks for the info. :) /Erik On 19 Nov 2014, at 02:36, Paul Hohensee wrote: > History as I remember it. :) > > It's been considered, and decided against. The platforms which openjdk > currently targets all have decent to spectacular hardware branch > prediction. Those that didn't, such as Niagara 1, ignored prediction > bits. Conclusion is that it's not worth complicating the code with, as > David says, 'magic macros'. > > Paul > > David Holmes wrote: > On 19/11/2014 4:42 AM, Christian Thalinger wrote: >> ?or I could just read the next email. Doh. > > Not so obvious when the subject was changed to "Compiler branch hints". > I had to go to the archives to find it. > > My 2c. The performance focus has been on generated code, not the runtime > code. Personally I dislike these magic macros as they clutter the code. > > David > >>> On Nov 18, 2014, at 10:41 AM, Christian Thalinger < > christian.thalinger at oracle.com> wrote: >>> >>> I?m not sure if the silence means nobody knows or nobody cares. > Speaking for myself, I don?t know of any history on this. >>> >>>> On Nov 8, 2014, at 1:43 AM, Erik ?sterlund > wrote: >>>> >>>> Hi, >>>> >>>> Just out of curiosity, is there some good reason why we don't have a > branch prediction macro? >>>> For every tight load a; cmpxchg(expect: a, addr: &x, new_val: b); loop, > I feel a bit uneasy not telling the compiler that this is pretty likely to > succeed, and relying on its guessing. >>>> >>>> Has it been excluded because it's considered not nice or perhaps it was > simply never introduced because nobody found it useful? >>>> >>>> Could have some define like this for GCC, which for other compilers > reduces to nothing: >>>> >>>> #define VM_EXPECT_TRUE(A) __builtin_expect((A), true) >>>> #define VM_EXPECT_FALSE(A) __builtin_expect((A), false) >>>> >>>> It might not lead to drastic performance improvements, but it feels > weird not to tell the compiler what we know and keep secrets from it. And I > think it's also nice for documentation purposes that people reading it also > understand that this expression is gonna be true most of the time, and deal > with it accordingly. >>>> >>>> /Erik From vitalyd at gmail.com Thu Nov 20 13:40:25 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 20 Nov 2014 08:40:25 -0500 Subject: Branch Prediction? In-Reply-To: References: Message-ID: Paul, Compilers can use these macros to move cold/unlikely basic blocks further away from the likely paths, leading to better icache utilization. Was that aspect considered at all? I'll agree that overuse of these macros would be a maintenance problem, but seems like judicious use of these in targeted places may yield small benefit. Thanks Sent from my phone On Nov 18, 2014 9:36 PM, "Paul Hohensee" wrote: > History as I remember it. :) > > It's been considered, and decided against. The platforms which openjdk > currently targets all have decent to spectacular hardware branch > prediction. Those that didn't, such as Niagara 1, ignored prediction > bits. Conclusion is that it's not worth complicating the code with, as > David says, 'magic macros'. > > Paul > > David Holmes wrote: > On 19/11/2014 4:42 AM, Christian Thalinger wrote: > > ?or I could just read the next email. Doh. > > Not so obvious when the subject was changed to "Compiler branch hints". > I had to go to the archives to find it. > > My 2c. The performance focus has been on generated code, not the runtime > code. Personally I dislike these magic macros as they clutter the code. > > David > > >> On Nov 18, 2014, at 10:41 AM, Christian Thalinger < > christian.thalinger at oracle.com> wrote: > >> > >> I?m not sure if the silence means nobody knows or nobody cares. > Speaking for myself, I don?t know of any history on this. > >> > >>> On Nov 8, 2014, at 1:43 AM, Erik ?sterlund > wrote: > >>> > >>> Hi, > >>> > >>> Just out of curiosity, is there some good reason why we don't have a > branch prediction macro? > >>> For every tight load a; cmpxchg(expect: a, addr: &x, new_val: b); loop, > I feel a bit uneasy not telling the compiler that this is pretty likely to > succeed, and relying on its guessing. > >>> > >>> Has it been excluded because it's considered not nice or perhaps it was > simply never introduced because nobody found it useful? > >>> > >>> Could have some define like this for GCC, which for other compilers > reduces to nothing: > >>> > >>> #define VM_EXPECT_TRUE(A) __builtin_expect((A), true) > >>> #define VM_EXPECT_FALSE(A) __builtin_expect((A), false) > >>> > >>> It might not lead to drastic performance improvements, but it feels > weird not to tell the compiler what we know and keep secrets from it. And I > think it's also nice for documentation purposes that people reading it also > understand that this expression is gonna be true most of the time, and deal > with it accordingly. > >>> > >>> /Erik > From aph at redhat.com Thu Nov 20 14:25:28 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 20 Nov 2014 14:25:28 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546C1264.6090308@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> <546BF7F3.5020507@oracle.com> <546C0881.8050905@oracle.com> <546C1264.6090308@oracle.com> Message-ID: <546DF9D8.3090505@redhat.com> On 11/19/2014 03:45 AM, Dean Long wrote: > On 11/18/2014 7:03 PM, Vladimir Kozlov wrote: >> Yes, we can hide AARCH64 using something similar to >> CODE_CACHE_SIZE_LIMIT macro which could be overwritten in platform >> specific files if needed: USE_STORE_RELEASE_FOR_VOLATILE. >> Or slightly more complicated declaration similar to >> support_IRIW_for_not_multiple_copy_atomic_cpu boolean constant. >> >> Dean, will it help us if we do that? If yes, then we should do that. >> > Yes, this will help us. Following the boolean constant example, we > would have something like: > > #ifdef USE_STORE_RELEASE_FOR_VOLATILE > const bool use_store_release_for_volatile = true; > #else > const bool use_store_release_for_volatile = false; > #endif The problem with this is that the AArch64 stlr instruction isn't esactly store release, so is this #define rather misleading. Other processors have store release instructions, but none of them do quite what the AArch64 does. But really I just want this patch to move forward, so I'll agree to anything which is not ridiculous. Andrew. From paul.hohensee at gmail.com Thu Nov 20 14:28:53 2014 From: paul.hohensee at gmail.com (Paul Hohensee) Date: Thu, 20 Nov 2014 09:28:53 -0500 Subject: Branch Prediction? In-Reply-To: References: Message-ID: No, we didn't think about likely-path optimization. If it were me (and it's not :) ), I'd want to see some generated code examples that would benefit. Perhaps quantify the benefit by getting a libjvm profile, look at the code for the top 10 or 20 methods, reorganize it via a proof-of-concept implementation and measure the result. I'd also take into consideration that future hardware improvements (e.g., larger icache line size) might negate any current improvement, so it'd have to be a 'big' benefit. Again if it were me (and it's not :) ), it would take maybe a per-instance 10% improvement before I'd accept the intellectual overhead cost of a permanent implementation. I worry a lot about long term source code maintainability, hence I have a sort of automatic skepticism about features than complicate it. :) Hope this helps, Paul On Thu, Nov 20, 2014 at 8:40 AM, Vitaly Davidovich wrote: > Paul, > > Compilers can use these macros to move cold/unlikely basic blocks further > away from the likely paths, leading to better icache utilization. Was that > aspect considered at all? I'll agree that overuse of these macros would be > a maintenance problem, but seems like judicious use of these in targeted > places may yield small benefit. > > Thanks > > Sent from my phone > On Nov 18, 2014 9:36 PM, "Paul Hohensee" wrote: > >> History as I remember it. :) >> >> It's been considered, and decided against. The platforms which openjdk >> currently targets all have decent to spectacular hardware branch >> prediction. Those that didn't, such as Niagara 1, ignored prediction >> bits. Conclusion is that it's not worth complicating the code with, as >> David says, 'magic macros'. >> >> Paul >> >> David Holmes wrote: >> On 19/11/2014 4:42 AM, Christian Thalinger wrote: >> > ?or I could just read the next email. Doh. >> >> Not so obvious when the subject was changed to "Compiler branch hints". >> I had to go to the archives to find it. >> >> My 2c. The performance focus has been on generated code, not the runtime >> code. Personally I dislike these magic macros as they clutter the code. >> >> David >> >> >> On Nov 18, 2014, at 10:41 AM, Christian Thalinger < >> christian.thalinger at oracle.com> wrote: >> >> >> >> I?m not sure if the silence means nobody knows or nobody cares. >> Speaking for myself, I don?t know of any history on this. >> >> >> >>> On Nov 8, 2014, at 1:43 AM, Erik ?sterlund >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> Just out of curiosity, is there some good reason why we don't have a >> branch prediction macro? >> >>> For every tight load a; cmpxchg(expect: a, addr: &x, new_val: b); >> loop, >> I feel a bit uneasy not telling the compiler that this is pretty likely to >> succeed, and relying on its guessing. >> >>> >> >>> Has it been excluded because it's considered not nice or perhaps it >> was >> simply never introduced because nobody found it useful? >> >>> >> >>> Could have some define like this for GCC, which for other compilers >> reduces to nothing: >> >>> >> >>> #define VM_EXPECT_TRUE(A) __builtin_expect((A), true) >> >>> #define VM_EXPECT_FALSE(A) __builtin_expect((A), false) >> >>> >> >>> It might not lead to drastic performance improvements, but it feels >> weird not to tell the compiler what we know and keep secrets from it. And >> I >> think it's also nice for documentation purposes that people reading it >> also >> understand that this expression is gonna be true most of the time, and >> deal >> with it accordingly. >> >>> >> >>> /Erik >> > From aph at redhat.com Thu Nov 20 15:04:37 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 20 Nov 2014 15:04:37 +0000 Subject: Branch Prediction? In-Reply-To: References: Message-ID: <546E0305.3020502@redhat.com> On 11/20/2014 02:28 PM, Paul Hohensee wrote: > If it were me (and it's not :) ), I'd want to see some generated code > examples that would benefit. Perhaps quantify the benefit by getting a > libjvm profile, look at the code for the top 10 or 20 methods, reorganize > it via a proof-of-concept implementation and measure the result. I'd also > take into consideration that future hardware improvements (e.g., larger > icache line size) might negate any current improvement, so it'd have to be > a 'big' benefit. Again if it were me (and it's not :) ), it would take > maybe a per-instance 10% improvement before I'd accept the intellectual > overhead cost of a permanent implementation. I worry a lot about long term > source code maintainability, hence I have a sort of automatic skepticism > about features than complicate it. :) I agree, but: The problem with the "benchamrk and see" approach is that a fast system is made up of many very small optimizations, each one of which may be lost in the noise. Andrew. From mikael.gerdin at oracle.com Thu Nov 20 15:38:32 2014 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Thu, 20 Nov 2014 16:38:32 +0100 Subject: Branch Prediction? In-Reply-To: <546E0305.3020502@redhat.com> References: <546E0305.3020502@redhat.com> Message-ID: <546E0AF8.2020607@oracle.com> On 2014-11-20 16:04, Andrew Haley wrote: > On 11/20/2014 02:28 PM, Paul Hohensee wrote: >> If it were me (and it's not :) ), I'd want to see some generated code >> examples that would benefit. Perhaps quantify the benefit by getting a >> libjvm profile, look at the code for the top 10 or 20 methods, reorganize >> it via a proof-of-concept implementation and measure the result. I'd also >> take into consideration that future hardware improvements (e.g., larger >> icache line size) might negate any current improvement, so it'd have to be >> a 'big' benefit. Again if it were me (and it's not :) ), it would take >> maybe a per-instance 10% improvement before I'd accept the intellectual >> overhead cost of a permanent implementation. I worry a lot about long term >> source code maintainability, hence I have a sort of automatic skepticism >> about features than complicate it. :) > > I agree, but: > > The problem with the "benchamrk and see" approach is that a fast > system is made up of many very small optimizations, each one of which > may be lost in the noise. +1 I, personally, would be fine with adding these kinds of hints to the hot paths of the scavenging collectors, for example. But someone needs to do the actual performance measurements to show that it at least gives something. /Mikael > > Andrew. > From vitalyd at gmail.com Thu Nov 20 15:43:15 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 20 Nov 2014 10:43:15 -0500 Subject: Branch Prediction? In-Reply-To: <546E0AF8.2020607@oracle.com> References: <546E0305.3020502@redhat.com> <546E0AF8.2020607@oracle.com> Message-ID: I'm guessing the "problem" will be tagging sufficiently enough of the existing code to make a dent in performance benchmarks. Is it feasible to, say, modify the scavenging collector with places you guys know are hot/cold, and then run one of the standard GC perf benchmarks that you have (or a spec bench that's used a performance proxy for GC testing)? Also, as Erik mentioned in the initial post, there's some nice documentation aspect to these macros in that it tells the reader how control usually flows through the functions. At the end, it's just like comments -- if they're not maintained, they can cause more harm than good. But the likely/unlikely macros are frequently used in performance sensitive projects that I've seen (just empirically). On Thu, Nov 20, 2014 at 10:38 AM, Mikael Gerdin wrote: > > On 2014-11-20 16:04, Andrew Haley wrote: > >> On 11/20/2014 02:28 PM, Paul Hohensee wrote: >> >>> If it were me (and it's not :) ), I'd want to see some generated code >>> examples that would benefit. Perhaps quantify the benefit by getting a >>> libjvm profile, look at the code for the top 10 or 20 methods, reorganize >>> it via a proof-of-concept implementation and measure the result. I'd >>> also >>> take into consideration that future hardware improvements (e.g., larger >>> icache line size) might negate any current improvement, so it'd have to >>> be >>> a 'big' benefit. Again if it were me (and it's not :) ), it would take >>> maybe a per-instance 10% improvement before I'd accept the intellectual >>> overhead cost of a permanent implementation. I worry a lot about long >>> term >>> source code maintainability, hence I have a sort of automatic skepticism >>> about features than complicate it. :) >>> >> >> I agree, but: >> >> The problem with the "benchamrk and see" approach is that a fast >> system is made up of many very small optimizations, each one of which >> may be lost in the noise. >> > > +1 > > I, personally, would be fine with adding these kinds of hints to the hot > paths of the scavenging collectors, for example. > But someone needs to do the actual performance measurements to show that > it at least gives something. > > /Mikael > > >> Andrew. >> >> From vladimir.kozlov at oracle.com Thu Nov 20 18:05:41 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 20 Nov 2014 10:05:41 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546DF9D8.3090505@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> <546BF7F3.5020507@oracle.com> <546C0881.8050905@oracle.com> <546C1264.6090308@oracle.com> <546DF9D8.3090505@redhat.com> Message-ID: <546E2D75.8080900@oracle.com> I based the name on your comment: + // AArch64 uses store release (which does everything we need to keep + // the machine in order) but we still need a compiler barrier here. You can name it as you like. Our main suggestion is to use such Boolean constant and normal if() statements instead of ifdef AARCH64 and AARCH64_ONLY/NOT_AARCH64 macros in C2 code (src/share/vm/opto/* files). We already do similar things for PPC64 port which sets support_IRIW_for_* constant. thanks, Vladimir On 11/20/14 6:25 AM, Andrew Haley wrote: > On 11/19/2014 03:45 AM, Dean Long wrote: >> On 11/18/2014 7:03 PM, Vladimir Kozlov wrote: >>> Yes, we can hide AARCH64 using something similar to >>> CODE_CACHE_SIZE_LIMIT macro which could be overwritten in platform >>> specific files if needed: USE_STORE_RELEASE_FOR_VOLATILE. >>> Or slightly more complicated declaration similar to >>> support_IRIW_for_not_multiple_copy_atomic_cpu boolean constant. >>> >>> Dean, will it help us if we do that? If yes, then we should do that. >>> >> Yes, this will help us. Following the boolean constant example, we >> would have something like: >> >> #ifdef USE_STORE_RELEASE_FOR_VOLATILE >> const bool use_store_release_for_volatile = true; >> #else >> const bool use_store_release_for_volatile = false; >> #endif > > The problem with this is that the AArch64 stlr instruction isn't > esactly store release, so is this #define rather misleading. Other > processors have store release instructions, but none of them do quite > what the AArch64 does. > > But really I just want this patch to move forward, so I'll agree to > anything which is not ridiculous. > > Andrew. > From aph at redhat.com Thu Nov 20 18:13:54 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 20 Nov 2014 18:13:54 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546E2D75.8080900@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> <546BF7F3.5020507@oracle.com> <546C0881.8050905@oracle.com> <546C1264.6090308@oracle.com> <546DF9D8.3090505@redhat.com> <546E2D75.8080900@oracle.com> Message-ID: <546E2F62.4030104@redhat.com> On 11/20/2014 06:05 PM, Vladimir Kozlov wrote: > I based the name on your comment: > > + // AArch64 uses store release (which does everything we need to keep > + // the machine in order) but we still need a compiler barrier here. Ah. Okay, I'll have to think of a good name for it, then. > You can name it as you like. Our main suggestion is to use such Boolean > constant and normal if() statements instead of ifdef AARCH64 and > AARCH64_ONLY/NOT_AARCH64 macros in C2 code (src/share/vm/opto/* files). > > We already do similar things for PPC64 port which sets > support_IRIW_for_* constant. Okay, Andrew. From max.ockner at oracle.com Thu Nov 20 19:23:57 2014 From: max.ockner at oracle.com (Max Ockner) Date: Thu, 20 Nov 2014 14:23:57 -0500 Subject: RFR:8060074:Cleanup of unused memory tracking parameters in os::free and its callers. In-Reply-To: <546C04DC.1010606@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> <546BBFDE.8040003@oracle.com> <546C04DC.1010606@oracle.com> Message-ID: <546E3FCD.7020808@oracle.com> David, Thomas beat me to it. os::free does not ever use the memtracking argument that was being passed to it. This does not break memory tracking though, because this memory tracking parameter is supplied and saved in a malloc header during memory allocation. os::free accesses and uses this header through Memtracker::record_free(memblock). Let me know if anything else stands out to you. I would love to have the David Holmes stamp of approval for this fix. Other notes: I have fixed the copyrights and made a new webrev, but it hasn't been copied to openjdk yet. Shouldn't be an issue because there isn't really anything new to see. Either way, sorry about that. Updated Webrev: http://oklahoma.us.oracle.com/~mockner/webrev/8060074.1/ Thanks, Max Ockner On 11/18/2014 9:47 PM, David Holmes wrote: > Hi Max, > > So I would have assumed memflags were being passed to all the "free" > routines for NMT purposes. Otherwise how does NMT track this? > > Thanks, > David > > On 19/11/2014 7:53 AM, Max Ockner wrote: >> Hello all, >> Please review this minor cleanup: >> >> Bug ID: 8060074 >> Webrev: http://cr.openjdk.java.net/~coleenp/8060074/ >> >> Summary: >> (1) os::free takes two arguments, but never uses the second argument, >> which is a MEMFLAG. I have removed this argument from every os::free >> call. >> (2) The FreeHeap method in src/share/vm/memory/allocation.inline.hpp >> also takes a MEMFLAG argument, which is only used to call os::free. Now >> an unused argument, it has been removed from all FreeHeap calls. No >> other methods which directly call os::free() have this problem. >> (3) The FREE_C_HEAP_ARRAY macro in src/share/vm/memory/allocation.hpp >> takes A MEMFLAG argument which is passed to FreeHeap, and nothing else. >> This argument is now unused, and has been removed. No other methods >> which call FreeHeap have this problem. >> >> No methods or macros which use the FREE_C_HEAP_ARRAY macro needed >> cleanup. I have also removed the extra argument from the definitions of >> the above methods. >> >> Tests: jtreg hotspot tests with >> -vmoption:"-XX:NativeMemoryTrackingdetail" >> >> Thanks for your help, >> Max Ockner From max.ockner at oracle.com Thu Nov 20 21:56:30 2014 From: max.ockner at oracle.com (Max Ockner) Date: Thu, 20 Nov 2014 16:56:30 -0500 Subject: RFR:8047290:Ensure consistent safepoint checking in MutexLockerEx In-Reply-To: <54619729.30704@oracle.com> References: <543EB71A.8000403@oracle.com> <543F174F.7040204@oracle.com> <545CF033.4010503@oracle.com> <54619729.30704@oracle.com> Message-ID: <546E638E.2020208@oracle.com> Hello again, I have made changes based on all comments. There is now a pair of assert statements in the Monitor/Mutex wait() methods. When I reran tests, I caught another lock that I had to change to "sometimes", but the assert that caught this lock was not in wait. There are currently no locks that use try to pass an incorrect safepoint check argument to wait(). Instead, gcbasher did not catch this lock last time, when the only asserts were in lock and lock_without_safepoint. This lock is "CMS_markBitMap_lock" in share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp. I'm guessing that it was not caught by the tests because this section of code is not reached very often. My initial inspection failed to catch this lock because it is passed around between various structures and renamed 4 times. I have not yet found a good way to check for this situation, even with a debugger. Are there any tests which actually manage to hit every line of code? How should I handle this situation where I can't rely on the tests that I have run to tell me if I have missed something? At what point can I assume that everything is OK? Thanks, Max Ockner On 11/10/2014 11:57 PM, David Holmes wrote: > Hi Max, > > On 8/11/2014 2:15 AM, Max Ockner wrote: >> Hello all, >> I have made these additonal changes: >> -Moved the assert() statements into the lock and lock_without_safepoint >> methods. >> -Changed Monitor::SafepointAllowed to Monitor::SafepointCheckRequired >> -Changed the Monitor::SafepointCheckRequired values for several locks >> which were locked outside of a MutexLockerEx (some were locked with >> MutexLocker, some were locked were locked without any MutexLocker* ) >> >> New webrev location: http://cr.openjdk.java.net/~coleenp/8047290/ > > Generally this is all okay - a few style and other nits below. > > However you missed adding an assert in Monitor::wait to check if the > no_safepoint_check flag was being used correctly for the current monitor. > > Specific comments: > > src/share/vm/runtime/mutex.hpp > > This comment is no longer accurate with the moved check location: > > + // MutexLockerEx checks these flags when acquiring a lock > + // to ensure consistent checking for each lock. > > The same goes for other references to MutexLockerEx in the enum > description. > > Also copyright year needs updating. > > --- > > src/share/vm/runtime/mutex.cpp > > 898 //Ensure > 961 //Ensure > > Space needed after // > > --- > > src/share/vm/runtime/mutexLocker.cpp > > + var = new type(Mutex::pri, #var, vm_block,safepoint_check_allowed); \ > > space needed after comma in k,s > > --- > > src/share/vm/runtime/mutexLocker.hpp > > Whitespace only changes - looks like leftovers from removed edits. > > > > Thanks, > David > > >> Additional testing: >> jtreg ./jdk/test/java/lang/invoke >> jtreg jfr tests >> >> Here is a list of ALL of the "sometimes" locks: >> >> "WorkGroup monitor" share/vm/utilities/workgroup.cpp >> "SLTMonitor" share/vm/gc_implementation/shared/concurrentGCThread.cpp >> "CompactibleFreeListSpace._lock" >> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp >> >> "freelist par lock" >> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp >> >> "SR_lock" share/vm/runtime/thread.cpp >> >> The remaining "sometimes" locks can be found in >> share/vm/runtime/mutexLocker.cpp: >> >> ParGCRareEvent_lock >> Safepoint_lock >> Threads_lock >> VMOperationQueue_lock >> VMOperationRequest_lock >> Terminator_lock >> Heap_lock >> Compile_lock >> PeriodicTask_lock >> JfrStacktrace_lock >> >> I have not checked the validity of the "sometimes" locks, and I believe >> that this should be a different project. >> >> Thanks for your help! >> Max Ockner >> On 10/15/2014 8:54 PM, David Holmes wrote: >>> Hi Max, >>> >>> This is looking good. >>> >>> A few high-level initial comments: >>> >>> I think SafepointAllowed should be SafepointCheckNeeded >>> >>> Why are the checks in MutexLocker when the state is maintained in the >>> mutex itself and the mutex/monitor has lock_without_safepoint, and >>> wait() ? I would have expected to see the >>> check in the mutex/monitor methods. >>> >>> Checking consistent usage of the _no_safepoint_check_flag is good. But >>> another part of this is that a monitor/mutex that never checks for >>> safepoints should never be held when a thread blocks at a safepoint - >>> is there some way to easily check that? I was surprised how many locks >>> are actually not checking for safepoints. >>> >>> Did you find any cases where the mutex/monitor was being used >>> inconsistently and incorrectly? >>> >>> Did you analyse the "sometimes" cases to see if they were safe? >>> (Aside: just for fun check out what happens if you lock the >>> Threads_lock with a safepoint check and a safepoint has been requested >>> :) ). >>> >>> Cheers, >>> David >>> >>> On 16/10/2014 4:04 AM, Max Ockner wrote: >>>> Hi all, >>>> >>>> I am a new member of the Hotspot runtime team in Burlington, MA. >>>> Please review my first fix related to safepoint checking. >>>> >>>> Summary: MutexLockerEx can either acquire a lock with or without a >>>> safepoint check. >>>> In some cases, a particular lock must either safepoint check always or >>>> never to avoid deadlocking. >>>> Some other locks have semantics which allow them to avoid deadlocks >>>> despite having a safepoint check only some of the time. >>>> All locks that are OK having inconsistent safepoint checks have been >>>> marked. All locks that should never safepoint check and all locks that >>>> should always safepoint check have also been marked. >>>> When a MutexLockerEx acquires a lock with or without a safepoint >>>> check, >>>> the lock's safepointAllowed marker is checked to ensure consistent >>>> safepoint checking. >>>> >>>> Webrev: http://oklahoma.us.oracle.com/~mockner/webrev/8047290/ >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8047290 >>>> >>>> Tested with: >>>> jprt "-testset hotspot" >>>> jtreg hotspot >>>> vm.quick.testlist >>>> >>>> Whitebox tests: >>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java: Test >>>> expects Assert ("This lock should always have a safepoint check") >>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java: Test >>>> expects Assert ("This lock should never have a safepoint check") >>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java: code >>>> should not assert. (Lock is properly acquired with no safepoint check) >>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java: code >>>> should not assert. (Lock is properly acquired with safepoint check) >>>> >>>> Thanks, >>>> Max >>>> >> From tobias.hartmann at oracle.com Fri Nov 21 08:37:00 2014 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 21 Nov 2014 09:37:00 +0100 Subject: [8u40] Backport RFR: 8050079: crash while compiling java.lang.ref.Finalizer::runFinalizer Message-ID: <546EF9AC.9060109@oracle.com> Hi, please review the following backport request for 8u40. 8050079: crash while compiling java.lang.ref.Finalizer::runFinalizer https://bugs.openjdk.java.net/browse/JDK-8050079 http://cr.openjdk.java.net/~thartmann/8050079/webrev.03/ http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0bdada928884 The fix does not apply cleanly to 8u40 because of the changes to test/TEST.groups. I removed them: http://cr.openjdk.java.net/~thartmann/8050079_8u/webrev.00/ The changes were pushed into 9 on Thursday. I'll wait for the nightlies before pushing into 8u40. Thanks, Tobias From goetz.lindenmaier at sap.com Fri Nov 21 13:31:55 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 21 Nov 2014 13:31:55 +0000 Subject: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF27CD3@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF264E2@DEWDFEMB12A.global.corp.sap> <5466A656.40707@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF27CD3@DEWDFEMB12A.global.corp.sap> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF29936@DEWDFEMB12A.global.corp.sap> Hi, I prepared a new webrev trying to cover all the issues mentioned below. http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.01/ I moved functionality from os.cpp and universe.cpp into ReservedHeapSpace::initialize_compressed_heap(). This class offers to save _base and _special, which I would have to reimplement if I had improved the methods I had added to os.cpp to also allocate large page heaps. Anyways, I think this class is the right place to gather code trying to reserve the heap. Also, I get along without setting the shift, base, implicit_null_check etc. fields of Universe, so there is no unnecessary calling back and forth between the two classes. Universe gets the heap back, and then sets the properties it needs to configure the compressed oops. All code handling the noaccess prefix is in a single method, too. Best regards, Goetz. Btw, I had to workaround a SS12u1 problem: it wouldn't compile char * x = (char*)UnscaledOopHeapMax - size in 32-bit mode. -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz Sent: Montag, 17. November 2014 09:33 To: 'Vladimir Kozlov'; 'hotspot-dev at openjdk.java.net' Subject: RE: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. Hi Vladimir, > It is very significant rewriting and it takes time to evaluate it. Yes, I know ... and I don't want to push, but nevertheless a ping can be useful sometimes. Thanks a lot for looking at it. > And I would not say it is simpler then before :) If I fix what you propose it's gonna get even more simple ;) > These is what I found so far. > The idea to try to allocate in a range instead of just below > UnscaledOopHeapMax or OopEncodingHeapMax is good. So I would ask to do > several attempts (3?) on non_PPC64 platforms too. Set to 3. > It is matter of preference but I am not comfortable with switch in loop. > For me sequential 'if (addr == 0)' checks is simpler. I'll fix this. > One thing worries me that you release found space and try to get it > again with ReservedHeapSpace. Is it possible to add new > ReservedHeapSpace ctor which simple use already allocated space? This was to keep diff's small, but I also think a new constructor is good. I'll fix this. > The next code in ReservedHeapSpace() is hard to understand (): >(UseCompressedOops && (requested_address == NULL || requested_address+size > (char*)OopEncodingHeapMax) ? > may be move all this into noaccess_prefix_size() and add comments. I have to redo this anyways if I make new constructors. > Why you need prefix when requested_address == NULL? If we allocate with NULL, we most probably will get a heap where base != NULL and thus need a noaccess prefix. > Remove next comment in universe.cpp: > // SAPJVM GL 2014-09-22 Removed. > Again you will release space so why bother to include space for classes?: >+ // For small heaps, save some space for compressed class pointer >+ // space so it can be decoded with no base. This was done like this before. We must assure the upper bound of the heap is low enough that the compressed class space still fits in there. virtualspace.cpp > With new code size+noaccess_prefix could be requested. But later it is > not used if WIN64_ONLY(&& UseLargePages) and you will have empty > non-protected page below heap. There's several points to this: * Also if not protectable, the heap base has to be below the real start of the heap. Else the first object in the heap will be compressed to 'null' and decompression will fail. * If we don't reserve the memory other stuff can end up in this space. On errors, if would be quite unexpected to find memory there. * To get a heap for the new disjoint mode I must control the size of this. Requesting a heap starting at (aligned base + prefix) is more likely to fail. * The size for the prefix must anyways be considered when deciding whether the heap is small enough to run with compressed oops. So distinguishing the case where we really can omit this would require quite some additional checks everywhere, and I thought it's not worth it. matcher.hpp > Universe::narrow_oop_use_implicit_null_checks() should be true for such > case too. So you can add new condition with || to existing ones. The > only condition you relax is base != NULL. Right? Yes, that's how it's intended. arguments.* files > Why you need PropertyList_add changes. Oh, the code using it got lost. I commented on this in the description in the webrev. "To more efficiently run expensive tests in various compressed oop modes, we set a property with the mode the VM is running in. So far it's called "com.sap.vm.test.compressedOopsMode" better suggestions are welcome (and necessary I guess). Our long running tests that are supposed to run in a dedicated compressed oop mode check this property and abort themselves if it's not the expected mode." When I know about the heap I do Arguments::PropertyList_add(new SystemProperty("com.sap.vm.test.compressedOopsMode", narrow_oop_mode_to_string(narrow_oop_mode()), false)); in universe.cpp. On some OSes it's deterministic which modes work, there we don't start such tests. Others, as you mentioned OSX, are very indeterministic. Here we save testruntime with this. But it's not that important. We can still parse the PrintCompresseOopsMode output after the test and discard the run. > Do you have platform specific changes? Yes, for ppc and aix. I'll submit them once this is in. >From your other mail: > One more thing. You should allow an allocation in the range when returned from OS allocated address does not match > requested address. We had such cases on OSX, for example, when OS allocates at different address but still inside range. Good point. I'll fix that in os::attempt_reserve_memory_in_range. I'll ping again once a new webrev is done! Best regards, Goetz. On 11/10/14 6:57 AM, Lindenmaier, Goetz wrote: > Hi, > > I need to improve a row of things around compressed oops heap handling > to achieve good performance on ppc. > I prepared a first webrev for review: > http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.00/ > > A detailed technical description of the change is in the webrev and according bug. > > If requested, I will split the change into parts with more respective less impact on > non-ppc platforms. > > The change is derived from well-tested code in our VM. Originally it was > crafted to require the least changes of VM coding, I changed it to be better > streamlined with the VM. > I tested this change to deliver heaps at about the same addresses as before. > Heap addresses mostly differ in lower bits. In some cases (Solaris 5.11) a heap > in a better compressed oops mode is found, though. > I ran (and adapted) test/runtime/CompressedOops and gc/arguments/TestUseCompressedOops*. > > Best regards, > Goetz. > > From erik.helin at oracle.com Fri Nov 21 15:30:12 2014 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 21 Nov 2014 16:30:12 +0100 Subject: RFR: 8065656: Use DWARF debug symbols for Solaris Message-ID: <20141121153011.GA12067@ehelin-desktop> Hi all, this patch changes the debug symbols format on Solaris from STABS [0] to DWARF [1] for libjvm.so. Since the supported compiler on Solaris has been updated to Oracle Solaris Studio 12.3 [2], the STABS debug format is now deprecated in the supported compiler [3]: -xdebugformat=stabs generates debugging information using the stabs standard format. The stabs format is no longer supported. Furthermore, in Oracle Solaris Studio 12.4, the release notes says [4]: The ?xdebugformat=stabs for all compilers might be removed in a future release. The only debugger format option will be ?xdebugformat=dwarf, which is currently the default. So, it seems to be a good time to change the debug format to DWARF when compiling with Oracle Solaris Studio. I also changed the debug format for GCC on Solaris to be DWARF, since the STABS support in GCC is in maintenance mode [5]. More reasons for using DWARF instead of STABS are: - Better support by Oracle Studio Performance Analyzer (the performance team have requested that we use DWARF v2 or later instead of STABS). - DWARF provides a better debugging experience for C++ compared to STABS. The one drawback of using DWARF compared to STABS is that the size of the debuginfo increases. For a SPARC fastdebug build the size of libjvm.debuginfo built with STABS is 782 MB and with DWARF 1002 MB. To summarize, we need to change from STABS to DWARF because STABS is deprecated in 12.3 (even "more" deprecated 12.4 given the wording in the release notes). I would suggest to change sooner rather than later, given that the change to DWARF also brings Oracle Studio Performance Analyzer support as well as a better C++ debugging experience in dbx. Webrev: http://cr.openjdk.java.net/~ehelin/8065656/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8065656 Testing: - Compiled with Oracle Solaris Studio 12.3 on both Solaris 11.1 on SPARC and Solaris 11.1 on x86-64 using JPRT. - Verified that DWARF v2 symbols are produced with objdump. Thanks, Erik [0]: http://www.sourceware.org/gdb/onlinedocs/stabs.html [1]: http://www.dwarfstd.org/ [2]: http://mail.openjdk.java.net/pipermail/jdk9-dev/2014-October/001489.html [3]: https://docs.oracle.com/cd/E24457_01/html/E22003/cplusplus.1.html [4]: https://docs.oracle.com/cd/E37069_01/html/E37070/gnxfn.html [5]: https://sourceware.org/ml/binutils/2013-01/msg00028.html From magnus.ihse.bursie at oracle.com Fri Nov 21 15:35:52 2014 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 21 Nov 2014 16:35:52 +0100 Subject: Proposal: Allowing selective pushes to hotspot without jprt In-Reply-To: References: <540F7021.5080100@oracle.com> <5410CDA9.7030405@oracle.com> <541C6FD9.5050602@oracle.com> <545CB362.60501@oracle.com> <5460CDF1.8050205@oracle.com> <546BF822.7010702@oracle.com> Message-ID: <546F5BD8.9010405@oracle.com> On 2014-11-19 09:23, Volker Simonis wrote: > On Wed, Nov 19, 2014 at 2:53 AM, David Holmes wrote: >> On 18/11/2014 8:03 PM, Volker Simonis wrote: >>> You're right - it works! >>> I've just pushed my first AIX-only change to hotspot-rt! >> >> Congratulations! >> >> Unfortunately it caused us a problem as now the repos can change whilst a >> job is going through JPRT - this requires a new merge due to multiple heads >> and so triggered a failure. But Mikael is working on it :) > It's astonishing how even the smallest changes can introduce > non-foreseeable problems. > > Hopefully Mikael will be able to fix the problem somehow (if he didn't > had too much of the champagne already :) Actually, having non-atomic commits *is* a foreseeable problem. ;-) Which is what you get when you just throw a bunch of unrelated mercurials together without any synchronization in-between them. In fact, I'm surprised this kind of thing does not happen any more often, but then again, a lot of our procedures have probably evolved partially as a response to this constraint. I'd like to see us moving to using atomic commits for the whole project. Unfortunately, there is no simple solution to this even for the open source code only, and it gets worse when adding the closed components. /Magnus From magnus.ihse.bursie at oracle.com Fri Nov 21 15:46:38 2014 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 21 Nov 2014 16:46:38 +0100 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <546CD834.50004@redhat.com> References: <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> <546B7C8D.7050409@redhat.com> <20141118180315.GB22927@redhat.com> <546CD834.50004@redhat.com> Message-ID: <546F5E5E.3030309@oracle.com> On 2014-11-19 18:49, Andrew Haley wrote: > I think this covers everything that reviewers have mentioned: > > http://cr.openjdk.java.net/~aph/aarch64-8064357-4/ > > Andrew. > Almost there! :-) 1) Comment in config.sub identifies it as config.guess. 2) As as understand it, your first attempt is to just dispatch through to autoconf-config.sub if there is no aarch64 arguments. Good! However, the code could be made clearer: The dot is small and some comment clarifying that we will exit after this line might be helpful to the reader. Also, you're using a $sub_args which is not defined. This will not break, but it makes the reader confused. (I suspect copy/paste glitch.) 3) The webrev indicates that the original config.sub was renamed to config.sub.orig, instead of autoconf-config.sub. But since that is how it is called in the new config.sub wrapper, the patch is unlikely to work. When you test this, make sure you have no lingering files in your workspace that can mess up the results. (e.g. no "?" files in hg status). /Magnus From aph at redhat.com Fri Nov 21 16:01:50 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 21 Nov 2014 16:01:50 +0000 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <546F5E5E.3030309@oracle.com> References: <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> <546B7C8D.7050409@redhat.com> <20141118180315.GB22927@redhat.com> <546CD834.50004@redhat.com> <546F5E5E.3030309@oracle.com> Message-ID: <546F61EE.7050609@redhat.com> On 11/21/2014 03:46 PM, Magnus Ihse Bursie wrote: > > Almost there! :-) > > 1) Comment in config.sub identifies it as config.guess. > > 2) As as understand it, your first attempt is to just dispatch through > to autoconf-config.sub if there is no aarch64 arguments. Good! However, > the code could be made clearer: The dot is small and some comment > clarifying that we will exit after this line might be helpful to the > reader. I could just add exit $? after the call to autoconf-config.sub. That way it does not matter whether autoconf-config.sub exits or not, and the reader can tell that we don't go any further. > Also, you're using a $sub_args which is not defined. This will > not break, but it makes the reader confused. (I suspect copy/paste glitch.) Oh, bah. Right you are. > 3) The webrev indicates that the original config.sub was renamed to > config.sub.orig, instead of autoconf-config.sub. How strange. I'll fix that. > But since that is how > it is called in the new config.sub wrapper, the patch is unlikely to > work. When you test this, make sure you have no lingering files in your > workspace that can mess up the results. (e.g. no "?" files in hg status). OK. Andrew. From vladimir.kozlov at oracle.com Fri Nov 21 16:07:20 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 21 Nov 2014 08:07:20 -0800 Subject: [8u40] Backport RFR: 8050079: crash while compiling java.lang.ref.Finalizer::runFinalizer In-Reply-To: <546EF9AC.9060109@oracle.com> References: <546EF9AC.9060109@oracle.com> Message-ID: <546F6338.7040702@oracle.com> Good. Thanks, Vladimir On 11/21/14 12:37 AM, Tobias Hartmann wrote: > Hi, > > please review the following backport request for 8u40. > > 8050079: crash while compiling java.lang.ref.Finalizer::runFinalizer > https://bugs.openjdk.java.net/browse/JDK-8050079 > http://cr.openjdk.java.net/~thartmann/8050079/webrev.03/ > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0bdada928884 > > The fix does not apply cleanly to 8u40 because of the changes to > test/TEST.groups. I removed them: > > http://cr.openjdk.java.net/~thartmann/8050079_8u/webrev.00/ > > The changes were pushed into 9 on Thursday. I'll wait for the nightlies before > pushing into 8u40. > > Thanks, > Tobias > From daniel.daugherty at oracle.com Fri Nov 21 16:14:06 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 21 Nov 2014 09:14:06 -0700 Subject: RFR: 8065656: Use DWARF debug symbols for Solaris In-Reply-To: <20141121153011.GA12067@ehelin-desktop> References: <20141121153011.GA12067@ehelin-desktop> Message-ID: <546F64CE.8090100@oracle.com> > http://cr.openjdk.java.net/~ehelin/8065656/webrev.00/ make/solaris/makefiles/gcc.make No comments. make/solaris/makefiles/sparcWorks.make No comments. Thumbs up! Dan On 11/21/14 8:30 AM, Erik Helin wrote: > Hi all, > > this patch changes the debug symbols format on Solaris from STABS > [0] to DWARF [1] for libjvm.so. Since the supported compiler on Solaris > has been updated to Oracle Solaris Studio 12.3 [2], the STABS debug format > is now deprecated in the supported compiler [3]: > > -xdebugformat=stabs generates debugging information > using the stabs standard format. The stabs format is no > longer supported. > > Furthermore, in Oracle Solaris Studio 12.4, the release notes says [4]: > > The ?xdebugformat=stabs for all compilers might be removed in a future > release. The only debugger format option will be ?xdebugformat=dwarf, > which is currently the default. > > So, it seems to be a good time to change the debug format to DWARF when > compiling with Oracle Solaris Studio. I also changed the debug format for > GCC on Solaris to be DWARF, since the STABS support in GCC is in > maintenance mode [5]. > > More reasons for using DWARF instead of STABS are: > - Better support by Oracle Studio Performance Analyzer (the performance > team have requested that we use DWARF v2 or later instead of STABS). > - DWARF provides a better debugging experience for C++ compared to STABS. > > The one drawback of using DWARF compared to STABS is that the size of the > debuginfo increases. For a SPARC fastdebug build the size of > libjvm.debuginfo built with STABS is 782 MB and with DWARF 1002 MB. > > To summarize, we need to change from STABS to DWARF because STABS is > deprecated in 12.3 (even "more" deprecated 12.4 given the wording in the > release notes). I would suggest to change sooner rather than later, given > that the change to DWARF also brings Oracle Studio Performance Analyzer > support as well as a better C++ debugging experience in dbx. > > Webrev: > http://cr.openjdk.java.net/~ehelin/8065656/webrev.00/ > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8065656 > > Testing: > - Compiled with Oracle Solaris Studio 12.3 on both Solaris 11.1 on SPARC > and Solaris 11.1 on x86-64 using JPRT. > - Verified that DWARF v2 symbols are produced with objdump. > > Thanks, > Erik > > [0]: http://www.sourceware.org/gdb/onlinedocs/stabs.html > [1]: http://www.dwarfstd.org/ > [2]: http://mail.openjdk.java.net/pipermail/jdk9-dev/2014-October/001489.html > [3]: https://docs.oracle.com/cd/E24457_01/html/E22003/cplusplus.1.html > [4]: https://docs.oracle.com/cd/E37069_01/html/E37070/gnxfn.html > [5]: https://sourceware.org/ml/binutils/2013-01/msg00028.html From charlie.hunt at oracle.com Fri Nov 21 16:30:18 2014 From: charlie.hunt at oracle.com (charlie hunt) Date: Fri, 21 Nov 2014 10:30:18 -0600 Subject: RFR: 8065656: Use DWARF debug symbols for Solaris In-Reply-To: <20141121153011.GA12067@ehelin-desktop> References: <20141121153011.GA12067@ehelin-desktop> Message-ID: <5C3C7BAF-ACF0-421F-A20B-F75022AF84D0@oracle.com> Looks good, though I am not an official (R)eviewer. Charlie > On Nov 21, 2014, at 9:30 AM, Erik Helin wrote: > > Hi all, > > this patch changes the debug symbols format on Solaris from STABS > [0] to DWARF [1] for libjvm.so. Since the supported compiler on Solaris > has been updated to Oracle Solaris Studio 12.3 [2], the STABS debug format > is now deprecated in the supported compiler [3]: > > -xdebugformat=stabs generates debugging information > using the stabs standard format. The stabs format is no > longer supported. > > Furthermore, in Oracle Solaris Studio 12.4, the release notes says [4]: > > The ?xdebugformat=stabs for all compilers might be removed in a future > release. The only debugger format option will be ?xdebugformat=dwarf, > which is currently the default. > > So, it seems to be a good time to change the debug format to DWARF when > compiling with Oracle Solaris Studio. I also changed the debug format for > GCC on Solaris to be DWARF, since the STABS support in GCC is in > maintenance mode [5]. > > More reasons for using DWARF instead of STABS are: > - Better support by Oracle Studio Performance Analyzer (the performance > team have requested that we use DWARF v2 or later instead of STABS). > - DWARF provides a better debugging experience for C++ compared to STABS. > > The one drawback of using DWARF compared to STABS is that the size of the > debuginfo increases. For a SPARC fastdebug build the size of > libjvm.debuginfo built with STABS is 782 MB and with DWARF 1002 MB. > > To summarize, we need to change from STABS to DWARF because STABS is > deprecated in 12.3 (even "more" deprecated 12.4 given the wording in the > release notes). I would suggest to change sooner rather than later, given > that the change to DWARF also brings Oracle Studio Performance Analyzer > support as well as a better C++ debugging experience in dbx. > > Webrev: > http://cr.openjdk.java.net/~ehelin/8065656/webrev.00/ > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8065656 > > Testing: > - Compiled with Oracle Solaris Studio 12.3 on both Solaris 11.1 on SPARC > and Solaris 11.1 on x86-64 using JPRT. > - Verified that DWARF v2 symbols are produced with objdump. > > Thanks, > Erik > > [0]: http://www.sourceware.org/gdb/onlinedocs/stabs.html > [1]: http://www.dwarfstd.org/ > [2]: http://mail.openjdk.java.net/pipermail/jdk9-dev/2014-October/001489.html > [3]: https://docs.oracle.com/cd/E24457_01/html/E22003/cplusplus.1.html > [4]: https://docs.oracle.com/cd/E37069_01/html/E37070/gnxfn.html > [5]: https://sourceware.org/ml/binutils/2013-01/msg00028.html From aph at redhat.com Fri Nov 21 17:02:09 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 21 Nov 2014 17:02:09 +0000 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <546F5E5E.3030309@oracle.com> References: <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> <546B7C8D.7050409@redhat.com> <20141118180315.GB22927@redhat.com> <546CD834.50004@redhat.com> <546F5E5E.3030309@oracle.com> Message-ID: <546F7011.50600@redhat.com> On 11/21/2014 03:46 PM, Magnus Ihse Bursie wrote: > > 1) Comment in config.sub identifies it as config.guess. > > 2) As as understand it, your first attempt is to just dispatch through > to autoconf-config.sub if there is no aarch64 arguments. Good! However, > the code could be made clearer: The dot is small and some comment > clarifying that we will exit after this line might be helpful to the > reader. Also, you're using a $sub_args which is not defined. This will > not break, but it makes the reader confused. (I suspect copy/paste glitch.) > > 3) The webrev indicates that the original config.sub was renamed to > config.sub.orig, instead of autoconf-config.sub. But since that is how > it is called in the new config.sub wrapper, the patch is unlikely to > work. When you test this, make sure you have no lingering files in your > workspace that can mess up the results. (e.g. no "?" files in hg status). http://cr.openjdk.java.net/~aph/aarch64-8064357-5/ Thanks, Andrew. From aph at redhat.com Fri Nov 21 17:33:25 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 21 Nov 2014 17:33:25 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546E2F62.4030104@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> <546BF7F3.5020507@oracle.com> <546C0881.8050905@oracle.com> <546C1264.6090308@oracle.com> <546DF9D8.3090505@redhat.com> <546E2D75.8080900@oracle.com> <546E2F62.4030104@redhat.com> Message-ID: <546F7765.1070907@redhat.com> On 11/20/2014 06:13 PM, Andrew Haley wrote: > On 11/20/2014 06:05 PM, Vladimir Kozlov wrote: >> I based the name on your comment: >> >> + // AArch64 uses store release (which does everything we need to keep >> + // the machine in order) but we still need a compiler barrier here. > > Ah. Okay, I'll have to think of a good name for it, then. > >> You can name it as you like. Our main suggestion is to use such Boolean >> constant and normal if() statements instead of ifdef AARCH64 and >> AARCH64_ONLY/NOT_AARCH64 macros in C2 code (src/share/vm/opto/* files). >> >> We already do similar things for PPC64 port which sets >> support_IRIW_for_* constant. > > Okay, I've done something similar but more useful. I've added an experimental flag: UseBarriersForVolatile. This defaults to true for all targets, but we can override it in the back end. That gives me the chance to do some benchmarking on various AArch64 targets to see which ones benefit from the new load acquire/store release instructions. I have kept AARCH64_ONLY for two hunks: --- old/src/share/vm/opto/memnode.hpp 2014-11-21 12:09:22.766963837 -0500 +++ new/src/share/vm/opto/memnode.hpp 2014-11-21 12:09:22.546983320 -0500 @@ -503,6 +503,10 @@ // Conservatively release stores of object references in order to // ensure visibility of object initialization. static inline MemOrd release_if_reference(const BasicType t) { + // AArch64 doesn't need a release store here because object + // initialization contains the necessary barriers. + AARCH64_ONLY(return unordered); + const MemOrd mo = (t == T_ARRAY || t == T_ADDRESS || // Might be the address of an object reference (`boxing'). t == T_OBJECT) ? release : unordered; --- old/src/share/vm/opto/graphKit.cpp 2014-11-21 12:09:20.017207376 -0500 +++ new/src/share/vm/opto/graphKit.cpp 2014-11-21 12:09:19.787227745 -0500 @@ -3813,7 +3813,8 @@ // Smash zero into card if( !UseConcMarkSweepGC ) { - __ store(__ ctrl(), card_adr, zero, bt, adr_type, MemNode::release); + __ store(__ ctrl(), card_adr, zero, bt, adr_type, + NOT_AARCH64(MemNode::release) AARCH64_ONLY(MemNode::unordered)); } else { // Specialized path for CM store barrier __ storeCM(__ ctrl(), card_adr, zero, oop_store, adr_idx, bt, adr_type); The first hunk is only required by IA64 as far as I am aware, but I am nervous about making it IA64_ONLY. The second hunk is a release node which is not as far as I am aware required by any target, and should simply be removed. This isn't a RFA because it's not tested yet, but what do you think? Andrew. From magnus.ihse.bursie at oracle.com Fri Nov 21 18:03:58 2014 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 21 Nov 2014 19:03:58 +0100 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <546F7011.50600@redhat.com> References: <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> <546B7C8D.7050409@redhat.com> <20141118180315.GB22927@redhat.com> <546CD834.50004@redhat.com> <546F5E5E.3030309@oracle.com> <546F7011.50600@redhat.com> Message-ID: <546F7E8E.2090405@oracle.com> On 2014-11-21 18:02, Andrew Haley wrote: > On 11/21/2014 03:46 PM, Magnus Ihse Bursie wrote: >> 1) Comment in config.sub identifies it as config.guess. >> >> 2) As as understand it, your first attempt is to just dispatch through >> to autoconf-config.sub if there is no aarch64 arguments. Good! However, >> the code could be made clearer: The dot is small and some comment >> clarifying that we will exit after this line might be helpful to the >> reader. Also, you're using a $sub_args which is not defined. This will >> not break, but it makes the reader confused. (I suspect copy/paste glitch.) >> >> 3) The webrev indicates that the original config.sub was renamed to >> config.sub.orig, instead of autoconf-config.sub. But since that is how >> it is called in the new config.sub wrapper, the patch is unlikely to >> work. When you test this, make sure you have no lingering files in your >> workspace that can mess up the results. (e.g. no "?" files in hg status). > http://cr.openjdk.java.net/~aph/aarch64-8064357-5/ Looks good to me. Thanks for working through all these iterations! /Magnus From vladimir.kozlov at oracle.com Fri Nov 21 18:06:58 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 21 Nov 2014 10:06:58 -0800 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546F7765.1070907@redhat.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> <546BF7F3.5020507@oracle.com> <546C0881.8050905@oracle.com> <546C1264.6090308@oracle.com> <546DF9D8.3090505@redhat.com> <546E2D75.8080900@oracle.com> <546E2F62.4030104@redhat.com> <546F7765.1070907@redhat.com> Message-ID: <546F7F42.5090100@oracle.com> On 11/21/14 9:33 AM, Andrew Haley wrote: > On 11/20/2014 06:13 PM, Andrew Haley wrote: >> On 11/20/2014 06:05 PM, Vladimir Kozlov wrote: >>> I based the name on your comment: >>> >>> + // AArch64 uses store release (which does everything we need to keep >>> + // the machine in order) but we still need a compiler barrier here. >> >> Ah. Okay, I'll have to think of a good name for it, then. >> >>> You can name it as you like. Our main suggestion is to use such Boolean >>> constant and normal if() statements instead of ifdef AARCH64 and >>> AARCH64_ONLY/NOT_AARCH64 macros in C2 code (src/share/vm/opto/* files). >>> >>> We already do similar things for PPC64 port which sets >>> support_IRIW_for_* constant. >> >> Okay, > > I've done something similar but more useful. I've added an > experimental flag: UseBarriersForVolatile. This defaults to true for > all targets, but we can override it in the back end. That gives me > the chance to do some benchmarking on various AArch64 targets to see > which ones benefit from the new load acquire/store release > instructions. Okay. > > I have kept AARCH64_ONLY for two hunks: > > --- old/src/share/vm/opto/memnode.hpp 2014-11-21 12:09:22.766963837 -0500 > +++ new/src/share/vm/opto/memnode.hpp 2014-11-21 12:09:22.546983320 -0500 > @@ -503,6 +503,10 @@ > // Conservatively release stores of object references in order to > // ensure visibility of object initialization. > static inline MemOrd release_if_reference(const BasicType t) { > + // AArch64 doesn't need a release store here because object > + // initialization contains the necessary barriers. > + AARCH64_ONLY(return unordered); > + > const MemOrd mo = (t == T_ARRAY || > t == T_ADDRESS || // Might be the address of an object reference (`boxing'). > t == T_OBJECT) ? release : unordered; This could be needed for ppc64 too, not only for IA64. > > --- old/src/share/vm/opto/graphKit.cpp 2014-11-21 12:09:20.017207376 -0500 > +++ new/src/share/vm/opto/graphKit.cpp 2014-11-21 12:09:19.787227745 -0500 > @@ -3813,7 +3813,8 @@ > > // Smash zero into card > if( !UseConcMarkSweepGC ) { > - __ store(__ ctrl(), card_adr, zero, bt, adr_type, MemNode::release); > + __ store(__ ctrl(), card_adr, zero, bt, adr_type, > + NOT_AARCH64(MemNode::release) AARCH64_ONLY(MemNode::unordered)); > } else { > // Specialized path for CM store barrier > __ storeCM(__ ctrl(), card_adr, zero, oop_store, adr_idx, bt, adr_type); Looks like PPC64 needs that. In ppc.ad: // Use release_store for card-marking to ensure that previous // oop-stores are visible before the card-mark change. enc_class enc_cms_card_mark(memory mem, iRegLdst releaseFieldAddr) %{ > > The first hunk is only required by IA64 as far as I am aware, but I > am nervous about making it IA64_ONLY. The second hunk is a release > node which is not as far as I am aware required by any target, and > should simply be removed. > > This isn't a RFA because it's not tested yet, but what do you think? Since it affects ppc64 and ia64 we need to ask Goetz and Co. I would suggest to put both these places under platform specific flags/bool constant. Thanks, Vladimir > > Andrew. > From vladimir.kozlov at oracle.com Fri Nov 21 18:09:35 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 21 Nov 2014 10:09:35 -0800 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <546F7E8E.2090405@oracle.com> References: <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> <546B7C8D.7050409@redhat.com> <20141118180315.GB22927@redhat.com> <546CD834.50004@redhat.com> <546F5E5E.3030309@oracle.com> <546F7011.50600@redhat.com> <546F7E8E.2090405@oracle.com> Message-ID: <546F7FDF.7050302@oracle.com> Thank you, Magnus I will push this into aarch64 staging repo after testing in JPRT. Vladimir On 11/21/14 10:03 AM, Magnus Ihse Bursie wrote: > > On 2014-11-21 18:02, Andrew Haley wrote: >> On 11/21/2014 03:46 PM, Magnus Ihse Bursie wrote: >>> 1) Comment in config.sub identifies it as config.guess. >>> >>> 2) As as understand it, your first attempt is to just dispatch through >>> to autoconf-config.sub if there is no aarch64 arguments. Good! However, >>> the code could be made clearer: The dot is small and some comment >>> clarifying that we will exit after this line might be helpful to the >>> reader. Also, you're using a $sub_args which is not defined. This will >>> not break, but it makes the reader confused. (I suspect copy/paste >>> glitch.) >>> >>> 3) The webrev indicates that the original config.sub was renamed to >>> config.sub.orig, instead of autoconf-config.sub. But since that is how >>> it is called in the new config.sub wrapper, the patch is unlikely to >>> work. When you test this, make sure you have no lingering files in your >>> workspace that can mess up the results. (e.g. no "?" files in hg >>> status). >> http://cr.openjdk.java.net/~aph/aarch64-8064357-5/ > > Looks good to me. > > Thanks for working through all these iterations! > > /Magnus From dean.long at oracle.com Fri Nov 21 19:31:08 2014 From: dean.long at oracle.com (Dean Long) Date: Fri, 21 Nov 2014 11:31:08 -0800 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <546F7E8E.2090405@oracle.com> References: <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> <546B7C8D.7050409@redhat.com> <20141118180315.GB22927@redhat.com> <546CD834.50004@redhat.com> <546F5E5E.3030309@oracle.com> <546F7011.50600@redhat.com> <546F7E8E.2090405@oracle.com> Message-ID: <546F92FC.4080700@oracle.com> One minor comment: do we want to preserve the history in the new config.sub, or check it in as a new file? dl On 11/21/2014 10:03 AM, Magnus Ihse Bursie wrote: > > On 2014-11-21 18:02, Andrew Haley wrote: >> On 11/21/2014 03:46 PM, Magnus Ihse Bursie wrote: >>> 1) Comment in config.sub identifies it as config.guess. >>> >>> 2) As as understand it, your first attempt is to just dispatch through >>> to autoconf-config.sub if there is no aarch64 arguments. Good! However, >>> the code could be made clearer: The dot is small and some comment >>> clarifying that we will exit after this line might be helpful to the >>> reader. Also, you're using a $sub_args which is not defined. This will >>> not break, but it makes the reader confused. (I suspect copy/paste >>> glitch.) >>> >>> 3) The webrev indicates that the original config.sub was renamed to >>> config.sub.orig, instead of autoconf-config.sub. But since that is how >>> it is called in the new config.sub wrapper, the patch is unlikely to >>> work. When you test this, make sure you have no lingering files in your >>> workspace that can mess up the results. (e.g. no "?" files in hg >>> status). >> http://cr.openjdk.java.net/~aph/aarch64-8064357-5/ > > Looks good to me. > > Thanks for working through all these iterations! > > /Magnus From vladimir.kozlov at oracle.com Fri Nov 21 20:03:48 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 21 Nov 2014 12:03:48 -0800 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <546F92FC.4080700@oracle.com> References: <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> <546B7C8D.7050409@redhat.com> <20141118180315.GB22927@redhat.com> <546CD834.50004@redhat.com> <546F5E5E.3030309@oracle.com> <546F7011.50600@redhat.com> <546F7E8E.2090405@oracle.com> <546F92FC.4080700@oracle.com> Message-ID: <546F9AA4.3090206@oracle.com> The history is preserved since patch is applied above config.sub. I did 'hg copy config.sub autoconf-config.sub' and then applied the patch. And there are only 2 changesets in config.sub: changeset: 574:b66c81dfa291 user: ohair date: Mon Jan 14 16:38:25 2013 -0800 summary: 8005284: build-infra: nonstandard copyright headers under common/autoconf/build-aux changeset: 423:e1830598f0b7 parent: 417:42f275168fa5 user: ohair date: Tue Apr 10 08:18:28 2012 -0700 summary: 7074397: Build infrastructure changes (makefile re-write) Thanks, Vladimir On 11/21/14 11:31 AM, Dean Long wrote: > One minor comment: do we want to preserve the history in the new > config.sub, > or check it in as a new file? > > dl > > On 11/21/2014 10:03 AM, Magnus Ihse Bursie wrote: >> >> On 2014-11-21 18:02, Andrew Haley wrote: >>> On 11/21/2014 03:46 PM, Magnus Ihse Bursie wrote: >>>> 1) Comment in config.sub identifies it as config.guess. >>>> >>>> 2) As as understand it, your first attempt is to just dispatch through >>>> to autoconf-config.sub if there is no aarch64 arguments. Good! However, >>>> the code could be made clearer: The dot is small and some comment >>>> clarifying that we will exit after this line might be helpful to the >>>> reader. Also, you're using a $sub_args which is not defined. This will >>>> not break, but it makes the reader confused. (I suspect copy/paste >>>> glitch.) >>>> >>>> 3) The webrev indicates that the original config.sub was renamed to >>>> config.sub.orig, instead of autoconf-config.sub. But since that is how >>>> it is called in the new config.sub wrapper, the patch is unlikely to >>>> work. When you test this, make sure you have no lingering files in your >>>> workspace that can mess up the results. (e.g. no "?" files in hg >>>> status). >>> http://cr.openjdk.java.net/~aph/aarch64-8064357-5/ >> >> Looks good to me. >> >> Thanks for working through all these iterations! >> >> /Magnus > From dean.long at oracle.com Fri Nov 21 20:28:32 2014 From: dean.long at oracle.com (Dean Long) Date: Fri, 21 Nov 2014 12:28:32 -0800 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <546F9AA4.3090206@oracle.com> References: <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> <546B7C8D.7050409@redhat.com> <20141118180315.GB22927@redhat.com> <546CD834.50004@redhat.com> <546F5E5E.3030309@oracle.com> <546F7011.50600@redhat.com> <546F7E8E.2090405@oracle.com> <546F92FC.4080700@oracle.com> <546F9AA4.3090206@oracle.com> Message-ID: <546FA070.2060801@oracle.com> I was thinking 'hg mv config.sub autoconf-config.sub' and then 'hg add config.sub' so it doesn't look like we are modifying the upstream version of config.sub, but maybe it's not a big deal? dl On 11/21/2014 12:03 PM, Vladimir Kozlov wrote: > The history is preserved since patch is applied above config.sub. > I did 'hg copy config.sub autoconf-config.sub' and then applied the > patch. > > And there are only 2 changesets in config.sub: > > changeset: 574:b66c81dfa291 > user: ohair > date: Mon Jan 14 16:38:25 2013 -0800 > summary: 8005284: build-infra: nonstandard copyright headers under > common/autoconf/build-aux > > changeset: 423:e1830598f0b7 > parent: 417:42f275168fa5 > user: ohair > date: Tue Apr 10 08:18:28 2012 -0700 > summary: 7074397: Build infrastructure changes (makefile re-write) > > Thanks, > Vladimir > > > On 11/21/14 11:31 AM, Dean Long wrote: >> One minor comment: do we want to preserve the history in the new >> config.sub, >> or check it in as a new file? >> >> dl >> >> On 11/21/2014 10:03 AM, Magnus Ihse Bursie wrote: >>> >>> On 2014-11-21 18:02, Andrew Haley wrote: >>>> On 11/21/2014 03:46 PM, Magnus Ihse Bursie wrote: >>>>> 1) Comment in config.sub identifies it as config.guess. >>>>> >>>>> 2) As as understand it, your first attempt is to just dispatch >>>>> through >>>>> to autoconf-config.sub if there is no aarch64 arguments. Good! >>>>> However, >>>>> the code could be made clearer: The dot is small and some comment >>>>> clarifying that we will exit after this line might be helpful to the >>>>> reader. Also, you're using a $sub_args which is not defined. This >>>>> will >>>>> not break, but it makes the reader confused. (I suspect copy/paste >>>>> glitch.) >>>>> >>>>> 3) The webrev indicates that the original config.sub was renamed to >>>>> config.sub.orig, instead of autoconf-config.sub. But since that is >>>>> how >>>>> it is called in the new config.sub wrapper, the patch is unlikely to >>>>> work. When you test this, make sure you have no lingering files in >>>>> your >>>>> workspace that can mess up the results. (e.g. no "?" files in hg >>>>> status). >>>> http://cr.openjdk.java.net/~aph/aarch64-8064357-5/ >>> >>> Looks good to me. >>> >>> Thanks for working through all these iterations! >>> >>> /Magnus >> From aph at redhat.com Fri Nov 21 20:58:28 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 21 Nov 2014 20:58:28 +0000 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <546F7E8E.2090405@oracle.com> References: <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> <546B7C8D.7050409@redhat.com> <20141118180315.GB22927@redhat.com> <546CD834.50004@redhat.com> <546F5E5E.3030309@oracle.com> <546F7011.50600@redhat.com> <546F7E8E.2090405@oracle.com> Message-ID: <546FA774.7040401@redhat.com> On 11/21/2014 06:03 PM, Magnus Ihse Bursie wrote: > Thanks for working through all these iterations! Oh, the thanks should be the other way around. I really want this to be right. Andrew. From ysr1729 at gmail.com Fri Nov 21 21:42:23 2014 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Fri, 21 Nov 2014 13:42:23 -0800 Subject: RFR: 8065656: Use DWARF debug symbols for Solaris In-Reply-To: <5C3C7BAF-ACF0-421F-A20B-F75022AF84D0@oracle.com> References: <20141121153011.GA12067@ehelin-desktop> <5C3C7BAF-ACF0-421F-A20B-F75022AF84D0@oracle.com> Message-ID: What does Peter think? For those too young to remember, Peter invented stabs way back when he was a young grad student at Berkeley :-) It has given admirable service, like so much else he has touched! -- Ramki ysr1729 > On Nov 21, 2014, at 08:30, charlie hunt wrote: > > Looks good, though I am not an official (R)eviewer. > > Charlie > >> On Nov 21, 2014, at 9:30 AM, Erik Helin wrote: >> >> Hi all, >> >> this patch changes the debug symbols format on Solaris from STABS >> [0] to DWARF [1] for libjvm.so. Since the supported compiler on Solaris >> has been updated to Oracle Solaris Studio 12.3 [2], the STABS debug format >> is now deprecated in the supported compiler [3]: >> >> -xdebugformat=stabs generates debugging information >> using the stabs standard format. The stabs format is no >> longer supported. >> >> Furthermore, in Oracle Solaris Studio 12.4, the release notes says [4]: >> >> The ?xdebugformat=stabs for all compilers might be removed in a future >> release. The only debugger format option will be ?xdebugformat=dwarf, >> which is currently the default. >> >> So, it seems to be a good time to change the debug format to DWARF when >> compiling with Oracle Solaris Studio. I also changed the debug format for >> GCC on Solaris to be DWARF, since the STABS support in GCC is in >> maintenance mode [5]. >> >> More reasons for using DWARF instead of STABS are: >> - Better support by Oracle Studio Performance Analyzer (the performance >> team have requested that we use DWARF v2 or later instead of STABS). >> - DWARF provides a better debugging experience for C++ compared to STABS. >> >> The one drawback of using DWARF compared to STABS is that the size of the >> debuginfo increases. For a SPARC fastdebug build the size of >> libjvm.debuginfo built with STABS is 782 MB and with DWARF 1002 MB. >> >> To summarize, we need to change from STABS to DWARF because STABS is >> deprecated in 12.3 (even "more" deprecated 12.4 given the wording in the >> release notes). I would suggest to change sooner rather than later, given >> that the change to DWARF also brings Oracle Studio Performance Analyzer >> support as well as a better C++ debugging experience in dbx. >> >> Webrev: >> http://cr.openjdk.java.net/~ehelin/8065656/webrev.00/ >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8065656 >> >> Testing: >> - Compiled with Oracle Solaris Studio 12.3 on both Solaris 11.1 on SPARC >> and Solaris 11.1 on x86-64 using JPRT. >> - Verified that DWARF v2 symbols are produced with objdump. >> >> Thanks, >> Erik >> >> [0]: http://www.sourceware.org/gdb/onlinedocs/stabs.html >> [1]: http://www.dwarfstd.org/ >> [2]: http://mail.openjdk.java.net/pipermail/jdk9-dev/2014-October/001489.html >> [3]: https://docs.oracle.com/cd/E24457_01/html/E22003/cplusplus.1.html >> [4]: https://docs.oracle.com/cd/E37069_01/html/E37070/gnxfn.html >> [5]: https://sourceware.org/ml/binutils/2013-01/msg00028.html From ysr1729 at gmail.com Fri Nov 21 21:47:13 2014 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Fri, 21 Nov 2014 13:47:13 -0800 Subject: RFR: 8065656: Use DWARF debug symbols for Solaris In-Reply-To: References: <20141121153011.GA12067@ehelin-desktop> <5C3C7BAF-ACF0-421F-A20B-F75022AF84D0@oracle.com> Message-ID: Peter's email id corrected, this time... ysr1729 > On Nov 21, 2014, at 13:42, Srinivas Ramakrishna wrote: > > > What does Peter think? For those too young to remember, Peter invented stabs way back when he was a young grad student at Berkeley :-) > It has given admirable service, like so much else he has touched! > > -- Ramki > > ysr1729 > >> On Nov 21, 2014, at 08:30, charlie hunt wrote: >> >> Looks good, though I am not an official (R)eviewer. >> >> Charlie >> >>> On Nov 21, 2014, at 9:30 AM, Erik Helin wrote: >>> >>> Hi all, >>> >>> this patch changes the debug symbols format on Solaris from STABS >>> [0] to DWARF [1] for libjvm.so. Since the supported compiler on Solaris >>> has been updated to Oracle Solaris Studio 12.3 [2], the STABS debug format >>> is now deprecated in the supported compiler [3]: >>> >>> -xdebugformat=stabs generates debugging information >>> using the stabs standard format. The stabs format is no >>> longer supported. >>> >>> Furthermore, in Oracle Solaris Studio 12.4, the release notes says [4]: >>> >>> The ?xdebugformat=stabs for all compilers might be removed in a future >>> release. The only debugger format option will be ?xdebugformat=dwarf, >>> which is currently the default. >>> >>> So, it seems to be a good time to change the debug format to DWARF when >>> compiling with Oracle Solaris Studio. I also changed the debug format for >>> GCC on Solaris to be DWARF, since the STABS support in GCC is in >>> maintenance mode [5]. >>> >>> More reasons for using DWARF instead of STABS are: >>> - Better support by Oracle Studio Performance Analyzer (the performance >>> team have requested that we use DWARF v2 or later instead of STABS). >>> - DWARF provides a better debugging experience for C++ compared to STABS. >>> >>> The one drawback of using DWARF compared to STABS is that the size of the >>> debuginfo increases. For a SPARC fastdebug build the size of >>> libjvm.debuginfo built with STABS is 782 MB and with DWARF 1002 MB. >>> >>> To summarize, we need to change from STABS to DWARF because STABS is >>> deprecated in 12.3 (even "more" deprecated 12.4 given the wording in the >>> release notes). I would suggest to change sooner rather than later, given >>> that the change to DWARF also brings Oracle Studio Performance Analyzer >>> support as well as a better C++ debugging experience in dbx. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~ehelin/8065656/webrev.00/ >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8065656 >>> >>> Testing: >>> - Compiled with Oracle Solaris Studio 12.3 on both Solaris 11.1 on SPARC >>> and Solaris 11.1 on x86-64 using JPRT. >>> - Verified that DWARF v2 symbols are produced with objdump. >>> >>> Thanks, >>> Erik >>> >>> [0]: http://www.sourceware.org/gdb/onlinedocs/stabs.html >>> [1]: http://www.dwarfstd.org/ >>> [2]: http://mail.openjdk.java.net/pipermail/jdk9-dev/2014-October/001489.html >>> [3]: https://docs.oracle.com/cd/E24457_01/html/E22003/cplusplus.1.html >>> [4]: https://docs.oracle.com/cd/E37069_01/html/E37070/gnxfn.html >>> [5]: https://sourceware.org/ml/binutils/2013-01/msg00028.html From vladimir.kozlov at oracle.com Fri Nov 21 22:37:10 2014 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 21 Nov 2014 14:37:10 -0800 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <546FA070.2060801@oracle.com> References: <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> <546B7C8D.7050409@redhat.com> <20141118180315.GB22927@redhat.com> <546CD834.50004@redhat.com> <546F5E5E.3030309@oracle.com> <546F7011.50600@redhat.com> <546F7E8E.2090405@oracle.com> <546F92FC.4080700@oracle.com> <546F9AA4.3090206@oracle.com> <546FA070.2060801@oracle.com> Message-ID: <546FBE96.20303@oracle.com> I did as you said: hg mv common/autoconf/build-aux/config.sub common/autoconf/build-aux/autoconf-config.sub $ hg st M common/autoconf/generated-configure.sh M common/autoconf/jdk-options.m4 M common/autoconf/platform.m4 A common/autoconf/build-aux/autoconf-config.sub R common/autoconf/build-aux/config.sub $ cp new_config.sub common/autoconf/build-aux/config.sub $ hg add common/autoconf/build-aux/config.sub $ hg st M common/autoconf/generated-configure.sh M common/autoconf/jdk-options.m4 M common/autoconf/platform.m4 A common/autoconf/build-aux/autoconf-config.sub R common/autoconf/build-aux/config.sub And webrev shows config.sub diffs vs original one and not as new file. Vladimir On 11/21/14 12:28 PM, Dean Long wrote: > I was thinking 'hg mv config.sub autoconf-config.sub' and then > 'hg add config.sub' so it doesn't look like we are modifying the > upstream version > of config.sub, but maybe it's not a big deal? > > dl > > On 11/21/2014 12:03 PM, Vladimir Kozlov wrote: >> The history is preserved since patch is applied above config.sub. >> I did 'hg copy config.sub autoconf-config.sub' and then applied the >> patch. >> >> And there are only 2 changesets in config.sub: >> >> changeset: 574:b66c81dfa291 >> user: ohair >> date: Mon Jan 14 16:38:25 2013 -0800 >> summary: 8005284: build-infra: nonstandard copyright headers under >> common/autoconf/build-aux >> >> changeset: 423:e1830598f0b7 >> parent: 417:42f275168fa5 >> user: ohair >> date: Tue Apr 10 08:18:28 2012 -0700 >> summary: 7074397: Build infrastructure changes (makefile re-write) >> >> Thanks, >> Vladimir >> >> >> On 11/21/14 11:31 AM, Dean Long wrote: >>> One minor comment: do we want to preserve the history in the new >>> config.sub, >>> or check it in as a new file? >>> >>> dl >>> >>> On 11/21/2014 10:03 AM, Magnus Ihse Bursie wrote: >>>> >>>> On 2014-11-21 18:02, Andrew Haley wrote: >>>>> On 11/21/2014 03:46 PM, Magnus Ihse Bursie wrote: >>>>>> 1) Comment in config.sub identifies it as config.guess. >>>>>> >>>>>> 2) As as understand it, your first attempt is to just dispatch >>>>>> through >>>>>> to autoconf-config.sub if there is no aarch64 arguments. Good! >>>>>> However, >>>>>> the code could be made clearer: The dot is small and some comment >>>>>> clarifying that we will exit after this line might be helpful to the >>>>>> reader. Also, you're using a $sub_args which is not defined. This >>>>>> will >>>>>> not break, but it makes the reader confused. (I suspect copy/paste >>>>>> glitch.) >>>>>> >>>>>> 3) The webrev indicates that the original config.sub was renamed to >>>>>> config.sub.orig, instead of autoconf-config.sub. But since that is >>>>>> how >>>>>> it is called in the new config.sub wrapper, the patch is unlikely to >>>>>> work. When you test this, make sure you have no lingering files in >>>>>> your >>>>>> workspace that can mess up the results. (e.g. no "?" files in hg >>>>>> status). >>>>> http://cr.openjdk.java.net/~aph/aarch64-8064357-5/ >>>> >>>> Looks good to me. >>>> >>>> Thanks for working through all these iterations! >>>> >>>> /Magnus >>> > From Peter.B.Kessler at Oracle.COM Fri Nov 21 23:03:25 2014 From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler) Date: Fri, 21 Nov 2014 15:03:25 -0800 Subject: RFR: 8065656: Use DWARF debug symbols for Solaris In-Reply-To: References: <20141121153011.GA12067@ehelin-desktop> <5C3C7BAF-ACF0-421F-A20B-F75022AF84D0@oracle.com> Message-ID: <546FC4BD.2050209@Oracle.COM> I don't think I invented the stab format. I think stab.h came with adb and sdb, but might predate even those. Stab.h covers the basics of global symbols, procedures, source files, line numbers, etc. (The earliest copy I can find is http://svnweb.freebsd.org/csrg/include/stab.h?revision=12194&view=markup) I probably did invent the N_PC (0x30) stab variant for the Berkeley Pascal compiler. The trick there was that, rather than having to negotiate for some modest number of the limited (< 2 bytes) available space for stab entries, to claim only one entry for all the Pascal symbolic information and put all the information I needed into the string part of the "symbol". That left lots of room for other languages, and separated the Pascal compiler (and all the other languages) from having to edit stab.h as we figured out what we wanted in the way of debugging information. Maybe that is the origin of the "symbol table in the string" (stabs) idea. That said, I'm happy to see stabs replaced by something better. I'm also not an upper-case R reviewer. ... peter On 11/21/14 01:42 PM, Srinivas Ramakrishna wrote: > > What does Peter think? For those too young to remember, Peter invented stabs way back when he was a young grad student at Berkeley :-) > It has given admirable service, like so much else he has touched! > > -- Ramki > > ysr1729 > >> On Nov 21, 2014, at 08:30, charlie hunt wrote: >> >> Looks good, though I am not an official (R)eviewer. >> >> Charlie >> >>> On Nov 21, 2014, at 9:30 AM, Erik Helin wrote: >>> >>> Hi all, >>> >>> this patch changes the debug symbols format on Solaris from STABS >>> [0] to DWARF [1] for libjvm.so. Since the supported compiler on Solaris >>> has been updated to Oracle Solaris Studio 12.3 [2], the STABS debug format >>> is now deprecated in the supported compiler [3]: >>> >>> -xdebugformat=stabs generates debugging information >>> using the stabs standard format. The stabs format is no >>> longer supported. >>> >>> Furthermore, in Oracle Solaris Studio 12.4, the release notes says [4]: >>> >>> The ?xdebugformat=stabs for all compilers might be removed in a future >>> release. The only debugger format option will be ?xdebugformat=dwarf, >>> which is currently the default. >>> >>> So, it seems to be a good time to change the debug format to DWARF when >>> compiling with Oracle Solaris Studio. I also changed the debug format for >>> GCC on Solaris to be DWARF, since the STABS support in GCC is in >>> maintenance mode [5]. >>> >>> More reasons for using DWARF instead of STABS are: >>> - Better support by Oracle Studio Performance Analyzer (the performance >>> team have requested that we use DWARF v2 or later instead of STABS). >>> - DWARF provides a better debugging experience for C++ compared to STABS. >>> >>> The one drawback of using DWARF compared to STABS is that the size of the >>> debuginfo increases. For a SPARC fastdebug build the size of >>> libjvm.debuginfo built with STABS is 782 MB and with DWARF 1002 MB. >>> >>> To summarize, we need to change from STABS to DWARF because STABS is >>> deprecated in 12.3 (even "more" deprecated 12.4 given the wording in the >>> release notes). I would suggest to change sooner rather than later, given >>> that the change to DWARF also brings Oracle Studio Performance Analyzer >>> support as well as a better C++ debugging experience in dbx. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~ehelin/8065656/webrev.00/ >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8065656 >>> >>> Testing: >>> - Compiled with Oracle Solaris Studio 12.3 on both Solaris 11.1 on SPARC >>> and Solaris 11.1 on x86-64 using JPRT. >>> - Verified that DWARF v2 symbols are produced with objdump. >>> >>> Thanks, >>> Erik >>> >>> [0]: http://www.sourceware.org/gdb/onlinedocs/stabs.html >>> [1]: http://www.dwarfstd.org/ >>> [2]: http://mail.openjdk.java.net/pipermail/jdk9-dev/2014-October/001489.html >>> [3]: https://docs.oracle.com/cd/E24457_01/html/E22003/cplusplus.1.html >>> [4]: https://docs.oracle.com/cd/E37069_01/html/E37070/gnxfn.html >>> [5]: https://sourceware.org/ml/binutils/2013-01/msg00028.html From dean.long at oracle.com Sat Nov 22 00:29:20 2014 From: dean.long at oracle.com (Dean Long) Date: Fri, 21 Nov 2014 16:29:20 -0800 Subject: RFR: AARCH64: 8064357: Top-level JDK changes In-Reply-To: <546FBE96.20303@oracle.com> References: <545D150F.0@redhat.com> <54607289.9090002@oracle.com> <54608868.3010108@oracle.com> <5464BBA9.1000809@oracle.com> <5464BFFA.7050205@redhat.com> <5464C7A0.4080304@oracle.com> <5464C85C.50908@redhat.com> <546B7C8D.7050409@redhat.com> <20141118180315.GB22927@redhat.com> <546CD834.50004@redhat.com> <546F5E5E.3030309@oracle.com> <546F7011.50600@redhat.com> <546F7E8E.2090405@oracle.com> <546F92FC.4080700@oracle.com> <546F9AA4.3090206@oracle.com> <546FA070.2060801@oracle.com> <546FBE96.20303@oracle.com> Message-ID: <546FD8E0.5050000@oracle.com> This may be a bug in webrev. If 'hg log -f' shows it as a new file, that should be enough. dl On 11/21/2014 2:37 PM, Vladimir Kozlov wrote: > I did as you said: > > hg mv common/autoconf/build-aux/config.sub > common/autoconf/build-aux/autoconf-config.sub > > $ hg st > M common/autoconf/generated-configure.sh > M common/autoconf/jdk-options.m4 > M common/autoconf/platform.m4 > A common/autoconf/build-aux/autoconf-config.sub > R common/autoconf/build-aux/config.sub > > $ cp new_config.sub common/autoconf/build-aux/config.sub > $ hg add common/autoconf/build-aux/config.sub > $ hg st > M common/autoconf/generated-configure.sh > M common/autoconf/jdk-options.m4 > M common/autoconf/platform.m4 > A common/autoconf/build-aux/autoconf-config.sub > R common/autoconf/build-aux/config.sub > > And webrev shows config.sub diffs vs original one and not as new file. > > Vladimir > > On 11/21/14 12:28 PM, Dean Long wrote: >> I was thinking 'hg mv config.sub autoconf-config.sub' and then >> 'hg add config.sub' so it doesn't look like we are modifying the >> upstream version >> of config.sub, but maybe it's not a big deal? >> >> dl >> >> On 11/21/2014 12:03 PM, Vladimir Kozlov wrote: >>> The history is preserved since patch is applied above config.sub. >>> I did 'hg copy config.sub autoconf-config.sub' and then applied the >>> patch. >>> >>> And there are only 2 changesets in config.sub: >>> >>> changeset: 574:b66c81dfa291 >>> user: ohair >>> date: Mon Jan 14 16:38:25 2013 -0800 >>> summary: 8005284: build-infra: nonstandard copyright headers under >>> common/autoconf/build-aux >>> >>> changeset: 423:e1830598f0b7 >>> parent: 417:42f275168fa5 >>> user: ohair >>> date: Tue Apr 10 08:18:28 2012 -0700 >>> summary: 7074397: Build infrastructure changes (makefile re-write) >>> >>> Thanks, >>> Vladimir >>> >>> >>> On 11/21/14 11:31 AM, Dean Long wrote: >>>> One minor comment: do we want to preserve the history in the new >>>> config.sub, >>>> or check it in as a new file? >>>> >>>> dl >>>> >>>> On 11/21/2014 10:03 AM, Magnus Ihse Bursie wrote: >>>>> >>>>> On 2014-11-21 18:02, Andrew Haley wrote: >>>>>> On 11/21/2014 03:46 PM, Magnus Ihse Bursie wrote: >>>>>>> 1) Comment in config.sub identifies it as config.guess. >>>>>>> >>>>>>> 2) As as understand it, your first attempt is to just dispatch >>>>>>> through >>>>>>> to autoconf-config.sub if there is no aarch64 arguments. Good! >>>>>>> However, >>>>>>> the code could be made clearer: The dot is small and some comment >>>>>>> clarifying that we will exit after this line might be helpful to >>>>>>> the >>>>>>> reader. Also, you're using a $sub_args which is not defined. This >>>>>>> will >>>>>>> not break, but it makes the reader confused. (I suspect copy/paste >>>>>>> glitch.) >>>>>>> >>>>>>> 3) The webrev indicates that the original config.sub was renamed to >>>>>>> config.sub.orig, instead of autoconf-config.sub. But since that is >>>>>>> how >>>>>>> it is called in the new config.sub wrapper, the patch is >>>>>>> unlikely to >>>>>>> work. When you test this, make sure you have no lingering files in >>>>>>> your >>>>>>> workspace that can mess up the results. (e.g. no "?" files in hg >>>>>>> status). >>>>>> http://cr.openjdk.java.net/~aph/aarch64-8064357-5/ >>>>> >>>>> Looks good to me. >>>>> >>>>> Thanks for working through all these iterations! >>>>> >>>>> /Magnus >>>> >> From david.holmes at oracle.com Mon Nov 24 05:06:24 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 24 Nov 2014 15:06:24 +1000 Subject: Proposal: Allowing selective pushes to hotspot without jprt In-Reply-To: <546F5BD8.9010405@oracle.com> References: <540F7021.5080100@oracle.com> <5410CDA9.7030405@oracle.com> <541C6FD9.5050602@oracle.com> <545CB362.60501@oracle.com> <5460CDF1.8050205@oracle.com> <546BF822.7010702@oracle.com> <546F5BD8.9010405@oracle.com> Message-ID: <5472BCD0.7060700@oracle.com> On 22/11/2014 1:35 AM, Magnus Ihse Bursie wrote: > > On 2014-11-19 09:23, Volker Simonis wrote: >> On Wed, Nov 19, 2014 at 2:53 AM, David Holmes >> wrote: >>> On 18/11/2014 8:03 PM, Volker Simonis wrote: >>>> You're right - it works! >>>> I've just pushed my first AIX-only change to hotspot-rt! >>> >>> Congratulations! >>> >>> Unfortunately it caused us a problem as now the repos can change >>> whilst a >>> job is going through JPRT - this requires a new merge due to multiple >>> heads >>> and so triggered a failure. But Mikael is working on it :) >> It's astonishing how even the smallest changes can introduce >> non-foreseeable problems. >> >> Hopefully Mikael will be able to fix the problem somehow (if he didn't >> had too much of the champagne already :) > > Actually, having non-atomic commits *is* a foreseeable problem. ;-) > Which is what you get when you just throw a bunch of unrelated > mercurials together without any synchronization in-between them. In > fact, I'm surprised this kind of thing does not happen any more often, > but then again, a lot of our procedures have probably evolved partially > as a response to this constraint. This wasn't the non-atomic commit problem, but rather a concurrent commit to the same repo while a JPRT job was running. At push time it surprised JPRT to find a new head in the target repo - something it considered to be an error. David ----- > I'd like to see us moving to using atomic commits for the whole project. > Unfortunately, there is no simple solution to this even for the open > source code only, and it gets worse when adding the closed components. > > /Magnus From tobias.hartmann at oracle.com Mon Nov 24 06:39:08 2014 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 24 Nov 2014 07:39:08 +0100 Subject: [8u40] Backport RFR: 8050079: crash while compiling java.lang.ref.Finalizer::runFinalizer In-Reply-To: <546F6338.7040702@oracle.com> References: <546EF9AC.9060109@oracle.com> <546F6338.7040702@oracle.com> Message-ID: <5472D28C.6010100@oracle.com> Thanks, Vladimir. Best, Tobias On 21.11.2014 17:07, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 11/21/14 12:37 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following backport request for 8u40. >> >> 8050079: crash while compiling java.lang.ref.Finalizer::runFinalizer >> https://bugs.openjdk.java.net/browse/JDK-8050079 >> http://cr.openjdk.java.net/~thartmann/8050079/webrev.03/ >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0bdada928884 >> >> The fix does not apply cleanly to 8u40 because of the changes to >> test/TEST.groups. I removed them: >> >> http://cr.openjdk.java.net/~thartmann/8050079_8u/webrev.00/ >> >> The changes were pushed into 9 on Thursday. I'll wait for the nightlies before >> pushing into 8u40. >> >> Thanks, >> Tobias >> From yasuenag at gmail.com Mon Nov 24 13:21:41 2014 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Mon, 24 Nov 2014 22:21:41 +0900 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <543E80F8.3080204@gmail.com> References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> Message-ID: <547330E5.1050708@gmail.com> Hi all, I've uploaded webrev for this issue about a month ago. Could you review it and sponsor it? Thanks, Yasumasa On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote: > Hi David, > > I've uploaded new webrev: > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/ > > >> I wasn't suggesting that you make such a change though because it is large and disruptive. > >> Unfactoring check_or_create_dump is a step backwards in terms of code sharing. > > I restored check_or_create_dump() to os_posix.cpp . > And I changed get_core_path() to create message which represents core dump path > (including filename) in each OS. > > >> Expanding the get_core_path in os_linux.cpp to handle the core_pattern may be okay (but I don't know enough about it to validate everything). > > I implemented all parameters in Linux kernel documentation: > https://www.kernel.org/doc/Documentation/sysctl/kernel.txt > > So I think that parameters which are processed are enough. > > > Thanks, > > Yasumasa > > > > (2014/10/15 9:41), David Holmes wrote: >> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote: >>> Hi David, >>> >>> Thank you for comments! >>> I've uploaded new webrev. Could you review it again? >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/ >>> >>> I am an author of jdk9. So I cannot commit it. >>> Could you be a sponsor for this enhancement? >>> >>> >>>> In which case that should be handled by the linux specific >>>> get_core_path() function. >>> >>> Agree. >>> So I implemented it in os_linux.cpp . >>> But part of format characters (%P: global pid, %s: signal, %t dump time) >>> are not processed >>> in this function because I think these parameters are difficult to >>> handle in it. >>> >>> %P: I could not find API for this. >>> %s: We have to change arguments of get_core_path() . >>> %t: This parameter means timestamp of coredump. It is decided in Kernel. >>> >>> >>>> Fixing this means changing all the os_posix using platforms. But your >>>> patch is not about this part. :) >>> >>> I moved os::check_or_create_dump() to each OS implementations (AIX, BSD, >>> Solaris, Linux) . >>> So I can write Linux specific code to check_or_create_dump() . >>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-) >> >> I wasn't suggesting that you make such a change though because it is large and disruptive. The simple handling of the | part of core_pattern was basically ok. Expanding the get_core_path in os_linux.cpp to handle the core_pattern may be okay (but I don't know enough about it to validate everything). Unfactoring check_or_create_dump is a step backwards in terms of code sharing. >> >> Sorry this has grown too large for me to deal with right now. >> >> David >> ----- >> >>> >>>> Though I'm unclear whether it both invokes the program and creates a >>>> core dump file; or just invokes the program? >>> >>> If '|' is set, Linux kernel will just redirect core image to user process. >>> Kernel documentation says as below: >>> ------------ >>> . If the first character of the pattern is a '|', the kernel will treat >>> the rest of the pattern as a command to run. The core dump will be >>> written to the standard input of that program instead of to a file. >>> ------------ >>> >>> And implementation of coredump (do_coredump()) follows to it. >>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c >>> >>> >>> In case of ABRT, ABRT dumps core image to default location >>> (/core.) >>> if user set unlimited to resource limit of core (ulimit -c) . >>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c >>> >>> >>>> A few style nits - you need spaces around keywords and before braces >>>> I also suggest saying "Core dumps may be processed with ..." rather >>>> than "treated". >>>> And as you don't do anything in the non-redirect case I suggest >>>> collapsing this: >>> >>> I've fixed them. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> (2014/10/13 9:41), David Holmes wrote: >>>> Hi Yasumasa, >>>> >>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote: >>>>> Hi David, >>>>> >>>>> Sorry for my English. >>>>> >>>>> I want to propose that JVM should create message according to core >>>>> pattern (/proc/sys/kernel/core_pattern) . >>>>> So I filed it to JBS and created a patch. >>>> >>>> So I've had a quick look at this core_pattern business and it seems to >>>> me that there are two aspects to this. >>>> >>>> First, without the leading |, the entry in the core_pattern file is a >>>> naming pattern for the core file. In which case that should be handled >>>> by the linux specific get_core_path() function. Though that in itself >>>> can't fully report the expected name, as part of it is provided in the >>>> shared code in os::check_or_create_dump. Fixing this means changing >>>> all the os_posix using platforms. But your patch is not about this >>>> part. :) >>>> >>>> Second, with a leading | the core_pattern is actually the name of a >>>> program to execute when the program is about to core dump, and that is >>>> what you report with your patch. Though I'm unclear whether it both >>>> invokes the program and creates a core dump file; or just invokes the >>>> program? >>>> >>>> So with regards to this second part your patch seems functionally ok. >>>> I do dislike having a big chunk of linux specific code in this "posix" >>>> support file but ... >>>> >>>> A few style nits - you need spaces around keywords and before braces eg: >>>> >>>> if(x){ >>>> >>>> should be >>>> >>>> if (x) { >>>> >>>> I also suggest saying "Core dumps may be processed with ..." rather >>>> than "treated". >>>> >>>> And as you don't do anything in the non-redirect case I suggest >>>> collapsing this: >>>> >>>> 83 is_redirect = core_pattern[0] == '|'; >>>> 84 } >>>> 85 >>>> 86 if(is_redirect){ >>>> 87 jio_snprintf(buffer, bufferSize, >>>> 88 "Core dumps may be treated with \"%s\"", >>>> &core_pattern[1]); >>>> 89 } >>>> >>>> to just >>>> >>>> 83 if (core_pattern[0] == '|') { // redirect >>>> 84 jio_snprintf(buffer, bufferSize, "Core dumps may be >>>> processed with \"%s\"", &core_pattern[1]); >>>> 85 } >>>> 86 } >>>> >>>> Comments from other runtime folk appreciated. >>>> >>>> Thanks, >>>> David >>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> 2014/10/07 15:43 "David Holmes" >>>> >: >>>>> >>>>> Hi Yasumasa, >>>>> >>>>> I'm sorry but I don't understand what you are proposing. When you >>>>> say >>>>> "treat" do you mean "create"? Otherwise what do you mean by >>>>> "treated"? >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote: >>>>> > I'm in Hackergarten @ JavaOne :-) >>>>> > >>>>> > >>>>> > Hi all, >>>>> > >>>>> > I would like to enhance the messages in hs_err report. >>>>> > Modern Linux kernel can treat core dump with user process >>>>> (e.g. ABRT) >>>>> > However, hs_err report cannot detect it. >>>>> > >>>>> > I think that hs_err report should output messages as below: >>>>> > ------------- >>>>> > Failed to write core dump. Core dumps may be treated with >>>>> "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s %c %p >>>>> %u %g %t e" >>>>> > ------------- >>>>> > >>>>> > I've uploaded webrev of this enhancement. >>>>> > Could you review it? >>>>> > >>>>> > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/ >>>>> > >>>>> > This patch works fine on Fedora20 x86_64. >>>>> > >>>>> > >>>>> > >>>>> > Thanks, >>>>> > >>>>> > Yasumasa >>>>> > >>>>> From aph at redhat.com Mon Nov 24 14:07:30 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 24 Nov 2014 14:07:30 +0000 Subject: Using LoadAcquire & StoreRelease instructions Message-ID: <54733BA2.9060909@redhat.com> Just to recap: when we have LoadAcquire & StoreRelease instructions we don't also need fences. We respect MemNode::unordered in the C2 aarch64.ad and generate LoadAcquire & StoreRelease. I have changed HotSpot in a few places so that we can disable the separate fences. However, there are two places in the HotSpot code base where I've had to conditionalize on AArch64 because a store is marked as a release where we don't need it to be. The first is a store to a non-volatile OOP field, which I think you said was for IA-64, because IA-64 does not have a store fence at the end of object initialization. I understand that argument and it makes sense, but can we make this IA64_ONLY, or is it wanted for other architectures as well? --- old/src/share/vm/opto/memnode.hpp 2014-11-21 12:09:22.766963837 -0500 +++ new/src/share/vm/opto/memnode.hpp 2014-11-21 12:09:22.546983320 -0500 @@ -503,6 +503,10 @@ // Conservatively release stores of object references in order to // ensure visibility of object initialization. static inline MemOrd release_if_reference(const BasicType t) { + // AArch64 doesn't need a release store here because object + // initialization contains the necessary barriers. + AARCH64_ONLY(return unordered); + const MemOrd mo = (t == T_ARRAY || t == T_ADDRESS || // Might be the address of an object reference (`boxing'). t == T_OBJECT) ? release : unordered; The second is for release stores into the card table, which I believe are not needed on any architecture. (G1 is irrelevant here: it has its own card table code, with the fences it needs.) If updating the card table really does need to be a release store then we must insert a fence here for every architecture. However, I don't think we do, and the release here has a significant performance impact. --- old/src/share/vm/opto/graphKit.cpp 2014-11-21 12:09:20.017207376 -0500 +++ new/src/share/vm/opto/graphKit.cpp 2014-11-21 12:09:19.787227745 -0500 @@ -3813,7 +3813,8 @@ // Smash zero into card if( !UseConcMarkSweepGC ) { - __ store(__ ctrl(), card_adr, zero, bt, adr_type, MemNode::release); + __ store(__ ctrl(), card_adr, zero, bt, adr_type, + NOT_AARCH64(MemNode::release) AARCH64_ONLY(MemNode::unordered)); } else { // Specialized path for CM store barrier __ storeCM(__ ctrl(), card_adr, zero, oop_store, adr_idx, bt, adr_type); So, while I am perfectly happy to just disable these for AArch64, it would be better for all concerned to have a resolution of this. Thanks, Andrew. From goetz.lindenmaier at sap.com Mon Nov 24 14:42:39 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 24 Nov 2014 14:42:39 +0000 Subject: Using LoadAcquire & StoreRelease instructions In-Reply-To: <54733BA2.9060909@redhat.com> References: <54733BA2.9060909@redhat.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF29EB9@DEWDFEMB12A.global.corp.sap> Hi Andrew, yes, you are right in both points. The first is only needed for IA64, because there we leave out the store fence after the initialization. The second, in graphKit, is superfluous. We figured that, too, not too long ago. Thanks for improving this, Martin & Goetz. -----Original Message----- From: Andrew Haley [mailto:aph at redhat.com] Sent: Montag, 24. November 2014 15:08 To: Lindenmaier, Goetz; hotspot-dev Source Developers; aarch64-port-dev at openjdk.java.net Subject: Using LoadAcquire & StoreRelease instructions Just to recap: when we have LoadAcquire & StoreRelease instructions we don't also need fences. We respect MemNode::unordered in the C2 aarch64.ad and generate LoadAcquire & StoreRelease. I have changed HotSpot in a few places so that we can disable the separate fences. However, there are two places in the HotSpot code base where I've had to conditionalize on AArch64 because a store is marked as a release where we don't need it to be. The first is a store to a non-volatile OOP field, which I think you said was for IA-64, because IA-64 does not have a store fence at the end of object initialization. I understand that argument and it makes sense, but can we make this IA64_ONLY, or is it wanted for other architectures as well? --- old/src/share/vm/opto/memnode.hpp 2014-11-21 12:09:22.766963837 -0500 +++ new/src/share/vm/opto/memnode.hpp 2014-11-21 12:09:22.546983320 -0500 @@ -503,6 +503,10 @@ // Conservatively release stores of object references in order to // ensure visibility of object initialization. static inline MemOrd release_if_reference(const BasicType t) { + // AArch64 doesn't need a release store here because object + // initialization contains the necessary barriers. + AARCH64_ONLY(return unordered); + const MemOrd mo = (t == T_ARRAY || t == T_ADDRESS || // Might be the address of an object reference (`boxing'). t == T_OBJECT) ? release : unordered; The second is for release stores into the card table, which I believe are not needed on any architecture. (G1 is irrelevant here: it has its own card table code, with the fences it needs.) If updating the card table really does need to be a release store then we must insert a fence here for every architecture. However, I don't think we do, and the release here has a significant performance impact. --- old/src/share/vm/opto/graphKit.cpp 2014-11-21 12:09:20.017207376 -0500 +++ new/src/share/vm/opto/graphKit.cpp 2014-11-21 12:09:19.787227745 -0500 @@ -3813,7 +3813,8 @@ // Smash zero into card if( !UseConcMarkSweepGC ) { - __ store(__ ctrl(), card_adr, zero, bt, adr_type, MemNode::release); + __ store(__ ctrl(), card_adr, zero, bt, adr_type, + NOT_AARCH64(MemNode::release) AARCH64_ONLY(MemNode::unordered)); } else { // Specialized path for CM store barrier __ storeCM(__ ctrl(), card_adr, zero, oop_store, adr_idx, bt, adr_type); So, while I am perfectly happy to just disable these for AArch64, it would be better for all concerned to have a resolution of this. Thanks, Andrew. From thomas.stuefe at gmail.com Mon Nov 24 17:32:00 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 24 Nov 2014 18:32:00 +0100 Subject: RFR(xs): 8065788: os::reserve_memory() on Windows should not assert that allocation size is aligned to OS allocation granularity. Message-ID: Hi, a very small change: Bug Report: https://bugs.openjdk.java.net/browse/JDK-8065788 WebRev: http://cr.openjdk.java.net/~simonis/webrevs/8065788/ os::reserve_memory() on Windows asserts that allocation size is allocated to os::vm_allocation_granularity(). This assert is wrong and should be removed. Allocation granularity affects the alignment of attach addresses, not of the allocated size. The latter is aligned to page size, but asserting that would be unnecessarily strict, as VirtualAlloc() will just quietly align size up to page size. For details see MSDN on VirtualAlloc(): http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887%28v=vs.85%29.aspx Kind Regards, Thomas St?fe From christian.tornqvist at oracle.com Mon Nov 24 19:36:15 2014 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Mon, 24 Nov 2014 14:36:15 -0500 Subject: RFR:8060074:Cleanup of unused memory tracking parameters in os::free and its callers. In-Reply-To: <546E3FCD.7020808@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> <546BBFDE.8040003@oracle.com> <546C04DC.1010606@oracle.com> <546E3FCD.7020808@oracle.com> Message-ID: <00b401d0081d$e9d99910$bd8ccb30$@oracle.com> Hi Max, This looks good, thanks for cleaning this up :) Thanks, Christian -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Max Ockner Sent: Thursday, November 20, 2014 2:24 PM To: David Holmes Cc: hotspot-dev at openjdk.java.net Subject: Re: RFR:8060074:Cleanup of unused memory tracking parameters in os::free and its callers. David, Thomas beat me to it. os::free does not ever use the memtracking argument that was being passed to it. This does not break memory tracking though, because this memory tracking parameter is supplied and saved in a malloc header during memory allocation. os::free accesses and uses this header through Memtracker::record_free(memblock). Let me know if anything else stands out to you. I would love to have the David Holmes stamp of approval for this fix. Other notes: I have fixed the copyrights and made a new webrev, but it hasn't been copied to openjdk yet. Shouldn't be an issue because there isn't really anything new to see. Either way, sorry about that. Updated Webrev: http://oklahoma.us.oracle.com/~mockner/webrev/8060074.1/ Thanks, Max Ockner On 11/18/2014 9:47 PM, David Holmes wrote: > Hi Max, > > So I would have assumed memflags were being passed to all the "free" > routines for NMT purposes. Otherwise how does NMT track this? > > Thanks, > David > > On 19/11/2014 7:53 AM, Max Ockner wrote: >> Hello all, >> Please review this minor cleanup: >> >> Bug ID: 8060074 >> Webrev: http://cr.openjdk.java.net/~coleenp/8060074/ >> >> Summary: >> (1) os::free takes two arguments, but never uses the second argument, >> which is a MEMFLAG. I have removed this argument from every os::free >> call. >> (2) The FreeHeap method in src/share/vm/memory/allocation.inline.hpp >> also takes a MEMFLAG argument, which is only used to call os::free. >> Now an unused argument, it has been removed from all FreeHeap calls. >> No other methods which directly call os::free() have this problem. >> (3) The FREE_C_HEAP_ARRAY macro in src/share/vm/memory/allocation.hpp >> takes A MEMFLAG argument which is passed to FreeHeap, and nothing else. >> This argument is now unused, and has been removed. No other methods >> which call FreeHeap have this problem. >> >> No methods or macros which use the FREE_C_HEAP_ARRAY macro needed >> cleanup. I have also removed the extra argument from the definitions >> of the above methods. >> >> Tests: jtreg hotspot tests with >> -vmoption:"-XX:NativeMemoryTrackingdetail" >> >> Thanks for your help, >> Max Ockner From yasuenag at gmail.com Tue Nov 25 03:34:44 2014 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 25 Nov 2014 12:34:44 +0900 Subject: guarantee(PageArmed == 0) failed: invaliant Message-ID: <5473F8D4.8000107@gmail.com> Hi all, My customer encountered crash with below messages: -------- Internal Error (safepoint.cpp:309) guarantee(PageArmed == 0) failed: invaliant -------- - JDK: JDK6u37 x64 - OS: RHEL 5.4 x86_64 I found similar issues in JBS: - JDK-7116986 - JDK-7156454 - JDK-8033717 I read safepoint.cpp in jdk9, I guess this error is caused in below: -------- if (int(iterations) == DeferPollingPageLoopCount) { guarantee (PageArmed == 0, "invariant") ; PageArmed = 1 ; os::make_polling_page_unreadable(); } -------- "iterations" is defined as "unsigned int", and increments in each loop. On the other hand, DeferPollingPageLoopCount is defined intx and default value is "-1" . "PageArmed" sets to 1. -------- if (DeferPollingPageLoopCount < 0) { // Make polling safepoint aware guarantee (PageArmed == 0, "invariant") ; PageArmed = 1 ; os::make_polling_page_unreadable(); } -------- If "iterations" is overflowed, do we encounter this guarantee ? I think this "if" statement should rewrite as below: -------- diff -r 7e08ae41ddbe src/share/vm/runtime/safepoint.cpp --- a/src/share/vm/runtime/safepoint.cpp Mon Nov 24 09:57:02 2014 +0100 +++ b/src/share/vm/runtime/safepoint.cpp Tue Nov 25 12:19:58 2014 +0900 @@ -288,7 +288,8 @@ // 9. On windows consider using the return value from SwitchThreadTo() // to drive subsequent spin/SwitchThreadTo()/Sleep(N) decisions. - if (int(iterations) == DeferPollingPageLoopCount) { + if ((DeferPollingPageLoopCount >= 0) && + (int(iterations) == DeferPollingPageLoopCount)) { guarantee (PageArmed == 0, "invariant") ; PageArmed = 1 ; os::make_polling_page_unreadable(); -------- If it is correct, I will file it to JBS and upload webrev. Could you help me to resolve this issue? Thanks, Yasumasa From david.holmes at oracle.com Tue Nov 25 03:24:05 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 25 Nov 2014 13:24:05 +1000 Subject: RFR: JDK-8035663 Suspicious failure of test java/util/concurrent/Phaser/FickleRegister.java In-Reply-To: <546C27ED.7090700@oracle.com> References: <546C27ED.7090700@oracle.com> Message-ID: <5473F655.4020602@oracle.com> Still looking for Reviewer please. Thanks, David On 19/11/2014 3:17 PM, David Holmes wrote: > webrev: > > http://cr.openjdk.java.net/~dholmes/8035663/webrev.jdk9/ > > This test failure exposed a number of issues with the logic in > unsafe.cpp for handling atomic updates of Java long fields on platforms > without any direct support for a 64-bit CAS operation - platforms for > which supports_cx8 is not true. This only impacts our SE Embedded PPC32 > platform (where we have been using this fix for some time now) but in > case other such platforms came along I wanted to get this pushed to > mainline. > > What the unsafe code did was to use the object containing the field as a > lock object for reading and writing the field. This seems reasonable on > the surface but in fact had a fatal flaw - because we were locking a > Java-level visible object inside what was considered to be a lock-free > code-path by the application and library logic, we could actually induce > a deadlock - which is why the test failed. > > In addition the code had two further flaws: > > 1. Because the field could also be updated via direct assignment in Java > code the unsafe code needed to perform an Atomic::load of the field. And > for good measure we also employ an Atomic::store to ensure no > interference with direct reads of the field in Java code. > > 2. The address of the field was being calculated before using the > ObjectLocker to lock the object, but locking could encounter a safepoint > check allowing the object to relocated by the GC, and we would then use > a stale address. > > To fix all of this we: > - introduce a special Mutex to use instead of the deadlock-inducing Java > object > - use Atomic::load and Atomic::store to access the jlong field > - avoid safepoints when locking (alternatively you could ensure you > calculate the address after acquiring the lock ) > > Thanks, > David From david.holmes at oracle.com Tue Nov 25 07:04:20 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 25 Nov 2014 17:04:20 +1000 Subject: guarantee(PageArmed == 0) failed: invaliant In-Reply-To: <5473F8D4.8000107@gmail.com> References: <5473F8D4.8000107@gmail.com> Message-ID: <547429F4.2020803@oracle.com> Hi Yasumasa, On 25/11/2014 1:34 PM, Yasumasa Suenaga wrote: > Hi all, > > My customer encountered crash with below messages: > -------- > Internal Error (safepoint.cpp:309) > guarantee(PageArmed == 0) failed: invaliant > -------- > - JDK: JDK6u37 x64 > - OS: RHEL 5.4 x86_64 > > I found similar issues in JBS: > - JDK-7116986 > - JDK-7156454 > - JDK-8033717 > > I read safepoint.cpp in jdk9, I guess this error is caused in below: > -------- > if (int(iterations) == DeferPollingPageLoopCount) { > guarantee (PageArmed == 0, "invariant") ; > PageArmed = 1 ; > os::make_polling_page_unreadable(); > } > -------- > > "iterations" is defined as "unsigned int", and increments in each loop. > On the other hand, DeferPollingPageLoopCount is defined intx and default > value is "-1" . > > "PageArmed" sets to 1. > -------- > if (DeferPollingPageLoopCount < 0) { > // Make polling safepoint aware > guarantee (PageArmed == 0, "invariant") ; > PageArmed = 1 ; > os::make_polling_page_unreadable(); > } > -------- > > > If "iterations" is overflowed, do we encounter this guarantee ? > I think this "if" statement should rewrite as below: No we want this overflow to trigger the guarantee failure - it indicates a problem elsewhere in the VM because a thread is not reaching the safepoint that has been requested, in a timely manner. When crashes like this occur you need to examine all the running threads to find out which are not safepoint-safe and then determine what they are doing and why they have not performed a safepoint check. David ------ > -------- > diff -r 7e08ae41ddbe src/share/vm/runtime/safepoint.cpp > --- a/src/share/vm/runtime/safepoint.cpp Mon Nov 24 09:57:02 2014 +0100 > +++ b/src/share/vm/runtime/safepoint.cpp Tue Nov 25 12:19:58 2014 +0900 > @@ -288,7 +288,8 @@ > // 9. On windows consider using the return value from SwitchThreadTo() > // to drive subsequent spin/SwitchThreadTo()/Sleep(N) decisions. > > - if (int(iterations) == DeferPollingPageLoopCount) { > + if ((DeferPollingPageLoopCount >= 0) && > + (int(iterations) == DeferPollingPageLoopCount)) { > guarantee (PageArmed == 0, "invariant") ; > PageArmed = 1 ; > os::make_polling_page_unreadable(); > -------- > > > If it is correct, I will file it to JBS and upload webrev. > Could you help me to resolve this issue? > > > Thanks, > > Yasumasa > From david.holmes at oracle.com Tue Nov 25 07:25:25 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 25 Nov 2014 17:25:25 +1000 Subject: RFR:8060074:Cleanup of unused memory tracking parameters in os::free and its callers. In-Reply-To: <546E3FCD.7020808@oracle.com> References: <545D19D9.3040400@oracle.com> <5463A9E9.8010005@oracle.com> <5463BD56.500@oracle.com> <5463C52D.4000600@oracle.com> <546BBFDE.8040003@oracle.com> <546C04DC.1010606@oracle.com> <546E3FCD.7020808@oracle.com> Message-ID: <54742EE5.7050306@oracle.com> Hi Max, On 21/11/2014 5:23 AM, Max Ockner wrote: > David, > Thomas beat me to it. os::free does not ever use the memtracking > argument that was being passed to it. This does not break memory > tracking though, because this memory tracking parameter is supplied and > saved in a malloc header during memory allocation. > os::free accesses and uses this header through > Memtracker::record_free(memblock). As it was explained to me, this was needed in NMT1 but not NMT2. So that fine. > Let me know if anything else stands out to you. I would love to have the > David Holmes stamp of approval for this fix. Seems fine to me. :) Thanks, David > Other notes: > I have fixed the copyrights and made a new webrev, but it hasn't been > copied to openjdk yet. Shouldn't be an issue because there isn't really > anything new to see. Either way, sorry about that. > Updated Webrev: http://oklahoma.us.oracle.com/~mockner/webrev/8060074.1/ > > Thanks, > Max Ockner > > > > > On 11/18/2014 9:47 PM, David Holmes wrote: >> Hi Max, >> >> So I would have assumed memflags were being passed to all the "free" >> routines for NMT purposes. Otherwise how does NMT track this? >> >> Thanks, >> David >> >> On 19/11/2014 7:53 AM, Max Ockner wrote: >>> Hello all, >>> Please review this minor cleanup: >>> >>> Bug ID: 8060074 >>> Webrev: http://cr.openjdk.java.net/~coleenp/8060074/ >>> >>> Summary: >>> (1) os::free takes two arguments, but never uses the second argument, >>> which is a MEMFLAG. I have removed this argument from every os::free >>> call. >>> (2) The FreeHeap method in src/share/vm/memory/allocation.inline.hpp >>> also takes a MEMFLAG argument, which is only used to call os::free. Now >>> an unused argument, it has been removed from all FreeHeap calls. No >>> other methods which directly call os::free() have this problem. >>> (3) The FREE_C_HEAP_ARRAY macro in src/share/vm/memory/allocation.hpp >>> takes A MEMFLAG argument which is passed to FreeHeap, and nothing else. >>> This argument is now unused, and has been removed. No other methods >>> which call FreeHeap have this problem. >>> >>> No methods or macros which use the FREE_C_HEAP_ARRAY macro needed >>> cleanup. I have also removed the extra argument from the definitions of >>> the above methods. >>> >>> Tests: jtreg hotspot tests with >>> -vmoption:"-XX:NativeMemoryTrackingdetail" >>> >>> Thanks for your help, >>> Max Ockner > From david.holmes at oracle.com Tue Nov 25 08:03:59 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 25 Nov 2014 18:03:59 +1000 Subject: RFR:8047290:Ensure consistent safepoint checking in MutexLockerEx In-Reply-To: <546E638E.2020208@oracle.com> References: <543EB71A.8000403@oracle.com> <543F174F.7040204@oracle.com> <545CF033.4010503@oracle.com> <54619729.30704@oracle.com> <546E638E.2020208@oracle.com> Message-ID: <547437EF.5010806@oracle.com> Hi Max, On 21/11/2014 7:56 AM, Max Ockner wrote: > Hello again, > > I have made changes based on all comments. There is now a pair of assert > statements in the Monitor/Mutex wait() methods. When I reran tests, I Is there an updated webrev? > caught another lock that I had to change to "sometimes", but the assert > that caught this lock was not in wait. There are currently no locks that > use try to pass an incorrect safepoint check argument to wait(). > Instead, gcbasher did not catch this lock last time, when the only > asserts were in lock and lock_without_safepoint. This lock is > "CMS_markBitMap_lock" in > share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp. > I'm guessing that it was not caught by the tests because this section of > code is not reached very often. My initial inspection failed to catch > this lock because it is passed around between various structures and > renamed 4 times. I have not yet found a good way to check for this > situation, even with a debugger. > > Are there any tests which actually manage to hit every line of code? No. There is too much code that is dependent on low-level details of the GC used, the compilation strategy, plus the set of runtime flags used (and whether product or fastdebug). That's why we have lots of tests run in lots of different ways, to try to get coverage. > How should I handle this situation where I can't rely on the tests that > I have run to tell me if I have missed something? > At what point can I assume that everything is OK? Difficult to answer in general - there are a number of recommended test suites used by the runtime team, but your changes will also impact GC and compiler code and so may not get exercised by the runtime test suites (unless run with various compiler and GC options). Perhaps an ad-hoc test run similar to nightlies? Or you push after the weekly snapshot so as to maximise nightly testing, and if there are issues exposed then you have time to address them or revert the fix. Cheers, David > Thanks, > Max Ockner > > On 11/10/2014 11:57 PM, David Holmes wrote: >> Hi Max, >> >> On 8/11/2014 2:15 AM, Max Ockner wrote: >>> Hello all, >>> I have made these additonal changes: >>> -Moved the assert() statements into the lock and lock_without_safepoint >>> methods. >>> -Changed Monitor::SafepointAllowed to Monitor::SafepointCheckRequired >>> -Changed the Monitor::SafepointCheckRequired values for several locks >>> which were locked outside of a MutexLockerEx (some were locked with >>> MutexLocker, some were locked were locked without any MutexLocker* ) >>> >>> New webrev location: http://cr.openjdk.java.net/~coleenp/8047290/ >> >> Generally this is all okay - a few style and other nits below. >> >> However you missed adding an assert in Monitor::wait to check if the >> no_safepoint_check flag was being used correctly for the current monitor. >> >> Specific comments: >> >> src/share/vm/runtime/mutex.hpp >> >> This comment is no longer accurate with the moved check location: >> >> + // MutexLockerEx checks these flags when acquiring a lock >> + // to ensure consistent checking for each lock. >> >> The same goes for other references to MutexLockerEx in the enum >> description. >> >> Also copyright year needs updating. >> >> --- >> >> src/share/vm/runtime/mutex.cpp >> >> 898 //Ensure >> 961 //Ensure >> >> Space needed after // >> >> --- >> >> src/share/vm/runtime/mutexLocker.cpp >> >> + var = new type(Mutex::pri, #var, vm_block,safepoint_check_allowed); \ >> >> space needed after comma in k,s >> >> --- >> >> src/share/vm/runtime/mutexLocker.hpp >> >> Whitespace only changes - looks like leftovers from removed edits. >> >> >> >> Thanks, >> David >> >> >>> Additional testing: >>> jtreg ./jdk/test/java/lang/invoke >>> jtreg jfr tests >>> >>> Here is a list of ALL of the "sometimes" locks: >>> >>> "WorkGroup monitor" share/vm/utilities/workgroup.cpp >>> "SLTMonitor" share/vm/gc_implementation/shared/concurrentGCThread.cpp >>> "CompactibleFreeListSpace._lock" >>> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp >>> >>> "freelist par lock" >>> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp >>> >>> "SR_lock" share/vm/runtime/thread.cpp >>> >>> The remaining "sometimes" locks can be found in >>> share/vm/runtime/mutexLocker.cpp: >>> >>> ParGCRareEvent_lock >>> Safepoint_lock >>> Threads_lock >>> VMOperationQueue_lock >>> VMOperationRequest_lock >>> Terminator_lock >>> Heap_lock >>> Compile_lock >>> PeriodicTask_lock >>> JfrStacktrace_lock >>> >>> I have not checked the validity of the "sometimes" locks, and I believe >>> that this should be a different project. >>> >>> Thanks for your help! >>> Max Ockner >>> On 10/15/2014 8:54 PM, David Holmes wrote: >>>> Hi Max, >>>> >>>> This is looking good. >>>> >>>> A few high-level initial comments: >>>> >>>> I think SafepointAllowed should be SafepointCheckNeeded >>>> >>>> Why are the checks in MutexLocker when the state is maintained in the >>>> mutex itself and the mutex/monitor has lock_without_safepoint, and >>>> wait() ? I would have expected to see the >>>> check in the mutex/monitor methods. >>>> >>>> Checking consistent usage of the _no_safepoint_check_flag is good. But >>>> another part of this is that a monitor/mutex that never checks for >>>> safepoints should never be held when a thread blocks at a safepoint - >>>> is there some way to easily check that? I was surprised how many locks >>>> are actually not checking for safepoints. >>>> >>>> Did you find any cases where the mutex/monitor was being used >>>> inconsistently and incorrectly? >>>> >>>> Did you analyse the "sometimes" cases to see if they were safe? >>>> (Aside: just for fun check out what happens if you lock the >>>> Threads_lock with a safepoint check and a safepoint has been requested >>>> :) ). >>>> >>>> Cheers, >>>> David >>>> >>>> On 16/10/2014 4:04 AM, Max Ockner wrote: >>>>> Hi all, >>>>> >>>>> I am a new member of the Hotspot runtime team in Burlington, MA. >>>>> Please review my first fix related to safepoint checking. >>>>> >>>>> Summary: MutexLockerEx can either acquire a lock with or without a >>>>> safepoint check. >>>>> In some cases, a particular lock must either safepoint check always or >>>>> never to avoid deadlocking. >>>>> Some other locks have semantics which allow them to avoid deadlocks >>>>> despite having a safepoint check only some of the time. >>>>> All locks that are OK having inconsistent safepoint checks have been >>>>> marked. All locks that should never safepoint check and all locks that >>>>> should always safepoint check have also been marked. >>>>> When a MutexLockerEx acquires a lock with or without a safepoint >>>>> check, >>>>> the lock's safepointAllowed marker is checked to ensure consistent >>>>> safepoint checking. >>>>> >>>>> Webrev: http://oklahoma.us.oracle.com/~mockner/webrev/8047290/ >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8047290 >>>>> >>>>> Tested with: >>>>> jprt "-testset hotspot" >>>>> jtreg hotspot >>>>> vm.quick.testlist >>>>> >>>>> Whitebox tests: >>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java: Test >>>>> expects Assert ("This lock should always have a safepoint check") >>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java: Test >>>>> expects Assert ("This lock should never have a safepoint check") >>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java: code >>>>> should not assert. (Lock is properly acquired with no safepoint check) >>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java: code >>>>> should not assert. (Lock is properly acquired with safepoint check) >>>>> >>>>> Thanks, >>>>> Max >>>>> >>> > From david.holmes at oracle.com Tue Nov 25 08:38:58 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 25 Nov 2014 18:38:58 +1000 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <547330E5.1050708@gmail.com> References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com> Message-ID: <54744022.2030208@oracle.com> Sorry Yasumasa, this fell off my radar and I was hoping for others to comment. We still need a second reviewer. The change in: src/os/aix/vm/os_aix.cpp src/os/solaris/vm/os_solaris.cpp jio_snprintf(buffer, bufferSize, "%s/core or core.%d", current_process_id()); has no argument for the %s - presumably p was intended. Thanks, David On 24/11/2014 11:21 PM, Yasumasa Suenaga wrote: > Hi all, > > I've uploaded webrev for this issue about a month ago. > Could you review it and sponsor it? > > > Thanks, > > Yasumasa > > > On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote: >> Hi David, >> >> I've uploaded new webrev: >> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/ >> >> >>> I wasn't suggesting that you make such a change though because it is >>> large and disruptive. >> >>> Unfactoring check_or_create_dump is a step backwards in terms of code >>> sharing. >> >> I restored check_or_create_dump() to os_posix.cpp . >> And I changed get_core_path() to create message which represents core >> dump path >> (including filename) in each OS. >> >> >>> Expanding the get_core_path in os_linux.cpp to handle the >>> core_pattern may be okay (but I don't know enough about it to >>> validate everything). >> >> I implemented all parameters in Linux kernel documentation: >> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt >> >> So I think that parameters which are processed are enough. >> >> >> Thanks, >> >> Yasumasa >> >> >> >> (2014/10/15 9:41), David Holmes wrote: >>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>> Thank you for comments! >>>> I've uploaded new webrev. Could you review it again? >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/ >>>> >>>> I am an author of jdk9. So I cannot commit it. >>>> Could you be a sponsor for this enhancement? >>>> >>>> >>>>> In which case that should be handled by the linux specific >>>>> get_core_path() function. >>>> >>>> Agree. >>>> So I implemented it in os_linux.cpp . >>>> But part of format characters (%P: global pid, %s: signal, %t dump >>>> time) >>>> are not processed >>>> in this function because I think these parameters are difficult to >>>> handle in it. >>>> >>>> %P: I could not find API for this. >>>> %s: We have to change arguments of get_core_path() . >>>> %t: This parameter means timestamp of coredump. It is decided in >>>> Kernel. >>>> >>>> >>>>> Fixing this means changing all the os_posix using platforms. But your >>>>> patch is not about this part. :) >>>> >>>> I moved os::check_or_create_dump() to each OS implementations (AIX, >>>> BSD, >>>> Solaris, Linux) . >>>> So I can write Linux specific code to check_or_create_dump() . >>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-) >>> >>> I wasn't suggesting that you make such a change though because it is >>> large and disruptive. The simple handling of the | part of >>> core_pattern was basically ok. Expanding the get_core_path in >>> os_linux.cpp to handle the core_pattern may be okay (but I don't know >>> enough about it to validate everything). Unfactoring >>> check_or_create_dump is a step backwards in terms of code sharing. >>> >>> Sorry this has grown too large for me to deal with right now. >>> >>> David >>> ----- >>> >>>> >>>>> Though I'm unclear whether it both invokes the program and creates a >>>>> core dump file; or just invokes the program? >>>> >>>> If '|' is set, Linux kernel will just redirect core image to user >>>> process. >>>> Kernel documentation says as below: >>>> ------------ >>>> . If the first character of the pattern is a '|', the kernel will treat >>>> the rest of the pattern as a command to run. The core dump will be >>>> written to the standard input of that program instead of to a file. >>>> ------------ >>>> >>>> And implementation of coredump (do_coredump()) follows to it. >>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c >>>> >>>> >>>> >>>> In case of ABRT, ABRT dumps core image to default location >>>> (/core.) >>>> if user set unlimited to resource limit of core (ulimit -c) . >>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c >>>> >>>> >>>>> A few style nits - you need spaces around keywords and before braces >>>>> I also suggest saying "Core dumps may be processed with ..." rather >>>>> than "treated". >>>>> And as you don't do anything in the non-redirect case I suggest >>>>> collapsing this: >>>> >>>> I've fixed them. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> (2014/10/13 9:41), David Holmes wrote: >>>>> Hi Yasumasa, >>>>> >>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote: >>>>>> Hi David, >>>>>> >>>>>> Sorry for my English. >>>>>> >>>>>> I want to propose that JVM should create message according to core >>>>>> pattern (/proc/sys/kernel/core_pattern) . >>>>>> So I filed it to JBS and created a patch. >>>>> >>>>> So I've had a quick look at this core_pattern business and it seems to >>>>> me that there are two aspects to this. >>>>> >>>>> First, without the leading |, the entry in the core_pattern file is a >>>>> naming pattern for the core file. In which case that should be handled >>>>> by the linux specific get_core_path() function. Though that in itself >>>>> can't fully report the expected name, as part of it is provided in the >>>>> shared code in os::check_or_create_dump. Fixing this means changing >>>>> all the os_posix using platforms. But your patch is not about this >>>>> part. :) >>>>> >>>>> Second, with a leading | the core_pattern is actually the name of a >>>>> program to execute when the program is about to core dump, and that is >>>>> what you report with your patch. Though I'm unclear whether it both >>>>> invokes the program and creates a core dump file; or just invokes the >>>>> program? >>>>> >>>>> So with regards to this second part your patch seems functionally ok. >>>>> I do dislike having a big chunk of linux specific code in this "posix" >>>>> support file but ... >>>>> >>>>> A few style nits - you need spaces around keywords and before >>>>> braces eg: >>>>> >>>>> if(x){ >>>>> >>>>> should be >>>>> >>>>> if (x) { >>>>> >>>>> I also suggest saying "Core dumps may be processed with ..." rather >>>>> than "treated". >>>>> >>>>> And as you don't do anything in the non-redirect case I suggest >>>>> collapsing this: >>>>> >>>>> 83 is_redirect = core_pattern[0] == '|'; >>>>> 84 } >>>>> 85 >>>>> 86 if(is_redirect){ >>>>> 87 jio_snprintf(buffer, bufferSize, >>>>> 88 "Core dumps may be treated with \"%s\"", >>>>> &core_pattern[1]); >>>>> 89 } >>>>> >>>>> to just >>>>> >>>>> 83 if (core_pattern[0] == '|') { // redirect >>>>> 84 jio_snprintf(buffer, bufferSize, "Core dumps may be >>>>> processed with \"%s\"", &core_pattern[1]); >>>>> 85 } >>>>> 86 } >>>>> >>>>> Comments from other runtime folk appreciated. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> 2014/10/07 15:43 "David Holmes" >>>>> >: >>>>>> >>>>>> Hi Yasumasa, >>>>>> >>>>>> I'm sorry but I don't understand what you are proposing. When you >>>>>> say >>>>>> "treat" do you mean "create"? Otherwise what do you mean by >>>>>> "treated"? >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote: >>>>>> > I'm in Hackergarten @ JavaOne :-) >>>>>> > >>>>>> > >>>>>> > Hi all, >>>>>> > >>>>>> > I would like to enhance the messages in hs_err report. >>>>>> > Modern Linux kernel can treat core dump with user process >>>>>> (e.g. ABRT) >>>>>> > However, hs_err report cannot detect it. >>>>>> > >>>>>> > I think that hs_err report should output messages as below: >>>>>> > ------------- >>>>>> > Failed to write core dump. Core dumps may be treated with >>>>>> "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s >>>>>> %c %p >>>>>> %u %g %t e" >>>>>> > ------------- >>>>>> > >>>>>> > I've uploaded webrev of this enhancement. >>>>>> > Could you review it? >>>>>> > >>>>>> > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/ >>>>>> > >>>>>> > This patch works fine on Fedora20 x86_64. >>>>>> > >>>>>> > >>>>>> > >>>>>> > Thanks, >>>>>> > >>>>>> > Yasumasa >>>>>> > >>>>>> From yasuenag at gmail.com Tue Nov 25 08:48:33 2014 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 25 Nov 2014 17:48:33 +0900 Subject: guarantee(PageArmed == 0) failed: invaliant In-Reply-To: <547429F4.2020803@oracle.com> References: <5473F8D4.8000107@gmail.com> <547429F4.2020803@oracle.com> Message-ID: Hi David, Thank you for details. I can understand purpose for this guarantee. I read hs_err again, I found thread which state is _thread_new . I guess it is reason of this issue, but I cannot evaluate because core image was not available. If this crash will be reproduced, I will try check details. Thanks, Yasumasa 2014/11/25 16:04 "David Holmes" : > Hi Yasumasa, > > On 25/11/2014 1:34 PM, Yasumasa Suenaga wrote: > > Hi all, > > > > My customer encountered crash with below messages: > > -------- > > Internal Error (safepoint.cpp:309) > > guarantee(PageArmed == 0) failed: invaliant > > -------- > > - JDK: JDK6u37 x64 > > - OS: RHEL 5.4 x86_64 > > > > I found similar issues in JBS: > > - JDK-7116986 > > - JDK-7156454 > > - JDK-8033717 > > > > I read safepoint.cpp in jdk9, I guess this error is caused in below: > > -------- > > if (int(iterations) == DeferPollingPageLoopCount) { > > guarantee (PageArmed == 0, "invariant") ; > > PageArmed = 1 ; > > os::make_polling_page_unreadable(); > > } > > -------- > > > > "iterations" is defined as "unsigned int", and increments in each loop. > > On the other hand, DeferPollingPageLoopCount is defined intx and default > > value is "-1" . > > > > "PageArmed" sets to 1. > > -------- > > if (DeferPollingPageLoopCount < 0) { > > // Make polling safepoint aware > > guarantee (PageArmed == 0, "invariant") ; > > PageArmed = 1 ; > > os::make_polling_page_unreadable(); > > } > > -------- > > > > > > If "iterations" is overflowed, do we encounter this guarantee ? > > I think this "if" statement should rewrite as below: > > No we want this overflow to trigger the guarantee failure - it indicates > a problem elsewhere in the VM because a thread is not reaching the > safepoint that has been requested, in a timely manner. > > When crashes like this occur you need to examine all the running threads > to find out which are not safepoint-safe and then determine what they > are doing and why they have not performed a safepoint check. > > David > ------ > > > -------- > > diff -r 7e08ae41ddbe src/share/vm/runtime/safepoint.cpp > > --- a/src/share/vm/runtime/safepoint.cpp Mon Nov 24 09:57:02 2014 > +0100 > > +++ b/src/share/vm/runtime/safepoint.cpp Tue Nov 25 12:19:58 2014 > +0900 > > @@ -288,7 +288,8 @@ > > // 9. On windows consider using the return value from > SwitchThreadTo() > > // to drive subsequent spin/SwitchThreadTo()/Sleep(N) > decisions. > > > > - if (int(iterations) == DeferPollingPageLoopCount) { > > + if ((DeferPollingPageLoopCount >= 0) && > > + (int(iterations) == DeferPollingPageLoopCount)) { > > guarantee (PageArmed == 0, "invariant") ; > > PageArmed = 1 ; > > os::make_polling_page_unreadable(); > > -------- > > > > > > If it is correct, I will file it to JBS and upload webrev. > > Could you help me to resolve this issue? > > > > > > Thanks, > > > > Yasumasa > > > From staffan.larsen at oracle.com Tue Nov 25 09:15:34 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 25 Nov 2014 10:15:34 +0100 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <547330E5.1050708@gmail.com> References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com> Message-ID: src/os/bsd/vm/os_linux.cpp: I?m inclined to think this is too complicated and hard to test and maintain (and I see no tests in the webrev). Could we not simplify this to print a helpful message instead? Something that prints the core_pattern and perhaps some of the values that could be used for substitution, but does not do the actual substitution? I think that would go a long way but be a lot more maintainable. src/os/bsd/vm/os_bsd.cpp: On OS X cores are by default written to /cores/core.. This is configureable with the kern.corefile sysctl variable, although it is rare to do so. /Staffan > On 24 nov 2014, at 14:21, Yasumasa Suenaga wrote: > > Hi all, > > I've uploaded webrev for this issue about a month ago. > Could you review it and sponsor it? > > > Thanks, > > Yasumasa > > > On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote: >> Hi David, >> >> I've uploaded new webrev: >> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/ >> >> >>> I wasn't suggesting that you make such a change though because it is large and disruptive. >> >>> Unfactoring check_or_create_dump is a step backwards in terms of code sharing. >> >> I restored check_or_create_dump() to os_posix.cpp . >> And I changed get_core_path() to create message which represents core dump path >> (including filename) in each OS. >> >> >>> Expanding the get_core_path in os_linux.cpp to handle the core_pattern may be okay (but I don't know enough about it to validate everything). >> >> I implemented all parameters in Linux kernel documentation: >> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt >> >> So I think that parameters which are processed are enough. >> >> >> Thanks, >> >> Yasumasa >> >> >> >> (2014/10/15 9:41), David Holmes wrote: >>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>> Thank you for comments! >>>> I've uploaded new webrev. Could you review it again? >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/ >>>> >>>> I am an author of jdk9. So I cannot commit it. >>>> Could you be a sponsor for this enhancement? >>>> >>>> >>>>> In which case that should be handled by the linux specific >>>>> get_core_path() function. >>>> >>>> Agree. >>>> So I implemented it in os_linux.cpp . >>>> But part of format characters (%P: global pid, %s: signal, %t dump time) >>>> are not processed >>>> in this function because I think these parameters are difficult to >>>> handle in it. >>>> >>>> %P: I could not find API for this. >>>> %s: We have to change arguments of get_core_path() . >>>> %t: This parameter means timestamp of coredump. It is decided in Kernel. >>>> >>>> >>>>> Fixing this means changing all the os_posix using platforms. But your >>>>> patch is not about this part. :) >>>> >>>> I moved os::check_or_create_dump() to each OS implementations (AIX, BSD, >>>> Solaris, Linux) . >>>> So I can write Linux specific code to check_or_create_dump() . >>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-) >>> >>> I wasn't suggesting that you make such a change though because it is large and disruptive. The simple handling of the | part of core_pattern was basically ok. Expanding the get_core_path in os_linux.cpp to handle the core_pattern may be okay (but I don't know enough about it to validate everything). Unfactoring check_or_create_dump is a step backwards in terms of code sharing. >>> >>> Sorry this has grown too large for me to deal with right now. >>> >>> David >>> ----- >>> >>>> >>>>> Though I'm unclear whether it both invokes the program and creates a >>>>> core dump file; or just invokes the program? >>>> >>>> If '|' is set, Linux kernel will just redirect core image to user process. >>>> Kernel documentation says as below: >>>> ------------ >>>> . If the first character of the pattern is a '|', the kernel will treat >>>> the rest of the pattern as a command to run. The core dump will be >>>> written to the standard input of that program instead of to a file. >>>> ------------ >>>> >>>> And implementation of coredump (do_coredump()) follows to it. >>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c >>>> >>>> >>>> In case of ABRT, ABRT dumps core image to default location >>>> (/core.) >>>> if user set unlimited to resource limit of core (ulimit -c) . >>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c >>>> >>>> >>>>> A few style nits - you need spaces around keywords and before braces >>>>> I also suggest saying "Core dumps may be processed with ..." rather >>>>> than "treated". >>>>> And as you don't do anything in the non-redirect case I suggest >>>>> collapsing this: >>>> >>>> I've fixed them. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> (2014/10/13 9:41), David Holmes wrote: >>>>> Hi Yasumasa, >>>>> >>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote: >>>>>> Hi David, >>>>>> >>>>>> Sorry for my English. >>>>>> >>>>>> I want to propose that JVM should create message according to core >>>>>> pattern (/proc/sys/kernel/core_pattern) . >>>>>> So I filed it to JBS and created a patch. >>>>> >>>>> So I've had a quick look at this core_pattern business and it seems to >>>>> me that there are two aspects to this. >>>>> >>>>> First, without the leading |, the entry in the core_pattern file is a >>>>> naming pattern for the core file. In which case that should be handled >>>>> by the linux specific get_core_path() function. Though that in itself >>>>> can't fully report the expected name, as part of it is provided in the >>>>> shared code in os::check_or_create_dump. Fixing this means changing >>>>> all the os_posix using platforms. But your patch is not about this >>>>> part. :) >>>>> >>>>> Second, with a leading | the core_pattern is actually the name of a >>>>> program to execute when the program is about to core dump, and that is >>>>> what you report with your patch. Though I'm unclear whether it both >>>>> invokes the program and creates a core dump file; or just invokes the >>>>> program? >>>>> >>>>> So with regards to this second part your patch seems functionally ok. >>>>> I do dislike having a big chunk of linux specific code in this "posix" >>>>> support file but ... >>>>> >>>>> A few style nits - you need spaces around keywords and before braces eg: >>>>> >>>>> if(x){ >>>>> >>>>> should be >>>>> >>>>> if (x) { >>>>> >>>>> I also suggest saying "Core dumps may be processed with ..." rather >>>>> than "treated". >>>>> >>>>> And as you don't do anything in the non-redirect case I suggest >>>>> collapsing this: >>>>> >>>>> 83 is_redirect = core_pattern[0] == '|'; >>>>> 84 } >>>>> 85 >>>>> 86 if(is_redirect){ >>>>> 87 jio_snprintf(buffer, bufferSize, >>>>> 88 "Core dumps may be treated with \"%s\"", >>>>> &core_pattern[1]); >>>>> 89 } >>>>> >>>>> to just >>>>> >>>>> 83 if (core_pattern[0] == '|') { // redirect >>>>> 84 jio_snprintf(buffer, bufferSize, "Core dumps may be >>>>> processed with \"%s\"", &core_pattern[1]); >>>>> 85 } >>>>> 86 } >>>>> >>>>> Comments from other runtime folk appreciated. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> 2014/10/07 15:43 "David Holmes" >>>>> >: >>>>>> >>>>>> Hi Yasumasa, >>>>>> >>>>>> I'm sorry but I don't understand what you are proposing. When you >>>>>> say >>>>>> "treat" do you mean "create"? Otherwise what do you mean by >>>>>> "treated"? >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote: >>>>>> > I'm in Hackergarten @ JavaOne :-) >>>>>> > >>>>>> > >>>>>> > Hi all, >>>>>> > >>>>>> > I would like to enhance the messages in hs_err report. >>>>>> > Modern Linux kernel can treat core dump with user process >>>>>> (e.g. ABRT) >>>>>> > However, hs_err report cannot detect it. >>>>>> > >>>>>> > I think that hs_err report should output messages as below: >>>>>> > ------------- >>>>>> > Failed to write core dump. Core dumps may be treated with >>>>>> "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s %c %p >>>>>> %u %g %t e" >>>>>> > ------------- >>>>>> > >>>>>> > I've uploaded webrev of this enhancement. >>>>>> > Could you review it? >>>>>> > >>>>>> > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/ >>>>>> > >>>>>> > This patch works fine on Fedora20 x86_64. >>>>>> > >>>>>> > >>>>>> > >>>>>> > Thanks, >>>>>> > >>>>>> > Yasumasa >>>>>> > >>>>>> From markus.gronlund at oracle.com Tue Nov 25 10:21:46 2014 From: markus.gronlund at oracle.com (=?utf-8?B?TWFya3VzIEdyw7ZubHVuZA==?=) Date: Tue, 25 Nov 2014 02:21:46 -0800 (PST) Subject: RFR(xs): 8065788: os::reserve_memory() on Windows should not assert that allocation size is aligned to OS allocation granularity. In-Reply-To: References: Message-ID: Hi Thomas, Thanks for finding and addressing this - looks good. Cheers Markus -----Original Message----- From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] Sent: den 24 november 2014 18:32 To: HotSpot Open Source Developers Subject: RFR(xs): 8065788: os::reserve_memory() on Windows should not assert that allocation size is aligned to OS allocation granularity. Hi, a very small change: Bug Report: https://bugs.openjdk.java.net/browse/JDK-8065788 WebRev: http://cr.openjdk.java.net/~simonis/webrevs/8065788/ os::reserve_memory() on Windows asserts that allocation size is allocated to os::vm_allocation_granularity(). This assert is wrong and should be removed. Allocation granularity affects the alignment of attach addresses, not of the allocated size. The latter is aligned to page size, but asserting that would be unnecessarily strict, as VirtualAlloc() will just quietly align size up to page size. For details see MSDN on VirtualAlloc(): http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887%28v=vs.85%29.aspx Kind Regards, Thomas St?fe From goetz.lindenmaier at sap.com Tue Nov 25 15:06:18 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 25 Nov 2014 15:06:18 +0000 Subject: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff Message-ID: <4295855A5C1DE049A61835A1887419CC2CF2A285@DEWDFEMB12A.global.corp.sap> Hi, please review and sponsor this tiny fix: http://cr.openjdk.java.net/~goetz/webrevs/8065915-inclFix/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8065915 It needs to go to hotspot-comp. 8058148 includes compile.hpp in ciTypeFlow.cpp. compile.hpp uses locate_node_notes() which is defined inline in node.hpp. Therefore ciTypeFlow.cpp also must include node.hpp. This breaks the build. opto/compile.hpp:825: warning: inline function 'Node_Notes* Compile::locate_node_notes(GrowableArray*, int, bool)' used but never defined Best regards, Goetz. From coleen.phillimore at oracle.com Tue Nov 25 20:17:49 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 25 Nov 2014 15:17:49 -0500 Subject: [8u40] RFR: 8042235: redefining method used by multiple MethodHandles crashes VM Message-ID: <5474E3ED.3060306@oracle.com> This is a backport of the bug fix for bug *https://bugs.openjdk.java.net/browse/JDK-8042593 *The fix has been in JDK9 for a week with a couple of nights of successful testing. The patch applied cleanly to jdk8u40. http://cr.openjdk.java.net/~coleenp/8042235_8u40/ Please approve this backport. thanks, Coleen From coleen.phillimore at oracle.com Tue Nov 25 23:05:36 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 25 Nov 2014 18:05:36 -0500 Subject: RFR: JDK-8035663 Suspicious failure of test java/util/concurrent/Phaser/FickleRegister.java In-Reply-To: <546C27ED.7090700@oracle.com> References: <546C27ED.7090700@oracle.com> Message-ID: <54750B40.4090403@oracle.com> David, I think this code looks good. The mutex seems like the right approach. Thank you for fixing this. It looked really broken now that you've described it in such readable detail. Coleen On 11/19/14, 12:17 AM, David Holmes wrote: > webrev: > > http://cr.openjdk.java.net/~dholmes/8035663/webrev.jdk9/ > > This test failure exposed a number of issues with the logic in > unsafe.cpp for handling atomic updates of Java long fields on > platforms without any direct support for a 64-bit CAS operation - > platforms for which supports_cx8 is not true. This only impacts our SE > Embedded PPC32 platform (where we have been using this fix for some > time now) but in case other such platforms came along I wanted to get > this pushed to mainline. > > What the unsafe code did was to use the object containing the field as > a lock object for reading and writing the field. This seems reasonable > on the surface but in fact had a fatal flaw - because we were locking > a Java-level visible object inside what was considered to be a > lock-free code-path by the application and library logic, we could > actually induce a deadlock - which is why the test failed. > > In addition the code had two further flaws: > > 1. Because the field could also be updated via direct assignment in > Java code the unsafe code needed to perform an Atomic::load of the > field. And for good measure we also employ an Atomic::store to ensure > no interference with direct reads of the field in Java code. > > 2. The address of the field was being calculated before using the > ObjectLocker to lock the object, but locking could encounter a > safepoint check allowing the object to relocated by the GC, and we > would then use a stale address. > > To fix all of this we: > - introduce a special Mutex to use instead of the deadlock-inducing > Java object > - use Atomic::load and Atomic::store to access the jlong field > - avoid safepoints when locking (alternatively you could ensure you > calculate the address after acquiring the lock ) > > Thanks, > David From david.holmes at oracle.com Wed Nov 26 01:32:33 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 26 Nov 2014 11:32:33 +1000 Subject: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF2A285@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF2A285@DEWDFEMB12A.global.corp.sap> Message-ID: <54752DB1.7070008@oracle.com> On 26/11/2014 1:06 AM, Lindenmaier, Goetz wrote: > Hi, > > please review and sponsor this tiny fix: > > http://cr.openjdk.java.net/~goetz/webrevs/8065915-inclFix/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8065915 > > It needs to go to hotspot-comp. > > 8058148 includes compile.hpp in ciTypeFlow.cpp. compile.hpp uses locate_node_notes() which is defined inline in node.hpp. Therefore ciTypeFlow.cpp also must include node.hpp. This breaks the build. If compile.hpp uses things from node.hpp then shouldn't it include node.hpp? David > opto/compile.hpp:825: warning: inline function 'Node_Notes* Compile::locate_node_notes(GrowableArray*, int, bool)' used but never defined > > Best regards, > Goetz. > From david.holmes at oracle.com Wed Nov 26 02:01:17 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 26 Nov 2014 12:01:17 +1000 Subject: RFR: JDK-8035663 Suspicious failure of test java/util/concurrent/Phaser/FickleRegister.java In-Reply-To: <54750B40.4090403@oracle.com> References: <546C27ED.7090700@oracle.com> <54750B40.4090403@oracle.com> Message-ID: <5475346D.9040007@oracle.com> Thanks Coleen! David On 26/11/2014 9:05 AM, Coleen Phillimore wrote: > > David, > > I think this code looks good. The mutex seems like the right approach. > Thank you for fixing this. It looked really broken now that you've > described it in such readable detail. > > Coleen > > On 11/19/14, 12:17 AM, David Holmes wrote: >> webrev: >> >> http://cr.openjdk.java.net/~dholmes/8035663/webrev.jdk9/ >> >> This test failure exposed a number of issues with the logic in >> unsafe.cpp for handling atomic updates of Java long fields on >> platforms without any direct support for a 64-bit CAS operation - >> platforms for which supports_cx8 is not true. This only impacts our SE >> Embedded PPC32 platform (where we have been using this fix for some >> time now) but in case other such platforms came along I wanted to get >> this pushed to mainline. >> >> What the unsafe code did was to use the object containing the field as >> a lock object for reading and writing the field. This seems reasonable >> on the surface but in fact had a fatal flaw - because we were locking >> a Java-level visible object inside what was considered to be a >> lock-free code-path by the application and library logic, we could >> actually induce a deadlock - which is why the test failed. >> >> In addition the code had two further flaws: >> >> 1. Because the field could also be updated via direct assignment in >> Java code the unsafe code needed to perform an Atomic::load of the >> field. And for good measure we also employ an Atomic::store to ensure >> no interference with direct reads of the field in Java code. >> >> 2. The address of the field was being calculated before using the >> ObjectLocker to lock the object, but locking could encounter a >> safepoint check allowing the object to relocated by the GC, and we >> would then use a stale address. >> >> To fix all of this we: >> - introduce a special Mutex to use instead of the deadlock-inducing >> Java object >> - use Atomic::load and Atomic::store to access the jlong field >> - avoid safepoints when locking (alternatively you could ensure you >> calculate the address after acquiring the lock ) >> >> Thanks, >> David > From yasuenag at gmail.com Wed Nov 26 03:39:33 2014 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 26 Nov 2014 12:39:33 +0900 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <54744022.2030208@oracle.com> References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com> <54744022.2030208@oracle.com> Message-ID: Hi David, Thank you for reviewing! I will fix it after discussion with Staffan. Thanks Yasumasa 2014/11/25 17:39 "David Holmes" : > Sorry Yasumasa, this fell off my radar and I was hoping for others to > comment. We still need a second reviewer. > > The change in: > src/os/aix/vm/os_aix.cpp > src/os/solaris/vm/os_solaris.cpp > > jio_snprintf(buffer, bufferSize, "%s/core or core.%d", > current_process_id()); > > has no argument for the %s - presumably p was intended. > > Thanks, > David > > On 24/11/2014 11:21 PM, Yasumasa Suenaga wrote: > >> Hi all, >> >> I've uploaded webrev for this issue about a month ago. >> Could you review it and sponsor it? >> >> >> Thanks, >> >> Yasumasa >> >> >> On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote: >> >>> Hi David, >>> >>> I've uploaded new webrev: >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/ >>> >>> >>> I wasn't suggesting that you make such a change though because it is >>>> large and disruptive. >>>> >>> >>> Unfactoring check_or_create_dump is a step backwards in terms of code >>>> sharing. >>>> >>> >>> I restored check_or_create_dump() to os_posix.cpp . >>> And I changed get_core_path() to create message which represents core >>> dump path >>> (including filename) in each OS. >>> >>> >>> Expanding the get_core_path in os_linux.cpp to handle the >>>> core_pattern may be okay (but I don't know enough about it to >>>> validate everything). >>>> >>> >>> I implemented all parameters in Linux kernel documentation: >>> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt >>> >>> So I think that parameters which are processed are enough. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> >>> (2014/10/15 9:41), David Holmes wrote: >>> >>>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote: >>>> >>>>> Hi David, >>>>> >>>>> Thank you for comments! >>>>> I've uploaded new webrev. Could you review it again? >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/ >>>>> >>>>> I am an author of jdk9. So I cannot commit it. >>>>> Could you be a sponsor for this enhancement? >>>>> >>>>> >>>>> In which case that should be handled by the linux specific >>>>>> get_core_path() function. >>>>>> >>>>> >>>>> Agree. >>>>> So I implemented it in os_linux.cpp . >>>>> But part of format characters (%P: global pid, %s: signal, %t dump >>>>> time) >>>>> are not processed >>>>> in this function because I think these parameters are difficult to >>>>> handle in it. >>>>> >>>>> %P: I could not find API for this. >>>>> %s: We have to change arguments of get_core_path() . >>>>> %t: This parameter means timestamp of coredump. It is decided in >>>>> Kernel. >>>>> >>>>> >>>>> Fixing this means changing all the os_posix using platforms. But your >>>>>> patch is not about this part. :) >>>>>> >>>>> >>>>> I moved os::check_or_create_dump() to each OS implementations (AIX, >>>>> BSD, >>>>> Solaris, Linux) . >>>>> So I can write Linux specific code to check_or_create_dump() . >>>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-) >>>>> >>>> >>>> I wasn't suggesting that you make such a change though because it is >>>> large and disruptive. The simple handling of the | part of >>>> core_pattern was basically ok. Expanding the get_core_path in >>>> os_linux.cpp to handle the core_pattern may be okay (but I don't know >>>> enough about it to validate everything). Unfactoring >>>> check_or_create_dump is a step backwards in terms of code sharing. >>>> >>>> Sorry this has grown too large for me to deal with right now. >>>> >>>> David >>>> ----- >>>> >>>> >>>>> Though I'm unclear whether it both invokes the program and creates a >>>>>> core dump file; or just invokes the program? >>>>>> >>>>> >>>>> If '|' is set, Linux kernel will just redirect core image to user >>>>> process. >>>>> Kernel documentation says as below: >>>>> ------------ >>>>> . If the first character of the pattern is a '|', the kernel will treat >>>>> the rest of the pattern as a command to run. The core dump will be >>>>> written to the standard input of that program instead of to a file. >>>>> ------------ >>>>> >>>>> And implementation of coredump (do_coredump()) follows to it. >>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/ >>>>> linux.git/tree/fs/coredump.c >>>>> >>>>> >>>>> >>>>> In case of ABRT, ABRT dumps core image to default location >>>>> (/core.) >>>>> if user set unlimited to resource limit of core (ulimit -c) . >>>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c >>>>> >>>>> >>>>> A few style nits - you need spaces around keywords and before braces >>>>>> I also suggest saying "Core dumps may be processed with ..." rather >>>>>> than "treated". >>>>>> And as you don't do anything in the non-redirect case I suggest >>>>>> collapsing this: >>>>>> >>>>> >>>>> I've fixed them. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> (2014/10/13 9:41), David Holmes wrote: >>>>> >>>>>> Hi Yasumasa, >>>>>> >>>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote: >>>>>> >>>>>>> Hi David, >>>>>>> >>>>>>> Sorry for my English. >>>>>>> >>>>>>> I want to propose that JVM should create message according to core >>>>>>> pattern (/proc/sys/kernel/core_pattern) . >>>>>>> So I filed it to JBS and created a patch. >>>>>>> >>>>>> >>>>>> So I've had a quick look at this core_pattern business and it seems to >>>>>> me that there are two aspects to this. >>>>>> >>>>>> First, without the leading |, the entry in the core_pattern file is a >>>>>> naming pattern for the core file. In which case that should be handled >>>>>> by the linux specific get_core_path() function. Though that in itself >>>>>> can't fully report the expected name, as part of it is provided in the >>>>>> shared code in os::check_or_create_dump. Fixing this means changing >>>>>> all the os_posix using platforms. But your patch is not about this >>>>>> part. :) >>>>>> >>>>>> Second, with a leading | the core_pattern is actually the name of a >>>>>> program to execute when the program is about to core dump, and that is >>>>>> what you report with your patch. Though I'm unclear whether it both >>>>>> invokes the program and creates a core dump file; or just invokes the >>>>>> program? >>>>>> >>>>>> So with regards to this second part your patch seems functionally ok. >>>>>> I do dislike having a big chunk of linux specific code in this "posix" >>>>>> support file but ... >>>>>> >>>>>> A few style nits - you need spaces around keywords and before >>>>>> braces eg: >>>>>> >>>>>> if(x){ >>>>>> >>>>>> should be >>>>>> >>>>>> if (x) { >>>>>> >>>>>> I also suggest saying "Core dumps may be processed with ..." rather >>>>>> than "treated". >>>>>> >>>>>> And as you don't do anything in the non-redirect case I suggest >>>>>> collapsing this: >>>>>> >>>>>> 83 is_redirect = core_pattern[0] == '|'; >>>>>> 84 } >>>>>> 85 >>>>>> 86 if(is_redirect){ >>>>>> 87 jio_snprintf(buffer, bufferSize, >>>>>> 88 "Core dumps may be treated with \"%s\"", >>>>>> &core_pattern[1]); >>>>>> 89 } >>>>>> >>>>>> to just >>>>>> >>>>>> 83 if (core_pattern[0] == '|') { // redirect >>>>>> 84 jio_snprintf(buffer, bufferSize, "Core dumps may be >>>>>> processed with \"%s\"", &core_pattern[1]); >>>>>> 85 } >>>>>> 86 } >>>>>> >>>>>> Comments from other runtime folk appreciated. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> 2014/10/07 15:43 "David Holmes" >>>>>> >: >>>>>>> >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> I'm sorry but I don't understand what you are proposing. When you >>>>>>> say >>>>>>> "treat" do you mean "create"? Otherwise what do you mean by >>>>>>> "treated"? >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote: >>>>>>> > I'm in Hackergarten @ JavaOne :-) >>>>>>> > >>>>>>> > >>>>>>> > Hi all, >>>>>>> > >>>>>>> > I would like to enhance the messages in hs_err report. >>>>>>> > Modern Linux kernel can treat core dump with user process >>>>>>> (e.g. ABRT) >>>>>>> > However, hs_err report cannot detect it. >>>>>>> > >>>>>>> > I think that hs_err report should output messages as below: >>>>>>> > ------------- >>>>>>> > Failed to write core dump. Core dumps may be treated with >>>>>>> "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s >>>>>>> %c %p >>>>>>> %u %g %t e" >>>>>>> > ------------- >>>>>>> > >>>>>>> > I've uploaded webrev of this enhancement. >>>>>>> > Could you review it? >>>>>>> > >>>>>>> > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/ >>>>>>> > >>>>>>> > This patch works fine on Fedora20 x86_64. >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > Thanks, >>>>>>> > >>>>>>> > Yasumasa >>>>>>> > >>>>>>> >>>>>>> From yasuenag at gmail.com Wed Nov 26 03:54:48 2014 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 26 Nov 2014 12:54:48 +0900 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com> Message-ID: Hi Staffan, Thank you for reviewing! os_linux.cpp: I want to print coredump location correctly to hs_err. So I want to output whether coredump is processed in other process or is written to file. If os::get_core_path() should be more simply, I will print raw string in core_pattern. os_bsd.cpp: I don't have OS X. So I cannot check it. I am focusing Linux in this enhancement. Could you file it as another enhancement if it need? Thanks, Yasumasa 2014/11/25 18:15 "Staffan Larsen" : > src/os/bsd/vm/os_linux.cpp: > I?m inclined to think this is too complicated and hard to test and > maintain (and I see no tests in the webrev). Could we not simplify this to > print a helpful message instead? Something that prints the core_pattern and > perhaps some of the values that could be used for substitution, but does > not do the actual substitution? I think that would go a long way but be a > lot more maintainable. > > src/os/bsd/vm/os_bsd.cpp: > On OS X cores are by default written to /cores/core.. This is > configureable with the kern.corefile sysctl variable, although it is rare > to do so. > > /Staffan > > > On 24 nov 2014, at 14:21, Yasumasa Suenaga wrote: > > > > Hi all, > > > > I've uploaded webrev for this issue about a month ago. > > Could you review it and sponsor it? > > > > > > Thanks, > > > > Yasumasa > > > > > > On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote: > >> Hi David, > >> > >> I've uploaded new webrev: > >> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/ > >> > >> > >>> I wasn't suggesting that you make such a change though because it is > large and disruptive. > >> > >>> Unfactoring check_or_create_dump is a step backwards in terms of code > sharing. > >> > >> I restored check_or_create_dump() to os_posix.cpp . > >> And I changed get_core_path() to create message which represents core > dump path > >> (including filename) in each OS. > >> > >> > >>> Expanding the get_core_path in os_linux.cpp to handle the core_pattern > may be okay (but I don't know enough about it to validate everything). > >> > >> I implemented all parameters in Linux kernel documentation: > >> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt > >> > >> So I think that parameters which are processed are enough. > >> > >> > >> Thanks, > >> > >> Yasumasa > >> > >> > >> > >> (2014/10/15 9:41), David Holmes wrote: > >>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote: > >>>> Hi David, > >>>> > >>>> Thank you for comments! > >>>> I've uploaded new webrev. Could you review it again? > >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/ > >>>> > >>>> I am an author of jdk9. So I cannot commit it. > >>>> Could you be a sponsor for this enhancement? > >>>> > >>>> > >>>>> In which case that should be handled by the linux specific > >>>>> get_core_path() function. > >>>> > >>>> Agree. > >>>> So I implemented it in os_linux.cpp . > >>>> But part of format characters (%P: global pid, %s: signal, %t dump > time) > >>>> are not processed > >>>> in this function because I think these parameters are difficult to > >>>> handle in it. > >>>> > >>>> %P: I could not find API for this. > >>>> %s: We have to change arguments of get_core_path() . > >>>> %t: This parameter means timestamp of coredump. It is decided in > Kernel. > >>>> > >>>> > >>>>> Fixing this means changing all the os_posix using platforms. But your > >>>>> patch is not about this part. :) > >>>> > >>>> I moved os::check_or_create_dump() to each OS implementations (AIX, > BSD, > >>>> Solaris, Linux) . > >>>> So I can write Linux specific code to check_or_create_dump() . > >>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-) > >>> > >>> I wasn't suggesting that you make such a change though because it is > large and disruptive. The simple handling of the | part of core_pattern was > basically ok. Expanding the get_core_path in os_linux.cpp to handle the > core_pattern may be okay (but I don't know enough about it to validate > everything). Unfactoring check_or_create_dump is a step backwards in terms > of code sharing. > >>> > >>> Sorry this has grown too large for me to deal with right now. > >>> > >>> David > >>> ----- > >>> > >>>> > >>>>> Though I'm unclear whether it both invokes the program and creates a > >>>>> core dump file; or just invokes the program? > >>>> > >>>> If '|' is set, Linux kernel will just redirect core image to user > process. > >>>> Kernel documentation says as below: > >>>> ------------ > >>>> . If the first character of the pattern is a '|', the kernel will > treat > >>>> the rest of the pattern as a command to run. The core dump will be > >>>> written to the standard input of that program instead of to a file. > >>>> ------------ > >>>> > >>>> And implementation of coredump (do_coredump()) follows to it. > >>>> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c > >>>> > >>>> > >>>> In case of ABRT, ABRT dumps core image to default location > >>>> (/core.) > >>>> if user set unlimited to resource limit of core (ulimit -c) . > >>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c > >>>> > >>>> > >>>>> A few style nits - you need spaces around keywords and before braces > >>>>> I also suggest saying "Core dumps may be processed with ..." rather > >>>>> than "treated". > >>>>> And as you don't do anything in the non-redirect case I suggest > >>>>> collapsing this: > >>>> > >>>> I've fixed them. > >>>> > >>>> > >>>> Thanks, > >>>> > >>>> Yasumasa > >>>> > >>>> > >>>> (2014/10/13 9:41), David Holmes wrote: > >>>>> Hi Yasumasa, > >>>>> > >>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote: > >>>>>> Hi David, > >>>>>> > >>>>>> Sorry for my English. > >>>>>> > >>>>>> I want to propose that JVM should create message according to core > >>>>>> pattern (/proc/sys/kernel/core_pattern) . > >>>>>> So I filed it to JBS and created a patch. > >>>>> > >>>>> So I've had a quick look at this core_pattern business and it seems > to > >>>>> me that there are two aspects to this. > >>>>> > >>>>> First, without the leading |, the entry in the core_pattern file is a > >>>>> naming pattern for the core file. In which case that should be > handled > >>>>> by the linux specific get_core_path() function. Though that in itself > >>>>> can't fully report the expected name, as part of it is provided in > the > >>>>> shared code in os::check_or_create_dump. Fixing this means changing > >>>>> all the os_posix using platforms. But your patch is not about this > >>>>> part. :) > >>>>> > >>>>> Second, with a leading | the core_pattern is actually the name of a > >>>>> program to execute when the program is about to core dump, and that > is > >>>>> what you report with your patch. Though I'm unclear whether it both > >>>>> invokes the program and creates a core dump file; or just invokes the > >>>>> program? > >>>>> > >>>>> So with regards to this second part your patch seems functionally ok. > >>>>> I do dislike having a big chunk of linux specific code in this > "posix" > >>>>> support file but ... > >>>>> > >>>>> A few style nits - you need spaces around keywords and before braces > eg: > >>>>> > >>>>> if(x){ > >>>>> > >>>>> should be > >>>>> > >>>>> if (x) { > >>>>> > >>>>> I also suggest saying "Core dumps may be processed with ..." rather > >>>>> than "treated". > >>>>> > >>>>> And as you don't do anything in the non-redirect case I suggest > >>>>> collapsing this: > >>>>> > >>>>> 83 is_redirect = core_pattern[0] == '|'; > >>>>> 84 } > >>>>> 85 > >>>>> 86 if(is_redirect){ > >>>>> 87 jio_snprintf(buffer, bufferSize, > >>>>> 88 "Core dumps may be treated with \"%s\"", > >>>>> &core_pattern[1]); > >>>>> 89 } > >>>>> > >>>>> to just > >>>>> > >>>>> 83 if (core_pattern[0] == '|') { // redirect > >>>>> 84 jio_snprintf(buffer, bufferSize, "Core dumps may be > >>>>> processed with \"%s\"", &core_pattern[1]); > >>>>> 85 } > >>>>> 86 } > >>>>> > >>>>> Comments from other runtime folk appreciated. > >>>>> > >>>>> Thanks, > >>>>> David > >>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Yasumasa > >>>>>> > >>>>>> 2014/10/07 15:43 "David Holmes" >>>>>> >: > >>>>>> > >>>>>> Hi Yasumasa, > >>>>>> > >>>>>> I'm sorry but I don't understand what you are proposing. When you > >>>>>> say > >>>>>> "treat" do you mean "create"? Otherwise what do you mean by > >>>>>> "treated"? > >>>>>> > >>>>>> Thanks, > >>>>>> David > >>>>>> > >>>>>> On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote: > >>>>>> > I'm in Hackergarten @ JavaOne :-) > >>>>>> > > >>>>>> > > >>>>>> > Hi all, > >>>>>> > > >>>>>> > I would like to enhance the messages in hs_err report. > >>>>>> > Modern Linux kernel can treat core dump with user process > >>>>>> (e.g. ABRT) > >>>>>> > However, hs_err report cannot detect it. > >>>>>> > > >>>>>> > I think that hs_err report should output messages as below: > >>>>>> > ------------- > >>>>>> > Failed to write core dump. Core dumps may be treated with > >>>>>> "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s > %c %p > >>>>>> %u %g %t e" > >>>>>> > ------------- > >>>>>> > > >>>>>> > I've uploaded webrev of this enhancement. > >>>>>> > Could you review it? > >>>>>> > > >>>>>> > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/ > >>>>>> > > >>>>>> > This patch works fine on Fedora20 x86_64. > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> > Thanks, > >>>>>> > > >>>>>> > Yasumasa > >>>>>> > > >>>>>> > > From goetz.lindenmaier at sap.com Wed Nov 26 07:39:30 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 26 Nov 2014 07:39:30 +0000 Subject: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff In-Reply-To: <54752DB1.7070008@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF2A285@DEWDFEMB12A.global.corp.sap> <54752DB1.7070008@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF2A7EA@DEWDFEMB12A.global.corp.sap> Hi David If I remember correctly that causes bigger problems because of some cyclic dependencies or the like. But I didn't try it this time. The best thing would be to introduce node.inline.hpp ... But here I just want to fix the build. Best regards, Goetz -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Mittwoch, 26. November 2014 02:33 To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net; Vladimir Ivanov Subject: Re: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff On 26/11/2014 1:06 AM, Lindenmaier, Goetz wrote: > Hi, > > please review and sponsor this tiny fix: > > http://cr.openjdk.java.net/~goetz/webrevs/8065915-inclFix/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8065915 > > It needs to go to hotspot-comp. > > 8058148 includes compile.hpp in ciTypeFlow.cpp. compile.hpp uses locate_node_notes() which is defined inline in node.hpp. Therefore ciTypeFlow.cpp also must include node.hpp. This breaks the build. If compile.hpp uses things from node.hpp then shouldn't it include node.hpp? David > opto/compile.hpp:825: warning: inline function 'Node_Notes* Compile::locate_node_notes(GrowableArray*, int, bool)' used but never defined > > Best regards, > Goetz. > From serguei.spitsyn at oracle.com Wed Nov 26 08:59:56 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 26 Nov 2014 00:59:56 -0800 Subject: RFR (S) 8008678: JSR 292: constant pool reconstitution must support pseudo strings Message-ID: <5475968C.40800@oracle.com> Please, review the fix for: https://bugs.openjdk.java.net/browse/JDK-8008678 Open webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8008678-JVMTI-pseudo.1/ Summary: The pseudo-strings are currently not supported in reconstitution of constant pool. This is an explanation from John Rose about what the pseudo-strings are: "We still need "live" oop constants pre-linked into the constant pool of bytecodes which implement some method handles. We use the anonymous class pseudo-string feature for that. The relevant code is here: http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java These oops are what "pseudo-strings" are. The odd name refers to the fact that, even though they are random oops, they appear in the constant pool where one would expect (because of class file syntax) to find a string." ... If you really wanted to reconstitute a class file for an anonymous class, and if that class has oop patching (pseudo-strings), you would need either to (a) reconstitute the patches array handed to Unsafe.defineAnonymousClass, or (b) accept whatever odd strings were there first, as an approximation. The "odd strings" are totally insignificant, and are typically something like "CONSTANT_PLACEHOLDER_42" (see java/lang/invoke/InvokerBytecodeGenerator.java)." Reconstitution of the ConstantPool is needed for both the JVMTI GetConstantPool() and RetransformClasses(). Finally, it goes to the ConstantPool::copy_cpool_bytes(). The problem is that a pseudo-string is a patched string that does not have a reference to the string symbol anymore: unresolved_string_at(idx) == NULL The fix is to create and fill in a map from JVM_CONSTANT_String cp index to the JVM_CONSTANT_Utf8 cp index to be able to restore this assotiation in the JvmtiConstantPoolReconstituter. Testing: Run: - java/lang/instrument tests - new jtreg test (see webrev) that was written by Filipp Zhinkin Thanks, Serguei From thomas.stuefe at gmail.com Wed Nov 26 11:50:29 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 26 Nov 2014 12:50:29 +0100 Subject: RFR(xs): 8065788: os::reserve_memory() on Windows should not assert that allocation size is aligned to OS allocation granularity. In-Reply-To: References: Message-ID: Thank you Markus! Could I have another Reviewer, and maybe someone who sponsors the change? Please note that this bug is not purely theoretical, but prevents us from reserving just one page on windows, which then leads to everyone aligning the reserve size up to vm_allocation_granularity() just to make the assert go away. Actually, I see a lot of "align_size_up()"s with vm_allocation_granularity() which may be unnecessary and could probably get cleaned up. Kind Regards, Thomas On Tue, Nov 25, 2014 at 11:21 AM, Markus Gr?nlund < markus.gronlund at oracle.com> wrote: > Hi Thomas, > > Thanks for finding and addressing this - looks good. > > Cheers > Markus > > -----Original Message----- > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > Sent: den 24 november 2014 18:32 > To: HotSpot Open Source Developers > Subject: RFR(xs): 8065788: os::reserve_memory() on Windows should not > assert that allocation size is aligned to OS allocation granularity. > > Hi, > > a very small change: > > Bug Report: https://bugs.openjdk.java.net/browse/JDK-8065788 > WebRev: http://cr.openjdk.java.net/~simonis/webrevs/8065788/ > > os::reserve_memory() on Windows asserts that allocation size is allocated > to os::vm_allocation_granularity(). This assert is wrong and should be > removed. > > Allocation granularity affects the alignment of attach addresses, not of > the allocated size. The latter is aligned to page size, but asserting that > would be unnecessarily strict, as VirtualAlloc() will just quietly align > size up to page size. > > For details see MSDN on VirtualAlloc(): > > http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887%28v=vs.85%29.aspx > > > Kind Regards, > > Thomas St?fe > From thomas.stuefe at gmail.com Wed Nov 26 14:12:52 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 26 Nov 2014 15:12:52 +0100 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com> Message-ID: Hi Yasumasa, I am not a Reviewer. Barring the general decision of the real reviewers, here are some thoughts: os_linux.cpp - jio_snprintf() returns -1 on truncation. n+=written may walk backwards. I would probably check for (written >= 0) and also, at the start of the loop, for (n < sizeof(core_path)). - code is used in error reporting. I would be hesitant to create larger buffers on the stack. malloc may be better. - code does not detect truncation of core_path (unlikely but possible) the rest is more matter of taste: - I would prefer sizeof(core_path) over PATH_MAX at all places where you refer to the size of the buffer. So you could make the buffer very small and test e.g. how your code behaves with truncation. - when reading /proc/sys/kernel/core_uses_pid, using fgetc instead of fgets may be a tiny bit simpler. Kind Regards, Thomas On Wed, Nov 26, 2014 at 4:54 AM, Yasumasa Suenaga wrote: > Hi Staffan, > > Thank you for reviewing! > > os_linux.cpp: > I want to print coredump location correctly to hs_err. So I want to output > whether coredump is processed in other process or is written to file. > If os::get_core_path() should be more simply, I will print raw string in > core_pattern. > > os_bsd.cpp: > I don't have OS X. So I cannot check it. > I am focusing Linux in this enhancement. Could you file it as another > enhancement if it need? > > Thanks, > > Yasumasa > > 2014/11/25 18:15 "Staffan Larsen" : > > > src/os/bsd/vm/os_linux.cpp: > > I?m inclined to think this is too complicated and hard to test and > > maintain (and I see no tests in the webrev). Could we not simplify this > to > > print a helpful message instead? Something that prints the core_pattern > and > > perhaps some of the values that could be used for substitution, but does > > not do the actual substitution? I think that would go a long way but be a > > lot more maintainable. > > > > src/os/bsd/vm/os_bsd.cpp: > > On OS X cores are by default written to /cores/core.. This is > > configureable with the kern.corefile sysctl variable, although it is rare > > to do so. > > > > /Staffan > > > > > On 24 nov 2014, at 14:21, Yasumasa Suenaga wrote: > > > > > > Hi all, > > > > > > I've uploaded webrev for this issue about a month ago. > > > Could you review it and sponsor it? > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote: > > >> Hi David, > > >> > > >> I've uploaded new webrev: > > >> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/ > > >> > > >> > > >>> I wasn't suggesting that you make such a change though because it is > > large and disruptive. > > >> > > >>> Unfactoring check_or_create_dump is a step backwards in terms of code > > sharing. > > >> > > >> I restored check_or_create_dump() to os_posix.cpp . > > >> And I changed get_core_path() to create message which represents core > > dump path > > >> (including filename) in each OS. > > >> > > >> > > >>> Expanding the get_core_path in os_linux.cpp to handle the > core_pattern > > may be okay (but I don't know enough about it to validate everything). > > >> > > >> I implemented all parameters in Linux kernel documentation: > > >> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt > > >> > > >> So I think that parameters which are processed are enough. > > >> > > >> > > >> Thanks, > > >> > > >> Yasumasa > > >> > > >> > > >> > > >> (2014/10/15 9:41), David Holmes wrote: > > >>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote: > > >>>> Hi David, > > >>>> > > >>>> Thank you for comments! > > >>>> I've uploaded new webrev. Could you review it again? > > >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/ > > >>>> > > >>>> I am an author of jdk9. So I cannot commit it. > > >>>> Could you be a sponsor for this enhancement? > > >>>> > > >>>> > > >>>>> In which case that should be handled by the linux specific > > >>>>> get_core_path() function. > > >>>> > > >>>> Agree. > > >>>> So I implemented it in os_linux.cpp . > > >>>> But part of format characters (%P: global pid, %s: signal, %t dump > > time) > > >>>> are not processed > > >>>> in this function because I think these parameters are difficult to > > >>>> handle in it. > > >>>> > > >>>> %P: I could not find API for this. > > >>>> %s: We have to change arguments of get_core_path() . > > >>>> %t: This parameter means timestamp of coredump. It is decided in > > Kernel. > > >>>> > > >>>> > > >>>>> Fixing this means changing all the os_posix using platforms. But > your > > >>>>> patch is not about this part. :) > > >>>> > > >>>> I moved os::check_or_create_dump() to each OS implementations (AIX, > > BSD, > > >>>> Solaris, Linux) . > > >>>> So I can write Linux specific code to check_or_create_dump() . > > >>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-) > > >>> > > >>> I wasn't suggesting that you make such a change though because it is > > large and disruptive. The simple handling of the | part of core_pattern > was > > basically ok. Expanding the get_core_path in os_linux.cpp to handle the > > core_pattern may be okay (but I don't know enough about it to validate > > everything). Unfactoring check_or_create_dump is a step backwards in > terms > > of code sharing. > > >>> > > >>> Sorry this has grown too large for me to deal with right now. > > >>> > > >>> David > > >>> ----- > > >>> > > >>>> > > >>>>> Though I'm unclear whether it both invokes the program and creates > a > > >>>>> core dump file; or just invokes the program? > > >>>> > > >>>> If '|' is set, Linux kernel will just redirect core image to user > > process. > > >>>> Kernel documentation says as below: > > >>>> ------------ > > >>>> . If the first character of the pattern is a '|', the kernel will > > treat > > >>>> the rest of the pattern as a command to run. The core dump will > be > > >>>> written to the standard input of that program instead of to a > file. > > >>>> ------------ > > >>>> > > >>>> And implementation of coredump (do_coredump()) follows to it. > > >>>> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c > > >>>> > > >>>> > > >>>> In case of ABRT, ABRT dumps core image to default location > > >>>> (/core.) > > >>>> if user set unlimited to resource limit of core (ulimit -c) . > > >>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c > > >>>> > > >>>> > > >>>>> A few style nits - you need spaces around keywords and before > braces > > >>>>> I also suggest saying "Core dumps may be processed with ..." rather > > >>>>> than "treated". > > >>>>> And as you don't do anything in the non-redirect case I suggest > > >>>>> collapsing this: > > >>>> > > >>>> I've fixed them. > > >>>> > > >>>> > > >>>> Thanks, > > >>>> > > >>>> Yasumasa > > >>>> > > >>>> > > >>>> (2014/10/13 9:41), David Holmes wrote: > > >>>>> Hi Yasumasa, > > >>>>> > > >>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote: > > >>>>>> Hi David, > > >>>>>> > > >>>>>> Sorry for my English. > > >>>>>> > > >>>>>> I want to propose that JVM should create message according to core > > >>>>>> pattern (/proc/sys/kernel/core_pattern) . > > >>>>>> So I filed it to JBS and created a patch. > > >>>>> > > >>>>> So I've had a quick look at this core_pattern business and it seems > > to > > >>>>> me that there are two aspects to this. > > >>>>> > > >>>>> First, without the leading |, the entry in the core_pattern file > is a > > >>>>> naming pattern for the core file. In which case that should be > > handled > > >>>>> by the linux specific get_core_path() function. Though that in > itself > > >>>>> can't fully report the expected name, as part of it is provided in > > the > > >>>>> shared code in os::check_or_create_dump. Fixing this means changing > > >>>>> all the os_posix using platforms. But your patch is not about this > > >>>>> part. :) > > >>>>> > > >>>>> Second, with a leading | the core_pattern is actually the name of a > > >>>>> program to execute when the program is about to core dump, and that > > is > > >>>>> what you report with your patch. Though I'm unclear whether it both > > >>>>> invokes the program and creates a core dump file; or just invokes > the > > >>>>> program? > > >>>>> > > >>>>> So with regards to this second part your patch seems functionally > ok. > > >>>>> I do dislike having a big chunk of linux specific code in this > > "posix" > > >>>>> support file but ... > > >>>>> > > >>>>> A few style nits - you need spaces around keywords and before > braces > > eg: > > >>>>> > > >>>>> if(x){ > > >>>>> > > >>>>> should be > > >>>>> > > >>>>> if (x) { > > >>>>> > > >>>>> I also suggest saying "Core dumps may be processed with ..." rather > > >>>>> than "treated". > > >>>>> > > >>>>> And as you don't do anything in the non-redirect case I suggest > > >>>>> collapsing this: > > >>>>> > > >>>>> 83 is_redirect = core_pattern[0] == '|'; > > >>>>> 84 } > > >>>>> 85 > > >>>>> 86 if(is_redirect){ > > >>>>> 87 jio_snprintf(buffer, bufferSize, > > >>>>> 88 "Core dumps may be treated with \"%s\"", > > >>>>> &core_pattern[1]); > > >>>>> 89 } > > >>>>> > > >>>>> to just > > >>>>> > > >>>>> 83 if (core_pattern[0] == '|') { // redirect > > >>>>> 84 jio_snprintf(buffer, bufferSize, "Core dumps may > be > > >>>>> processed with \"%s\"", &core_pattern[1]); > > >>>>> 85 } > > >>>>> 86 } > > >>>>> > > >>>>> Comments from other runtime folk appreciated. > > >>>>> > > >>>>> Thanks, > > >>>>> David > > >>>>> > > >>>>>> Thanks, > > >>>>>> > > >>>>>> Yasumasa > > >>>>>> > > >>>>>> 2014/10/07 15:43 "David Holmes" > >>>>>> >: > > >>>>>> > > >>>>>> Hi Yasumasa, > > >>>>>> > > >>>>>> I'm sorry but I don't understand what you are proposing. When > you > > >>>>>> say > > >>>>>> "treat" do you mean "create"? Otherwise what do you mean by > > >>>>>> "treated"? > > >>>>>> > > >>>>>> Thanks, > > >>>>>> David > > >>>>>> > > >>>>>> On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote: > > >>>>>> > I'm in Hackergarten @ JavaOne :-) > > >>>>>> > > > >>>>>> > > > >>>>>> > Hi all, > > >>>>>> > > > >>>>>> > I would like to enhance the messages in hs_err report. > > >>>>>> > Modern Linux kernel can treat core dump with user process > > >>>>>> (e.g. ABRT) > > >>>>>> > However, hs_err report cannot detect it. > > >>>>>> > > > >>>>>> > I think that hs_err report should output messages as below: > > >>>>>> > ------------- > > >>>>>> > Failed to write core dump. Core dumps may be treated > with > > >>>>>> "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s > > %c %p > > >>>>>> %u %g %t e" > > >>>>>> > ------------- > > >>>>>> > > > >>>>>> > I've uploaded webrev of this enhancement. > > >>>>>> > Could you review it? > > >>>>>> > > > >>>>>> > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/ > > >>>>>> > > > >>>>>> > This patch works fine on Fedora20 x86_64. > > >>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>>> > Thanks, > > >>>>>> > > > >>>>>> > Yasumasa > > >>>>>> > > > >>>>>> > > > > > From vladimir.x.ivanov at oracle.com Wed Nov 26 15:39:42 2014 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 26 Nov 2014 19:39:42 +0400 Subject: [8u40] Bulk backport request: 8058847, 8058148, 8060147 Message-ID: <5475F43E.7050309@oracle.com> This is a bulk request to backport the following changes into 8u40. They were integrated into 9 and cleanly apply to 8u-dev. (1) 8058847: C2: EliminateAutoBox regression after 8042786 https://bugs.openjdk.java.net/browse/JDK-8058847 http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/7723d5b0fca3 (2) 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff should be increased https://bugs.openjdk.java.net/browse/JDK-8058148 http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/7dd010c9fab1 (3) SIGSEGV in Metadata::mark_on_stack() while marking metadata in ciEnv https://bugs.openjdk.java.net/browse/JDK-8060147 http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/c14722c9cda3 Best regards, Vladimir Ivanov From coleen.phillimore at oracle.com Wed Nov 26 17:17:52 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 26 Nov 2014 12:17:52 -0500 Subject: RFR (S) 8008678: JSR 292: constant pool reconstitution must support pseudo strings In-Reply-To: <5475968C.40800@oracle.com> References: <5475968C.40800@oracle.com> Message-ID: <54760B40.9040206@oracle.com> Serguei, I had a quick look at this. I was wondering if we could make the pseudo_string_map conditional in ConstantPool and not make all classes pay in footprint for this field? The same thing probably could be done for operands too. There are flags that you can set to conditionally add a pointer to base() in this function. Typical C++ would subclass ConstantPool to add InvokeDynamicConstantPool fields, but this is not typical C++ so the trick we use is like the one in ConstMethod. I think it's worth doing in this case. Thanks, Coleen On 11/26/14, 3:59 AM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-8008678 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8008678-JVMTI-pseudo.1/ > > > > Summary: > The pseudo-strings are currently not supported in reconstitution of > constant pool. > > This is an explanation from John Rose about what the pseudo-strings > are: > > "We still need "live" oop constants pre-linked into the constant > pool of bytecodes which > implement some method handles. We use the anonymous class > pseudo-string feature for that. > The relevant code is here: > http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java > > These oops are what "pseudo-strings" are. > The odd name refers to the fact that, even though they are random > oops, they appear in the constant pool > where one would expect (because of class file syntax) to find a > string." > ... > If you really wanted to reconstitute a class file for an anonymous > class, and > if that class has oop patching (pseudo-strings), you would need > either to (a) reconstitute the patches array > handed to Unsafe.defineAnonymousClass, or (b) accept whatever odd > strings were there first, as an approximation. > The "odd strings" are totally insignificant, and are typically > something like "CONSTANT_PLACEHOLDER_42" > (see java/lang/invoke/InvokerBytecodeGenerator.java)." > > > Reconstitution of the ConstantPool is needed for both the JVMTI > GetConstantPool() and RetransformClasses(). > Finally, it goes to the ConstantPool::copy_cpool_bytes(). > > The problem is that a pseudo-string is a patched string that does > not have > a reference to the string symbol anymore: > unresolved_string_at(idx) == NULL > > The fix is to create and fill in a map from JVM_CONSTANT_String cp > index to the JVM_CONSTANT_Utf8 cp index > to be able to restore this assotiation in the > JvmtiConstantPoolReconstituter. > > Testing: > Run: > - java/lang/instrument tests > - new jtreg test (see webrev) that was written by Filipp Zhinkin > > > Thanks, > Serguei From christian.thalinger at oracle.com Wed Nov 26 19:09:05 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Wed, 26 Nov 2014 11:09:05 -0800 Subject: [8u40] Bulk backport request: 8058847, 8058148, 8060147 In-Reply-To: <5475F43E.7050309@oracle.com> References: <5475F43E.7050309@oracle.com> Message-ID: Good. > On Nov 26, 2014, at 7:39 AM, Vladimir Ivanov wrote: > > This is a bulk request to backport the following changes into 8u40. They were integrated into 9 and cleanly apply to 8u-dev. > > (1) 8058847: C2: EliminateAutoBox regression after 8042786 > https://bugs.openjdk.java.net/browse/JDK-8058847 > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/7723d5b0fca3 > > (2) 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff should be increased > https://bugs.openjdk.java.net/browse/JDK-8058148 > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/7dd010c9fab1 > > (3) SIGSEGV in Metadata::mark_on_stack() while marking metadata in ciEnv > https://bugs.openjdk.java.net/browse/JDK-8060147 > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/c14722c9cda3 > > Best regards, > Vladimir Ivanov From vladimir.x.ivanov at oracle.com Wed Nov 26 18:40:30 2014 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 26 Nov 2014 22:40:30 +0400 Subject: [8u40] Bulk backport request: 8058847, 8058148, 8060147 In-Reply-To: References: <5475F43E.7050309@oracle.com> Message-ID: <54761E9E.2020704@oracle.com> Thank you, Chris. Best regards, Vladimir Ivanov On 11/26/14, 11:09 PM, Christian Thalinger wrote: > Good. > >> On Nov 26, 2014, at 7:39 AM, Vladimir Ivanov wrote: >> >> This is a bulk request to backport the following changes into 8u40. They were integrated into 9 and cleanly apply to 8u-dev. >> >> (1) 8058847: C2: EliminateAutoBox regression after 8042786 >> https://bugs.openjdk.java.net/browse/JDK-8058847 >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/7723d5b0fca3 >> >> (2) 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff should be increased >> https://bugs.openjdk.java.net/browse/JDK-8058148 >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/7dd010c9fab1 >> >> (3) SIGSEGV in Metadata::mark_on_stack() while marking metadata in ciEnv >> https://bugs.openjdk.java.net/browse/JDK-8060147 >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/c14722c9cda3 >> >> Best regards, >> Vladimir Ivanov > From david.r.chase at oracle.com Wed Nov 26 19:40:13 2014 From: david.r.chase at oracle.com (David Chase) Date: Wed, 26 Nov 2014 14:40:13 -0500 Subject: [8u40] RFR: 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <5474E3ED.3060306@oracle.com> References: <5474E3ED.3060306@oracle.com> Message-ID: Not a reviewer, but I worked on a version of this bug, and this patch looks okay to me. David On 2014-11-25, at 3:17 PM, Coleen Phillimore wrote: > > This is a backport of the bug fix for bug *https://bugs.openjdk.java.net/browse/JDK-8042593 > > *The fix has been in JDK9 for a week with a couple of nights of successful testing. The patch applied cleanly to jdk8u40. > > http://cr.openjdk.java.net/~coleenp/8042235_8u40/ > > Please approve this backport. > > thanks, > Coleen From coleen.phillimore at oracle.com Wed Nov 26 19:44:45 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 26 Nov 2014 14:44:45 -0500 Subject: [8u40] RFR: 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: References: <5474E3ED.3060306@oracle.com> Message-ID: <54762DAD.7080009@oracle.com> Thank you! Serguei reviewed the backport also, but I don't think he replied to the whole list. Coleen On 11/26/14, 2:40 PM, David Chase wrote: > Not a reviewer, but I worked on a version of this bug, and this patch looks okay to me. > > David > > On 2014-11-25, at 3:17 PM, Coleen Phillimore wrote: > >> This is a backport of the bug fix for bug *https://bugs.openjdk.java.net/browse/JDK-8042593 >> >> *The fix has been in JDK9 for a week with a couple of nights of successful testing. The patch applied cleanly to jdk8u40. >> >> http://cr.openjdk.java.net/~coleenp/8042235_8u40/ >> >> Please approve this backport. >> >> thanks, >> Coleen From serguei.spitsyn at oracle.com Wed Nov 26 19:53:45 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 26 Nov 2014 11:53:45 -0800 Subject: RFR (S) 8008678: JSR 292: constant pool reconstitution must support pseudo strings In-Reply-To: <54760B40.9040206@oracle.com> References: <5475968C.40800@oracle.com> <54760B40.9040206@oracle.com> Message-ID: <54762FC9.2010800@oracle.com> Coleen, Thank you for looking at this! I'll check how this can be improved. It is my concern too. Thanks, Serguei On 11/26/14 9:17 AM, Coleen Phillimore wrote: > > Serguei, > I had a quick look at this. I was wondering if we could make the > pseudo_string_map conditional in ConstantPool and not make all classes > pay in footprint for this field? The same thing probably could be > done for operands too. There are flags that you can set to > conditionally add a pointer to base() in this function. > > Typical C++ would subclass ConstantPool to add > InvokeDynamicConstantPool fields, but this is not typical C++ so the > trick we use is like the one in ConstMethod. I think it's worth > doing in this case. > > Thanks, > Coleen > > On 11/26/14, 3:59 AM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-8008678 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8008678-JVMTI-pseudo.1/ >> >> >> >> Summary: >> The pseudo-strings are currently not supported in reconstitution >> of constant pool. >> >> This is an explanation from John Rose about what the >> pseudo-strings are: >> >> "We still need "live" oop constants pre-linked into the constant >> pool of bytecodes which >> implement some method handles. We use the anonymous class >> pseudo-string feature for that. >> The relevant code is here: >> http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java >> >> These oops are what "pseudo-strings" are. >> The odd name refers to the fact that, even though they are random >> oops, they appear in the constant pool >> where one would expect (because of class file syntax) to find a >> string." >> ... >> If you really wanted to reconstitute a class file for an >> anonymous class, and >> if that class has oop patching (pseudo-strings), you would need >> either to (a) reconstitute the patches array >> handed to Unsafe.defineAnonymousClass, or (b) accept whatever odd >> strings were there first, as an approximation. >> The "odd strings" are totally insignificant, and are typically >> something like "CONSTANT_PLACEHOLDER_42" >> (see java/lang/invoke/InvokerBytecodeGenerator.java)." >> >> >> Reconstitution of the ConstantPool is needed for both the JVMTI >> GetConstantPool() and RetransformClasses(). >> Finally, it goes to the ConstantPool::copy_cpool_bytes(). >> >> The problem is that a pseudo-string is a patched string that does >> not have >> a reference to the string symbol anymore: >> unresolved_string_at(idx) == NULL >> >> The fix is to create and fill in a map from JVM_CONSTANT_String cp >> index to the JVM_CONSTANT_Utf8 cp index >> to be able to restore this assotiation in the >> JvmtiConstantPoolReconstituter. >> >> Testing: >> Run: >> - java/lang/instrument tests >> - new jtreg test (see webrev) that was written by Filipp Zhinkin >> >> >> Thanks, >> Serguei > From serguei.spitsyn at oracle.com Wed Nov 26 19:59:51 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 26 Nov 2014 11:59:51 -0800 Subject: [8u40] RFR: 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <54762DAD.7080009@oracle.com> References: <5474E3ED.3060306@oracle.com> <54762DAD.7080009@oracle.com> Message-ID: <54763137.3010806@oracle.com> Sorry, Coleen. I've replied to the David's email but somehow missed the hotspot-dev list. I've compared this with the jdk9 webrev. It is the same so that the fix is good. Thanks, Serguei On 11/26/14 11:44 AM, Coleen Phillimore wrote: > > Thank you! Serguei reviewed the backport also, but I don't think he > replied to the whole list. > > Coleen > > On 11/26/14, 2:40 PM, David Chase wrote: >> Not a reviewer, but I worked on a version of this bug, and this patch >> looks okay to me. >> >> David >> >> On 2014-11-25, at 3:17 PM, Coleen Phillimore >> wrote: >> >>> This is a backport of the bug fix for bug >>> *https://bugs.openjdk.java.net/browse/JDK-8042593 >>> >>> *The fix has been in JDK9 for a week with a couple of nights of >>> successful testing. The patch applied cleanly to jdk8u40. >>> >>> http://cr.openjdk.java.net/~coleenp/8042235_8u40/ >>> >>> Please approve this backport. >>> >>> thanks, >>> Coleen > From coleen.phillimore at oracle.com Wed Nov 26 20:01:35 2014 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 26 Nov 2014 15:01:35 -0500 Subject: [8u40] RFR: 8042235: redefining method used by multiple MethodHandles crashes VM In-Reply-To: <54763137.3010806@oracle.com> References: <5474E3ED.3060306@oracle.com> <54762DAD.7080009@oracle.com> <54763137.3010806@oracle.com> Message-ID: <5476319F.4090300@oracle.com> Thank you, Serguei! Coleen On 11/26/14, 2:59 PM, serguei.spitsyn at oracle.com wrote: > Sorry, Coleen. > > I've replied to the David's email but somehow missed the hotspot-dev > list. > > I've compared this with the jdk9 webrev. > It is the same so that the fix is good. > > Thanks, > Serguei > > On 11/26/14 11:44 AM, Coleen Phillimore wrote: >> >> Thank you! Serguei reviewed the backport also, but I don't think he >> replied to the whole list. >> >> Coleen >> >> On 11/26/14, 2:40 PM, David Chase wrote: >>> Not a reviewer, but I worked on a version of this bug, and this >>> patch looks okay to me. >>> >>> David >>> >>> On 2014-11-25, at 3:17 PM, Coleen Phillimore >>> wrote: >>> >>>> This is a backport of the bug fix for bug >>>> *https://bugs.openjdk.java.net/browse/JDK-8042593 >>>> >>>> *The fix has been in JDK9 for a week with a couple of nights of >>>> successful testing. The patch applied cleanly to jdk8u40. >>>> >>>> http://cr.openjdk.java.net/~coleenp/8042235_8u40/ >>>> >>>> Please approve this backport. >>>> >>>> thanks, >>>> Coleen >> > From goetz.lindenmaier at sap.com Thu Nov 27 09:45:12 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 27 Nov 2014 09:45:12 +0000 Subject: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff References: <4295855A5C1DE049A61835A1887419CC2CF2A285@DEWDFEMB12A.global.corp.sap> <54752DB1.7070008@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF2AB1C@DEWDFEMB12A.global.corp.sap> Hi, could somebody please have a look at this really tiny change? Maybe you, Vladimir I.? 8058148 arrived in jdk8, so this one needs to go there, too, please. It breaks the build without precompiled headers. Best regards, Goetz. -----Original Message----- From: Lindenmaier, Goetz Sent: Mittwoch, 26. November 2014 08:40 To: 'David Holmes'; hotspot-dev at openjdk.java.net; Vladimir Ivanov Subject: RE: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff Hi David If I remember correctly that causes bigger problems because of some cyclic dependencies or the like. But I didn't try it this time. The best thing would be to introduce node.inline.hpp ... But here I just want to fix the build. Best regards, Goetz -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Mittwoch, 26. November 2014 02:33 To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net; Vladimir Ivanov Subject: Re: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff On 26/11/2014 1:06 AM, Lindenmaier, Goetz wrote: > Hi, > > please review and sponsor this tiny fix: > > http://cr.openjdk.java.net/~goetz/webrevs/8065915-inclFix/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8065915 > > It needs to go to hotspot-comp. > > 8058148 includes compile.hpp in ciTypeFlow.cpp. compile.hpp uses locate_node_notes() which is defined inline in node.hpp. Therefore ciTypeFlow.cpp also must include node.hpp. This breaks the build. If compile.hpp uses things from node.hpp then shouldn't it include node.hpp? David > opto/compile.hpp:825: warning: inline function 'Node_Notes* Compile::locate_node_notes(GrowableArray*, int, bool)' used but never defined > > Best regards, > Goetz. > From vladimir.x.ivanov at oracle.com Thu Nov 27 09:40:34 2014 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 27 Nov 2014 13:40:34 +0400 Subject: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CF2AB1C@DEWDFEMB12A.global.corp.sap> References: <4295855A5C1DE049A61835A1887419CC2CF2A285@DEWDFEMB12A.global.corp.sap> <54752DB1.7070008@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2AB1C@DEWDFEMB12A.global.corp.sap> Message-ID: <5476F192.2060805@oracle.com> Goetz, looks good to me. Best regards, Vladimir Ivanov On 11/27/14, 1:45 PM, Lindenmaier, Goetz wrote: > Hi, > > could somebody please have a look at this really tiny change? > Maybe you, Vladimir I.? > 8058148 arrived in jdk8, so this one needs to go there, too, please. > > It breaks the build without precompiled headers. > > Best regards, > Goetz. > > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Mittwoch, 26. November 2014 08:40 > To: 'David Holmes'; hotspot-dev at openjdk.java.net; Vladimir Ivanov > Subject: RE: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff > > Hi David > > If I remember correctly that causes bigger problems because of some cyclic dependencies or the like. > But I didn't try it this time. > > The best thing would be to introduce node.inline.hpp ... But here I just want to fix the build. > > Best regards, > Goetz > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Mittwoch, 26. November 2014 02:33 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net; Vladimir Ivanov > Subject: Re: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff > > On 26/11/2014 1:06 AM, Lindenmaier, Goetz wrote: >> Hi, >> >> please review and sponsor this tiny fix: >> >> http://cr.openjdk.java.net/~goetz/webrevs/8065915-inclFix/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8065915 >> >> It needs to go to hotspot-comp. >> >> 8058148 includes compile.hpp in ciTypeFlow.cpp. compile.hpp uses locate_node_notes() which is defined inline in node.hpp. Therefore ciTypeFlow.cpp also must include node.hpp. This breaks the build. > > If compile.hpp uses things from node.hpp then shouldn't it include node.hpp? > > David > >> opto/compile.hpp:825: warning: inline function 'Node_Notes* Compile::locate_node_notes(GrowableArray*, int, bool)' used but never defined >> >> Best regards, >> Goetz. >> From goetz.lindenmaier at sap.com Thu Nov 27 15:49:11 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 27 Nov 2014 15:49:11 +0000 Subject: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff In-Reply-To: <5476F192.2060805@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF2A285@DEWDFEMB12A.global.corp.sap> <54752DB1.7070008@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF2AB1C@DEWDFEMB12A.global.corp.sap> <5476F192.2060805@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF2AC32@DEWDFEMB12A.global.corp.sap> Hi Vladimir, thanks for reviewing and pushing the change! Will jdk8u be split into jdk8u40 after ZBB? Or can I still get a change that works for both? Should I send a backport request? Best regards, Goetz. -----Original Message----- From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] Sent: Donnerstag, 27. November 2014 10:41 To: Lindenmaier, Goetz; David Holmes; hotspot-dev at openjdk.java.net Subject: Re: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff Goetz, looks good to me. Best regards, Vladimir Ivanov On 11/27/14, 1:45 PM, Lindenmaier, Goetz wrote: > Hi, > > could somebody please have a look at this really tiny change? > Maybe you, Vladimir I.? > 8058148 arrived in jdk8, so this one needs to go there, too, please. > > It breaks the build without precompiled headers. > > Best regards, > Goetz. > > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Mittwoch, 26. November 2014 08:40 > To: 'David Holmes'; hotspot-dev at openjdk.java.net; Vladimir Ivanov > Subject: RE: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff > > Hi David > > If I remember correctly that causes bigger problems because of some cyclic dependencies or the like. > But I didn't try it this time. > > The best thing would be to introduce node.inline.hpp ... But here I just want to fix the build. > > Best regards, > Goetz > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Mittwoch, 26. November 2014 02:33 > To: Lindenmaier, Goetz; hotspot-dev at openjdk.java.net; Vladimir Ivanov > Subject: Re: RFR(XS): 8065915: Fix includes after 8058148: MaxNodeLimit and LiveNodeCountInliningCutoff > > On 26/11/2014 1:06 AM, Lindenmaier, Goetz wrote: >> Hi, >> >> please review and sponsor this tiny fix: >> >> http://cr.openjdk.java.net/~goetz/webrevs/8065915-inclFix/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8065915 >> >> It needs to go to hotspot-comp. >> >> 8058148 includes compile.hpp in ciTypeFlow.cpp. compile.hpp uses locate_node_notes() which is defined inline in node.hpp. Therefore ciTypeFlow.cpp also must include node.hpp. This breaks the build. > > If compile.hpp uses things from node.hpp then shouldn't it include node.hpp? > > David > >> opto/compile.hpp:825: warning: inline function 'Node_Notes* Compile::locate_node_notes(GrowableArray*, int, bool)' used but never defined >> >> Best regards, >> Goetz. >> From tatiana.pivovarova at oracle.com Thu Nov 27 15:53:00 2014 From: tatiana.pivovarova at oracle.com (Tatiana Pivovarova) Date: Thu, 27 Nov 2014 18:53:00 +0300 Subject: RFR(S): 8064953: Asserts.assert* should print values Message-ID: <547748DC.1040907@oracle.com> Hi, please review this enhancement patch. bugid: https://bugs.openjdk.java.net/browse/JDK-8064953 webrev: http://cr.openjdk.java.net/~iignatyev/tpivovarova/8064953/webrev.00/ Problem: 'assert*' methods which get message as 'msg' parameter don't print compared parameter's values. These values must be printed in any case: with or without 'msg' parameter. Solution: This enhancement force 'assert*' methods to print compared values. Testing: Manual. I run all tests in hotspot/test/* on the latest jdk. Thanks, Tatiana From david.holmes at oracle.com Thu Nov 27 22:43:57 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 28 Nov 2014 08:43:57 +1000 Subject: RFR(S): 8064953: Asserts.assert* should print values In-Reply-To: <547748DC.1040907@oracle.com> References: <547748DC.1040907@oracle.com> Message-ID: <5477A92D.50000@oracle.com> Hi Tatiana, This looks okay to me. Thanks, David On 28/11/2014 1:53 AM, Tatiana Pivovarova wrote: > Hi, > > please review this enhancement patch. > > bugid: https://bugs.openjdk.java.net/browse/JDK-8064953 > webrev: > http://cr.openjdk.java.net/~iignatyev/tpivovarova/8064953/webrev.00/ > > Problem: > 'assert*' methods which get message as 'msg' parameter don't print > compared parameter's values. These values must be printed in any case: > with or without 'msg' parameter. > > Solution: > This enhancement force 'assert*' methods to print compared values. > > Testing: Manual. I run all tests in hotspot/test/* on the latest jdk. > > Thanks, > Tatiana From staffan.larsen at oracle.com Fri Nov 28 06:44:36 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 28 Nov 2014 07:44:36 +0100 Subject: RFR(S): 8064953: Asserts.assert* should print values In-Reply-To: <547748DC.1040907@oracle.com> References: <547748DC.1040907@oracle.com> Message-ID: <3220B20F-43EF-449E-AFB3-A134F5CC08D2@oracle.com> Tatiana, This looks good but can lead to some strange messages when paired with the messages that are in use for some of the existing assert calls. For example this usage: testlibrary_tests/whitebox/vm_flags/VmFlagTest.java: Asserts.assertEQ(tests.length, results.length, "[TESTBUG] tests.length != results.length?); will now lead to a message looking like this: [TESTBUG] tests.length != results.length Expected that 1 == 2 Could we instead change the output to look more like this?: [TESTBUG] tests.length != results.length (assert failed: 1 == 2) Thanks, /Staffan > On 27 nov 2014, at 16:53, Tatiana Pivovarova wrote: > > Hi, > > please review this enhancement patch. > > bugid: https://bugs.openjdk.java.net/browse/JDK-8064953 > webrev: http://cr.openjdk.java.net/~iignatyev/tpivovarova/8064953/webrev.00/ > > Problem: > 'assert*' methods which get message as 'msg' parameter don't print compared parameter's values. These values must be printed in any case: with or without 'msg' parameter. > > Solution: > This enhancement force 'assert*' methods to print compared values. > > Testing: Manual. I run all tests in hotspot/test/* on the latest jdk. > > Thanks, > Tatiana From tatiana.pivovarova at oracle.com Fri Nov 28 11:27:33 2014 From: tatiana.pivovarova at oracle.com (Tatiana Pivovarova) Date: Fri, 28 Nov 2014 14:27:33 +0300 Subject: RFR(S): 8064953: Asserts.assert* should print values In-Reply-To: <3220B20F-43EF-449E-AFB3-A134F5CC08D2@oracle.com> References: <547748DC.1040907@oracle.com> <3220B20F-43EF-449E-AFB3-A134F5CC08D2@oracle.com> Message-ID: <54785C25.8020800@oracle.com> Hi David, Staffan, Thank you for your review! Staffan, you are right "(assert failed: ...)" is more readable. I made this changes in code Here is the new webrev: http://cr.openjdk.java.net/~iignatyev/tpivovarova/8064953/webrev.01/ Thanks, Tatiana On 11/28/2014 09:44 AM, Staffan Larsen wrote: > Tatiana, > > This looks good but can lead to some strange messages when paired with the messages that are in use for some of the existing assert calls. > > For example this usage: > > testlibrary_tests/whitebox/vm_flags/VmFlagTest.java: > Asserts.assertEQ(tests.length, results.length, "[TESTBUG] tests.length != results.length?); > > will now lead to a message looking like this: > > [TESTBUG] tests.length != results.length Expected that 1 == 2 > > Could we instead change the output to look more like this?: > > [TESTBUG] tests.length != results.length (assert failed: 1 == 2) > > > Thanks, > /Staffan > > >> On 27 nov 2014, at 16:53, Tatiana Pivovarova wrote: >> >> Hi, >> >> please review this enhancement patch. >> >> bugid: https://bugs.openjdk.java.net/browse/JDK-8064953 >> webrev: http://cr.openjdk.java.net/~iignatyev/tpivovarova/8064953/webrev.00/ >> >> Problem: >> 'assert*' methods which get message as 'msg' parameter don't print compared parameter's values. These values must be printed in any case: with or without 'msg' parameter. >> >> Solution: >> This enhancement force 'assert*' methods to print compared values. >> >> Testing: Manual. I run all tests in hotspot/test/* on the latest jdk. >> >> Thanks, >> Tatiana From staffan.larsen at oracle.com Fri Nov 28 11:31:20 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 28 Nov 2014 12:31:20 +0100 Subject: RFR(S): 8064953: Asserts.assert* should print values In-Reply-To: <54785C25.8020800@oracle.com> References: <547748DC.1040907@oracle.com> <3220B20F-43EF-449E-AFB3-A134F5CC08D2@oracle.com> <54785C25.8020800@oracle.com> Message-ID: <3B503BDD-04FC-406B-86E4-38B4E46582EA@oracle.com> Looks good! Thanks, /Staffan > On 28 nov 2014, at 12:27, Tatiana Pivovarova wrote: > > Hi David, Staffan, > > Thank you for your review! > Staffan, you are right "(assert failed: ...)" is more readable. I made this changes in code > Here is the new webrev: http://cr.openjdk.java.net/~iignatyev/tpivovarova/8064953/webrev.01/ > > Thanks, > Tatiana > > On 11/28/2014 09:44 AM, Staffan Larsen wrote: >> Tatiana, >> >> This looks good but can lead to some strange messages when paired with the messages that are in use for some of the existing assert calls. >> >> For example this usage: >> >> testlibrary_tests/whitebox/vm_flags/VmFlagTest.java: >> Asserts.assertEQ(tests.length, results.length, "[TESTBUG] tests.length != results.length?); >> >> will now lead to a message looking like this: >> >> [TESTBUG] tests.length != results.length Expected that 1 == 2 >> >> Could we instead change the output to look more like this?: >> >> [TESTBUG] tests.length != results.length (assert failed: 1 == 2) >> >> >> Thanks, >> /Staffan >> >> >>> On 27 nov 2014, at 16:53, Tatiana Pivovarova wrote: >>> >>> Hi, >>> >>> please review this enhancement patch. >>> >>> bugid: https://bugs.openjdk.java.net/browse/JDK-8064953 >>> webrev: http://cr.openjdk.java.net/~iignatyev/tpivovarova/8064953/webrev.00/ >>> >>> Problem: >>> 'assert*' methods which get message as 'msg' parameter don't print compared parameter's values. These values must be printed in any case: with or without 'msg' parameter. >>> >>> Solution: >>> This enhancement force 'assert*' methods to print compared values. >>> >>> Testing: Manual. I run all tests in hotspot/test/* on the latest jdk. >>> >>> Thanks, >>> Tatiana > From david.holmes at oracle.com Fri Nov 28 11:42:18 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 28 Nov 2014 21:42:18 +1000 Subject: RFR(S): 8064953: Asserts.assert* should print values In-Reply-To: <54785C25.8020800@oracle.com> References: <547748DC.1040907@oracle.com> <3220B20F-43EF-449E-AFB3-A134F5CC08D2@oracle.com> <54785C25.8020800@oracle.com> Message-ID: <54785F9A.4010803@oracle.com> On 28/11/2014 9:27 PM, Tatiana Pivovarova wrote: > Hi David, Staffan, > > Thank you for your review! > Staffan, you are right "(assert failed: ...)" is more readable. I made > this changes in code > Here is the new webrev: > http://cr.openjdk.java.net/~iignatyev/tpivovarova/8064953/webrev.01/ Still fine by me. Thanks, David > Thanks, > Tatiana > > On 11/28/2014 09:44 AM, Staffan Larsen wrote: >> Tatiana, >> >> This looks good but can lead to some strange messages when paired with >> the messages that are in use for some of the existing assert calls. >> >> For example this usage: >> >> testlibrary_tests/whitebox/vm_flags/VmFlagTest.java: >> Asserts.assertEQ(tests.length, results.length, "[TESTBUG] tests.length >> != results.length?); >> >> will now lead to a message looking like this: >> >> [TESTBUG] tests.length != results.length Expected that 1 == 2 >> >> Could we instead change the output to look more like this?: >> >> [TESTBUG] tests.length != results.length (assert failed: 1 == 2) >> >> >> Thanks, >> /Staffan >> >> >>> On 27 nov 2014, at 16:53, Tatiana Pivovarova >>> wrote: >>> >>> Hi, >>> >>> please review this enhancement patch. >>> >>> bugid: https://bugs.openjdk.java.net/browse/JDK-8064953 >>> webrev: >>> http://cr.openjdk.java.net/~iignatyev/tpivovarova/8064953/webrev.00/ >>> >>> Problem: >>> 'assert*' methods which get message as 'msg' parameter don't print >>> compared parameter's values. These values must be printed in any >>> case: with or without 'msg' parameter. >>> >>> Solution: >>> This enhancement force 'assert*' methods to print compared values. >>> >>> Testing: Manual. I run all tests in hotspot/test/* on the latest jdk. >>> >>> Thanks, >>> Tatiana > From tatiana.pivovarova at oracle.com Fri Nov 28 11:45:47 2014 From: tatiana.pivovarova at oracle.com (Tatiana Pivovarova) Date: Fri, 28 Nov 2014 14:45:47 +0300 Subject: RFR(S): 8064953: Asserts.assert* should print values In-Reply-To: <54785F9A.4010803@oracle.com> References: <547748DC.1040907@oracle.com> <3220B20F-43EF-449E-AFB3-A134F5CC08D2@oracle.com> <54785C25.8020800@oracle.com> <54785F9A.4010803@oracle.com> Message-ID: <5478606B.90909@oracle.com> Hi David, Staffan, Thank you again for your review! Tatiana On 11/28/2014 02:42 PM, David Holmes wrote: > On 28/11/2014 9:27 PM, Tatiana Pivovarova wrote: >> Hi David, Staffan, >> >> Thank you for your review! >> Staffan, you are right "(assert failed: ...)" is more readable. I made >> this changes in code >> Here is the new webrev: >> http://cr.openjdk.java.net/~iignatyev/tpivovarova/8064953/webrev.01/ > > Still fine by me. > > Thanks, > David > >> Thanks, >> Tatiana >> >> On 11/28/2014 09:44 AM, Staffan Larsen wrote: >>> Tatiana, >>> >>> This looks good but can lead to some strange messages when paired with >>> the messages that are in use for some of the existing assert calls. >>> >>> For example this usage: >>> >>> testlibrary_tests/whitebox/vm_flags/VmFlagTest.java: >>> Asserts.assertEQ(tests.length, results.length, "[TESTBUG] tests.length >>> != results.length?); >>> >>> will now lead to a message looking like this: >>> >>> [TESTBUG] tests.length != results.length Expected that 1 == 2 >>> >>> Could we instead change the output to look more like this?: >>> >>> [TESTBUG] tests.length != results.length (assert failed: 1 == 2) >>> >>> >>> Thanks, >>> /Staffan >>> >>> >>>> On 27 nov 2014, at 16:53, Tatiana Pivovarova >>>> wrote: >>>> >>>> Hi, >>>> >>>> please review this enhancement patch. >>>> >>>> bugid: https://bugs.openjdk.java.net/browse/JDK-8064953 >>>> webrev: >>>> http://cr.openjdk.java.net/~iignatyev/tpivovarova/8064953/webrev.00/ >>>> >>>> Problem: >>>> 'assert*' methods which get message as 'msg' parameter don't print >>>> compared parameter's values. These values must be printed in any >>>> case: with or without 'msg' parameter. >>>> >>>> Solution: >>>> This enhancement force 'assert*' methods to print compared values. >>>> >>>> Testing: Manual. I run all tests in hotspot/test/* on the latest jdk. >>>> >>>> Thanks, >>>> Tatiana >> From volker.simonis at gmail.com Fri Nov 28 13:41:03 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 28 Nov 2014 14:41:03 +0100 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: <546F7F42.5090100@oracle.com> References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> <546BF7F3.5020507@oracle.com> <546C0881.8050905@oracle.com> <546C1264.6090308@oracle.com> <546DF9D8.3090505@redhat.com> <546E2D75.8080900@oracle.com> <546E2F62.4030104@redhat.com> <546F7765.1070907@redhat.com> <546F7F42.5090100@oracle.com> Message-ID: Hi, I think Goetz answered to the remaining questions a few days ago: http://mail.openjdk.java.net/pipermail/aarch64-port-dev/2014-November/001855.html http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-November/016199.html but for some reason his mail doesn't appear in this mail thread. As he wrote, the release store into the card table in graphKit.cpp isn't needed and we've just removed in in our internal version a few weeks ago as well. The first one in memnode.hpp is indeed only needed on IA64 so your solution with AARCH64_ONLY is OK for us. Regards, Volker On Fri, Nov 21, 2014 at 7:06 PM, Vladimir Kozlov wrote: > On 11/21/14 9:33 AM, Andrew Haley wrote: >> >> On 11/20/2014 06:13 PM, Andrew Haley wrote: >>> >>> On 11/20/2014 06:05 PM, Vladimir Kozlov wrote: >>>> >>>> I based the name on your comment: >>>> >>>> + // AArch64 uses store release (which does everything we need to keep >>>> + // the machine in order) but we still need a compiler barrier here. >>> >>> >>> Ah. Okay, I'll have to think of a good name for it, then. >>> >>>> You can name it as you like. Our main suggestion is to use such Boolean >>>> constant and normal if() statements instead of ifdef AARCH64 and >>>> AARCH64_ONLY/NOT_AARCH64 macros in C2 code (src/share/vm/opto/* files). >>>> >>>> We already do similar things for PPC64 port which sets >>>> support_IRIW_for_* constant. >>> >>> >>> Okay, >> >> >> I've done something similar but more useful. I've added an >> experimental flag: UseBarriersForVolatile. This defaults to true for >> all targets, but we can override it in the back end. That gives me >> the chance to do some benchmarking on various AArch64 targets to see >> which ones benefit from the new load acquire/store release >> instructions. > > > Okay. > >> >> I have kept AARCH64_ONLY for two hunks: >> >> --- old/src/share/vm/opto/memnode.hpp 2014-11-21 12:09:22.766963837 >> -0500 >> +++ new/src/share/vm/opto/memnode.hpp 2014-11-21 12:09:22.546983320 >> -0500 >> @@ -503,6 +503,10 @@ >> // Conservatively release stores of object references in order to >> // ensure visibility of object initialization. >> static inline MemOrd release_if_reference(const BasicType t) { >> + // AArch64 doesn't need a release store here because object >> + // initialization contains the necessary barriers. >> + AARCH64_ONLY(return unordered); >> + >> const MemOrd mo = (t == T_ARRAY || >> t == T_ADDRESS || // Might be the address of an >> object reference (`boxing'). >> t == T_OBJECT) ? release : unordered; > > > This could be needed for ppc64 too, not only for IA64. > >> >> --- old/src/share/vm/opto/graphKit.cpp 2014-11-21 12:09:20.017207376 >> -0500 >> +++ new/src/share/vm/opto/graphKit.cpp 2014-11-21 12:09:19.787227745 >> -0500 >> @@ -3813,7 +3813,8 @@ >> >> // Smash zero into card >> if( !UseConcMarkSweepGC ) { >> - __ store(__ ctrl(), card_adr, zero, bt, adr_type, MemNode::release); >> + __ store(__ ctrl(), card_adr, zero, bt, adr_type, >> + NOT_AARCH64(MemNode::release) >> AARCH64_ONLY(MemNode::unordered)); >> } else { >> // Specialized path for CM store barrier >> __ storeCM(__ ctrl(), card_adr, zero, oop_store, adr_idx, bt, >> adr_type); > > > Looks like PPC64 needs that. In ppc.ad: > > // Use release_store for card-marking to ensure that previous > // oop-stores are visible before the card-mark change. > enc_class enc_cms_card_mark(memory mem, iRegLdst releaseFieldAddr) %{ > >> >> The first hunk is only required by IA64 as far as I am aware, but I >> am nervous about making it IA64_ONLY. The second hunk is a release >> node which is not as far as I am aware required by any target, and >> should simply be removed. >> >> This isn't a RFA because it's not tested yet, but what do you think? > > > Since it affects ppc64 and ia64 we need to ask Goetz and Co. > I would suggest to put both these places under platform specific flags/bool > constant. > > Thanks, > Vladimir > >> >> Andrew. >> > From aph at redhat.com Fri Nov 28 13:51:39 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 28 Nov 2014 13:51:39 +0000 Subject: AARCH64: 8064611: Changes to HotSpot shared code In-Reply-To: References: <54625D3D.4000007@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26C6A@DEWDFEMB12A.global.corp.sap> <54632ABA.5000706@redhat.com> <4295855A5C1DE049A61835A1887419CC2CF26D77@DEWDFEMB12A.global.corp.sap> <54638FA0.8040204@redhat.com> <546572B8.9080005@oracle.com> <546A1EF5.6060607@redhat.com> <546A60C7.1070408@oracle.com> <546BF7F3.5020507@oracle.com> <546C0881.8050905@oracle.com> <546C1264.6090308@oracle.com> <546DF9D8.3090505@redhat.com> <546E2D75.8080900@oracle.com> <546E2F62.4030104@redhat.com> <546F7765.1070907@redhat.com> <546F7F42.5090100@oracle.com> Message-ID: <54787DEB.4080601@redhat.com> On 11/28/2014 01:41 PM, Volker Simonis wrote: > As he wrote, the release store into the card table in graphKit.cpp > isn't needed and we've just removed in in our internal version a few > weeks ago as well. > > The first one in memnode.hpp is indeed only needed on IA64 so your > solution with AARCH64_ONLY is OK for us. Okay, thanks. I never saw the reply. Andrew. From yasuenag at gmail.com Sat Nov 29 15:44:30 2014 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Sun, 30 Nov 2014 00:44:30 +0900 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com> Message-ID: <5479E9DE.7070703@gmail.com> Hi all, Thank you for checking my patch! I've uploaded new webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.03/hotspot.patch David: > The change in: > src/os/aix/vm/os_aix.cpp > src/os/solaris/vm/os_solaris.cpp > > jio_snprintf(buffer, bufferSize, "%s/core or core.%d", current_process_id()); > > has no argument for the %s - presumably p was intended. I've fixed. Staffan: > src/os/bsd/vm/os_linux.cpp: > Could we not simplify this to print a helpful message instead? Most of case in Linux, I think that core image name is "core." . In other case which except pipe redirection, I guess that user defines it. Thus I print string in kernel.core_pattern directly. > src/os/bsd/vm/os_bsd.cpp: > On OS X cores are by default written to /cores/core.. This is configureable with the kern.corefile sysctl variable, although it is rare to do so. Thank you! I changed path to "/cores/core." . Thomas: > - jio_snprintf() returns -1 on truncation. n+=written may walk backwards. I would probably check for (written >= 0) and also, at the start of the loop, for (n < sizeof(core_path)). > - code is used in error reporting. I would be hesitant to create larger buffers on the stack. malloc may be better. I've fixed them. > - code does not detect truncation of core_path (unlikely but possible) Do you mean variable name? "core_path" in my patch stores /proc/sys/kernel/core_pattern . Length of kernel.core_pattern is defined 128 chars in Linux Kernel Documentation. https://www.kernel.org/doc/Documentation/sysctl/kernel.txt Thus length of core_path (129 chars) is enough. > - when reading /proc/sys/kernel/core_uses_pid, using fgetc instead of fgets may be a tiny bit simpler. I changed to use fgetc() . Thanks, Yasumasa (2014/11/26 23:12), Thomas St?fe wrote: > Hi Yasumasa, > > I am not a Reviewer. Barring the general decision of the real reviewers, here are some thoughts: > > os_linux.cpp > > - jio_snprintf() returns -1 on truncation. n+=written may walk backwards. I would probably check for (written >= 0) and also, at the start of the loop, for (n < sizeof(core_path)). > - code is used in error reporting. I would be hesitant to create larger buffers on the stack. malloc may be better. > - code does not detect truncation of core_path (unlikely but possible) > > the rest is more matter of taste: > - I would prefer sizeof(core_path) over PATH_MAX at all places where you refer to the size of the buffer. So you could make the buffer very small and test e.g. how your code behaves with truncation. > - when reading /proc/sys/kernel/core_uses_pid, using fgetc instead of fgets may be a tiny bit simpler. > > Kind Regards, Thomas > > > > On Wed, Nov 26, 2014 at 4:54 AM, Yasumasa Suenaga > wrote: > > Hi Staffan, > > Thank you for reviewing! > > os_linux.cpp: > I want to print coredump location correctly to hs_err. So I want to output > whether coredump is processed in other process or is written to file. > If os::get_core_path() should be more simply, I will print raw string in > core_pattern. > > os_bsd.cpp: > I don't have OS X. So I cannot check it. > I am focusing Linux in this enhancement. Could you file it as another > enhancement if it need? > > Thanks, > > Yasumasa > > 2014/11/25 18:15 "Staffan Larsen" >: > > > src/os/bsd/vm/os_linux.cpp: > > I?m inclined to think this is too complicated and hard to test and > > maintain (and I see no tests in the webrev). Could we not simplify this to > > print a helpful message instead? Something that prints the core_pattern and > > perhaps some of the values that could be used for substitution, but does > > not do the actual substitution? I think that would go a long way but be a > > lot more maintainable. > > > > src/os/bsd/vm/os_bsd.cpp: > > On OS X cores are by default written to /cores/core.. This is > > configureable with the kern.corefile sysctl variable, although it is rare > > to do so. > > > > /Staffan > > > > > On 24 nov 2014, at 14:21, Yasumasa Suenaga > wrote: > > > > > > Hi all, > > > > > > I've uploaded webrev for this issue about a month ago. > > > Could you review it and sponsor it? > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote: > > >> Hi David, > > >> > > >> I've uploaded new webrev: > > >> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/ > > >> > > >> > > >>> I wasn't suggesting that you make such a change though because it is > > large and disruptive. > > >> > > >>> Unfactoring check_or_create_dump is a step backwards in terms of code > > sharing. > > >> > > >> I restored check_or_create_dump() to os_posix.cpp . > > >> And I changed get_core_path() to create message which represents core > > dump path > > >> (including filename) in each OS. > > >> > > >> > > >>> Expanding the get_core_path in os_linux.cpp to handle the core_pattern > > may be okay (but I don't know enough about it to validate everything). > > >> > > >> I implemented all parameters in Linux kernel documentation: > > >> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt > > >> > > >> So I think that parameters which are processed are enough. > > >> > > >> > > >> Thanks, > > >> > > >> Yasumasa > > >> > > >> > > >> > > >> (2014/10/15 9:41), David Holmes wrote: > > >>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote: > > >>>> Hi David, > > >>>> > > >>>> Thank you for comments! > > >>>> I've uploaded new webrev. Could you review it again? > > >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/ > > >>>> > > >>>> I am an author of jdk9. So I cannot commit it. > > >>>> Could you be a sponsor for this enhancement? > > >>>> > > >>>> > > >>>>> In which case that should be handled by the linux specific > > >>>>> get_core_path() function. > > >>>> > > >>>> Agree. > > >>>> So I implemented it in os_linux.cpp . > > >>>> But part of format characters (%P: global pid, %s: signal, %t dump > > time) > > >>>> are not processed > > >>>> in this function because I think these parameters are difficult to > > >>>> handle in it. > > >>>> > > >>>> %P: I could not find API for this. > > >>>> %s: We have to change arguments of get_core_path() . > > >>>> %t: This parameter means timestamp of coredump. It is decided in > > Kernel. > > >>>> > > >>>> > > >>>>> Fixing this means changing all the os_posix using platforms. But your > > >>>>> patch is not about this part. :) > > >>>> > > >>>> I moved os::check_or_create_dump() to each OS implementations (AIX, > > BSD, > > >>>> Solaris, Linux) . > > >>>> So I can write Linux specific code to check_or_create_dump() . > > >>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-) > > >>> > > >>> I wasn't suggesting that you make such a change though because it is > > large and disruptive. The simple handling of the | part of core_pattern was > > basically ok. Expanding the get_core_path in os_linux.cpp to handle the > > core_pattern may be okay (but I don't know enough about it to validate > > everything). Unfactoring check_or_create_dump is a step backwards in terms > > of code sharing. > > >>> > > >>> Sorry this has grown too large for me to deal with right now. > > >>> > > >>> David > > >>> ----- > > >>> > > >>>> > > >>>>> Though I'm unclear whether it both invokes the program and creates a > > >>>>> core dump file; or just invokes the program? > > >>>> > > >>>> If '|' is set, Linux kernel will just redirect core image to user > > process. > > >>>> Kernel documentation says as below: > > >>>> ------------ > > >>>> . If the first character of the pattern is a '|', the kernel will > > treat > > >>>> the rest of the pattern as a command to run. The core dump will be > > >>>> written to the standard input of that program instead of to a file. > > >>>> ------------ > > >>>> > > >>>> And implementation of coredump (do_coredump()) follows to it. > > >>>> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c > > >>>> > > >>>> > > >>>> In case of ABRT, ABRT dumps core image to default location > > >>>> (/core.) > > >>>> if user set unlimited to resource limit of core (ulimit -c) . > > >>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c > > >>>> > > >>>> > > >>>>> A few style nits - you need spaces around keywords and before braces > > >>>>> I also suggest saying "Core dumps may be processed with ..." rather > > >>>>> than "treated". > > >>>>> And as you don't do anything in the non-redirect case I suggest > > >>>>> collapsing this: > > >>>> > > >>>> I've fixed them. > > >>>> > > >>>> > > >>>> Thanks, > > >>>> > > >>>> Yasumasa > > >>>> > > >>>> > > >>>> (2014/10/13 9:41), David Holmes wrote: > > >>>>> Hi Yasumasa, > > >>>>> > > >>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote: > > >>>>>> Hi David, > > >>>>>> > > >>>>>> Sorry for my English. > > >>>>>> > > >>>>>> I want to propose that JVM should create message according to core > > >>>>>> pattern (/proc/sys/kernel/core_pattern) . > > >>>>>> So I filed it to JBS and created a patch. > > >>>>> > > >>>>> So I've had a quick look at this core_pattern business and it seems > > to > > >>>>> me that there are two aspects to this. > > >>>>> > > >>>>> First, without the leading |, the entry in the core_pattern file is a > > >>>>> naming pattern for the core file. In which case that should be > > handled > > >>>>> by the linux specific get_core_path() function. Though that in itself > > >>>>> can't fully report the expected name, as part of it is provided in > > the > > >>>>> shared code in os::check_or_create_dump. Fixing this means changing > > >>>>> all the os_posix using platforms. But your patch is not about this > > >>>>> part. :) > > >>>>> > > >>>>> Second, with a leading | the core_pattern is actually the name of a > > >>>>> program to execute when the program is about to core dump, and that > > is > > >>>>> what you report with your patch. Though I'm unclear whether it both > > >>>>> invokes the program and creates a core dump file; or just invokes the > > >>>>> program? > > >>>>> > > >>>>> So with regards to this second part your patch seems functionally ok. > > >>>>> I do dislike having a big chunk of linux specific code in this > > "posix" > > >>>>> support file but ... > > >>>>> > > >>>>> A few style nits - you need spaces around keywords and before braces > > eg: > > >>>>> > > >>>>> if(x){ > > >>>>> > > >>>>> should be > > >>>>> > > >>>>> if (x) { > > >>>>> > > >>>>> I also suggest saying "Core dumps may be processed with ..." rather > > >>>>> than "treated". > > >>>>> > > >>>>> And as you don't do anything in the non-redirect case I suggest > > >>>>> collapsing this: > > >>>>> > > >>>>> 83 is_redirect = core_pattern[0] == '|'; > > >>>>> 84 } > > >>>>> 85 > > >>>>> 86 if(is_redirect){ > > >>>>> 87 jio_snprintf(buffer, bufferSize, > > >>>>> 88 "Core dumps may be treated with \"%s\"", > > >>>>> &core_pattern[1]); > > >>>>> 89 } > > >>>>> > > >>>>> to just > > >>>>> > > >>>>> 83 if (core_pattern[0] == '|') { // redirect > > >>>>> 84 jio_snprintf(buffer, bufferSize, "Core dumps may be > > >>>>> processed with \"%s\"", &core_pattern[1]); > > >>>>> 85 } > > >>>>> 86 } > > >>>>> > > >>>>> Comments from other runtime folk appreciated. > > >>>>> > > >>>>> Thanks, > > >>>>> David > > >>>>> > > >>>>>> Thanks, > > >>>>>> > > >>>>>> Yasumasa > > >>>>>> > > >>>>>> 2014/10/07 15:43 "David Holmes" > > >>>>>> >>: > > >>>>>> > > >>>>>> Hi Yasumasa, > > >>>>>> > > >>>>>> I'm sorry but I don't understand what you are proposing. When you > > >>>>>> say > > >>>>>> "treat" do you mean "create"? Otherwise what do you mean by > > >>>>>> "treated"? > > >>>>>> > > >>>>>> Thanks, > > >>>>>> David > > >>>>>> > > >>>>>> On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote: > > >>>>>> > I'm in Hackergarten @ JavaOne :-) > > >>>>>> > > > >>>>>> > > > >>>>>> > Hi all, > > >>>>>> > > > >>>>>> > I would like to enhance the messages in hs_err report. > > >>>>>> > Modern Linux kernel can treat core dump with user process > > >>>>>> (e.g. ABRT) > > >>>>>> > However, hs_err report cannot detect it. > > >>>>>> > > > >>>>>> > I think that hs_err report should output messages as below: > > >>>>>> > ------------- > > >>>>>> > Failed to write core dump. Core dumps may be treated with > > >>>>>> "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s > > %c %p > > >>>>>> %u %g %t e" > > >>>>>> > ------------- > > >>>>>> > > > >>>>>> > I've uploaded webrev of this enhancement. > > >>>>>> > Could you review it? > > >>>>>> > > > >>>>>> > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/ > > >>>>>> > > > >>>>>> > This patch works fine on Fedora20 x86_64. > > >>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>>> > Thanks, > > >>>>>> > > > >>>>>> > Yasumasa > > >>>>>> > > > >>>>>> > > > > > >