From alexey.menkov at oracle.com Thu Mar 1 18:53:09 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 1 Mar 2018 10:53:09 -0800 Subject: RFR: JDK-8193369: post_field_access does not work for some functions, possibly related to fast_getfield In-Reply-To: <91aadc35-125a-bf74-6cf5-672dc77ffb22@oracle.com> References: <1fca6b67-c0d1-db03-52ed-f2c6bcc29a5b@oracle.com> <91aadc35-125a-bf74-6cf5-672dc77ffb22@oracle.com> Message-ID: <3df69fad-c0d8-5667-a61a-f88a83e26d89@oracle.com> Hi Serguei, Thank you for the feedback. Updated webrev: http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev.01/ See inline for comments for your notes. On 02/27/2018 23:08, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > Thank you for taking care about this! > The fix looks good to me. > > Some comments on the test. > > http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev/test/hotspot/jtreg/serviceability/jvmti/FieldAccessWatch/FieldAccessWatch.java.html > > There are some commented lines in the TestResult class. > A cleanup is needed to delete them. > I guess, it is already in your plan. I deleted couple lines, keeping comment for fields > The empty line #135 is not needed. > An empty line is needed after the L99. fixed. > Probably, the intention was to spell "startTest" insted of "initTest" > below: > > ?119???????? if (!startTest(result)) { > ?120???????????? throw new RuntimeException("initTest failed"); > ?121???????? } fixed. > I wonder if this sleep is really needed: > ??? 124 Thread.sleep(500); > > The "action.apply()" is executed synchronously, is not it? But notifications are asynchronous, so this helps to avoid test failures is some events are delivered a bit later in loaded environment. Also this helps to avoid mess of native and java logging > I'm thinking if moving the test() to native side would simplify things. To me it's simpler and more flexible to perform required actions in Java, native part only handles notifications. > An Exception can be thrown from native if the test failed or just a > boolean status returned. > > > http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev/test/hotspot/jtreg/serviceability/jvmti/FieldAccessWatch/libFieldAccessWatch.c.html > > I'd suggest to rename currentTestResults to testResultObject, > so it will be in line with testResultClass. fixed. > One concern is that that the reportError() does not cause the test to > fail and does not break the execution. > Would it better to throw an exception with the same message as was printed? Updated several cases (immediate return from callbacks if something went wrong). Note that reportError is called from native Java methods and from JVMTI callbacks, so throwing an exception doesn't looks right. > It seems, the function tagAndWatch() adds some complexity to the code. > Is all this really needed? Could you, please, add some comments. > It does not seem this functions tags anything. renamed the function, added short function description. > ?168???????????????? (*jvmti)->Deallocate(jvmti, (unsigned char*)sig); > > ?The sig needs to be cleared after deallocation as it is used and > checked in a loop. Moved the variable to the correct scope. > Missed initializations: > > ? 68???? char *name; > ?142???????? jfieldID* klassFields; > ?143???????? jint fieldCount; Fixed. --alex > Thanks, > Serguei > > > On 2/26/18 14:43, Alex Menkov wrote: >> Hi all, >> >> Please review a fix for >> JDK-8193369: post_field_access does not work for some functions, >> possibly related to fast_getfield >> >> The fix disables "fast" command generation when FieldAccess or >> FieldModification notifications are requested. >> >> jira: https://bugs.openjdk.java.net/browse/JDK-8193369 >> webrev: http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev/ >> >> --alex > From daniil.x.titov at oracle.com Fri Mar 2 01:53:01 2018 From: daniil.x.titov at oracle.com (daniil.x.titov at oracle.com) Date: Thu, 1 Mar 2018 17:53:01 -0800 Subject: RFR 8170541: serviceability/jdwp/AllModulesCommandTest.java fails intermittently on Windows and Solaris In-Reply-To: References: <86bf7cb3-50a2-ddeb-0f88-73b8b2b4b2cf@oracle.com> <18D0CADB-CDB9-4F5D-B77A-DE5A5F6A5C19@oracle.com> <4ba6d0b1-0277-3e5e-1aa9-49e1db636586@oracle.com> <1c404611-6021-96d2-5f5f-1f48735ab5e2@oracle.com> Message-ID: <4437815e-5ea2-0a80-5a4e-b29259f757ac@oracle.com> Hi David, Could you please say are you OK with the answers provided or there is something else that needs to be clarified? Thanks! Best regards, Daniil On 2/26/18 3:00 PM, daniil.x.titov at oracle.com wrote: > > > On 2/26/18 12:16 PM, Chris Plummer wrote: >> On 2/26/18 11:51 AM, daniil.x.titov at oracle.com wrote: >>> Hi David and Sergei, >>> >>> On 2/20/18 10:16 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi David, >>>> >>>> >>>> On 2/20/18 20:02, David Holmes wrote: >>>>> Hi Daniil, >>>>> >>>>> Good find on this! >>>>> >>>>> What does the actual spec say about the length of things and how >>>>> they may be split across multiple packets? Are we guaranteed that >>>>> at most two packets will be involved? >>> >>> The JDWP spec >>> (https://docs.oracle.com/javase/9/docs/specs/jdwp/jdwp-spec.html) >>> says nothing about splitting JDWP reply packets at all but the >>> implementation limits the max number of the sent packets to two >>> packets max. The implementation is dated back to the initial load >>> that happened in 2007 and the information about the related Jira >>> issue is missing. >>> >>> open/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c >>> >>> 836 ?? data = packet->type.cmd.data; >>> 837 ?? /* Do one send for short packets, two for longer ones */ >>> 838 ?? if (data_len <= MAX_DATA_SIZE) { >>> 839? ????? memcpy(header + JDWP_HEADER_SIZE, data, data_len); >>> 840 ?????? if (send_fully(socketFD, (char *)&header, >>> JDWP_HEADER_SIZE + data_len) != >>> 841? ????????? JDWP_HEADER_SIZE + data_len) { >>> 842 ?????????? RETURN_IO_ERROR("send failed"); >>> 843? ????? } >>> 844 ?? } else { >>> 845? ????? memcpy(header + JDWP_HEADER_SIZE, data, MAX_DATA_SIZE); >>> 846 ?????? if (send_fully(socketFD, (char *)&header, >>> JDWP_HEADER_SIZE + MAX_DATA_SIZE) != >>> 847 ?????????? JDWP_HEADER_SIZE + MAX_DATA_SIZE) { >>> 848 ?????????? RETURN_IO_ERROR("send failed"); >>> 849 ?????? } >>> 850 ? ???? /* Send the remaining data bytes right out of the data >>> area. */ >>> 851 ?????? if (send_fully(socketFD, (char *)data + MAX_DATA_SIZE, >>> 852 ????????????????????? data_len - MAX_DATA_SIZE) != data_len - >>> MAX_DATA_SIZE) { >>> 853 ?????????? RETURN_IO_ERROR("send failed"); >>> 854 ?????? } >>> 855 ?? } >>> >> Curious. First packet is limited to MAX_DATA_SIZE, 2nd packet has no >> size limit. What's the point then of splitting it then? Is there a >> desire to get the header transmitted in a smaller packet. >> >> Chris > > It looks as the goal was to somehow improve the responsiveness in case > of the large data but I am not sure about this. I could not locate any > traces in Jira related to this implementation. > Probably Serguei has some info what is the history behind this design. >>>>> >>>>> ?68???? protected byte[] readJdwpString(DataInputStream ds) throws >>>>> IOException { >>>>> ? 69???????? byte[] str = null; >>>>> ? 70???????? int len = ds.readInt(); >>>>> ? 71???????? if (len > 0) { >>>>> ? 72???????????? str = new byte[len]; >>>>> ? 73???????????? ds.read(str, 0, len); >>>>> ? 74???????? } >>>>> >>>>> might we get a short-read of the string if it is split across >>>>> multiple packets? >>>> >>> This and all other reads happen not directly from the socket input >>> stream but rather from the? DataInputStream object that is >>> constructed in? JdwpReply.initFromStream(InputStream) method. With >>> the proposed fix we do ensure that the created DataInputStream >>> object contains data from both packets in cases when the reply was >>> split in two packets. >>> >>>> Nice catch! >>>> Even though this fix is enough to resolve this problem now, there >>>> is a chance, >>>> it can fail in the future when more modules are added to the platform. >>>> >>>> >>>>> I'm wondering if all these reads should be loops, ensuring we read >>>>> the expected amount of data. >>>>> >>> Since the implementation of the socket transport limits the max >>> number of packets the reply might be split in to two packets I don't >>> think we really need it here. >>>>> One further comment - not sure why we need the print out for when >>>>> we do read multiple packets? >>>>> That would seem to be a debugging aid. >>>> >>>> Yes, it helps to understand what happens. >>>> Many tests have a lack of tracing which makes it harder to debug >>>> and understand failures. >>> That is correct.? This additional tracing was added to help to >>> understand the possible failures in the future. >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>> Thanks, >>> Daniil >>> >>>>> On 21/02/2018 10:14 AM, Daniil Titov wrote: >>>>>> Hi Serguei, >>>>>> >>>>>> A new version of the webrev that has these strings reformatted is >>>>>> at http://cr.openjdk.java.net/~dtitov/8170541/webrev.02/ >>>>>> >>>>>> Thank you! >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Daniil >>>>>> >>>>>> *From: *"serguei.spitsyn at oracle.com" >>>>>> *Date: *Tuesday, February 20, 2018 at 3:00 PM >>>>>> *To: *Daniil Titov , >>>>>> "serviceability-dev at openjdk.java.net" >>>>>> >>>>>> *Subject: *Re: RFR 8170541: >>>>>> serviceability/jdwp/AllModulesCommandTest.java fails >>>>>> intermittently on Windows and Solaris >>>>>> >>>>>> Hi Daniil, >>>>>> >>>>>> Interesting issue... >>>>>> Thank you for finding to the root cause so quickly! >>>>>> >>>>>> The fix looks good. >>>>>> Could I ask you to reformat these lines to make the L54 shorter ?: >>>>>> >>>>>> ?? 54???????????????? System.out.println("[" + >>>>>> getClass().getName() + "] Only " + bytesRead + " bytes of " + >>>>>> dataLength + >>>>>> >>>>>> ?? 55???????????????????????? " were read in the first packet. >>>>>> Reading the rest..."); >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 2/20/18 09:24, Daniil Titov wrote: >>>>>> >>>>>> ??? Please review the changes that fix intermittent failure of >>>>>> ??? serviceability/jdwp/AllModulesCommandTest.java test. >>>>>> >>>>>> ??? The problem here is that for a large data the JDWP agent >>>>>> ??? (socketTransport_writePacket() method in >>>>>> src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c ) >>>>>> ??? sends 2 packets and in some cases only the first packet is >>>>>> received >>>>>> ??? at the time when the test reads the reply from the JDWP >>>>>> agent. Since >>>>>> ??? the test does not check that all data is received in the first >>>>>> ??? packet the correlation between commands and replies became >>>>>> broken >>>>>> ??? (the unread second packet is read by the next command and the >>>>>> reply >>>>>> ??? for the next command is read by the next after next command >>>>>> and so on). >>>>>> >>>>>> ??? Bug: https://bugs.openjdk.java.net/browse/JDK-8170541 >>>>>> >>>>>> ??? Webrev: http://cr.openjdk.java.net/~dtitov/8170541/webrev.01 >>>>>> >>>>>> ??? The tests ran successfully with Mach5. >>>>>> >>>>>> ??? Best regards, >>>>>> >>>>>> ??? Daniil >>>>>> >>>>>> >>>>>> >>>> >>> >> >> > From david.holmes at oracle.com Fri Mar 2 02:07:34 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 2 Mar 2018 12:07:34 +1000 Subject: RFR 8170541: serviceability/jdwp/AllModulesCommandTest.java fails intermittently on Windows and Solaris In-Reply-To: <4437815e-5ea2-0a80-5a4e-b29259f757ac@oracle.com> References: <86bf7cb3-50a2-ddeb-0f88-73b8b2b4b2cf@oracle.com> <18D0CADB-CDB9-4F5D-B77A-DE5A5F6A5C19@oracle.com> <4ba6d0b1-0277-3e5e-1aa9-49e1db636586@oracle.com> <1c404611-6021-96d2-5f5f-1f48735ab5e2@oracle.com> <4437815e-5ea2-0a80-5a4e-b29259f757ac@oracle.com> Message-ID: <9e110cdf-2071-f8ed-7562-d7e4a42a9e52@oracle.com> Hi Daniil, On 2/03/2018 11:53 AM, daniil.x.titov at oracle.com wrote: > Hi David, > > Could you please say are you OK with the answers provided or there is > something else that needs to be clarified? Sorry. Yes the answers are fine - thanks. David > Thanks! > > Best regards, > Daniil > > On 2/26/18 3:00 PM, daniil.x.titov at oracle.com wrote: >> >> >> On 2/26/18 12:16 PM, Chris Plummer wrote: >>> On 2/26/18 11:51 AM, daniil.x.titov at oracle.com wrote: >>>> Hi David and Sergei, >>>> >>>> On 2/20/18 10:16 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi David, >>>>> >>>>> >>>>> On 2/20/18 20:02, David Holmes wrote: >>>>>> Hi Daniil, >>>>>> >>>>>> Good find on this! >>>>>> >>>>>> What does the actual spec say about the length of things and how >>>>>> they may be split across multiple packets? Are we guaranteed that >>>>>> at most two packets will be involved? >>>> >>>> The JDWP spec >>>> (https://docs.oracle.com/javase/9/docs/specs/jdwp/jdwp-spec.html) >>>> says nothing about splitting JDWP reply packets at all but the >>>> implementation limits the max number of the sent packets to two >>>> packets max. The implementation is dated back to the initial load >>>> that happened in 2007 and the information about the related Jira >>>> issue is missing. >>>> >>>> open/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c >>>> >>>> 836 ?? data = packet->type.cmd.data; >>>> 837 ?? /* Do one send for short packets, two for longer ones */ >>>> 838 ?? if (data_len <= MAX_DATA_SIZE) { >>>> 839? ????? memcpy(header + JDWP_HEADER_SIZE, data, data_len); >>>> 840 ?????? if (send_fully(socketFD, (char *)&header, >>>> JDWP_HEADER_SIZE + data_len) != >>>> 841? ????????? JDWP_HEADER_SIZE + data_len) { >>>> 842 ?????????? RETURN_IO_ERROR("send failed"); >>>> 843? ????? } >>>> 844 ?? } else { >>>> 845? ????? memcpy(header + JDWP_HEADER_SIZE, data, MAX_DATA_SIZE); >>>> 846 ?????? if (send_fully(socketFD, (char *)&header, >>>> JDWP_HEADER_SIZE + MAX_DATA_SIZE) != >>>> 847 ?????????? JDWP_HEADER_SIZE + MAX_DATA_SIZE) { >>>> 848 ?????????? RETURN_IO_ERROR("send failed"); >>>> 849 ?????? } >>>> 850 ? ???? /* Send the remaining data bytes right out of the data >>>> area. */ >>>> 851 ?????? if (send_fully(socketFD, (char *)data + MAX_DATA_SIZE, >>>> 852 ????????????????????? data_len - MAX_DATA_SIZE) != data_len - >>>> MAX_DATA_SIZE) { >>>> 853 ?????????? RETURN_IO_ERROR("send failed"); >>>> 854 ?????? } >>>> 855 ?? } >>>> >>> Curious. First packet is limited to MAX_DATA_SIZE, 2nd packet has no >>> size limit. What's the point then of splitting it then? Is there a >>> desire to get the header transmitted in a smaller packet. >>> >>> Chris >> >> It looks as the goal was to somehow improve the responsiveness in case >> of the large data but I am not sure about this. I could not locate any >> traces in Jira related to this implementation. >> Probably Serguei has some info what is the history behind this design. >>>>>> >>>>>> ?68???? protected byte[] readJdwpString(DataInputStream ds) throws >>>>>> IOException { >>>>>> ? 69???????? byte[] str = null; >>>>>> ? 70???????? int len = ds.readInt(); >>>>>> ? 71???????? if (len > 0) { >>>>>> ? 72???????????? str = new byte[len]; >>>>>> ? 73???????????? ds.read(str, 0, len); >>>>>> ? 74???????? } >>>>>> >>>>>> might we get a short-read of the string if it is split across >>>>>> multiple packets? >>>>> >>>> This and all other reads happen not directly from the socket input >>>> stream but rather from the? DataInputStream object that is >>>> constructed in? JdwpReply.initFromStream(InputStream) method. With >>>> the proposed fix we do ensure that the created DataInputStream >>>> object contains data from both packets in cases when the reply was >>>> split in two packets. >>>> >>>>> Nice catch! >>>>> Even though this fix is enough to resolve this problem now, there >>>>> is a chance, >>>>> it can fail in the future when more modules are added to the platform. >>>>> >>>>> >>>>>> I'm wondering if all these reads should be loops, ensuring we read >>>>>> the expected amount of data. >>>>>> >>>> Since the implementation of the socket transport limits the max >>>> number of packets the reply might be split in to two packets I don't >>>> think we really need it here. >>>>>> One further comment - not sure why we need the print out for when >>>>>> we do read multiple packets? >>>>>> That would seem to be a debugging aid. >>>>> >>>>> Yes, it helps to understand what happens. >>>>> Many tests have a lack of tracing which makes it harder to debug >>>>> and understand failures. >>>> That is correct.? This additional tracing was added to help to >>>> understand the possible failures in the future. >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>> Thanks, >>>> Daniil >>>> >>>>>> On 21/02/2018 10:14 AM, Daniil Titov wrote: >>>>>>> Hi Serguei, >>>>>>> >>>>>>> A new version of the webrev that has these strings reformatted is >>>>>>> at http://cr.openjdk.java.net/~dtitov/8170541/webrev.02/ >>>>>>> >>>>>>> Thank you! >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Daniil >>>>>>> >>>>>>> *From: *"serguei.spitsyn at oracle.com" >>>>>>> *Date: *Tuesday, February 20, 2018 at 3:00 PM >>>>>>> *To: *Daniil Titov , >>>>>>> "serviceability-dev at openjdk.java.net" >>>>>>> >>>>>>> *Subject: *Re: RFR 8170541: >>>>>>> serviceability/jdwp/AllModulesCommandTest.java fails >>>>>>> intermittently on Windows and Solaris >>>>>>> >>>>>>> Hi Daniil, >>>>>>> >>>>>>> Interesting issue... >>>>>>> Thank you for finding to the root cause so quickly! >>>>>>> >>>>>>> The fix looks good. >>>>>>> Could I ask you to reformat these lines to make the L54 shorter ?: >>>>>>> >>>>>>> ?? 54???????????????? System.out.println("[" + >>>>>>> getClass().getName() + "] Only " + bytesRead + " bytes of " + >>>>>>> dataLength + >>>>>>> >>>>>>> ?? 55???????????????????????? " were read in the first packet. >>>>>>> Reading the rest..."); >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 2/20/18 09:24, Daniil Titov wrote: >>>>>>> >>>>>>> ??? Please review the changes that fix intermittent failure of >>>>>>> ??? serviceability/jdwp/AllModulesCommandTest.java test. >>>>>>> >>>>>>> ??? The problem here is that for a large data the JDWP agent >>>>>>> ??? (socketTransport_writePacket() method in >>>>>>> src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c ) >>>>>>> ??? sends 2 packets and in some cases only the first packet is >>>>>>> received >>>>>>> ??? at the time when the test reads the reply from the JDWP >>>>>>> agent. Since >>>>>>> ??? the test does not check that all data is received in the first >>>>>>> ??? packet the correlation between commands and replies became >>>>>>> broken >>>>>>> ??? (the unread second packet is read by the next command and the >>>>>>> reply >>>>>>> ??? for the next command is read by the next after next command >>>>>>> and so on). >>>>>>> >>>>>>> ??? Bug: https://bugs.openjdk.java.net/browse/JDK-8170541 >>>>>>> >>>>>>> ??? Webrev: http://cr.openjdk.java.net/~dtitov/8170541/webrev.01 >>>>>>> >>>>>>> ??? The tests ran successfully with Mach5. >>>>>>> >>>>>>> ??? Best regards, >>>>>>> >>>>>>> ??? Daniil >>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >>> >>> >> > From christoph.langer at sap.com Fri Mar 2 14:21:52 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Fri, 2 Mar 2018 14:21:52 +0000 Subject: 8u: RFR(S): 8197943: Unable to use JDWP API in JDK 8 to debug JDK 9 VM In-Reply-To: <08762177-1fe5-ea67-f485-05824698a26a@oracle.com> References: <678d140a-bc3c-f2c3-769b-95c35fc9e9ca@oracle.com> <08762177-1fe5-ea67-f485-05824698a26a@oracle.com> Message-ID: <9070da009d17481cacced1bb09b3b41b@sap.com> Thank you, Steven. I just took the bug. Thanks Volker and Chris for reviewing. I just posted in 8u-dev for approval... Best regards Christoph > -----Original Message----- > From: Stephen Fitch [mailto:Stephen.Fitch at oracle.com] > Sent: Montag, 26. Februar 2018 21:31 > To: Chris Plummer ; Volker Simonis > ; Langer, Christoph > > Cc: serviceability-dev at openjdk.java.net > Subject: Re: 8u: RFR(S): 8197943: Unable to use JDWP API in JDK 8 to debug > JDK 9 VM > > Happy to see this get done (ahead of when I can get to it) feel > free to take the JBS backport back into your name Christoph. > > s. > > On 2/26/18 11:22 AM, Chris Plummer wrote: > > I'm not sure the old code was doing anything useful by essentially checking > > for jdwpMajor == 0. When was it ever zero? > > > > Chris > > > > On 2/26/18 2:28 AM, Volker Simonis wrote: > >> Hi Christoph, > >> > >> I think the new code is wrong for "jdwpMajor == 0", which was > >> correctly handled before. > >> > >> But I'm not sure if that is relevant at all nowadays and taking into > >> account that this is a verbatim downport from 9 I don't think we have > >> to do better in 8u. > >> > >> Otherwise looks good from my side. Reviewed for 8u. > >> > >> Thank you and best regards, > >> Volker > >> > >> > >> On Thu, Feb 22, 2018 at 11:21 AM, Langer, Christoph > >> wrote: > >>> Hi JDK 8 reviewers, > >>> > >>> > >>> > >>> I?d like to propose a fix for a backport of JDI changes that came with > >>> modularization. > >>> > >>> > >>> > >>> Egor has brought this up in 8u-dev: > >>> http://mail.openjdk.java.net/pipermail/jdk8u-dev/2018- > February/007230.html > >>> > >>> > >>> > >>> The fix is straightforward and is used by Egor in JetBrains, as well as we > >>> at SAP have this already patched for a while in our JDK8 port. > >>> > >>> > >>> > >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8197943 > >>> > >>> Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8197943.0/ > >>> > >>> > >>> > >>> The build went through fine and jtreg tests are running. > >>> > >>> > >>> > >>> @Stephen: In the bug I read that you wanted to backport this. I hope > this > >>> matches your intentions. Otherwise I?d step back and wait for your > proposal? From chris.plummer at oracle.com Fri Mar 2 22:57:56 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 2 Mar 2018 14:57:56 -0800 Subject: RFR 8170541: serviceability/jdwp/AllModulesCommandTest.java fails intermittently on Windows and Solaris In-Reply-To: <1b51738d-4eee-f07a-4342-a814bd30eacd@oracle.com> References: <86bf7cb3-50a2-ddeb-0f88-73b8b2b4b2cf@oracle.com> <18D0CADB-CDB9-4F5D-B77A-DE5A5F6A5C19@oracle.com> <4ba6d0b1-0277-3e5e-1aa9-49e1db636586@oracle.com> <1c404611-6021-96d2-5f5f-1f48735ab5e2@oracle.com> <07e6e62e-5713-fd57-34f6-ad439e8da209@oracle.com> <1b51738d-4eee-f07a-4342-a814bd30eacd@oracle.com> Message-ID: <61b28e60-be95-66e5-a342-6275dbda5457@oracle.com> Finally got around to reading up on this. At first I was expecting to see it originally sent one large packet, but from reading up on the issue it looks like the problem was it sent a packet for each field of the JDWP header, resulting in too many small packets, and the fix for this was to coalesce the header into one packet. So this CR doesn't explain why two packets are sent (when the data is large) instead of one. Chris On 2/26/18 8:20 PM, David Holmes wrote: > The two-step send came in with: > > https://bugs.openjdk.java.net/browse/JDK-6401245 > > "Small JDWP packets with the socket transport causes slow debugging on > linux 2.6.15 kernel and newer" > > David > ----- > > On 27/02/2018 9:29 AM, serguei.spitsyn at oracle.com wrote: >> On 2/26/18 15:06, Chris Plummer wrote: >>> On 2/26/18 3:00 PM, daniil.x.titov at oracle.com wrote: >>>> >>>> >>>> On 2/26/18 12:16 PM, Chris Plummer wrote: >>>>> On 2/26/18 11:51 AM, daniil.x.titov at oracle.com wrote: >>>>>> Hi David and Sergei, >>>>>> >>>>>> On 2/20/18 10:16 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> >>>>>>> On 2/20/18 20:02, David Holmes wrote: >>>>>>>> Hi Daniil, >>>>>>>> >>>>>>>> Good find on this! >>>>>>>> >>>>>>>> What does the actual spec say about the length of things and >>>>>>>> how they may be split across multiple packets? Are we >>>>>>>> guaranteed that at most two packets will be involved? >>>>>> >>>>>> The JDWP spec >>>>>> (https://docs.oracle.com/javase/9/docs/specs/jdwp/jdwp-spec.html) >>>>>> says nothing about splitting JDWP reply packets at all but the >>>>>> implementation limits the max number of the sent packets to two >>>>>> packets max. The implementation is dated back to the initial load >>>>>> that happened in 2007 and the information about the related Jira >>>>>> issue is missing. >>>>>> >>>>>> open/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c >>>>>> >>>>>> 836 ?? data = packet->type.cmd.data; >>>>>> 837 ?? /* Do one send for short packets, two for longer ones */ >>>>>> 838 ?? if (data_len <= MAX_DATA_SIZE) { >>>>>> 839? ????? memcpy(header + JDWP_HEADER_SIZE, data, data_len); >>>>>> 840 ?????? if (send_fully(socketFD, (char *)&header, >>>>>> JDWP_HEADER_SIZE + data_len) != >>>>>> 841? ????????? JDWP_HEADER_SIZE + data_len) { >>>>>> 842 ?????????? RETURN_IO_ERROR("send failed"); >>>>>> 843? ????? } >>>>>> 844 ?? } else { >>>>>> 845? ????? memcpy(header + JDWP_HEADER_SIZE, data, MAX_DATA_SIZE); >>>>>> 846 ?????? if (send_fully(socketFD, (char *)&header, >>>>>> JDWP_HEADER_SIZE + MAX_DATA_SIZE) != >>>>>> 847 ?????????? JDWP_HEADER_SIZE + MAX_DATA_SIZE) { >>>>>> 848 ?????????? RETURN_IO_ERROR("send failed"); >>>>>> 849 ?????? } >>>>>> 850 ? ???? /* Send the remaining data bytes right out of the data >>>>>> area. */ >>>>>> 851 ?????? if (send_fully(socketFD, (char *)data + MAX_DATA_SIZE, >>>>>> 852 ????????????????????? data_len - MAX_DATA_SIZE) != data_len - >>>>>> MAX_DATA_SIZE) { >>>>>> 853 ?????????? RETURN_IO_ERROR("send failed"); >>>>>> 854 ?????? } >>>>>> 855 ?? } >>>>>> >>>>> Curious. First packet is limited to MAX_DATA_SIZE, 2nd packet has >>>>> no size limit. What's the point then of splitting it then? Is >>>>> there a desire to get the header transmitted in a smaller packet. >>>>> >>>>> Chris >>>> >>>> It looks as the goal was to somehow improve the responsiveness in >>>> case of the large data but I am not sure about this. I could not >>>> locate any traces in Jira related to this implementation. >>> I was thinking it might be something like that too. Get the header >>> across the wire quickly. Maybe the user just wants the header (with >>> size info) initially, and will allocate a large buffer for the rest >>> if necessary. >> >> It was my guess too. >> At least, it is the best explanation for this design that looks >> reasonable to me. >> >> >>> Chris >>>> Probably Serguei has some info what is the history behind this design. >> >> I don't know the history here. >> This was implemented in very early days, most likely, before JDK 1.5 >> or even 1.4.2. >> >> Thanks, >> Serguei >> >>>>>>>> >>>>>>>> ?68???? protected byte[] readJdwpString(DataInputStream ds) >>>>>>>> throws IOException { >>>>>>>> ? 69???????? byte[] str = null; >>>>>>>> ? 70???????? int len = ds.readInt(); >>>>>>>> ? 71???????? if (len > 0) { >>>>>>>> ? 72???????????? str = new byte[len]; >>>>>>>> ? 73???????????? ds.read(str, 0, len); >>>>>>>> ? 74???????? } >>>>>>>> >>>>>>>> might we get a short-read of the string if it is split across >>>>>>>> multiple packets? >>>>>>> >>>>>> This and all other reads happen not directly from the socket >>>>>> input stream but rather from the? DataInputStream object that is >>>>>> constructed in JdwpReply.initFromStream(InputStream) method. With >>>>>> the proposed fix we do ensure that the created DataInputStream >>>>>> object contains data from both packets in cases when the reply >>>>>> was split in two packets. >>>>>> >>>>>>> Nice catch! >>>>>>> Even though this fix is enough to resolve this problem now, >>>>>>> there is a chance, >>>>>>> it can fail in the future when more modules are added to the >>>>>>> platform. >>>>>>> >>>>>>> >>>>>>>> I'm wondering if all these reads should be loops, ensuring we >>>>>>>> read the expected amount of data. >>>>>>>> >>>>>> Since the implementation of the socket transport limits the max >>>>>> number of packets the reply might be split in to two packets I >>>>>> don't think we really need it here. >>>>>>>> One further comment - not sure why we need the print out for >>>>>>>> when we do read multiple packets? >>>>>>>> That would seem to be a debugging aid. >>>>>>> >>>>>>> Yes, it helps to understand what happens. >>>>>>> Many tests have a lack of tracing which makes it harder to debug >>>>>>> and understand failures. >>>>>> That is correct.? This additional tracing was added to help to >>>>>> understand the possible failures in the future. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>> Thanks, >>>>>> Daniil >>>>>> >>>>>>>> On 21/02/2018 10:14 AM, Daniil Titov wrote: >>>>>>>>> Hi Serguei, >>>>>>>>> >>>>>>>>> A new version of the webrev that has these strings reformatted >>>>>>>>> is at http://cr.openjdk.java.net/~dtitov/8170541/webrev.02/ >>>>>>>>> >>>>>>>>> Thank you! >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> >>>>>>>>> Daniil >>>>>>>>> >>>>>>>>> *From: *"serguei.spitsyn at oracle.com" >>>>>>>>> *Date: *Tuesday, February 20, 2018 at 3:00 PM >>>>>>>>> *To: *Daniil Titov , >>>>>>>>> "serviceability-dev at openjdk.java.net" >>>>>>>>> >>>>>>>>> *Subject: *Re: RFR 8170541: >>>>>>>>> serviceability/jdwp/AllModulesCommandTest.java fails >>>>>>>>> intermittently on Windows and Solaris >>>>>>>>> >>>>>>>>> Hi Daniil, >>>>>>>>> >>>>>>>>> Interesting issue... >>>>>>>>> Thank you for finding to the root cause so quickly! >>>>>>>>> >>>>>>>>> The fix looks good. >>>>>>>>> Could I ask you to reformat these lines to make the L54 >>>>>>>>> shorter ?: >>>>>>>>> >>>>>>>>> ?? 54???????????????? System.out.println("[" + >>>>>>>>> getClass().getName() + "] Only " + bytesRead + " bytes of " + >>>>>>>>> dataLength + >>>>>>>>> >>>>>>>>> ?? 55???????????????????????? " were read in the first packet. >>>>>>>>> Reading the rest..."); >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2/20/18 09:24, Daniil Titov wrote: >>>>>>>>> >>>>>>>>> ??? Please review the changes that fix intermittent failure of >>>>>>>>> ??? serviceability/jdwp/AllModulesCommandTest.java test. >>>>>>>>> >>>>>>>>> ??? The problem here is that for a large data the JDWP agent >>>>>>>>> ??? (socketTransport_writePacket() method in >>>>>>>>> src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c ) >>>>>>>>> ??? sends 2 packets and in some cases only the first packet is >>>>>>>>> received >>>>>>>>> ??? at the time when the test reads the reply from the JDWP >>>>>>>>> agent. Since >>>>>>>>> ??? the test does not check that all data is received in the >>>>>>>>> first >>>>>>>>> ??? packet the correlation between commands and replies became >>>>>>>>> broken >>>>>>>>> ??? (the unread second packet is read by the next command and >>>>>>>>> the reply >>>>>>>>> ??? for the next command is read by the next after next >>>>>>>>> command and so on). >>>>>>>>> >>>>>>>>> ??? Bug: https://bugs.openjdk.java.net/browse/JDK-8170541 >>>>>>>>> >>>>>>>>> ??? Webrev: http://cr.openjdk.java.net/~dtitov/8170541/webrev.01 >>>>>>>>> >>>>>>>>> ??? The tests ran successfully with Mach5. >>>>>>>>> >>>>>>>>> ??? Best regards, >>>>>>>>> >>>>>>>>> ??? Daniil >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >>> >> From david.holmes at oracle.com Sat Mar 3 04:41:42 2018 From: david.holmes at oracle.com (David Holmes) Date: Sat, 3 Mar 2018 14:41:42 +1000 Subject: RFR 8170541: serviceability/jdwp/AllModulesCommandTest.java fails intermittently on Windows and Solaris In-Reply-To: <61b28e60-be95-66e5-a342-6275dbda5457@oracle.com> References: <86bf7cb3-50a2-ddeb-0f88-73b8b2b4b2cf@oracle.com> <18D0CADB-CDB9-4F5D-B77A-DE5A5F6A5C19@oracle.com> <4ba6d0b1-0277-3e5e-1aa9-49e1db636586@oracle.com> <1c404611-6021-96d2-5f5f-1f48735ab5e2@oracle.com> <07e6e62e-5713-fd57-34f6-ad439e8da209@oracle.com> <1b51738d-4eee-f07a-4342-a814bd30eacd@oracle.com> <61b28e60-be95-66e5-a342-6275dbda5457@oracle.com> Message-ID: On 3/03/2018 8:57 AM, Chris Plummer wrote: > Finally got around to reading up on this. At first I was expecting to > see it originally sent one large packet, but from reading up on the > issue it looks like the problem was it sent a packet for each field of > the JDWP header, resulting in too many small packets, and the fix for > this was to coalesce the header into one packet. So this CR doesn't > explain why two packets are sent (when the data is large) instead of one. No there is no verbiage to explain it, but that was the fix that put in the code: 496 static jdwpTransportError JNICALL 497socketTransport_writePacket(jdwpTransportEnv* env, const jdwpPacket *packet) 498{ 499 jint len, data_len, id; 500 /* 501 * room for header and up to MAX_DATA_SIZE data bytes 502 */ 503 char header[HEADER_SIZE + MAX_DATA_SIZE]; 504 jbyte *data; 505 506 /* packet can't be null */ 507 if (packet == NULL) { 508 RETURN_ERROR(JDWPTRANSPORT_ERROR_ILLEGAL_ARGUMENT, "packet is NULL"); 509 } 510 511 len = packet->type.cmd.len; /* includes header */ 512 data_len = len - HEADER_SIZE; 513 514 /* bad packet */ 515 if (data_len < 0) { 516 RETURN_ERROR(JDWPTRANSPORT_ERROR_ILLEGAL_ARGUMENT, "invalid length"); 517 } 518 519 /* prepare the header for transmission */ 520 len = (jint)dbgsysHostToNetworkLong(len); 521 id = (jint)dbgsysHostToNetworkLong(packet->type.cmd.id); 522 523 memcpy(header + 0, &len, 4); 524 memcpy(header + 4, &id, 4); 525 header[8] = packet->type.cmd.flags; 526 if (packet->type.cmd.flags & JDWPTRANSPORT_FLAGS_REPLY) { 527 jshort errorCode = 528 dbgsysHostToNetworkShort(packet->type.reply.errorCode); 529 memcpy(header + 9, &errorCode, 2); 530 } else { 531 header[9] = packet->type.cmd.cmdSet; 532 header[10] = packet->type.cmd.cmd; 533 } 534 535 data = packet->type.cmd.data; 536 /* Do one send for short packets, two for longer ones */ 537 if (data_len <= MAX_DATA_SIZE) { 538 memcpy(header + HEADER_SIZE, data, data_len); 539 if (dbgsysSend(socketFD, (char *)&header, HEADER_SIZE + data_len, 0) != 540 HEADER_SIZE + data_len) { 541 RETURN_IO_ERROR("send failed"); 542 } 543 } else { 544 memcpy(header + HEADER_SIZE, data, MAX_DATA_SIZE); 545 if (dbgsysSend(socketFD, (char *)&header, HEADER_SIZE + MAX_DATA_SIZE, 0) != 546 HEADER_SIZE + MAX_DATA_SIZE) { 547 RETURN_IO_ERROR("send failed"); 548 } 549 /* Send the remaining data bytes right out of the data area. */ 550 if (dbgsysSend(socketFD, (char *)data + MAX_DATA_SIZE, 551 data_len - MAX_DATA_SIZE, 0) != data_len - MAX_DATA_SIZE) { 552 RETURN_IO_ERROR("send failed"); 553 } 554 } 555 556 return JDWPTRANSPORT_ERROR_NONE; 557} David ------ > Chris > > On 2/26/18 8:20 PM, David Holmes wrote: >> The two-step send came in with: >> >> https://bugs.openjdk.java.net/browse/JDK-6401245 >> >> "Small JDWP packets with the socket transport causes slow debugging on >> linux 2.6.15 kernel and newer" >> >> David >> ----- >> >> On 27/02/2018 9:29 AM, serguei.spitsyn at oracle.com wrote: >>> On 2/26/18 15:06, Chris Plummer wrote: >>>> On 2/26/18 3:00 PM, daniil.x.titov at oracle.com wrote: >>>>> >>>>> >>>>> On 2/26/18 12:16 PM, Chris Plummer wrote: >>>>>> On 2/26/18 11:51 AM, daniil.x.titov at oracle.com wrote: >>>>>>> Hi David and Sergei, >>>>>>> >>>>>>> On 2/20/18 10:16 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> >>>>>>>> On 2/20/18 20:02, David Holmes wrote: >>>>>>>>> Hi Daniil, >>>>>>>>> >>>>>>>>> Good find on this! >>>>>>>>> >>>>>>>>> What does the actual spec say about the length of things and >>>>>>>>> how they may be split across multiple packets? Are we >>>>>>>>> guaranteed that at most two packets will be involved? >>>>>>> >>>>>>> The JDWP spec >>>>>>> (https://docs.oracle.com/javase/9/docs/specs/jdwp/jdwp-spec.html) >>>>>>> says nothing about splitting JDWP reply packets at all but the >>>>>>> implementation limits the max number of the sent packets to two >>>>>>> packets max. The implementation is dated back to the initial load >>>>>>> that happened in 2007 and the information about the related Jira >>>>>>> issue is missing. >>>>>>> >>>>>>> open/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c >>>>>>> >>>>>>> 836 ?? data = packet->type.cmd.data; >>>>>>> 837 ?? /* Do one send for short packets, two for longer ones */ >>>>>>> 838 ?? if (data_len <= MAX_DATA_SIZE) { >>>>>>> 839? ????? memcpy(header + JDWP_HEADER_SIZE, data, data_len); >>>>>>> 840 ?????? if (send_fully(socketFD, (char *)&header, >>>>>>> JDWP_HEADER_SIZE + data_len) != >>>>>>> 841? ????????? JDWP_HEADER_SIZE + data_len) { >>>>>>> 842 ?????????? RETURN_IO_ERROR("send failed"); >>>>>>> 843? ????? } >>>>>>> 844 ?? } else { >>>>>>> 845? ????? memcpy(header + JDWP_HEADER_SIZE, data, MAX_DATA_SIZE); >>>>>>> 846 ?????? if (send_fully(socketFD, (char *)&header, >>>>>>> JDWP_HEADER_SIZE + MAX_DATA_SIZE) != >>>>>>> 847 ?????????? JDWP_HEADER_SIZE + MAX_DATA_SIZE) { >>>>>>> 848 ?????????? RETURN_IO_ERROR("send failed"); >>>>>>> 849 ?????? } >>>>>>> 850 ? ???? /* Send the remaining data bytes right out of the data >>>>>>> area. */ >>>>>>> 851 ?????? if (send_fully(socketFD, (char *)data + MAX_DATA_SIZE, >>>>>>> 852 ????????????????????? data_len - MAX_DATA_SIZE) != data_len - >>>>>>> MAX_DATA_SIZE) { >>>>>>> 853 ?????????? RETURN_IO_ERROR("send failed"); >>>>>>> 854 ?????? } >>>>>>> 855 ?? } >>>>>>> >>>>>> Curious. First packet is limited to MAX_DATA_SIZE, 2nd packet has >>>>>> no size limit. What's the point then of splitting it then? Is >>>>>> there a desire to get the header transmitted in a smaller packet. >>>>>> >>>>>> Chris >>>>> >>>>> It looks as the goal was to somehow improve the responsiveness in >>>>> case of the large data but I am not sure about this. I could not >>>>> locate any traces in Jira related to this implementation. >>>> I was thinking it might be something like that too. Get the header >>>> across the wire quickly. Maybe the user just wants the header (with >>>> size info) initially, and will allocate a large buffer for the rest >>>> if necessary. >>> >>> It was my guess too. >>> At least, it is the best explanation for this design that looks >>> reasonable to me. >>> >>> >>>> Chris >>>>> Probably Serguei has some info what is the history behind this design. >>> >>> I don't know the history here. >>> This was implemented in very early days, most likely, before JDK 1.5 >>> or even 1.4.2. >>> >>> Thanks, >>> Serguei >>> >>>>>>>>> >>>>>>>>> ?68???? protected byte[] readJdwpString(DataInputStream ds) >>>>>>>>> throws IOException { >>>>>>>>> ? 69???????? byte[] str = null; >>>>>>>>> ? 70???????? int len = ds.readInt(); >>>>>>>>> ? 71???????? if (len > 0) { >>>>>>>>> ? 72???????????? str = new byte[len]; >>>>>>>>> ? 73???????????? ds.read(str, 0, len); >>>>>>>>> ? 74???????? } >>>>>>>>> >>>>>>>>> might we get a short-read of the string if it is split across >>>>>>>>> multiple packets? >>>>>>>> >>>>>>> This and all other reads happen not directly from the socket >>>>>>> input stream but rather from the? DataInputStream object that is >>>>>>> constructed in JdwpReply.initFromStream(InputStream) method. With >>>>>>> the proposed fix we do ensure that the created DataInputStream >>>>>>> object contains data from both packets in cases when the reply >>>>>>> was split in two packets. >>>>>>> >>>>>>>> Nice catch! >>>>>>>> Even though this fix is enough to resolve this problem now, >>>>>>>> there is a chance, >>>>>>>> it can fail in the future when more modules are added to the >>>>>>>> platform. >>>>>>>> >>>>>>>> >>>>>>>>> I'm wondering if all these reads should be loops, ensuring we >>>>>>>>> read the expected amount of data. >>>>>>>>> >>>>>>> Since the implementation of the socket transport limits the max >>>>>>> number of packets the reply might be split in to two packets I >>>>>>> don't think we really need it here. >>>>>>>>> One further comment - not sure why we need the print out for >>>>>>>>> when we do read multiple packets? >>>>>>>>> That would seem to be a debugging aid. >>>>>>>> >>>>>>>> Yes, it helps to understand what happens. >>>>>>>> Many tests have a lack of tracing which makes it harder to debug >>>>>>>> and understand failures. >>>>>>> That is correct.? This additional tracing was added to help to >>>>>>> understand the possible failures in the future. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>> Thanks, >>>>>>> Daniil >>>>>>> >>>>>>>>> On 21/02/2018 10:14 AM, Daniil Titov wrote: >>>>>>>>>> Hi Serguei, >>>>>>>>>> >>>>>>>>>> A new version of the webrev that has these strings reformatted >>>>>>>>>> is at http://cr.openjdk.java.net/~dtitov/8170541/webrev.02/ >>>>>>>>>> >>>>>>>>>> Thank you! >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> >>>>>>>>>> Daniil >>>>>>>>>> >>>>>>>>>> *From: *"serguei.spitsyn at oracle.com" >>>>>>>>>> *Date: *Tuesday, February 20, 2018 at 3:00 PM >>>>>>>>>> *To: *Daniil Titov , >>>>>>>>>> "serviceability-dev at openjdk.java.net" >>>>>>>>>> >>>>>>>>>> *Subject: *Re: RFR 8170541: >>>>>>>>>> serviceability/jdwp/AllModulesCommandTest.java fails >>>>>>>>>> intermittently on Windows and Solaris >>>>>>>>>> >>>>>>>>>> Hi Daniil, >>>>>>>>>> >>>>>>>>>> Interesting issue... >>>>>>>>>> Thank you for finding to the root cause so quickly! >>>>>>>>>> >>>>>>>>>> The fix looks good. >>>>>>>>>> Could I ask you to reformat these lines to make the L54 >>>>>>>>>> shorter ?: >>>>>>>>>> >>>>>>>>>> ?? 54???????????????? System.out.println("[" + >>>>>>>>>> getClass().getName() + "] Only " + bytesRead + " bytes of " + >>>>>>>>>> dataLength + >>>>>>>>>> >>>>>>>>>> ?? 55???????????????????????? " were read in the first packet. >>>>>>>>>> Reading the rest..."); >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2/20/18 09:24, Daniil Titov wrote: >>>>>>>>>> >>>>>>>>>> ??? Please review the changes that fix intermittent failure of >>>>>>>>>> ??? serviceability/jdwp/AllModulesCommandTest.java test. >>>>>>>>>> >>>>>>>>>> ??? The problem here is that for a large data the JDWP agent >>>>>>>>>> ??? (socketTransport_writePacket() method in >>>>>>>>>> src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c ) >>>>>>>>>> ??? sends 2 packets and in some cases only the first packet is >>>>>>>>>> received >>>>>>>>>> ??? at the time when the test reads the reply from the JDWP >>>>>>>>>> agent. Since >>>>>>>>>> ??? the test does not check that all data is received in the >>>>>>>>>> first >>>>>>>>>> ??? packet the correlation between commands and replies became >>>>>>>>>> broken >>>>>>>>>> ??? (the unread second packet is read by the next command and >>>>>>>>>> the reply >>>>>>>>>> ??? for the next command is read by the next after next >>>>>>>>>> command and so on). >>>>>>>>>> >>>>>>>>>> ??? Bug: https://bugs.openjdk.java.net/browse/JDK-8170541 >>>>>>>>>> >>>>>>>>>> ??? Webrev: http://cr.openjdk.java.net/~dtitov/8170541/webrev.01 >>>>>>>>>> >>>>>>>>>> ??? The tests ran successfully with Mach5. >>>>>>>>>> >>>>>>>>>> ??? Best regards, >>>>>>>>>> >>>>>>>>>> ??? Daniil >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> > > From christoph.langer at sap.com Mon Mar 5 09:03:19 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Mon, 5 Mar 2018 09:03:19 +0000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans Message-ID: Hi, please review a small fix that was identified by a coverity code scan. In case strlen(name) was the same or larger than name_length_max or resp. strlen(arg) >= arg_length_max, the _name or _arg fields would not get null terminated correctly. Bug: https://bugs.openjdk.java.net/browse/JDK-8199010 Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199010.0/ Thanks Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Mar 5 11:29:26 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 5 Mar 2018 21:29:26 +1000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: References: Message-ID: Hi Christoph, On 5/03/2018 7:03 PM, Langer, Christoph wrote: > Hi, > > please review a small fix that was identified by a coverity code scan. > > In case strlen(name) was the same or larger than name_length_max or resp. strlen(arg) >= arg_length_max, the _name or _arg fields would not get null terminated correctly. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8199010 > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199010.0/ That looks good to me. Thanks, David > Thanks > Christoph > From thomas.stuefe at gmail.com Mon Mar 5 14:52:32 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 5 Mar 2018 15:52:32 +0100 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: References: Message-ID: Hi Christoph, Seeing that truncation is considered assertion worthy, should we really hide it in release? Gru? Thomas On Mar 5, 2018 10:03, "Langer, Christoph" wrote: > Hi, > > please review a small fix that was identified by a coverity code scan. > > In case strlen(name) was the same or larger than name_length_max or resp. > strlen(arg) >= arg_length_max, the _name or _arg fields would not get null > terminated correctly. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8199010 > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199010.0/ > > Thanks > Christoph > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Mon Mar 5 15:37:49 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Mon, 5 Mar 2018 15:37:49 +0000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: References: Message-ID: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> Hi Thomas, well, I think this discussion is beyond the scope of my contribution. Probably one doesn?t want the risk of JVM crashes/exits just because someone shoots in a bad attach operation name which is too long. So, may I consider it reviewed from your end? I?m trying the submission repo right now with this change? Best regards Christoph From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] Sent: Montag, 5. M?rz 2018 15:53 To: Langer, Christoph Cc: Hotspot dev runtime ; serviceability-dev at openjdk.java.net Subject: Re: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans Hi Christoph, Seeing that truncation is considered assertion worthy, should we really hide it in release? Gru? Thomas On Mar 5, 2018 10:03, "Langer, Christoph" > wrote: Hi, please review a small fix that was identified by a coverity code scan. In case strlen(name) was the same or larger than name_length_max or resp. strlen(arg) >= arg_length_max, the _name or _arg fields would not get null terminated correctly. Bug: https://bugs.openjdk.java.net/browse/JDK-8199010 Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199010.0/ Thanks Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Mon Mar 5 17:18:21 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 5 Mar 2018 09:18:21 -0800 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> Message-ID: <12af0630-5952-59bb-b039-96408bab6b0b@oracle.com> Sorry, meant to address Christoph, not Thomas. Chris On 3/5/18 9:17 AM, Chris Plummer wrote: > Hi Thomas > > Asserts imply something that is suppose to never happen, but that you > want to check for in debug builds to help uncover bugs. Given this, > either we have a bug (and someone can pass in a name that is too > long), or coverity is complaining about something that can never > happen, or the assert is invalid. So the potential fixes are: > > -Fix the problem up the call chain were the invalid string can be > passed in. > -Tell coverity to clam up because having the string be too long is not > possible. > -Leave in your fix but remove the assert. > > thanks, > > Chris > > On 3/5/18 7:37 AM, Langer, Christoph wrote: >> Hi Thomas, >> >> well, I think this discussion is beyond the scope of my contribution. >> Probably one doesn???t want the risk of JVM crashes/exits just >> because someone shoots in a bad attach operation name which is too long. >> >> So, may I consider it reviewed from your end? I???m trying the >> submission repo right now with this change??? >> >> Best regards >> Christoph >> >> From: Thomas St??fe [mailto:thomas.stuefe at gmail.com] >> Sent: Montag, 5. M??rz 2018 15:53 >> To: Langer, Christoph >> Cc: Hotspot dev runtime ; >> serviceability-dev at openjdk.java.net >> Subject: Re: RFR (XS): 8199010: attachListener.hpp: Fix potential >> null termination issue found by coverity scans >> >> Hi Christoph, >> >> Seeing that truncation is considered assertion worthy, should we >> really hide it in release? >> >> Gru?? Thomas >> >> On Mar 5, 2018 10:03, "Langer, Christoph" >> > wrote: >> Hi, >> >> please review a small fix that was identified by a coverity code scan. >> >> In case strlen(name) was the same or larger than name_length_max or >> resp. strlen(arg) >= arg_length_max, the _name or _arg fields would >> not get null terminated correctly. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8199010 >> Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199010.0/ >> >> Thanks >> Christoph > > > From chris.plummer at oracle.com Mon Mar 5 17:17:23 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 5 Mar 2018 09:17:23 -0800 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> Message-ID: <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> Hi Thomas Asserts imply something that is suppose to never happen, but that you want to check for in debug builds to help uncover bugs. Given this, either we have a bug (and someone can pass in a name that is too long), or coverity is complaining about something that can never happen, or the assert is invalid. So the potential fixes are: -Fix the problem up the call chain were the invalid string can be passed in. -Tell coverity to clam up because having the string be too long is not possible. -Leave in your fix but remove the assert. thanks, Chris On 3/5/18 7:37 AM, Langer, Christoph wrote: > Hi Thomas, > > well, I think this discussion is beyond the scope of my contribution. Probably one doesn?t want the risk of JVM crashes/exits just because someone shoots in a bad attach operation name which is too long. > > So, may I consider it reviewed from your end? I?m trying the submission repo right now with this change? > > Best regards > Christoph > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > Sent: Montag, 5. M?rz 2018 15:53 > To: Langer, Christoph > Cc: Hotspot dev runtime ; serviceability-dev at openjdk.java.net > Subject: Re: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans > > Hi Christoph, > > Seeing that truncation is considered assertion worthy, should we really hide it in release? > > Gru? Thomas > > On Mar 5, 2018 10:03, "Langer, Christoph" > wrote: > Hi, > > please review a small fix that was identified by a coverity code scan. > > In case strlen(name) was the same or larger than name_length_max or resp. strlen(arg) >= arg_length_max, the _name or _arg fields would not get null terminated correctly. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8199010 > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199010.0/ > > Thanks > Christoph From serguei.spitsyn at oracle.com Mon Mar 5 17:58:53 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 5 Mar 2018 09:58:53 -0800 Subject: RFR: JDK-8193369: post_field_access does not work for some functions, possibly related to fast_getfield In-Reply-To: <3df69fad-c0d8-5667-a61a-f88a83e26d89@oracle.com> References: <1fca6b67-c0d1-db03-52ed-f2c6bcc29a5b@oracle.com> <91aadc35-125a-bf74-6cf5-672dc77ffb22@oracle.com> <3df69fad-c0d8-5667-a61a-f88a83e26d89@oracle.com> Message-ID: <74eacea4-a3c0-a35d-047b-1478b7d46c87@oracle.com> Hi Alex, It looks good. Thank you for the update! Thanks, Serguei On 3/1/18 10:53, Alex Menkov wrote: > Hi Serguei, > > Thank you for the feedback. > Updated webrev: > http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev.01/ > > See inline for comments for your notes. > > On 02/27/2018 23:08, serguei.spitsyn at oracle.com wrote: >> Hi Alex, >> >> Thank you for taking care about this! >> The fix looks good to me. >> >> Some comments on the test. >> >> http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev/test/hotspot/jtreg/serviceability/jvmti/FieldAccessWatch/FieldAccessWatch.java.html >> >> There are some commented lines in the TestResult class. >> A cleanup is needed to delete them. >> I guess, it is already in your plan. > > I deleted couple lines, keeping comment for fields > >> The empty line #135 is not needed. >> An empty line is needed after the L99. > > fixed. > >> Probably, the intention was to spell "startTest" insted of "initTest" >> below: >> >> ??119???????? if (!startTest(result)) { >> ??120???????????? throw new RuntimeException("initTest failed"); >> ??121???????? } > > fixed. > >> I wonder if this sleep is really needed: >> ???? 124 Thread.sleep(500); >> >> The "action.apply()" is executed synchronously, is not it? > > But notifications are asynchronous, so this helps to avoid test > failures is some events are delivered a bit later in loaded environment. > Also this helps to avoid mess of native and java logging > >> I'm thinking if moving the test() to native side would simplify things. > > To me it's simpler and more flexible to perform required actions in > Java, native part only handles notifications. > >> An Exception can be thrown from native if the test failed or just a >> boolean status returned. >> >> >> http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev/test/hotspot/jtreg/serviceability/jvmti/FieldAccessWatch/libFieldAccessWatch.c.html >> >> I'd suggest to rename currentTestResults to testResultObject, >> so it will be in line with testResultClass. > > fixed. > >> One concern is that that the reportError() does not cause the test to >> fail and does not break the execution. >> Would it better to throw an exception with the same message as was >> printed? > > Updated several cases (immediate return from callbacks if something > went wrong). > Note that reportError is called from native Java methods and from > JVMTI callbacks, so throwing an exception doesn't looks right. > >> It seems, the function tagAndWatch() adds some complexity to the code. >> Is all this really needed? Could you, please, add some comments. >> It does not seem this functions tags anything. > > renamed the function, added short function description. > >> ??168 (*jvmti)->Deallocate(jvmti, (unsigned char*)sig); >> >> ??The sig needs to be cleared after deallocation as it is used and >> checked in a loop. > > Moved the variable to the correct scope. > >> Missed initializations: >> >> ?? 68???? char *name; >> ??142???????? jfieldID* klassFields; >> ??143???????? jint fieldCount; > > Fixed. > > --alex > >> Thanks, >> Serguei >> >> >> On 2/26/18 14:43, Alex Menkov wrote: >>> Hi all, >>> >>> Please review a fix for >>> JDK-8193369: post_field_access does not work for some functions, >>> possibly related to fast_getfield >>> >>> The fix disables "fast" command generation when FieldAccess or >>> FieldModification notifications are requested. >>> >>> jira: https://bugs.openjdk.java.net/browse/JDK-8193369 >>> webrev: http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev/ >>> >>> --alex >> From christoph.langer at sap.com Mon Mar 5 21:28:06 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Mon, 5 Mar 2018 21:28:06 +0000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> Message-ID: Hi Chris, > Asserts imply something that is suppose to never happen, but that you > want to check for in debug builds to help uncover bugs. Given this, > either we have a bug (and someone can pass in a name that is too long), > or coverity is complaining about something that can never happen, or the > assert is invalid. So the potential fixes are: > > -Fix the problem up the call chain were the invalid string can be passed in. > -Tell coverity to clam up because having the string be too long is not > possible. > -Leave in your fix but remove the assert. I believe coverity has a valid point here for the case that strlen(name) == name_length_max or strlen(arg) == arg_length_max. In that case, memcpy would copy exactly the bytes for the strings without the terminating zero. And as zero initialization of c++ members is not guaranteed (as far as I know), the name_length_max + 1 byte or arg_legth_max + 1 could theoretically have nonzero values. Furthermore, I think in this case it makes sense to have an assertion because, as you state, in the debug builds you want to see any potential bug uncovered at the cost of a JVM exit. But in an opt build you want to be rather stable, even in case you get names and arguments passed that are too long. I don't want to go into the details of potential calling paths how that can happen, though... But even in case there are length violations in attach operation names or its arguments, the operations would most likely result in no success which is uncritical to a running VM. So wouldn't you agree that my change is fine as is? Submission-repo testing reported no errors. Best regards Christoph From david.holmes at oracle.com Mon Mar 5 22:08:24 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 6 Mar 2018 08:08:24 +1000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> Message-ID: On 6/03/2018 3:17 AM, Chris Plummer wrote: > Asserts imply something that is suppose to never happen, but that you > want to check for in debug builds to help uncover bugs. Given this, > either we have a bug (and someone can pass in a name that is too long), > or coverity is complaining about something that can never happen, or the > assert is invalid. So the potential fixes are: > > -Fix the problem up the call chain were the invalid string can be passed > in. > -Tell coverity to clam up because having the string be too long is not > possible. > -Leave in your fix but remove the assert. I hadn't looked into the calling context for this, but a too long name should be impossible. The allowed names come from here: // names must be of length <= AttachOperation::name_length_max static AttachOperationFunctionInfo funcs[] = { { "agentProperties", get_agent_properties }, { "datadump", data_dump }, { "dumpheap", dump_heap }, { "load", load_agent }, { "properties", get_system_properties }, { "threaddump", thread_dump }, { "inspectheap", heap_inspection }, { "setflag", set_flag }, { "printflag", print_flag }, { "jcmd", jcmd }, { NULL, NULL } }; and name_length_max comes from the longest defined name: agentProperties. Further, AFAICS, set_name is only actually called on Windows. And we again check the incoming cmd "name" to ensure it isn't too big. So the whole copying code seems somewhat overly conservative: - we've limited the name to below the maximum - we have an assert just in case someone adds a new name and forgets to increase the maximum (there are actually asserts at multiple levels) - but we also copy as-if the name can be longer than expected The irony is that the current code was put in place because of coverity! https://bugs.openjdk.java.net/browse/JDK-8140482 Not sure why it isn't just using strncpy though. David > thanks, > > Chris > > On 3/5/18 7:37 AM, Langer, Christoph wrote: >> Hi Thomas, >> >> well, I think this discussion is beyond the scope of my contribution. >> Probably one doesn?t want the risk of JVM crashes/exits just because >> someone shoots in a bad attach operation name which is too long. >> >> So, may I consider it reviewed from your end? I?m trying the >> submission repo right now with this change? >> >> Best regards >> Christoph >> >> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] >> Sent: Montag, 5. M?rz 2018 15:53 >> To: Langer, Christoph >> Cc: Hotspot dev runtime ; >> serviceability-dev at openjdk.java.net >> Subject: Re: RFR (XS): 8199010: attachListener.hpp: Fix potential null >> termination issue found by coverity scans >> >> Hi Christoph, >> >> Seeing that truncation is considered assertion worthy, should we >> really hide it in release? >> >> Gru? Thomas >> >> On Mar 5, 2018 10:03, "Langer, Christoph" >> > wrote: >> Hi, >> >> please review a small fix that was identified by a coverity code scan. >> >> In case strlen(name) was the same or larger than name_length_max or >> resp. strlen(arg) >= arg_length_max, the _name or _arg fields would >> not get null terminated correctly. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8199010 >> Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199010.0/ >> >> Thanks >> Christoph > > > From chris.plummer at oracle.com Mon Mar 5 22:37:42 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 5 Mar 2018 14:37:42 -0800 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> Message-ID: On 3/5/18 1:28 PM, Langer, Christoph wrote: > Hi Chris, > >> Asserts imply something that is suppose to never happen, but that you >> want to check for in debug builds to help uncover bugs. Given this, >> either we have a bug (and someone can pass in a name that is too long), >> or coverity is complaining about something that can never happen, or the >> assert is invalid. So the potential fixes are: >> >> -Fix the problem up the call chain were the invalid string can be passed in. >> -Tell coverity to clam up because having the string be too long is not >> possible. >> -Leave in your fix but remove the assert. > I believe coverity has a valid point here for the case that strlen(name) == name_length_max or strlen(arg) == arg_length_max. In that case, memcpy would copy exactly the bytes for the strings without the terminating zero. And as zero initialization of c++ members is not guaranteed (as far as I know), the name_length_max + 1 byte or arg_legth_max + 1 could theoretically have nonzero values. > > Furthermore, I think in this case it makes sense to have an assertion because, as you state, in the debug builds you want to see any potential bug uncovered at the cost of a JVM exit. But in an opt build you want to be rather stable, even in case you get names and arguments passed that are too long. I don't want to go into the details of potential calling paths how that can happen, though... But even in case there are length violations in attach operation names or its arguments, the operations would most likely result in no success which is uncritical to a running VM. > > So wouldn't you agree that my change is fine as is? > > Submission-repo testing reported no errors. > > Best regards > Christoph > Hi Christoph, We don't assert things that we also explicitlydefend against.For example, the following defensive coding should not be accepted: ??? // x is never suppose to be less then 0, but just in case it is, set it to 0 ??? assert(x >= 0); ??? if (x < 0) x = 0; Either it can't be less then zero and we assert this, or we accept the possibility that it could be less then zero and defend against it, but no assert in that case. Given the following initial code (and ignoring what coverity might have to say about it): ?127???? memcpy(_name, name, MIN2(len + 1, (size_t)name_length_max)); I would argue that the bug here is that it should be: ?127???? memcpy(_name, name, MIN2(len, (size_t)name_length_max) + 1); Maybe this would silence coverity, but it also is the correct thing to do. We weren't null terminating if len == name_length_max because we were only copying len bytes in that case, not len+1. But, this still has the issue of defending against something we also assert for, so the reality is the MIN2 part should be be needed at all. Does coverity complain if you get rid of it? ?127???? memcpy(_name, name, len + 1); thanks, Chris From david at acz.org Mon Mar 5 22:48:58 2018 From: david at acz.org (David Phillips) Date: Mon, 5 Mar 2018 14:48:58 -0800 Subject: Capturing thread dumps without safepoints Message-ID: We often need to take thread dumps, using jstack or similar tools, and would like to do so without the overhead of safepoints. This is particularly important when the system is under heavy load, as we've seen safepoints take 30+ seconds. Administrators often have the idea to automatically capture a thread dump when the system becomes unhealthy, but this can make the problem worse. I had the idea to use AsyncGetCallTrace() to implement an asynchronous version of jstack. It works as follows: * Install a signal handler that captures the stack to a global memory location. * Register a ThreadStart event which captures the JNI environment and pthread_t. * Start an agent thread which accepts connections on a socket. * When a client connects, it sends a signal to each thread using pthread_kill() and writes out the stack trace after the signal handler completes. I would love to have feedback on the code: https://github.com/airlift/astack Is this a reasonable approach? Is something like this already available in the JVM? -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Mar 5 23:31:40 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 6 Mar 2018 09:31:40 +1000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> Message-ID: <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> On 6/03/2018 8:37 AM, Chris Plummer wrote: > On 3/5/18 1:28 PM, Langer, Christoph wrote: >> Hi Chris, >> >>> Asserts imply something that is suppose to never happen, but that you >>> want to check for in debug builds to help uncover bugs. Given this, >>> either we have a bug (and someone can pass in a name that is too long), >>> or coverity is complaining about something that can never happen, or the >>> assert is invalid. So the potential fixes are: >>> >>> -Fix the problem up the call chain were the invalid string can be >>> passed in. >>> -Tell coverity to clam up because having the string be too long is not >>> possible. >>> -Leave in your fix but remove the assert. >> I believe coverity has a valid point here for the case that >> strlen(name) == name_length_max or strlen(arg) == arg_length_max. In >> that case, memcpy would copy exactly the bytes for the strings without >> the terminating zero. And as zero initialization of c++ members is not >> guaranteed (as far as I know), the name_length_max + 1 byte or >> arg_legth_max + 1 could theoretically have nonzero values. >> >> Furthermore, I think in this case it makes sense to have an assertion >> because, as you state, in the debug builds you want to see any >> potential bug uncovered at the cost of a JVM exit. But in an opt build >> you want to be rather stable, even in case you get names and arguments >> passed that are too long. I don't want to go into the details of >> potential calling paths how that can happen, though... But even in >> case there are length violations in attach operation names or its >> arguments, the operations would most likely result in no success which >> is uncritical to a running VM. >> >> So wouldn't you agree that my change is fine as is? >> >> Submission-repo testing reported no errors. >> >> Best regards >> Christoph >> > Hi Christoph, > > We don't assert things that we also explicitlydefend against.For > example, the following defensive coding should not be accepted: > > ??? // x is never suppose to be less then 0, but just in case it is, > set it to 0 > ??? assert(x >= 0); > ??? if (x < 0) x = 0; > > Either it can't be less then zero and we assert this, or we accept the > possibility that it could be less then zero and defend against it, but > no assert in that case. > > Given the following initial code (and ignoring what coverity might have > to say about it): > > ?127???? memcpy(_name, name, MIN2(len + 1, (size_t)name_length_max)); > > I would argue that the bug here is that it should be: > > ?127???? memcpy(_name, name, MIN2(len, (size_t)name_length_max) + 1); That fails to copy the '\0' - you need to keep "len + 1". Though again why not just use strncpy. David ----- > Maybe this would silence coverity, but it also is the correct thing to > do. We weren't null terminating if len == name_length_max because we > were only copying len bytes in that case, not len+1. But, this still has > the issue of defending against something we also assert for, so the > reality is the MIN2 part should be be needed at all. Does coverity > complain if you get rid of it? > > ?127???? memcpy(_name, name, len + 1); > > thanks, > > Chris > From chris.plummer at oracle.com Mon Mar 5 23:53:21 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 5 Mar 2018 15:53:21 -0800 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> Message-ID: <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> On 3/5/18 3:31 PM, David Holmes wrote: > On 6/03/2018 8:37 AM, Chris Plummer wrote: >> On 3/5/18 1:28 PM, Langer, Christoph wrote: >>> Hi Chris, >>> >>>> Asserts imply something that is suppose to never happen, but that you >>>> want to check for in debug builds to help uncover bugs. Given this, >>>> either we have a bug (and someone can pass in a name that is too >>>> long), >>>> or coverity is complaining about something that can never happen, >>>> or the >>>> assert is invalid. So the potential fixes are: >>>> >>>> -Fix the problem up the call chain were the invalid string can be >>>> passed in. >>>> -Tell coverity to clam up because having the string be too long is not >>>> possible. >>>> -Leave in your fix but remove the assert. >>> I believe coverity has a valid point here for the case that >>> strlen(name) == name_length_max or strlen(arg) == arg_length_max. In >>> that case, memcpy would copy exactly the bytes for the strings >>> without the terminating zero. And as zero initialization of c++ >>> members is not guaranteed (as far as I know), the name_length_max + >>> 1 byte or arg_legth_max + 1 could theoretically have nonzero values. >>> >>> Furthermore, I think in this case it makes sense to have an >>> assertion because, as you state, in the debug builds you want to see >>> any potential bug uncovered at the cost of a JVM exit. But in an opt >>> build you want to be rather stable, even in case you get names and >>> arguments passed that are too long. I don't want to go into the >>> details of potential calling paths how that can happen, though... >>> But even in case there are length violations in attach operation >>> names or its arguments, the operations would most likely result in >>> no success which is uncritical to a running VM. >>> >>> So wouldn't you agree that my change is fine as is? >>> >>> Submission-repo testing reported no errors. >>> >>> Best regards >>> Christoph >>> >> Hi Christoph, >> >> We don't assert things that we also explicitlydefend against.For >> example, the following defensive coding should not be accepted: >> >> ???? // x is never suppose to be less then 0, but just in case it is, >> set it to 0 >> ???? assert(x >= 0); >> ???? if (x < 0) x = 0; >> >> Either it can't be less then zero and we assert this, or we accept >> the possibility that it could be less then zero and defend against >> it, but no assert in that case. >> >> Given the following initial code (and ignoring what coverity might >> have to say about it): >> >> ??127???? memcpy(_name, name, MIN2(len + 1, (size_t)name_length_max)); >> >> I would argue that the bug here is that it should be: >> >> ??127???? memcpy(_name, name, MIN2(len, (size_t)name_length_max) + 1); > > That fails to copy the '\0' - you need to keep "len + 1". Though again > why not just use strncpy. > Look closer. 1 is added to the MIN2 result. I agree with using strncpy. Chris > David > ----- > >> Maybe this would silence coverity, but it also is the correct thing >> to do. We weren't null terminating if len == name_length_max because >> we were only copying len bytes in that case, not len+1. But, this >> still has the issue of defending against something we also assert >> for, so the reality is the MIN2 part should be be needed at all. Does >> coverity complain if you get rid of it? >> >> ??127???? memcpy(_name, name, len + 1); >> >> thanks, >> >> Chris >> From christoph.langer at sap.com Tue Mar 6 08:50:12 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Tue, 6 Mar 2018 08:50:12 +0000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> Message-ID: Hi Chris, David and Thomas, I took a closer look now, too. Funny that the original change was contributed by my colleagues because of coverity and that they didn't do it completely right. ?? As a code comment in our attachListener.hpp suggests, the '0' termination to please coverity was added far earlier than JDK-8140482 was done. So, yes, in fact the input to the "set_name" and "set_arg" methods should never exceed the maximum length values as per the current code in the OpenJDK. These methods are called from the various platform specific attachListener_.cpp files. And in each of these places the length is already checked and violations get handled. So with the assertion we merely guard against new code that doesn't do checking which can potentially come in. So one can argue that the assertions are enough here and we can just do strcpy. In that case I would even support Thomas' suggestion to change the assertion into a guarantee as the input coming in from new code is not necessarily static but can be user input (who knows). And we should also turn the knob here to quiesce coverity since it is obviously not considering the possible call paths and the checks in them. But on the other hand, one could be as conservative as it is now - I guess it doesn't bear too much of cost and this place of code is not performance critical. That means do the assertion in dbg builds and for opt effectively do a checked, truncating copy of the input data but avoiding JVM crashes or other errors due to unterminated strings. I personally tend to do the second - but fine if I get overruled. But, if we do the second, I'm still for memcpy as strncpy would do zero padding of the buffer which is not necessary and we have to write a terminating 0 as well to handle the case that inputlength > name_len_max (the case which should not happen but we want to protect against). That would mean my change stays as it is. What shall I do now? Best regards Christoph From david.holmes at oracle.com Tue Mar 6 12:26:01 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 6 Mar 2018 22:26:01 +1000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> Message-ID: On 6/03/2018 6:50 PM, Langer, Christoph wrote: > Hi Chris, David and Thomas, > > I took a closer look now, too. Funny that the original change was contributed by my colleagues because of coverity and that they didn't do it completely right. ?? As a code comment in our attachListener.hpp suggests, the '0' termination to please coverity was added far earlier than JDK-8140482 was done. > > So, yes, in fact the input to the "set_name" and "set_arg" methods should never exceed the maximum length values as per the current code in the OpenJDK. These methods are called from the various platform specific attachListener_.cpp files. And in each of these places the length is already checked and violations get handled. So with the assertion we merely guard against new code that doesn't do checking which can potentially come in. > > So one can argue that the assertions are enough here and we can just do strcpy. In that case I would even support Thomas' suggestion to change the assertion into a guarantee as the input coming in from new code is not necessarily static but can be user input (who knows). And we should also turn the knob here to quiesce coverity since it is obviously not considering the possible call paths and the checks in them. > > But on the other hand, one could be as conservative as it is now - I guess it doesn't bear too much of cost and this place of code is not performance critical. That means do the assertion in dbg builds and for opt effectively do a checked, truncating copy of the input data but avoiding JVM crashes or other errors due to unterminated strings. > > I personally tend to do the second - but fine if I get overruled. > > But, if we do the second, I'm still for memcpy as strncpy would do zero padding of the buffer which is not necessary and we have to write a terminating 0 as well to handle the case that inputlength > name_len_max (the case which should not happen but we want to protect against). That would mean my change stays as it is. I don't know why strncpy would do zero padding? Personally I view this code as follows: - it is guaranteed that name length can not exceed the expected maximum due to existing checks - the assert guards against new code that might add an unchecked path or a new command name that is longer than current max and doesn't update the max With that in mind then a simple strncpy of len+1 fully suffices. However that doesn't address the coverity issue (and possibly other checking tools). And given this code was already appeasing coverity, I vote for just accepting Christoph's patch. Thanks, David > > What shall I do now? > > Best regards > Christoph > From christoph.langer at sap.com Tue Mar 6 13:47:31 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Tue, 6 Mar 2018 13:47:31 +0000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> Message-ID: <43d8dfa3129c47f48452e64af423008e@sap.com> Thanks, David. A colleague just told me that a guarantee would also quiesce Coverity. So that could really be an option then. Let's wait for Chris' opinion... > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Dienstag, 6. M?rz 2018 13:26 > To: Langer, Christoph ; Chris Plummer > ; Thomas St?fe > Cc: serviceability-dev at openjdk.java.net; Hotspot dev runtime runtime-dev at openjdk.java.net> > Subject: Re: RFR (XS): 8199010: attachListener.hpp: Fix potential null > termination issue found by coverity scans > > On 6/03/2018 6:50 PM, Langer, Christoph wrote: > > Hi Chris, David and Thomas, > > > > I took a closer look now, too. Funny that the original change was > contributed by my colleagues because of coverity and that they didn't do it > completely right. ?? As a code comment in our attachListener.hpp suggests, > the '0' termination to please coverity was added far earlier than JDK-8140482 > was done. > > > > So, yes, in fact the input to the "set_name" and "set_arg" methods should > never exceed the maximum length values as per the current code in the > OpenJDK. These methods are called from the various platform specific > attachListener_.cpp files. And in each of these places the length is > already checked and violations get handled. So with the assertion we merely > guard against new code that doesn't do checking which can potentially come > in. > > > > So one can argue that the assertions are enough here and we can just do > strcpy. In that case I would even support Thomas' suggestion to change the > assertion into a guarantee as the input coming in from new code is not > necessarily static but can be user input (who knows). And we should also > turn the knob here to quiesce coverity since it is obviously not considering > the possible call paths and the checks in them. > > > > But on the other hand, one could be as conservative as it is now - I guess it > doesn't bear too much of cost and this place of code is not performance > critical. That means do the assertion in dbg builds and for opt effectively do a > checked, truncating copy of the input data but avoiding JVM crashes or other > errors due to unterminated strings. > > > > I personally tend to do the second - but fine if I get overruled. > > > > But, if we do the second, I'm still for memcpy as strncpy would do zero > padding of the buffer which is not necessary and we have to write a > terminating 0 as well to handle the case that inputlength > name_len_max > (the case which should not happen but we want to protect against). That > would mean my change stays as it is. > > I don't know why strncpy would do zero padding? > > Personally I view this code as follows: > > - it is guaranteed that name length can not exceed the expected maximum > due to existing checks > - the assert guards against new code that might add an unchecked path or > a new command name that is longer than current max and doesn't update > the max > > With that in mind then a simple strncpy of len+1 fully suffices. > > However that doesn't address the coverity issue (and possibly other > checking tools). And given this code was already appeasing coverity, I > vote for just accepting Christoph's patch. > > Thanks, > David > > > > What shall I do now? > > > > Best regards > > Christoph > > From chris.plummer at oracle.com Tue Mar 6 16:53:25 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 6 Mar 2018 08:53:25 -0800 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> Message-ID: <7ab7c0b7-69ad-5b94-dd98-19790288bebb@oracle.com> On 3/6/18 4:26 AM, David Holmes wrote: > On 6/03/2018 6:50 PM, Langer, Christoph wrote: >> Hi Chris, David and Thomas, >> >> I took a closer look now, too. Funny that the original change was >> contributed by my colleagues because of coverity and that they didn't >> do it completely right. ?? As a code comment in our >> attachListener.hpp suggests, the '0' termination to please coverity >> was added far earlier than JDK-8140482 was done. >> >> So, yes, in fact the input to the "set_name" and "set_arg" methods >> should never exceed the maximum length values as per the current code >> in the OpenJDK. These methods are called from the various platform >> specific attachListener_.cpp files. And in each of these places >> the length is already checked and violations get handled. So with the >> assertion we merely guard against new code that doesn't do checking >> which can potentially come in. >> >> So one can argue that the assertions are enough here and we can just >> do strcpy. In that case I would even support Thomas' suggestion to >> change the assertion into a guarantee as the input coming in from new >> code is not necessarily static but can be user input (who knows). And >> we should also turn the knob here to quiesce coverity since it is >> obviously not considering the possible call paths and the checks in >> them. >> >> But on the other hand, one could be as conservative as it is now - I >> guess it doesn't bear too much of cost and this place of code is not >> performance critical. That means do the assertion in dbg builds and >> for opt effectively do a checked, truncating copy of the input data >> but avoiding JVM crashes or other errors due to unterminated strings. >> >> I personally tend to do the second - but fine if I get overruled. >> >> But, if we do the second, I'm still for memcpy as strncpy would do >> zero padding of the buffer which is not necessary and we have to >> write a terminating 0 as well to handle the case that inputlength > >> name_len_max (the case which should not happen but we want to protect >> against). That would mean my change stays as it is. > > I don't know why strncpy would do zero padding? From the man page: ???? The stpncpy() and strncpy() functions copy at most len characters from src into dst.? If src is less than len ???? characters long, the remainder of dst is filled with `\0' characters.? Otherwise, dst is not terminated. > > Personally I view this code as follows: > > - it is guaranteed that name length can not exceed the expected > maximum due to existing checks > - the assert guards against new code that might add an unchecked path > or a new command name that is longer than current max and doesn't > update the max > > With that in mind then a simple strncpy of len+1 fully suffices. > > However that doesn't address the coverity issue (and possibly other > checking tools). And given this code was already appeasing coverity, I > vote for just accepting Christoph's patch. Why don't we do a restart and look at the original problem whose fix introduced the current issue. Here's the original code: ??? assert(strlen(name) <= name_length_max, "exceeds maximum name length"); ??? strcpy(_name, name); And here's how set_name is used: class AttachOperation: public CHeapObj { ? enum { ??? name_length_max = 16,?????? // maximum length of? name ? }; ? char _name[name_length_max+1]; } ? if (strlen(cmd) > AttachOperation::name_length_max) return ATTACH_ERROR_ILLEGALARG; ? op->set_name(cmd); I don't see why coverity would complain about this. It can statically see that there will be no buffer overflow. Does it think there are other potential callers of set_name() for which no size check is made? I can't find any. The only clue from JDK-8140482 is: attachListener.hpp: Do strncpy to not overflow buffer. Don't write more chars than before. David, you had pointed that the strcpy complained seemed erroneous during the review for JDK-8140482. The final word from Goetz was: > I agree that I can not find another possible issue > with the strcpy. > Still I think it's better to have the strncpy, as it would have > protected against the bug in attachListener_windows.cpp. > But if you insist I'll just remove it. http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2015-November/016363.html Dmitry Samersoff later added: > It might be better to calculate strlen(name) once, than use memcpy. So we started with what appears to be an invalid and unexplained complaint by coverity about strcpy, changed it to strncpy to appease coverity, changed it to memcpy to avoid two calls to strlen, and now want to fix a different complaint by coverity on that solution. I say go back to strcpy and see if/why coverity is still complaining. thanks, Chris > > Thanks, > David >> >> What shall I do now? >> >> Best regards >> Christoph >> From christoph.langer at sap.com Tue Mar 6 19:03:49 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Tue, 6 Mar 2018 19:03:49 +0000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <7ab7c0b7-69ad-5b94-dd98-19790288bebb@oracle.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> <7ab7c0b7-69ad-5b94-dd98-19790288bebb@oracle.com> Message-ID: <233c0e46d1474a2b8017b554fa53a75b@sap.com> Hi Chris, > > I don't know why strncpy would do zero padding? > From the man page: > > ???? The stpncpy() and strncpy() functions copy at most len characters > from src into dst.? If src is less than len > ???? characters long, the remainder of dst is filled with `\0' > characters.? Otherwise, dst is not terminated. Ok, yes, it depends on the length that you specify. I meant if you always specify the full buffer length (name_length_max+1), it would pad. Never mind. > Why don't we do a restart and look at the original problem whose fix > introduced the current issue. Here's the original code: > > ??? assert(strlen(name) <= name_length_max, "exceeds maximum name > length"); > ??? strcpy(_name, name); > > And here's how set_name is used: > > class AttachOperation: public CHeapObj { > ? enum { > ??? name_length_max = 16,?????? // maximum length of? name > ? }; > ? char _name[name_length_max+1]; > } > > ? if (strlen(cmd) > AttachOperation::name_length_max) return > ATTACH_ERROR_ILLEGALARG; > ? op->set_name(cmd); > > I don't see why coverity would complain about this. It can statically > see that there will be no buffer overflow. Does it think there are other > potential callers of set_name() for which no size check is made? I can't > find any. The only clue from JDK-8140482 is: > > attachListener.hpp: > Do strncpy to not overflow buffer. Don't write more chars than before. > > David, you had pointed that the strcpy complained seemed erroneous > during the review for JDK-8140482. The final word from Goetz was: > > > I agree that I can not find another possible issue > > with the strcpy. > > Still I think it's better to have the strncpy, as it would have > > protected against the bug in attachListener_windows.cpp. > > But if you insist I'll just remove it. > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2015- > November/016363.html > > Dmitry Samersoff later added: > > It might be better to calculate strlen(name) once, than use memcpy. > So we started with what appears to be an invalid and unexplained > complaint by coverity about strcpy, changed it to strncpy to appease > coverity, changed it to memcpy to avoid two calls to strlen, and now > want to fix a different complaint by coverity on that solution. > > I say go back to strcpy and see if/why coverity is still complaining. Ok, agreed. I'll put the original code in place and see what coverity exactly says. Maybe it wants to see a guarantee instead of an assert. But let's see. Best regards Christoph From thomas.stuefe at gmail.com Tue Mar 6 19:27:55 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 6 Mar 2018 20:27:55 +0100 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <233c0e46d1474a2b8017b554fa53a75b@sap.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> <7ab7c0b7-69ad-5b94-dd98-19790288bebb@oracle.com> <233c0e46d1474a2b8017b554fa53a75b@sap.com> Message-ID: Maybe stupid question, any reason we could not just strdup() those strings? And add an ~AttachOperation dtor to clean them up again? Otherwise, I usually prefer using snprintf (or, jio_snprintf or the new os::snprintf()) with "%s" as format string: jio_snprintf(_name, sizeof(_name), "%s", inputstring). This takes care cleanly of zero termination and truncation and one does not have to think too hard about MIN2 expressions. One also saves the strlen() call. strncpy is almost completely useless, usually, because it does not zero terminate cleanly and the \0 padding is just weird. Best Regards, Thomas On Tue, Mar 6, 2018 at 8:03 PM, Langer, Christoph wrote: > Hi Chris, > > > > I don't know why strncpy would do zero padding? > > From the man page: > > > > The stpncpy() and strncpy() functions copy at most len characters > > from src into dst. If src is less than len > > characters long, the remainder of dst is filled with `\0' > > characters. Otherwise, dst is not terminated. > > Ok, yes, it depends on the length that you specify. I meant if you always > specify the full buffer length (name_length_max+1), it would pad. Never > mind. > > > Why don't we do a restart and look at the original problem whose fix > > introduced the current issue. Here's the original code: > > > > assert(strlen(name) <= name_length_max, "exceeds maximum name > > length"); > > strcpy(_name, name); > > > > And here's how set_name is used: > > > > class AttachOperation: public CHeapObj { > > enum { > > name_length_max = 16, // maximum length of name > > }; > > char _name[name_length_max+1]; > > } > > > > if (strlen(cmd) > AttachOperation::name_length_max) return > > ATTACH_ERROR_ILLEGALARG; > > op->set_name(cmd); > > > > I don't see why coverity would complain about this. It can statically > > see that there will be no buffer overflow. Does it think there are other > > potential callers of set_name() for which no size check is made? I can't > > find any. The only clue from JDK-8140482 is: > > > > attachListener.hpp: > > Do strncpy to not overflow buffer. Don't write more chars than before. > > > > David, you had pointed that the strcpy complained seemed erroneous > > during the review for JDK-8140482. The final word from Goetz was: > > > > > I agree that I can not find another possible issue > > > with the strcpy. > > > Still I think it's better to have the strncpy, as it would have > > > protected against the bug in attachListener_windows.cpp. > > > But if you insist I'll just remove it. > > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2015- > > November/016363.html > > > > Dmitry Samersoff later added: > > > It might be better to calculate strlen(name) once, than use memcpy. > > So we started with what appears to be an invalid and unexplained > > complaint by coverity about strcpy, changed it to strncpy to appease > > coverity, changed it to memcpy to avoid two calls to strlen, and now > > want to fix a different complaint by coverity on that solution. > > > > I say go back to strcpy and see if/why coverity is still complaining. > > Ok, agreed. I'll put the original code in place and see what coverity > exactly says. Maybe it wants to see a guarantee instead of an assert. But > let's see. > > Best regards > Christoph > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Wed Mar 7 12:18:41 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 7 Mar 2018 21:18:41 +0900 Subject: PING: RFR: JDK-8153333: [REDO] STW phases at Concurrent GC should count in PerfCounter In-Reply-To: <9183dfd3-522f-fd73-a681-5ea957f1d717@gmail.com> References: <2f4e0901-1602-6276-e7fd-84e168e7b317@gmail.com> <3210723e-c5f9-4277-a97a-be61e10e6b3d@oracle.com> <89a1a161-874d-4743-e52b-7ab202fd8976@gmail.com> <9799a2ad-6727-cae9-312b-fd9de13c8203@oracle.com> <9183dfd3-522f-fd73-a681-5ea957f1d717@gmail.com> Message-ID: PING: Could you review it? http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.08/ JBS: https://bugs.openjdk.java.net/browse/JDK-8153333 CSR: https://bugs.openjdk.java.net/browse/JDK-8196862 This change has passed Mach5 on submit repo. Also it has passed hotspot/jtreg/:hotspot_serviceability and jdk/:jdk_tools jtreg tests. We need one more reviewer. Thanks, Yasumasa On 2018/02/21 21:14, Yasumasa Suenaga wrote: > PING: Could you review it? > >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.07/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8153333 > CSR: https://bugs.openjdk.java.net/browse/JDK-8196862 > > > Yasumasa > > > On 2018/02/15 10:23, Yasumasa Suenaga wrote: >> Hi all, >> >> CSR for this issue [1] has been approved. >> This webrev has been reviewed by Stefan, but we need one more >> reviewer. Could you review it? >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.07/ >> >> >> Thanks, >> >> Yasumasa >> >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8196862 >> >> >> >> 2018-02-06 22:33 GMT+09:00 Yasumasa Suenaga : >>> Hi Stefan, >>> >>>> This looks good to me, will do some more testing while waiting for a >>>> second reviewer and I can sponsor the change once it's ready to go. >>> >>> >>> Thanks! I'm waiting for second reviewer. >>> >>>>> What should I do to get CSR approve? >>>> >>>> In the bug system under "More" you can choose "Create CSR" which is the >>>> first step. More information can be found on the wiki: >>>> https://wiki.openjdk.java.net/display/csr/CSR+FAQs >>> >>> >>> I filed new CSR: >>> ?? https://bugs.openjdk.java.net/browse/JDK-8196862 >>> >>> >>> Yasumasa >>> >>> >>> >>> On 2018/02/06 21:55, Stefan Johansson wrote: >>>> >>>> >>>> >>>> On 2018-02-06 06:10, Yasumasa Suenaga wrote: >>>>> >>>>> Hi Stefan, >>>>> >>>>>> I agree, for G1 this should not be controlled. Maybe I was a bit >>>>>> unclear, I >>>>>> was wondering why we want to control it for CMS. >>>>> >>>>> I said to remove -XX:EnableConcGCPerfCounter in two years ago. I've >>>>> missed it :-) >>>>> >>>>> >>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2016-March/017125.html >>>>> >>>>> So I uploaded new webrev. This change includes copyright year updates. >>>>> >>>>> ??? http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.06/ >>>> >>>> Thanks Yasumasa, >>>> >>>> This looks good to me, will do some more testing while waiting for a >>>> second reviewer and I can sponsor the change once it's ready to go. >>>>> >>>>> >>>>> This change passes all tests on submit repo, and >>>>> :hotspot_serviceability :jdk_tools tests on my laptop. >>>>> >>>>> >>>>> http://java.se.oracle.com:10065/mdash/jobs/mach5-one-ysuenaga-JDK-8153333-20180206-0222-10428 >>>>> >>>>> >>>>>> If we do the change for CMS, we should >>>>>> probably also do a CSR, but that should be fairly straight forward. >>>>> >>>>> What should I do to get CSR approve? >>>> >>>> In the bug system under "More" you can choose "Create CSR" which is the >>>> first step. More information can be found on the wiki: >>>> https://wiki.openjdk.java.net/display/csr/CSR+FAQs >>>> >>>> Cheers, >>>> Stefan >>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> 2018-02-06 0:33 GMT+09:00 Stefan Johansson : >>>>>> >>>>>> >>>>>> On 2018-02-03 06:40, Yasumasa Suenaga wrote: >>>>>>> >>>>>>> On 2018/02/02 23:38, Stefan Johansson wrote: >>>>>>>> >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> The changes doesn't apply clean on the latest jdk/hs, can you provide >>>>>>>> an >>>>>>>> updated webrev? >>>>>>> >>>>>>> >>>>>>> I uploaded webrev for jdk-hs: >>>>>>> ??? cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.05/ >>>>>>> >>>>>> Thanks, I've kicked off a testing job now to verify nothing unexpected >>>>>> fails. >>>>>>> >>>>>>> >>>>>>>> The testing done by the submit repo doesn't cover the tests you have >>>>>>>> update so I plan to take the change for a spin and make sure the >>>>>>>> correct >>>>>>>> tests are run and verified in Mach 5. >>>>>>> >>>>>>> >>>>>>> I've also tested hotspot/jtreg/:hotspot_serviceability and >>>>>>> jdk/:jdk_tools >>>>>>> on my laptop. >>>>>>> I did not see any errors / failures which are related to this change. >>>>>> >>>>>> I also ran some local tests on this and it looks good. >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Also a question about the change. Why do we need a special flag for >>>>>>>> CMS? >>>>>>>> I see that the original bug report refers to the flag as being a way >>>>>>>> to turn >>>>>>>> on and off the feature but the current implementation only consider >>>>>>>> the flag >>>>>>>> for CMS. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2016-March/016774.html >>>>>>> >>>>>>> Originally, STW phases (Remark and Cleanup) at G1 are not counted in >>>>>>> jstat >>>>>>> FGC column. >>>>>>> So I think we need not to control the behavior of PerfCounter for G1. >>>>>>> >>>>>> I agree, for G1 this should not be controlled. Maybe I was a bit >>>>>> unclear, I >>>>>> was wondering why we want to control it for CMS. I think either we >>>>>> should >>>>>> change the behavior without guarding it by a flag or just skip updating >>>>>> CMS >>>>>> (and leave the pauses in FGC). If we do the change for CMS, we should >>>>>> probably also do a CSR, but that should be fairly straight forward. >>>>>> >>>>>> I also found the old review thread where Jon M had the same comment >>>>>> (removing the flag) and it looks like all agreed on that: >>>>>> >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2016-March/017118.html >>>>>> >>>>>> Thanks, >>>>>> Stefan >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> Stefan >>>>>>>> >>>>>>>> On 2018-02-01 14:58, Yasumasa Suenaga wrote: >>>>>>>>> >>>>>>>>> PING: Could you review and sponsor it? >>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.04/ >>>>>>>>> >>>>>>>>> >>>>>>>>> This change has been passed Mach 5 via submit repo: >>>>>>>>> >>>>>>>>> >>>>>>>>> http://java.se.oracle.com:10065/mdash/jobs/mach5-one-ysuenaga-JDK-8153333-20180201-0805-10101 >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2017/11/01 22:02, Yasumasa Suenaga wrote: >>>>>>>>>> >>>>>>>>>> PING: Could you review and sponsor it? >>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.04/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Also I need JPRT results of this change. >>>>>>>>>> Could you cooperate? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2017/09/27 0:08, Yasumasa Suenaga wrote: >>>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I uploaded new webrev to be adapted to jdk10/hs: >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.04/ >>>>>>>>>>> >>>>>>>>>>> I want to check this patch via JPRT, but I cannot access it. >>>>>>>>>>> Could you cooperate? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2017/09/21 7:46, Yasumasa Suenaga wrote: >>>>>>>>>>>> >>>>>>>>>>>> PING: >>>>>>>>>>>> >>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.03/hotspot/ >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.03/jdk/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2017/07/01 23:44, Yasumasa Suenaga wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> PING: >>>>>>>>>>>>> >>>>>>>>>>>>> Have you checked this issue? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2017/06/14 13:22, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I changed PerfCounter to show CGC STW phase in jstat in >>>>>>>>>>>>>> JDK-8151674. >>>>>>>>>>>>>> However, it occurred several jtreg test failure, so it was >>>>>>>>>>>>>> back-outed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I want to resume to work for this issue. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.03/hotspot/ >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.03/jdk/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> These changes are work fine on jtreg test as below: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ???? hotspot/test/serviceability/tmtools/jstat >>>>>>>>>>>>>> ???? jdk/test/sun/tools >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Since JDK 9, default GC algorithm is set to G1. >>>>>>>>>>>>>> So I think this change is useful to watch GC behavior through >>>>>>>>>>>>>> jstat. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I cannot access JPRT. Could you help? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>> >>> From amit.sapre at oracle.com Thu Mar 8 07:27:49 2018 From: amit.sapre at oracle.com (Amit Sapre) Date: Wed, 7 Mar 2018 23:27:49 -0800 (PST) Subject: RFR : JDK-8071367 - JMX: Remove SNMP support Message-ID: Hello, Please review the changes for removing SNMP support. Bug ID : https://bugs.openjdk.java.net/browse/JDK-8071367 Webrev : http://cr.openjdk.java.net/~asapre/webrev/2018/JDK-8071367/webrev.00 Thanks, Amit -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alan.Bateman at oracle.com Thu Mar 8 07:46:14 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 8 Mar 2018 07:46:14 +0000 Subject: RFR : JDK-8071367 - JMX: Remove SNMP support In-Reply-To: References: Message-ID: On 08/03/2018 07:27, Amit Sapre wrote: > > Hello, > > Please review the changes for removing SNMP support. > > Bug ID : https://bugs.openjdk.java.net/browse/JDK-8071367 > > Webrev : > http://cr.openjdk.java.net/~asapre/webrev/2018/JDK-8071367/webrev.00 > > cc'ing compiler-dev for help on javac/resources/legacy.properties. I'm not 100% sure if it is used when compiling to old releases or not. As you are re-wording the class description for jdk.internal.agent.Agent then we might as well get it right. The Agent class loaded and its static no-arg startAgent method is invoked when a system property starting with "com.sun.management" is specified on the command line. We could expand this to include the case where it is started in a running VM too. build.properties - I assume the empty value for excludes shouldn't have a continuation character now. The rest looks good to me. -Alan. -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus.gronlund at oracle.com Thu Mar 8 09:23:51 2018 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Thu, 8 Mar 2018 01:23:51 -0800 (PST) Subject: RFR: 8196337 Add commit methods that take all event properties as argument In-Reply-To: <495b2840-6031-593d-e6a8-6398e7c3883c@oracle.com> References: <6c57498a-d0cc-e7bd-c54d-bfd4b2acf323@oracle.com> <495b2840-6031-593d-e6a8-6398e7c3883c@oracle.com> Message-ID: <48f9fb9e-8d7c-4f2a-bd27-cc4d074b0a44@default> Hi Leo, Looks good, thanks for doing this. Sorry for the delay. Cheers Markus -----Original Message----- From: Leo Korinth Sent: den 13 februari 2018 12:26 To: hotspot-runtime-dev at openjdk.java.net; serviceability-dev at openjdk.java.net Subject: Re: RFR: 8196337 Add commit methods that take all event properties as argument Hi, I would be happy for reviews! Thanks, Leo On 31/01/18 11:00, Leo Korinth wrote: > Hi, > > I am adding commit methods that take all event properties as argument. > > For instant events (without start and stop times) a static commit > method is created (taking all properties). > > For non-instant events, a non static commit method is created (taking > all properties). Also a static commit method (with additional > startTicks/endTicks) is created. > > Also an extra constructor is created (taking all properties). An auto > commit destructor was considered (that would auto commit if the > constructor with all properties was used) but has not yet been implemented. > > Enhancement: > https://bugs.openjdk.java.net/browse/JDK-8196337 > > Webrev: > http://cr.openjdk.java.net/~lkorinth/8196337/00/ (open) > > Testing: > - hs-tier1, hs-tier2 > > Thanks, > Leo From christoph.langer at sap.com Thu Mar 8 11:38:34 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Thu, 8 Mar 2018 11:38:34 +0000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <7ab7c0b7-69ad-5b94-dd98-19790288bebb@oracle.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> <7ab7c0b7-69ad-5b94-dd98-19790288bebb@oracle.com> Message-ID: <18e077534cd940539cd1cbb93943e9c8@sap.com> Hi Chris, I went back to the original code in coverity and checked what it complains about. This is the original code: assert(strlen(name) <= name_length_max, "exceeds maximum name length"); strcpy(_name, name); Coverity has various issues with this. First, it generally considers strcpy as risky and doesn't like it at all. It says: secure_coding: [VERY RISKY]. Using "strcpy(char *, char const *)" can cause a buffer overflow when done incorrectly. If the destination string of a strcpy() is not large enough then anything might happen. Use strncpy() instead. Secondly, it doesn't accept the assert as length check and complains: fixed_size_dest: You might overrun the 17-character fixed-size string this->_name by copying name without checking the length. And, 3rd, it considers the risk as elevated: parameter_as_source: Note: This defect has an elevated risk because the source argument is a parameter of the current function. In my opinion the points are valid, because in opt builds there would be no length check. Though we can see from the current code base that an overrun is virtually impossible and coverity might also be smarter in terms of checking call paths that lead to calls of set_name or set_arg, in the end there is no 100% guarantee that this method would never be called with some bad parameter during runtime, e.g. after someone changes code. I really think it would be easiest to go to my proposed patch. And it doesn't come with much cost and the place probably isn't performance relevant. Best regards Christoph From yasuenag at gmail.com Thu Mar 8 13:21:35 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 8 Mar 2018 22:21:35 +0900 Subject: RFR: 8199323: hsdis could not be loaded which are located on long path Message-ID: <3fc35990-88b5-475c-f89e-226b862cab08@gmail.com> Hi all, Could you review and sponsor it? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.00/ JBS: https://bugs.openjdk.java.net/browse/JDK-8199323 Mach5 test result on submit repo: mach5-one-ysuenaga-JDK-8199323-20180308-1027-13701 I encountered DebuggerException when hsdis is located on long path as below: Location of hsdis: /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/jre/lib/amd64/hsdis-amd64.so Exception: sun.jvm.hotspot.debugger.DebuggerException: /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/j: cannot open shared object file: No such file or directory In Java_sun_jvm_hotspot_asm_Disassembler_load_1library(), buffer which uses for library path is defined as below: ``` char buffer[128]; ``` I copied JVM_MAXPATHLEN related code to sadis.c from os/posix/include/jvm_md.h and os/windows/include/jvm_md.h . I added noreg-hard label on this ticket because this issue is available when disassembling on coredump. Thanks, Yasumasa From chris.plummer at oracle.com Thu Mar 8 16:47:32 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 8 Mar 2018 08:47:32 -0800 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <18e077534cd940539cd1cbb93943e9c8@sap.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> <7ab7c0b7-69ad-5b94-dd98-19790288bebb@oracle.com> <18e077534cd940539cd1cbb93943e9c8@sap.com> Message-ID: Hi Christoph, On 3/8/18 3:38 AM, Langer, Christoph wrote: > Hi Chris, > > I went back to the original code in coverity and checked what it complains about. > > This is the original code: > assert(strlen(name) <= name_length_max, "exceeds maximum name length"); > strcpy(_name, name); > > Coverity has various issues with this. > > First, it generally considers strcpy as risky and doesn't like it at all. It says: > secure_coding: [VERY RISKY]. Using "strcpy(char *, char const *)" can cause a buffer overflow when done incorrectly. If the destination string of a strcpy() is not large enough then anything might happen. Use strncpy() instead. Generally speaking this is true, but not in this case since the caller guards it with a length check. > Secondly, it doesn't accept the assert as length check and complains: > fixed_size_dest: You might overrun the 17-character fixed-size string this->_name by copying name without checking the length. Agreed that the assert is not a length check in product builds. However, the only caller has a length check. Have you tried moving this length check into set_name() and see if the problem goes away? Although I don't suggest that as a fix. Just curious as to what the result would be. Maybe it's coverity is concerned that the cmd could be changed between the length check and the call to set_name. BTW, I just realized I had been ignoring the set_arg() changes all this time and focused on set_name(). So if any of the complaints are unique to set_arg() please let me know. > And, 3rd, it considers the risk as elevated: > parameter_as_source: Note: This defect has an elevated risk because the source argument is a parameter of the current function. Is this a complaint about "name" being a source argument to strcpy(). If so, I don't get this one. How are you going to copy "name" without specifying it as an argument to something (strcpy, strncpy, memcpy, etc). Besides, it is being passed to strcpy as a const argument. Makes me wonder if adding const to the parameter declarations for both set_name() and enqueue() would help. > > In my opinion the points are valid, because in opt builds there would be no length check. But there is a length check in the caller. Does coverity not see checks up the call chain? > Though we can see from the current code base that an overrun is virtually impossible and coverity might also be smarter in terms of checking call paths that lead to calls of set_name or set_arg, in the end there is no 100% guarantee that this method would never be called with some bad parameter during runtime, e.g. after someone changes code. But if someone did that, we'd still have a bug, right? > > I really think it would be easiest to go to my proposed patch. And it doesn't come with much cost and the place probably isn't performance relevant. I'm not worried about performance. To me it has more to do with taking easily to read code and changing it into something that someone would stare at for a bit before figuring out what it's doing, and then ask "Why so complicated?". Coverity is suppose to help us make our code better. I don't see that being the case here. If in the end your changes are the simplest approach to quieting coverity, then I guess that's what we should go with. However, I'm still not convinced we really fully why converity is not happy with a strcpy that can be statically shown to be safe. Is is a coverity bug? Is there a call path we are missing? Something else that makes it hard for coverity to statically check this? That's one reason I'd like to see what happens if a check is put directly in set_name. cheers, Chris > > Best regards > Christoph > From mandy.chung at oracle.com Thu Mar 8 20:02:52 2018 From: mandy.chung at oracle.com (mandy chung) Date: Thu, 8 Mar 2018 12:02:52 -0800 Subject: RFR : JDK-8071367 - JMX: Remove SNMP support In-Reply-To: References: Message-ID: <5fed6078-624f-2c4b-b92c-5022fd07925c@oracle.com> On 3/7/18 11:46 PM, Alan Bateman wrote: > On 08/03/2018 07:27, Amit Sapre wrote: >> >> Hello, >> >> Please review the changes for removing SNMP support. >> >> Bug ID : https://bugs.openjdk.java.net/browse/JDK-8071367 >> >> Webrev : >> http://cr.openjdk.java.net/~asapre/webrev/2018/JDK-8071367/webrev.00 >> >> > cc'ing compiler-dev for help on javac/resources/legacy.properties. I'm > not 100% sure if it is used when compiling to old releases or not. > I think legacy.properties is for compiling for older releases prior to 9 and it should not be changed.?? Let's get the compiler team to confirm. > As you are re-wording the class description for > jdk.internal.agent.Agent then we might as well get it right. The Agent > class loaded and its static no-arg startAgent method is invoked when a > system property starting with "com.sun.management" is specified on the > command line. We could expand this to include the case where it is > started in a running VM too. > Good suggestion. > build.properties - I assume the empty value for excludes shouldn't > have a continuation character now. > > The rest looks good to me. > I look through the webrev.? No other comment besides called out above. I created https://bugs.openjdk.java.net/browse/JDK-8199358 to track the docs update. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Mar 8 21:48:03 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 8 Mar 2018 13:48:03 -0800 Subject: PING: Re: RFR: JDK-8193369: post_field_access does not work for some functions, possibly related to fast_getfield In-Reply-To: <74eacea4-a3c0-a35d-047b-1478b7d46c87@oracle.com> References: <1fca6b67-c0d1-db03-52ed-f2c6bcc29a5b@oracle.com> <91aadc35-125a-bf74-6cf5-672dc77ffb22@oracle.com> <3df69fad-c0d8-5667-a61a-f88a83e26d89@oracle.com> <74eacea4-a3c0-a35d-047b-1478b7d46c87@oracle.com> Message-ID: <72cd98ab-1034-c3c4-80cc-d32d8a512fef@oracle.com> Hey guys, One more review is needed for this fix! Thanks, Serguei On 3/5/18 09:58, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > It looks good. > Thank you for the update! > > Thanks, > Serguei > > On 3/1/18 10:53, Alex Menkov wrote: >> Hi Serguei, >> >> Thank you for the feedback. >> Updated webrev: >> http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev.01/ >> >> See inline for comments for your notes. >> >> On 02/27/2018 23:08, serguei.spitsyn at oracle.com wrote: >>> Hi Alex, >>> >>> Thank you for taking care about this! >>> The fix looks good to me. >>> >>> Some comments on the test. >>> >>> http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev/test/hotspot/jtreg/serviceability/jvmti/FieldAccessWatch/FieldAccessWatch.java.html >>> >>> There are some commented lines in the TestResult class. >>> A cleanup is needed to delete them. >>> I guess, it is already in your plan. >> >> I deleted couple lines, keeping comment for fields >> >>> The empty line #135 is not needed. >>> An empty line is needed after the L99. >> >> fixed. >> >>> Probably, the intention was to spell "startTest" insted of >>> "initTest" below: >>> >>> ??119???????? if (!startTest(result)) { >>> ??120???????????? throw new RuntimeException("initTest failed"); >>> ??121???????? } >> >> fixed. >> >>> I wonder if this sleep is really needed: >>> ???? 124 Thread.sleep(500); >>> >>> The "action.apply()" is executed synchronously, is not it? >> >> But notifications are asynchronous, so this helps to avoid test >> failures is some events are delivered a bit later in loaded environment. >> Also this helps to avoid mess of native and java logging >> >>> I'm thinking if moving the test() to native side would simplify things. >> >> To me it's simpler and more flexible to perform required actions in >> Java, native part only handles notifications. >> >>> An Exception can be thrown from native if the test failed or just a >>> boolean status returned. >>> >>> >>> http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev/test/hotspot/jtreg/serviceability/jvmti/FieldAccessWatch/libFieldAccessWatch.c.html >>> >>> I'd suggest to rename currentTestResults to testResultObject, >>> so it will be in line with testResultClass. >> >> fixed. >> >>> One concern is that that the reportError() does not cause the test >>> to fail and does not break the execution. >>> Would it better to throw an exception with the same message as was >>> printed? >> >> Updated several cases (immediate return from callbacks if something >> went wrong). >> Note that reportError is called from native Java methods and from >> JVMTI callbacks, so throwing an exception doesn't looks right. >> >>> It seems, the function tagAndWatch() adds some complexity to the code. >>> Is all this really needed? Could you, please, add some comments. >>> It does not seem this functions tags anything. >> >> renamed the function, added short function description. >> >>> ??168 (*jvmti)->Deallocate(jvmti, (unsigned char*)sig); >>> >>> ??The sig needs to be cleared after deallocation as it is used and >>> checked in a loop. >> >> Moved the variable to the correct scope. >> >>> Missed initializations: >>> >>> ?? 68???? char *name; >>> ??142???????? jfieldID* klassFields; >>> ??143???????? jint fieldCount; >> >> Fixed. >> >> --alex >> >>> Thanks, >>> Serguei >>> >>> >>> On 2/26/18 14:43, Alex Menkov wrote: >>>> Hi all, >>>> >>>> Please review a fix for >>>> JDK-8193369: post_field_access does not work for some functions, >>>> possibly related to fast_getfield >>>> >>>> The fix disables "fast" command generation when FieldAccess or >>>> FieldModification notifications are requested. >>>> >>>> jira: https://bugs.openjdk.java.net/browse/JDK-8193369 >>>> webrev: http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev/ >>>> >>>> --alex >>> > From jcbeyler at google.com Fri Mar 9 00:00:17 2018 From: jcbeyler at google.com (JC Beyler) Date: Fri, 09 Mar 2018 00:00:17 +0000 Subject: JDK-8171119: Low-Overhead Heap Profiling In-Reply-To: References: <5A819F10.8040201@oracle.com> <5A8414AC.3020209@oracle.com> Message-ID: Hi all, I apologize for the delay but I wanted to add an event system and that took a bit longer than expected and I also reworked the code to take into account the deprecation of FastTLABRefill. This update has four parts: A) I moved the implementation from Thread to ThreadHeapSampler inside of Thread. Would you prefer it as a pointer inside of Thread or like this works for you? Second question would be would you rather have an association outside of Thread altogether that tries to remember when threads are live and then we would have something like: ThreadHeapSampler::get_sampling_size(this_thread); I worry about the overhead of this but perhaps it is not too too bad? B) I also have been working on the Allocation event system that sends out a notification at each sampled event. This will be practical when wanting to do something at the allocation point. I'm also looking at if the whole heapMonitoring code could not reside in the agent code and not in the JDK. I'm not convinced but I'm talking to Serguei about it to see/assess :) - Also added two tests for the new event subsystem C) Removed the slow_path fields inside the TLAB code since now FastTLABRefill is deprecated D) Updated the JVMTI documentation and specification for the methods. So the incremental webrev is here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.09_10/ and the full webrev is here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.10 I believe I have updated the various JIRA issues that track this :) Thanks for your input, Jc On Wed, Feb 14, 2018 at 10:34 PM, JC Beyler wrote: > Hi Erik, > > I inlined my answers, which the last one seems to answer Robbin's concerns > about the same thing (adding things to Thread). > > On Wed, Feb 14, 2018 at 2:51 AM, Erik ?sterlund > wrote: > >> Hi JC, >> >> Comments are inlined below. >> >> >> On 2018-02-13 06:18, JC Beyler wrote: >> >> Hi Erik, >> >> Thanks for your answers, I've now inlined my own answers/comments. >> >> I've done a new webrev here: >> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08/ >> >> The incremental is here: >> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/ >> >> Note to all: >> - I've been integrating changes from Erin/Serguei/David comments so >> this webrev incremental is a bit an answer to all comments in one. I >> apologize for that :) >> >> >> On Mon, Feb 12, 2018 at 6:05 AM, Erik ?sterlund < >> erik.osterlund at oracle.com> wrote: >> >>> Hi JC, >>> >>> Sorry for the delayed reply. >>> >>> Inlined answers: >>> >>> >>> On 2018-02-06 00:04, JC Beyler wrote: >>> >>>> Hi Erik, >>>> >>>> (Renaming this to be folded into the newly renamed thread :)) >>>> >>>> First off, thanks a lot for reviewing the webrev! I appreciate it! >>>> >>>> I updated the webrev to: >>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/ >>>> >>>> And the incremental one is here: >>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.04_05a/ >>>> >>>> It contains: >>>> - The change for since from 9 to 11 for the jvmti.xml >>>> - The use of the OrderAccess for initialized >>>> - Clearing the oop >>>> >>>> I also have inlined my answers to your comments. The biggest question >>>> will come from the multiple *_end variables. A bit of the logic there >>>> is due to handling the slow path refill vs fast path refill and >>>> checking that the rug was not pulled underneath the slowpath. I >>>> believe that a previous comment was that TlabFastRefill was going to >>>> be deprecated. >>>> >>>> If this is true, we could revert this code a bit and just do a : if >>>> TlabFastRefill is enabled, disable this. And then deprecate that when >>>> TlabFastRefill is deprecated. >>>> >>>> This might simplify this webrev and I can work on a follow-up that >>>> either: removes TlabFastRefill if Robbin does not have the time to do >>>> it or add the support to the assembly side to handle this correctly. >>>> What do you think? >>>> >>> >>> I support removing TlabFastRefill, but I think it is good to not depend >>> on that happening first. >>> >>> >> >> I'm slowly pushing on the FastTLABRefill ( >> >> https://bugs.openjdk.java.net/browse/JDK-8194084), I agree on keeping >> both separate for now though so that we can think of both differently >> >> >> >>> Now, below, inlined are my answers: >>>> >>>> On Fri, Feb 2, 2018 at 8:44 AM, Erik ?sterlund >>>> wrote: >>>> >>>>> Hi JC, >>>>> >>>>> Hope I am reviewing the right version of your work. Here goes... >>>>> >>>>> src/hotspot/share/gc/shared/collectedHeap.inline.hpp: >>>>> >>>>> 159 AllocTracer::send_allocation_outside_tlab(klass, result, >>>>> size * >>>>> HeapWordSize, THREAD); >>>>> 160 >>>>> 161 THREAD->tlab().handle_sample(THREAD, result, size); >>>>> 162 return result; >>>>> 163 } >>>>> >>>>> Should not call tlab()->X without checking if (UseTLAB) IMO. >>>>> >>>>> Done! >>>> >>> >>> More about this later. >>> >>> >>> >>>> src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp: >>>>> >>>>> So first of all, there seems to quite a few ends. There is an "end", a >>>>> "hard >>>>> end", a "slow path end", and an "actual end". Moreover, it seems like >>>>> the >>>>> "hard end" is actually further away than the "actual end". So the >>>>> "hard end" >>>>> seems like more of a "really definitely actual end" or something. I >>>>> don't >>>>> know about you, but I think it looks kind of messy. In particular, I >>>>> don't >>>>> feel like the name "actual end" reflects what it represents, >>>>> especially when >>>>> there is another end that is behind the "actual end". >>>>> >>>>> 413 HeapWord* ThreadLocalAllocBuffer::hard_end() { >>>>> 414 // Did a fast TLAB refill occur? >>>>> 415 if (_slow_path_end != _end) { >>>>> 416 // Fix up the actual end to be now the end of this TLAB. >>>>> 417 _slow_path_end = _end; >>>>> 418 _actual_end = _end; >>>>> 419 } >>>>> 420 >>>>> 421 return _actual_end + alignment_reserve(); >>>>> 422 } >>>>> >>>>> I really do not like making getters unexpectedly have these kind of >>>>> side >>>>> effects. It is not expected that when you ask for the "hard end", you >>>>> implicitly update the "slow path end" and "actual end" to new values. >>>>> >>>>> As I said, a lot of this is due to the FastTlabRefill. If I make this >>>> not supporting FastTlabRefill, this goes away. The reason the system >>>> needs to update itself at the get is that you only know at that get if >>>> things have shifted underneath the tlab slow path. I am not sure of >>>> really better names (naming is hard!), perhaps we could do these >>>> names: >>>> >>>> - current_tlab_end // Either the allocated tlab end or a sampling >>>> point >>>> - last_allocation_address // The end of the tlab allocation >>>> - last_slowpath_allocated_end // In case a fast refill occurred the >>>> end might have changed, this is to remember slow vs fast past refills >>>> >>>> the hard_end method can be renamed to something like: >>>> tlab_end_pointer() // The end of the lab including a bit of >>>> alignment reserved bytes >>>> >>> >>> Those names sound better to me. Could you please provide a mapping from >>> the old names to the new names so I understand which one is which please? >>> >>> This is my current guess of what you are proposing: >>> >>> end -> current_tlab_end >>> actual_end -> last_allocation_address >>> slow_path_end -> last_slowpath_allocated_end >>> hard_end -> tlab_end_pointer >>> >>> >> Yes that is correct, that was what I was proposing. >> >> >>> I would prefer this naming: >>> >>> end -> slow_path_end // the end for taking a slow path; either due to >>> sampling or refilling >>> actual_end -> allocation_end // the end for allocations >>> slow_path_end -> last_slow_path_end // last address for slow_path_end >>> (as opposed to allocation_end) >>> hard_end -> reserved_end // the end of the reserved space of the TLAB >>> >>> About setting things in the getter... that still seems like a very >>> unpleasant thing to me. It would be better to inspect the call hierarchy >>> and explicitly update the ends where they need updating, and assert in the >>> getter that they are in sync, rather than implicitly setting various ends >>> as a surprising side effect in a getter. It looks like the call hierarchy >>> is very small. With my new naming convention, reserved_end() would >>> presumably return _allocation_end + alignment_reserve(), and have an assert >>> checking that _allocation_end == _last_slow_path_allocation_end, >>> complaining that this invariant must hold, and that a caller to this >>> function, such as make_parsable(), must first explicitly synchronize the >>> ends as required, to honor that invariant. >>> >>> >> >> I've renamed the variables to how you preferred it except for the _end >> one. I did: >> current_end >> last_allocation_address >> tlab_end_ptr >> >> The reason is that the architecture dependent code use the thread.hpp API >> and it already has tlab included into the name so it becomes >> tlab_current_end (which is better that tlab_current_tlab_end in my opinion). >> >> I also moved the update into a separate method with a TODO that says to >> remove it when FastTLABRefill is deprecated >> >> >> This looks a lot better now. Thanks. >> >> Note that the following comment now needs updating accordingly in >> threadLocalAllocBuffer.hpp: >> >> 41 // Heap sampling is performed via the end/actual_end fields. 42 // actual_end contains the real end of the tlab allocation, 43 // whereas end can be set to an arbitrary spot in the tlab to 44 // trip the return and sample the allocation. 45 // slow_path_end is used to track if a fast tlab refill occured 46 // between slowpath calls. >> >> There might be other comments too, I have not looked in detail. >> > > This was the only spot that still had an actual_end, I fixed it now. I'll > do a sweep to double check other comments. > > >> >> >> >> >> >>> >>> Not sure it's better but before updating the webrev, I wanted to try >>>> to get input/consensus :) >>>> >>>> (Note hard_end was always further off than end). >>>> >>>> src/hotspot/share/prims/jvmti.xml: >>>>> >>>>> 10357 >>>>> 10358 >>>>> 10359 Can sample the heap. >>>>> 10360 If this capability is enabled then the heap sampling >>>>> methods >>>>> can be called. >>>>> 10361 >>>>> 10362 >>>>> >>>>> Looks like this capability should not be "since 9" if it gets >>>>> integrated >>>>> now. >>>>> >>>> Updated now to 11, crossing my fingers :) >>>> >>>> >>>> src/hotspot/share/runtime/heapMonitoring.cpp: >>>>> >>>>> 448 if (is_alive->do_object_b(value)) { >>>>> 449 // Update the oop to point to the new object if it is >>>>> still >>>>> alive. >>>>> 450 f->do_oop(&(trace.obj)); >>>>> 451 >>>>> 452 // Copy the old trace, if it is still live. >>>>> 453 _allocated_traces->at_put(curr_pos++, trace); >>>>> 454 >>>>> 455 // Store the live trace in a cache, to be served up on >>>>> /heapz. >>>>> 456 _traces_on_last_full_gc->append(trace); >>>>> 457 >>>>> 458 count++; >>>>> 459 } else { >>>>> 460 // If the old trace is no longer live, add it to the >>>>> list of >>>>> 461 // recently collected garbage. >>>>> 462 store_garbage_trace(trace); >>>>> 463 } >>>>> >>>>> In the case where the oop was not live, I would like it to be >>>>> explicitly >>>>> cleared. >>>>> >>>> Done I think how you wanted it. Let me know because I'm not familiar >>>> with the RootAccess API. I'm unclear if I'm doing this right or not so >>>> reviews of these parts are highly appreciated. Robbin had talked of >>>> perhaps later pushing this all into a OopStorage, should I do this now >>>> do you think? Or can that wait a second webrev later down the road? >>>> >>> >>> I think using handles can and should be done later. You can use the >>> Access API now. >>> I noticed that you are missing an #include "oops/access.inline.hpp" in >>> your heapMonitoring.cpp file. >>> >>> >> The missing header is there for me so I don't know, I made sure it is >> present in the latest webrev. Sorry about that. >> >> >> >>> + Did I clear it the way you wanted me to or were you thinking of >>>> something else? >>>> >>> >>> That is precisely how I wanted it to be cleared. Thanks. >>> >>> + Final question here, seems like if I were to want to not do the >>>> f->do_oop directly on the trace.obj, I'd need to do something like: >>>> >>>> f->do_oop(&value); >>>> ... >>>> trace->store_oop(value); >>>> >>>> to update the oop internally. Is that right/is that one of the >>>> advantages of going to the Oopstorage sooner than later? >>>> >>> >>> I think you really want to do the do_oop on the root directly. Is there >>> a particular reason why you would not want to do that? >>> Otherwise, yes - the benefit with using the handle approach is that you >>> do not need to call do_oop explicitly in your code. >>> >>> >> There is no reason except that now we have a load_oop and a get_oop_addr, >> I was not sure what you would think of that. >> >> >> That's fine. >> >> >> >>> >>>> Also I see a lot of concurrent-looking use of the following field: >>>>> 267 volatile bool _initialized; >>>>> >>>>> Please note that the "volatile" qualifier does not help with reordering >>>>> here. Reordering between volatile and non-volatile fields is >>>>> completely free >>>>> for both compiler and hardware, except for windows with MSVC, where >>>>> volatile >>>>> semantics is defined to use acquire/release semantics, and the >>>>> hardware is >>>>> TSO. But for the general case, I would expect this field to be stored >>>>> with >>>>> OrderAccess::release_store and loaded with OrderAccess::load_acquire. >>>>> Otherwise it is not thread safe. >>>>> >>>> Because everything is behind a mutex, I wasn't really worried about >>>> this. I have a test that has multiple threads trying to hit this >>>> corner case and it passes. >>>> >>>> However, to be paranoid, I updated it to using the OrderAccess API >>>> now, thanks! Let me know what you think there too! >>>> >>> >>> If it is indeed always supposed to be read and written under a mutex, >>> then I would strongly prefer to have it accessed as a normal non-volatile >>> member, and have an assertion that given lock is held or we are in a >>> safepoint, as we do in many other places. Something like this: >>> >>> assert(HeapMonitorStorage_lock->owned_by_self() || >>> (SafepointSynchronize::is_at_safepoint() && >>> Thread::current()->is_VM_thread()), "this should not be accessed >>> concurrently"); >>> >>> It would be confusing to people reading the code if there are uses of >>> OrderAccess that are actually always protected under a mutex. >>> >>> >> Thank you for the exact example to be put in the code! I put it around >> each access/assignment of the _initialized method and found one case where >> yes you can touch it and not have the lock. It actually is "ok" because you >> don't act on the storage until later and only when you really want to >> modify the storage (see the object_alloc_do_sample method which calls the >> add_trace method). >> >> But, because of this, I'm going to put the OrderAccess here, I'll do some >> performance numbers later and if there are issues, I might add a "unsafe" >> read and a "safe" one to make it explicit to the reader. But I don't think >> it will come to that. >> >> >> Okay. This double return in heapMonitoring.cpp looks wrong: >> >> 283 bool initialized() { >> 284 return OrderAccess::load_acquire(&_initialized) != 0; >> 285 return _initialized; >> 286 } >> >> Since you said object_alloc_do_sample() is the only place where you do >> not hold the mutex while reading initialized(), I had a closer look at >> that. It looks like in its current shape, the lack of a mutex may lead to a >> memory leak. In particular, it first checks if (initialized()). Let's >> assume this is now true. It then allocates a bunch of stuff, and checks if >> the number of frames were over 0. If they were, it calls >> StackTraceStorage::storage()->add_trace() seemingly hoping that after >> grabbing the lock in there, initialized() will still return true. But it >> could now return false and skip doing anything, in which case the allocated >> stuff will never be freed. >> > > I fixed this now by making add_trace return a boolean and checking for > that. It will be in the next webrev. Thanks, the truth is that in our > implementation the system is always on or off, so this never really occurs > :). In this version though, that is not true and it's important to handle > so thanks again! > > > >> >> So the analysis seems to be that _initialized is only used outside of the >> mutex in once instance, where it is used to perform double-checked locking, >> that actually causes a memory leak. >> >> I am not proposing how to fix that, just raising the issue. If you still >> want to perform this double-checked locking somehow, then the use of >> acquire/release still seems odd. Because the memory ordering restrictions >> of it never comes into play in this particular case. If it ever did, then >> the use of destroy_stuff(); release_store(_initialized, 0) would be broken >> anyway as that would imply that whatever concurrent reader there ever was >> would after reading _initialized with load_acquire() could *never* read the >> data that is concurrently destroyed anyway. I would be biased to think that >> RawAccess::load/store looks like a more appropriate solution, >> given that the memory leak issue is resolved. I do not know how painful it >> would be to not perform this double-checked locking. >> > > So I agree with this entirely. I looked also a bit more and the difference > and code really stems from our internal version. In this version however, > there are actually a lot of things going on that I did not go entirely > through in my head but this comment made me ponder a bit more on it. > > Since every object_alloc_do_sample is protected by a check to > HeapMonitoring::enabled(), there is only a small chance that the call is > happening when things have been disabled. So there is no real need to do a > first check on the initialized, it is a rare occurence that a call happens > to object_alloc_do_sample and the initialized of the storage returns false. > > (By the way, even if you did call object_alloc_do_sample without looking > at HeapMonitoring::enabled(), that would be ok too. You would gather the > stacktrace and get nowhere at the add_trace call, which would return false; > so though not optimal performance wise, nothing would break). > > Furthermore, the add_trace is really the moment of no return and we have > the mutex lock and then the initialized check. So, in the end, I did two > things: I removed that first check and then I removed the OrderAccess for > the storage initialized. I think now I have a better grasp and > understanding why it was done in our code and why it is not needed here. > Thanks for pointing it out :). This now still passes my JTREG tests, > especially the threaded one. > > > > > >> >> >> >> >> >>> As a kind of meta comment, I wonder if it would make sense to add >>>>> sampling >>>>> for non-TLAB allocations. Seems like if someone is rapidly allocating a >>>>> whole bunch of 1 MB objects that never fit in a TLAB, I might still be >>>>> interested in seeing that in my traces, and not get surprised that the >>>>> allocation rate is very high yet not showing up in any profiles. >>>>> >>>>> That is handled by the handle_sample where you wanted me to put a >>>> UseTlab because you hit that case if the allocation is too big. >>>> >>> >>> I see. It was not obvious to me that non-TLAB sampling is done in the >>> TLAB class. That seems like an abstraction crime. >>> What I wanted in my previous comment was that we do not call into the >>> TLAB when we are not using TLABs. If there is sampling logic in the TLAB >>> that is used for something else than TLABs, then it seems like that logic >>> simply does not belong inside of the TLAB. It should be moved out of the >>> TLAB, and instead have the TLAB call this common abstraction that makes >>> sense. >>> >>> >> So in the incremental version: >> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/, this is >> still a "crime". The reason is that the system has to have the >> bytes_until_sample on a per-thread level and it made "sense" to have it >> with the TLAB implementation. Also, I was not sure how people felt about >> adding something to the thread instance instead. >> >> Do you think it fits better at the Thread level? I can see how difficult >> it is to make it happen there and add some logic there. Let me know what >> you think. >> >> >> We have an unfortunate situation where everyone that has some fields that >> are thread local tend to dump them right into Thread, making the size and >> complexity of Thread grow as it becomes tightly coupled with various >> unrelated subsystems. It would be desirable to have a separate class for >> this instead that encapsulates the sampling logic. That class could >> possibly reside in Thread though as a value object of Thread. >> > > I imagined that would be the case but was not sure. I will look at the > example that Robbin is talking about (ThreadSMR) and will see how to > refactor my code to use that. > > Thanks again for your help, > Jc > > >> >> >> >> >> >>> Hope I have answered your questions and that my feedback makes sense to >>> you. >>> >>> >> You have and thank you for them, I think we are getting to a cleaner >> implementation and things are getting better and more readable :) >> >> >> Yes it is getting better. >> >> Thanks, >> /Erik >> >> >> Thanks for your help! >> Jc >> >> >> >>> Thanks, >>> /Erik >>> >>> >>> I double checked by changing the test >>>> >>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatObjectCorrectnessTest.java >>>> >>>> to use a smaller Tlab (2048) and made the object bigger and it goes >>>> through that and passes. >>>> >>>> Thanks again for your review and I look forward to your pointers for >>>> the questions I now have raised! >>>> Jc >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> Thanks, >>>>> /Erik >>>>> >>>>> >>>>> On 2018-01-26 06:45, JC Beyler wrote: >>>>> >>>>>> Thanks Robbin for the reviews :) >>>>>> >>>>>> The new full webrev is here: >>>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/ >>>>>> The incremental webrev is here: >>>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/ >>>>>> >>>>>> I inlined my answers: >>>>>> >>>>>> On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn < >>>>>> robbin.ehn at oracle.com> wrote: >>>>>> >>>>>>> Hi JC, great to see another revision! >>>>>>> >>>>>>> #### >>>>>>> heapMonitoring.cpp >>>>>>> >>>>>>> StackTraceData should not contain the oop for 'safety' reasons. >>>>>>> When StackTraceData is moved from _allocated_traces: >>>>>>> L452 store_garbage_trace(trace); >>>>>>> it contains a dead oop. >>>>>>> _allocated_traces could instead be a tupel of oop and StackTraceData >>>>>>> thus >>>>>>> dead oops are not kept. >>>>>>> >>>>>> Done I used inheritance to make the copier work regardless but the >>>>>> idea is the same. >>>>>> >>>>>> You should use the new Access API for loading the oop, something like >>>>>>> this: >>>>>>> RootAccess::load(...) >>>>>>> I don't think you need to use Access API for clearing the oop, but it >>>>>>> would >>>>>>> look nicer. And you shouldn't probably be using: >>>>>>> Universe::heap()->is_in_reserved(value) >>>>>>> >>>>>> I am unfamiliar with this but I think I did do it like you wanted me >>>>>> to (all tests pass so that's a start). I'm not sure how to clear the >>>>>> oop exactly, is there somewhere that does that, which I can use to do >>>>>> the same? >>>>>> >>>>>> I removed the is_in_reserved, this came from our internal version, I >>>>>> don't know why it was there but my tests work without so I removed it >>>>>> :) >>>>>> >>>>>> >>>>>> The lock: >>>>>>> L424 MutexLocker mu(HeapMonitorStorage_lock); >>>>>>> Is not needed as far as I can see. >>>>>>> weak_oops_do is called in a safepoint, no TLAB allocation can happen >>>>>>> and >>>>>>> JVMTI thread can't access these data-structures. Is there something >>>>>>> more >>>>>>> to >>>>>>> this lock that I'm missing? >>>>>>> >>>>>> Since a thread can call the JVMTI getLiveTraces (or any of the other >>>>>> ones), it can get to the point of trying to copying the >>>>>> _allocated_traces. I imagine it is possible that this is happening >>>>>> during a GC or that it can be started and a GC happens afterwards. >>>>>> Therefore, it seems to me that you want this protected, no? >>>>>> >>>>>> >>>>>> #### >>>>>>> You have 6 files without any changes in them (any more): >>>>>>> g1CollectedHeap.cpp >>>>>>> psMarkSweep.cpp >>>>>>> psParallelCompact.cpp >>>>>>> genCollectedHeap.cpp >>>>>>> referenceProcessor.cpp >>>>>>> thread.hpp >>>>>>> >>>>>>> Done. >>>>>> >>>>>> #### >>>>>>> I have not looked closely, but is it possible to hide heap sampling >>>>>>> in >>>>>>> AllocTracer ? (with some minor changes to the AllocTracer API) >>>>>>> >>>>>>> I am imagining that you are saying to move the code that does the >>>>>> sampling code (change the tlab end, do the call to HeapMonitoring, >>>>>> etc.) into the AllocTracer code itself? I think that is right and I'll >>>>>> look if that is possible and prepare a webrev to show what would be >>>>>> needed to make that happen. >>>>>> >>>>>> #### >>>>>>> Minor nit, when declaring pointer there is a little mix of having the >>>>>>> pointer adjacent by type name and data name. (Most hotspot code is by >>>>>>> type >>>>>>> name) >>>>>>> E.g. >>>>>>> heapMonitoring.cpp:711 jvmtiStackTrace *trace = .... >>>>>>> heapMonitoring.cpp:733 Method* m = vfst.method(); >>>>>>> (not just this file) >>>>>>> >>>>>>> Done! >>>>>> >>>>>> #### >>>>>>> HeapMonitorThreadOnOffTest.java:77 >>>>>>> I would make g_tmp volatile, otherwise the assignment in loop may >>>>>>> theoretical be skipped. >>>>>>> >>>>>>> Also done! >>>>>> >>>>>> Thanks again! >>>>>> Jc >>>>>> >>>>> >>>>> >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Fri Mar 9 08:41:08 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 09 Mar 2018 09:41:08 +0100 Subject: PING: RFR: JDK-8153333: [REDO] STW phases at Concurrent GC should count in PerfCounter In-Reply-To: References: <2f4e0901-1602-6276-e7fd-84e168e7b317@gmail.com> <3210723e-c5f9-4277-a97a-be61e10e6b3d@oracle.com> <89a1a161-874d-4743-e52b-7ab202fd8976@gmail.com> <9799a2ad-6727-cae9-312b-fd9de13c8203@oracle.com> <9183dfd3-522f-fd73-a681-5ea957f1d717@gmail.com> Message-ID: <1520584868.3872.6.camel@oracle.com> Hi Yasumasa, On Wed, 2018-03-07 at 21:18 +0900, Yasumasa Suenaga wrote: > PING: Could you review it? > > http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.08/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8153333 > CSR: https://bugs.openjdk.java.net/browse/JDK-8196862 > > This change has passed Mach5 on submit repo. > Also it has passed hotspot/jtreg/:hotspot_serviceability and > jdk/:jdk_tools jtreg tests. > > We need one more reviewer. - one really minor issue I think: in the description in JstatGcCauseResults.java, the descriptions of the new concurrent collections, between "GCT" and "Total Garbage collection time." there should probably be one space less to align with the "main" gc phase times. - in gcCapacityOutput1.awk, gcNewCapacityOutput1.awk, there are some additional newlines (line 10/11). No need for re-review from me, and for the change in JstatGcCauseResults.java I am not completely sure my suggestion is good. Stefan Johansson already mentioned he will sponsor. Thanks, Thomas From jini.george at oracle.com Fri Mar 9 09:29:02 2018 From: jini.george at oracle.com (Jini George) Date: Fri, 9 Mar 2018 14:59:02 +0530 Subject: RFR: JDK-8175312: SA: clhsdb: Provide an improved heap summary for 'universe' for G1GC In-Reply-To: References: <38d71740-0b66-3ce8-26ed-a0f2b9f9e91c@oracle.com> <5e8c582e-b32f-daf7-0e0c-1e6606ceaf3a@oracle.com> Message-ID: Here is the revised webrev: http://cr.openjdk.java.net/~jgeorge/8175312/webrev.02/ I have made modifications to have the 'universe' command display details like: hsdb> universe Heap Parameters: garbage-first heap [0x0000000725200000, 0x00000007c0000000] region size 1024K G1 Heap: regions = 2478 capacity = 2598371328 (2478.0MB) used = 5242880 (5.0MB) free = 2593128448 (2473.0MB) 0.20177562550443906% used G1 Young Generation: Eden Space: regions = 5 capacity = 8388608 (8.0MB) used = 5242880 (5.0MB) free = 3145728 (3.0MB) 62.5% used Survivor Space: regions = 0 capacity = 0 (0.0MB) used = 0 (0.0MB) free = 0 (0.0MB) 0.0% used G1 Old Generation: regions = 0 capacity = 155189248 (148.0MB) used = 0 (0.0MB) free = 155189248 (148.0MB) 0.0% used I did not add the metaspace details since that did not seem to be in line with the 'universe' output for other GCs. I have added a new command "g1regiondetails" to display the region details, and have modified the tests accordingly. hsdb> g1regiondetails Region Details: Region: 0x0000000725200000,0x0000000725200000,0x0000000725300000:Free Region: 0x0000000725300000,0x0000000725300000,0x0000000725400000:Free Region: 0x0000000725400000,0x0000000725400000,0x0000000725500000:Free Region: 0x0000000725500000,0x0000000725500000,0x0000000725600000:Free Region: 0x0000000725600000,0x0000000725600000,0x0000000725700000:Free Region: 0x0000000725700000,0x0000000725700000,0x0000000725800000:Free ... Thanks, Jini. On 2/28/2018 12:56 PM, Jini George wrote: > Thank you very much, Stefan. My answers inline. > > On 2/27/2018 3:30 PM, Stefan Johansson wrote: >> Hi Jini, > >>>> JIRA ID:https://bugs.openjdk.java.net/browse/JDK-8175312 >>>> Webrev: >>>> http://cr.openjdk.java.net/~jgeorge/8175312/webrev.00/index.html >>>> >> It looks like a file is missing, did you forget to add it to the >> changeset? > > Indeed, I had missed that! I added the missing file in the following > webrev: > > http://cr.openjdk.java.net/~jgeorge/8175312/webrev.01/ > >> --- >> open/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/G1CollectedHeap.java:36: >> error: cannot find symbol >> import sun.jvm.hotspot.gc.shared.PrintRegionClosure; >> --- >> >> Otherwise the change looks good, but I would like to see the output >> live. For a big heap this will print a lot of data, just wondering if >> the universe command is the correct choice for this kind of output. I >> like having the possibility to print all regions, so I want the change >> but maybe it should be a different command and 'universe' just prints >> a little more than before. Something like our logging heap-summary at >> shutdown: >> garbage-first heap?? total 16384K, used 3072K [0x00000000ff000000, >> 0x0000000100000000) >> ??region size 1024K, 4 young (4096K), 0 survivors (0K) >> Metaspace?????? used 6731K, capacity 6825K, committed 7040K, reserved >> 1056768K >> ??class space??? used 559K, capacity 594K, committed 640K, reserved >> 1048576K > > Ok, will add this, and could probably have the region details displayed > under a new command called "g1regiondetails", or some such, and send out > a new webrev. > > Thanks, > Jini. > >> >> Thanks, >> Stefan >>>> Modifications have been made to display the regions like: >>>> >>>> ... >>>> Region: 0x00000005c5400000,0x00000005c5600000,0x00000005c5600000:Old >>>> Region: 0x00000005c5600000,0x00000005c5800000,0x00000005c5800000:Old >>>> Region: 0x00000005c5800000,0x00000005c5a00000,0x00000005c5a00000:Old >>>> Region: 0x00000005c5a00000,0x00000005c5c00000,0x00000005c5c00000:Old >>>> Region: 0x00000005c5c00000,0x00000005c5c00000,0x00000005c5e00000:Free >>>> Region: 0x00000005c5e00000,0x00000005c5e00000,0x00000005c6000000:Free >>>> Region: 0x00000005c6000000,0x00000005c6200000,0x00000005c6200000:Old >>>> ... >>>> >>>> The jtreg test at this point does not include any testing for the >>>> display of archived or pinned regions. The testing for this will be >>>> added once JDK-8174994 is resolved. >>>> >>>> The SA tests pass with jprt and Mach5. >>>> >>>> Thanks, >>>> Jini. >> From yasuenag at gmail.com Fri Mar 9 12:12:42 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Fri, 9 Mar 2018 21:12:42 +0900 Subject: PING: RFR: JDK-8153333: [REDO] STW phases at Concurrent GC should count in PerfCounter In-Reply-To: <1520584868.3872.6.camel@oracle.com> References: <2f4e0901-1602-6276-e7fd-84e168e7b317@gmail.com> <3210723e-c5f9-4277-a97a-be61e10e6b3d@oracle.com> <89a1a161-874d-4743-e52b-7ab202fd8976@gmail.com> <9799a2ad-6727-cae9-312b-fd9de13c8203@oracle.com> <9183dfd3-522f-fd73-a681-5ea957f1d717@gmail.com> <1520584868.3872.6.camel@oracle.com> Message-ID: Hi Thomas, > - one really minor issue I think: in the description in > JstatGcCauseResults.java, the descriptions of the new concurrent > collections, between "GCT" and "Total Garbage collection time." there > should probably be one space less to align with the "main" gc phase > times. > > - in gcCapacityOutput1.awk, gcNewCapacityOutput1.awk, there are some > additional newlines (line 10/11). Thanks! I will fix them. Yasumasa On 2018/03/09 17:41, Thomas Schatzl wrote: > Hi Yasumasa, > > On Wed, 2018-03-07 at 21:18 +0900, Yasumasa Suenaga wrote: >> PING: Could you review it? >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8153333/webrev.08/ >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8153333 >> CSR: https://bugs.openjdk.java.net/browse/JDK-8196862 >> >> This change has passed Mach5 on submit repo. >> Also it has passed hotspot/jtreg/:hotspot_serviceability and >> jdk/:jdk_tools jtreg tests. >> >> We need one more reviewer. > > - one really minor issue I think: in the description in > JstatGcCauseResults.java, the descriptions of the new concurrent > collections, between "GCT" and "Total Garbage collection time." there > should probably be one space less to align with the "main" gc phase > times. > > - in gcCapacityOutput1.awk, gcNewCapacityOutput1.awk, there are some > additional newlines (line 10/11). > > No need for re-review from me, and for the change in > JstatGcCauseResults.java I am not completely sure my suggestion is > good. > > Stefan Johansson already mentioned he will sponsor. > > Thanks, > Thomas > From christoph.langer at sap.com Fri Mar 9 12:50:19 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Fri, 9 Mar 2018 12:50:19 +0000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> <7ab7c0b7-69ad-5b94-dd98-19790288bebb@oracle.com> <18e077534cd940539cd1cbb93943e9c8@sap.com> Message-ID: <6d9ec2ef8f5d400c8ea4a794388fd1c8@sap.com> Hi Chris, > > Secondly, it doesn't accept the assert as length check and complains: > > fixed_size_dest: You might overrun the 17-character fixed-size string this- > >_name by copying name without checking the length. > Agreed that the assert is not a length check in product builds. However, > the only caller has a length check. Have you tried moving this length > check into set_name() and see if the problem goes away? Although I don't > suggest that as a fix. Just curious as to what the result would be. When doing a length check in set_name(), coverity would be pleased. But still we'd have to handle length violations by either guaranteeing or returning some error return code, or quietly truncating. But you say you don't suggest it as fix anyway... > BTW, I just realized I had been ignoring the set_arg() changes all this > time and focused on set_name(). So if any of the complaints are unique > to set_arg() please let me know. No, nothing unique. > > And, 3rd, it considers the risk as elevated: > > parameter_as_source: Note: This defect has an elevated risk because the > source argument is a parameter of the current function. > Is this a complaint about "name" being a source argument to strcpy(). If > so, I don't get this one. How are you going to copy "name" without > specifying it as an argument to something (strcpy, strncpy, memcpy, > etc). Besides, it is being passed to strcpy as a const argument. Makes > me wonder if adding const to the parameter declarations for both > set_name() and enqueue() would help. I think coverity just considers this finding as elevated because the input data isn't something static from inside the method but comes in as argument. > > In my opinion the points are valid, because in opt builds there would be no length check. > But there is a length check in the caller. Does coverity not see checks up the call chain? Obviously not. > > I really think it would be easiest to go to my proposed patch. And it doesn't >> come with much cost and the place probably isn't performance relevant. > I'm not worried about performance. To me it has more to do with taking > easily to read code and changing it into something that someone would > stare at for a bit before figuring out what it's doing, and then ask > "Why so complicated?". Coverity is suppose to help us make our code > better. I don't see that being the case here. If in the end your changes > are the simplest approach to quieting coverity, then I guess that's what > we should go with. However, I'm still not convinced we really fully why > converity is not happy with a strcpy that can be statically shown to be > safe. Is is a coverity bug? Is there a call path we are missing? > Something else that makes it hard for coverity to statically check this? > That's one reason I'd like to see what happens if a check is put > directly in set_name. OK, so let me summarize: The code as it is right now has a little issue - which isn't obvious at a quick glance by the way. It can be fixed like I suggested. This would add two lines of code at each place and one can argue about how easy it is to understand. To me it seems as understandable as it was before - but I'm probably a bit concerned here. I can suggest an alternative which might be easier to read: http://cr.openjdk.java.net/~clanger/webrevs/8199010.1/ It comes at the cost of 2 calls to strlen() in dbg builds but it has one line of code less and might be more straightforward to understand. All larger refactoring of set_name() and set_arg() is beyond the scope of my change. Now I'd really like if you could accept one of my 2 proposals, given that also Thomas and David think it's ok. I want to get this done now. ?? Maybe you can even sponsor it... Thanks & Best regards Christoph From leo.korinth at oracle.com Fri Mar 9 13:34:16 2018 From: leo.korinth at oracle.com (Leo Korinth) Date: Fri, 9 Mar 2018 14:34:16 +0100 Subject: RFR: 8196337 Add commit methods that take all event properties as argument In-Reply-To: <48f9fb9e-8d7c-4f2a-bd27-cc4d074b0a44@default> References: <6c57498a-d0cc-e7bd-c54d-bfd4b2acf323@oracle.com> <495b2840-6031-593d-e6a8-6398e7c3883c@oracle.com> <48f9fb9e-8d7c-4f2a-bd27-cc4d074b0a44@default> Message-ID: <369d0429-83af-48ec-42b4-34c8f98f349f@oracle.com> Hi Markus, I removed a few newlines (no new webrev). I will use the modified change unless you disagree. I still need another reviewer (and a sponsor). Thanks, Leo On 08/03/18 10:23, Markus Gronlund wrote: > Hi Leo, > > Looks good, thanks for doing this. > > Sorry for the delay. > > Cheers > Markus > > -----Original Message----- > From: Leo Korinth > Sent: den 13 februari 2018 12:26 > To: hotspot-runtime-dev at openjdk.java.net; serviceability-dev at openjdk.java.net > Subject: Re: RFR: 8196337 Add commit methods that take all event properties as argument > > Hi, > > I would be happy for reviews! > > Thanks, > Leo > > On 31/01/18 11:00, Leo Korinth wrote: >> Hi, >> >> I am adding commit methods that take all event properties as argument. >> >> For instant events (without start and stop times) a static commit >> method is created (taking all properties). >> >> For non-instant events, a non static commit method is created (taking >> all properties). Also a static commit method (with additional >> startTicks/endTicks) is created. >> >> Also an extra constructor is created (taking all properties). An auto >> commit destructor was considered (that would auto commit if the >> constructor with all properties was used) but has not yet been implemented. >> >> Enhancement: >> https://bugs.openjdk.java.net/browse/JDK-8196337 >> >> Webrev: >> http://cr.openjdk.java.net/~lkorinth/8196337/00/ (open) >> >> Testing: >> - hs-tier1, hs-tier2 >> >> Thanks, >> Leo From erik.helin at oracle.com Fri Mar 9 14:19:55 2018 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 9 Mar 2018 15:19:55 +0100 Subject: RFR: 8196337 Add commit methods that take all event properties as argument In-Reply-To: <369d0429-83af-48ec-42b4-34c8f98f349f@oracle.com> References: <6c57498a-d0cc-e7bd-c54d-bfd4b2acf323@oracle.com> <495b2840-6031-593d-e6a8-6398e7c3883c@oracle.com> <48f9fb9e-8d7c-4f2a-bd27-cc4d074b0a44@default> <369d0429-83af-48ec-42b4-34c8f98f349f@oracle.com> Message-ID: <5c849db7-d25c-f446-28a4-e7a4f0014770@oracle.com> On 03/09/2018 02:34 PM, Leo Korinth wrote: > I still need another reviewer (and a sponsor). Looks good (well XSLT combined with C++ is awful, but the patch is good :D), Reviewed. I can sponsor this patch. Thanks, Erik > Thanks, > Leo > > On 08/03/18 10:23, Markus Gronlund wrote: >> Hi Leo, >> >> Looks good, thanks for doing this. >> >> Sorry for the delay. >> >> Cheers >> Markus >> >> -----Original Message----- >> From: Leo Korinth >> Sent: den 13 februari 2018 12:26 >> To: hotspot-runtime-dev at openjdk.java.net; >> serviceability-dev at openjdk.java.net >> Subject: Re: RFR: 8196337 Add commit methods that take all event >> properties as argument >> >> Hi, >> >> I would be happy for reviews! >> >> Thanks, >> Leo >> >> On 31/01/18 11:00, Leo Korinth wrote: >>> Hi, >>> >>> I am adding commit methods that take all event properties as argument. >>> >>> For instant events (without start and stop times) a static commit >>> method is created (taking all properties). >>> >>> For non-instant events, a non static commit method is created (taking >>> all properties). Also a static commit method (with additional >>> startTicks/endTicks) is created. >>> >>> Also an extra constructor is created (taking all properties). An auto >>> commit destructor was considered (that would auto commit if the >>> constructor with all properties was used) but has not yet been >>> implemented. >>> >>> Enhancement: >>> https://bugs.openjdk.java.net/browse/JDK-8196337 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~lkorinth/8196337/00/ (open) >>> >>> Testing: >>> - hs-tier1, hs-tier2 >>> >>> Thanks, >>> Leo From leo.korinth at oracle.com Fri Mar 9 14:45:23 2018 From: leo.korinth at oracle.com (Leo Korinth) Date: Fri, 9 Mar 2018 15:45:23 +0100 Subject: RFR: 8196337 Add commit methods that take all event properties as argument In-Reply-To: <5c849db7-d25c-f446-28a4-e7a4f0014770@oracle.com> References: <6c57498a-d0cc-e7bd-c54d-bfd4b2acf323@oracle.com> <495b2840-6031-593d-e6a8-6398e7c3883c@oracle.com> <48f9fb9e-8d7c-4f2a-bd27-cc4d074b0a44@default> <369d0429-83af-48ec-42b4-34c8f98f349f@oracle.com> <5c849db7-d25c-f446-28a4-e7a4f0014770@oracle.com> Message-ID: On 09/03/18 15:19, Erik Helin wrote: > On 03/09/2018 02:34 PM, Leo Korinth wrote: >> I still need another reviewer (and a sponsor). > > Looks good (well XSLT combined with C++ is awful, but the patch is good > :D), Reviewed. I can sponsor this patch. Great! Thanks, Leo > > Thanks, > Erik > >> Thanks, >> Leo >> >> On 08/03/18 10:23, Markus Gronlund wrote: >>> Hi Leo, >>> >>> Looks good, thanks for doing this. >>> >>> Sorry for the delay. >>> >>> Cheers >>> Markus >>> >>> -----Original Message----- >>> From: Leo Korinth >>> Sent: den 13 februari 2018 12:26 >>> To: hotspot-runtime-dev at openjdk.java.net; >>> serviceability-dev at openjdk.java.net >>> Subject: Re: RFR: 8196337 Add commit methods that take all event >>> properties as argument >>> >>> Hi, >>> >>> I would be happy for reviews! >>> >>> Thanks, >>> Leo >>> >>> On 31/01/18 11:00, Leo Korinth wrote: >>>> Hi, >>>> >>>> I am adding commit methods that take all event properties as argument. >>>> >>>> For instant events (without start and stop times) a static commit >>>> method is created (taking all properties). >>>> >>>> For non-instant events, a non static commit method is created (taking >>>> all properties). Also a static commit method (with additional >>>> startTicks/endTicks) is created. >>>> >>>> Also an extra constructor is created (taking all properties). An auto >>>> commit destructor was considered (that would auto commit if the >>>> constructor with all properties was used) but has not yet been >>>> implemented. >>>> >>>> Enhancement: >>>> https://bugs.openjdk.java.net/browse/JDK-8196337 >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~lkorinth/8196337/00/ (open) >>>> >>>> Testing: >>>> - hs-tier1, hs-tier2 >>>> >>>> Thanks, >>>> Leo From chris.plummer at oracle.com Fri Mar 9 16:01:51 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 9 Mar 2018 08:01:51 -0800 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <6d9ec2ef8f5d400c8ea4a794388fd1c8@sap.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> <7ab7c0b7-69ad-5b94-dd98-19790288bebb@oracle.com> <18e077534cd940539cd1cbb93943e9c8@sap.com> <6d9ec2ef8f5d400c8ea4a794388fd1c8@sap.com> Message-ID: <5933353c-8f49-85cc-3e01-6e80982a3d7f@oracle.com> On 3/9/18 4:50 AM, Langer, Christoph wrote: > Hi Chris, > >>> Secondly, it doesn't accept the assert as length check and complains: >>> fixed_size_dest: You might overrun the 17-character fixed-size string this- >>> _name by copying name without checking the length. >> Agreed that the assert is not a length check in product builds. However, >> the only caller has a length check. Have you tried moving this length >> check into set_name() and see if the problem goes away? Although I don't >> suggest that as a fix. Just curious as to what the result would be. > When doing a length check in set_name(), coverity would be pleased. But still we'd have to handle length violations by either guaranteeing or returning some error return code, or quietly truncating. But you say you don't suggest it as fix anyway... > >> BTW, I just realized I had been ignoring the set_arg() changes all this >> time and focused on set_name(). So if any of the complaints are unique >> to set_arg() please let me know. > No, nothing unique. > >>> And, 3rd, it considers the risk as elevated: >>> parameter_as_source: Note: This defect has an elevated risk because the >> source argument is a parameter of the current function. >> Is this a complaint about "name" being a source argument to strcpy(). If >> so, I don't get this one. How are you going to copy "name" without >> specifying it as an argument to something (strcpy, strncpy, memcpy, >> etc). Besides, it is being passed to strcpy as a const argument. Makes >> me wonder if adding const to the parameter declarations for both >> set_name() and enqueue() would help. > I think coverity just considers this finding as elevated because the input data isn't something static from inside the method but comes in as argument. > >>> In my opinion the points are valid, because in opt builds there would be no length check. >> But there is a length check in the caller. Does coverity not see checks up the call chain? > Obviously not. > >>> I really think it would be easiest to go to my proposed patch. And it doesn't >>> come with much cost and the place probably isn't performance relevant. >> I'm not worried about performance. To me it has more to do with taking >> easily to read code and changing it into something that someone would >> stare at for a bit before figuring out what it's doing, and then ask >> "Why so complicated?". Coverity is suppose to help us make our code >> better. I don't see that being the case here. If in the end your changes >> are the simplest approach to quieting coverity, then I guess that's what >> we should go with. However, I'm still not convinced we really fully why >> converity is not happy with a strcpy that can be statically shown to be >> safe. Is is a coverity bug? Is there a call path we are missing? >> Something else that makes it hard for coverity to statically check this? >> That's one reason I'd like to see what happens if a check is put >> directly in set_name. > OK, so let me summarize: > The code as it is right now has a little issue - which isn't obvious at a quick glance by the way. > It can be fixed like I suggested. This would add two lines of code at each place and one can argue about how easy it is to understand. To me it seems as understandable as it was before - but I'm probably a bit concerned here. In terms of readability, I was referring back to the original code that just had the strlen. It was the original coverity fix to that code that introduced readability issue. You aren't really doing much to make it less readable. > I can suggest an alternative which might be easier to read: http://cr.openjdk.java.net/~clanger/webrevs/8199010.1/ It comes at the cost of 2 calls to strlen() in dbg builds but it has one line of code less and might be more straightforward to understand. > All larger refactoring of set_name() and set_arg() is beyond the scope of my change. I like this version better, although it doesn't change my opinion that this is still all jumping through hoops to get coverity to stop complaining about something that is perfectly fine. > > Now I'd really like if you could accept one of my 2 proposals, given that also Thomas and David think it's ok. I want to get this done now. ?? Maybe you can even sponsor it... Yeah, I'm ok with the change. I've said my peace and don't just want to get in the way of a simple fix. Yes, I can also sponsor it for you. cheers, Chris > > Thanks & Best regards > Christoph > From chris.plummer at oracle.com Fri Mar 9 20:46:48 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 9 Mar 2018 12:46:48 -0800 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr Message-ID: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> Hello, Please help review the following: https://bugs.openjdk.java.net/browse/JDK-8198655 http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ In the end there were two issues. The first was that the pb.redirectError() call was redirecting the LingeredApp's stderr to the console, which we don't want. The second was that nothing was capturing the LingeredApp's output and sending it to the driver app's output (jtr file). These changes make all the LingeredApp's output end up in the jtr file. Tested by running all tests that use LingeredApp. thanks, Chris From david.holmes at oracle.com Mon Mar 12 02:52:53 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 12 Mar 2018 12:52:53 +1000 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> Message-ID: Hi Chris, On 10/03/2018 6:46 AM, Chris Plummer wrote: > Hello, > > Please help review the following: > > https://bugs.openjdk.java.net/browse/JDK-8198655 > http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ > > In the end there were two issues. The first was that the > pb.redirectError() call was redirecting the LingeredApp's stderr to the > console, which we don't want. The second was that nothing was capturing > the LingeredApp's output and sending it to the driver app's output (jtr > file). These changes make all the LingeredApp's output end up in the jtr > file. It isn't clear to me how the interleaving of the two streams by the two threads is handled in the copy routine. Are we guaranteed to get complete lines of output from each stream before writing to System.out? Thanks, David ----- > Tested by running all tests that use LingeredApp. > > thanks, > > Chris From david.holmes at oracle.com Mon Mar 12 04:13:47 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 12 Mar 2018 14:13:47 +1000 Subject: RFR: 8199323: hsdis could not be loaded which are located on long path In-Reply-To: <3fc35990-88b5-475c-f89e-226b862cab08@gmail.com> References: <3fc35990-88b5-475c-f89e-226b862cab08@gmail.com> Message-ID: Hi Yasumasa, On 8/03/2018 11:21 PM, Yasumasa Suenaga wrote: > Hi all, > > Could you review and sponsor it? > > ????????????????????????? webrev: > http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.00/ > ???????????????????????????? JBS: > https://bugs.openjdk.java.net/browse/JDK-8199323 > Mach5 test result on submit repo: > mach5-one-ysuenaga-JDK-8199323-20180308-1027-13701 > > I encountered DebuggerException when hsdis is located on long path as > below: > > Location of hsdis: > /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/jre/lib/amd64/hsdis-amd64.so > > > Exception: > sun.jvm.hotspot.debugger.DebuggerException: > /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/j: > cannot open shared object file: No such file or directory > > In Java_sun_jvm_hotspot_asm_Disassembler_load_1library(), buffer which > uses for library path is defined as below: > > ``` > char buffer[128]; > ``` > > I copied JVM_MAXPATHLEN related code to sadis.c from > os/posix/include/jvm_md.h and os/windows/include/jvm_md.h . I don't think this code has the same concern that the code in jvm_md.h claims** to have, so a simple use of MAXPATHLEN should be fine on all non-windows platforms. ** The posix jvm_md.h code is historical and I don't think we have to be concerned either about a 4095 definition of MAXPATHLEN or that the VM and libraries may have been compiled on different Linux versions! My only concern with the current change is whether a 4K on stack buffer might cause any issues? Thanks, David ----- > > I added noreg-hard label on this ticket because this issue is available > when disassembling on coredump. > > > Thanks, > > Yasumasa From harsha.wardhana.b at oracle.com Mon Mar 12 05:02:36 2018 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Mon, 12 Mar 2018 10:32:36 +0530 Subject: RFR : JDK-8196744 : JMX: Not enough JDP packets received before timeout In-Reply-To: <1dcedaa6-51ed-eb38-e490-3ef3cf1b0817@oracle.com> References: <2eb1146c-f089-716d-8d83-70504cde24f8@oracle.com> <4a870972-15c1-695f-2912-6ca5fe92ea61@oracle.com> <1dcedaa6-51ed-eb38-e490-3ef3cf1b0817@oracle.com> Message-ID: Ping! Can I have one more review for the below fix? Thanks Harsha On Monday 26 February 2018 10:42 AM, Harsha Wardhana B wrote: > Hello All, > > Requesting for review from one more reviewer. > > Thanks > Harsha > > On Wednesday 21 February 2018 10:01 AM, Chris Plummer wrote: >> Hi Harsha, >> >> Not a review, but just a request that you add the explanation of the >> problem to the CR so we have a record of it. Also, the copyright >> needs to be updated. >> >> thanks, >> >> Chris >> >> On 2/20/18 3:30 AM, Harsha Wardhana B wrote: >>> Hi All, >>> >>> Please find the fix below for the Jdp test-case. >>> >>> issue: https://bugs.openjdk.java.net/browse/JDK-8196028 >>> webrev : http://cr.openjdk.java.net/~hb/8196028/webrev.00/ >>> >>> Fix details : The test was receiving JDP packets from other VM and >>> hence the multi-cast socket was not timing-out. The default timeout >>> handler was causing test to fail. Added a shutdown method that >>> passes the test in case of timeout. >>> >>> Thanks >>> Harsha >> >> > From serguei.spitsyn at oracle.com Mon Mar 12 09:32:25 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 12 Mar 2018 02:32:25 -0700 Subject: tt Message-ID: From christoph.langer at sap.com Mon Mar 12 10:27:14 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Mon, 12 Mar 2018 10:27:14 +0000 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> Message-ID: Hi Chris, > Hi Chris, > > On 10/03/2018 6:46 AM, Chris Plummer wrote: > > Hello, > > > > Please help review the following: > > > > https://bugs.openjdk.java.net/browse/JDK-8198655 > > http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ > > > > In the end there were two issues. The first was that the > > pb.redirectError() call was redirecting the LingeredApp's stderr to the > > console, which we don't want. The second was that nothing was capturing > > the LingeredApp's output and sending it to the driver app's output (jtr > > file). These changes make all the LingeredApp's output end up in the jtr > > file. > > It isn't clear to me how the interleaving of the two streams by the two > threads is handled in the copy routine. Are we guaranteed to get > complete lines of output from each stream before writing to System.out? Would perhaps the use of a BufferedReader in this place be appropriate, using readLine()? Another small remark: The indentation of line 361" } catch (IOException e) {" seems too deep. Best regards Christoph From christoph.langer at sap.com Mon Mar 12 13:24:22 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Mon, 12 Mar 2018 13:24:22 +0000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <5933353c-8f49-85cc-3e01-6e80982a3d7f@oracle.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> <7ab7c0b7-69ad-5b94-dd98-19790288bebb@oracle.com> <18e077534cd940539cd1cbb93943e9c8@sap.com> <6d9ec2ef8f5d400c8ea4a794388fd1c8@sap.com> <5933353c-8f49-85cc-3e01-6e80982a3d7f@oracle.com> Message-ID: <95d10e87dc5f4531bd576965a9ab43c2@sap.com> Hi, here is the final webrev for pushing: http://cr.openjdk.java.net/~clanger/webrevs/8199010.2/ I also did a little sorting in the include files (alphabetical order). The tests at SAP went fine and the coverity build was satisfied, too ?? Thanks in advance, Chris, for sponsoring. Best regards Christoph > -----Original Message----- > From: Chris Plummer [mailto:chris.plummer at oracle.com] > Sent: Freitag, 9. M?rz 2018 17:02 > To: Langer, Christoph > Cc: serviceability-dev at openjdk.java.net; Hotspot dev runtime runtime-dev at openjdk.java.net>; David Holmes > ; Thomas St?fe > Subject: Re: RFR (XS): 8199010: attachListener.hpp: Fix potential null > termination issue found by coverity scans > > On 3/9/18 4:50 AM, Langer, Christoph wrote: > > Hi Chris, > > > >>> Secondly, it doesn't accept the assert as length check and complains: > >>> fixed_size_dest: You might overrun the 17-character fixed-size string > this- > >>> _name by copying name without checking the length. > >> Agreed that the assert is not a length check in product builds. However, > >> the only caller has a length check. Have you tried moving this length > >> check into set_name() and see if the problem goes away? Although I > don't > >> suggest that as a fix. Just curious as to what the result would be. > > When doing a length check in set_name(), coverity would be pleased. But > still we'd have to handle length violations by either guaranteeing or returning > some error return code, or quietly truncating. But you say you don't suggest > it as fix anyway... > > > >> BTW, I just realized I had been ignoring the set_arg() changes all this > >> time and focused on set_name(). So if any of the complaints are unique > >> to set_arg() please let me know. > > No, nothing unique. > > > >>> And, 3rd, it considers the risk as elevated: > >>> parameter_as_source: Note: This defect has an elevated risk because > the > >> source argument is a parameter of the current function. > >> Is this a complaint about "name" being a source argument to strcpy(). If > >> so, I don't get this one. How are you going to copy "name" without > >> specifying it as an argument to something (strcpy, strncpy, memcpy, > >> etc). Besides, it is being passed to strcpy as a const argument. Makes > >> me wonder if adding const to the parameter declarations for both > >> set_name() and enqueue() would help. > > I think coverity just considers this finding as elevated because the input > data isn't something static from inside the method but comes in as argument. > > > >>> In my opinion the points are valid, because in opt builds there would be > no length check. > >> But there is a length check in the caller. Does coverity not see checks up > the call chain? > > Obviously not. > > > >>> I really think it would be easiest to go to my proposed patch. And it > doesn't > >>> come with much cost and the place probably isn't performance relevant. > >> I'm not worried about performance. To me it has more to do with taking > >> easily to read code and changing it into something that someone would > >> stare at for a bit before figuring out what it's doing, and then ask > >> "Why so complicated?". Coverity is suppose to help us make our code > >> better. I don't see that being the case here. If in the end your changes > >> are the simplest approach to quieting coverity, then I guess that's what > >> we should go with. However, I'm still not convinced we really fully why > >> converity is not happy with a strcpy that can be statically shown to be > >> safe. Is is a coverity bug? Is there a call path we are missing? > >> Something else that makes it hard for coverity to statically check this? > >> That's one reason I'd like to see what happens if a check is put > >> directly in set_name. > > OK, so let me summarize: > > The code as it is right now has a little issue - which isn't obvious at a quick > glance by the way. > > It can be fixed like I suggested. This would add two lines of code at each > place and one can argue about how easy it is to understand. To me it seems > as understandable as it was before - but I'm probably a bit concerned here. > In terms of readability, I was referring back to the original code that > just had the strlen. It was the original coverity fix to that code that > introduced readability issue. You aren't really doing much to make it > less readable. > > I can suggest an alternative which might be easier to read: > http://cr.openjdk.java.net/~clanger/webrevs/8199010.1/ It comes at the > cost of 2 calls to strlen() in dbg builds but it has one line of code less and > might be more straightforward to understand. > > All larger refactoring of set_name() and set_arg() is beyond the scope of > my change. > I like this version better, although it doesn't change my opinion that > this is still all jumping through hoops to get coverity to stop > complaining about something that is perfectly fine. > > > > Now I'd really like if you could accept one of my 2 proposals, given that also > Thomas and David think it's ok. I want to get this done now. ?? Maybe you can > even sponsor it... > Yeah, I'm ok with the change. I've said my peace and don't just want to > get in the way of a simple fix. Yes, I can also sponsor it for you. > > cheers, > > Chris > > > > Thanks & Best regards > > Christoph > > > From yasuenag at gmail.com Mon Mar 12 14:25:18 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Mon, 12 Mar 2018 23:25:18 +0900 Subject: RFR: 8199323: hsdis could not be loaded which are located on long path In-Reply-To: References: <3fc35990-88b5-475c-f89e-226b862cab08@gmail.com> Message-ID: <5c53013a-2225-e1a1-0ab8-eb4638dfbd7d@gmail.com> Hi David, > I don't think this code has the same concern that the code in jvm_md.h claims** to have, so a simple use of MAXPATHLEN should be fine on all non-windows platforms. It sounds good to me. I updated webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.01/ > My only concern with the current change is whether a 4K on stack buffer might cause any issues? In case of HotSpot for x64 Linux, stack size is 1MB. So I think the risk of stack overflow is very low. In fact, my environment (Fedora 27 x64) works fine with this change. Thanks, Yasumasa On 2018/03/12 13:13, David Holmes wrote: > Hi Yasumasa, > > On 8/03/2018 11:21 PM, Yasumasa Suenaga wrote: >> Hi all, >> >> Could you review and sponsor it? >> >> ?????????????????????????? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.00/ >> ????????????????????????????? JBS: https://bugs.openjdk.java.net/browse/JDK-8199323 >> Mach5 test result on submit repo: mach5-one-ysuenaga-JDK-8199323-20180308-1027-13701 >> >> I encountered DebuggerException when hsdis is located on long path as below: >> >> Location of hsdis: >> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/jre/lib/amd64/hsdis-amd64.so >> >> Exception: >> sun.jvm.hotspot.debugger.DebuggerException: /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/j: cannot open shared object file: No such file or directory >> >> In Java_sun_jvm_hotspot_asm_Disassembler_load_1library(), buffer which uses for library path is defined as below: >> >> ``` >> char buffer[128]; >> ``` >> >> I copied JVM_MAXPATHLEN related code to sadis.c from os/posix/include/jvm_md.h and os/windows/include/jvm_md.h . > > I don't think this code has the same concern that the code in jvm_md.h claims** to have, so a simple use of MAXPATHLEN should be fine on all non-windows platforms. > > ** The posix jvm_md.h code is historical and I don't think we have to be concerned either about a 4095 definition of MAXPATHLEN or that the VM and libraries may have been compiled on different Linux versions! > > My only concern with the current change is whether a 4K on stack buffer might cause any issues? > > Thanks, > David > ----- > >> >> I added noreg-hard label on this ticket because this issue is available when disassembling on coredump. >> >> >> Thanks, >> >> Yasumasa From stefan.johansson at oracle.com Mon Mar 12 15:22:33 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Mon, 12 Mar 2018 16:22:33 +0100 Subject: RFR: JDK-8175312: SA: clhsdb: Provide an improved heap summary for 'universe' for G1GC In-Reply-To: References: <38d71740-0b66-3ce8-26ed-a0f2b9f9e91c@oracle.com> <5e8c582e-b32f-daf7-0e0c-1e6606ceaf3a@oracle.com> Message-ID: <59c7ec4f-f9dc-7c25-e7bf-f3fc304b5c60@oracle.com> Hi Jini, This looks good. I'm totally fine with skipping metaspace if that isn't displayed for the other GCs. Cheers, Stefan On 2018-03-09 10:29, Jini George wrote: > Here is the revised webrev: > > http://cr.openjdk.java.net/~jgeorge/8175312/webrev.02/ > > I have made modifications to have the 'universe' command display > details like: > > hsdb> universe > Heap Parameters: > garbage-first heap [0x0000000725200000, 0x00000007c0000000] region > size 1024K > G1 Heap: > ?? regions? = 2478 > ?? capacity = 2598371328 (2478.0MB) > ?? used???? = 5242880 (5.0MB) > ?? free???? = 2593128448 (2473.0MB) > ?? 0.20177562550443906% used > G1 Young Generation: > Eden Space: > ?? regions? = 5 > ?? capacity = 8388608 (8.0MB) > ?? used???? = 5242880 (5.0MB) > ?? free???? = 3145728 (3.0MB) > ?? 62.5% used > Survivor Space: > ?? regions? = 0 > ?? capacity = 0 (0.0MB) > ?? used???? = 0 (0.0MB) > ?? free???? = 0 (0.0MB) > ?? 0.0% used > G1 Old Generation: > ?? regions? = 0 > ?? capacity = 155189248 (148.0MB) > ?? used???? = 0 (0.0MB) > ?? free???? = 155189248 (148.0MB) > ?? 0.0% used > > > I did not add the metaspace details since that did not seem to be in > line with the 'universe' output for other GCs. I have added a new > command "g1regiondetails" to display the region details, and have > modified the tests accordingly. > > hsdb> g1regiondetails > Region Details: > Region: 0x0000000725200000,0x0000000725200000,0x0000000725300000:Free > Region: 0x0000000725300000,0x0000000725300000,0x0000000725400000:Free > Region: 0x0000000725400000,0x0000000725400000,0x0000000725500000:Free > Region: 0x0000000725500000,0x0000000725500000,0x0000000725600000:Free > Region: 0x0000000725600000,0x0000000725600000,0x0000000725700000:Free > Region: 0x0000000725700000,0x0000000725700000,0x0000000725800000:Free > ... > > Thanks, > Jini. > > > On 2/28/2018 12:56 PM, Jini George wrote: >> Thank you very much, Stefan. My answers inline. >> >> On 2/27/2018 3:30 PM, Stefan Johansson wrote: >>> Hi Jini, >> >>>>> JIRA ID:https://bugs.openjdk.java.net/browse/JDK-8175312 >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~jgeorge/8175312/webrev.00/index.html >>>>> >>> It looks like a file is missing, did you forget to add it to the >>> changeset? >> >> Indeed, I had missed that! I added the missing file in the following >> webrev: >> >> http://cr.openjdk.java.net/~jgeorge/8175312/webrev.01/ >> >>> --- >>> open/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/G1CollectedHeap.java:36: >>> error: cannot find symbol >>> import sun.jvm.hotspot.gc.shared.PrintRegionClosure; >>> --- >>> >>> Otherwise the change looks good, but I would like to see the output >>> live. For a big heap this will print a lot of data, just wondering >>> if the universe command is the correct choice for this kind of >>> output. I like having the possibility to print all regions, so I >>> want the change but maybe it should be a different command and >>> 'universe' just prints a little more than before. Something like our >>> logging heap-summary at shutdown: >>> garbage-first heap?? total 16384K, used 3072K [0x00000000ff000000, >>> 0x0000000100000000) >>> ??region size 1024K, 4 young (4096K), 0 survivors (0K) >>> Metaspace?????? used 6731K, capacity 6825K, committed 7040K, >>> reserved 1056768K >>> ??class space??? used 559K, capacity 594K, committed 640K, reserved >>> 1048576K >> >> Ok, will add this, and could probably have the region details >> displayed under a new command called "g1regiondetails", or some such, >> and send out a new webrev. >> >> Thanks, >> Jini. >> >>> >>> Thanks, >>> Stefan >>>>> Modifications have been made to display the regions like: >>>>> >>>>> ... >>>>> Region: 0x00000005c5400000,0x00000005c5600000,0x00000005c5600000:Old >>>>> Region: 0x00000005c5600000,0x00000005c5800000,0x00000005c5800000:Old >>>>> Region: 0x00000005c5800000,0x00000005c5a00000,0x00000005c5a00000:Old >>>>> Region: 0x00000005c5a00000,0x00000005c5c00000,0x00000005c5c00000:Old >>>>> Region: 0x00000005c5c00000,0x00000005c5c00000,0x00000005c5e00000:Free >>>>> Region: 0x00000005c5e00000,0x00000005c5e00000,0x00000005c6000000:Free >>>>> Region: 0x00000005c6000000,0x00000005c6200000,0x00000005c6200000:Old >>>>> ... >>>>> >>>>> The jtreg test at this point does not include any testing for the >>>>> display of archived or pinned regions. The testing for this will >>>>> be added once JDK-8174994 is resolved. >>>>> >>>>> The SA tests pass with jprt and Mach5. >>>>> >>>>> Thanks, >>>>> Jini. >>> From christoph.langer at sap.com Mon Mar 12 15:41:47 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Mon, 12 Mar 2018 15:41:47 +0000 Subject: Another question regarding Thread dumps Message-ID: <540c219708ee4dc9b15e6454c79e5bba@sap.com> Hi, I have another question regarding thread dumping code. At the places where thread dumps get generated (attachListener.cpp, diagnosticCommand.cpp, os.cpp), there's always a series of 3 VM operations: VM_PrintThreads, VM_PrintJNI and VM_FindDeadlocks. I'm wondering if it would make sense to do this altogether in one VM operation? Then probably the picture could be more consistent. However, I can imagine the risk that the safepoint takes too long. Are there other pros and cons I'm missing? I'm asking because in our JVM codebase I can find places where some of these VM ops had been combined and I'm wondering what might be the reasoning behind that and whether it makes sense to revert to the OpenJDK way of doing things or whether the changes are smart and even worth contributing. What do you think? Thanks Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From jini.george at oracle.com Mon Mar 12 15:52:09 2018 From: jini.george at oracle.com (Jini George) Date: Mon, 12 Mar 2018 21:22:09 +0530 Subject: RFR: JDK-8175312: SA: clhsdb: Provide an improved heap summary for 'universe' for G1GC In-Reply-To: <59c7ec4f-f9dc-7c25-e7bf-f3fc304b5c60@oracle.com> References: <38d71740-0b66-3ce8-26ed-a0f2b9f9e91c@oracle.com> <5e8c582e-b32f-daf7-0e0c-1e6606ceaf3a@oracle.com> <59c7ec4f-f9dc-7c25-e7bf-f3fc304b5c60@oracle.com> Message-ID: Thank you very much, Stefan. Could one more reviewer please take a look at it ? - Jini. On 3/12/2018 8:52 PM, Stefan Johansson wrote: > Hi Jini, > > This looks good. I'm totally fine with skipping metaspace if that isn't > displayed for the other GCs. > > Cheers, > Stefan > > On 2018-03-09 10:29, Jini George wrote: >> Here is the revised webrev: >> >> http://cr.openjdk.java.net/~jgeorge/8175312/webrev.02/ >> >> I have made modifications to have the 'universe' command display >> details like: >> >> hsdb> universe >> Heap Parameters: >> garbage-first heap [0x0000000725200000, 0x00000007c0000000] region >> size 1024K >> G1 Heap: >> ?? regions? = 2478 >> ?? capacity = 2598371328 (2478.0MB) >> ?? used???? = 5242880 (5.0MB) >> ?? free???? = 2593128448 (2473.0MB) >> ?? 0.20177562550443906% used >> G1 Young Generation: >> Eden Space: >> ?? regions? = 5 >> ?? capacity = 8388608 (8.0MB) >> ?? used???? = 5242880 (5.0MB) >> ?? free???? = 3145728 (3.0MB) >> ?? 62.5% used >> Survivor Space: >> ?? regions? = 0 >> ?? capacity = 0 (0.0MB) >> ?? used???? = 0 (0.0MB) >> ?? free???? = 0 (0.0MB) >> ?? 0.0% used >> G1 Old Generation: >> ?? regions? = 0 >> ?? capacity = 155189248 (148.0MB) >> ?? used???? = 0 (0.0MB) >> ?? free???? = 155189248 (148.0MB) >> ?? 0.0% used >> >> >> I did not add the metaspace details since that did not seem to be in >> line with the 'universe' output for other GCs. I have added a new >> command "g1regiondetails" to display the region details, and have >> modified the tests accordingly. >> >> hsdb> g1regiondetails >> Region Details: >> Region: 0x0000000725200000,0x0000000725200000,0x0000000725300000:Free >> Region: 0x0000000725300000,0x0000000725300000,0x0000000725400000:Free >> Region: 0x0000000725400000,0x0000000725400000,0x0000000725500000:Free >> Region: 0x0000000725500000,0x0000000725500000,0x0000000725600000:Free >> Region: 0x0000000725600000,0x0000000725600000,0x0000000725700000:Free >> Region: 0x0000000725700000,0x0000000725700000,0x0000000725800000:Free >> ... >> >> Thanks, >> Jini. >> >> >> On 2/28/2018 12:56 PM, Jini George wrote: >>> Thank you very much, Stefan. My answers inline. >>> >>> On 2/27/2018 3:30 PM, Stefan Johansson wrote: >>>> Hi Jini, >>> >>>>>> JIRA ID:https://bugs.openjdk.java.net/browse/JDK-8175312 >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~jgeorge/8175312/webrev.00/index.html >>>>>> >>>> It looks like a file is missing, did you forget to add it to the >>>> changeset? >>> >>> Indeed, I had missed that! I added the missing file in the following >>> webrev: >>> >>> http://cr.openjdk.java.net/~jgeorge/8175312/webrev.01/ >>> >>>> --- >>>> open/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/G1CollectedHeap.java:36: >>>> error: cannot find symbol >>>> import sun.jvm.hotspot.gc.shared.PrintRegionClosure; >>>> --- >>>> >>>> Otherwise the change looks good, but I would like to see the output >>>> live. For a big heap this will print a lot of data, just wondering >>>> if the universe command is the correct choice for this kind of >>>> output. I like having the possibility to print all regions, so I >>>> want the change but maybe it should be a different command and >>>> 'universe' just prints a little more than before. Something like our >>>> logging heap-summary at shutdown: >>>> garbage-first heap?? total 16384K, used 3072K [0x00000000ff000000, >>>> 0x0000000100000000) >>>> ??region size 1024K, 4 young (4096K), 0 survivors (0K) >>>> Metaspace?????? used 6731K, capacity 6825K, committed 7040K, >>>> reserved 1056768K >>>> ??class space??? used 559K, capacity 594K, committed 640K, reserved >>>> 1048576K >>> >>> Ok, will add this, and could probably have the region details >>> displayed under a new command called "g1regiondetails", or some such, >>> and send out a new webrev. >>> >>> Thanks, >>> Jini. >>> >>>> >>>> Thanks, >>>> Stefan >>>>>> Modifications have been made to display the regions like: >>>>>> >>>>>> ... >>>>>> Region: 0x00000005c5400000,0x00000005c5600000,0x00000005c5600000:Old >>>>>> Region: 0x00000005c5600000,0x00000005c5800000,0x00000005c5800000:Old >>>>>> Region: 0x00000005c5800000,0x00000005c5a00000,0x00000005c5a00000:Old >>>>>> Region: 0x00000005c5a00000,0x00000005c5c00000,0x00000005c5c00000:Old >>>>>> Region: 0x00000005c5c00000,0x00000005c5c00000,0x00000005c5e00000:Free >>>>>> Region: 0x00000005c5e00000,0x00000005c5e00000,0x00000005c6000000:Free >>>>>> Region: 0x00000005c6000000,0x00000005c6200000,0x00000005c6200000:Old >>>>>> ... >>>>>> >>>>>> The jtreg test at this point does not include any testing for the >>>>>> display of archived or pinned regions. The testing for this will >>>>>> be added once JDK-8174994 is resolved. >>>>>> >>>>>> The SA tests pass with jprt and Mach5. >>>>>> >>>>>> Thanks, >>>>>> Jini. >>>> > From chris.plummer at oracle.com Mon Mar 12 15:56:35 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 12 Mar 2018 08:56:35 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> Message-ID: <865d412a-f05f-8e58-a6b2-11d8accf00d3@oracle.com> On 3/12/18 3:27 AM, Langer, Christoph wrote: > Hi Chris, > >> Hi Chris, >> >> On 10/03/2018 6:46 AM, Chris Plummer wrote: >>> Hello, >>> >>> Please help review the following: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8198655 >>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >>> >>> In the end there were two issues. The first was that the >>> pb.redirectError() call was redirecting the LingeredApp's stderr to the >>> console, which we don't want. The second was that nothing was capturing >>> the LingeredApp's output and sending it to the driver app's output (jtr >>> file). These changes make all the LingeredApp's output end up in the jtr >>> file. >> It isn't clear to me how the interleaving of the two streams by the two >> threads is handled in the copy routine. Are we guaranteed to get >> complete lines of output from each stream before writing to System.out? > Would perhaps the use of a BufferedReader in this place be appropriate, using readLine()? Hi Christoph, That would be an improvement to the interleaving of stderr and stdout, although there could still be issues. For example, if the test intentionally left out newlines as it built a long line that might take a while to construct (think of printing a "." each second or something like that). Also, if the last line was missing a newline, it would never be printed. thanks, Chris > > Another small remark: The indentation of line 361" } catch (IOException e) {" seems too deep. > > Best regards > Christoph From chris.plummer at oracle.com Mon Mar 12 15:53:16 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 12 Mar 2018 08:53:16 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> Message-ID: On 3/11/18 7:52 PM, David Holmes wrote: > Hi Chris, > > On 10/03/2018 6:46 AM, Chris Plummer wrote: >> Hello, >> >> Please help review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8198655 >> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >> >> In the end there were two issues. The first was that the >> pb.redirectError() call was redirecting the LingeredApp's stderr to >> the console, which we don't want. The second was that nothing was >> capturing the LingeredApp's output and sending it to the driver app's >> output (jtr file). These changes make all the LingeredApp's output >> end up in the jtr file. > > It isn't clear to me how the interleaving of the two streams by the > two threads is handled in the copy routine. Are we guaranteed to get > complete lines of output from each stream before writing to System.out? Hi David, I'm hoping Igor will chime in here, since this is just cloned from some closed code he wrote, and he recommended this fix. Perhaps we are just doing something a bit non standard here. When spawning a separate test process, don't we normally just dump stdout and stderr separately via OutputAnalyzer.reportDiagnosticSummary() after the test completes, and then only if there is an error. I'm not sure why Igor felt LingeredApp tests should be handled differently. thanks, Chris > > Thanks, > David > ----- > >> Tested by running all tests that use LingeredApp. >> >> thanks, >> >> Chris From chris.plummer at oracle.com Mon Mar 12 16:52:15 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 12 Mar 2018 09:52:15 -0700 Subject: PING: Re: RFR: JDK-8193369: post_field_access does not work for some functions, possibly related to fast_getfield In-Reply-To: <72cd98ab-1034-c3c4-80cc-d32d8a512fef@oracle.com> References: <1fca6b67-c0d1-db03-52ed-f2c6bcc29a5b@oracle.com> <91aadc35-125a-bf74-6cf5-672dc77ffb22@oracle.com> <3df69fad-c0d8-5667-a61a-f88a83e26d89@oracle.com> <74eacea4-a3c0-a35d-047b-1478b7d46c87@oracle.com> <72cd98ab-1034-c3c4-80cc-d32d8a512fef@oracle.com> Message-ID: <077a8d3f-badf-fd68-b889-5604d4340891@oracle.com> Hi Alex, Please update the copyright date in jvmtiManageCapabilities.cpp. The following is where you added your fix: ?315?? if (avail.can_generate_breakpoint_events ?316??????? || avail.can_generate_field_access_events ?317??????? || avail.can_generate_field_modification_events) ?318?? { ?319???? RewriteFrequentPairs = false; ?320?? } Although this addresses the problem, in general I think this approach is error prone since it requires knowledge of which bytecode pairs might be rewritten, and the impact they may have on JVMTI. But that's a pre-existing issue with this code, not something I'd expect you to fix with this CR, so looks good. The test case also looks good. thanks, Chris On 3/8/18 1:48 PM, serguei.spitsyn at oracle.com wrote: > Hey guys, > > One more review is needed for this fix! > > Thanks, > Serguei > > > On 3/5/18 09:58, serguei.spitsyn at oracle.com wrote: >> Hi Alex, >> >> It looks good. >> Thank you for the update! >> >> Thanks, >> Serguei >> >> On 3/1/18 10:53, Alex Menkov wrote: >>> Hi Serguei, >>> >>> Thank you for the feedback. >>> Updated webrev: >>> http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev.01/ >>> >>> See inline for comments for your notes. >>> >>> On 02/27/2018 23:08, serguei.spitsyn at oracle.com wrote: >>>> Hi Alex, >>>> >>>> Thank you for taking care about this! >>>> The fix looks good to me. >>>> >>>> Some comments on the test. >>>> >>>> http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev/test/hotspot/jtreg/serviceability/jvmti/FieldAccessWatch/FieldAccessWatch.java.html >>>> >>>> There are some commented lines in the TestResult class. >>>> A cleanup is needed to delete them. >>>> I guess, it is already in your plan. >>> >>> I deleted couple lines, keeping comment for fields >>>> The empty line #135 is not needed. >>>> An empty line is needed after the L99. >>> >>> fixed. >>>> Probably, the intention was to spell "startTest" insted of >>>> "initTest" below: >>>> >>>> ??119???????? if (!startTest(result)) { >>>> ??120???????????? throw new RuntimeException("initTest failed"); >>>> ??121???????? } >>> >>> fixed. >>>> I wonder if this sleep is really needed: >>>> ???? 124 Thread.sleep(500); >>>> >>>> The "action.apply()" is executed synchronously, is not it? >>> >>> But notifications are asynchronous, so this helps to avoid test >>> failures is some events are delivered a bit later in loaded >>> environment. >>> Also this helps to avoid mess of native and java logging >>>> I'm thinking if moving the test() to native side would simplify >>>> things. >>> >>> To me it's simpler and more flexible to perform required actions in >>> Java, native part only handles notifications. >>>> An Exception can be thrown from native if the test failed or just a >>>> boolean status returned. >>>> >>>> >>>> http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev/test/hotspot/jtreg/serviceability/jvmti/FieldAccessWatch/libFieldAccessWatch.c.html >>>> >>>> I'd suggest to rename currentTestResults to testResultObject, >>>> so it will be in line with testResultClass. >>> >>> fixed. >>>> One concern is that that the reportError() does not cause the test >>>> to fail and does not break the execution. >>>> Would it better to throw an exception with the same message as was >>>> printed? >>> >>> Updated several cases (immediate return from callbacks if something >>> went wrong). >>> Note that reportError is called from native Java methods and from >>> JVMTI callbacks, so throwing an exception doesn't looks right. >>>> It seems, the function tagAndWatch() adds some complexity to the code. >>>> Is all this really needed? Could you, please, add some comments. >>>> It does not seem this functions tags anything. >>> >>> renamed the function, added short function description. >>>> ??168 (*jvmti)->Deallocate(jvmti, (unsigned char*)sig); >>>> >>>> ??The sig needs to be cleared after deallocation as it is used and >>>> checked in a loop. >>> >>> Moved the variable to the correct scope. >>>> Missed initializations: >>>> >>>> ?? 68???? char *name; >>>> ??142???????? jfieldID* klassFields; >>>> ??143???????? jint fieldCount; >>> >>> Fixed. >>> >>> --alex >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 2/26/18 14:43, Alex Menkov wrote: >>>>> Hi all, >>>>> >>>>> Please review a fix for >>>>> JDK-8193369: post_field_access does not work for some functions, >>>>> possibly related to fast_getfield >>>>> >>>>> The fix disables "fast" command generation when FieldAccess or >>>>> FieldModification notifications are requested. >>>>> >>>>> jira: https://bugs.openjdk.java.net/browse/JDK-8193369 >>>>> webrev: http://cr.openjdk.java.net/~amenkov/fast_field_access/webrev/ >>>>> >>>>> --alex > From igor.ignatyev at oracle.com Mon Mar 12 20:26:24 2018 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 12 Mar 2018 13:26:24 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> Message-ID: <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> > On Mar 12, 2018, at 8:53 AM, Chris Plummer wrote: > > On 3/11/18 7:52 PM, David Holmes wrote: >> Hi Chris, >> >> On 10/03/2018 6:46 AM, Chris Plummer wrote: >>> Hello, >>> >>> Please help review the following: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8198655 >>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >>> >>> In the end there were two issues. The first was that the pb.redirectError() call was redirecting the LingeredApp's stderr to the console, which we don't want. The second was that nothing was capturing the LingeredApp's output and sending it to the driver app's output (jtr file). These changes make all the LingeredApp's output end up in the jtr file. >> >> It isn't clear to me how the interleaving of the two streams by the two threads is handled in the copy routine. Are we guaranteed to get complete lines of output from each stream before writing to System.out? > Hi David, > > I'm hoping Igor will chime in here, since this is just cloned from some closed code he wrote, and he recommended this fix. Perhaps we are just doing something a bit non standard here. When spawning a separate test process, don't we normally just dump stdout and stderr separately via OutputAnalyzer.reportDiagnosticSummary() after the test completes, and then only if there is an error. I'm not sure why Igor felt LingeredApp tests should be handled differently. I recommended this fix as one of possibilities and never claimed it's the best solution ;) I don't know much of LingeredApp tests, so I just suggested the patch which only solves the problem I noticed (LingeredApp's cerr being printed into jtreg agent's cerr). OutputAnalyzer might not fit the use case of LingeredApp b/c it blocks till the process is finished. again, I don't know much about LingeredApp itself and the tests which use it. answering to David's question, copy routine handles interleaving of two streams similarly to printf routine, it does not do that at all. we are writing to System.out as we read data, the only guarantee we have is all the bytes we read into buffer will be written together (which might mean 1 byte at a time), no guarantees about lines. the behavior is pretty much the same as you expect to get from an interactive shell w/ both cout and cerr are being printed on the console. Thanks, -- Igor > > thanks, > > Chris >> >> Thanks, >> David >> ----- >> >>> Tested by running all tests that use LingeredApp. >>> >>> thanks, >>> >>> Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Mon Mar 12 20:51:12 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 12 Mar 2018 13:51:12 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> Message-ID: <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> Hi Igor, On 3/12/18 1:26 PM, Igor Ignatyev wrote: > > >> On Mar 12, 2018, at 8:53 AM, Chris Plummer > > wrote: >> >> On 3/11/18 7:52 PM, David Holmes wrote: >>> Hi Chris, >>> >>> On 10/03/2018 6:46 AM, Chris Plummer wrote: >>>> Hello, >>>> >>>> Please help review the following: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8198655 >>>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >>>> >>>> In the end there were two issues. The first was that the >>>> pb.redirectError() call was redirecting the LingeredApp's stderr to >>>> the console, which we don't want. The second was that nothing was >>>> capturing the LingeredApp's output and sending it to the driver >>>> app's output (jtr file). These changes make all the LingeredApp's >>>> output end up in the jtr file. >>> >>> It isn't clear to me how the interleaving of the two streams by the >>> two threads is handled in the copy routine. Are we guaranteed to get >>> complete lines of output from each stream before writing to System.out? >> Hi David, >> >> I'm hoping Igor will chime in here, since this is just cloned from >> some closed code he wrote, and he recommended this fix. Perhaps we >> are just doing something a bit non standard here. When spawning a >> separate test process, don't we normally just dump stdout and stderr >> separately via OutputAnalyzer.reportDiagnosticSummary() after the >> test completes, and then only if there is an error. I'm not sure why >> Igor felt LingeredApp tests should be handled differently. > > I recommended this fix as one of possibilities and never claimed it's > the best solution ;) ?I don't know much of LingeredApp tests, so I > just suggested the patch which only solves the problem I noticed > (LingeredApp's cerr being printed into jtreg agent's cerr). If by "into jtreg agent's cerr" you are referring to the presence of JDWP "ERROR" messages in the jtreg console, that is fixed simply by removing the following: ?? pb.redirectError(ProcessBuilder.Redirect.INHERIT); And that is already part of this fix. But it actually makes the ERROR messages completely disappear. The copy() part of the fix makes all the LingeredApp output appear in the .jtr file (including the JDWP "ERROR" messages). > OutputAnalyzer might not fit the use case of LingeredApp b/c it blocks > till the process is finished. again, I don't know much about > LingeredApp itself and the tests which use it. My point was that other jtreg tests that use ProcessBuilder and OutputAnalyzer don't print out anything from the spawned process/app until it is done, and even then usually only if there was a test failure. Why is there a need here to print out messages as they are generated. > > answering to David's question, copy routine handles interleaving of > two streams similarly to printf routine, it does not do that at all. > we are writing to System.out as we read data, the only guarantee we > have is all the bytes we read into buffer will be written together > (which might mean 1 byte at a time), no guarantees about lines. the > behavior is pretty much the same as you expect to get from an > interactive shell w/ both cout and cerr are being printed on the console. Not quite. If a single threaded app is sending to both System.out and System.err (and/or stdout and stderr) and does something to ensure flushing after each print or each line, then the output should appear cleanly in the order executed. By having two different threads read these two streams, order might be changed, and possibly even interleaved within lines. cheers, Chris > > Thanks, > -- Igor >> >> thanks, >> >> Chris >>> >>> Thanks, >>> David >>> ----- >>> >>>> Tested by running all tests that use LingeredApp. >>>> >>>> thanks, >>>> >>>> Chris > From yumin.qi at gmail.com Tue Mar 13 00:54:44 2018 From: yumin.qi at gmail.com (yumin qi) Date: Mon, 12 Mar 2018 17:54:44 -0700 Subject: RFR: JDK-8175312: SA: clhsdb: Provide an improved heap summary for 'universe' for G1GC In-Reply-To: References: <38d71740-0b66-3ce8-26ed-a0f2b9f9e91c@oracle.com> <5e8c582e-b32f-daf7-0e0c-1e6606ceaf3a@oracle.com> <59c7ec4f-f9dc-7c25-e7bf-f3fc304b5c60@oracle.com> Message-ID: Jini, Looks good. One minor comment: + public void printG1HeapSummary(G1CollectedHeap heap) {+ G1CollectedHeap g1h = (G1CollectedHeap) heap; 'heap' has been cast to 'G1CollectedHeap' at call site, so seems no need to convert here again. Thanks Yumin On Mon, Mar 12, 2018 at 8:52 AM, Jini George wrote: > Thank you very much, Stefan. Could one more reviewer please take a look at > it ? > > - Jini. > > > On 3/12/2018 8:52 PM, Stefan Johansson wrote: > >> Hi Jini, >> >> This looks good. I'm totally fine with skipping metaspace if that isn't >> displayed for the other GCs. >> >> Cheers, >> Stefan >> >> On 2018-03-09 10:29, Jini George wrote: >> >>> Here is the revised webrev: >>> >>> http://cr.openjdk.java.net/~jgeorge/8175312/webrev.02/ >>> >>> I have made modifications to have the 'universe' command display details >>> like: >>> >>> hsdb> universe >>> Heap Parameters: >>> garbage-first heap [0x0000000725200000, 0x00000007c0000000] region size >>> 1024K >>> G1 Heap: >>> regions = 2478 >>> capacity = 2598371328 (2478.0MB) >>> used = 5242880 (5.0MB) >>> free = 2593128448 (2473.0MB) >>> 0.20177562550443906% used >>> G1 Young Generation: >>> Eden Space: >>> regions = 5 >>> capacity = 8388608 (8.0MB) >>> used = 5242880 (5.0MB) >>> free = 3145728 (3.0MB) >>> 62.5% used >>> Survivor Space: >>> regions = 0 >>> capacity = 0 (0.0MB) >>> used = 0 (0.0MB) >>> free = 0 (0.0MB) >>> 0.0% used >>> G1 Old Generation: >>> regions = 0 >>> capacity = 155189248 (148.0MB) >>> used = 0 (0.0MB) >>> free = 155189248 (148.0MB) >>> 0.0% used >>> >>> >>> I did not add the metaspace details since that did not seem to be in >>> line with the 'universe' output for other GCs. I have added a new command >>> "g1regiondetails" to display the region details, and have modified the >>> tests accordingly. >>> >>> hsdb> g1regiondetails >>> Region Details: >>> Region: 0x0000000725200000,0x0000000725200000,0x0000000725300000:Free >>> Region: 0x0000000725300000,0x0000000725300000,0x0000000725400000:Free >>> Region: 0x0000000725400000,0x0000000725400000,0x0000000725500000:Free >>> Region: 0x0000000725500000,0x0000000725500000,0x0000000725600000:Free >>> Region: 0x0000000725600000,0x0000000725600000,0x0000000725700000:Free >>> Region: 0x0000000725700000,0x0000000725700000,0x0000000725800000:Free >>> ... >>> >>> Thanks, >>> Jini. >>> >>> >>> On 2/28/2018 12:56 PM, Jini George wrote: >>> >>>> Thank you very much, Stefan. My answers inline. >>>> >>>> On 2/27/2018 3:30 PM, Stefan Johansson wrote: >>>> >>>>> Hi Jini, >>>>> >>>> >>>> JIRA ID:https://bugs.openjdk.java.net/browse/JDK-8175312 >>>>>>> Webrev: http://cr.openjdk.java.net/~jgeorge/8175312/webrev.00/index. >>>>>>> html >>>>>>> >>>>>>> It looks like a file is missing, did you forget to add it to the >>>>> changeset? >>>>> >>>> >>>> Indeed, I had missed that! I added the missing file in the following >>>> webrev: >>>> >>>> http://cr.openjdk.java.net/~jgeorge/8175312/webrev.01/ >>>> >>>> --- >>>>> open/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/G1CollectedHeap.java:36: >>>>> error: cannot find symbol >>>>> import sun.jvm.hotspot.gc.shared.PrintRegionClosure; >>>>> --- >>>>> >>>>> Otherwise the change looks good, but I would like to see the output >>>>> live. For a big heap this will print a lot of data, just wondering if the >>>>> universe command is the correct choice for this kind of output. I like >>>>> having the possibility to print all regions, so I want the change but maybe >>>>> it should be a different command and 'universe' just prints a little more >>>>> than before. Something like our logging heap-summary at shutdown: >>>>> garbage-first heap total 16384K, used 3072K [0x00000000ff000000, >>>>> 0x0000000100000000) >>>>> region size 1024K, 4 young (4096K), 0 survivors (0K) >>>>> Metaspace used 6731K, capacity 6825K, committed 7040K, reserved >>>>> 1056768K >>>>> class space used 559K, capacity 594K, committed 640K, reserved >>>>> 1048576K >>>>> >>>> >>>> Ok, will add this, and could probably have the region details displayed >>>> under a new command called "g1regiondetails", or some such, and send out a >>>> new webrev. >>>> >>>> Thanks, >>>> Jini. >>>> >>>> >>>>> Thanks, >>>>> Stefan >>>>> >>>>>> Modifications have been made to display the regions like: >>>>>>> >>>>>>> ... >>>>>>> Region: 0x00000005c5400000,0x00000005c5600000,0x00000005c5600000:Old >>>>>>> Region: 0x00000005c5600000,0x00000005c5800000,0x00000005c5800000:Old >>>>>>> Region: 0x00000005c5800000,0x00000005c5a00000,0x00000005c5a00000:Old >>>>>>> Region: 0x00000005c5a00000,0x00000005c5c00000,0x00000005c5c00000:Old >>>>>>> Region: 0x00000005c5c00000,0x00000005c5c00000,0x00000005c5e00000: >>>>>>> Free >>>>>>> Region: 0x00000005c5e00000,0x00000005c5e00000,0x00000005c6000000: >>>>>>> Free >>>>>>> Region: 0x00000005c6000000,0x00000005c6200000,0x00000005c6200000:Old >>>>>>> ... >>>>>>> >>>>>>> The jtreg test at this point does not include any testing for the >>>>>>> display of archived or pinned regions. The testing for this will be added >>>>>>> once JDK-8174994 is resolved. >>>>>>> >>>>>>> The SA tests pass with jprt and Mach5. >>>>>>> >>>>>>> Thanks, >>>>>>> Jini. >>>>>>> >>>>>> >>>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Tue Mar 13 04:36:05 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 13 Mar 2018 14:36:05 +1000 Subject: Another question regarding Thread dumps In-Reply-To: <540c219708ee4dc9b15e6454c79e5bba@sap.com> References: <540c219708ee4dc9b15e6454c79e5bba@sap.com> Message-ID: On 13/03/2018 1:41 AM, Langer, Christoph wrote: > Hi, > > I have another question regarding thread dumping code. > > At the places where thread dumps get generated (attachListener.cpp, diagnosticCommand.cpp, os.cpp), there's always a series of 3 VM operations: VM_PrintThreads, VM_PrintJNI and VM_FindDeadlocks. I'm wondering if it would make sense to do this altogether in one VM operation? Then probably the picture could be more consistent. However, I can imagine the risk that the safepoint takes too long. Are there other pros and cons I'm missing? > > I'm asking because in our JVM codebase I can find places where some of these VM ops had been combined and I'm wondering what might be the reasoning behind that and whether it makes sense to revert to the OpenJDK way of doing things or whether the changes are smart and even worth contributing. What do you think? VM_FindDeadlocks is also used stand-alone in jmm_FindDeadlockedThreads. I think they are logically three distinct operations. And one really long safepoint could be quite problematic. You'd need extensive real-life benchmarking of the impact on real apps that use active monitoring before being able to make this change. This seems to me like a "if it ain't broke ..." situation. Cheers, David > Thanks > Christoph > From david.holmes at oracle.com Tue Mar 13 07:38:08 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 13 Mar 2018 17:38:08 +1000 Subject: RFR: 8199323: hsdis could not be loaded which are located on long path In-Reply-To: <5c53013a-2225-e1a1-0ab8-eb4638dfbd7d@gmail.com> References: <3fc35990-88b5-475c-f89e-226b862cab08@gmail.com> <5c53013a-2225-e1a1-0ab8-eb4638dfbd7d@gmail.com> Message-ID: <95bd7747-b8ee-1450-c2ef-a8afde5c0b71@oracle.com> Looks fine to me. Just need a second review. And if you use the new submit-hs repo [1] to do pre-push testing you can push this yourself. Thanks, David [1] http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-March/030656.html On 13/03/2018 12:25 AM, Yasumasa Suenaga wrote: > Hi David, > >> I don't think this code has the same concern that the code in jvm_md.h >> claims** to have, so a simple use of MAXPATHLEN should be fine on all >> non-windows platforms. > > It sounds good to me. I updated webrev: > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.01/ > > >> My only concern with the current change is whether a 4K on stack >> buffer might cause any issues? > > In case of HotSpot for x64 Linux, stack size is 1MB. So I think the risk > of stack overflow is very low. > In fact, my environment (Fedora 27 x64) works fine with this change. > > > Thanks, > > Yasumasa > > > On 2018/03/12 13:13, David Holmes wrote: >> Hi Yasumasa, >> >> On 8/03/2018 11:21 PM, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> Could you review and sponsor it? >>> >>> ?????????????????????????? webrev: >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.00/ >>> ????????????????????????????? JBS: >>> https://bugs.openjdk.java.net/browse/JDK-8199323 >>> Mach5 test result on submit repo: >>> mach5-one-ysuenaga-JDK-8199323-20180308-1027-13701 >>> >>> I encountered DebuggerException when hsdis is located on long path as >>> below: >>> >>> Location of hsdis: >>> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/jre/lib/amd64/hsdis-amd64.so >>> >>> >>> Exception: >>> sun.jvm.hotspot.debugger.DebuggerException: >>> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/j: >>> cannot open shared object file: No such file or directory >>> >>> In Java_sun_jvm_hotspot_asm_Disassembler_load_1library(), buffer >>> which uses for library path is defined as below: >>> >>> ``` >>> char buffer[128]; >>> ``` >>> >>> I copied JVM_MAXPATHLEN related code to sadis.c from >>> os/posix/include/jvm_md.h and os/windows/include/jvm_md.h . >> >> I don't think this code has the same concern that the code in jvm_md.h >> claims** to have, so a simple use of MAXPATHLEN should be fine on all >> non-windows platforms. >> >> ** The posix jvm_md.h code is historical and I don't think we have to >> be concerned either about a 4095 definition of MAXPATHLEN or that the >> VM and libraries may have been compiled on different Linux versions! >> >> My only concern with the current change is whether a 4K on stack >> buffer might cause any issues? >> >> Thanks, >> David >> ----- >> >>> >>> I added noreg-hard label on this ticket because this issue is >>> available when disassembling on coredump. >>> >>> >>> Thanks, >>> >>> Yasumasa From yasuenag at gmail.com Tue Mar 13 08:26:36 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 13 Mar 2018 17:26:36 +0900 Subject: RFR: 8199323: hsdis could not be loaded which are located on long path In-Reply-To: <95bd7747-b8ee-1450-c2ef-a8afde5c0b71@oracle.com> References: <3fc35990-88b5-475c-f89e-226b862cab08@gmail.com> <5c53013a-2225-e1a1-0ab8-eb4638dfbd7d@gmail.com> <95bd7747-b8ee-1450-c2ef-a8afde5c0b71@oracle.com> Message-ID: Thanks David! I've run test on submit-hs repo, but I received 1 failure: mach5-one-ysuenaga-JDK-8199323-20180313-0429-14193 java/lang/invoke/condy/CondyInterfaceWithOverpassMethods.java windows-x64 Error: failed to clean up files after test I guess the failure does not relate to this change. After getting second reviewer, I re-run the test before pushing. Yasumasa 2018-03-13 16:38 GMT+09:00 David Holmes : > Looks fine to me. Just need a second review. > > And if you use the new submit-hs repo [1] to do pre-push testing you can > push this yourself. > > Thanks, > David > > [1] > http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-March/030656.html > > > On 13/03/2018 12:25 AM, Yasumasa Suenaga wrote: >> >> Hi David, >> >>> I don't think this code has the same concern that the code in jvm_md.h >>> claims** to have, so a simple use of MAXPATHLEN should be fine on all >>> non-windows platforms. >> >> >> It sounds good to me. I updated webrev: >> http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.01/ >> >> >>> My only concern with the current change is whether a 4K on stack buffer >>> might cause any issues? >> >> >> In case of HotSpot for x64 Linux, stack size is 1MB. So I think the risk >> of stack overflow is very low. >> In fact, my environment (Fedora 27 x64) works fine with this change. >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2018/03/12 13:13, David Holmes wrote: >>> >>> Hi Yasumasa, >>> >>> On 8/03/2018 11:21 PM, Yasumasa Suenaga wrote: >>>> >>>> Hi all, >>>> >>>> Could you review and sponsor it? >>>> >>>> webrev: >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.00/ >>>> JBS: >>>> https://bugs.openjdk.java.net/browse/JDK-8199323 >>>> Mach5 test result on submit repo: >>>> mach5-one-ysuenaga-JDK-8199323-20180308-1027-13701 >>>> >>>> I encountered DebuggerException when hsdis is located on long path as >>>> below: >>>> >>>> Location of hsdis: >>>> >>>> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/jre/lib/amd64/hsdis-amd64.so >>>> >>>> Exception: >>>> sun.jvm.hotspot.debugger.DebuggerException: >>>> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/j: >>>> cannot open shared object file: No such file or directory >>>> >>>> In Java_sun_jvm_hotspot_asm_Disassembler_load_1library(), buffer which >>>> uses for library path is defined as below: >>>> >>>> ``` >>>> char buffer[128]; >>>> ``` >>>> >>>> I copied JVM_MAXPATHLEN related code to sadis.c from >>>> os/posix/include/jvm_md.h and os/windows/include/jvm_md.h . >>> >>> >>> I don't think this code has the same concern that the code in jvm_md.h >>> claims** to have, so a simple use of MAXPATHLEN should be fine on all >>> non-windows platforms. >>> >>> ** The posix jvm_md.h code is historical and I don't think we have to be >>> concerned either about a 4095 definition of MAXPATHLEN or that the VM and >>> libraries may have been compiled on different Linux versions! >>> >>> My only concern with the current change is whether a 4K on stack buffer >>> might cause any issues? >>> >>> Thanks, >>> David >>> ----- >>> >>>> >>>> I added noreg-hard label on this ticket because this issue is available >>>> when disassembling on coredump. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa From thomas.stuefe at gmail.com Tue Mar 13 08:34:25 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 13 Mar 2018 09:34:25 +0100 Subject: RFR: 8199323: hsdis could not be loaded which are located on long path In-Reply-To: <95bd7747-b8ee-1450-c2ef-a8afde5c0b71@oracle.com> References: <3fc35990-88b5-475c-f89e-226b862cab08@gmail.com> <5c53013a-2225-e1a1-0ab8-eb4638dfbd7d@gmail.com> <95bd7747-b8ee-1450-c2ef-a8afde5c0b71@oracle.com> Message-ID: Hi Yasumasa, looks fine to me too. Thank you for fixing. Like David, not a big fan of the array allocation on the stack, but it will probably be okay. Lets hope noone changes JVM_MAXPATHLEN. Best Regards, Thomas On Tue, Mar 13, 2018 at 8:38 AM, David Holmes wrote: > Looks fine to me. Just need a second review. > > And if you use the new submit-hs repo [1] to do pre-push testing you can > push this yourself. > > Thanks, > David > > [1] http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-Marc > h/030656.html > > > On 13/03/2018 12:25 AM, Yasumasa Suenaga wrote: > >> Hi David, >> >> I don't think this code has the same concern that the code in jvm_md.h >>> claims** to have, so a simple use of MAXPATHLEN should be fine on all >>> non-windows platforms. >>> >> >> It sounds good to me. I updated webrev: >> http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.01/ >> >> >> My only concern with the current change is whether a 4K on stack buffer >>> might cause any issues? >>> >> >> In case of HotSpot for x64 Linux, stack size is 1MB. So I think the risk >> of stack overflow is very low. >> In fact, my environment (Fedora 27 x64) works fine with this change. >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2018/03/12 13:13, David Holmes wrote: >> >>> Hi Yasumasa, >>> >>> On 8/03/2018 11:21 PM, Yasumasa Suenaga wrote: >>> >>>> Hi all, >>>> >>>> Could you review and sponsor it? >>>> >>>> webrev: http://cr.openjdk.java.net/~ys >>>> uenaga/JDK-8199323/webrev.00/ >>>> JBS: https://bugs.openjdk.java.net/ >>>> browse/JDK-8199323 >>>> Mach5 test result on submit repo: mach5-one-ysuenaga-JDK-8199323 >>>> -20180308-1027-13701 >>>> >>>> I encountered DebuggerException when hsdis is located on long path as >>>> below: >>>> >>>> Location of hsdis: >>>> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/work >>>> space/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8. >>>> x86_64/jre/lib/amd64/hsdis-amd64.so >>>> >>>> Exception: >>>> sun.jvm.hotspot.debugger.DebuggerException: >>>> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/work >>>> space/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/j: >>>> cannot open shared object file: No such file or directory >>>> >>>> In Java_sun_jvm_hotspot_asm_Disassembler_load_1library(), buffer which >>>> uses for library path is defined as below: >>>> >>>> ``` >>>> char buffer[128]; >>>> ``` >>>> >>>> I copied JVM_MAXPATHLEN related code to sadis.c from >>>> os/posix/include/jvm_md.h and os/windows/include/jvm_md.h . >>>> >>> >>> I don't think this code has the same concern that the code in jvm_md.h >>> claims** to have, so a simple use of MAXPATHLEN should be fine on all >>> non-windows platforms. >>> >>> ** The posix jvm_md.h code is historical and I don't think we have to be >>> concerned either about a 4095 definition of MAXPATHLEN or that the VM and >>> libraries may have been compiled on different Linux versions! >>> >>> My only concern with the current change is whether a 4K on stack buffer >>> might cause any issues? >>> >>> Thanks, >>> David >>> ----- >>> >>> >>>> I added noreg-hard label on this ticket because this issue is available >>>> when disassembling on coredump. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Tue Mar 13 08:49:42 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 13 Mar 2018 17:49:42 +0900 Subject: RFR: 8199323: hsdis could not be loaded which are located on long path In-Reply-To: References: <3fc35990-88b5-475c-f89e-226b862cab08@gmail.com> <5c53013a-2225-e1a1-0ab8-eb4638dfbd7d@gmail.com> <95bd7747-b8ee-1450-c2ef-a8afde5c0b71@oracle.com> Message-ID: Thanks Thomas! I did not understand cause of failure in CondyInterfaceWithOverpassMethods.java. Can you share Mach5 report? I want to know I can push this change now. Yasumasa 2018-03-13 17:34 GMT+09:00 Thomas St?fe : > Hi Yasumasa, > looks fine to me too. Thank you for fixing. > > Like David, not a big fan of the array allocation on the stack, but it will > probably be okay. Lets hope noone changes JVM_MAXPATHLEN. > > Best Regards, Thomas > > On Tue, Mar 13, 2018 at 8:38 AM, David Holmes > wrote: >> >> Looks fine to me. Just need a second review. >> >> And if you use the new submit-hs repo [1] to do pre-push testing you can >> push this yourself. >> >> Thanks, >> David >> >> [1] >> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-March/030656.html >> >> >> On 13/03/2018 12:25 AM, Yasumasa Suenaga wrote: >>> >>> Hi David, >>> >>>> I don't think this code has the same concern that the code in jvm_md.h >>>> claims** to have, so a simple use of MAXPATHLEN should be fine on all >>>> non-windows platforms. >>> >>> >>> It sounds good to me. I updated webrev: >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.01/ >>> >>> >>>> My only concern with the current change is whether a 4K on stack buffer >>>> might cause any issues? >>> >>> >>> In case of HotSpot for x64 Linux, stack size is 1MB. So I think the risk >>> of stack overflow is very low. >>> In fact, my environment (Fedora 27 x64) works fine with this change. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2018/03/12 13:13, David Holmes wrote: >>>> >>>> Hi Yasumasa, >>>> >>>> On 8/03/2018 11:21 PM, Yasumasa Suenaga wrote: >>>>> >>>>> Hi all, >>>>> >>>>> Could you review and sponsor it? >>>>> >>>>> webrev: >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.00/ >>>>> JBS: >>>>> https://bugs.openjdk.java.net/browse/JDK-8199323 >>>>> Mach5 test result on submit repo: >>>>> mach5-one-ysuenaga-JDK-8199323-20180308-1027-13701 >>>>> >>>>> I encountered DebuggerException when hsdis is located on long path as >>>>> below: >>>>> >>>>> Location of hsdis: >>>>> >>>>> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/jre/lib/amd64/hsdis-amd64.so >>>>> >>>>> Exception: >>>>> sun.jvm.hotspot.debugger.DebuggerException: >>>>> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/j: >>>>> cannot open shared object file: No such file or directory >>>>> >>>>> In Java_sun_jvm_hotspot_asm_Disassembler_load_1library(), buffer which >>>>> uses for library path is defined as below: >>>>> >>>>> ``` >>>>> char buffer[128]; >>>>> ``` >>>>> >>>>> I copied JVM_MAXPATHLEN related code to sadis.c from >>>>> os/posix/include/jvm_md.h and os/windows/include/jvm_md.h . >>>> >>>> >>>> I don't think this code has the same concern that the code in jvm_md.h >>>> claims** to have, so a simple use of MAXPATHLEN should be fine on all >>>> non-windows platforms. >>>> >>>> ** The posix jvm_md.h code is historical and I don't think we have to be >>>> concerned either about a 4095 definition of MAXPATHLEN or that the VM and >>>> libraries may have been compiled on different Linux versions! >>>> >>>> My only concern with the current change is whether a 4K on stack buffer >>>> might cause any issues? >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> >>>>> I added noreg-hard label on this ticket because this issue is available >>>>> when disassembling on coredump. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa > > From thomas.stuefe at gmail.com Tue Mar 13 08:59:23 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 13 Mar 2018 09:59:23 +0100 Subject: RFR: 8199323: hsdis could not be loaded which are located on long path In-Reply-To: References: <3fc35990-88b5-475c-f89e-226b862cab08@gmail.com> <5c53013a-2225-e1a1-0ab8-eb4638dfbd7d@gmail.com> <95bd7747-b8ee-1450-c2ef-a8afde5c0b71@oracle.com> Message-ID: Hi Yasumasa, On Tue, Mar 13, 2018 at 9:49 AM, Yasumasa Suenaga wrote: > Thanks Thomas! > > I did not understand cause of failure in CondyInterfaceWithOverpassMeth > ods.java. > Can you share Mach5 report? > > Sorry, unfortunately no. I'm a reviewer, but not from Oracle :) You could send a mail to ops at openjdk.java.net. In general, getting detailed test error information from these new submit repos has been very difficult. > I want to know I can push this change now. > > > Yasumasa > > Best Regards, Thomas > > 2018-03-13 17:34 GMT+09:00 Thomas St?fe : > > Hi Yasumasa, > > looks fine to me too. Thank you for fixing. > > > > Like David, not a big fan of the array allocation on the stack, but it > will > > probably be okay. Lets hope noone changes JVM_MAXPATHLEN. > > > > Best Regards, Thomas > > > > On Tue, Mar 13, 2018 at 8:38 AM, David Holmes > > wrote: > >> > >> Looks fine to me. Just need a second review. > >> > >> And if you use the new submit-hs repo [1] to do pre-push testing you can > >> push this yourself. > >> > >> Thanks, > >> David > >> > >> [1] > >> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018- > March/030656.html > >> > >> > >> On 13/03/2018 12:25 AM, Yasumasa Suenaga wrote: > >>> > >>> Hi David, > >>> > >>>> I don't think this code has the same concern that the code in jvm_md.h > >>>> claims** to have, so a simple use of MAXPATHLEN should be fine on all > >>>> non-windows platforms. > >>> > >>> > >>> It sounds good to me. I updated webrev: > >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.01/ > >>> > >>> > >>>> My only concern with the current change is whether a 4K on stack > buffer > >>>> might cause any issues? > >>> > >>> > >>> In case of HotSpot for x64 Linux, stack size is 1MB. So I think the > risk > >>> of stack overflow is very low. > >>> In fact, my environment (Fedora 27 x64) works fine with this change. > >>> > >>> > >>> Thanks, > >>> > >>> Yasumasa > >>> > >>> > >>> On 2018/03/12 13:13, David Holmes wrote: > >>>> > >>>> Hi Yasumasa, > >>>> > >>>> On 8/03/2018 11:21 PM, Yasumasa Suenaga wrote: > >>>>> > >>>>> Hi all, > >>>>> > >>>>> Could you review and sponsor it? > >>>>> > >>>>> webrev: > >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.00/ > >>>>> JBS: > >>>>> https://bugs.openjdk.java.net/browse/JDK-8199323 > >>>>> Mach5 test result on submit repo: > >>>>> mach5-one-ysuenaga-JDK-8199323-20180308-1027-13701 > >>>>> > >>>>> I encountered DebuggerException when hsdis is located on long path as > >>>>> below: > >>>>> > >>>>> Location of hsdis: > >>>>> > >>>>> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/ > workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13. > el6_8.x86_64/jre/lib/amd64/hsdis-amd64.so > >>>>> > >>>>> Exception: > >>>>> sun.jvm.hotspot.debugger.DebuggerException: > >>>>> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/ > workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/j: > >>>>> cannot open shared object file: No such file or directory > >>>>> > >>>>> In Java_sun_jvm_hotspot_asm_Disassembler_load_1library(), buffer > which > >>>>> uses for library path is defined as below: > >>>>> > >>>>> ``` > >>>>> char buffer[128]; > >>>>> ``` > >>>>> > >>>>> I copied JVM_MAXPATHLEN related code to sadis.c from > >>>>> os/posix/include/jvm_md.h and os/windows/include/jvm_md.h . > >>>> > >>>> > >>>> I don't think this code has the same concern that the code in jvm_md.h > >>>> claims** to have, so a simple use of MAXPATHLEN should be fine on all > >>>> non-windows platforms. > >>>> > >>>> ** The posix jvm_md.h code is historical and I don't think we have to > be > >>>> concerned either about a 4095 definition of MAXPATHLEN or that the VM > and > >>>> libraries may have been compiled on different Linux versions! > >>>> > >>>> My only concern with the current change is whether a 4K on stack > buffer > >>>> might cause any issues? > >>>> > >>>> Thanks, > >>>> David > >>>> ----- > >>>> > >>>>> > >>>>> I added noreg-hard label on this ticket because this issue is > available > >>>>> when disassembling on coredump. > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Yasumasa > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Tue Mar 13 11:22:17 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 13 Mar 2018 21:22:17 +1000 Subject: RFR: 8199323: hsdis could not be loaded which are located on long path In-Reply-To: References: <3fc35990-88b5-475c-f89e-226b862cab08@gmail.com> <5c53013a-2225-e1a1-0ab8-eb4638dfbd7d@gmail.com> <95bd7747-b8ee-1450-c2ef-a8afde5c0b71@oracle.com> Message-ID: On 13/03/2018 6:26 PM, Yasumasa Suenaga wrote: > Thanks David! > > I've run test on submit-hs repo, but I received 1 failure: > > mach5-one-ysuenaga-JDK-8199323-20180313-0429-14193 > java/lang/invoke/condy/CondyInterfaceWithOverpassMethods.java > windows-x64 > Error: failed to clean up files after test > > I guess the failure does not relate to this change. No. "failed to clean up files after test" is something we tend to see a bit on windows. > After getting second reviewer, I re-run the test before pushing. Ok. David > > Yasumasa > > > > 2018-03-13 16:38 GMT+09:00 David Holmes : >> Looks fine to me. Just need a second review. >> >> And if you use the new submit-hs repo [1] to do pre-push testing you can >> push this yourself. >> >> Thanks, >> David >> >> [1] >> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-March/030656.html >> >> >> On 13/03/2018 12:25 AM, Yasumasa Suenaga wrote: >>> >>> Hi David, >>> >>>> I don't think this code has the same concern that the code in jvm_md.h >>>> claims** to have, so a simple use of MAXPATHLEN should be fine on all >>>> non-windows platforms. >>> >>> >>> It sounds good to me. I updated webrev: >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.01/ >>> >>> >>>> My only concern with the current change is whether a 4K on stack buffer >>>> might cause any issues? >>> >>> >>> In case of HotSpot for x64 Linux, stack size is 1MB. So I think the risk >>> of stack overflow is very low. >>> In fact, my environment (Fedora 27 x64) works fine with this change. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2018/03/12 13:13, David Holmes wrote: >>>> >>>> Hi Yasumasa, >>>> >>>> On 8/03/2018 11:21 PM, Yasumasa Suenaga wrote: >>>>> >>>>> Hi all, >>>>> >>>>> Could you review and sponsor it? >>>>> >>>>> webrev: >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8199323/webrev.00/ >>>>> JBS: >>>>> https://bugs.openjdk.java.net/browse/JDK-8199323 >>>>> Mach5 test result on submit repo: >>>>> mach5-one-ysuenaga-JDK-8199323-20180308-1027-13701 >>>>> >>>>> I encountered DebuggerException when hsdis is located on long path as >>>>> below: >>>>> >>>>> Location of hsdis: >>>>> >>>>> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/jre/lib/amd64/hsdis-amd64.so >>>>> >>>>> Exception: >>>>> sun.jvm.hotspot.debugger.DebuggerException: >>>>> /home/yasuenag/work/xxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxxxx/workspace/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/j: >>>>> cannot open shared object file: No such file or directory >>>>> >>>>> In Java_sun_jvm_hotspot_asm_Disassembler_load_1library(), buffer which >>>>> uses for library path is defined as below: >>>>> >>>>> ``` >>>>> char buffer[128]; >>>>> ``` >>>>> >>>>> I copied JVM_MAXPATHLEN related code to sadis.c from >>>>> os/posix/include/jvm_md.h and os/windows/include/jvm_md.h . >>>> >>>> >>>> I don't think this code has the same concern that the code in jvm_md.h >>>> claims** to have, so a simple use of MAXPATHLEN should be fine on all >>>> non-windows platforms. >>>> >>>> ** The posix jvm_md.h code is historical and I don't think we have to be >>>> concerned either about a 4095 definition of MAXPATHLEN or that the VM and >>>> libraries may have been compiled on different Linux versions! >>>> >>>> My only concern with the current change is whether a 4K on stack buffer >>>> might cause any issues? >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> >>>>> I added noreg-hard label on this ticket because this issue is available >>>>> when disassembling on coredump. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa From jcbeyler at google.com Tue Mar 13 17:37:11 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 13 Mar 2018 17:37:11 +0000 Subject: RFR: Two line change in documentation Message-ID: Hi all, I saw an error in the SetEventNotificationMode method where the parameter is called event_thread but the documentation was referring to it as thread. I then went and did a quick scan of the documentation and found one type of "couse" instead of "course". Here is the diff, not sure it was worth doing a webrev for it but let me know: diff -r 2d1d0c66966b src/hotspot/share/prims/jvmti.xml --- a/src/hotspot/share/prims/jvmti.xml Mon Mar 12 14:11:54 2018 -0700 +++ b/src/hotspot/share/prims/jvmti.xml Tue Mar 13 10:35:03 2018 -0700 @@ -693,7 +693,7 @@ mechanism causes the unload (an unload mechanism is not specified in this document) or the library is (in effect) unloaded by the termination of the VM whether through normal termination or VM failure, including start-up failure. - Uncontrolled shutdown is, of couse, an exception to this rule. + Uncontrolled shutdown is, of course, an exception to this rule. Note the distinction between this function and the VM Death event: for the VM Death event to be sent, the VM must have run at least to the point of initialization and a valid @@ -9405,7 +9405,7 @@ the event will be disabled - If thread is NULL, + If event_thread is NULL, the event is enabled or disabled globally; otherwise, it is enabled or disabled for a particular thread. An event is generated for Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Mar 13 21:32:42 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 13 Mar 2018 14:32:42 -0700 Subject: RFR: Two line change in documentation In-Reply-To: References: Message-ID: Hi Jc, Yes, these are typos. Thank you for fixing them! Is it a formal review request (RFR) ? If so, then a bug number is needed. I've filed one: ? https://bugs.openjdk.java.net/browse/JDK-8199561 The fix looks good. I think, this can be fixed under a trivial fix rule with just one review. I'll sponsor it for you. Thanks, Serguei On 3/13/18 10:37, JC Beyler wrote: > Hi all, > > I saw an error in the SetEventNotificationMode method where the > parameter is called event_thread but the documentation was referring > to it as thread. I then went and did a quick scan of the documentation > and found one type of "couse" instead of "course". > > Here is the diff, not sure it was worth doing a webrev for it but let > me know: > diff -r 2d1d0c66966b src/hotspot/share/prims/jvmti.xml > --- a/src/hotspot/share/prims/jvmti.xmlMon Mar 12 14:11:54 2018 -0700 > +++ b/src/hotspot/share/prims/jvmti.xmlTue Mar 13 10:35:03 2018 -0700 > @@ -693,7 +693,7 @@ > ? ? ?mechanism causes the unload (an unload mechanism is not specified > in this document) > ? ? ?or the library is (in effect) unloaded by the termination of the > VM whether through > ? ? ?normal termination or VM failure, including start-up failure. > -? ? Uncontrolled shutdown is, of couse, an exception to this rule. > +? ? Uncontrolled shutdown is, of course, an exception to this rule. > ? ? ?Note the distinction between this function and the > ? ? ?VM Death event: for the VM > Death event > ? ? ?to be sent, the VM must have run at least to the point of > initialization and a valid > @@ -9405,7 +9405,7 @@ > ? ? the event will be disabled > ? > > -If thread is NULL, > +If event_thread is NULL, > the event is enabled or disabled globally; otherwise, it is > enabled or disabled for a particular thread. > An event is generated for > > Thanks, > Jc From alexey.menkov at oracle.com Tue Mar 13 23:14:19 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 13 Mar 2018 16:14:19 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" Message-ID: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> Hi all, Please review a small fix for https://bugs.openjdk.java.net/browse/JDK-8049695 webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ Root cause of the issue is jbd hungs as a result of the buffer overflow. In the beginning of the shmemBase.c: #define MAX_IPC_PREFIX 50 /* user-specified or generated name for */ /* shared memory seg and prefix for other IPC */ #define MAX_IPC_SUFFIX 25 /* suffix to shmem name for other IPC names */ #define MAX_IPC_NAME (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) buffer (char prefix[]) in function createStream is used to generate base name for mutex/events, so MAX_IPC_PREFIX is not big enough. --alex From david.holmes at oracle.com Wed Mar 14 00:46:53 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 14 Mar 2018 10:46:53 +1000 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> Message-ID: <4212eecf-7c08-15af-2e2f-8b63c4483b95@oracle.com> Hi Alex, On 14/03/2018 9:14 AM, Alex Menkov wrote: > Hi all, > > Please review a small fix for > https://bugs.openjdk.java.net/browse/JDK-8049695 > webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ > > Root cause of the issue is jbd hungs as a result of the buffer overflow. > > In the beginning of the shmemBase.c: > > #define MAX_IPC_PREFIX 50?? /* user-specified or generated name for */ > ??????????????????????????? /* shared memory seg and prefix for other > IPC */ > #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC names */ > #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) > > buffer (char prefix[]) in function createStream is used to generate base > name for mutex/events, so MAX_IPC_PREFIX is not big enough. Good catch! But overall this code seems to be missing bounds checks everywhere. You made the "prefix" (poor name?) buffer bigger (MAX_IPC_NAME) but do we know the incoming name plus the appended descriptive string will fit in it? Looking at createTransport for example, it also has: char prefix[MAX_IPC_PREFIX]; and it produces an error if strlen(address) >= MAX_IPC_PREFIX but otherwise copies it across: strcpy(transport->name, address); and then later does: sprintf(prefix, "%s.mutex", transport->name); so we may have overflowed again by adding ".mutex"! The same goes for the subsequent sprintf's. So I think there is more work to do to ensure this code is immune from buffer overflows. Thanks, David ----- > --alex From david.holmes at oracle.com Wed Mar 14 00:55:53 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 14 Mar 2018 10:55:53 +1000 Subject: RFR: Two line change in documentation In-Reply-To: References: Message-ID: <6d75ab02-630e-a423-3d40-c5be9750ea27@oracle.com> On 14/03/2018 7:32 AM, serguei.spitsyn at oracle.com wrote: > Hi Jc, > > Yes, these are typos. > Thank you for fixing them! > > Is it a formal review request (RFR) ? > If so, then a bug number is needed. > I've filed one: > ? https://bugs.openjdk.java.net/browse/JDK-8199561 > > The fix looks good. > I think, this can be fixed under a trivial fix rule with just one review. I agree. No need for CSR either as these are just obvious typos. :) Thanks, David > I'll sponsor it for you. > > Thanks, > Serguei > > > On 3/13/18 10:37, JC Beyler wrote: >> Hi all, >> >> I saw an error in the SetEventNotificationMode method where the >> parameter is called event_thread but the documentation was referring >> to it as thread. I then went and did a quick scan of the documentation >> and found one type of "couse" instead of "course". >> >> Here is the diff, not sure it was worth doing a webrev for it but let >> me know: >> diff -r 2d1d0c66966b src/hotspot/share/prims/jvmti.xml >> --- a/src/hotspot/share/prims/jvmti.xmlMon Mar 12 14:11:54 2018 -0700 >> +++ b/src/hotspot/share/prims/jvmti.xmlTue Mar 13 10:35:03 2018 -0700 >> @@ -693,7 +693,7 @@ >> ? ? ?mechanism causes the unload (an unload mechanism is not specified >> in this document) >> ? ? ?or the library is (in effect) unloaded by the termination of the >> VM whether through >> ? ? ?normal termination or VM failure, including start-up failure. >> -? ? Uncontrolled shutdown is, of couse, an exception to this rule. >> +? ? Uncontrolled shutdown is, of course, an exception to this rule. >> ? ? ?Note the distinction between this function and the >> ? ? ?VM Death event: for the VM >> Death event >> ? ? ?to be sent, the VM must have run at least to the point of >> initialization and a valid >> @@ -9405,7 +9405,7 @@ >> ? ? the event will be disabled >> ? >> >> -If thread is NULL, >> +If event_thread is NULL, >> the event is enabled or disabled globally; otherwise, it is >> enabled or disabled for a particular thread. >> An event is generated for >> >> Thanks, >> Jc > From daniil.x.titov at oracle.com Wed Mar 14 05:26:50 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 13 Mar 2018 22:26:50 -0700 Subject: RFR 8166642: serviceability/dcmd/framework/* timeout Message-ID: <2F3BE7C8-AB3A-461E-80F8-82FA77C25DCE@oracle.com> Please review the changes that fix intermittent timeout failure of serviceability/dcmd/framework/* tests. The problem here is that these tests invoke jcmd in different ways and one of such ways is when a main class is passed to the jcmd as a VM identifier. The main class for jtreg test is com.sun.javatest.regtest.agent.MainWrapper and in some cases more than one test are running in parallel and there are multiple Java processes with com.sun.javatest.regtest.agent.MainWrapper as a main class . When it happens jcmd iterates over all Java processes that match the condition (the main class equals to com.sun.javatest.regtest.agent.MainWrapper) and executes the command for each of them. That results in the jcmd invokes the given command multiple times and attaches to Java processes not related to the current test. The fix makes serviceability/dcmd/framework/* tests non-concurrent to ensure that they don't interact with other tests. Bug: https://bugs.openjdk.java.net/browse/JDK-8166642 Webrev: http://cr.openjdk.java.net/~dtitov/8166642/webrev.01 The tests ran successfully with Mach5. Best regards, Daniil From chris.plummer at oracle.com Wed Mar 14 05:50:54 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 13 Mar 2018 22:50:54 -0700 Subject: RFR 8166642: serviceability/dcmd/framework/* timeout In-Reply-To: <2F3BE7C8-AB3A-461E-80F8-82FA77C25DCE@oracle.com> References: <2F3BE7C8-AB3A-461E-80F8-82FA77C25DCE@oracle.com> Message-ID: <8525f5af-f4ac-4eef-da11-4477954a3279@oracle.com> Hi Danill, The fix looks good. Were you able to reproduce this problem, and then after the fix run the tests enough times to be confident this really resolves the issue? Are you going to close JDK-8194057 as a dup? thanks, Chris On 3/13/18 10:26 PM, Daniil Titov wrote: > Please review the changes that fix intermittent timeout failure of serviceability/dcmd/framework/* tests. > > The problem here is that these tests invoke jcmd in different ways and one of such ways is when a main class is passed to the jcmd as a VM identifier. The main class for jtreg test is com.sun.javatest.regtest.agent.MainWrapper and in some cases more than one test are running in parallel and there are multiple Java processes with com.sun.javatest.regtest.agent.MainWrapper as a main class . When it happens jcmd iterates over all Java processes that match the condition (the main class equals to com.sun.javatest.regtest.agent.MainWrapper) and executes the command for each of them. That results in the jcmd invokes the given command multiple times and attaches to Java processes not related to the current test. > > The fix makes serviceability/dcmd/framework/* tests non-concurrent to ensure that they don't interact with other tests. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8166642 > Webrev: http://cr.openjdk.java.net/~dtitov/8166642/webrev.01 > > The tests ran successfully with Mach5. > > Best regards, > Daniil > > From david.holmes at oracle.com Wed Mar 14 05:54:39 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 14 Mar 2018 15:54:39 +1000 Subject: RFR 8166642: serviceability/dcmd/framework/* timeout In-Reply-To: <2F3BE7C8-AB3A-461E-80F8-82FA77C25DCE@oracle.com> References: <2F3BE7C8-AB3A-461E-80F8-82FA77C25DCE@oracle.com> Message-ID: Hi Daniil, On 14/03/2018 3:26 PM, Daniil Titov wrote: > Please review the changes that fix intermittent timeout failure of serviceability/dcmd/framework/* tests. > > The problem here is that these tests invoke jcmd in different ways and one of such ways is when a main class is passed to the jcmd as a VM identifier. The main class for jtreg test is com.sun.javatest.regtest.agent.MainWrapper and in some cases more than one test are running in parallel and there are multiple Java processes with com.sun.javatest.regtest.agent.MainWrapper as a main class . When it happens jcmd iterates over all Java processes that match the condition (the main class equals to com.sun.javatest.regtest.agent.MainWrapper) and executes the command for each of them. That results in the jcmd invokes the given command multiple times and attaches to Java processes not related to the current test. It's good to finally find the root cause of the problem! > The fix makes serviceability/dcmd/framework/* tests non-concurrent to ensure that they don't interact with other tests. This seems more of a workaround than a fix - though I don't know whether there is a way to distinguish multiple VMs all running what appears to be the same main class. My concern with the fix is how long it will take to run these tests sequentially, as they are run in tier2 and as part of the CI test job? Thanks, David > Bug: https://bugs.openjdk.java.net/browse/JDK-8166642 > Webrev: http://cr.openjdk.java.net/~dtitov/8166642/webrev.01 > > The tests ran successfully with Mach5. > > Best regards, > Daniil > > From chris.plummer at oracle.com Wed Mar 14 06:49:48 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 13 Mar 2018 23:49:48 -0700 Subject: RFR 8166642: serviceability/dcmd/framework/* timeout In-Reply-To: References: <2F3BE7C8-AB3A-461E-80F8-82FA77C25DCE@oracle.com> Message-ID: <70f178cf-ecb4-bd69-3f8d-9daf3cdbe045@oracle.com> On 3/13/18 10:54 PM, David Holmes wrote: > Hi Daniil, > > On 14/03/2018 3:26 PM, Daniil Titov wrote: >> Please review the changes that fix intermittent timeout failure of >> serviceability/dcmd/framework/* tests. >> >> The problem here is that these tests invoke jcmd in different ways >> and one of such ways is when a main class is passed to the jcmd as a >> VM identifier. The main class for jtreg test is >> com.sun.javatest.regtest.agent.MainWrapper and in some cases more >> than one test are running in parallel and there are multiple Java >> processes with com.sun.javatest.regtest.agent.MainWrapper as a main >> class . When it happens jcmd iterates over all Java processes that >> match the condition (the main class equals to >> com.sun.javatest.regtest.agent.MainWrapper) and executes the command >> for each of them. That results in the jcmd invokes the given command >> multiple times and attaches to Java processes not related to the >> current test. > > It's good to finally find the root cause of the problem! > >> The fix makes serviceability/dcmd/framework/* tests non-concurrent to >> ensure that they don't interact with other tests. > > This seems more of a workaround than a fix - though I don't know > whether there is a way to distinguish multiple VMs all running what > appears to be the same main class. The three tests are all intentionally testing the "jcmd " functionality. A better fix (but not worth it IMHO) would be to spawn a separate test process rather than using the main test process, which is always going to share the main class name with other concurrently running tests. > > My concern with the fix is how long it will take to run these tests > sequentially, as they are run in tier2 and as part of the CI test job? They are quick tests and there are only 3 of them. They appear to take less than 5 seconds each. Chris > > Thanks, > David > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8166642 >> Webrev: http://cr.openjdk.java.net/~dtitov/8166642/webrev.01 >> >> The tests ran successfully with Mach5. >> >> Best regards, >> Daniil >> >> From david.holmes at oracle.com Wed Mar 14 07:38:02 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 14 Mar 2018 17:38:02 +1000 Subject: RFR 8166642: serviceability/dcmd/framework/* timeout In-Reply-To: <70f178cf-ecb4-bd69-3f8d-9daf3cdbe045@oracle.com> References: <2F3BE7C8-AB3A-461E-80F8-82FA77C25DCE@oracle.com> <70f178cf-ecb4-bd69-3f8d-9daf3cdbe045@oracle.com> Message-ID: <376bf36e-88db-7ec4-868c-b1adf5cbf562@oracle.com> On 14/03/2018 4:49 PM, Chris Plummer wrote: > On 3/13/18 10:54 PM, David Holmes wrote: >> Hi Daniil, >> >> On 14/03/2018 3:26 PM, Daniil Titov wrote: >>> Please review the changes that fix intermittent timeout failure of >>> serviceability/dcmd/framework/* tests. >>> >>> The problem here is that these tests invoke jcmd in different ways >>> and one of such ways is when a main class is passed to the jcmd as a >>> VM identifier. The main class for jtreg test is >>> com.sun.javatest.regtest.agent.MainWrapper and in some cases more >>> than one test are running in parallel and there are multiple Java >>> processes with com.sun.javatest.regtest.agent.MainWrapper as a main >>> class . When it happens jcmd iterates over all Java processes that >>> match the condition (the main class equals to >>> com.sun.javatest.regtest.agent.MainWrapper) and executes the command >>> for each of them. That results in the jcmd invokes the given command >>> multiple times and attaches to Java processes not related to the >>> current test. >> >> It's good to finally find the root cause of the problem! >> >>> The fix makes serviceability/dcmd/framework/* tests non-concurrent to >>> ensure that they don't interact with other tests. >> >> This seems more of a workaround than a fix - though I don't know >> whether there is a way to distinguish multiple VMs all running what >> appears to be the same main class. > The three tests are all intentionally testing the "jcmd " > functionality. A better fix (but not worth it IMHO) would be to spawn a > separate test process rather than using the main test process, which is > always going to share the main class name with other concurrently > running tests. Yes uniquely named main classes would be better. I'll defer to you on the "worth it" part. >> >> My concern with the fix is how long it will take to run these tests >> sequentially, as they are run in tier2 and as part of the CI test job? > They are quick tests and there are only 3 of them. They appear to take > less than 5 seconds each. Great! Thanks Chris. David > Chris >> >> Thanks, >> David >> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8166642 >>> Webrev: http://cr.openjdk.java.net/~dtitov/8166642/webrev.01 >>> >>> The tests ran successfully with Mach5. >>> >>> Best regards, >>> Daniil >>> >>> > From christoph.langer at sap.com Wed Mar 14 16:04:27 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Wed, 14 Mar 2018 16:04:27 +0000 Subject: Another question regarding Thread dumps In-Reply-To: References: <540c219708ee4dc9b15e6454c79e5bba@sap.com> Message-ID: <6e2ff19f042d4b1083847a33faad23f7@sap.com> Thanks David for your comments. I decided that I adapt our coding to the one currently used in OpenJDK. I'm not aware of any issues either way, so I prefer to have common coding. Best regards Christoph > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Dienstag, 13. M?rz 2018 05:36 > To: Langer, Christoph ; serviceability- > dev at openjdk.java.net; Hotspot dev runtime dev at openjdk.java.net> > Subject: Re: Another question regarding Thread dumps > > On 13/03/2018 1:41 AM, Langer, Christoph wrote: > > Hi, > > > > I have another question regarding thread dumping code. > > > > At the places where thread dumps get generated (attachListener.cpp, > diagnosticCommand.cpp, os.cpp), there's always a series of 3 VM operations: > VM_PrintThreads, VM_PrintJNI and VM_FindDeadlocks. I'm wondering if it > would make sense to do this altogether in one VM operation? Then probably > the picture could be more consistent. However, I can imagine the risk that > the safepoint takes too long. Are there other pros and cons I'm missing? > > > > I'm asking because in our JVM codebase I can find places where some of > these VM ops had been combined and I'm wondering what might be the > reasoning behind that and whether it makes sense to revert to the OpenJDK > way of doing things or whether the changes are smart and even worth > contributing. What do you think? > > VM_FindDeadlocks is also used stand-alone in jmm_FindDeadlockedThreads. > > I think they are logically three distinct operations. And one really > long safepoint could be quite problematic. You'd need extensive > real-life benchmarking of the impact on real apps that use active > monitoring before being able to make this change. This seems to me like > a "if it ain't broke ..." situation. > > Cheers, > David > > > Thanks > > Christoph > > From alexey.menkov at oracle.com Wed Mar 14 16:45:34 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 14 Mar 2018 09:45:34 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <4212eecf-7c08-15af-2e2f-8b63c4483b95@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <4212eecf-7c08-15af-2e2f-8b63c4483b95@oracle.com> Message-ID: Hi David, On 03/13/2018 17:46, David Holmes wrote: > Hi Alex, > > On 14/03/2018 9:14 AM, Alex Menkov wrote: >> Hi all, >> >> Please review a small fix for >> https://bugs.openjdk.java.net/browse/JDK-8049695 >> webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >> >> Root cause of the issue is jbd hungs as a result of the buffer overflow. >> >> In the beginning of the shmemBase.c: >> >> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name for */ >> ???????????????????????????? /* shared memory seg and prefix for other >> IPC */ >> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >> names */ >> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >> >> buffer (char prefix[]) in function createStream is used to generate >> base name for mutex/events, so MAX_IPC_PREFIX is not big enough. > > Good catch! But overall this code seems to be missing bounds checks > everywhere. You made the "prefix" (poor name?) buffer bigger > (MAX_IPC_NAME) but do we know the incoming name plus the appended > descriptive string will fit in it? Yes, the possible values can be added to the shmem name (which is restricted by 49 chars): ".mutex" ".hasData" ".hasSpace" ".accept" ".attach" "." (pid is 64bit value, max len IIRC is 19 symbols) So extra MAX_IPC_SUFFIX (25 symbols) is enough > Looking at createTransport for example, it also has: > > char prefix[MAX_IPC_PREFIX]; > > and it produces an error if > > strlen(address) >= MAX_IPC_PREFIX > > but otherwise copies it across: > > strcpy(transport->name, address); > > and then later does: > > ?sprintf(prefix, "%s.mutex", transport->name); > > so we may have overflowed again by adding ".mutex"! The same goes for > the subsequent sprintf's. Thank you for the catch! I looked the file for other similar issues, but somehow overlokked this case. Will fix it. Also will change confusing "prefix" name to "base_name". --alex > > So I think there is more work to do to ensure this code is immune from > buffer overflows. > > Thanks, > David > ----- > >> --alex From daniil.x.titov at oracle.com Wed Mar 14 16:47:17 2018 From: daniil.x.titov at oracle.com (daniil.x.titov at oracle.com) Date: Wed, 14 Mar 2018 09:47:17 -0700 Subject: RFR 8166642: serviceability/dcmd/framework/* timeout In-Reply-To: <8525f5af-f4ac-4eef-da11-4477954a3279@oracle.com> References: <2F3BE7C8-AB3A-461E-80F8-82FA77C25DCE@oracle.com> <8525f5af-f4ac-4eef-da11-4477954a3279@oracle.com> Message-ID: <34884a03-e6e0-a462-28e8-fb56f1a1c536@oracle.com> Hi Chris and David, Could you please say is anything else required or you are OK with these changes? As Chris already replied there are only 3 tests that will be affected and each of them takes less than 5 seconds to complete. On 3/13/18 10:50 PM, Chris Plummer wrote: > Hi Danill, > > The fix looks good. Were you able to reproduce this problem, and then > after the fix run the tests enough times to be confident this really > resolves the issue? > I was able to reproduce this problem with Mach5 . There were about 1-3 failures per 100 runs of hotspot_serviceability suite. After the fix the tests were run more then 1000 times without failures. > Are you going to close JDK-8194057 as a dup? > Yes. I plan to close JDK-8194057 as a duplicate. > thanks, > > Chris > Thanks! Best regards, Daniil On 3/13/18 10:26 PM, Daniil Titov wrote: >> Please review the changes that fix intermittent timeout failure of >> serviceability/dcmd/framework/* tests. >> >> The problem here is that these tests invoke jcmd in different ways >> and one of such ways is when a main class is passed to the jcmd as a >> VM identifier. The main class for jtreg test is >> com.sun.javatest.regtest.agent.MainWrapper and in some cases more >> than one test are running in parallel and there are multiple Java >> processes with com.sun.javatest.regtest.agent.MainWrapper as a main >> class . When it happens jcmd iterates over all Java processes that >> match the condition (the main class equals to >> com.sun.javatest.regtest.agent.MainWrapper) and executes the command >> for each of them. That results in the jcmd invokes the given command >> multiple times and attaches to Java processes not related to the >> current test. >> >> The fix makes serviceability/dcmd/framework/* tests non-concurrent to >> ensure that they don't interact with other tests. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8166642 >> Webrev: http://cr.openjdk.java.net/~dtitov/8166642/webrev.01 >> >> The tests ran successfully with Mach5. >> >> Best regards, >> Daniil >> >> > From jcbeyler at google.com Wed Mar 14 16:59:59 2018 From: jcbeyler at google.com (JC Beyler) Date: Wed, 14 Mar 2018 16:59:59 +0000 Subject: RFR: Two line change in documentation In-Reply-To: <6d75ab02-630e-a423-3d40-c5be9750ea27@oracle.com> References: <6d75ab02-630e-a423-3d40-c5be9750ea27@oracle.com> Message-ID: Hi David and Serguei, Sorry, I was not sure what would be the process for a small fix like this. Did it make sense to make a bug/webrev and all that or not was not clear to me. I'll know for next time. Anyway, I assigned the bug to myself, and put the fix here: http://cr.openjdk.java.net/~jcbeyler/8199561/ Let me know if I need to do anything else, Jc On Tue, Mar 13, 2018 at 5:56 PM David Holmes wrote: > On 14/03/2018 7:32 AM, serguei.spitsyn at oracle.com wrote: > > Hi Jc, > > > > Yes, these are typos. > > Thank you for fixing them! > > > > Is it a formal review request (RFR) ? > > If so, then a bug number is needed. > > I've filed one: > > https://bugs.openjdk.java.net/browse/JDK-8199561 > > > > The fix looks good. > > I think, this can be fixed under a trivial fix rule with just one review. > > I agree. No need for CSR either as these are just obvious typos. :) > > Thanks, > David > > > I'll sponsor it for you. > > > > Thanks, > > Serguei > > > > > > On 3/13/18 10:37, JC Beyler wrote: > >> Hi all, > >> > >> I saw an error in the SetEventNotificationMode method where the > >> parameter is called event_thread but the documentation was referring > >> to it as thread. I then went and did a quick scan of the documentation > >> and found one type of "couse" instead of "course". > >> > >> Here is the diff, not sure it was worth doing a webrev for it but let > >> me know: > >> diff -r 2d1d0c66966b src/hotspot/share/prims/jvmti.xml > >> --- a/src/hotspot/share/prims/jvmti.xmlMon Mar 12 14:11:54 2018 -0700 > >> +++ b/src/hotspot/share/prims/jvmti.xmlTue Mar 13 10:35:03 2018 -0700 > >> @@ -693,7 +693,7 @@ > >> mechanism causes the unload (an unload mechanism is not specified > >> in this document) > >> or the library is (in effect) unloaded by the termination of the > >> VM whether through > >> normal termination or VM failure, including start-up failure. > >> - Uncontrolled shutdown is, of couse, an exception to this rule. > >> + Uncontrolled shutdown is, of course, an exception to this rule. > >> Note the distinction between this function and the > >> VM Death event: for the VM > >> Death event > >> to be sent, the VM must have run at least to the point of > >> initialization and a valid > >> @@ -9405,7 +9405,7 @@ > >> the event will be disabled > >> > >> > >> -If thread is NULL, > >> +If event_thread is NULL, > >> the event is enabled or disabled globally; otherwise, it is > >> enabled or disabled for a particular thread. > >> An event is generated for > >> > >> Thanks, > >> Jc > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Mar 14 17:07:06 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 14 Mar 2018 10:07:06 -0700 Subject: RFR: Two line change in documentation In-Reply-To: References: <6d75ab02-630e-a423-3d40-c5be9750ea27@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From jcbeyler at google.com Wed Mar 14 17:11:56 2018 From: jcbeyler at google.com (JC Beyler) Date: Wed, 14 Mar 2018 17:11:56 +0000 Subject: RFR: Two line change in documentation In-Reply-To: References: <6d75ab02-630e-a423-3d40-c5be9750ea27@oracle.com> Message-ID: Thanks Serguei! Jc On Wed, Mar 14, 2018 at 10:07 AM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Hi Jc, > > I forgot to ask you for a patch yesterday. > Thank you for webrev, it has to be enough as it has a patch in it. > I'll push it now. > > Thanks, > Serguei > > > On 3/14/18 09:59, JC Beyler wrote: > > Hi David and Serguei, > > Sorry, I was not sure what would be the process for a small fix like this. > Did it make sense to make a bug/webrev and all that or not was not clear to > me. I'll know for next time. Anyway, I assigned the bug to myself, and put > the fix here: > http://cr.openjdk.java.net/~jcbeyler/8199561/ > > Let me know if I need to do anything else, > Jc > > > > On Tue, Mar 13, 2018 at 5:56 PM David Holmes > wrote: > >> On 14/03/2018 7:32 AM, serguei.spitsyn at oracle.com wrote: >> > Hi Jc, >> > >> > Yes, these are typos. >> > Thank you for fixing them! >> > >> > Is it a formal review request (RFR) ? >> > If so, then a bug number is needed. >> > I've filed one: >> > https://bugs.openjdk.java.net/browse/JDK-8199561 >> > >> > The fix looks good. >> > I think, this can be fixed under a trivial fix rule with just one >> review. >> >> I agree. No need for CSR either as these are just obvious typos. :) >> >> Thanks, >> David >> >> > I'll sponsor it for you. >> > >> > Thanks, >> > Serguei >> > >> > >> > On 3/13/18 10:37, JC Beyler wrote: >> >> Hi all, >> >> >> >> I saw an error in the SetEventNotificationMode method where the >> >> parameter is called event_thread but the documentation was referring >> >> to it as thread. I then went and did a quick scan of the documentation >> >> and found one type of "couse" instead of "course". >> >> >> >> Here is the diff, not sure it was worth doing a webrev for it but let >> >> me know: >> >> diff -r 2d1d0c66966b src/hotspot/share/prims/jvmti.xml >> >> --- a/src/hotspot/share/prims/jvmti.xmlMon Mar 12 14:11:54 2018 -0700 >> >> +++ b/src/hotspot/share/prims/jvmti.xmlTue Mar 13 10:35:03 2018 -0700 >> >> @@ -693,7 +693,7 @@ >> >> mechanism causes the unload (an unload mechanism is not specified >> >> in this document) >> >> or the library is (in effect) unloaded by the termination of the >> >> VM whether through >> >> normal termination or VM failure, including start-up failure. >> >> - Uncontrolled shutdown is, of couse, an exception to this rule. >> >> + Uncontrolled shutdown is, of course, an exception to this rule. >> >> Note the distinction between this function and the >> >> VM Death event: for the VM >> >> Death event >> >> to be sent, the VM must have run at least to the point of >> >> initialization and a valid >> >> @@ -9405,7 +9405,7 @@ >> >> the event will be disabled >> >> >> >> >> >> -If thread is NULL, >> >> +If event_thread is NULL, >> >> the event is enabled or disabled globally; otherwise, it is >> >> enabled or disabled for a particular thread. >> >> An event is generated for >> >> >> >> Thanks, >> >> Jc >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Mar 14 17:24:16 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 14 Mar 2018 10:24:16 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> Message-ID: On 3/12/18 1:51 PM, Chris Plummer wrote: > Hi Igor, > > On 3/12/18 1:26 PM, Igor Ignatyev wrote: >> >> >>> On Mar 12, 2018, at 8:53 AM, Chris Plummer >> > wrote: >>> >>> On 3/11/18 7:52 PM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> On 10/03/2018 6:46 AM, Chris Plummer wrote: >>>>> Hello, >>>>> >>>>> Please help review the following: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8198655 >>>>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >>>>> >>>>> In the end there were two issues. The first was that the >>>>> pb.redirectError() call was redirecting the LingeredApp's stderr >>>>> to the console, which we don't want. The second was that nothing >>>>> was capturing the LingeredApp's output and sending it to the >>>>> driver app's output (jtr file). These changes make all the >>>>> LingeredApp's output end up in the jtr file. >>>> >>>> It isn't clear to me how the interleaving of the two streams by the >>>> two threads is handled in the copy routine. Are we guaranteed to >>>> get complete lines of output from each stream before writing to >>>> System.out? >>> Hi David, >>> >>> I'm hoping Igor will chime in here, since this is just cloned from >>> some closed code he wrote, and he recommended this fix. Perhaps we >>> are just doing something a bit non standard here. When spawning a >>> separate test process, don't we normally just dump stdout and stderr >>> separately via OutputAnalyzer.reportDiagnosticSummary() after the >>> test completes, and then only if there is an error. I'm not sure why >>> Igor felt LingeredApp tests should be handled differently. >> >> I recommended this fix as one of possibilities and never claimed it's >> the best solution ;) ?I don't know much of LingeredApp tests, so I >> just suggested the patch which only solves the problem I noticed >> (LingeredApp's cerr being printed into jtreg agent's cerr). > If by "into jtreg agent's cerr" you are referring to the presence of > JDWP "ERROR" messages in the jtreg console, that is fixed simply by > removing the following: > > ?? pb.redirectError(ProcessBuilder.Redirect.INHERIT); > > And that is already part of this fix. But it actually makes the ERROR > messages completely disappear. The copy() part of the fix makes all > the LingeredApp output appear in the .jtr file (including the JDWP > "ERROR" messages). >> OutputAnalyzer might not fit the use case of LingeredApp b/c it >> blocks till the process is finished. again, I don't know much about >> LingeredApp itself and the tests which use it. > My point was that other jtreg tests that use ProcessBuilder and > OutputAnalyzer don't print out anything from the spawned process/app > until it is done, and even then usually only if there was a test > failure. Why is there a need here to print out messages as they are > generated. I played around with OutputAnalyzer a bit to see if we can do something more like are other uses of ProcessBuilder. Unfortunately it did not work: ??????? appProcess = pb.start(); ??????? OutputAnalyzer output = new OutputAnalyzer(appProcess); The problem is that OutputAnalyzer(appProcess) will not return until appProcess exits, but the test requires interaction with appProcess before it will exit, and this is done from the same thread. So unless appProcess aborts unexpectedly, we block in OutputAnalyzer(appProcess). So next I tried getting the InputGobbler output by calling LingeredApp.getOutput(). This was missing the stderr output. So I created a separate gobbler for LingeredApp.getErrorStream(). It only contained the following: ?[Debugger failed to attach: ERROR: Peer not allowed to connect: 127.0.0.1, ]------------ Yet the console (without disabling the pb.redirectError(ProcessBuilder.Redirect.INHERIT) code) normally would contain: [2018-03-14 09:49:38,354] Agent[1]: stderr: Debugger failed to attach: ERROR: Peer not allowed to connect: 127.0.0.1 [2018-03-14 09:49:38,355] Agent[1]: stderr: [2018-03-14 09:49:39,075] Agent[1]: stderr: Error in allow option: '127.0.0.1;192.168.0.0/24' [2018-03-14 09:49:39,075] Agent[1]: stderr: ERROR: transport error 103: invalid IP address in allow option [2018-03-14 09:49:39,075] Agent[1]: stderr: ERROR: JDWP Transport dt_socket failed to initialize, TRANSPORT_INIT(510) [2018-03-14 09:49:39,075] Agent[1]: stderr: JDWP exit error AGENT_ERROR_TRANSPORT_INIT(197): No transports initialized [:732] [2018-03-14 09:49:40,064] Agent[1]: stderr: ERROR: JDWP option syntax error: -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:50103,allow= [2018-03-14 09:49:41,289] Agent[1]: stderr: Error in allow option: '*+allow=127.0.0.1' [2018-03-14 09:49:41,289] Agent[1]: stderr: ERROR: transport error 103: allow option '*' cannot be expanded [2018-03-14 09:49:41,289] Agent[1]: stderr: ERROR: JDWP Transport dt_socket failed to initialize, TRANSPORT_INIT(510) [2018-03-14 09:49:41,289] Agent[1]: stderr: JDWP exit error AGENT_ERROR_TRANSPORT_INIT(197): No transports initialized [:732] [2018-03-14 09:49:42,418] Agent[1]: stderr: Error in allow option: 'allow=127.0.0.1+*' [2018-03-14 09:49:42,418] Agent[1]: stderr: ERROR: transport error 103: invalid IP address in allow option [2018-03-14 09:49:42,418] Agent[1]: stderr: ERROR: JDWP Transport dt_socket failed to initialize, TRANSPORT_INIT(510) [2018-03-14 09:49:42,418] Agent[1]: stderr: JDWP exit error AGENT_ERROR_TRANSPORT_INIT(197): No transports initialized [:732] Just to summarize, if I don't disable pb.redirectError(ProcessBuilder.Redirect.INHERIT), all the above appears in the console. If I disable it, none of the above is in the console. It is just lost. If I disable and try to capture it from LingeredApp.getErrorStream() using an InputGobller, I only get the first ERROR, and the rest seem to be lost. (note the copy() approach in the original fix does capture all the ERROR messages). I looked a bit closer at what InputGobbler does vs Igor's copy() approach. They are both very similar in that they spawn a separate thread to read from the InputStream. The difference is the InputGobbler uses BufferedReader.readline() whereas Igor's copy() directly uses InputStream.read(). So it seems the InputGobbler buffering is preventing us from seeing all the stderr output. So one conclusion is that there probably is not much point in having two approaches to getting the LingeredApp's output. I think we should just create an InputGobbler for both stdout and stderr (currently we only created one for stdout), and then fix the buffering issue with InputGobbler on stderr. That still leaves the question of whether we should always dump stdout and stderr. I think that should be left to the test to decide, and probably only done if there is an error. Chris >> >> answering to David's question, copy routine handles interleaving of >> two streams similarly to printf routine, it does not do that at all. >> we are writing to System.out as we read data, the only guarantee we >> have is all the bytes we read into buffer will be written together >> (which might mean 1 byte at a time), no guarantees about lines. the >> behavior is pretty much the same as you expect to get from an >> interactive shell w/ both cout and cerr are being printed on the >> console. > Not quite. If a single threaded app is sending to both System.out and > System.err (and/or stdout and stderr) and does something to ensure > flushing after each print or each line, then the output should appear > cleanly in the order executed. By having two different threads read > these two streams, order might be changed, and possibly even > interleaved within lines. > > cheers, > > Chris >> >> Thanks, >> -- Igor >>> >>> thanks, >>> >>> Chris >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Tested by running all tests that use LingeredApp. >>>>> >>>>> thanks, >>>>> >>>>> Chris >> > > From chris.plummer at oracle.com Wed Mar 14 17:26:42 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 14 Mar 2018 10:26:42 -0700 Subject: RFR 8166642: serviceability/dcmd/framework/* timeout In-Reply-To: <34884a03-e6e0-a462-28e8-fb56f1a1c536@oracle.com> References: <2F3BE7C8-AB3A-461E-80F8-82FA77C25DCE@oracle.com> <8525f5af-f4ac-4eef-da11-4477954a3279@oracle.com> <34884a03-e6e0-a462-28e8-fb56f1a1c536@oracle.com> Message-ID: Hi Daniil, I believe David said he would defer to me on whether you should take the approach of spawning a different app to avoid this issue. I'm fine with it the way it is, so unless you feel compelled to take the more complicated approach, I think you're good to go. thanks, Chris On 3/14/18 9:47 AM, daniil.x.titov at oracle.com wrote: > Hi Chris and David, > > Could you please say is anything else required or you are OK with > these changes? > > As Chris already replied there are only 3 tests that will be affected > and each of them takes less than 5 seconds to complete. > > > On 3/13/18 10:50 PM, Chris Plummer wrote: >> Hi Danill, >> >> The fix looks good. Were you able to reproduce this problem, and then >> after the fix run the tests enough times to be confident this really >> resolves the issue? >> > I was able to reproduce this problem with Mach5 . There were about 1-3 > failures per 100 runs of hotspot_serviceability suite. After the fix > the tests were run more then 1000 times without failures. >> Are you going to close JDK-8194057 as a dup? >> > Yes. I plan to close JDK-8194057 as a duplicate. >> thanks, >> >> Chris >> > > Thanks! > > Best regards, > Daniil > > > On 3/13/18 10:26 PM, Daniil Titov wrote: >>> Please review the changes that fix intermittent timeout failure of >>> serviceability/dcmd/framework/* tests. >>> >>> The problem here is that these tests invoke jcmd in different ways >>> and one of such ways is when a main class is passed to the jcmd as a >>> VM identifier. The main class for jtreg test is >>> com.sun.javatest.regtest.agent.MainWrapper and in some cases more >>> than one test are running in parallel and there are multiple Java >>> processes with com.sun.javatest.regtest.agent.MainWrapper as a main >>> class . When it happens jcmd iterates over all Java processes that >>> match the condition (the main class equals to >>> com.sun.javatest.regtest.agent.MainWrapper) and executes the command >>> for each of them. That results in the jcmd invokes the given command >>> multiple times and attaches to Java processes not related to the >>> current test. >>> >>> The fix makes serviceability/dcmd/framework/* tests non-concurrent >>> to ensure that they don't interact with other tests. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8166642 >>> Webrev: http://cr.openjdk.java.net/~dtitov/8166642/webrev.01 >>> >>> The tests ran successfully with Mach5. >>> >>> Best regards, >>> Daniil >>> >>> >> > From alexey.menkov at oracle.com Wed Mar 14 19:43:32 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 14 Mar 2018 12:43:32 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <4212eecf-7c08-15af-2e2f-8b63c4483b95@oracle.com> Message-ID: Updated fix: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.02/ The changes: - createTransport function is fixed; - "prefix" variable is renamed to "baseName". --alex On 03/14/2018 09:45, Alex Menkov wrote: > Hi David, > > > On 03/13/2018 17:46, David Holmes wrote: >> Hi Alex, >> >> On 14/03/2018 9:14 AM, Alex Menkov wrote: >>> Hi all, >>> >>> Please review a small fix for >>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>> webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>> >>> Root cause of the issue is jbd hungs as a result of the buffer overflow. >>> >>> In the beginning of the shmemBase.c: >>> >>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name for */ >>> ???????????????????????????? /* shared memory seg and prefix for >>> other IPC */ >>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >>> names */ >>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>> >>> buffer (char prefix[]) in function createStream is used to generate >>> base name for mutex/events, so MAX_IPC_PREFIX is not big enough. >> >> Good catch! But overall this code seems to be missing bounds checks >> everywhere. You made the "prefix" (poor name?) buffer bigger >> (MAX_IPC_NAME) but do we know the incoming name plus the appended >> descriptive string will fit in it? > > Yes, the possible values can be added to the shmem name (which is > restricted by 49 chars): > ".mutex" > ".hasData" > ".hasSpace" > ".accept" > ".attach" > "." (pid is 64bit value, max len IIRC is 19 symbols) > So extra MAX_IPC_SUFFIX (25 symbols) is enough > >> Looking at createTransport for example, it also has: >> >> char prefix[MAX_IPC_PREFIX]; >> >> and it produces an error if >> >> strlen(address) >= MAX_IPC_PREFIX >> >> but otherwise copies it across: >> >> strcpy(transport->name, address); >> >> and then later does: >> >> ??sprintf(prefix, "%s.mutex", transport->name); >> >> so we may have overflowed again by adding ".mutex"! The same goes for >> the subsequent sprintf's. > > Thank you for the catch! > I looked the file for other similar issues, but somehow overlokked this > case. > Will fix it. > Also will change confusing "prefix" name to "base_name". > > --alex > >> >> So I think there is more work to do to ensure this code is immune from >> buffer overflows. >> >> Thanks, >> David >> ----- >> >>> --alex From chris.plummer at oracle.com Wed Mar 14 20:42:06 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 14 Mar 2018 13:42:06 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <4212eecf-7c08-15af-2e2f-8b63c4483b95@oracle.com> Message-ID: <081c7cb8-7671-7a2d-7608-318cfb292ec5@oracle.com> Hi Alex, I don't think prefix -> basename is what David had in mind. Those basically mean the same thing. The buffer is being used for the full name, which is why neither is really appropriate. So maybe just call it fullname, or even just name. createConnection() has a similar prefix reference that should be fixed. thanks, Chris On 3/14/18 12:43 PM, Alex Menkov wrote: > > Updated fix: > http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.02/ > > The changes: > - createTransport function is fixed; > - "prefix" variable is renamed to "baseName". > > --alex > > On 03/14/2018 09:45, Alex Menkov wrote: >> Hi David, >> >> >> On 03/13/2018 17:46, David Holmes wrote: >>> Hi Alex, >>> >>> On 14/03/2018 9:14 AM, Alex Menkov wrote: >>>> Hi all, >>>> >>>> Please review a small fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>> >>>> Root cause of the issue is jbd hungs as a result of the buffer >>>> overflow. >>>> >>>> In the beginning of the shmemBase.c: >>>> >>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name for */ >>>> ???????????????????????????? /* shared memory seg and prefix for >>>> other IPC */ >>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >>>> names */ >>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>> >>>> buffer (char prefix[]) in function createStream is used to generate >>>> base name for mutex/events, so MAX_IPC_PREFIX is not big enough. >>> >>> Good catch! But overall this code seems to be missing bounds checks >>> everywhere. You made the "prefix" (poor name?) buffer bigger >>> (MAX_IPC_NAME) but do we know the incoming name plus the appended >>> descriptive string will fit in it? >> >> Yes, the possible values can be added to the shmem name (which is >> restricted by 49 chars): >> ".mutex" >> ".hasData" >> ".hasSpace" >> ".accept" >> ".attach" >> "." (pid is 64bit value, max len IIRC is 19 symbols) >> So extra MAX_IPC_SUFFIX (25 symbols) is enough >> >>> Looking at createTransport for example, it also has: >>> >>> char prefix[MAX_IPC_PREFIX]; >>> >>> and it produces an error if >>> >>> strlen(address) >= MAX_IPC_PREFIX >>> >>> but otherwise copies it across: >>> >>> strcpy(transport->name, address); >>> >>> and then later does: >>> >>> ??sprintf(prefix, "%s.mutex", transport->name); >>> >>> so we may have overflowed again by adding ".mutex"! The same goes >>> for the subsequent sprintf's. >> >> Thank you for the catch! >> I looked the file for other similar issues, but somehow overlokked >> this case. >> Will fix it. >> Also will change confusing "prefix" name to "base_name". >> >> --alex >> >>> >>> So I think there is more work to do to ensure this code is immune >>> from buffer overflows. >>> >>> Thanks, >>> David >>> ----- >>> >>>> --alex From alexey.menkov at oracle.com Wed Mar 14 23:28:08 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 14 Mar 2018 16:28:08 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <081c7cb8-7671-7a2d-7608-318cfb292ec5@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <4212eecf-7c08-15af-2e2f-8b63c4483b95@oracle.com> <081c7cb8-7671-7a2d-7608-318cfb292ec5@oracle.com> Message-ID: <70679c6d-30a8-0a43-1d91-64393dd12e9a@oracle.com> Hi Chris, On 03/14/2018 13:42, Chris Plummer wrote: > Hi Alex, > > I don't think prefix -> basename is what David had in mind. Those > basically mean the same thing. The buffer is being used for the full > name, which is why neither is really appropriate. So maybe just call it > fullname, or even just name. createConnection() has a similar prefix > reference that should be fixed. Ok, I don't like "fullname", "name" is already used there, so I made them "objectName" (for mutex/event names) and streamName (for stream name in createConnection()). updated webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.03/ --alex > > thanks, > > Chris > > On 3/14/18 12:43 PM, Alex Menkov wrote: >> >> Updated fix: >> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.02/ >> >> The changes: >> - createTransport function is fixed; >> - "prefix" variable is renamed to "baseName". >> >> --alex >> >> On 03/14/2018 09:45, Alex Menkov wrote: >>> Hi David, >>> >>> >>> On 03/13/2018 17:46, David Holmes wrote: >>>> Hi Alex, >>>> >>>> On 14/03/2018 9:14 AM, Alex Menkov wrote: >>>>> Hi all, >>>>> >>>>> Please review a small fix for >>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>> webrev: >>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>> >>>>> Root cause of the issue is jbd hungs as a result of the buffer >>>>> overflow. >>>>> >>>>> In the beginning of the shmemBase.c: >>>>> >>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name for */ >>>>> ???????????????????????????? /* shared memory seg and prefix for >>>>> other IPC */ >>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >>>>> names */ >>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>> >>>>> buffer (char prefix[]) in function createStream is used to generate >>>>> base name for mutex/events, so MAX_IPC_PREFIX is not big enough. >>>> >>>> Good catch! But overall this code seems to be missing bounds checks >>>> everywhere. You made the "prefix" (poor name?) buffer bigger >>>> (MAX_IPC_NAME) but do we know the incoming name plus the appended >>>> descriptive string will fit in it? >>> >>> Yes, the possible values can be added to the shmem name (which is >>> restricted by 49 chars): >>> ".mutex" >>> ".hasData" >>> ".hasSpace" >>> ".accept" >>> ".attach" >>> "." (pid is 64bit value, max len IIRC is 19 symbols) >>> So extra MAX_IPC_SUFFIX (25 symbols) is enough >>> >>>> Looking at createTransport for example, it also has: >>>> >>>> char prefix[MAX_IPC_PREFIX]; >>>> >>>> and it produces an error if >>>> >>>> strlen(address) >= MAX_IPC_PREFIX >>>> >>>> but otherwise copies it across: >>>> >>>> strcpy(transport->name, address); >>>> >>>> and then later does: >>>> >>>> ??sprintf(prefix, "%s.mutex", transport->name); >>>> >>>> so we may have overflowed again by adding ".mutex"! The same goes >>>> for the subsequent sprintf's. >>> >>> Thank you for the catch! >>> I looked the file for other similar issues, but somehow overlokked >>> this case. >>> Will fix it. >>> Also will change confusing "prefix" name to "base_name". >>> >>> --alex >>> >>>> >>>> So I think there is more work to do to ensure this code is immune >>>> from buffer overflows. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> --alex > > > From chris.plummer at oracle.com Wed Mar 14 23:33:07 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 14 Mar 2018 16:33:07 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <70679c6d-30a8-0a43-1d91-64393dd12e9a@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <4212eecf-7c08-15af-2e2f-8b63c4483b95@oracle.com> <081c7cb8-7671-7a2d-7608-318cfb292ec5@oracle.com> <70679c6d-30a8-0a43-1d91-64393dd12e9a@oracle.com> Message-ID: On 3/14/18 4:28 PM, Alex Menkov wrote: > Hi Chris, > > On 03/14/2018 13:42, Chris Plummer wrote: >> Hi Alex, >> >> I don't think prefix -> basename is what David had in mind. Those >> basically mean the same thing. The buffer is being used for the full >> name, which is why neither is really appropriate. So maybe just call >> it fullname, or even just name. createConnection() has a similar >> prefix reference that should be fixed. > > Ok, I don't like "fullname", "name" is already used there, so I made > them "objectName" (for mutex/event names) and streamName (for stream > name in createConnection()). > > updated webrev: > http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.03/ Hi Alex, Just noticed the copyright needs updating, but otherwise looks good. No need for another webrev. thanks, Chris > > --alex > >> >> thanks, >> >> Chris >> >> On 3/14/18 12:43 PM, Alex Menkov wrote: >>> >>> Updated fix: >>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.02/ >>> >>> The changes: >>> - createTransport function is fixed; >>> - "prefix" variable is renamed to "baseName". >>> >>> --alex >>> >>> On 03/14/2018 09:45, Alex Menkov wrote: >>>> Hi David, >>>> >>>> >>>> On 03/13/2018 17:46, David Holmes wrote: >>>>> Hi Alex, >>>>> >>>>> On 14/03/2018 9:14 AM, Alex Menkov wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review a small fix for >>>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>>> >>>>>> Root cause of the issue is jbd hungs as a result of the buffer >>>>>> overflow. >>>>>> >>>>>> In the beginning of the shmemBase.c: >>>>>> >>>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name >>>>>> for */ >>>>>> ???????????????????????????? /* shared memory seg and prefix for >>>>>> other IPC */ >>>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >>>>>> names */ >>>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>>> >>>>>> buffer (char prefix[]) in function createStream is used to >>>>>> generate base name for mutex/events, so MAX_IPC_PREFIX is not big >>>>>> enough. >>>>> >>>>> Good catch! But overall this code seems to be missing bounds >>>>> checks everywhere. You made the "prefix" (poor name?) buffer >>>>> bigger (MAX_IPC_NAME) but do we know the incoming name plus the >>>>> appended descriptive string will fit in it? >>>> >>>> Yes, the possible values can be added to the shmem name (which is >>>> restricted by 49 chars): >>>> ".mutex" >>>> ".hasData" >>>> ".hasSpace" >>>> ".accept" >>>> ".attach" >>>> "." (pid is 64bit value, max len IIRC is 19 symbols) >>>> So extra MAX_IPC_SUFFIX (25 symbols) is enough >>>> >>>>> Looking at createTransport for example, it also has: >>>>> >>>>> char prefix[MAX_IPC_PREFIX]; >>>>> >>>>> and it produces an error if >>>>> >>>>> strlen(address) >= MAX_IPC_PREFIX >>>>> >>>>> but otherwise copies it across: >>>>> >>>>> strcpy(transport->name, address); >>>>> >>>>> and then later does: >>>>> >>>>> ??sprintf(prefix, "%s.mutex", transport->name); >>>>> >>>>> so we may have overflowed again by adding ".mutex"! The same goes >>>>> for the subsequent sprintf's. >>>> >>>> Thank you for the catch! >>>> I looked the file for other similar issues, but somehow overlokked >>>> this case. >>>> Will fix it. >>>> Also will change confusing "prefix" name to "base_name". >>>> >>>> --alex >>>> >>>>> >>>>> So I think there is more work to do to ensure this code is immune >>>>> from buffer overflows. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> --alex >> >> >> From serguei.spitsyn at oracle.com Thu Mar 15 00:33:07 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 14 Mar 2018 17:33:07 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <70679c6d-30a8-0a43-1d91-64393dd12e9a@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <4212eecf-7c08-15af-2e2f-8b63c4483b95@oracle.com> <081c7cb8-7671-7a2d-7608-318cfb292ec5@oracle.com> <70679c6d-30a8-0a43-1d91-64393dd12e9a@oracle.com> Message-ID: <28126f92-019e-89ef-76ee-e8a59da7bba3@oracle.com> Hi Alex, Sorry to being late to this party. Thank you for getting to the root cause of this issue and for extra updates. All such issues are important for test stabilization, so now there will be one problem less! This looks pretty good to me. I'd replace "objectName" with "streamName" to keep it unified. But I understand why you are trying to avoid using "streamName" in this particular case. It is because we already have the argument "name" for the stream, so there can be a confusion why do we have also "streamName" as the argument already took this role. A better name for argument would be "baseName" (or "prefix") to avoid this confusion. But I think, this confusion is not that big, so the "streamName" should be fine. I leave it up to you and other reviewers. Thanks, Serguei On 3/14/18 16:28, Alex Menkov wrote: > Hi Chris, > > On 03/14/2018 13:42, Chris Plummer wrote: >> Hi Alex, >> >> I don't think prefix -> basename is what David had in mind. Those >> basically mean the same thing. The buffer is being used for the full >> name, which is why neither is really appropriate. So maybe just call >> it fullname, or even just name. createConnection() has a similar >> prefix reference that should be fixed. > > Ok, I don't like "fullname", "name" is already used there, so I made > them "objectName" (for mutex/event names) and streamName (for stream > name in createConnection()). > > updated webrev: > http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.03/ > > --alex > >> >> thanks, >> >> Chris >> >> On 3/14/18 12:43 PM, Alex Menkov wrote: >>> >>> Updated fix: >>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.02/ >>> >>> The changes: >>> - createTransport function is fixed; >>> - "prefix" variable is renamed to "baseName". >>> >>> --alex >>> >>> On 03/14/2018 09:45, Alex Menkov wrote: >>>> Hi David, >>>> >>>> >>>> On 03/13/2018 17:46, David Holmes wrote: >>>>> Hi Alex, >>>>> >>>>> On 14/03/2018 9:14 AM, Alex Menkov wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review a small fix for >>>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>>> >>>>>> Root cause of the issue is jbd hungs as a result of the buffer >>>>>> overflow. >>>>>> >>>>>> In the beginning of the shmemBase.c: >>>>>> >>>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name >>>>>> for */ >>>>>> ???????????????????????????? /* shared memory seg and prefix for >>>>>> other IPC */ >>>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >>>>>> names */ >>>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>>> >>>>>> buffer (char prefix[]) in function createStream is used to >>>>>> generate base name for mutex/events, so MAX_IPC_PREFIX is not big >>>>>> enough. >>>>> >>>>> Good catch! But overall this code seems to be missing bounds >>>>> checks everywhere. You made the "prefix" (poor name?) buffer >>>>> bigger (MAX_IPC_NAME) but do we know the incoming name plus the >>>>> appended descriptive string will fit in it? >>>> >>>> Yes, the possible values can be added to the shmem name (which is >>>> restricted by 49 chars): >>>> ".mutex" >>>> ".hasData" >>>> ".hasSpace" >>>> ".accept" >>>> ".attach" >>>> "." (pid is 64bit value, max len IIRC is 19 symbols) >>>> So extra MAX_IPC_SUFFIX (25 symbols) is enough >>>> >>>>> Looking at createTransport for example, it also has: >>>>> >>>>> char prefix[MAX_IPC_PREFIX]; >>>>> >>>>> and it produces an error if >>>>> >>>>> strlen(address) >= MAX_IPC_PREFIX >>>>> >>>>> but otherwise copies it across: >>>>> >>>>> strcpy(transport->name, address); >>>>> >>>>> and then later does: >>>>> >>>>> ??sprintf(prefix, "%s.mutex", transport->name); >>>>> >>>>> so we may have overflowed again by adding ".mutex"! The same goes >>>>> for the subsequent sprintf's. >>>> >>>> Thank you for the catch! >>>> I looked the file for other similar issues, but somehow overlokked >>>> this case. >>>> Will fix it. >>>> Also will change confusing "prefix" name to "base_name". >>>> >>>> --alex >>>> >>>>> >>>>> So I think there is more work to do to ensure this code is immune >>>>> from buffer overflows. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> --alex >> >> >> From alexey.menkov at oracle.com Thu Mar 15 00:44:32 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 14 Mar 2018 17:44:32 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <28126f92-019e-89ef-76ee-e8a59da7bba3@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <4212eecf-7c08-15af-2e2f-8b63c4483b95@oracle.com> <081c7cb8-7671-7a2d-7608-318cfb292ec5@oracle.com> <70679c6d-30a8-0a43-1d91-64393dd12e9a@oracle.com> <28126f92-019e-89ef-76ee-e8a59da7bba3@oracle.com> Message-ID: <321a3a2f-c3bd-51a4-eba4-9f9b4481c5d8@oracle.com> Hi Serguei, On 03/14/2018 17:33, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > Sorry to being late to this party. > Thank you for getting to the root cause of this issue and for extra > updates. > All such issues are important for test stabilization, so now there will > be one problem less! > > This looks pretty good to me. > I'd replace "objectName" with "streamName" to keep it unified. In createStream() and createTransport() the buffer is used to generate names for mutex and events (kind of "Windows objects"). In createConnection() the buffer is used to generate stream names (client->server & server->client streams). --alex > But I understand why you are trying to avoid using "streamName" in this > particular case. > It is because we already have the argument "name" for the stream, so > there can be > a confusion why do we have also "streamName" as the argument already > took this role. > A better name for argument would be "baseName" (or "prefix") to avoid > this confusion. > But I think, this confusion is not that big, so the "streamName" should > be fine. > > I leave it up to you and other reviewers. > > Thanks, > Serguei > > > On 3/14/18 16:28, Alex Menkov wrote: >> Hi Chris, >> >> On 03/14/2018 13:42, Chris Plummer wrote: >>> Hi Alex, >>> >>> I don't think prefix -> basename is what David had in mind. Those >>> basically mean the same thing. The buffer is being used for the full >>> name, which is why neither is really appropriate. So maybe just call >>> it fullname, or even just name. createConnection() has a similar >>> prefix reference that should be fixed. >> >> Ok, I don't like "fullname", "name" is already used there, so I made >> them "objectName" (for mutex/event names) and streamName (for stream >> name in createConnection()). >> >> updated webrev: >> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.03/ >> >> --alex >> >>> >>> thanks, >>> >>> Chris >>> >>> On 3/14/18 12:43 PM, Alex Menkov wrote: >>>> >>>> Updated fix: >>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.02/ >>>> >>>> The changes: >>>> - createTransport function is fixed; >>>> - "prefix" variable is renamed to "baseName". >>>> >>>> --alex >>>> >>>> On 03/14/2018 09:45, Alex Menkov wrote: >>>>> Hi David, >>>>> >>>>> >>>>> On 03/13/2018 17:46, David Holmes wrote: >>>>>> Hi Alex, >>>>>> >>>>>> On 14/03/2018 9:14 AM, Alex Menkov wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review a small fix for >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>>>> webrev: >>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>>>> >>>>>>> Root cause of the issue is jbd hungs as a result of the buffer >>>>>>> overflow. >>>>>>> >>>>>>> In the beginning of the shmemBase.c: >>>>>>> >>>>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name >>>>>>> for */ >>>>>>> ???????????????????????????? /* shared memory seg and prefix for >>>>>>> other IPC */ >>>>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >>>>>>> names */ >>>>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>>>> >>>>>>> buffer (char prefix[]) in function createStream is used to >>>>>>> generate base name for mutex/events, so MAX_IPC_PREFIX is not big >>>>>>> enough. >>>>>> >>>>>> Good catch! But overall this code seems to be missing bounds >>>>>> checks everywhere. You made the "prefix" (poor name?) buffer >>>>>> bigger (MAX_IPC_NAME) but do we know the incoming name plus the >>>>>> appended descriptive string will fit in it? >>>>> >>>>> Yes, the possible values can be added to the shmem name (which is >>>>> restricted by 49 chars): >>>>> ".mutex" >>>>> ".hasData" >>>>> ".hasSpace" >>>>> ".accept" >>>>> ".attach" >>>>> "." (pid is 64bit value, max len IIRC is 19 symbols) >>>>> So extra MAX_IPC_SUFFIX (25 symbols) is enough >>>>> >>>>>> Looking at createTransport for example, it also has: >>>>>> >>>>>> char prefix[MAX_IPC_PREFIX]; >>>>>> >>>>>> and it produces an error if >>>>>> >>>>>> strlen(address) >= MAX_IPC_PREFIX >>>>>> >>>>>> but otherwise copies it across: >>>>>> >>>>>> strcpy(transport->name, address); >>>>>> >>>>>> and then later does: >>>>>> >>>>>> ??sprintf(prefix, "%s.mutex", transport->name); >>>>>> >>>>>> so we may have overflowed again by adding ".mutex"! The same goes >>>>>> for the subsequent sprintf's. >>>>> >>>>> Thank you for the catch! >>>>> I looked the file for other similar issues, but somehow overlokked >>>>> this case. >>>>> Will fix it. >>>>> Also will change confusing "prefix" name to "base_name". >>>>> >>>>> --alex >>>>> >>>>>> >>>>>> So I think there is more work to do to ensure this code is immune >>>>>> from buffer overflows. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> --alex >>> >>> >>> > From serguei.spitsyn at oracle.com Thu Mar 15 00:54:09 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 14 Mar 2018 17:54:09 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <321a3a2f-c3bd-51a4-eba4-9f9b4481c5d8@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <4212eecf-7c08-15af-2e2f-8b63c4483b95@oracle.com> <081c7cb8-7671-7a2d-7608-318cfb292ec5@oracle.com> <70679c6d-30a8-0a43-1d91-64393dd12e9a@oracle.com> <28126f92-019e-89ef-76ee-e8a59da7bba3@oracle.com> <321a3a2f-c3bd-51a4-eba4-9f9b4481c5d8@oracle.com> Message-ID: <000d187e-4c04-4d22-975a-207290fe38ac@oracle.com> On 3/14/18 17:44, Alex Menkov wrote: > Hi Serguei, > > On 03/14/2018 17:33, serguei.spitsyn at oracle.com wrote: >> Hi Alex, >> >> Sorry to being late to this party. >> Thank you for getting to the root cause of this issue and for extra >> updates. >> All such issues are important for test stabilization, so now there >> will be one problem less! >> >> This looks pretty good to me. >> I'd replace "objectName" with "streamName" to keep it unified. > > In createStream() and createTransport() the buffer is used to generate > names for mutex and events (kind of "Windows objects"). > In createConnection() the buffer is used to generate stream names > (client->server & server->client streams). Right, I missed this. Sorry for the noise. :) Thanks, Serguei > > --alex > >> But I understand why you are trying to avoid using "streamName" in >> this particular case. >> It is because we already have the argument "name" for the stream, so >> there can be >> a confusion why do we have also "streamName" as the argument already >> took this role. >> A better name for argument would be "baseName" (or "prefix") to avoid >> this confusion. >> But I think, this confusion is not that big, so the "streamName" >> should be fine. >> >> I leave it up to you and other reviewers. >> >> Thanks, >> Serguei >> >> >> On 3/14/18 16:28, Alex Menkov wrote: >>> Hi Chris, >>> >>> On 03/14/2018 13:42, Chris Plummer wrote: >>>> Hi Alex, >>>> >>>> I don't think prefix -> basename is what David had in mind. Those >>>> basically mean the same thing. The buffer is being used for the >>>> full name, which is why neither is really appropriate. So maybe >>>> just call it fullname, or even just name. createConnection() has a >>>> similar prefix reference that should be fixed. >>> >>> Ok, I don't like "fullname", "name" is already used there, so I made >>> them "objectName" (for mutex/event names) and streamName (for stream >>> name in createConnection()). >>> >>> updated webrev: >>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.03/ >>> >>> --alex >>> >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 3/14/18 12:43 PM, Alex Menkov wrote: >>>>> >>>>> Updated fix: >>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.02/ >>>>> >>>>> The changes: >>>>> - createTransport function is fixed; >>>>> - "prefix" variable is renamed to "baseName". >>>>> >>>>> --alex >>>>> >>>>> On 03/14/2018 09:45, Alex Menkov wrote: >>>>>> Hi David, >>>>>> >>>>>> >>>>>> On 03/13/2018 17:46, David Holmes wrote: >>>>>>> Hi Alex, >>>>>>> >>>>>>> On 14/03/2018 9:14 AM, Alex Menkov wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Please review a small fix for >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>>>>> webrev: >>>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>>>>> >>>>>>>> Root cause of the issue is jbd hungs as a result of the buffer >>>>>>>> overflow. >>>>>>>> >>>>>>>> In the beginning of the shmemBase.c: >>>>>>>> >>>>>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name >>>>>>>> for */ >>>>>>>> ???????????????????????????? /* shared memory seg and prefix >>>>>>>> for other IPC */ >>>>>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other >>>>>>>> IPC names */ >>>>>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>>>>> >>>>>>>> buffer (char prefix[]) in function createStream is used to >>>>>>>> generate base name for mutex/events, so MAX_IPC_PREFIX is not >>>>>>>> big enough. >>>>>>> >>>>>>> Good catch! But overall this code seems to be missing bounds >>>>>>> checks everywhere. You made the "prefix" (poor name?) buffer >>>>>>> bigger (MAX_IPC_NAME) but do we know the incoming name plus the >>>>>>> appended descriptive string will fit in it? >>>>>> >>>>>> Yes, the possible values can be added to the shmem name (which is >>>>>> restricted by 49 chars): >>>>>> ".mutex" >>>>>> ".hasData" >>>>>> ".hasSpace" >>>>>> ".accept" >>>>>> ".attach" >>>>>> "." (pid is 64bit value, max len IIRC is 19 symbols) >>>>>> So extra MAX_IPC_SUFFIX (25 symbols) is enough >>>>>> >>>>>>> Looking at createTransport for example, it also has: >>>>>>> >>>>>>> char prefix[MAX_IPC_PREFIX]; >>>>>>> >>>>>>> and it produces an error if >>>>>>> >>>>>>> strlen(address) >= MAX_IPC_PREFIX >>>>>>> >>>>>>> but otherwise copies it across: >>>>>>> >>>>>>> strcpy(transport->name, address); >>>>>>> >>>>>>> and then later does: >>>>>>> >>>>>>> ??sprintf(prefix, "%s.mutex", transport->name); >>>>>>> >>>>>>> so we may have overflowed again by adding ".mutex"! The same >>>>>>> goes for the subsequent sprintf's. >>>>>> >>>>>> Thank you for the catch! >>>>>> I looked the file for other similar issues, but somehow >>>>>> overlokked this case. >>>>>> Will fix it. >>>>>> Also will change confusing "prefix" name to "base_name". >>>>>> >>>>>> --alex >>>>>> >>>>>>> >>>>>>> So I think there is more work to do to ensure this code is >>>>>>> immune from buffer overflows. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> --alex >>>> >>>> >>>> >> From david.holmes at oracle.com Thu Mar 15 00:58:55 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 15 Mar 2018 10:58:55 +1000 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <70679c6d-30a8-0a43-1d91-64393dd12e9a@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <4212eecf-7c08-15af-2e2f-8b63c4483b95@oracle.com> <081c7cb8-7671-7a2d-7608-318cfb292ec5@oracle.com> <70679c6d-30a8-0a43-1d91-64393dd12e9a@oracle.com> Message-ID: Hi Alex, First: >> Yes, the possible values can be added to the shmem name (which is >> restricted by 49 chars): where is this 49 char limit enforced? Otherwise latest version seems okay. (I am somewhat surprised one of the static analysis tools hasn't flagged this code for potential buffer overflows.) Thanks, David On 15/03/2018 9:28 AM, Alex Menkov wrote: > Hi Chris, > > On 03/14/2018 13:42, Chris Plummer wrote: >> Hi Alex, >> >> I don't think prefix -> basename is what David had in mind. Those >> basically mean the same thing. The buffer is being used for the full >> name, which is why neither is really appropriate. So maybe just call >> it fullname, or even just name. createConnection() has a similar >> prefix reference that should be fixed. > > Ok, I don't like "fullname", "name" is already used there, so I made > them "objectName" (for mutex/event names) and streamName (for stream > name in createConnection()). > > updated webrev: > http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.03/ > > --alex > >> >> thanks, >> >> Chris >> >> On 3/14/18 12:43 PM, Alex Menkov wrote: >>> >>> Updated fix: >>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.02/ >>> >>> The changes: >>> - createTransport function is fixed; >>> - "prefix" variable is renamed to "baseName". >>> >>> --alex >>> >>> On 03/14/2018 09:45, Alex Menkov wrote: >>>> Hi David, >>>> >>>> >>>> On 03/13/2018 17:46, David Holmes wrote: >>>>> Hi Alex, >>>>> >>>>> On 14/03/2018 9:14 AM, Alex Menkov wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review a small fix for >>>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>>> >>>>>> Root cause of the issue is jbd hungs as a result of the buffer >>>>>> overflow. >>>>>> >>>>>> In the beginning of the shmemBase.c: >>>>>> >>>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name >>>>>> for */ >>>>>> ???????????????????????????? /* shared memory seg and prefix for >>>>>> other IPC */ >>>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >>>>>> names */ >>>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>>> >>>>>> buffer (char prefix[]) in function createStream is used to >>>>>> generate base name for mutex/events, so MAX_IPC_PREFIX is not big >>>>>> enough. >>>>> >>>>> Good catch! But overall this code seems to be missing bounds checks >>>>> everywhere. You made the "prefix" (poor name?) buffer bigger >>>>> (MAX_IPC_NAME) but do we know the incoming name plus the appended >>>>> descriptive string will fit in it? >>>> >>>> Yes, the possible values can be added to the shmem name (which is >>>> restricted by 49 chars): >>>> ".mutex" >>>> ".hasData" >>>> ".hasSpace" >>>> ".accept" >>>> ".attach" >>>> "." (pid is 64bit value, max len IIRC is 19 symbols) >>>> So extra MAX_IPC_SUFFIX (25 symbols) is enough >>>> >>>>> Looking at createTransport for example, it also has: >>>>> >>>>> char prefix[MAX_IPC_PREFIX]; >>>>> >>>>> and it produces an error if >>>>> >>>>> strlen(address) >= MAX_IPC_PREFIX >>>>> >>>>> but otherwise copies it across: >>>>> >>>>> strcpy(transport->name, address); >>>>> >>>>> and then later does: >>>>> >>>>> ??sprintf(prefix, "%s.mutex", transport->name); >>>>> >>>>> so we may have overflowed again by adding ".mutex"! The same goes >>>>> for the subsequent sprintf's. >>>> >>>> Thank you for the catch! >>>> I looked the file for other similar issues, but somehow overlokked >>>> this case. >>>> Will fix it. >>>> Also will change confusing "prefix" name to "base_name". >>>> >>>> --alex >>>> >>>>> >>>>> So I think there is more work to do to ensure this code is immune >>>>> from buffer overflows. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> --alex >> >> >> From david.holmes at oracle.com Thu Mar 15 01:35:26 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 15 Mar 2018 11:35:26 +1000 Subject: RFR 8166642: serviceability/dcmd/framework/* timeout In-Reply-To: References: <2F3BE7C8-AB3A-461E-80F8-82FA77C25DCE@oracle.com> <8525f5af-f4ac-4eef-da11-4477954a3279@oracle.com> <34884a03-e6e0-a462-28e8-fb56f1a1c536@oracle.com> Message-ID: <3dc3e945-b41c-bdc0-91ab-8e004b3eef83@oracle.com> Confirming good to go. David On 15/03/2018 3:26 AM, Chris Plummer wrote: > Hi Daniil, > > I believe David said he would defer to me on whether you should take the > approach of spawning a different app to avoid this issue. I'm fine with > it the way it is, so unless you feel compelled to take the more > complicated approach, I think you're good to go. > > thanks, > > Chris > > On 3/14/18 9:47 AM, daniil.x.titov at oracle.com wrote: >> Hi Chris and David, >> >> Could you please say is anything else required or you are OK with >> these changes? >> >> As Chris already replied there are only 3 tests that will be affected >> and each of them takes less than 5 seconds to complete. >> >> >> On 3/13/18 10:50 PM, Chris Plummer wrote: >>> Hi Danill, >>> >>> The fix looks good. Were you able to reproduce this problem, and then >>> after the fix run the tests enough times to be confident this really >>> resolves the issue? >>> >> I was able to reproduce this problem with Mach5 . There were about 1-3 >> failures per 100 runs of hotspot_serviceability suite. After the fix >> the tests were run more then 1000 times without failures. >>> Are you going to close JDK-8194057 as a dup? >>> >> Yes. I plan to close JDK-8194057 as a duplicate. >>> thanks, >>> >>> Chris >>> >> >> Thanks! >> >> Best regards, >> Daniil >> >> >> On 3/13/18 10:26 PM, Daniil Titov wrote: >>>> Please review the changes that fix intermittent timeout failure of >>>> serviceability/dcmd/framework/* tests. >>>> >>>> The problem here is that these tests invoke jcmd in different ways >>>> and one of such ways is when a main class is passed to the jcmd as a >>>> VM identifier. The main class for jtreg test is >>>> com.sun.javatest.regtest.agent.MainWrapper and in some cases more >>>> than one test are running in parallel and there are multiple Java >>>> processes with com.sun.javatest.regtest.agent.MainWrapper as a main >>>> class . When it happens jcmd iterates over all Java processes that >>>> match the condition (the main class equals to >>>> com.sun.javatest.regtest.agent.MainWrapper) and executes the command >>>> for each of them. That results in the jcmd invokes the given command >>>> multiple times and attaches to Java processes not related to the >>>> current test. >>>> >>>> The fix makes serviceability/dcmd/framework/* tests non-concurrent >>>> to ensure that they don't interact with other tests. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8166642 >>>> Webrev: http://cr.openjdk.java.net/~dtitov/8166642/webrev.01 >>>> >>>> The tests ran successfully with Mach5. >>>> >>>> Best regards, >>>> Daniil >>>> >>>> >>> >> > From jini.george at oracle.com Thu Mar 15 04:21:13 2018 From: jini.george at oracle.com (Jini George) Date: Thu, 15 Mar 2018 09:51:13 +0530 Subject: RFR: JDK-8175312: SA: clhsdb: Provide an improved heap summary for 'universe' for G1GC In-Reply-To: References: <38d71740-0b66-3ce8-26ed-a0f2b9f9e91c@oracle.com> <5e8c582e-b32f-daf7-0e0c-1e6606ceaf3a@oracle.com> <59c7ec4f-f9dc-7c25-e7bf-f3fc304b5c60@oracle.com> Message-ID: Thank you very much, Yumin. The modified webrev (after incorporating your comment) is at: http://cr.openjdk.java.net/~jgeorge/8175312/webrev.03/index.html Thanks, Jini. On 3/13/2018 6:24 AM, yumin qi wrote: > Jini, > > ? Looks good. One minor comment: > > + public void printG1HeapSummary(G1CollectedHeap heap) { > + G1CollectedHeap g1h = (G1CollectedHeap) heap; > > ?'heap' has been cast to 'G1CollectedHeap' at call site, so seems no > need to convert here again. > > Thanks > Yumin > > On Mon, Mar 12, 2018 at 8:52 AM, Jini George > wrote: > > Thank you very much, Stefan. Could one more reviewer please take a > look at it ? > > - Jini. > > > On 3/12/2018 8:52 PM, Stefan Johansson wrote: > > Hi Jini, > > This looks good. I'm totally fine with skipping metaspace if > that isn't displayed for the other GCs. > > Cheers, > Stefan > > On 2018-03-09 10:29, Jini George wrote: > > Here is the revised webrev: > > http://cr.openjdk.java.net/~jgeorge/8175312/webrev.02/ > > > I have made modifications to have the 'universe' command > display details like: > > hsdb> universe > Heap Parameters: > garbage-first heap [0x0000000725200000, 0x00000007c0000000] > region size 1024K > G1 Heap: > ?? regions? = 2478 > ?? capacity = 2598371328 (2478.0MB) > ?? used???? = 5242880 (5.0MB) > ?? free???? = 2593128448 (2473.0MB) > ?? 0.20177562550443906% used > G1 Young Generation: > Eden Space: > ?? regions? = 5 > ?? capacity = 8388608 (8.0MB) > ?? used???? = 5242880 (5.0MB) > ?? free???? = 3145728 (3.0MB) > ?? 62.5% used > Survivor Space: > ?? regions? = 0 > ?? capacity = 0 (0.0MB) > ?? used???? = 0 (0.0MB) > ?? free???? = 0 (0.0MB) > ?? 0.0% used > G1 Old Generation: > ?? regions? = 0 > ?? capacity = 155189248 (148.0MB) > ?? used???? = 0 (0.0MB) > ?? free???? = 155189248 (148.0MB) > ?? 0.0% used > > > I did not add the metaspace details since that did not seem > to be in line with the 'universe' output for other GCs. I > have added a new command "g1regiondetails" to display the > region details, and have modified the tests accordingly. > > hsdb> g1regiondetails > Region Details: > Region: > 0x0000000725200000,0x0000000725200000,0x0000000725300000:Free > Region: > 0x0000000725300000,0x0000000725300000,0x0000000725400000:Free > Region: > 0x0000000725400000,0x0000000725400000,0x0000000725500000:Free > Region: > 0x0000000725500000,0x0000000725500000,0x0000000725600000:Free > Region: > 0x0000000725600000,0x0000000725600000,0x0000000725700000:Free > Region: > 0x0000000725700000,0x0000000725700000,0x0000000725800000:Free > ... > > Thanks, > Jini. > > > On 2/28/2018 12:56 PM, Jini George wrote: > > Thank you very much, Stefan. My answers inline. > > On 2/27/2018 3:30 PM, Stefan Johansson wrote: > > Hi Jini, > > > JIRA > ID:https://bugs.openjdk.java.net/browse/JDK-8175312 > > Webrev: > http://cr.openjdk.java.net/~jgeorge/8175312/webrev.00/index.html > > > It looks like a file is missing, did you forget to > add it to the changeset? > > > Indeed, I had missed that! I added the missing file in > the following webrev: > > http://cr.openjdk.java.net/~jgeorge/8175312/webrev.01/ > > > --- > open/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/G1CollectedHeap.java:36: > error: cannot find symbol > import sun.jvm.hotspot.gc.shared.PrintRegionClosure; > --- > > Otherwise the change looks good, but I would like to > see the output live. For a big heap this will print > a lot of data, just wondering if the universe > command is the correct choice for this kind of > output. I like having the possibility to print all > regions, so I want the change but maybe it should be > a different command and 'universe' just prints a > little more than before. Something like our logging > heap-summary at shutdown: > garbage-first heap?? total 16384K, used 3072K > [0x00000000ff000000, 0x0000000100000000) > ??region size 1024K, 4 young (4096K), 0 survivors (0K) > Metaspace?????? used 6731K, capacity 6825K, > committed 7040K, reserved 1056768K > ??class space??? used 559K, capacity 594K, > committed 640K, reserved 1048576K > > > Ok, will add this, and could probably have the region > details displayed under a new command called > "g1regiondetails", or some such, and send out a new webrev. > > Thanks, > Jini. > > > Thanks, > Stefan > > Modifications have been made to display the > regions like: > > ... > Region: > 0x00000005c5400000,0x00000005c5600000,0x00000005c5600000:Old > Region: > 0x00000005c5600000,0x00000005c5800000,0x00000005c5800000:Old > Region: > 0x00000005c5800000,0x00000005c5a00000,0x00000005c5a00000:Old > Region: > 0x00000005c5a00000,0x00000005c5c00000,0x00000005c5c00000:Old > Region: > 0x00000005c5c00000,0x00000005c5c00000,0x00000005c5e00000:Free > Region: > 0x00000005c5e00000,0x00000005c5e00000,0x00000005c6000000:Free > Region: > 0x00000005c6000000,0x00000005c6200000,0x00000005c6200000:Old > ... > > The jtreg test at this point does not > include any testing for the display of > archived or pinned regions. The testing for > this will be added once JDK-8174994 is resolved. > > The SA tests pass with jprt and Mach5. > > Thanks, > Jini. > > > > From serguei.spitsyn at oracle.com Thu Mar 15 04:48:21 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 14 Mar 2018 21:48:21 -0700 Subject: RFR 8166642: serviceability/dcmd/framework/* timeout In-Reply-To: <2F3BE7C8-AB3A-461E-80F8-82FA77C25DCE@oracle.com> References: <2F3BE7C8-AB3A-461E-80F8-82FA77C25DCE@oracle.com> Message-ID: Hi Daniil, It looks good. I've consulted with Igor Ignatyev, and he suggested exactly the same approach. Thanks, Serguei On 3/13/18 22:26, Daniil Titov wrote: > Please review the changes that fix intermittent timeout failure of serviceability/dcmd/framework/* tests. > > The problem here is that these tests invoke jcmd in different ways and one of such ways is when a main class is passed to the jcmd as a VM identifier. The main class for jtreg test is com.sun.javatest.regtest.agent.MainWrapper and in some cases more than one test are running in parallel and there are multiple Java processes with com.sun.javatest.regtest.agent.MainWrapper as a main class . When it happens jcmd iterates over all Java processes that match the condition (the main class equals to com.sun.javatest.regtest.agent.MainWrapper) and executes the command for each of them. That results in the jcmd invokes the given command multiple times and attaches to Java processes not related to the current test. > > The fix makes serviceability/dcmd/framework/* tests non-concurrent to ensure that they don't interact with other tests. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8166642 > Webrev: http://cr.openjdk.java.net/~dtitov/8166642/webrev.01 > > The tests ran successfully with Mach5. > > Best regards, > Daniil > > From Alan.Bateman at oracle.com Thu Mar 15 11:20:06 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 15 Mar 2018 11:20:06 +0000 Subject: RFR of JDK-8199215: Re-examine getFreePort method in test infrastructure library In-Reply-To: References: <5a52026a-b3a3-8e16-2a85-1c2bd239e266@oracle.com> Message-ID: <067292b0-a167-8877-df46-73a081c99301@oracle.com> On 15/03/2018 08:43, Hamlin Li wrote: > : > > Hi Alan, > > Thank you for reviewing, I have updated the webrev in place. ( cc'ing serviceability-dev and net-dev as these are the other areas that use the getFreePort method in the test library. For context, the patch that we are discussing is: ??? http://cr.openjdk.java.net/~mli/8199215/webrev.00/ ) The new implementation of getFreePort looks good but it no longer throws InterruptedException and so might need some of the usages (esp. in the serviceability tests) to be updated. Also the comment "The function will spin ..." is no longer relevant and can be removed. Moving refusingEndpoint() from the NIO test to Utils looks okay. The "it's much more stable ..." in the method description looks a it inconsistent with the other wording. An alternative is "This method is better choice than getFreePort for tests that need an endpoint that refuses connections". The update to the tests look okay to me. -Alan From christoph.langer at sap.com Thu Mar 15 14:42:13 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Thu, 15 Mar 2018 14:42:13 +0000 Subject: RFR (XS): 8199010: attachListener.hpp: Fix potential null termination issue found by coverity scans In-Reply-To: <95d10e87dc5f4531bd576965a9ab43c2@sap.com> References: <3d24bae5a92348b7ba766bd84b99a15b@sap.com> <8f5085d2-756f-e397-0741-3fcfd8ea687e@oracle.com> <22c65f6a-8df3-713d-fb75-6038bac9b345@oracle.com> <15110e01-1bfe-9cfc-15fe-d722d15bb2b7@oracle.com> <7ab7c0b7-69ad-5b94-dd98-19790288bebb@oracle.com> <18e077534cd940539cd1cbb93943e9c8@sap.com> <6d9ec2ef8f5d400c8ea4a794388fd1c8@sap.com> <5933353c-8f49-85cc-3e01-6e80982a3d7f@oracle.com> <95d10e87dc5f4531bd576965a9ab43c2@sap.com> Message-ID: <3eb5a42129b94191af337e6e66b822e2@sap.com> Hi, I just pushed it after successfully running it through the hs submit repo: http://hg.openjdk.java.net/jdk/hs/rev/f654b37c58a1 Thanks Christoph > -----Original Message----- > From: Langer, Christoph > Sent: Montag, 12. M?rz 2018 14:24 > To: 'Chris Plummer' > Cc: serviceability-dev at openjdk.java.net; Hotspot dev runtime runtime-dev at openjdk.java.net>; David Holmes > ; Thomas St?fe > Subject: RE: RFR (XS): 8199010: attachListener.hpp: Fix potential null > termination issue found by coverity scans > > Hi, > > here is the final webrev for pushing: > http://cr.openjdk.java.net/~clanger/webrevs/8199010.2/ I also did a little > sorting in the include files (alphabetical order). The tests at SAP went fine > and the coverity build was satisfied, too ?? > > Thanks in advance, Chris, for sponsoring. > > Best regards > Christoph > > > -----Original Message----- > > From: Chris Plummer [mailto:chris.plummer at oracle.com] > > Sent: Freitag, 9. M?rz 2018 17:02 > > To: Langer, Christoph > > Cc: serviceability-dev at openjdk.java.net; Hotspot dev runtime > runtime-dev at openjdk.java.net>; David Holmes > > ; Thomas St?fe > > Subject: Re: RFR (XS): 8199010: attachListener.hpp: Fix potential null > > termination issue found by coverity scans > > > > On 3/9/18 4:50 AM, Langer, Christoph wrote: > > > Hi Chris, > > > > > >>> Secondly, it doesn't accept the assert as length check and complains: > > >>> fixed_size_dest: You might overrun the 17-character fixed-size string > > this- > > >>> _name by copying name without checking the length. > > >> Agreed that the assert is not a length check in product builds. However, > > >> the only caller has a length check. Have you tried moving this length > > >> check into set_name() and see if the problem goes away? Although I > > don't > > >> suggest that as a fix. Just curious as to what the result would be. > > > When doing a length check in set_name(), coverity would be pleased. But > > still we'd have to handle length violations by either guaranteeing or > returning > > some error return code, or quietly truncating. But you say you don't > suggest > > it as fix anyway... > > > > > >> BTW, I just realized I had been ignoring the set_arg() changes all this > > >> time and focused on set_name(). So if any of the complaints are unique > > >> to set_arg() please let me know. > > > No, nothing unique. > > > > > >>> And, 3rd, it considers the risk as elevated: > > >>> parameter_as_source: Note: This defect has an elevated risk because > > the > > >> source argument is a parameter of the current function. > > >> Is this a complaint about "name" being a source argument to strcpy(). If > > >> so, I don't get this one. How are you going to copy "name" without > > >> specifying it as an argument to something (strcpy, strncpy, memcpy, > > >> etc). Besides, it is being passed to strcpy as a const argument. Makes > > >> me wonder if adding const to the parameter declarations for both > > >> set_name() and enqueue() would help. > > > I think coverity just considers this finding as elevated because the input > > data isn't something static from inside the method but comes in as > argument. > > > > > >>> In my opinion the points are valid, because in opt builds there would > be > > no length check. > > >> But there is a length check in the caller. Does coverity not see checks up > > the call chain? > > > Obviously not. > > > > > >>> I really think it would be easiest to go to my proposed patch. And it > > doesn't > > >>> come with much cost and the place probably isn't performance > relevant. > > >> I'm not worried about performance. To me it has more to do with taking > > >> easily to read code and changing it into something that someone would > > >> stare at for a bit before figuring out what it's doing, and then ask > > >> "Why so complicated?". Coverity is suppose to help us make our code > > >> better. I don't see that being the case here. If in the end your changes > > >> are the simplest approach to quieting coverity, then I guess that's what > > >> we should go with. However, I'm still not convinced we really fully why > > >> converity is not happy with a strcpy that can be statically shown to be > > >> safe. Is is a coverity bug? Is there a call path we are missing? > > >> Something else that makes it hard for coverity to statically check this? > > >> That's one reason I'd like to see what happens if a check is put > > >> directly in set_name. > > > OK, so let me summarize: > > > The code as it is right now has a little issue - which isn't obvious at a quick > > glance by the way. > > > It can be fixed like I suggested. This would add two lines of code at each > > place and one can argue about how easy it is to understand. To me it seems > > as understandable as it was before - but I'm probably a bit concerned here. > > In terms of readability, I was referring back to the original code that > > just had the strlen. It was the original coverity fix to that code that > > introduced readability issue. You aren't really doing much to make it > > less readable. > > > I can suggest an alternative which might be easier to read: > > http://cr.openjdk.java.net/~clanger/webrevs/8199010.1/ It comes at the > > cost of 2 calls to strlen() in dbg builds but it has one line of code less and > > might be more straightforward to understand. > > > All larger refactoring of set_name() and set_arg() is beyond the scope of > > my change. > > I like this version better, although it doesn't change my opinion that > > this is still all jumping through hoops to get coverity to stop > > complaining about something that is perfectly fine. > > > > > > Now I'd really like if you could accept one of my 2 proposals, given that > also > > Thomas and David think it's ok. I want to get this done now. ?? Maybe you > can > > even sponsor it... > > Yeah, I'm ok with the change. I've said my peace and don't just want to > > get in the way of a simple fix. Yes, I can also sponsor it for you. > > > > cheers, > > > > Chris > > > > > > Thanks & Best regards > > > Christoph > > > > > From magnus.ihse.bursie at oracle.com Thu Mar 15 18:22:17 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Thu, 15 Mar 2018 19:22:17 +0100 Subject: RFR: JDK-8199682 Clean up building the saproc library Message-ID: The saproc library has historically been built in quite odd ways on almost all platforms. When the old build system was converted, this was not changed. However, now the time has come to streamline this and build this library just as any other. The most visible change, perhaps, is that the library is now named saproc on all platforms, even Windows. Other changes include: * Don't set flags that is already set by the default flags. * Don't set flags that do not have anny effect. * Don't subst away the WIN32_LEAN_AND_MEAN definition, it's perfectly okay to have it. * Don't set CXX linker on solaris -- this was not needed so no reason to do it. * Cleaned up some old hooks for closed code that is no longer needed. I have verified this using COMPARE_BUILD. This shows only the expected differences: * On all platforms: class file changes for WindbgDebuggerLocal.java. * On solaris: some minor symbol differences, since the linker now uses C framework functions instead of C++. (And with symbol changes always comes disasm changes.) * On linux: a binary difference for libsaproc.so, but no size/symbol/deps/disasm change. * On macosx: no changes at all. * On windows: sawindbg.dll is renamed to saproc.dll. When I made a manual comparison between the two files, I found no significant differences. Bug: https://bugs.openjdk.java.net/browse/JDK-8199682 WebRev: http://cr.openjdk.java.net/~ihse/JDK-8199682-clean-up-saproc/webrev.01 /Magnus From erik.joelsson at oracle.com Thu Mar 15 18:39:42 2018 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Thu, 15 Mar 2018 11:39:42 -0700 Subject: RFR: JDK-8199682 Clean up building the saproc library In-Reply-To: References: Message-ID: <39fbd44c-030d-582f-f4f9-9bc9f3e53c42@oracle.com> Looks good to me. The removed source files, are those some kind of tests? /Erik On 2018-03-15 11:22, Magnus Ihse Bursie wrote: > The saproc library has historically been built in quite odd ways on > almost all platforms. When the old build system was converted, this > was not changed. > > However, now the time has come to streamline this and build this > library just as any other. > > The most visible change, perhaps, is that the library is now named > saproc on all platforms, even Windows. Other changes include: > * Don't set flags that is already set by the default flags. > * Don't set flags that do not have anny effect. > * Don't subst away the WIN32_LEAN_AND_MEAN definition, it's perfectly > okay to have it. > * Don't set CXX linker on solaris -- this was not needed so no reason > to do it. > * Cleaned up some old hooks for closed code that is no longer needed. > > I have verified this using COMPARE_BUILD. This shows only the expected > differences: > * On all platforms: class file changes for WindbgDebuggerLocal.java. > * On solaris: some minor symbol differences, since the linker now uses > C framework functions instead of C++. (And with symbol changes always > comes disasm changes.) > * On linux: a binary difference for libsaproc.so, but no > size/symbol/deps/disasm change. > * On macosx: no changes at all. > * On windows: sawindbg.dll is renamed to saproc.dll. When I made a > manual comparison between the two files, I found no significant > differences. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8199682 > WebRev: > http://cr.openjdk.java.net/~ihse/JDK-8199682-clean-up-saproc/webrev.01 > > /Magnus > From magnus.ihse.bursie at oracle.com Thu Mar 15 18:49:25 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Thu, 15 Mar 2018 19:49:25 +0100 Subject: RFR: JDK-8199682 Clean up building the saproc library In-Reply-To: <39fbd44c-030d-582f-f4f9-9bc9f3e53c42@oracle.com> References: <39fbd44c-030d-582f-f4f9-9bc9f3e53c42@oracle.com> Message-ID: On 2018-03-15 19:39, Erik Joelsson wrote: > Looks good to me. > > The removed source files, are those some kind of tests? I don't really know; they have been excluded from the build for all time. My guess is that the Bsd* stuff is, like in the case of the sound libraries, bsd-based stuff that arrived with the mac port (but disabled). The test.c is a trivial main() method which looks more like a left-over adhoc testing from the initial developer. Perhaps someone wants to turn it into a proper test, but it seems like it's not much even to start with. (And hopefully we have much better real test coverage of this now.) /Magnus > > /Erik > > > On 2018-03-15 11:22, Magnus Ihse Bursie wrote: >> The saproc library has historically been built in quite odd ways on >> almost all platforms. When the old build system was converted, this >> was not changed. >> >> However, now the time has come to streamline this and build this >> library just as any other. >> >> The most visible change, perhaps, is that the library is now named >> saproc on all platforms, even Windows. Other changes include: >> * Don't set flags that is already set by the default flags. >> * Don't set flags that do not have anny effect. >> * Don't subst away the WIN32_LEAN_AND_MEAN definition, it's perfectly >> okay to have it. >> * Don't set CXX linker on solaris -- this was not needed so no reason >> to do it. >> * Cleaned up some old hooks for closed code that is no longer needed. >> >> I have verified this using COMPARE_BUILD. This shows only the >> expected differences: >> * On all platforms: class file changes for WindbgDebuggerLocal.java. >> * On solaris: some minor symbol differences, since the linker now >> uses C framework functions instead of C++. (And with symbol changes >> always comes disasm changes.) >> * On linux: a binary difference for libsaproc.so, but no >> size/symbol/deps/disasm change. >> * On macosx: no changes at all. >> * On windows: sawindbg.dll is renamed to saproc.dll. When I made a >> manual comparison between the two files, I found no significant >> differences. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8199682 >> WebRev: >> http://cr.openjdk.java.net/~ihse/JDK-8199682-clean-up-saproc/webrev.01 >> >> /Magnus >> > From david.holmes at oracle.com Fri Mar 16 03:13:00 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 16 Mar 2018 13:13:00 +1000 Subject: RFR: JDK-8199682 Clean up building the saproc library In-Reply-To: References: Message-ID: <927e4b35-3174-90aa-857f-d27d2f90ad3c@oracle.com> Hi Magnus, Overall this seems okay. On 16/03/2018 4:22 AM, Magnus Ihse Bursie wrote: > The saproc library has historically been built in quite odd ways on > almost all platforms. When the old build system was converted, this was > not changed. > > However, now the time has come to streamline this and build this library > just as any other. > > The most visible change, perhaps, is that the library is now named > saproc on all platforms, even Windows. Other changes include: That could have repercussions elsewhere. sawindbg.dll is probably a well known name for deployment systems. > * Don't set flags that is already set by the default flags. > * Don't set flags that do not have anny effect. > * Don't subst away the WIN32_LEAN_AND_MEAN definition, it's perfectly > okay to have it. > * Don't set CXX linker on solaris -- this was not needed so no reason to > do it. > * Cleaned up some old hooks for closed code that is no longer needed. Right - we could have deleted that when our ARM ports went open. > I have verified this using COMPARE_BUILD. This shows only the expected > differences: > * On all platforms: class file changes for WindbgDebuggerLocal.java. > * On solaris: some minor symbol differences, since the linker now uses C > framework functions instead of C++. (And with symbol changes always > comes disasm changes.) > * On linux: a binary difference for libsaproc.so, but no > size/symbol/deps/disasm change. > * On macosx: no changes at all. > * On windows: sawindbg.dll is renamed to saproc.dll. When I made a > manual comparison between the two files, I found no significant > differences. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8199682 > WebRev: > http://cr.openjdk.java.net/~ihse/JDK-8199682-clean-up-saproc/webrev.01 The deleted OSX files seem okay. This just seems like another case where the original port copied every Linux file across to the bsd directory. Not sure about the Solaris saproc_audit.cpp or the test.c files ?? Thanks, David > /Magnus > From huaming.li at oracle.com Fri Mar 16 01:54:48 2018 From: huaming.li at oracle.com (Hamlin Li) Date: Fri, 16 Mar 2018 09:54:48 +0800 Subject: RFR of JDK-8199215: Re-examine getFreePort method in test infrastructure library In-Reply-To: <067292b0-a167-8877-df46-73a081c99301@oracle.com> References: <5a52026a-b3a3-8e16-2a85-1c2bd239e266@oracle.com> <067292b0-a167-8877-df46-73a081c99301@oracle.com> Message-ID: On 15/03/2018 7:20 PM, Alan Bateman wrote: > On 15/03/2018 08:43, Hamlin Li wrote: >> : >> >> Hi Alan, >> >> Thank you for reviewing, I have updated the webrev in place. > ( cc'ing serviceability-dev and net-dev as these are the other areas > that use the getFreePort method in the test library. For context, the > patch that we are discussing is: > ??? http://cr.openjdk.java.net/~mli/8199215/webrev.00/ ) > > The new implementation of getFreePort looks good but it no longer > throws InterruptedException and so might need some of the usages (esp. > in the serviceability tests) to be updated. Also the comment "The > function will spin ..." is no longer relevant and can be removed. > > Moving refusingEndpoint() from the NIO test to Utils looks okay. The > "it's much more stable ..." in the method description looks a it > inconsistent with the other wording. An alternative is "This method is > better choice than getFreePort for tests that need an endpoint that > refuses connections". > > The update to the tests look okay to me. Hi Alan, Thank you for detailed reviewing. I have updated the webrev in place. (http://cr.openjdk.java.net/~mli/8199215/webrev.00/) Thank you -Hamlin > > -Alan > From sundararajan.athijegannathan at oracle.com Fri Mar 16 05:14:57 2018 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Fri, 16 Mar 2018 10:44:57 +0530 Subject: RFR: JDK-8199682 Clean up building the saproc library In-Reply-To: References: <39fbd44c-030d-582f-f4f9-9bc9f3e53c42@oracle.com> Message-ID: <5AAB52D1.6040707@oracle.com> Renaming sawindbg as saproc sounds odd. For Linux, Solaris/Unix, we either use /proc & libproc, so calling saproc for those makes sense. But Windows? We have a separate debugger class to load platform specific native library. What is the reason for uniform naming? -Sundar On 16/03/18, 12:19 AM, Magnus Ihse Bursie wrote: > > > On 2018-03-15 19:39, Erik Joelsson wrote: >> Looks good to me. >> >> The removed source files, are those some kind of tests? > I don't really know; they have been excluded from the build for all > time. My guess is that the Bsd* stuff is, like in the case of the > sound libraries, bsd-based stuff that arrived with the mac port (but > disabled). The test.c is a trivial main() method which looks more like > a left-over adhoc testing from the initial developer. Perhaps someone > wants to turn it into a proper test, but it seems like it's not much > even to start with. (And hopefully we have much better real test > coverage of this now.) > > /Magnus >> >> /Erik >> >> >> On 2018-03-15 11:22, Magnus Ihse Bursie wrote: >>> The saproc library has historically been built in quite odd ways on >>> almost all platforms. When the old build system was converted, this >>> was not changed. >>> >>> However, now the time has come to streamline this and build this >>> library just as any other. >>> >>> The most visible change, perhaps, is that the library is now named >>> saproc on all platforms, even Windows. Other changes include: >>> * Don't set flags that is already set by the default flags. >>> * Don't set flags that do not have anny effect. >>> * Don't subst away the WIN32_LEAN_AND_MEAN definition, it's >>> perfectly okay to have it. >>> * Don't set CXX linker on solaris -- this was not needed so no >>> reason to do it. >>> * Cleaned up some old hooks for closed code that is no longer needed. >>> >>> I have verified this using COMPARE_BUILD. This shows only the >>> expected differences: >>> * On all platforms: class file changes for WindbgDebuggerLocal.java. >>> * On solaris: some minor symbol differences, since the linker now >>> uses C framework functions instead of C++. (And with symbol changes >>> always comes disasm changes.) >>> * On linux: a binary difference for libsaproc.so, but no >>> size/symbol/deps/disasm change. >>> * On macosx: no changes at all. >>> * On windows: sawindbg.dll is renamed to saproc.dll. When I made a >>> manual comparison between the two files, I found no significant >>> differences. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8199682 >>> WebRev: >>> http://cr.openjdk.java.net/~ihse/JDK-8199682-clean-up-saproc/webrev.01 >>> >>> /Magnus >>> >> > From Alan.Bateman at oracle.com Fri Mar 16 08:05:48 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 16 Mar 2018 08:05:48 +0000 Subject: RFR of JDK-8199215: Re-examine getFreePort method in test infrastructure library In-Reply-To: References: <5a52026a-b3a3-8e16-2a85-1c2bd239e266@oracle.com> <067292b0-a167-8877-df46-73a081c99301@oracle.com> Message-ID: <9ef60dff-7d40-f772-2ecf-74eb13e79391@oracle.com> On 16/03/2018 01:54, Hamlin Li wrote: > : > > Hi Alan, > Thank you for detailed reviewing. I have updated the webrev in place. > (http://cr.openjdk.java.net/~mli/8199215/webrev.00/) Looks good, just a minor typo "is better choice" -> "is a better choice". Just to confirm, have you run the serviceability and http client tests to make sure that they compile with this change? -Alan From huaming.li at oracle.com Fri Mar 16 09:00:42 2018 From: huaming.li at oracle.com (Hamlin Li) Date: Fri, 16 Mar 2018 17:00:42 +0800 Subject: RFR of JDK-8199215: Re-examine getFreePort method in test infrastructure library In-Reply-To: <9ef60dff-7d40-f772-2ecf-74eb13e79391@oracle.com> References: <5a52026a-b3a3-8e16-2a85-1c2bd239e266@oracle.com> <067292b0-a167-8877-df46-73a081c99301@oracle.com> <9ef60dff-7d40-f772-2ecf-74eb13e79391@oracle.com> Message-ID: <668c3c63-ff56-2e00-dc60-5e85c478fd48@oracle.com> On 16/03/2018 4:05 PM, Alan Bateman wrote: > On 16/03/2018 01:54, Hamlin Li wrote: >> : >> >> Hi Alan, >> Thank you for detailed reviewing. I have updated the webrev in place. >> (http://cr.openjdk.java.net/~mli/8199215/webrev.00/) > Looks good, just a minor typo "is better choice" -> "is a better choice". Hi Alan, Thank you, I will modify it when push. > > Just to confirm, have you run the serviceability and http client tests > to make sure that they compile with this change? Yes, I ran tier1,tier2,tier3 tests(I think it includes httpclient tests), and also specific tests using Utils.getFreePort in svc area. I think I'm good to push? Thank you -Hamlin > > -Alan From Alan.Bateman at oracle.com Fri Mar 16 09:23:13 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 16 Mar 2018 09:23:13 +0000 Subject: RFR of JDK-8199215: Re-examine getFreePort method in test infrastructure library In-Reply-To: <668c3c63-ff56-2e00-dc60-5e85c478fd48@oracle.com> References: <5a52026a-b3a3-8e16-2a85-1c2bd239e266@oracle.com> <067292b0-a167-8877-df46-73a081c99301@oracle.com> <9ef60dff-7d40-f772-2ecf-74eb13e79391@oracle.com> <668c3c63-ff56-2e00-dc60-5e85c478fd48@oracle.com> Message-ID: On 16/03/2018 09:00, Hamlin Li wrote: > : >> >> Just to confirm, have you run the serviceability and http client >> tests to make sure that they compile with this change? > Yes, I ran tier1,tier2,tier3 tests(I think it includes httpclient > tests), and also specific tests using Utils.getFreePort in svc area. > I think I'm good to push? Thanks for confirming, I was just surprised that none of the tests in the other areas needed changes.? So yes, go ahead! -Alan From magnus.ihse.bursie at oracle.com Fri Mar 16 11:49:05 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 16 Mar 2018 12:49:05 +0100 Subject: RFR: JDK-8199682 Clean up building the saproc library In-Reply-To: <927e4b35-3174-90aa-857f-d27d2f90ad3c@oracle.com> References: <927e4b35-3174-90aa-857f-d27d2f90ad3c@oracle.com> Message-ID: On 2018-03-16 04:13, David Holmes wrote: > Hi Magnus, > > Overall this seems okay. Thanks! > > On 16/03/2018 4:22 AM, Magnus Ihse Bursie wrote: >> The saproc library has historically been built in quite odd ways on >> almost all platforms. When the old build system was converted, this >> was not changed. >> >> However, now the time has come to streamline this and build this >> library just as any other. >> >> The most visible change, perhaps, is that the library is now named >> saproc on all platforms, even Windows. Other changes include: > > That could have repercussions elsewhere. sawindbg.dll is probably a > well known name for deployment systems. You mean other classes than WindbgDebuggerLocal.java, out in the wild, might load sawindbg.dll directly and call into it? If they do so, they must also be prepared that this is not an exported interface and can change at any time. > >> * Don't set flags that is already set by the default flags. >> * Don't set flags that do not have anny effect. >> * Don't subst away the WIN32_LEAN_AND_MEAN definition, it's perfectly >> okay to have it. >> * Don't set CXX linker on solaris -- this was not needed so no reason >> to do it. >> * Cleaned up some old hooks for closed code that is no longer needed. > > Right - we could have deleted that when our ARM ports went open. > >> I have verified this using COMPARE_BUILD. This shows only the >> expected differences: >> * On all platforms: class file changes for WindbgDebuggerLocal.java. >> * On solaris: some minor symbol differences, since the linker now >> uses C framework functions instead of C++. (And with symbol changes >> always comes disasm changes.) >> * On linux: a binary difference for libsaproc.so, but no >> size/symbol/deps/disasm change. >> * On macosx: no changes at all. >> * On windows: sawindbg.dll is renamed to saproc.dll. When I made a >> manual comparison between the two files, I found no significant >> differences. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8199682 >> WebRev: >> http://cr.openjdk.java.net/~ihse/JDK-8199682-clean-up-saproc/webrev.01 > > The deleted OSX files seem okay. This just seems like another case > where the original port copied every Linux file across to the bsd > directory. > > Not sure about the Solaris saproc_audit.cpp or the test.c files ?? I don't know either. :) As I said to Erik, the test files looked like stupid adhoc testing just left in place. The saproc_audit.cpp looks legit, but has not been compiled for years. Someone must have "removed" the file by excluding it from compilation, rather than deleting it. Could have happened back in the bad old days when "solaris" didn't mean solaris but "unix", and nobody understood the consequences of deleting files there. As always, the file is still in the repository, if someone wants to revive it. /Magnus > > Thanks, > David > >> /Magnus >> From david.holmes at oracle.com Fri Mar 16 12:47:20 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 16 Mar 2018 22:47:20 +1000 Subject: RFR: JDK-8199682 Clean up building the saproc library In-Reply-To: References: <927e4b35-3174-90aa-857f-d27d2f90ad3c@oracle.com> Message-ID: <8306d8e1-c0a1-c41d-56f7-073a0899a630@oracle.com> On 16/03/2018 9:49 PM, Magnus Ihse Bursie wrote: > On 2018-03-16 04:13, David Holmes wrote: >> Hi Magnus, >> >> Overall this seems okay. > Thanks! > >> >> On 16/03/2018 4:22 AM, Magnus Ihse Bursie wrote: >>> The saproc library has historically been built in quite odd ways on >>> almost all platforms. When the old build system was converted, this >>> was not changed. >>> >>> However, now the time has come to streamline this and build this >>> library just as any other. >>> >>> The most visible change, perhaps, is that the library is now named >>> saproc on all platforms, even Windows. Other changes include: >> >> That could have repercussions elsewhere. sawindbg.dll is probably a >> well known name for deployment systems. > You mean other classes than WindbgDebuggerLocal.java, out in the wild, > might load sawindbg.dll directly and call into it? If they do so, they > must also be prepared that this is not an exported interface and can > change at any time. No I mean deployment systems, like an upstream RPM manager, or Oracle's own installer process, may know the name of the file and have to be modified if the name changes. Though as Sundar said "proc" isn't really the right name. David ----- >> >>> * Don't set flags that is already set by the default flags. >>> * Don't set flags that do not have anny effect. >>> * Don't subst away the WIN32_LEAN_AND_MEAN definition, it's perfectly >>> okay to have it. >>> * Don't set CXX linker on solaris -- this was not needed so no reason >>> to do it. >>> * Cleaned up some old hooks for closed code that is no longer needed. >> >> Right - we could have deleted that when our ARM ports went open. >> >>> I have verified this using COMPARE_BUILD. This shows only the >>> expected differences: >>> * On all platforms: class file changes for WindbgDebuggerLocal.java. >>> * On solaris: some minor symbol differences, since the linker now >>> uses C framework functions instead of C++. (And with symbol changes >>> always comes disasm changes.) >>> * On linux: a binary difference for libsaproc.so, but no >>> size/symbol/deps/disasm change. >>> * On macosx: no changes at all. >>> * On windows: sawindbg.dll is renamed to saproc.dll. When I made a >>> manual comparison between the two files, I found no significant >>> differences. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8199682 >>> WebRev: >>> http://cr.openjdk.java.net/~ihse/JDK-8199682-clean-up-saproc/webrev.01 >> >> The deleted OSX files seem okay. This just seems like another case >> where the original port copied every Linux file across to the bsd >> directory. >> >> Not sure about the Solaris saproc_audit.cpp or the test.c files ?? > I don't know either. :) As I said to Erik, the test files looked like > stupid adhoc testing just left in place. The saproc_audit.cpp looks > legit, but has not been compiled for years. Someone must have "removed" > the file by excluding it from compilation, rather than deleting it. > Could have happened back in the bad old days when "solaris" didn't mean > solaris but "unix", and nobody understood the consequences of deleting > files there. > > As always, the file is still in the repository, if someone wants to > revive it. > > /Magnus > >> >> Thanks, >> David >> >>> /Magnus >>> > From magnus.ihse.bursie at oracle.com Fri Mar 16 18:12:08 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 16 Mar 2018 19:12:08 +0100 Subject: RFR: JDK-8199682 Clean up building the saproc library In-Reply-To: <5AAB52D1.6040707@oracle.com> References: <39fbd44c-030d-582f-f4f9-9bc9f3e53c42@oracle.com> <5AAB52D1.6040707@oracle.com> Message-ID: Hi Sundar, I almost missed your mail, since you removed both me and build-dev from the cc list... > 16 mars 2018 kl. 06:14 skrev Sundararajan Athijegannathan : > > Renaming sawindbg as saproc sounds odd. For Linux, Solaris/Unix, we either use /proc & libproc, so calling saproc for those makes sense. But Windows? We have a separate debugger class to load platform specific native library. What is the reason for uniform naming? This is the only library in the JDK that has a different name on different platform. This clashes with the design of the build system, and requires a clunky workaround. For the upcoming changes in the build system, this goes from an annoyance to a blocker. No other components have their names based on the OS functionality they use, even if they use vastly different APIs on different platforms; rather they are named after the services they provide to the JDK. My assumption was that ?saproc? meant ?serviceability agent process handling?, and that this was a reasonable name for all platforms. Also, the source code for all platforms reside in the ?libsaproc? directory, which is consistent with the JDK standard for matching source code to native library. But if you believe this is an inappropriate name, let?s work together to find a name that works for all platforms. This of course will lead to new names for the current libsaproc.* libraries, and the source code directories. /Magnus > > -Sundar > > On 16/03/18, 12:19 AM, Magnus Ihse Bursie wrote: >> >> >> On 2018-03-15 19:39, Erik Joelsson wrote: >>> Looks good to me. >>> >>> The removed source files, are those some kind of tests? >> I don't really know; they have been excluded from the build for all time. My guess is that the Bsd* stuff is, like in the case of the sound libraries, bsd-based stuff that arrived with the mac port (but disabled). The test.c is a trivial main() method which looks more like a left-over adhoc testing from the initial developer. Perhaps someone wants to turn it into a proper test, but it seems like it's not much even to start with. (And hopefully we have much better real test coverage of this now.) >> >> /Magnus >>> >>> /Erik >>> >>> >>> On 2018-03-15 11:22, Magnus Ihse Bursie wrote: >>>> The saproc library has historically been built in quite odd ways on almost all platforms. When the old build system was converted, this was not changed. >>>> >>>> However, now the time has come to streamline this and build this library just as any other. >>>> >>>> The most visible change, perhaps, is that the library is now named saproc on all platforms, even Windows. Other changes include: >>>> * Don't set flags that is already set by the default flags. >>>> * Don't set flags that do not have anny effect. >>>> * Don't subst away the WIN32_LEAN_AND_MEAN definition, it's perfectly okay to have it. >>>> * Don't set CXX linker on solaris -- this was not needed so no reason to do it. >>>> * Cleaned up some old hooks for closed code that is no longer needed. >>>> >>>> I have verified this using COMPARE_BUILD. This shows only the expected differences: >>>> * On all platforms: class file changes for WindbgDebuggerLocal.java. >>>> * On solaris: some minor symbol differences, since the linker now uses C framework functions instead of C++. (And with symbol changes always comes disasm changes.) >>>> * On linux: a binary difference for libsaproc.so, but no size/symbol/deps/disasm change. >>>> * On macosx: no changes at all. >>>> * On windows: sawindbg.dll is renamed to saproc.dll. When I made a manual comparison between the two files, I found no significant differences. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8199682 >>>> WebRev: http://cr.openjdk.java.net/~ihse/JDK-8199682-clean-up-saproc/webrev.01 >>>> >>>> /Magnus >>>> >>> >> From chris.plummer at oracle.com Fri Mar 16 18:20:39 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 16 Mar 2018 11:20:39 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> Message-ID: <4f7f3d57-890c-87eb-abea-ed0f04d58ee6@oracle.com> Hi, I've resolved the issues I had before with not seeing all the stderr output when I tried to capture it. What I'd like to do now is have us decide how the output should be handled from the perspective a LingeredApp user (driver app). Currently all LingeredApp stdout is captured and gets be returned the the driver app by calling app.getAppOutput(). It does not appear in the .jtr file, but the test would have the option of dumping it there it it cared to. Only one test uses app.getAppOutput(). Currently all the LingeredApp stderr is redirected to the console, so it does not appear in the .jtr file. So how do we want this changed? Some possibilities are: (1) capture stderr just like stdout currently is, and leave is up the the driver app to decide if it wants to display it (after the app terminates). (2) capture stderr just like stdout currently is, but have LingeredApp automatically send captured output to driver app's stdout and stderr (after the app terminates). (3) send the LingeredApp's stdout and stderr to the driver app's stdout as it is being captured (this was the original fix Igor suggested and the webrev supported). A minor alternative to this is to keep the two streams separated instead of sending both to stdout. Let me know what you think. I'm inclined to go with 2, especially since normally there is little to no output from the LingeredApp. BTW, here's the CR and original webrev for reference: https://bugs.openjdk.java.net/browse/JDK-8198655 http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ thanks, Chris From serguei.spitsyn at oracle.com Fri Mar 16 20:25:29 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 16 Mar 2018 13:25:29 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: <4f7f3d57-890c-87eb-abea-ed0f04d58ee6@oracle.com> References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> <4f7f3d57-890c-87eb-abea-ed0f04d58ee6@oracle.com> Message-ID: <0abb9c09-bee9-7c7b-db26-99bad154e7fc@oracle.com> Hi Chris, Thank you for taking care about this issue! On 3/16/18 11:20, Chris Plummer wrote: > Hi, > > I've resolved the issues I had before with not seeing all the stderr > output when I tried to capture it. What I'd like to do now is have us > decide how the output should be handled from the perspective a > LingeredApp user (driver app). Currently all LingeredApp stdout is > captured and gets be returned the the driver app by calling > app.getAppOutput(). It does not appear in the .jtr file, but the test > would have the option of dumping it there it it cared to. Only one > test uses app.getAppOutput(). Currently all the LingeredApp stderr is > redirected to the console, so it does not appear in the .jtr file. Just a general comment to make sure I understand it and ensure we are in sync. It seems much more safe to always have both stdout and stderr outputs present in the .jtr automatically file independently of of what the test does. > So how do we want this changed? Some possibilities are: > > (1) capture stderr just like stdout currently is, and leave is up the > the driver app to decide if it wants to display it (after the app > terminates). It does not look good to me (see above) but maybe I'm missing something important here. > (2) capture stderr just like stdout currently is, but have LingeredApp > automatically send captured output to driver app's stdout and stderr > (after the app terminates). The stdout and std err will be separated in this case, right? Do you have a webrev for this? > (3) send the LingeredApp's stdout and stderr to the driver app's > stdout as it is being captured (this was the original fix Igor > suggested and the webrev supported). A minor alternative to this is to > keep the two streams separated instead of sending both to stdout. > > Let me know what you think. I'm inclined to go with 2, especially > since normally there is little to no output from the LingeredApp. The choice (2) looks good enough. Not sure it is that important to have output from stdout and stderr sync'ed but is is important to have the stderr present in the .jtr automatically. The choice (3) looks even better if it is going to work well. Not sure, it is really necessary. Thanks, Serguei > > BTW, here's the CR and original webrev for reference: > > https://bugs.openjdk.java.net/browse/JDK-8198655 > http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ > > thanks, > > Chris > From chris.plummer at oracle.com Fri Mar 16 21:48:25 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 16 Mar 2018 14:48:25 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: <0abb9c09-bee9-7c7b-db26-99bad154e7fc@oracle.com> References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> <4f7f3d57-890c-87eb-abea-ed0f04d58ee6@oracle.com> <0abb9c09-bee9-7c7b-db26-99bad154e7fc@oracle.com> Message-ID: <2fd4bb59-7b5f-3598-10c3-da3a2ab957af@oracle.com> On 3/16/18 1:25 PM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > Thank you for taking care about this issue! > > On 3/16/18 11:20, Chris Plummer wrote: >> Hi, >> >> I've resolved the issues I had before with not seeing all the stderr >> output when I tried to capture it. What I'd like to do now is have us >> decide how the output should be handled from the perspective a >> LingeredApp user (driver app). Currently all LingeredApp stdout is >> captured and gets be returned the the driver app by calling >> app.getAppOutput(). It does not appear in the .jtr file, but the test >> would have the option of dumping it there it it cared to. Only one >> test uses app.getAppOutput(). Currently all the LingeredApp stderr is >> redirected to the console, so it does not appear in the .jtr file. > > Just a general comment to make sure I understand it and ensure we are > in sync. > It seems much more safe to always have both stdout and stderr outputs > present in the .jtr automatically file independently of of what the > test does. > > >> So how do we want this changed? Some possibilities are: >> >> (1) capture stderr just like stdout currently is, and leave is up the >> the driver app to decide if it wants to display it (after the app >> terminates). > > It does not look good to me (see above) but maybe I'm missing > something important here. > >> (2) capture stderr just like stdout currently is, but have >> LingeredApp automatically send captured output to driver app's stdout >> and stderr (after the app terminates). > > The stdout and std err will be separated in this case, right? > Do you have a webrev for this? I currently have it working like this, although I need to fix LingeredApp.getAppOutput(). I had to make it return a single String instead of a List of Strings, so this breaks the one test that uses this API. It's easily fixed. Just haven't gotten around to it yet. > > >> (3) send the LingeredApp's stdout and stderr to the driver app's >> stdout as it is being captured (this was the original fix Igor >> suggested and the webrev supported). A minor alternative to this is >> to keep the two streams separated instead of sending both to stdout. >> >> Let me know what you think. I'm inclined to go with 2, especially >> since normally there is little to no output from the LingeredApp. > > The choice (2) looks good enough. > Not sure it is that important to have output from stdout and stderr > sync'ed > but is is important to have the stderr present in the .jtr automatically. > > The choice (3) looks even better if it is going to work well. This is basically what the original webrev did. It sent LingeredApp's stderr and stdout to the the driver apps stdout. It's a 1 word change to make it send stderr to stderr. I think it has a bug though that did not manifest itself. It seems the new copy() code that is capturing stdout would be contending with the existing InputGlobbler code that is doing the same. I would need to fix this to make sure LingeredApp.getAppOutput() still returns all the apps stdout output. Chris > Not sure, it is really necessary. > > Thanks, > Serguei > > >> >> BTW, here's the CR and original webrev for reference: >> >> https://bugs.openjdk.java.net/browse/JDK-8198655 >> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >> >> thanks, >> >> Chris >> > From david.holmes at oracle.com Sat Mar 17 07:11:34 2018 From: david.holmes at oracle.com (David Holmes) Date: Sat, 17 Mar 2018 17:11:34 +1000 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: <2fd4bb59-7b5f-3598-10c3-da3a2ab957af@oracle.com> References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> <4f7f3d57-890c-87eb-abea-ed0f04d58ee6@oracle.com> <0abb9c09-bee9-7c7b-db26-99bad154e7fc@oracle.com> <2fd4bb59-7b5f-3598-10c3-da3a2ab957af@oracle.com> Message-ID: <4bef98cf-e422-572a-508c-b0d4c9aff8f4@oracle.com> I'm afraid I'm losing track of this change. The key thing is that we should not have a test that launches any other process for which we can not see the output of that process. David On 17/03/2018 7:48 AM, Chris Plummer wrote: > On 3/16/18 1:25 PM, serguei.spitsyn at oracle.com wrote: >> Hi Chris, >> >> Thank you for taking care about this issue! >> >> On 3/16/18 11:20, Chris Plummer wrote: >>> Hi, >>> >>> I've resolved the issues I had before with not seeing all the stderr >>> output when I tried to capture it. What I'd like to do now is have us >>> decide how the output should be handled from the perspective a >>> LingeredApp user (driver app). Currently all LingeredApp stdout is >>> captured and gets be returned the the driver app by calling >>> app.getAppOutput(). It does not appear in the .jtr file, but the test >>> would have the option of dumping it there it it cared to. Only one >>> test uses app.getAppOutput(). Currently all the LingeredApp stderr is >>> redirected to the console, so it does not appear in the .jtr file. >> >> Just a general comment to make sure I understand it and ensure we are >> in sync. >> It seems much more safe to always have both stdout and stderr outputs >> present in the .jtr automatically file independently of of what the >> test does. >> >> >>> So how do we want this changed? Some possibilities are: >>> >>> (1) capture stderr just like stdout currently is, and leave is up the >>> the driver app to decide if it wants to display it (after the app >>> terminates). >> >> It does not look good to me (see above) but maybe I'm missing >> something important here. >> >>> (2) capture stderr just like stdout currently is, but have >>> LingeredApp automatically send captured output to driver app's stdout >>> and stderr (after the app terminates). >> >> The stdout and std err will be separated in this case, right? >> Do you have a webrev for this? > I currently have it working like this, although I need to fix > LingeredApp.getAppOutput(). I had to make it return a single String > instead of a List of Strings, so this breaks the one test that uses this > API. It's easily fixed. Just haven't gotten around to it yet. >> >> >>> (3) send the LingeredApp's stdout and stderr to the driver app's >>> stdout as it is being captured (this was the original fix Igor >>> suggested and the webrev supported). A minor alternative to this is >>> to keep the two streams separated instead of sending both to stdout. >>> >>> Let me know what you think. I'm inclined to go with 2, especially >>> since normally there is little to no output from the LingeredApp. >> >> The choice (2) looks good enough. >> Not sure it is that important to have output from stdout and stderr >> sync'ed >> but is is important to have the stderr present in the .jtr automatically. >> >> The choice (3) looks even better if it is going to work well. > This is basically what the original webrev did. It sent LingeredApp's > stderr and stdout to the the driver apps stdout. It's a 1 word change to > make it send stderr to stderr. I think it has a bug though that did not > manifest itself. It seems the new copy() code that is capturing stdout > would be contending with the existing InputGlobbler code that is doing > the same. I would need to fix this to make sure > LingeredApp.getAppOutput() still returns all the apps stdout output. > > Chris >> Not sure, it is really necessary. >> >> Thanks, >> Serguei >> >> >>> >>> BTW, here's the CR and original webrev for reference: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8198655 >>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >>> >>> thanks, >>> >>> Chris >>> >> > From chris.plummer at oracle.com Mon Mar 19 16:39:15 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 19 Mar 2018 09:39:15 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: <4bef98cf-e422-572a-508c-b0d4c9aff8f4@oracle.com> References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> <4f7f3d57-890c-87eb-abea-ed0f04d58ee6@oracle.com> <0abb9c09-bee9-7c7b-db26-99bad154e7fc@oracle.com> <2fd4bb59-7b5f-3598-10c3-da3a2ab957af@oracle.com> <4bef98cf-e422-572a-508c-b0d4c9aff8f4@oracle.com> Message-ID: <05d6de94-c52a-b3f2-7a41-293f7e490475@oracle.com> Hi David, Just to clarify one point, most of the tests that use OutputAnalyzer do not display process output unless there is an error. So part of the decision here with LingeredApp is when to display the output. Currently the stdout is captured, but not displayed, unless the tests does the work to display it, which none do. Currently stderr goes to the console. Note that some negative tests actually cause some expected stderr output, although the tests don't check for it. One thought I just had is to create an async option for OutputAnalyzer so it doesn't block until the process exits. Basically that means splitting ProcessTools.getOutput() so it doesn't block. What I currently have is essentially doing that. It copies ProcessTools.getOutput(), splitting it into two parts. But all this logic is in LingeredApp, and of course doesn't have any of the output error checking support that OutputAnalyzer, which might be useful for LingeredApp. For example, the negative tests only test that launching the app failed. They could be improved by checking for specific error output. Chris On 3/17/18 12:11 AM, David Holmes wrote: > I'm afraid I'm losing track of this change. > > The key thing is that we should not have a test that launches any > other process for which we can not see the output of that process. > > David > > On 17/03/2018 7:48 AM, Chris Plummer wrote: >> On 3/16/18 1:25 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Chris, >>> >>> Thank you for taking care about this issue! >>> >>> On 3/16/18 11:20, Chris Plummer wrote: >>>> Hi, >>>> >>>> I've resolved the issues I had before with not seeing all the >>>> stderr output when I tried to capture it. What I'd like to do now >>>> is have us decide how the output should be handled from the >>>> perspective a LingeredApp user (driver app). Currently all >>>> LingeredApp stdout is captured and gets be returned the the driver >>>> app by calling app.getAppOutput(). It does not appear in the .jtr >>>> file, but the test would have the option of dumping it there it it >>>> cared to. Only one test uses app.getAppOutput(). Currently all the >>>> LingeredApp stderr is redirected to the console, so it does not >>>> appear in the .jtr file. >>> >>> Just a general comment to make sure I understand it and ensure we >>> are in sync. >>> It seems much more safe to always have both stdout and stderr >>> outputs present in the .jtr automatically file independently of of >>> what the test does. >>> >>> >>>> So how do we want this changed? Some possibilities are: >>>> >>>> (1) capture stderr just like stdout currently is, and leave is up >>>> the the driver app to decide if it wants to display it (after the >>>> app terminates). >>> >>> It does not look good to me (see above) but maybe I'm missing >>> something important here. >>> >>>> (2) capture stderr just like stdout currently is, but have >>>> LingeredApp automatically send captured output to driver app's >>>> stdout and stderr (after the app terminates). >>> >>> The stdout and std err will be separated in this case, right? >>> Do you have a webrev for this? >> I currently have it working like this, although I need to fix >> LingeredApp.getAppOutput(). I had to make it return a single String >> instead of a List of Strings, so this breaks the one test that uses >> this API. It's easily fixed. Just haven't gotten around to it yet. >>> >>> >>>> (3) send the LingeredApp's stdout and stderr to the driver app's >>>> stdout as it is being captured (this was the original fix Igor >>>> suggested and the webrev supported). A minor alternative to this is >>>> to keep the two streams separated instead of sending both to stdout. >>>> >>>> Let me know what you think. I'm inclined to go with 2, especially >>>> since normally there is little to no output from the LingeredApp. >>> >>> The choice (2) looks good enough. >>> Not sure it is that important to have output from stdout and stderr >>> sync'ed >>> but is is important to have the stderr present in the .jtr >>> automatically. >>> >>> The choice (3) looks even better if it is going to work well. >> This is basically what the original webrev did. It sent LingeredApp's >> stderr and stdout to the the driver apps stdout. It's a 1 word change >> to make it send stderr to stderr. I think it has a bug though that >> did not manifest itself. It seems the new copy() code that is >> capturing stdout would be contending with the existing InputGlobbler >> code that is doing the same. I would need to fix this to make sure >> LingeredApp.getAppOutput() still returns all the apps stdout output. >> >> Chris >>> Not sure, it is really necessary. >>> >>> Thanks, >>> Serguei >>> >>> >>>> >>>> BTW, here's the CR and original webrev for reference: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8198655 >>>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >>>> >>>> thanks, >>>> >>>> Chris >>>> >>> >> From david.holmes at oracle.com Mon Mar 19 20:43:56 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 20 Mar 2018 06:43:56 +1000 Subject: RFR : JDK-8196744 : JMX: Not enough JDP packets received before timeout In-Reply-To: <59657246-1590-45c4-9bb2-a650fcd8cdd4@default> References: <85f7cf50-2791-6a91-20e8-81a98c6239ab@oracle.com> <59657246-1590-45c4-9bb2-a650fcd8cdd4@default> Message-ID: Hi Harsha, Given the negative nature of the test this approach seems quite reasonable. Thanks, David > Harsha Wardhana B > > Ping! Can I have one more review for the below fix? > > Thanks > > Harsha > > On Monday 26 February 2018 10:42 AM, Harsha Wardhana B wrote: > >> Hello All, > >> > >> Requesting for review from one more reviewer. > >> > >> Thanks > >> Harsha > >> > >> On Wednesday 21 February 2018 10:01 AM, Chris Plummer wrote: > >>> Hi Harsha, > >>> > >>> Not a review, but just a request that you add the explanation of the > >>> problem to the CR so we have a record of it. Also, the copyright > >>> needs to be updated. > >>> > >>> thanks, > >>> > >>> Chris > >>> > >>> On 2/20/18 3:30 AM, Harsha Wardhana B wrote: > >>>> Hi All, > >>>> > >>>> Please find the fix below for the Jdp test-case. > >>>> > >>>> issue: https://bugs.openjdk.java.net/browse/JDK-8196028 > >>>> webrev : http://cr.openjdk.java.net/~hb/8196028/webrev.00/ > >>>> > >>>> Fix details : The test was receiving JDP packets from other VM and > >>>> hence the multi-cast socket was not timing-out. The default timeout > >>>> handler was causing test to fail. Added a shutdown method that > >>>> passes the test in case of timeout. > >>>> > >>>> Thanks > >>>> Harsha > >>> > >>> > >> > From jcbeyler at google.com Mon Mar 19 21:06:22 2018 From: jcbeyler at google.com (JC Beyler) Date: Mon, 19 Mar 2018 21:06:22 +0000 Subject: JDK-8171119: Low-Overhead Heap Profiling In-Reply-To: References: <5A819F10.8040201@oracle.com> <5A8414AC.3020209@oracle.com> Message-ID: Hi all, The incremental webrev update is here: http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event4_5/ The full webrev is here: http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event5/ Major change here is: - I've removed the heapMonitoring.cpp code in favor of just having the sampling events as per Serguei's request; I still have to do some overhead measurements but the tests prove the concept can work - Most of the tlab code is unchanged, the only major part is that now things get sent off to event collectors when used and enabled. - Added the interpreter collectors to handle interpreter execution - Updated the name from SetTlabHeapSampling to SetHeapSampling to be more generic - Added a mutex for the thread sampling so that we can initialize an internal static array safely - Ported the tests from the old system to this new one I've also updated the JEP and CSR to reflect these changes: https://bugs.openjdk.java.net/browse/JDK-8194905 https://bugs.openjdk.java.net/browse/JDK-8171119 In order to make this have some forward progress, I've removed the heap sampling code entirely and now rely entirely on the event sampling system. The tests reflect this by using a simplified implementation of what an agent could do: http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event5/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitor.c (Search for anything mentioning event_storage). I have not taken the time to port the whole code we had originally in heapMonitoring to this. I hesitate only because that code was in C++, I'd have to port it to C and this is for tests so perhaps what I have now is good enough? As far as testing goes, I've ported all the relevant tests and then added a few: - Turning the system on/off - Testing using various GCs - Testing using the interpreter - Testing the sampling rate - Testing with objects and arrays - Testing with various threads Finally, as overhead goes, I have the numbers of the system off vs a clean build and I have 0% overhead, which is what we'd want. This was using the Dacapo benchmarks. I am now preparing to run a version with the events on using dacapo and will report back here. Any comments are welcome :) Jc On Thu, Mar 8, 2018 at 4:00 PM JC Beyler wrote: > Hi all, > > I apologize for the delay but I wanted to add an event system and that > took a bit longer than expected and I also reworked the code to take into > account the deprecation of FastTLABRefill. > > This update has four parts: > > A) I moved the implementation from Thread to ThreadHeapSampler inside of > Thread. Would you prefer it as a pointer inside of Thread or like this > works for you? Second question would be would you rather have an > association outside of Thread altogether that tries to remember when > threads are live and then we would have something like: > ThreadHeapSampler::get_sampling_size(this_thread); > > I worry about the overhead of this but perhaps it is not too too bad? > > B) I also have been working on the Allocation event system that sends out > a notification at each sampled event. This will be practical when wanting > to do something at the allocation point. I'm also looking at if the whole > heapMonitoring code could not reside in the agent code and not in the JDK. > I'm not convinced but I'm talking to Serguei about it to see/assess :) > - Also added two tests for the new event subsystem > > C) Removed the slow_path fields inside the TLAB code since now > FastTLABRefill is deprecated > > D) Updated the JVMTI documentation and specification for the methods. > > So the incremental webrev is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.09_10/ > > and the full webrev is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.10 > > I believe I have updated the various JIRA issues that track this :) > > Thanks for your input, > Jc > > > On Wed, Feb 14, 2018 at 10:34 PM, JC Beyler wrote: > >> Hi Erik, >> >> I inlined my answers, which the last one seems to answer Robbin's >> concerns about the same thing (adding things to Thread). >> >> On Wed, Feb 14, 2018 at 2:51 AM, Erik ?sterlund < >> erik.osterlund at oracle.com> wrote: >> >>> Hi JC, >>> >>> Comments are inlined below. >>> >>> >>> On 2018-02-13 06:18, JC Beyler wrote: >>> >>> Hi Erik, >>> >>> Thanks for your answers, I've now inlined my own answers/comments. >>> >>> I've done a new webrev here: >>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08/ >>> >>> The incremental is here: >>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/ >>> >>> Note to all: >>> - I've been integrating changes from Erin/Serguei/David comments so >>> this webrev incremental is a bit an answer to all comments in one. I >>> apologize for that :) >>> >>> >>> On Mon, Feb 12, 2018 at 6:05 AM, Erik ?sterlund < >>> erik.osterlund at oracle.com> wrote: >>> >>>> Hi JC, >>>> >>>> Sorry for the delayed reply. >>>> >>>> Inlined answers: >>>> >>>> >>>> On 2018-02-06 00:04, JC Beyler wrote: >>>> >>>>> Hi Erik, >>>>> >>>>> (Renaming this to be folded into the newly renamed thread :)) >>>>> >>>>> First off, thanks a lot for reviewing the webrev! I appreciate it! >>>>> >>>>> I updated the webrev to: >>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/ >>>>> >>>>> And the incremental one is here: >>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.04_05a/ >>>>> >>>>> It contains: >>>>> - The change for since from 9 to 11 for the jvmti.xml >>>>> - The use of the OrderAccess for initialized >>>>> - Clearing the oop >>>>> >>>>> I also have inlined my answers to your comments. The biggest question >>>>> will come from the multiple *_end variables. A bit of the logic there >>>>> is due to handling the slow path refill vs fast path refill and >>>>> checking that the rug was not pulled underneath the slowpath. I >>>>> believe that a previous comment was that TlabFastRefill was going to >>>>> be deprecated. >>>>> >>>>> If this is true, we could revert this code a bit and just do a : if >>>>> TlabFastRefill is enabled, disable this. And then deprecate that when >>>>> TlabFastRefill is deprecated. >>>>> >>>>> This might simplify this webrev and I can work on a follow-up that >>>>> either: removes TlabFastRefill if Robbin does not have the time to do >>>>> it or add the support to the assembly side to handle this correctly. >>>>> What do you think? >>>>> >>>> >>>> I support removing TlabFastRefill, but I think it is good to not depend >>>> on that happening first. >>>> >>>> >>> >>> I'm slowly pushing on the FastTLABRefill ( >>> >>> https://bugs.openjdk.java.net/browse/JDK-8194084), I agree on keeping >>> both separate for now though so that we can think of both differently >>> >>> >>> >>>> Now, below, inlined are my answers: >>>>> >>>>> On Fri, Feb 2, 2018 at 8:44 AM, Erik ?sterlund >>>>> wrote: >>>>> >>>>>> Hi JC, >>>>>> >>>>>> Hope I am reviewing the right version of your work. Here goes... >>>>>> >>>>>> src/hotspot/share/gc/shared/collectedHeap.inline.hpp: >>>>>> >>>>>> 159 AllocTracer::send_allocation_outside_tlab(klass, result, >>>>>> size * >>>>>> HeapWordSize, THREAD); >>>>>> 160 >>>>>> 161 THREAD->tlab().handle_sample(THREAD, result, size); >>>>>> 162 return result; >>>>>> 163 } >>>>>> >>>>>> Should not call tlab()->X without checking if (UseTLAB) IMO. >>>>>> >>>>>> Done! >>>>> >>>> >>>> More about this later. >>>> >>>> >>>> >>>>> src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp: >>>>>> >>>>>> So first of all, there seems to quite a few ends. There is an "end", >>>>>> a "hard >>>>>> end", a "slow path end", and an "actual end". Moreover, it seems like >>>>>> the >>>>>> "hard end" is actually further away than the "actual end". So the >>>>>> "hard end" >>>>>> seems like more of a "really definitely actual end" or something. I >>>>>> don't >>>>>> know about you, but I think it looks kind of messy. In particular, I >>>>>> don't >>>>>> feel like the name "actual end" reflects what it represents, >>>>>> especially when >>>>>> there is another end that is behind the "actual end". >>>>>> >>>>>> 413 HeapWord* ThreadLocalAllocBuffer::hard_end() { >>>>>> 414 // Did a fast TLAB refill occur? >>>>>> 415 if (_slow_path_end != _end) { >>>>>> 416 // Fix up the actual end to be now the end of this TLAB. >>>>>> 417 _slow_path_end = _end; >>>>>> 418 _actual_end = _end; >>>>>> 419 } >>>>>> 420 >>>>>> 421 return _actual_end + alignment_reserve(); >>>>>> 422 } >>>>>> >>>>>> I really do not like making getters unexpectedly have these kind of >>>>>> side >>>>>> effects. It is not expected that when you ask for the "hard end", you >>>>>> implicitly update the "slow path end" and "actual end" to new values. >>>>>> >>>>>> As I said, a lot of this is due to the FastTlabRefill. If I make this >>>>> not supporting FastTlabRefill, this goes away. The reason the system >>>>> needs to update itself at the get is that you only know at that get if >>>>> things have shifted underneath the tlab slow path. I am not sure of >>>>> really better names (naming is hard!), perhaps we could do these >>>>> names: >>>>> >>>>> - current_tlab_end // Either the allocated tlab end or a >>>>> sampling point >>>>> - last_allocation_address // The end of the tlab allocation >>>>> - last_slowpath_allocated_end // In case a fast refill occurred the >>>>> end might have changed, this is to remember slow vs fast past refills >>>>> >>>>> the hard_end method can be renamed to something like: >>>>> tlab_end_pointer() // The end of the lab including a bit of >>>>> alignment reserved bytes >>>>> >>>> >>>> Those names sound better to me. Could you please provide a mapping from >>>> the old names to the new names so I understand which one is which please? >>>> >>>> This is my current guess of what you are proposing: >>>> >>>> end -> current_tlab_end >>>> actual_end -> last_allocation_address >>>> slow_path_end -> last_slowpath_allocated_end >>>> hard_end -> tlab_end_pointer >>>> >>>> >>> Yes that is correct, that was what I was proposing. >>> >>> >>>> I would prefer this naming: >>>> >>>> end -> slow_path_end // the end for taking a slow path; either due to >>>> sampling or refilling >>>> actual_end -> allocation_end // the end for allocations >>>> slow_path_end -> last_slow_path_end // last address for slow_path_end >>>> (as opposed to allocation_end) >>>> hard_end -> reserved_end // the end of the reserved space of the TLAB >>>> >>>> About setting things in the getter... that still seems like a very >>>> unpleasant thing to me. It would be better to inspect the call hierarchy >>>> and explicitly update the ends where they need updating, and assert in the >>>> getter that they are in sync, rather than implicitly setting various ends >>>> as a surprising side effect in a getter. It looks like the call hierarchy >>>> is very small. With my new naming convention, reserved_end() would >>>> presumably return _allocation_end + alignment_reserve(), and have an assert >>>> checking that _allocation_end == _last_slow_path_allocation_end, >>>> complaining that this invariant must hold, and that a caller to this >>>> function, such as make_parsable(), must first explicitly synchronize the >>>> ends as required, to honor that invariant. >>>> >>>> >>> >>> I've renamed the variables to how you preferred it except for the _end >>> one. I did: >>> current_end >>> last_allocation_address >>> tlab_end_ptr >>> >>> The reason is that the architecture dependent code use the thread.hpp >>> API and it already has tlab included into the name so it becomes >>> tlab_current_end (which is better that tlab_current_tlab_end in my opinion). >>> >>> I also moved the update into a separate method with a TODO that says to >>> remove it when FastTLABRefill is deprecated >>> >>> >>> This looks a lot better now. Thanks. >>> >>> Note that the following comment now needs updating accordingly in >>> threadLocalAllocBuffer.hpp: >>> >>> 41 // Heap sampling is performed via the end/actual_end fields. 42 // actual_end contains the real end of the tlab allocation, 43 // whereas end can be set to an arbitrary spot in the tlab to 44 // trip the return and sample the allocation. 45 // slow_path_end is used to track if a fast tlab refill occured 46 // between slowpath calls. >>> >>> There might be other comments too, I have not looked in detail. >>> >> >> This was the only spot that still had an actual_end, I fixed it now. I'll >> do a sweep to double check other comments. >> >> >>> >>> >>> >>> >>> >>>> >>>> Not sure it's better but before updating the webrev, I wanted to try >>>>> to get input/consensus :) >>>>> >>>>> (Note hard_end was always further off than end). >>>>> >>>>> src/hotspot/share/prims/jvmti.xml: >>>>>> >>>>>> 10357 >>>>>> 10358 >>>>>> 10359 Can sample the heap. >>>>>> 10360 If this capability is enabled then the heap sampling >>>>>> methods >>>>>> can be called. >>>>>> 10361 >>>>>> 10362 >>>>>> >>>>>> Looks like this capability should not be "since 9" if it gets >>>>>> integrated >>>>>> now. >>>>>> >>>>> Updated now to 11, crossing my fingers :) >>>>> >>>>> >>>>> src/hotspot/share/runtime/heapMonitoring.cpp: >>>>>> >>>>>> 448 if (is_alive->do_object_b(value)) { >>>>>> 449 // Update the oop to point to the new object if it is >>>>>> still >>>>>> alive. >>>>>> 450 f->do_oop(&(trace.obj)); >>>>>> 451 >>>>>> 452 // Copy the old trace, if it is still live. >>>>>> 453 _allocated_traces->at_put(curr_pos++, trace); >>>>>> 454 >>>>>> 455 // Store the live trace in a cache, to be served up on >>>>>> /heapz. >>>>>> 456 _traces_on_last_full_gc->append(trace); >>>>>> 457 >>>>>> 458 count++; >>>>>> 459 } else { >>>>>> 460 // If the old trace is no longer live, add it to the >>>>>> list of >>>>>> 461 // recently collected garbage. >>>>>> 462 store_garbage_trace(trace); >>>>>> 463 } >>>>>> >>>>>> In the case where the oop was not live, I would like it to be >>>>>> explicitly >>>>>> cleared. >>>>>> >>>>> Done I think how you wanted it. Let me know because I'm not familiar >>>>> with the RootAccess API. I'm unclear if I'm doing this right or not so >>>>> reviews of these parts are highly appreciated. Robbin had talked of >>>>> perhaps later pushing this all into a OopStorage, should I do this now >>>>> do you think? Or can that wait a second webrev later down the road? >>>>> >>>> >>>> I think using handles can and should be done later. You can use the >>>> Access API now. >>>> I noticed that you are missing an #include "oops/access.inline.hpp" in >>>> your heapMonitoring.cpp file. >>>> >>>> >>> The missing header is there for me so I don't know, I made sure it is >>> present in the latest webrev. Sorry about that. >>> >>> >>> >>>> + Did I clear it the way you wanted me to or were you thinking of >>>>> something else? >>>>> >>>> >>>> That is precisely how I wanted it to be cleared. Thanks. >>>> >>>> + Final question here, seems like if I were to want to not do the >>>>> f->do_oop directly on the trace.obj, I'd need to do something like: >>>>> >>>>> f->do_oop(&value); >>>>> ... >>>>> trace->store_oop(value); >>>>> >>>>> to update the oop internally. Is that right/is that one of the >>>>> advantages of going to the Oopstorage sooner than later? >>>>> >>>> >>>> I think you really want to do the do_oop on the root directly. Is there >>>> a particular reason why you would not want to do that? >>>> Otherwise, yes - the benefit with using the handle approach is that you >>>> do not need to call do_oop explicitly in your code. >>>> >>>> >>> There is no reason except that now we have a load_oop and a >>> get_oop_addr, I was not sure what you would think of that. >>> >>> >>> That's fine. >>> >>> >>> >>>> >>>>> Also I see a lot of concurrent-looking use of the following field: >>>>>> 267 volatile bool _initialized; >>>>>> >>>>>> Please note that the "volatile" qualifier does not help with >>>>>> reordering >>>>>> here. Reordering between volatile and non-volatile fields is >>>>>> completely free >>>>>> for both compiler and hardware, except for windows with MSVC, where >>>>>> volatile >>>>>> semantics is defined to use acquire/release semantics, and the >>>>>> hardware is >>>>>> TSO. But for the general case, I would expect this field to be stored >>>>>> with >>>>>> OrderAccess::release_store and loaded with OrderAccess::load_acquire. >>>>>> Otherwise it is not thread safe. >>>>>> >>>>> Because everything is behind a mutex, I wasn't really worried about >>>>> this. I have a test that has multiple threads trying to hit this >>>>> corner case and it passes. >>>>> >>>>> However, to be paranoid, I updated it to using the OrderAccess API >>>>> now, thanks! Let me know what you think there too! >>>>> >>>> >>>> If it is indeed always supposed to be read and written under a mutex, >>>> then I would strongly prefer to have it accessed as a normal non-volatile >>>> member, and have an assertion that given lock is held or we are in a >>>> safepoint, as we do in many other places. Something like this: >>>> >>>> assert(HeapMonitorStorage_lock->owned_by_self() || >>>> (SafepointSynchronize::is_at_safepoint() && >>>> Thread::current()->is_VM_thread()), "this should not be accessed >>>> concurrently"); >>>> >>>> It would be confusing to people reading the code if there are uses of >>>> OrderAccess that are actually always protected under a mutex. >>>> >>>> >>> Thank you for the exact example to be put in the code! I put it around >>> each access/assignment of the _initialized method and found one case where >>> yes you can touch it and not have the lock. It actually is "ok" because you >>> don't act on the storage until later and only when you really want to >>> modify the storage (see the object_alloc_do_sample method which calls the >>> add_trace method). >>> >>> But, because of this, I'm going to put the OrderAccess here, I'll do >>> some performance numbers later and if there are issues, I might add a >>> "unsafe" read and a "safe" one to make it explicit to the reader. But I >>> don't think it will come to that. >>> >>> >>> Okay. This double return in heapMonitoring.cpp looks wrong: >>> >>> 283 bool initialized() { >>> 284 return OrderAccess::load_acquire(&_initialized) != 0; >>> 285 return _initialized; >>> 286 } >>> >>> Since you said object_alloc_do_sample() is the only place where you do >>> not hold the mutex while reading initialized(), I had a closer look at >>> that. It looks like in its current shape, the lack of a mutex may lead to a >>> memory leak. In particular, it first checks if (initialized()). Let's >>> assume this is now true. It then allocates a bunch of stuff, and checks if >>> the number of frames were over 0. If they were, it calls >>> StackTraceStorage::storage()->add_trace() seemingly hoping that after >>> grabbing the lock in there, initialized() will still return true. But it >>> could now return false and skip doing anything, in which case the allocated >>> stuff will never be freed. >>> >> >> I fixed this now by making add_trace return a boolean and checking for >> that. It will be in the next webrev. Thanks, the truth is that in our >> implementation the system is always on or off, so this never really occurs >> :). In this version though, that is not true and it's important to handle >> so thanks again! >> >> >> >>> >>> So the analysis seems to be that _initialized is only used outside of >>> the mutex in once instance, where it is used to perform double-checked >>> locking, that actually causes a memory leak. >>> >>> I am not proposing how to fix that, just raising the issue. If you still >>> want to perform this double-checked locking somehow, then the use of >>> acquire/release still seems odd. Because the memory ordering restrictions >>> of it never comes into play in this particular case. If it ever did, then >>> the use of destroy_stuff(); release_store(_initialized, 0) would be broken >>> anyway as that would imply that whatever concurrent reader there ever was >>> would after reading _initialized with load_acquire() could *never* read the >>> data that is concurrently destroyed anyway. I would be biased to think that >>> RawAccess::load/store looks like a more appropriate solution, >>> given that the memory leak issue is resolved. I do not know how painful it >>> would be to not perform this double-checked locking. >>> >> >> So I agree with this entirely. I looked also a bit more and the >> difference and code really stems from our internal version. In this version >> however, there are actually a lot of things going on that I did not go >> entirely through in my head but this comment made me ponder a bit more on >> it. >> >> Since every object_alloc_do_sample is protected by a check to >> HeapMonitoring::enabled(), there is only a small chance that the call is >> happening when things have been disabled. So there is no real need to do a >> first check on the initialized, it is a rare occurence that a call happens >> to object_alloc_do_sample and the initialized of the storage returns false. >> >> (By the way, even if you did call object_alloc_do_sample without looking >> at HeapMonitoring::enabled(), that would be ok too. You would gather the >> stacktrace and get nowhere at the add_trace call, which would return false; >> so though not optimal performance wise, nothing would break). >> >> Furthermore, the add_trace is really the moment of no return and we have >> the mutex lock and then the initialized check. So, in the end, I did two >> things: I removed that first check and then I removed the OrderAccess for >> the storage initialized. I think now I have a better grasp and >> understanding why it was done in our code and why it is not needed here. >> Thanks for pointing it out :). This now still passes my JTREG tests, >> especially the threaded one. >> >> >> >> >> >>> >>> >>> >>> >>> >>>> As a kind of meta comment, I wonder if it would make sense to add >>>>>> sampling >>>>>> for non-TLAB allocations. Seems like if someone is rapidly allocating >>>>>> a >>>>>> whole bunch of 1 MB objects that never fit in a TLAB, I might still be >>>>>> interested in seeing that in my traces, and not get surprised that the >>>>>> allocation rate is very high yet not showing up in any profiles. >>>>>> >>>>>> That is handled by the handle_sample where you wanted me to put a >>>>> UseTlab because you hit that case if the allocation is too big. >>>>> >>>> >>>> I see. It was not obvious to me that non-TLAB sampling is done in the >>>> TLAB class. That seems like an abstraction crime. >>>> What I wanted in my previous comment was that we do not call into the >>>> TLAB when we are not using TLABs. If there is sampling logic in the TLAB >>>> that is used for something else than TLABs, then it seems like that logic >>>> simply does not belong inside of the TLAB. It should be moved out of the >>>> TLAB, and instead have the TLAB call this common abstraction that makes >>>> sense. >>>> >>>> >>> So in the incremental version: >>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/, this is >>> still a "crime". The reason is that the system has to have the >>> bytes_until_sample on a per-thread level and it made "sense" to have it >>> with the TLAB implementation. Also, I was not sure how people felt about >>> adding something to the thread instance instead. >>> >>> Do you think it fits better at the Thread level? I can see how difficult >>> it is to make it happen there and add some logic there. Let me know what >>> you think. >>> >>> >>> We have an unfortunate situation where everyone that has some fields >>> that are thread local tend to dump them right into Thread, making the size >>> and complexity of Thread grow as it becomes tightly coupled with various >>> unrelated subsystems. It would be desirable to have a separate class for >>> this instead that encapsulates the sampling logic. That class could >>> possibly reside in Thread though as a value object of Thread. >>> >> >> I imagined that would be the case but was not sure. I will look at the >> example that Robbin is talking about (ThreadSMR) and will see how to >> refactor my code to use that. >> >> Thanks again for your help, >> Jc >> >> >>> >>> >>> >>> >>> >>>> Hope I have answered your questions and that my feedback makes sense to >>>> you. >>>> >>>> >>> You have and thank you for them, I think we are getting to a cleaner >>> implementation and things are getting better and more readable :) >>> >>> >>> Yes it is getting better. >>> >>> Thanks, >>> /Erik >>> >>> >>> Thanks for your help! >>> Jc >>> >>> >>> >>>> Thanks, >>>> /Erik >>>> >>>> >>>> I double checked by changing the test >>>>> >>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatObjectCorrectnessTest.java >>>>> >>>>> to use a smaller Tlab (2048) and made the object bigger and it goes >>>>> through that and passes. >>>>> >>>>> Thanks again for your review and I look forward to your pointers for >>>>> the questions I now have raised! >>>>> Jc >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>>> /Erik >>>>>> >>>>>> >>>>>> On 2018-01-26 06:45, JC Beyler wrote: >>>>>> >>>>>>> Thanks Robbin for the reviews :) >>>>>>> >>>>>>> The new full webrev is here: >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/ >>>>>>> The incremental webrev is here: >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/ >>>>>>> >>>>>>> I inlined my answers: >>>>>>> >>>>>>> On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn < >>>>>>> robbin.ehn at oracle.com> wrote: >>>>>>> >>>>>>>> Hi JC, great to see another revision! >>>>>>>> >>>>>>>> #### >>>>>>>> heapMonitoring.cpp >>>>>>>> >>>>>>>> StackTraceData should not contain the oop for 'safety' reasons. >>>>>>>> When StackTraceData is moved from _allocated_traces: >>>>>>>> L452 store_garbage_trace(trace); >>>>>>>> it contains a dead oop. >>>>>>>> _allocated_traces could instead be a tupel of oop and >>>>>>>> StackTraceData thus >>>>>>>> dead oops are not kept. >>>>>>>> >>>>>>> Done I used inheritance to make the copier work regardless but the >>>>>>> idea is the same. >>>>>>> >>>>>>> You should use the new Access API for loading the oop, something like >>>>>>>> this: >>>>>>>> RootAccess::load(...) >>>>>>>> I don't think you need to use Access API for clearing the oop, but >>>>>>>> it >>>>>>>> would >>>>>>>> look nicer. And you shouldn't probably be using: >>>>>>>> Universe::heap()->is_in_reserved(value) >>>>>>>> >>>>>>> I am unfamiliar with this but I think I did do it like you wanted me >>>>>>> to (all tests pass so that's a start). I'm not sure how to clear the >>>>>>> oop exactly, is there somewhere that does that, which I can use to do >>>>>>> the same? >>>>>>> >>>>>>> I removed the is_in_reserved, this came from our internal version, I >>>>>>> don't know why it was there but my tests work without so I removed it >>>>>>> :) >>>>>>> >>>>>>> >>>>>>> The lock: >>>>>>>> L424 MutexLocker mu(HeapMonitorStorage_lock); >>>>>>>> Is not needed as far as I can see. >>>>>>>> weak_oops_do is called in a safepoint, no TLAB allocation can >>>>>>>> happen and >>>>>>>> JVMTI thread can't access these data-structures. Is there something >>>>>>>> more >>>>>>>> to >>>>>>>> this lock that I'm missing? >>>>>>>> >>>>>>> Since a thread can call the JVMTI getLiveTraces (or any of the other >>>>>>> ones), it can get to the point of trying to copying the >>>>>>> _allocated_traces. I imagine it is possible that this is happening >>>>>>> during a GC or that it can be started and a GC happens afterwards. >>>>>>> Therefore, it seems to me that you want this protected, no? >>>>>>> >>>>>>> >>>>>>> #### >>>>>>>> You have 6 files without any changes in them (any more): >>>>>>>> g1CollectedHeap.cpp >>>>>>>> psMarkSweep.cpp >>>>>>>> psParallelCompact.cpp >>>>>>>> genCollectedHeap.cpp >>>>>>>> referenceProcessor.cpp >>>>>>>> thread.hpp >>>>>>>> >>>>>>>> Done. >>>>>>> >>>>>>> #### >>>>>>>> I have not looked closely, but is it possible to hide heap sampling >>>>>>>> in >>>>>>>> AllocTracer ? (with some minor changes to the AllocTracer API) >>>>>>>> >>>>>>>> I am imagining that you are saying to move the code that does the >>>>>>> sampling code (change the tlab end, do the call to HeapMonitoring, >>>>>>> etc.) into the AllocTracer code itself? I think that is right and >>>>>>> I'll >>>>>>> look if that is possible and prepare a webrev to show what would be >>>>>>> needed to make that happen. >>>>>>> >>>>>>> #### >>>>>>>> Minor nit, when declaring pointer there is a little mix of having >>>>>>>> the >>>>>>>> pointer adjacent by type name and data name. (Most hotspot code is >>>>>>>> by >>>>>>>> type >>>>>>>> name) >>>>>>>> E.g. >>>>>>>> heapMonitoring.cpp:711 jvmtiStackTrace *trace = .... >>>>>>>> heapMonitoring.cpp:733 Method* m = vfst.method(); >>>>>>>> (not just this file) >>>>>>>> >>>>>>>> Done! >>>>>>> >>>>>>> #### >>>>>>>> HeapMonitorThreadOnOffTest.java:77 >>>>>>>> I would make g_tmp volatile, otherwise the assignment in loop may >>>>>>>> theoretical be skipped. >>>>>>>> >>>>>>>> Also done! >>>>>>> >>>>>>> Thanks again! >>>>>>> Jc >>>>>>> >>>>>> >>>>>> >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Mon Mar 19 22:50:28 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 19 Mar 2018 15:50:28 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: <05d6de94-c52a-b3f2-7a41-293f7e490475@oracle.com> References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> <4f7f3d57-890c-87eb-abea-ed0f04d58ee6@oracle.com> <0abb9c09-bee9-7c7b-db26-99bad154e7fc@oracle.com> <2fd4bb59-7b5f-3598-10c3-da3a2ab957af@oracle.com> <4bef98cf-e422-572a-508c-b0d4c9aff8f4@oracle.com> <05d6de94-c52a-b3f2-7a41-293f7e490475@oracle.com> Message-ID: I looked into modifying OutputAnalyzer (actually ended up being ProcessTools that needed all the changes) to be more flexible so it could support LingeredApp. The problem I ran into is that ProcessTools is all static, but I needed to create and return a context. It ended up being too much disruption, so I instead have the ProcessTools.getOutput() code as part of LingeredApp. Another thing I discovered is that you can use OutputAnalyzer with already generated output, so this option is still available to users of LingeredApp. You just need to do something like: ??? OutputAnalyzer out = new OutputAnalyzer(lingeredApp.getOutput().getStdout(), lingeredApp.getOutput().getStderr()); I didn't change any test to take advantage of this, but it's there if someone wants it. I've included another webrev below (completely different from the original). In the end, all LingeredApp stdout and stderr is dumped after the app exits. The old way of storing away the stdout using an InputGobbler is gone. Since getAppOutput() depended on this, and the new way of saving stdout saves it as one big string rather than a List of lines, getAppOutput() needed some changes to convert to the List form. http://cr.openjdk.java.net/~cjplummer/8198655/webrev.03 thanks, Chris On 3/19/18 9:39 AM, Chris Plummer wrote: > Hi David, > > Just to clarify one point, most of the tests that use OutputAnalyzer > do not display process output unless there is an error. So part of the > decision here with LingeredApp is when to display the output. > Currently the stdout is captured, but not displayed, unless the tests > does the work to display it, which none do. Currently stderr goes to > the console. Note that some negative tests actually cause some > expected stderr output, although the tests don't check for it. > > One thought I just had is to create an async option for OutputAnalyzer > so it doesn't block until the process exits. Basically that means > splitting ProcessTools.getOutput() so it doesn't block. What I > currently have is essentially doing that. It copies > ProcessTools.getOutput(), splitting it into two parts. But all this > logic is in LingeredApp, and of course doesn't have any of the output > error checking support that OutputAnalyzer, which might be useful for > LingeredApp. For example, the negative tests only test that launching > the app failed. They could be improved by checking for specific error > output. > > Chris > > On 3/17/18 12:11 AM, David Holmes wrote: >> I'm afraid I'm losing track of this change. >> >> The key thing is that we should not have a test that launches any >> other process for which we can not see the output of that process. >> >> David >> >> On 17/03/2018 7:48 AM, Chris Plummer wrote: >>> On 3/16/18 1:25 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Chris, >>>> >>>> Thank you for taking care about this issue! >>>> >>>> On 3/16/18 11:20, Chris Plummer wrote: >>>>> Hi, >>>>> >>>>> I've resolved the issues I had before with not seeing all the >>>>> stderr output when I tried to capture it. What I'd like to do now >>>>> is have us decide how the output should be handled from the >>>>> perspective a LingeredApp user (driver app). Currently all >>>>> LingeredApp stdout is captured and gets be returned the the driver >>>>> app by calling app.getAppOutput(). It does not appear in the .jtr >>>>> file, but the test would have the option of dumping it there it it >>>>> cared to. Only one test uses app.getAppOutput(). Currently all the >>>>> LingeredApp stderr is redirected to the console, so it does not >>>>> appear in the .jtr file. >>>> >>>> Just a general comment to make sure I understand it and ensure we >>>> are in sync. >>>> It seems much more safe to always have both stdout and stderr >>>> outputs present in the .jtr automatically file independently of of >>>> what the test does. >>>> >>>> >>>>> So how do we want this changed? Some possibilities are: >>>>> >>>>> (1) capture stderr just like stdout currently is, and leave is up >>>>> the the driver app to decide if it wants to display it (after the >>>>> app terminates). >>>> >>>> It does not look good to me (see above) but maybe I'm missing >>>> something important here. >>>> >>>>> (2) capture stderr just like stdout currently is, but have >>>>> LingeredApp automatically send captured output to driver app's >>>>> stdout and stderr (after the app terminates). >>>> >>>> The stdout and std err will be separated in this case, right? >>>> Do you have a webrev for this? >>> I currently have it working like this, although I need to fix >>> LingeredApp.getAppOutput(). I had to make it return a single String >>> instead of a List of Strings, so this breaks the one test that uses >>> this API. It's easily fixed. Just haven't gotten around to it yet. >>>> >>>> >>>>> (3) send the LingeredApp's stdout and stderr to the driver app's >>>>> stdout as it is being captured (this was the original fix Igor >>>>> suggested and the webrev supported). A minor alternative to this >>>>> is to keep the two streams separated instead of sending both to >>>>> stdout. >>>>> >>>>> Let me know what you think. I'm inclined to go with 2, especially >>>>> since normally there is little to no output from the LingeredApp. >>>> >>>> The choice (2) looks good enough. >>>> Not sure it is that important to have output from stdout and stderr >>>> sync'ed >>>> but is is important to have the stderr present in the .jtr >>>> automatically. >>>> >>>> The choice (3) looks even better if it is going to work well. >>> This is basically what the original webrev did. It sent >>> LingeredApp's stderr and stdout to the the driver apps stdout. It's >>> a 1 word change to make it send stderr to stderr. I think it has a >>> bug though that did not manifest itself. It seems the new copy() >>> code that is capturing stdout would be contending with the existing >>> InputGlobbler code that is doing the same. I would need to fix this >>> to make sure LingeredApp.getAppOutput() still returns all the apps >>> stdout output. >>> >>> Chris >>>> Not sure, it is really necessary. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>>> >>>>> BTW, here's the CR and original webrev for reference: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8198655 >>>>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>> >>> > From alexey.menkov at oracle.com Tue Mar 20 00:28:09 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 19 Mar 2018 17:28:09 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> Message-ID: <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> Hi guys, please re-review the fix. Reg.test is added the the issue. webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.04/ --alex On 03/13/2018 16:14, Alex Menkov wrote: > Hi all, > > Please review a small fix for > https://bugs.openjdk.java.net/browse/JDK-8049695 > webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ > > Root cause of the issue is jbd hungs as a result of the buffer overflow. > > In the beginning of the shmemBase.c: > > #define MAX_IPC_PREFIX 50?? /* user-specified or generated name for */ > ??????????????????????????? /* shared memory seg and prefix for other > IPC */ > #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC names */ > #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) > > buffer (char prefix[]) in function createStream is used to generate base > name for mutex/events, so MAX_IPC_PREFIX is not big enough. > > --alex From chris.plummer at oracle.com Tue Mar 20 00:48:15 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 19 Mar 2018 17:48:15 -0700 Subject: RFR(S): 8195109: ServiceUtil::visible_oop is not needed anymore Message-ID: <09b9d4e3-1db1-f0dc-6eee-73cfe33124f5@oracle.com> Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8195109 http://cr.openjdk.java.net/~cjplummer/8195109/webrev.00/index.html The assert I added to make sure this is safe has been in place in jdk/jdk for almost 3 weeks with no issues (longer in jdk/hs). The webrev is missing the copyright update for threadService.hpp. I fixed it after noticing that. Testing is in progress. Running hs tiers 1, 2, and 3, and jdk tiers 1 and 2. Also making sure all serviceability tests are run. thanks, Chris From david.holmes at oracle.com Tue Mar 20 01:10:48 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 20 Mar 2018 11:10:48 +1000 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> Message-ID: <3b9731df-3ea9-59ec-2c0c-af72a554e69a@oracle.com> Hi Alex, On 20/03/2018 10:28 AM, Alex Menkov wrote: > Hi guys, > > please re-review the fix. I still have an unanswered question about where the max of 49 is enforced. I see it for the "address" but not names in general. ?? > Reg.test is added the the issue. I don't quite follow the test. I see you try to set the name with a value that is too long, and if that doesn't cause an overflow and we don't crash that is good. But I'd expect you to read back the name and check it matches the truncated name with 49 characters. Thanks, David > webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.04/ > > --alex > > On 03/13/2018 16:14, Alex Menkov wrote: >> Hi all, >> >> Please review a small fix for >> https://bugs.openjdk.java.net/browse/JDK-8049695 >> webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >> >> Root cause of the issue is jbd hungs as a result of the buffer overflow. >> >> In the beginning of the shmemBase.c: >> >> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name for */ >> ???????????????????????????? /* shared memory seg and prefix for other >> IPC */ >> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >> names */ >> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >> >> buffer (char prefix[]) in function createStream is used to generate >> base name for mutex/events, so MAX_IPC_PREFIX is not big enough. >> >> --alex From harsha.wardhana.b at oracle.com Tue Mar 20 05:21:08 2018 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Tue, 20 Mar 2018 10:51:08 +0530 Subject: RFR : JDK-8196744 : JMX: Not enough JDP packets received before timeout In-Reply-To: References: <85f7cf50-2791-6a91-20e8-81a98c6239ab@oracle.com> <59657246-1590-45c4-9bb2-a650fcd8cdd4@default> Message-ID: Thanks David. -Harsha On Tuesday 20 March 2018 02:13 AM, David Holmes wrote: > Hi Harsha, > > Given the negative nature of the test this approach seems quite > reasonable. > > Thanks, > David > >> Harsha Wardhana B >> Ping! Can I have one more review for the below fix? >> >> Thanks >> >> Harsha >> >> On Monday 26 February 2018 10:42 AM, Harsha Wardhana B wrote: >> >>> Hello All, >> >>> >> >>> Requesting for review from one more reviewer. >> >>> >> >>> Thanks >> >>> Harsha >> >>> >> >>> On Wednesday 21 February 2018 10:01 AM, Chris Plummer wrote: >> >>>> Hi Harsha, >> >>>> >> >>>> Not a review, but just a request that you add the explanation of the >> >>>> problem to the CR so we have a record of it. Also, the copyright >> >>>> needs to be updated. >> >>>> >> >>>> thanks, >> >>>> >> >>>> Chris >> >>>> >> >>>> On 2/20/18 3:30 AM, Harsha Wardhana B wrote: >> >>>>> Hi All, >> >>>>> >> >>>>> Please find the fix below for the Jdp test-case. >> >>>>> >> >>>>> issue: https://bugs.openjdk.java.net/browse/JDK-8196028 >> >>>>> webrev : http://cr.openjdk.java.net/~hb/8196028/webrev.00/ >> >>>>> >> >>>>> Fix details : The test was receiving JDP packets from other VM and >> >>>>> hence the multi-cast socket was not timing-out. The default timeout >> >>>>> handler was causing test to fail. Added a shutdown method that >> >>>>> passes the test in case of timeout. >> >>>>> >> >>>>> Thanks >> >>>>> Harsha >> >>>> >> >>>> >> >>> >> From stefan.karlsson at oracle.com Tue Mar 20 08:05:48 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 20 Mar 2018 09:05:48 +0100 Subject: RFR(S): 8195109: ServiceUtil::visible_oop is not needed anymore In-Reply-To: <09b9d4e3-1db1-f0dc-6eee-73cfe33124f5@oracle.com> References: <09b9d4e3-1db1-f0dc-6eee-73cfe33124f5@oracle.com> Message-ID: Looks good to me. StefanK On 2018-03-20 01:48, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8195109 > http://cr.openjdk.java.net/~cjplummer/8195109/webrev.00/index.html > > The assert I added to make sure this is safe has been in place in > jdk/jdk for almost 3 weeks with no issues (longer in jdk/hs). > > The webrev is missing the copyright update for threadService.hpp. I > fixed it after noticing that. > > Testing is in progress. Running hs tiers 1, 2, and 3, and jdk tiers 1 > and 2. Also making sure all serviceability tests are run. > > thanks, > > Chris From magnus.ihse.bursie at oracle.com Tue Mar 20 11:24:39 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Tue, 20 Mar 2018 12:24:39 +0100 Subject: RFR: JDK-8199682 Clean up building the saproc library In-Reply-To: References: <39fbd44c-030d-582f-f4f9-9bc9f3e53c42@oracle.com> <5AAB52D1.6040707@oracle.com> Message-ID: <2d26e786-5394-7bdf-a0f5-95ea7b36247b@oracle.com> On 2018-03-16 19:12, Magnus Ihse Bursie wrote: > Hi Sundar, > > I almost missed your mail, since you removed both me and build-dev from the cc list... > >> 16 mars 2018 kl. 06:14 skrev Sundararajan Athijegannathan : >> >> Renaming sawindbg as saproc sounds odd. For Linux, Solaris/Unix, we either use /proc & libproc, so calling saproc for those makes sense. But Windows? We have a separate debugger class to load platform specific native library. What is the reason for uniform naming? > This is the only library in the JDK that has a different name on different platform. This clashes with the design of the build system, and requires a clunky workaround. For the upcoming changes in the build system, this goes from an annoyance to a blocker. > > No other components have their names based on the OS functionality they use, even if they use vastly different APIs on different platforms; rather they are named after the services they provide to the JDK. > > My assumption was that ?saproc? meant ?serviceability agent process handling?, and that this was a reasonable name for all platforms. Also, the source code for all platforms reside in the ?libsaproc? directory, which is consistent with the JDK standard for matching source code to native library. > > But if you believe this is an inappropriate name, let?s work together to find a name that works for all platforms. This of course will lead to new names for the current libsaproc.* libraries, and the source code directories. Hi Sundar, Are you okay with this rationale for changing to saproc, or do you want to discuss this further? /Magnus > > /Magnus > >> -Sundar >> >> On 16/03/18, 12:19 AM, Magnus Ihse Bursie wrote: >>> >>> On 2018-03-15 19:39, Erik Joelsson wrote: >>>> Looks good to me. >>>> >>>> The removed source files, are those some kind of tests? >>> I don't really know; they have been excluded from the build for all time. My guess is that the Bsd* stuff is, like in the case of the sound libraries, bsd-based stuff that arrived with the mac port (but disabled). The test.c is a trivial main() method which looks more like a left-over adhoc testing from the initial developer. Perhaps someone wants to turn it into a proper test, but it seems like it's not much even to start with. (And hopefully we have much better real test coverage of this now.) >>> >>> /Magnus >>>> /Erik >>>> >>>> >>>> On 2018-03-15 11:22, Magnus Ihse Bursie wrote: >>>>> The saproc library has historically been built in quite odd ways on almost all platforms. When the old build system was converted, this was not changed. >>>>> >>>>> However, now the time has come to streamline this and build this library just as any other. >>>>> >>>>> The most visible change, perhaps, is that the library is now named saproc on all platforms, even Windows. Other changes include: >>>>> * Don't set flags that is already set by the default flags. >>>>> * Don't set flags that do not have anny effect. >>>>> * Don't subst away the WIN32_LEAN_AND_MEAN definition, it's perfectly okay to have it. >>>>> * Don't set CXX linker on solaris -- this was not needed so no reason to do it. >>>>> * Cleaned up some old hooks for closed code that is no longer needed. >>>>> >>>>> I have verified this using COMPARE_BUILD. This shows only the expected differences: >>>>> * On all platforms: class file changes for WindbgDebuggerLocal.java. >>>>> * On solaris: some minor symbol differences, since the linker now uses C framework functions instead of C++. (And with symbol changes always comes disasm changes.) >>>>> * On linux: a binary difference for libsaproc.so, but no size/symbol/deps/disasm change. >>>>> * On macosx: no changes at all. >>>>> * On windows: sawindbg.dll is renamed to saproc.dll. When I made a manual comparison between the two files, I found no significant differences. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8199682 >>>>> WebRev: http://cr.openjdk.java.net/~ihse/JDK-8199682-clean-up-saproc/webrev.01 >>>>> >>>>> /Magnus >>>>> From sundararajan.athijegannathan at oracle.com Tue Mar 20 15:09:47 2018 From: sundararajan.athijegannathan at oracle.com (Sundararajan Athijegannathan) Date: Tue, 20 Mar 2018 20:39:47 +0530 Subject: RFR: JDK-8199682 Clean up building the saproc library In-Reply-To: <2d26e786-5394-7bdf-a0f5-95ea7b36247b@oracle.com> References: <39fbd44c-030d-582f-f4f9-9bc9f3e53c42@oracle.com> <5AAB52D1.6040707@oracle.com> <2d26e786-5394-7bdf-a0f5-95ea7b36247b@oracle.com> Message-ID: <5AB1243B.1020807@oracle.com> Hi, Sounds good - so long as we don't have scripts that depend on the old name. Or if those could be fixed... -Sundar On 20/03/18, 4:54 PM, Magnus Ihse Bursie wrote: > > On 2018-03-16 19:12, Magnus Ihse Bursie wrote: >> Hi Sundar, >> >> I almost missed your mail, since you removed both me and build-dev >> from the cc list... >> >>> 16 mars 2018 kl. 06:14 skrev Sundararajan Athijegannathan >>> : >>> >>> Renaming sawindbg as saproc sounds odd. For Linux, Solaris/Unix, we >>> either use /proc & libproc, so calling saproc for those makes sense. >>> But Windows? We have a separate debugger class to load platform >>> specific native library. What is the reason for uniform naming? >> This is the only library in the JDK that has a different name on >> different platform. This clashes with the design of the build system, >> and requires a clunky workaround. For the upcoming changes in the >> build system, this goes from an annoyance to a blocker. >> >> No other components have their names based on the OS functionality >> they use, even if they use vastly different APIs on different >> platforms; rather they are named after the services they provide to >> the JDK. >> >> My assumption was that ?saproc? meant ?serviceability agent process >> handling?, and that this was a reasonable name for all platforms. >> Also, the source code for all platforms reside in the ?libsaproc? >> directory, which is consistent with the JDK standard for matching >> source code to native library. >> >> But if you believe this is an inappropriate name, let?s work together >> to find a name that works for all platforms. This of course will lead >> to new names for the current libsaproc.* libraries, and the source >> code directories. > Hi Sundar, > > Are you okay with this rationale for changing to saproc, or do you > want to discuss this further? > > /Magnus >> >> /Magnus >> >>> -Sundar >>> >>> On 16/03/18, 12:19 AM, Magnus Ihse Bursie wrote: >>>> >>>> On 2018-03-15 19:39, Erik Joelsson wrote: >>>>> Looks good to me. >>>>> >>>>> The removed source files, are those some kind of tests? >>>> I don't really know; they have been excluded from the build for all >>>> time. My guess is that the Bsd* stuff is, like in the case of the >>>> sound libraries, bsd-based stuff that arrived with the mac port >>>> (but disabled). The test.c is a trivial main() method which looks >>>> more like a left-over adhoc testing from the initial developer. >>>> Perhaps someone wants to turn it into a proper test, but it seems >>>> like it's not much even to start with. (And hopefully we have much >>>> better real test coverage of this now.) >>>> >>>> /Magnus >>>>> /Erik >>>>> >>>>> >>>>> On 2018-03-15 11:22, Magnus Ihse Bursie wrote: >>>>>> The saproc library has historically been built in quite odd ways >>>>>> on almost all platforms. When the old build system was converted, >>>>>> this was not changed. >>>>>> >>>>>> However, now the time has come to streamline this and build this >>>>>> library just as any other. >>>>>> >>>>>> The most visible change, perhaps, is that the library is now >>>>>> named saproc on all platforms, even Windows. Other changes include: >>>>>> * Don't set flags that is already set by the default flags. >>>>>> * Don't set flags that do not have anny effect. >>>>>> * Don't subst away the WIN32_LEAN_AND_MEAN definition, it's >>>>>> perfectly okay to have it. >>>>>> * Don't set CXX linker on solaris -- this was not needed so no >>>>>> reason to do it. >>>>>> * Cleaned up some old hooks for closed code that is no longer >>>>>> needed. >>>>>> >>>>>> I have verified this using COMPARE_BUILD. This shows only the >>>>>> expected differences: >>>>>> * On all platforms: class file changes for WindbgDebuggerLocal.java. >>>>>> * On solaris: some minor symbol differences, since the linker now >>>>>> uses C framework functions instead of C++. (And with symbol >>>>>> changes always comes disasm changes.) >>>>>> * On linux: a binary difference for libsaproc.so, but no >>>>>> size/symbol/deps/disasm change. >>>>>> * On macosx: no changes at all. >>>>>> * On windows: sawindbg.dll is renamed to saproc.dll. When I made >>>>>> a manual comparison between the two files, I found no significant >>>>>> differences. >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8199682 >>>>>> WebRev: >>>>>> http://cr.openjdk.java.net/~ihse/JDK-8199682-clean-up-saproc/webrev.01 >>>>>> >>>>>> >>>>>> /Magnus >>>>>> > From alexey.menkov at oracle.com Tue Mar 20 17:25:51 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 20 Mar 2018 10:25:51 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <3b9731df-3ea9-59ec-2c0c-af72a554e69a@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> <3b9731df-3ea9-59ec-2c0c-af72a554e69a@oracle.com> Message-ID: <7402a238-0c43-ac32-e0b0-4951361c1760@oracle.com> Hi David, On 03/19/2018 18:10, David Holmes wrote: > Hi Alex, > > On 20/03/2018 10:28 AM, Alex Menkov wrote: >> Hi guys, >> >> please re-review the fix. > > I still have an unanswered question about where the max of 49 is > enforced. I see it for the "address" but not names in general. ?? for shmem the "channel name" is the address (it's checked in createTransport/openTransport). Names for mutexes/events are generated by appending some strings to the adddress and length of the added parts are supposed to be less than MAX_IPC_SUFFIX (25 symbols): ".mutex" (+ up to 3 symbols) ".hasData" (+ up to 3 symbols) ".hasSpace" (+ up to 3 symbols) ".ctos" ".stoc" ".accept" (+ up to 3 symbols) ".attach" (+ up to 3 symbols) "." (pid is a DWORD) > >> Reg.test is added the the issue. > > I don't quite follow the test. I see you try to set the name with a > value that is too long, and if that doesn't cause an overflow and we > don't crash that is good. But I'd expect you to read back the name and > check it matches the truncated name with 49 characters. The test specifies the maximum length supported (49 symbols) (if longer name is specified, "address strings longer than 50 characters are invalid" error reported). As far as I see there is no way to read back the name used to create the transport. --alex > > Thanks, > David > >> webrev: >> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.04/ >> >> --alex >> >> On 03/13/2018 16:14, Alex Menkov wrote: >>> Hi all, >>> >>> Please review a small fix for >>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>> webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>> >>> Root cause of the issue is jbd hungs as a result of the buffer overflow. >>> >>> In the beginning of the shmemBase.c: >>> >>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name for */ >>> ???????????????????????????? /* shared memory seg and prefix for >>> other IPC */ >>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >>> names */ >>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>> >>> buffer (char prefix[]) in function createStream is used to generate >>> base name for mutex/events, so MAX_IPC_PREFIX is not big enough. >>> >>> --alex From chris.plummer at oracle.com Tue Mar 20 19:39:47 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 20 Mar 2018 12:39:47 -0700 Subject: RFR(S): 8195109: ServiceUtil::visible_oop is not needed anymore In-Reply-To: <09b9d4e3-1db1-f0dc-6eee-73cfe33124f5@oracle.com> References: <09b9d4e3-1db1-f0dc-6eee-73cfe33124f5@oracle.com> Message-ID: <1f2774ce-b289-9ced-bc79-601e3d3e4bc3@oracle.com> Hi, New webrev: http://cr.openjdk.java.net/~cjplummer/8195109/webrev.01/index.html There was a build failure on solaris-sparc in threadSMR.cpp. References to the Copy class were producing "unresolved symbol" errors. threadSMR.cpp includes threadService.hpp, which no longer includes serviceUtil.hpp (because it was removed). It looks like serviceUtil.hpp indirectly included "utilities/copy.hpp", so now I include it directly in threadSMR.cpp. The problem was only on solaris-sparc, so I assume on other platforms there was platform dependent code indirectly pulling in copy.hpp. In any case, it's now directly pulled in on all platforms. thanks, Chris On 3/19/18 5:48 PM, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8195109 > http://cr.openjdk.java.net/~cjplummer/8195109/webrev.00/index.html > > The assert I added to make sure this is safe has been in place in > jdk/jdk for almost 3 weeks with no issues (longer in jdk/hs). > > The webrev is missing the copyright update for threadService.hpp. I > fixed it after noticing that. > > Testing is in progress. Running hs tiers 1, 2, and 3, and jdk tiers 1 > and 2. Also making sure all serviceability tests are run. > > thanks, > > Chris From david.holmes at oracle.com Wed Mar 21 04:51:12 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 21 Mar 2018 14:51:12 +1000 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <7402a238-0c43-ac32-e0b0-4951361c1760@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> <3b9731df-3ea9-59ec-2c0c-af72a554e69a@oracle.com> <7402a238-0c43-ac32-e0b0-4951361c1760@oracle.com> Message-ID: <81f49bb9-fd37-b114-ad56-94e9141ec6e1@oracle.com> Hi Alex, On 21/03/2018 3:25 AM, Alex Menkov wrote: > Hi David, > > On 03/19/2018 18:10, David Holmes wrote: >> Hi Alex, >> >> On 20/03/2018 10:28 AM, Alex Menkov wrote: >>> Hi guys, >>> >>> please re-review the fix. >> >> I still have an unanswered question about where the max of 49 is >> enforced. I see it for the "address" but not names in general. ?? > > for shmem the "channel name" is the address (it's checked in > createTransport/openTransport). > Names for mutexes/events are generated by appending some strings to the > adddress and length of the added parts are supposed to be less than > MAX_IPC_SUFFIX (25 symbols): > ".mutex" (+ up to 3 symbols) > ".hasData" (+ up to 3 symbols) > ".hasSpace" (+ up to 3 symbols) > ".ctos" > ".stoc" > ".accept" (+ up to 3 symbols) > ".attach" (+ up to 3 symbols) > "." (pid is a DWORD) Okay so ... the code in shmemBase.c is very unclear as to which "names" can come in from an external source and which are only ever derived from other "names". If the "address" (which seems a very bad description in this case!) is the only external source for a name, and it is limited to a length of 49 then that is okay. >> >>> Reg.test is added the the issue. >> >> I don't quite follow the test. I see you try to set the name with a >> value that is too long, and if that doesn't cause an overflow and we >> don't crash that is good. But I'd expect you to read back the name and >> check it matches the truncated name with 49 characters. > > The test specifies the maximum length supported (49 symbols) > (if longer name is specified, "address strings longer than 50 characters > are invalid" error reported). I missed the substring that simply causes the name to be the maximum supported length. That would trigger the overflow and so suffices as a regression test for this fix. Is there another test that already passes a too-long name and verifies the error gets thrown? > As far as I see there is no way to read back the name used to create the > transport. Ok. Thanks, David ----- > --alex > >> >> Thanks, >> David >> >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.04/ >>> >>> --alex >>> >>> On 03/13/2018 16:14, Alex Menkov wrote: >>>> Hi all, >>>> >>>> Please review a small fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>> >>>> Root cause of the issue is jbd hungs as a result of the buffer >>>> overflow. >>>> >>>> In the beginning of the shmemBase.c: >>>> >>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name for */ >>>> ???????????????????????????? /* shared memory seg and prefix for >>>> other IPC */ >>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >>>> names */ >>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>> >>>> buffer (char prefix[]) in function createStream is used to generate >>>> base name for mutex/events, so MAX_IPC_PREFIX is not big enough. >>>> >>>> --alex From christoph.langer at sap.com Wed Mar 21 08:10:42 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Wed, 21 Mar 2018 08:10:42 +0000 Subject: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of attach operations Message-ID: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> Hi, may I please ask for reviews of the following small fix. Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199924.0/ Bug: https://bugs.openjdk.java.net/browse/JDK-8199924 If one passes null arguments to the varargs of attach operations, they get swallowed on Solaris and following arguments will shift to lower positions. Other platform implementations handle this correctly, for instance linux: http://hg.openjdk.java.net/jdk/jdk/file/f6ad4d73c834/src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java#l178 Thanks Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Mar 21 09:19:58 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 21 Mar 2018 19:19:58 +1000 Subject: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of attach operations In-Reply-To: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> References: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> Message-ID: <9decad59-366b-a5bc-607f-88625169468d@oracle.com> Hi Christoph, On 21/03/2018 6:10 PM, Langer, Christoph wrote: > Hi, > > may I please ask for reviews of the following small fix. > > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199924.0/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8199924 > > If one passes null arguments to the varargs of attach operations, they > get swallowed on Solaris and following arguments will shift to lower > positions. > > Other platform implementations handle this correctly, for instance > linux: > http://hg.openjdk.java.net/jdk/jdk/file/f6ad4d73c834/src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java#l178 Wouldn't it be simpler to just handle this at the Java level and substitute "" for null in the args array? We're only looking at a maximum of three possible entries. Thanks, David > Thanks > > Christoph > From stefan.karlsson at oracle.com Wed Mar 21 10:41:28 2018 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 21 Mar 2018 11:41:28 +0100 Subject: RFR(S): 8195109: ServiceUtil::visible_oop is not needed anymore In-Reply-To: <1f2774ce-b289-9ced-bc79-601e3d3e4bc3@oracle.com> References: <09b9d4e3-1db1-f0dc-6eee-73cfe33124f5@oracle.com> <1f2774ce-b289-9ced-bc79-601e3d3e4bc3@oracle.com> Message-ID: Looks good. StefanK On 2018-03-20 20:39, Chris Plummer wrote: > Hi, > > New webrev: > > http://cr.openjdk.java.net/~cjplummer/8195109/webrev.01/index.html > > There was a build failure on solaris-sparc in threadSMR.cpp. References > to the Copy class were producing "unresolved symbol" errors. > threadSMR.cpp includes threadService.hpp, which no longer includes > serviceUtil.hpp (because it was removed). It looks like serviceUtil.hpp > indirectly included "utilities/copy.hpp", so now I include it directly > in threadSMR.cpp. The problem was only on solaris-sparc, so I assume on > other platforms there was platform dependent code indirectly pulling in > copy.hpp. In any case, it's now directly pulled in on all platforms. > > thanks, > > Chris > > On 3/19/18 5:48 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8195109 >> http://cr.openjdk.java.net/~cjplummer/8195109/webrev.00/index.html >> >> The assert I added to make sure this is safe has been in place in >> jdk/jdk for almost 3 weeks with no issues (longer in jdk/hs). >> >> The webrev is missing the copyright update for threadService.hpp. I >> fixed it after noticing that. >> >> Testing is in progress. Running hs tiers 1, 2, and 3, and jdk tiers 1 >> and 2. Also making sure all serviceability tests are run. >> >> thanks, >> >> Chris > > From christoph.langer at sap.com Wed Mar 21 12:45:29 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Wed, 21 Mar 2018 12:45:29 +0000 Subject: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of attach operations In-Reply-To: <9decad59-366b-a5bc-607f-88625169468d@oracle.com> References: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> <9decad59-366b-a5bc-607f-88625169468d@oracle.com> Message-ID: <94b75a874b5c4a3cb1cb109a51220c16@sap.com> Hi David, thanks for looking at this. I currently have no emotions whether to fix it in C or in Java - I'll check it out... Best regards Christoph > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Mittwoch, 21. M?rz 2018 10:20 > To: Langer, Christoph ; serviceability- > dev at openjdk.java.net > Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of > attach operations > > Hi Christoph, > > On 21/03/2018 6:10 PM, Langer, Christoph wrote: > > Hi, > > > > may I please ask for reviews of the following small fix. > > > > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199924.0/ > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8199924 > > > > If one passes null arguments to the varargs of attach operations, they > > get swallowed on Solaris and following arguments will shift to lower > > positions. > > > > Other platform implementations handle this correctly, for instance > > linux: > > > http://hg.openjdk.java.net/jdk/jdk/file/f6ad4d73c834/src/jdk.attach/linux/cl > asses/sun/tools/attach/VirtualMachineImpl.java#l178 > > Wouldn't it be simpler to just handle this at the Java level and > substitute "" for null in the args array? We're only looking at a > maximum of three possible entries. > > Thanks, > David > > > Thanks > > > > Christoph > > From daniel.daugherty at oracle.com Wed Mar 21 13:58:57 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 21 Mar 2018 09:58:57 -0400 Subject: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of attach operations In-Reply-To: <94b75a874b5c4a3cb1cb109a51220c16@sap.com> References: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> <9decad59-366b-a5bc-607f-88625169468d@oracle.com> <94b75a874b5c4a3cb1cb109a51220c16@sap.com> Message-ID: <94e35233-005a-3e2c-1db2-bf5625dca863@oracle.com> Hmmm... shouldn't the inconsistency in the Solaris backend also be addressed? Dan On 3/21/18 8:45 AM, Langer, Christoph wrote: > Hi David, > > thanks for looking at this. I currently have no emotions whether to fix it in C or in Java - I'll check it out... > > Best regards > Christoph > >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Mittwoch, 21. M?rz 2018 10:20 >> To: Langer, Christoph ; serviceability- >> dev at openjdk.java.net >> Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of >> attach operations >> >> Hi Christoph, >> >> On 21/03/2018 6:10 PM, Langer, Christoph wrote: >>> Hi, >>> >>> may I please ask for reviews of the following small fix. >>> >>> Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199924.0/ >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8199924 >>> >>> If one passes null arguments to the varargs of attach operations, they >>> get swallowed on Solaris and following arguments will shift to lower >>> positions. >>> >>> Other platform implementations handle this correctly, for instance >>> linux: >>> >> http://hg.openjdk.java.net/jdk/jdk/file/f6ad4d73c834/src/jdk.attach/linux/cl >> asses/sun/tools/attach/VirtualMachineImpl.java#l178 >> >> Wouldn't it be simpler to just handle this at the Java level and >> substitute "" for null in the args array? We're only looking at a >> maximum of three possible entries. >> >> Thanks, >> David >> >>> Thanks >>> >>> Christoph >>> From christoph.langer at sap.com Wed Mar 21 14:00:36 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Wed, 21 Mar 2018 14:00:36 +0000 Subject: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of attach operations In-Reply-To: <94e35233-005a-3e2c-1db2-bf5625dca863@oracle.com> References: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> <9decad59-366b-a5bc-607f-88625169468d@oracle.com> <94b75a874b5c4a3cb1cb109a51220c16@sap.com> <94e35233-005a-3e2c-1db2-bf5625dca863@oracle.com> Message-ID: <693b2894c45244b1aaf914d6a00fc12b@sap.com> Hi Dan, that is, you mean the C-code? My original change? Best regards Christoph > -----Original Message----- > From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com] > Sent: Mittwoch, 21. M?rz 2018 14:59 > To: Langer, Christoph ; David Holmes > ; serviceability-dev at openjdk.java.net > Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of > attach operations > > Hmmm... shouldn't the inconsistency in the Solaris backend also be > addressed? > > Dan > > > On 3/21/18 8:45 AM, Langer, Christoph wrote: > > Hi David, > > > > thanks for looking at this. I currently have no emotions whether to fix it in C > or in Java - I'll check it out... > > > > Best regards > > Christoph > > > >> -----Original Message----- > >> From: David Holmes [mailto:david.holmes at oracle.com] > >> Sent: Mittwoch, 21. M?rz 2018 10:20 > >> To: Langer, Christoph ; serviceability- > >> dev at openjdk.java.net > >> Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of > >> attach operations > >> > >> Hi Christoph, > >> > >> On 21/03/2018 6:10 PM, Langer, Christoph wrote: > >>> Hi, > >>> > >>> may I please ask for reviews of the following small fix. > >>> > >>> Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199924.0/ > >>> > >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8199924 > >>> > >>> If one passes null arguments to the varargs of attach operations, they > >>> get swallowed on Solaris and following arguments will shift to lower > >>> positions. > >>> > >>> Other platform implementations handle this correctly, for instance > >>> linux: > >>> > >> > http://hg.openjdk.java.net/jdk/jdk/file/f6ad4d73c834/src/jdk.attach/linux/cl > >> asses/sun/tools/attach/VirtualMachineImpl.java#l178 > >> > >> Wouldn't it be simpler to just handle this at the Java level and > >> substitute "" for null in the args array? We're only looking at a > >> maximum of three possible entries. > >> > >> Thanks, > >> David > >> > >>> Thanks > >>> > >>> Christoph > >>> From daniel.daugherty at oracle.com Wed Mar 21 14:23:16 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 21 Mar 2018 10:23:16 -0400 Subject: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of attach operations In-Reply-To: <693b2894c45244b1aaf914d6a00fc12b@sap.com> References: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> <9decad59-366b-a5bc-607f-88625169468d@oracle.com> <94b75a874b5c4a3cb1cb109a51220c16@sap.com> <94e35233-005a-3e2c-1db2-bf5625dca863@oracle.com> <693b2894c45244b1aaf914d6a00fc12b@sap.com> Message-ID: <3fdcb090-3e74-5d22-4188-8af625f8a217@oracle.com> On 3/21/18 10:00 AM, Langer, Christoph wrote: > Hi Dan, > > that is, you mean the C-code? My original change? Hmmm... I think I confused myself before I drank enough coffee... Looking again... Dan > > Best regards > Christoph > >> -----Original Message----- >> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com] >> Sent: Mittwoch, 21. M?rz 2018 14:59 >> To: Langer, Christoph ; David Holmes >> ; serviceability-dev at openjdk.java.net >> Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of >> attach operations >> >> Hmmm... shouldn't the inconsistency in the Solaris backend also be >> addressed? >> >> Dan >> >> >> On 3/21/18 8:45 AM, Langer, Christoph wrote: >>> Hi David, >>> >>> thanks for looking at this. I currently have no emotions whether to fix it in C >> or in Java - I'll check it out... >>> Best regards >>> Christoph >>> >>>> -----Original Message----- >>>> From: David Holmes [mailto:david.holmes at oracle.com] >>>> Sent: Mittwoch, 21. M?rz 2018 10:20 >>>> To: Langer, Christoph ; serviceability- >>>> dev at openjdk.java.net >>>> Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of >>>> attach operations >>>> >>>> Hi Christoph, >>>> >>>> On 21/03/2018 6:10 PM, Langer, Christoph wrote: >>>>> Hi, >>>>> >>>>> may I please ask for reviews of the following small fix. >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199924.0/ >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8199924 >>>>> >>>>> If one passes null arguments to the varargs of attach operations, they >>>>> get swallowed on Solaris and following arguments will shift to lower >>>>> positions. >>>>> >>>>> Other platform implementations handle this correctly, for instance >>>>> linux: >>>>> >> http://hg.openjdk.java.net/jdk/jdk/file/f6ad4d73c834/src/jdk.attach/linux/cl >>>> asses/sun/tools/attach/VirtualMachineImpl.java#l178 >>>> >>>> Wouldn't it be simpler to just handle this at the Java level and >>>> substitute "" for null in the args array? We're only looking at a >>>> maximum of three possible entries. >>>> >>>> Thanks, >>>> David >>>> >>>>> Thanks >>>>> >>>>> Christoph >>>>> From daniel.daugherty at oracle.com Wed Mar 21 14:51:58 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 21 Mar 2018 10:51:58 -0400 Subject: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of attach operations In-Reply-To: <3fdcb090-3e74-5d22-4188-8af625f8a217@oracle.com> References: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> <9decad59-366b-a5bc-607f-88625169468d@oracle.com> <94b75a874b5c4a3cb1cb109a51220c16@sap.com> <94e35233-005a-3e2c-1db2-bf5625dca863@oracle.com> <693b2894c45244b1aaf914d6a00fc12b@sap.com> <3fdcb090-3e74-5d22-4188-8af625f8a217@oracle.com> Message-ID: <1173f3a8-b2cd-776c-e640-1b83d84a6777@oracle.com> On 3/21/18 10:23 AM, Daniel D. Daugherty wrote: > On 3/21/18 10:00 AM, Langer, Christoph wrote: >> Hi Dan, >> >> that is, you mean the C-code? My original change? > > Hmmm... I think I confused myself before I drank enough coffee... > Looking again... Okay I definitely confused myself... and I clearly don't remember the attach-on-demand code as well as I used to... sigh... I think you should keep your original fix since it now properly handles null arguments at the same attach-on-demand layer as the Linux code that you quoted. Handling this in args array processing would also be possible as David suggests, but it would bother me that Linux and Solaris lower attach-on-demand layers would have different behaviors. Hope this is more clear. Dan > > Dan > > >> >> Best regards >> Christoph >> >>> -----Original Message----- >>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com] >>> Sent: Mittwoch, 21. M?rz 2018 14:59 >>> To: Langer, Christoph ; David Holmes >>> ; serviceability-dev at openjdk.java.net >>> Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null >>> arguments of >>> attach operations >>> >>> Hmmm... shouldn't the inconsistency in the Solaris backend also be >>> addressed? >>> >>> Dan >>> >>> >>> On 3/21/18 8:45 AM, Langer, Christoph wrote: >>>> Hi David, >>>> >>>> thanks for looking at this. I currently have no emotions whether to >>>> fix it in C >>> or in Java - I'll check it out... >>>> Best regards >>>> Christoph >>>> >>>>> -----Original Message----- >>>>> From: David Holmes [mailto:david.holmes at oracle.com] >>>>> Sent: Mittwoch, 21. M?rz 2018 10:20 >>>>> To: Langer, Christoph ; serviceability- >>>>> dev at openjdk.java.net >>>>> Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null >>>>> arguments of >>>>> attach operations >>>>> >>>>> Hi Christoph, >>>>> >>>>> On 21/03/2018 6:10 PM, Langer, Christoph wrote: >>>>>> Hi, >>>>>> >>>>>> may I please ask for reviews of the following small fix. >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199924.0/ >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8199924 >>>>>> >>>>>> If one passes null arguments to the varargs of attach operations, >>>>>> they >>>>>> get swallowed on Solaris and following arguments will shift to lower >>>>>> positions. >>>>>> >>>>>> Other platform implementations handle this correctly, for instance >>>>>> linux: >>>>>> >>> http://hg.openjdk.java.net/jdk/jdk/file/f6ad4d73c834/src/jdk.attach/linux/cl >>> >>>>> asses/sun/tools/attach/VirtualMachineImpl.java#l178 >>>>> >>>>> Wouldn't it be simpler to just handle this at the Java level and >>>>> substitute "" for null in the args array? We're only looking at a >>>>> maximum of three possible entries. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Thanks >>>>>> >>>>>> Christoph >>>>>> > > From daniel.daugherty at oracle.com Wed Mar 21 14:59:37 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 21 Mar 2018 10:59:37 -0400 Subject: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of attach operations In-Reply-To: <1173f3a8-b2cd-776c-e640-1b83d84a6777@oracle.com> References: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> <9decad59-366b-a5bc-607f-88625169468d@oracle.com> <94b75a874b5c4a3cb1cb109a51220c16@sap.com> <94e35233-005a-3e2c-1db2-bf5625dca863@oracle.com> <693b2894c45244b1aaf914d6a00fc12b@sap.com> <3fdcb090-3e74-5d22-4188-8af625f8a217@oracle.com> <1173f3a8-b2cd-776c-e640-1b83d84a6777@oracle.com> Message-ID: <2bb110c3-a2b2-d940-ebfe-4ebcb84922f5@oracle.com> Forgot to make it clear that I did review the change... > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199924.0/ src/jdk.attach/solaris/native/libattach/VirtualMachineImpl.c ??? No comments. Thumbs up. Dan On 3/21/18 10:51 AM, Daniel D. Daugherty wrote: > On 3/21/18 10:23 AM, Daniel D. Daugherty wrote: >> On 3/21/18 10:00 AM, Langer, Christoph wrote: >>> Hi Dan, >>> >>> that is, you mean the C-code? My original change? >> >> Hmmm... I think I confused myself before I drank enough coffee... >> Looking again... > > Okay I definitely confused myself... and I clearly don't remember > the attach-on-demand code as well as I used to... sigh... > > I think you should keep your original fix since it now properly > handles null arguments at the same attach-on-demand layer as the > Linux code that you quoted. > > Handling this in args array processing would also be possible > as David suggests, but it would bother me that Linux and Solaris > lower attach-on-demand layers would have different behaviors. > > Hope this is more clear. > > Dan > > >> >> Dan >> >> >>> >>> Best regards >>> Christoph >>> >>>> -----Original Message----- >>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com] >>>> Sent: Mittwoch, 21. M?rz 2018 14:59 >>>> To: Langer, Christoph ; David Holmes >>>> ; serviceability-dev at openjdk.java.net >>>> Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null >>>> arguments of >>>> attach operations >>>> >>>> Hmmm... shouldn't the inconsistency in the Solaris backend also be >>>> addressed? >>>> >>>> Dan >>>> >>>> >>>> On 3/21/18 8:45 AM, Langer, Christoph wrote: >>>>> Hi David, >>>>> >>>>> thanks for looking at this. I currently have no emotions whether >>>>> to fix it in C >>>> or in Java - I'll check it out... >>>>> Best regards >>>>> Christoph >>>>> >>>>>> -----Original Message----- >>>>>> From: David Holmes [mailto:david.holmes at oracle.com] >>>>>> Sent: Mittwoch, 21. M?rz 2018 10:20 >>>>>> To: Langer, Christoph ; serviceability- >>>>>> dev at openjdk.java.net >>>>>> Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null >>>>>> arguments of >>>>>> attach operations >>>>>> >>>>>> Hi Christoph, >>>>>> >>>>>> On 21/03/2018 6:10 PM, Langer, Christoph wrote: >>>>>>> Hi, >>>>>>> >>>>>>> may I please ask for reviews of the following small fix. >>>>>>> >>>>>>> Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199924.0/ >>>>>>> >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8199924 >>>>>>> >>>>>>> If one passes null arguments to the varargs of attach >>>>>>> operations, they >>>>>>> get swallowed on Solaris and following arguments will shift to >>>>>>> lower >>>>>>> positions. >>>>>>> >>>>>>> Other platform implementations handle this correctly, for instance >>>>>>> linux: >>>>>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/f6ad4d73c834/src/jdk.attach/linux/cl >>>> >>>>>> asses/sun/tools/attach/VirtualMachineImpl.java#l178 >>>>>> >>>>>> Wouldn't it be simpler to just handle this at the Java level and >>>>>> substitute "" for null in the args array? We're only looking at a >>>>>> maximum of three possible entries. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Christoph >>>>>>> >> >> > > From chris.plummer at oracle.com Wed Mar 21 16:31:49 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 21 Mar 2018 09:31:49 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> <4f7f3d57-890c-87eb-abea-ed0f04d58ee6@oracle.com> <0abb9c09-bee9-7c7b-db26-99bad154e7fc@oracle.com> <2fd4bb59-7b5f-3598-10c3-da3a2ab957af@oracle.com> <4bef98cf-e422-572a-508c-b0d4c9aff8f4@oracle.com> <05d6de94-c52a-b3f2-7a41-293f7e490475@oracle.com> Message-ID: <81748aaf-b7f7-75ce-27fa-e54de754607e@oracle.com> Ping. I still need a couple of reviews for this. thanks, Chris On 3/19/18 3:50 PM, Chris Plummer wrote: > I looked into modifying OutputAnalyzer (actually ended up being > ProcessTools that needed all the changes) to be more flexible so it > could support LingeredApp. The problem I ran into is that ProcessTools > is all static, but I needed to create and return a context. It ended > up being too much disruption, so I instead have the > ProcessTools.getOutput() code as part of LingeredApp. > > Another thing I discovered is that you can use OutputAnalyzer with > already generated output, so this option is still available to users > of LingeredApp. You just need to do something like: > > ??? OutputAnalyzer out = new > OutputAnalyzer(lingeredApp.getOutput().getStdout(), > lingeredApp.getOutput().getStderr()); > > I didn't change any test to take advantage of this, but it's there if > someone wants it. > > I've included another webrev below (completely different from the > original). In the end, all LingeredApp stdout and stderr is dumped > after the app exits. The old way of storing away the stdout using an > InputGobbler is gone. Since getAppOutput() depended on this, and the > new way of saving stdout saves it as one big string rather than a List > of lines, getAppOutput() needed some changes to convert to the List form. > > http://cr.openjdk.java.net/~cjplummer/8198655/webrev.03 > > thanks, > > Chris > > On 3/19/18 9:39 AM, Chris Plummer wrote: >> Hi David, >> >> Just to clarify one point, most of the tests that use OutputAnalyzer >> do not display process output unless there is an error. So part of >> the decision here with LingeredApp is when to display the output. >> Currently the stdout is captured, but not displayed, unless the tests >> does the work to display it, which none do. Currently stderr goes to >> the console. Note that some negative tests actually cause some >> expected stderr output, although the tests don't check for it. >> >> One thought I just had is to create an async option for >> OutputAnalyzer so it doesn't block until the process exits. Basically >> that means splitting ProcessTools.getOutput() so it doesn't block. >> What I currently have is essentially doing that. It copies >> ProcessTools.getOutput(), splitting it into two parts. But all this >> logic is in LingeredApp, and of course doesn't have any of the output >> error checking support that OutputAnalyzer, which might be useful for >> LingeredApp. For example, the negative tests only test that launching >> the app failed. They could be improved by checking for specific error >> output. >> >> Chris >> >> On 3/17/18 12:11 AM, David Holmes wrote: >>> I'm afraid I'm losing track of this change. >>> >>> The key thing is that we should not have a test that launches any >>> other process for which we can not see the output of that process. >>> >>> David >>> >>> On 17/03/2018 7:48 AM, Chris Plummer wrote: >>>> On 3/16/18 1:25 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Chris, >>>>> >>>>> Thank you for taking care about this issue! >>>>> >>>>> On 3/16/18 11:20, Chris Plummer wrote: >>>>>> Hi, >>>>>> >>>>>> I've resolved the issues I had before with not seeing all the >>>>>> stderr output when I tried to capture it. What I'd like to do now >>>>>> is have us decide how the output should be handled from the >>>>>> perspective a LingeredApp user (driver app). Currently all >>>>>> LingeredApp stdout is captured and gets be returned the the >>>>>> driver app by calling app.getAppOutput(). It does not appear in >>>>>> the .jtr file, but the test would have the option of dumping it >>>>>> there it it cared to. Only one test uses app.getAppOutput(). >>>>>> Currently all the LingeredApp stderr is redirected to the >>>>>> console, so it does not appear in the .jtr file. >>>>> >>>>> Just a general comment to make sure I understand it and ensure we >>>>> are in sync. >>>>> It seems much more safe to always have both stdout and stderr >>>>> outputs present in the .jtr automatically file independently of of >>>>> what the test does. >>>>> >>>>> >>>>>> So how do we want this changed? Some possibilities are: >>>>>> >>>>>> (1) capture stderr just like stdout currently is, and leave is up >>>>>> the the driver app to decide if it wants to display it (after the >>>>>> app terminates). >>>>> >>>>> It does not look good to me (see above) but maybe I'm missing >>>>> something important here. >>>>> >>>>>> (2) capture stderr just like stdout currently is, but have >>>>>> LingeredApp automatically send captured output to driver app's >>>>>> stdout and stderr (after the app terminates). >>>>> >>>>> The stdout and std err will be separated in this case, right? >>>>> Do you have a webrev for this? >>>> I currently have it working like this, although I need to fix >>>> LingeredApp.getAppOutput(). I had to make it return a single String >>>> instead of a List of Strings, so this breaks the one test that uses >>>> this API. It's easily fixed. Just haven't gotten around to it yet. >>>>> >>>>> >>>>>> (3) send the LingeredApp's stdout and stderr to the driver app's >>>>>> stdout as it is being captured (this was the original fix Igor >>>>>> suggested and the webrev supported). A minor alternative to this >>>>>> is to keep the two streams separated instead of sending both to >>>>>> stdout. >>>>>> >>>>>> Let me know what you think. I'm inclined to go with 2, especially >>>>>> since normally there is little to no output from the LingeredApp. >>>>> >>>>> The choice (2) looks good enough. >>>>> Not sure it is that important to have output from stdout and >>>>> stderr sync'ed >>>>> but is is important to have the stderr present in the .jtr >>>>> automatically. >>>>> >>>>> The choice (3) looks even better if it is going to work well. >>>> This is basically what the original webrev did. It sent >>>> LingeredApp's stderr and stdout to the the driver apps stdout. It's >>>> a 1 word change to make it send stderr to stderr. I think it has a >>>> bug though that did not manifest itself. It seems the new copy() >>>> code that is capturing stdout would be contending with the existing >>>> InputGlobbler code that is doing the same. I would need to fix this >>>> to make sure LingeredApp.getAppOutput() still returns all the apps >>>> stdout output. >>>> >>>> Chris >>>>> Not sure, it is really necessary. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>>> >>>>>> BTW, here's the CR and original webrev for reference: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8198655 >>>>>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>> >>>> >> > > From alexey.menkov at oracle.com Wed Mar 21 16:41:21 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 21 Mar 2018 09:41:21 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <81f49bb9-fd37-b114-ad56-94e9141ec6e1@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> <3b9731df-3ea9-59ec-2c0c-af72a554e69a@oracle.com> <7402a238-0c43-ac32-e0b0-4951361c1760@oracle.com> <81f49bb9-fd37-b114-ad56-94e9141ec6e1@oracle.com> Message-ID: Hi David, On 03/20/2018 21:51, David Holmes wrote: > Hi Alex, > > On 21/03/2018 3:25 AM, Alex Menkov wrote: >> Hi David, >> >> On 03/19/2018 18:10, David Holmes wrote: >>> Hi Alex, >>> >>> On 20/03/2018 10:28 AM, Alex Menkov wrote: >>>> Hi guys, >>>> >>>> please re-review the fix. >>> >>> I still have an unanswered question about where the max of 49 is >>> enforced. I see it for the "address" but not names in general. ?? >> >> for shmem the "channel name" is the address (it's checked in >> createTransport/openTransport). >> Names for mutexes/events are generated by appending some strings to >> the adddress and length of the added parts are supposed to be less >> than MAX_IPC_SUFFIX (25 symbols): >> ".mutex" (+ up to 3 symbols) >> ".hasData" (+ up to 3 symbols) >> ".hasSpace" (+ up to 3 symbols) >> ".ctos" >> ".stoc" >> ".accept" (+ up to 3 symbols) >> ".attach" (+ up to 3 symbols) >> "." (pid is a DWORD) > > Okay so ... the code in shmemBase.c is very unclear as to which "names" > can come in from an external source and which are only ever derived from > other "names". If the "address" (which seems a very bad description in > this case!) is the only external source for a name, and it is limited to > a length of 49 then that is okay. Yes, the "address" is the only external arg, all other names are constructed from it. I believe it's "address" because it comes from "address" parameter: -Xrunjdwp:transport=st_shmem,address= > >>> >>>> Reg.test is added the the issue. >>> >>> I don't quite follow the test. I see you try to set the name with a >>> value that is too long, and if that doesn't cause an overflow and we >>> don't crash that is good. But I'd expect you to read back the name >>> and check it matches the truncated name with 49 characters. >> >> The test specifies the maximum length supported (49 symbols) >> (if longer name is specified, "address strings longer than 50 >> characters are invalid" error reported). > > I missed the substring that simply causes the name to be the maximum > supported length. That would trigger the overflow and so suffices as a > regression test for this fix. > > Is there another test that already passes a too-long name and verifies > the error gets thrown? Do you mean name >= 50 symbols? No, there is no such test. I don't think it make much sense (test an arbitrary implementation-specific restriction), but I can add the case to the test. --alex > >> As far as I see there is no way to read back the name used to create >> the transport. > > Ok. > > Thanks, > David > ----- > >> --alex >> >>> >>> Thanks, >>> David >>> >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.04/ >>>> >>>> --alex >>>> >>>> On 03/13/2018 16:14, Alex Menkov wrote: >>>>> Hi all, >>>>> >>>>> Please review a small fix for >>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>> webrev: >>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>> >>>>> Root cause of the issue is jbd hungs as a result of the buffer >>>>> overflow. >>>>> >>>>> In the beginning of the shmemBase.c: >>>>> >>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name for */ >>>>> ???????????????????????????? /* shared memory seg and prefix for >>>>> other IPC */ >>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >>>>> names */ >>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>> >>>>> buffer (char prefix[]) in function createStream is used to generate >>>>> base name for mutex/events, so MAX_IPC_PREFIX is not big enough. >>>>> >>>>> --alex From serguei.spitsyn at oracle.com Wed Mar 21 18:08:31 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 21 Mar 2018 11:08:31 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: <81748aaf-b7f7-75ce-27fa-e54de754607e@oracle.com> References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> <4f7f3d57-890c-87eb-abea-ed0f04d58ee6@oracle.com> <0abb9c09-bee9-7c7b-db26-99bad154e7fc@oracle.com> <2fd4bb59-7b5f-3598-10c3-da3a2ab957af@oracle.com> <4bef98cf-e422-572a-508c-b0d4c9aff8f4@oracle.com> <05d6de94-c52a-b3f2-7a41-293f7e490475@oracle.com> <81748aaf-b7f7-75ce-27fa-e54de754607e@oracle.com> Message-ID: Hi Chris, It looks good to me. It is a little bit more complicated than one would expect but reasonable. Thanks, Serguei On 3/21/18 09:31, Chris Plummer wrote: > Ping. I still need a couple of reviews for this. > > thanks, > > Chris > > On 3/19/18 3:50 PM, Chris Plummer wrote: >> I looked into modifying OutputAnalyzer (actually ended up being >> ProcessTools that needed all the changes) to be more flexible so it >> could support LingeredApp. The problem I ran into is that >> ProcessTools is all static, but I needed to create and return a >> context. It ended up being too much disruption, so I instead have the >> ProcessTools.getOutput() code as part of LingeredApp. >> >> Another thing I discovered is that you can use OutputAnalyzer with >> already generated output, so this option is still available to users >> of LingeredApp. You just need to do something like: >> >> ??? OutputAnalyzer out = new >> OutputAnalyzer(lingeredApp.getOutput().getStdout(), >> lingeredApp.getOutput().getStderr()); >> >> I didn't change any test to take advantage of this, but it's there if >> someone wants it. >> >> I've included another webrev below (completely different from the >> original). In the end, all LingeredApp stdout and stderr is dumped >> after the app exits. The old way of storing away the stdout using an >> InputGobbler is gone. Since getAppOutput() depended on this, and the >> new way of saving stdout saves it as one big string rather than a >> List of lines, getAppOutput() needed some changes to convert to the >> List form. >> >> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.03 >> >> thanks, >> >> Chris >> >> On 3/19/18 9:39 AM, Chris Plummer wrote: >>> Hi David, >>> >>> Just to clarify one point, most of the tests that use OutputAnalyzer >>> do not display process output unless there is an error. So part of >>> the decision here with LingeredApp is when to display the output. >>> Currently the stdout is captured, but not displayed, unless the >>> tests does the work to display it, which none do. Currently stderr >>> goes to the console. Note that some negative tests actually cause >>> some expected stderr output, although the tests don't check for it. >>> >>> One thought I just had is to create an async option for >>> OutputAnalyzer so it doesn't block until the process exits. >>> Basically that means splitting ProcessTools.getOutput() so it >>> doesn't block. What I currently have is essentially doing that. It >>> copies ProcessTools.getOutput(), splitting it into two parts. But >>> all this logic is in LingeredApp, and of course doesn't have any of >>> the output error checking support that OutputAnalyzer, which might >>> be useful for LingeredApp. For example, the negative tests only test >>> that launching the app failed. They could be improved by checking >>> for specific error output. >>> >>> Chris >>> >>> On 3/17/18 12:11 AM, David Holmes wrote: >>>> I'm afraid I'm losing track of this change. >>>> >>>> The key thing is that we should not have a test that launches any >>>> other process for which we can not see the output of that process. >>>> >>>> David >>>> >>>> On 17/03/2018 7:48 AM, Chris Plummer wrote: >>>>> On 3/16/18 1:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Thank you for taking care about this issue! >>>>>> >>>>>> On 3/16/18 11:20, Chris Plummer wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I've resolved the issues I had before with not seeing all the >>>>>>> stderr output when I tried to capture it. What I'd like to do >>>>>>> now is have us decide how the output should be handled from the >>>>>>> perspective a LingeredApp user (driver app). Currently all >>>>>>> LingeredApp stdout is captured and gets be returned the the >>>>>>> driver app by calling app.getAppOutput(). It does not appear in >>>>>>> the .jtr file, but the test would have the option of dumping it >>>>>>> there it it cared to. Only one test uses app.getAppOutput(). >>>>>>> Currently all the LingeredApp stderr is redirected to the >>>>>>> console, so it does not appear in the .jtr file. >>>>>> >>>>>> Just a general comment to make sure I understand it and ensure we >>>>>> are in sync. >>>>>> It seems much more safe to always have both stdout and stderr >>>>>> outputs present in the .jtr automatically file independently of >>>>>> of what the test does. >>>>>> >>>>>> >>>>>>> So how do we want this changed? Some possibilities are: >>>>>>> >>>>>>> (1) capture stderr just like stdout currently is, and leave is >>>>>>> up the the driver app to decide if it wants to display it (after >>>>>>> the app terminates). >>>>>> >>>>>> It does not look good to me (see above) but maybe I'm missing >>>>>> something important here. >>>>>> >>>>>>> (2) capture stderr just like stdout currently is, but have >>>>>>> LingeredApp automatically send captured output to driver app's >>>>>>> stdout and stderr (after the app terminates). >>>>>> >>>>>> The stdout and std err will be separated in this case, right? >>>>>> Do you have a webrev for this? >>>>> I currently have it working like this, although I need to fix >>>>> LingeredApp.getAppOutput(). I had to make it return a single >>>>> String instead of a List of Strings, so this breaks the one test >>>>> that uses this API. It's easily fixed. Just haven't gotten around >>>>> to it yet. >>>>>> >>>>>> >>>>>>> (3) send the LingeredApp's stdout and stderr to the driver app's >>>>>>> stdout as it is being captured (this was the original fix Igor >>>>>>> suggested and the webrev supported). A minor alternative to this >>>>>>> is to keep the two streams separated instead of sending both to >>>>>>> stdout. >>>>>>> >>>>>>> Let me know what you think. I'm inclined to go with 2, >>>>>>> especially since normally there is little to no output from the >>>>>>> LingeredApp. >>>>>> >>>>>> The choice (2) looks good enough. >>>>>> Not sure it is that important to have output from stdout and >>>>>> stderr sync'ed >>>>>> but is is important to have the stderr present in the .jtr >>>>>> automatically. >>>>>> >>>>>> The choice (3) looks even better if it is going to work well. >>>>> This is basically what the original webrev did. It sent >>>>> LingeredApp's stderr and stdout to the the driver apps stdout. >>>>> It's a 1 word change to make it send stderr to stderr. I think it >>>>> has a bug though that did not manifest itself. It seems the new >>>>> copy() code that is capturing stdout would be contending with the >>>>> existing InputGlobbler code that is doing the same. I would need >>>>> to fix this to make sure LingeredApp.getAppOutput() still returns >>>>> all the apps stdout output. >>>>> >>>>> Chris >>>>>> Not sure, it is really necessary. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>>> >>>>>>> BTW, here's the CR and original webrev for reference: >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8198655 >>>>>>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>> >>>>> >>> >> >> > > From Stephen.Fitch at oracle.com Wed Mar 21 18:14:02 2018 From: Stephen.Fitch at oracle.com (Stephen Fitch) Date: Wed, 21 Mar 2018 11:14:02 -0700 Subject: HotSpot Serviceability Agent (SA) Survey Message-ID: Hi, The HotSpot Serviceability Agent (SA) is a set of APIs and tools for debugging HotSpot Virtual Machine and has been a part of the JVM/JDK for a long time, however we don't have a lot of data about how it is used in practice, especially outside of Oracle. Therefore, we have created an initial survey to gather more information and help us evaluate and understand how others are using it. If you have used, or have (support) processes that utilize the Serviceability Agent or related APIs, then we would definitely appreciate if you would complete this survey: https://www.surveymonkey.com/r/CF3MYDL We are specifically interested in your use-cases and how SA is effective for you in resolving JVM issues. The survey will remain open through March 31st. The results of the survey will be made public after the survey closes. Regards, Stephen ?Java Platform Group - JVM - Sustaining Engineering From chris.plummer at oracle.com Wed Mar 21 18:24:55 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 21 Mar 2018 11:24:55 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> <4f7f3d57-890c-87eb-abea-ed0f04d58ee6@oracle.com> <0abb9c09-bee9-7c7b-db26-99bad154e7fc@oracle.com> <2fd4bb59-7b5f-3598-10c3-da3a2ab957af@oracle.com> <4bef98cf-e422-572a-508c-b0d4c9aff8f4@oracle.com> <05d6de94-c52a-b3f2-7a41-293f7e490475@oracle.com> <81748aaf-b7f7-75ce-27fa-e54de754607e@oracle.com> Message-ID: Yeah, this was all new to me. Before this I didn't know anything about jtreg IO other than the use of OutputAnalyzer for capture and verification. Thanks for reviewing. Chris On 3/21/18 11:08 AM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > It looks good to me. > It is a little bit more complicated than one would expect but reasonable. > > Thanks, > Serguei > > > On 3/21/18 09:31, Chris Plummer wrote: >> Ping. I still need a couple of reviews for this. >> >> thanks, >> >> Chris >> >> On 3/19/18 3:50 PM, Chris Plummer wrote: >>> I looked into modifying OutputAnalyzer (actually ended up being >>> ProcessTools that needed all the changes) to be more flexible so it >>> could support LingeredApp. The problem I ran into is that >>> ProcessTools is all static, but I needed to create and return a >>> context. It ended up being too much disruption, so I instead have >>> the ProcessTools.getOutput() code as part of LingeredApp. >>> >>> Another thing I discovered is that you can use OutputAnalyzer with >>> already generated output, so this option is still available to users >>> of LingeredApp. You just need to do something like: >>> >>> ??? OutputAnalyzer out = new >>> OutputAnalyzer(lingeredApp.getOutput().getStdout(), >>> lingeredApp.getOutput().getStderr()); >>> >>> I didn't change any test to take advantage of this, but it's there >>> if someone wants it. >>> >>> I've included another webrev below (completely different from the >>> original). In the end, all LingeredApp stdout and stderr is dumped >>> after the app exits. The old way of storing away the stdout using an >>> InputGobbler is gone. Since getAppOutput() depended on this, and the >>> new way of saving stdout saves it as one big string rather than a >>> List of lines, getAppOutput() needed some changes to convert to the >>> List form. >>> >>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.03 >>> >>> thanks, >>> >>> Chris >>> >>> On 3/19/18 9:39 AM, Chris Plummer wrote: >>>> Hi David, >>>> >>>> Just to clarify one point, most of the tests that use >>>> OutputAnalyzer do not display process output unless there is an >>>> error. So part of the decision here with LingeredApp is when to >>>> display the output. Currently the stdout is captured, but not >>>> displayed, unless the tests does the work to display it, which none >>>> do. Currently stderr goes to the console. Note that some negative >>>> tests actually cause some expected stderr output, although the >>>> tests don't check for it. >>>> >>>> One thought I just had is to create an async option for >>>> OutputAnalyzer so it doesn't block until the process exits. >>>> Basically that means splitting ProcessTools.getOutput() so it >>>> doesn't block. What I currently have is essentially doing that. It >>>> copies ProcessTools.getOutput(), splitting it into two parts. But >>>> all this logic is in LingeredApp, and of course doesn't have any of >>>> the output error checking support that OutputAnalyzer, which might >>>> be useful for LingeredApp. For example, the negative tests only >>>> test that launching the app failed. They could be improved by >>>> checking for specific error output. >>>> >>>> Chris >>>> >>>> On 3/17/18 12:11 AM, David Holmes wrote: >>>>> I'm afraid I'm losing track of this change. >>>>> >>>>> The key thing is that we should not have a test that launches any >>>>> other process for which we can not see the output of that process. >>>>> >>>>> David >>>>> >>>>> On 17/03/2018 7:48 AM, Chris Plummer wrote: >>>>>> On 3/16/18 1:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Thank you for taking care about this issue! >>>>>>> >>>>>>> On 3/16/18 11:20, Chris Plummer wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I've resolved the issues I had before with not seeing all the >>>>>>>> stderr output when I tried to capture it. What I'd like to do >>>>>>>> now is have us decide how the output should be handled from the >>>>>>>> perspective a LingeredApp user (driver app). Currently all >>>>>>>> LingeredApp stdout is captured and gets be returned the the >>>>>>>> driver app by calling app.getAppOutput(). It does not appear in >>>>>>>> the .jtr file, but the test would have the option of dumping it >>>>>>>> there it it cared to. Only one test uses app.getAppOutput(). >>>>>>>> Currently all the LingeredApp stderr is redirected to the >>>>>>>> console, so it does not appear in the .jtr file. >>>>>>> >>>>>>> Just a general comment to make sure I understand it and ensure >>>>>>> we are in sync. >>>>>>> It seems much more safe to always have both stdout and stderr >>>>>>> outputs present in the .jtr automatically file independently of >>>>>>> of what the test does. >>>>>>> >>>>>>> >>>>>>>> So how do we want this changed? Some possibilities are: >>>>>>>> >>>>>>>> (1) capture stderr just like stdout currently is, and leave is >>>>>>>> up the the driver app to decide if it wants to display it >>>>>>>> (after the app terminates). >>>>>>> >>>>>>> It does not look good to me (see above) but maybe I'm missing >>>>>>> something important here. >>>>>>> >>>>>>>> (2) capture stderr just like stdout currently is, but have >>>>>>>> LingeredApp automatically send captured output to driver app's >>>>>>>> stdout and stderr (after the app terminates). >>>>>>> >>>>>>> The stdout and std err will be separated in this case, right? >>>>>>> Do you have a webrev for this? >>>>>> I currently have it working like this, although I need to fix >>>>>> LingeredApp.getAppOutput(). I had to make it return a single >>>>>> String instead of a List of Strings, so this breaks the one test >>>>>> that uses this API. It's easily fixed. Just haven't gotten around >>>>>> to it yet. >>>>>>> >>>>>>> >>>>>>>> (3) send the LingeredApp's stdout and stderr to the driver >>>>>>>> app's stdout as it is being captured (this was the original fix >>>>>>>> Igor suggested and the webrev supported). A minor alternative >>>>>>>> to this is to keep the two streams separated instead of sending >>>>>>>> both to stdout. >>>>>>>> >>>>>>>> Let me know what you think. I'm inclined to go with 2, >>>>>>>> especially since normally there is little to no output from the >>>>>>>> LingeredApp. >>>>>>> >>>>>>> The choice (2) looks good enough. >>>>>>> Not sure it is that important to have output from stdout and >>>>>>> stderr sync'ed >>>>>>> but is is important to have the stderr present in the .jtr >>>>>>> automatically. >>>>>>> >>>>>>> The choice (3) looks even better if it is going to work well. >>>>>> This is basically what the original webrev did. It sent >>>>>> LingeredApp's stderr and stdout to the the driver apps stdout. >>>>>> It's a 1 word change to make it send stderr to stderr. I think it >>>>>> has a bug though that did not manifest itself. It seems the new >>>>>> copy() code that is capturing stdout would be contending with the >>>>>> existing InputGlobbler code that is doing the same. I would need >>>>>> to fix this to make sure LingeredApp.getAppOutput() still returns >>>>>> all the apps stdout output. >>>>>> >>>>>> Chris >>>>>>> Not sure, it is really necessary. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> BTW, here's the CR and original webrev for reference: >>>>>>>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8198655 >>>>>>>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>> >>>>>> >>>> >>> >>> >> >> > From serguei.spitsyn at oracle.com Wed Mar 21 18:54:34 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 21 Mar 2018 11:54:34 -0700 Subject: RFR(S): 8195109: ServiceUtil::visible_oop is not needed anymore In-Reply-To: <1f2774ce-b289-9ced-bc79-601e3d3e4bc3@oracle.com> References: <09b9d4e3-1db1-f0dc-6eee-73cfe33124f5@oracle.com> <1f2774ce-b289-9ced-bc79-601e3d3e4bc3@oracle.com> Message-ID: <754fc246-b7b8-9caf-d0ae-2546bb25acd5@oracle.com> Hi Chris, It looks good. Thanks, Serguei On 3/20/18 12:39, Chris Plummer wrote: > Hi, > > New webrev: > > http://cr.openjdk.java.net/~cjplummer/8195109/webrev.01/index.html > > There was a build failure on solaris-sparc in threadSMR.cpp. > References to the Copy class were producing "unresolved symbol" > errors. threadSMR.cpp includes threadService.hpp, which no longer > includes serviceUtil.hpp (because it was removed). It looks like > serviceUtil.hpp indirectly included "utilities/copy.hpp", so now I > include it directly in threadSMR.cpp. The problem was only on > solaris-sparc, so I assume on other platforms there was platform > dependent code indirectly pulling in copy.hpp. In any case, it's now > directly pulled in on all platforms. > > thanks, > > Chris > > On 3/19/18 5:48 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8195109 >> http://cr.openjdk.java.net/~cjplummer/8195109/webrev.00/index.html >> >> The assert I added to make sure this is safe has been in place in >> jdk/jdk for almost 3 weeks with no issues (longer in jdk/hs). >> >> The webrev is missing the copyright update for threadService.hpp. I >> fixed it after noticing that. >> >> Testing is in progress. Running hs tiers 1, 2, and 3, and jdk tiers 1 >> and 2. Also making sure all serviceability tests are run. >> >> thanks, >> >> Chris > > From Roger.Riggs at Oracle.com Wed Mar 21 19:13:52 2018 From: Roger.Riggs at Oracle.com (Roger Riggs) Date: Wed, 21 Mar 2018 15:13:52 -0400 Subject: RFR 8199467 Compilation Errors in libinstrument Reentrancy.c with VS2017 Message-ID: Please review a small change to avoid sign extension in libinstrument/ Reentrancy.c to correct an compilation warning with vs2017. diff --git a/src/java.instrument/share/native/libinstrument/Reentrancy.c b/src/java.instrument/share/native/libinstrument/Reentrancy.c --- a/src/java.instrument/share/native/libinstrument/Reentrancy.c +++ b/src/java.instrument/share/native/libinstrument/Reentrancy.c @@ -90,7 +90,7 @@ assertTLSValue( jvmtiEnv *????? jvmtienv ???????????????? jthread???????? thread, ???????????????? const void *??? expected) { ???? jvmtiError? error; -??? void *????? test = (void *) 0x99999999; +??? void *????? test = (void *) 0x99999999ul; ???? /* now check if we do a fetch we get what we wrote */ ???? error = (*jvmtienv)->GetThreadLocalStorage( Issue: ? https://bugs.openjdk.java.net/browse/JDK-8199467 Thanks, Roger From serguei.spitsyn at oracle.com Wed Mar 21 19:17:15 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 21 Mar 2018 12:17:15 -0700 Subject: RFR 8199467 Compilation Errors in libinstrument Reentrancy.c with VS2017 In-Reply-To: References: Message-ID: <1ab47282-de3e-2681-34f3-a0546e258e22@oracle.com> Hi Roger, It looks good to me. Thank you for taking care about this! Thanks, Serguei On 3/21/18 12:13, Roger Riggs wrote: > Please review a small change to avoid sign extension in libinstrument/ > Reentrancy.c > to correct an compilation warning with vs2017. > > diff --git > a/src/java.instrument/share/native/libinstrument/Reentrancy.c > b/src/java.instrument/share/native/libinstrument/Reentrancy.c > --- a/src/java.instrument/share/native/libinstrument/Reentrancy.c > +++ b/src/java.instrument/share/native/libinstrument/Reentrancy.c > @@ -90,7 +90,7 @@ assertTLSValue( jvmtiEnv *????? jvmtienv > ???????????????? jthread???????? thread, > ???????????????? const void *??? expected) { > ???? jvmtiError? error; > -??? void *????? test = (void *) 0x99999999; > +??? void *????? test = (void *) 0x99999999ul; > > ???? /* now check if we do a fetch we get what we wrote */ > ???? error = (*jvmtienv)->GetThreadLocalStorage( > > Issue: > ? https://bugs.openjdk.java.net/browse/JDK-8199467 > > Thanks, Roger > From martinrb at google.com Wed Mar 21 19:31:04 2018 From: martinrb at google.com (Martin Buchholz) Date: Wed, 21 Mar 2018 12:31:04 -0700 Subject: RFR 8199467 Compilation Errors in libinstrument Reentrancy.c with VS2017 In-Reply-To: References: Message-ID: On Wed, Mar 21, 2018 at 12:13 PM, Roger Riggs wrote: > > - void * test = (void *) 0x99999999; > + void * test = (void *) 0x99999999ul; > Martin's 15th law: Never use "l" in a numeric constant unless the constant is 0xCafeBabel, so 0x99999999UL -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Mar 21 22:23:45 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 22 Mar 2018 08:23:45 +1000 Subject: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of attach operations In-Reply-To: <1173f3a8-b2cd-776c-e640-1b83d84a6777@oracle.com> References: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> <9decad59-366b-a5bc-607f-88625169468d@oracle.com> <94b75a874b5c4a3cb1cb109a51220c16@sap.com> <94e35233-005a-3e2c-1db2-bf5625dca863@oracle.com> <693b2894c45244b1aaf914d6a00fc12b@sap.com> <3fdcb090-3e74-5d22-4188-8af625f8a217@oracle.com> <1173f3a8-b2cd-776c-e640-1b83d84a6777@oracle.com> Message-ID: On 22/03/2018 12:51 AM, Daniel D. Daugherty wrote: > On 3/21/18 10:23 AM, Daniel D. Daugherty wrote: >> On 3/21/18 10:00 AM, Langer, Christoph wrote: >>> Hi Dan, >>> >>> that is, you mean the C-code? My original change? >> >> Hmmm... I think I confused myself before I drank enough coffee... >> Looking again... > > Okay I definitely confused myself... and I clearly don't remember > the attach-on-demand code as well as I used to... sigh... > > I think you should keep your original fix since it now properly > handles null arguments at the same attach-on-demand layer as the > Linux code that you quoted. > > Handling this in args array processing would also be possible > as David suggests, but it would bother me that Linux and Solaris > lower attach-on-demand layers would have different behaviors. They already do have completely different behaviours. Linux handles NULL at the Java layer by inserting empty strings! David > Hope this is more clear. > > Dan > > >> >> Dan >> >> >>> >>> Best regards >>> Christoph >>> >>>> -----Original Message----- >>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com] >>>> Sent: Mittwoch, 21. M?rz 2018 14:59 >>>> To: Langer, Christoph ; David Holmes >>>> ; serviceability-dev at openjdk.java.net >>>> Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null >>>> arguments of >>>> attach operations >>>> >>>> Hmmm... shouldn't the inconsistency in the Solaris backend also be >>>> addressed? >>>> >>>> Dan >>>> >>>> >>>> On 3/21/18 8:45 AM, Langer, Christoph wrote: >>>>> Hi David, >>>>> >>>>> thanks for looking at this. I currently have no emotions whether to >>>>> fix it in C >>>> or in Java - I'll check it out... >>>>> Best regards >>>>> Christoph >>>>> >>>>>> -----Original Message----- >>>>>> From: David Holmes [mailto:david.holmes at oracle.com] >>>>>> Sent: Mittwoch, 21. M?rz 2018 10:20 >>>>>> To: Langer, Christoph ; serviceability- >>>>>> dev at openjdk.java.net >>>>>> Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null >>>>>> arguments of >>>>>> attach operations >>>>>> >>>>>> Hi Christoph, >>>>>> >>>>>> On 21/03/2018 6:10 PM, Langer, Christoph wrote: >>>>>>> Hi, >>>>>>> >>>>>>> may I please ask for reviews of the following small fix. >>>>>>> >>>>>>> Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8199924.0/ >>>>>>> >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8199924 >>>>>>> >>>>>>> If one passes null arguments to the varargs of attach operations, >>>>>>> they >>>>>>> get swallowed on Solaris and following arguments will shift to lower >>>>>>> positions. >>>>>>> >>>>>>> Other platform implementations handle this correctly, for instance >>>>>>> linux: >>>>>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/f6ad4d73c834/src/jdk.attach/linux/cl >>>> >>>>>> asses/sun/tools/attach/VirtualMachineImpl.java#l178 >>>>>> >>>>>> Wouldn't it be simpler to just handle this at the Java level and >>>>>> substitute "" for null in the args array? We're only looking at a >>>>>> maximum of three possible entries. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Christoph >>>>>>> >> >> > From david.holmes at oracle.com Wed Mar 21 22:26:02 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 22 Mar 2018 08:26:02 +1000 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> <4f7f3d57-890c-87eb-abea-ed0f04d58ee6@oracle.com> <0abb9c09-bee9-7c7b-db26-99bad154e7fc@oracle.com> <2fd4bb59-7b5f-3598-10c3-da3a2ab957af@oracle.com> <4bef98cf-e422-572a-508c-b0d4c9aff8f4@oracle.com> <05d6de94-c52a-b3f2-7a41-293f7e490475@oracle.com> <81748aaf-b7f7-75ce-27fa-e54de754607e@oracle.com> Message-ID: Sorry Chris I just don't have time to try and figure this one out. If it works uses it. David On 22/03/2018 4:24 AM, Chris Plummer wrote: > Yeah, this was all new to me. Before this I didn't know anything about > jtreg IO other than the use of OutputAnalyzer for capture and verification. > > Thanks for reviewing. > > Chris > > On 3/21/18 11:08 AM, serguei.spitsyn at oracle.com wrote: >> Hi Chris, >> >> It looks good to me. >> It is a little bit more complicated than one would expect but reasonable. >> >> Thanks, >> Serguei >> >> >> On 3/21/18 09:31, Chris Plummer wrote: >>> Ping. I still need a couple of reviews for this. >>> >>> thanks, >>> >>> Chris >>> >>> On 3/19/18 3:50 PM, Chris Plummer wrote: >>>> I looked into modifying OutputAnalyzer (actually ended up being >>>> ProcessTools that needed all the changes) to be more flexible so it >>>> could support LingeredApp. The problem I ran into is that >>>> ProcessTools is all static, but I needed to create and return a >>>> context. It ended up being too much disruption, so I instead have >>>> the ProcessTools.getOutput() code as part of LingeredApp. >>>> >>>> Another thing I discovered is that you can use OutputAnalyzer with >>>> already generated output, so this option is still available to users >>>> of LingeredApp. You just need to do something like: >>>> >>>> ??? OutputAnalyzer out = new >>>> OutputAnalyzer(lingeredApp.getOutput().getStdout(), >>>> lingeredApp.getOutput().getStderr()); >>>> >>>> I didn't change any test to take advantage of this, but it's there >>>> if someone wants it. >>>> >>>> I've included another webrev below (completely different from the >>>> original). In the end, all LingeredApp stdout and stderr is dumped >>>> after the app exits. The old way of storing away the stdout using an >>>> InputGobbler is gone. Since getAppOutput() depended on this, and the >>>> new way of saving stdout saves it as one big string rather than a >>>> List of lines, getAppOutput() needed some changes to convert to the >>>> List form. >>>> >>>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.03 >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 3/19/18 9:39 AM, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> Just to clarify one point, most of the tests that use >>>>> OutputAnalyzer do not display process output unless there is an >>>>> error. So part of the decision here with LingeredApp is when to >>>>> display the output. Currently the stdout is captured, but not >>>>> displayed, unless the tests does the work to display it, which none >>>>> do. Currently stderr goes to the console. Note that some negative >>>>> tests actually cause some expected stderr output, although the >>>>> tests don't check for it. >>>>> >>>>> One thought I just had is to create an async option for >>>>> OutputAnalyzer so it doesn't block until the process exits. >>>>> Basically that means splitting ProcessTools.getOutput() so it >>>>> doesn't block. What I currently have is essentially doing that. It >>>>> copies ProcessTools.getOutput(), splitting it into two parts. But >>>>> all this logic is in LingeredApp, and of course doesn't have any of >>>>> the output error checking support that OutputAnalyzer, which might >>>>> be useful for LingeredApp. For example, the negative tests only >>>>> test that launching the app failed. They could be improved by >>>>> checking for specific error output. >>>>> >>>>> Chris >>>>> >>>>> On 3/17/18 12:11 AM, David Holmes wrote: >>>>>> I'm afraid I'm losing track of this change. >>>>>> >>>>>> The key thing is that we should not have a test that launches any >>>>>> other process for which we can not see the output of that process. >>>>>> >>>>>> David >>>>>> >>>>>> On 17/03/2018 7:48 AM, Chris Plummer wrote: >>>>>>> On 3/16/18 1:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Thank you for taking care about this issue! >>>>>>>> >>>>>>>> On 3/16/18 11:20, Chris Plummer wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I've resolved the issues I had before with not seeing all the >>>>>>>>> stderr output when I tried to capture it. What I'd like to do >>>>>>>>> now is have us decide how the output should be handled from the >>>>>>>>> perspective a LingeredApp user (driver app). Currently all >>>>>>>>> LingeredApp stdout is captured and gets be returned the the >>>>>>>>> driver app by calling app.getAppOutput(). It does not appear in >>>>>>>>> the .jtr file, but the test would have the option of dumping it >>>>>>>>> there it it cared to. Only one test uses app.getAppOutput(). >>>>>>>>> Currently all the LingeredApp stderr is redirected to the >>>>>>>>> console, so it does not appear in the .jtr file. >>>>>>>> >>>>>>>> Just a general comment to make sure I understand it and ensure >>>>>>>> we are in sync. >>>>>>>> It seems much more safe to always have both stdout and stderr >>>>>>>> outputs present in the .jtr automatically file independently of >>>>>>>> of what the test does. >>>>>>>> >>>>>>>> >>>>>>>>> So how do we want this changed? Some possibilities are: >>>>>>>>> >>>>>>>>> (1) capture stderr just like stdout currently is, and leave is >>>>>>>>> up the the driver app to decide if it wants to display it >>>>>>>>> (after the app terminates). >>>>>>>> >>>>>>>> It does not look good to me (see above) but maybe I'm missing >>>>>>>> something important here. >>>>>>>> >>>>>>>>> (2) capture stderr just like stdout currently is, but have >>>>>>>>> LingeredApp automatically send captured output to driver app's >>>>>>>>> stdout and stderr (after the app terminates). >>>>>>>> >>>>>>>> The stdout and std err will be separated in this case, right? >>>>>>>> Do you have a webrev for this? >>>>>>> I currently have it working like this, although I need to fix >>>>>>> LingeredApp.getAppOutput(). I had to make it return a single >>>>>>> String instead of a List of Strings, so this breaks the one test >>>>>>> that uses this API. It's easily fixed. Just haven't gotten around >>>>>>> to it yet. >>>>>>>> >>>>>>>> >>>>>>>>> (3) send the LingeredApp's stdout and stderr to the driver >>>>>>>>> app's stdout as it is being captured (this was the original fix >>>>>>>>> Igor suggested and the webrev supported). A minor alternative >>>>>>>>> to this is to keep the two streams separated instead of sending >>>>>>>>> both to stdout. >>>>>>>>> >>>>>>>>> Let me know what you think. I'm inclined to go with 2, >>>>>>>>> especially since normally there is little to no output from the >>>>>>>>> LingeredApp. >>>>>>>> >>>>>>>> The choice (2) looks good enough. >>>>>>>> Not sure it is that important to have output from stdout and >>>>>>>> stderr sync'ed >>>>>>>> but is is important to have the stderr present in the .jtr >>>>>>>> automatically. >>>>>>>> >>>>>>>> The choice (3) looks even better if it is going to work well. >>>>>>> This is basically what the original webrev did. It sent >>>>>>> LingeredApp's stderr and stdout to the the driver apps stdout. >>>>>>> It's a 1 word change to make it send stderr to stderr. I think it >>>>>>> has a bug though that did not manifest itself. It seems the new >>>>>>> copy() code that is capturing stdout would be contending with the >>>>>>> existing InputGlobbler code that is doing the same. I would need >>>>>>> to fix this to make sure LingeredApp.getAppOutput() still returns >>>>>>> all the apps stdout output. >>>>>>> >>>>>>> Chris >>>>>>>> Not sure, it is really necessary. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> BTW, here's the CR and original webrev for reference: >>>>>>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8198655 >>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>> >>>> >>> >>> >> > > From david.holmes at oracle.com Wed Mar 21 22:39:23 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 22 Mar 2018 08:39:23 +1000 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> <3b9731df-3ea9-59ec-2c0c-af72a554e69a@oracle.com> <7402a238-0c43-ac32-e0b0-4951361c1760@oracle.com> <81f49bb9-fd37-b114-ad56-94e9141ec6e1@oracle.com> Message-ID: On 22/03/2018 2:41 AM, Alex Menkov wrote: > Hi David, > > On 03/20/2018 21:51, David Holmes wrote: >> Hi Alex, >> >> On 21/03/2018 3:25 AM, Alex Menkov wrote: >>> Hi David, >>> >>> On 03/19/2018 18:10, David Holmes wrote: >>>> Hi Alex, >>>> >>>> On 20/03/2018 10:28 AM, Alex Menkov wrote: >>>>> Hi guys, >>>>> >>>>> please re-review the fix. >>>> >>>> I still have an unanswered question about where the max of 49 is >>>> enforced. I see it for the "address" but not names in general. ?? >>> >>> for shmem the "channel name" is the address (it's checked in >>> createTransport/openTransport). >>> Names for mutexes/events are generated by appending some strings to >>> the adddress and length of the added parts are supposed to be less >>> than MAX_IPC_SUFFIX (25 symbols): >>> ".mutex" (+ up to 3 symbols) >>> ".hasData" (+ up to 3 symbols) >>> ".hasSpace" (+ up to 3 symbols) >>> ".ctos" >>> ".stoc" >>> ".accept" (+ up to 3 symbols) >>> ".attach" (+ up to 3 symbols) >>> "." (pid is a DWORD) >> >> Okay so ... the code in shmemBase.c is very unclear as to which >> "names" can come in from an external source and which are only ever >> derived from other "names". If the "address" (which seems a very bad >> description in this case!) is the only external source for a name, and >> it is limited to a length of 49 then that is okay. > > Yes, the "address" is the only external arg, all other names are > constructed from it. > I believe it's "address" because it comes from "address" parameter: > -Xrunjdwp:transport=st_shmem,address= > >> >>>> >>>>> Reg.test is added the the issue. >>>> >>>> I don't quite follow the test. I see you try to set the name with a >>>> value that is too long, and if that doesn't cause an overflow and we >>>> don't crash that is good. But I'd expect you to read back the name >>>> and check it matches the truncated name with 49 characters. >>> >>> The test specifies the maximum length supported (49 symbols) >>> (if longer name is specified, "address strings longer than 50 >>> characters are invalid" error reported). >> >> I missed the substring that simply causes the name to be the maximum >> supported length. That would trigger the overflow and so suffices as a >> regression test for this fix. >> >> Is there another test that already passes a too-long name and verifies >> the error gets thrown? > > Do you mean name >= 50 symbols? > No, there is no such test. > I don't think it make much sense (test an arbitrary > implementation-specific restriction), but I can add the case to the test. It ensures that using a too-long name fails gracefully. Thanks, David > --alex > >> >>> As far as I see there is no way to read back the name used to create >>> the transport. >> >> Ok. >> >> Thanks, >> David >> ----- >> >>> --alex >>> >>>> >>>> Thanks, >>>> David >>>> >>>>> webrev: >>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.04/ >>>>> >>>>> --alex >>>>> >>>>> On 03/13/2018 16:14, Alex Menkov wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review a small fix for >>>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>>> >>>>>> Root cause of the issue is jbd hungs as a result of the buffer >>>>>> overflow. >>>>>> >>>>>> In the beginning of the shmemBase.c: >>>>>> >>>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name >>>>>> for */ >>>>>> ???????????????????????????? /* shared memory seg and prefix for >>>>>> other IPC */ >>>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >>>>>> names */ >>>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>>> >>>>>> buffer (char prefix[]) in function createStream is used to >>>>>> generate base name for mutex/events, so MAX_IPC_PREFIX is not big >>>>>> enough. >>>>>> >>>>>> --alex From igor.ignatyev at oracle.com Thu Mar 22 04:02:54 2018 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 21 Mar 2018 21:02:54 -0700 Subject: RFR(S): 8198655: test/lib/jdk/test/lib/apps/LingeredApp shouldn't inherit cout/cerr In-Reply-To: References: <0620299e-6127-17d8-a11d-29c62f536c0f@oracle.com> <0C33B71D-A2D5-40F3-9DF4-6FD1333E3DC5@oracle.com> <0aa48be4-f549-bb4a-61e6-bb9bdbcca31c@oracle.com> <4f7f3d57-890c-87eb-abea-ed0f04d58ee6@oracle.com> <0abb9c09-bee9-7c7b-db26-99bad154e7fc@oracle.com> <2fd4bb59-7b5f-3598-10c3-da3a2ab957af@oracle.com> <4bef98cf-e422-572a-508c-b0d4c9aff8f4@oracle.com> <05d6de94-c52a-b3f2-7a41-293f7e490475@oracle.com> <81748aaf-b7f7-75ce-27fa-e54de754607e@oracle.com> Message-ID: Hi Chris, the changeset looks reasonable, reviewed. Thanks, -- Igor > On Mar 21, 2018, at 3:26 PM, David Holmes wrote: > > Sorry Chris I just don't have time to try and figure this one out. If it works uses it. > > David > > On 22/03/2018 4:24 AM, Chris Plummer wrote: >> Yeah, this was all new to me. Before this I didn't know anything about jtreg IO other than the use of OutputAnalyzer for capture and verification. >> Thanks for reviewing. >> Chris >> On 3/21/18 11:08 AM, serguei.spitsyn at oracle.com wrote: >>> Hi Chris, >>> >>> It looks good to me. >>> It is a little bit more complicated than one would expect but reasonable. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 3/21/18 09:31, Chris Plummer wrote: >>>> Ping. I still need a couple of reviews for this. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 3/19/18 3:50 PM, Chris Plummer wrote: >>>>> I looked into modifying OutputAnalyzer (actually ended up being ProcessTools that needed all the changes) to be more flexible so it could support LingeredApp. The problem I ran into is that ProcessTools is all static, but I needed to create and return a context. It ended up being too much disruption, so I instead have the ProcessTools.getOutput() code as part of LingeredApp. >>>>> >>>>> Another thing I discovered is that you can use OutputAnalyzer with already generated output, so this option is still available to users of LingeredApp. You just need to do something like: >>>>> >>>>> OutputAnalyzer out = new OutputAnalyzer(lingeredApp.getOutput().getStdout(), lingeredApp.getOutput().getStderr()); >>>>> >>>>> I didn't change any test to take advantage of this, but it's there if someone wants it. >>>>> >>>>> I've included another webrev below (completely different from the original). In the end, all LingeredApp stdout and stderr is dumped after the app exits. The old way of storing away the stdout using an InputGobbler is gone. Since getAppOutput() depended on this, and the new way of saving stdout saves it as one big string rather than a List of lines, getAppOutput() needed some changes to convert to the List form. >>>>> >>>>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.03 >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 3/19/18 9:39 AM, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> Just to clarify one point, most of the tests that use OutputAnalyzer do not display process output unless there is an error. So part of the decision here with LingeredApp is when to display the output. Currently the stdout is captured, but not displayed, unless the tests does the work to display it, which none do. Currently stderr goes to the console. Note that some negative tests actually cause some expected stderr output, although the tests don't check for it. >>>>>> >>>>>> One thought I just had is to create an async option for OutputAnalyzer so it doesn't block until the process exits. Basically that means splitting ProcessTools.getOutput() so it doesn't block. What I currently have is essentially doing that. It copies ProcessTools.getOutput(), splitting it into two parts. But all this logic is in LingeredApp, and of course doesn't have any of the output error checking support that OutputAnalyzer, which might be useful for LingeredApp. For example, the negative tests only test that launching the app failed. They could be improved by checking for specific error output. >>>>>> >>>>>> Chris >>>>>> >>>>>> On 3/17/18 12:11 AM, David Holmes wrote: >>>>>>> I'm afraid I'm losing track of this change. >>>>>>> >>>>>>> The key thing is that we should not have a test that launches any other process for which we can not see the output of that process. >>>>>>> >>>>>>> David >>>>>>> >>>>>>> On 17/03/2018 7:48 AM, Chris Plummer wrote: >>>>>>>> On 3/16/18 1:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Thank you for taking care about this issue! >>>>>>>>> >>>>>>>>> On 3/16/18 11:20, Chris Plummer wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I've resolved the issues I had before with not seeing all the stderr output when I tried to capture it. What I'd like to do now is have us decide how the output should be handled from the perspective a LingeredApp user (driver app). Currently all LingeredApp stdout is captured and gets be returned the the driver app by calling app.getAppOutput(). It does not appear in the .jtr file, but the test would have the option of dumping it there it it cared to. Only one test uses app.getAppOutput(). Currently all the LingeredApp stderr is redirected to the console, so it does not appear in the .jtr file. >>>>>>>>> >>>>>>>>> Just a general comment to make sure I understand it and ensure we are in sync. >>>>>>>>> It seems much more safe to always have both stdout and stderr outputs present in the .jtr automatically file independently of of what the test does. >>>>>>>>> >>>>>>>>> >>>>>>>>>> So how do we want this changed? Some possibilities are: >>>>>>>>>> >>>>>>>>>> (1) capture stderr just like stdout currently is, and leave is up the the driver app to decide if it wants to display it (after the app terminates). >>>>>>>>> >>>>>>>>> It does not look good to me (see above) but maybe I'm missing something important here. >>>>>>>>> >>>>>>>>>> (2) capture stderr just like stdout currently is, but have LingeredApp automatically send captured output to driver app's stdout and stderr (after the app terminates). >>>>>>>>> >>>>>>>>> The stdout and std err will be separated in this case, right? >>>>>>>>> Do you have a webrev for this? >>>>>>>> I currently have it working like this, although I need to fix LingeredApp.getAppOutput(). I had to make it return a single String instead of a List of Strings, so this breaks the one test that uses this API. It's easily fixed. Just haven't gotten around to it yet. >>>>>>>>> >>>>>>>>> >>>>>>>>>> (3) send the LingeredApp's stdout and stderr to the driver app's stdout as it is being captured (this was the original fix Igor suggested and the webrev supported). A minor alternative to this is to keep the two streams separated instead of sending both to stdout. >>>>>>>>>> >>>>>>>>>> Let me know what you think. I'm inclined to go with 2, especially since normally there is little to no output from the LingeredApp. >>>>>>>>> >>>>>>>>> The choice (2) looks good enough. >>>>>>>>> Not sure it is that important to have output from stdout and stderr sync'ed >>>>>>>>> but is is important to have the stderr present in the .jtr automatically. >>>>>>>>> >>>>>>>>> The choice (3) looks even better if it is going to work well. >>>>>>>> This is basically what the original webrev did. It sent LingeredApp's stderr and stdout to the the driver apps stdout. It's a 1 word change to make it send stderr to stderr. I think it has a bug though that did not manifest itself. It seems the new copy() code that is capturing stdout would be contending with the existing InputGlobbler code that is doing the same. I would need to fix this to make sure LingeredApp.getAppOutput() still returns all the apps stdout output. >>>>>>>> >>>>>>>> Chris >>>>>>>>> Not sure, it is really necessary. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> BTW, here's the CR and original webrev for reference: >>>>>>>>>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8198655 >>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8198655/webrev.00/webrev/ >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>> From amit.sapre at oracle.com Thu Mar 22 10:03:42 2018 From: amit.sapre at oracle.com (Amit Sapre) Date: Thu, 22 Mar 2018 03:03:42 -0700 (PDT) Subject: RFR : JDK-8071367 - JMX: Remove SNMP support In-Reply-To: <5fed6078-624f-2c4b-b92c-5022fd07925c@oracle.com> References: <5fed6078-624f-2c4b-b92c-5022fd07925c@oracle.com> Message-ID: Thanks Alan and Mandy for inputs. This webrev : http://cr.openjdk.java.net/~asapre/webrev/2018/JDK-8071367/webrev.01/ addresses your comments. I reverted changes for legacy.properties and also rebased the patch for jdk/jdk repo. Thanks, Amit From: mandy chung Sent: Friday, March 09, 2018 1:33 AM To: Alan Bateman; Amit Sapre Cc: serviceability-dev at openjdk.java.net; compiler-dev at openjdk.java.net Subject: Re: RFR : JDK-8071367 - JMX: Remove SNMP support On 3/7/18 11:46 PM, Alan Bateman wrote: On 08/03/2018 07:27, Amit Sapre wrote: Hello, Please review the changes for removing SNMP support. Bug ID : https://bugs.openjdk.java.net/browse/JDK-8071367 Webrev : HYPERLINK "http://cr.openjdk.java.net/%7Easapre/webrev/2018/JDK-8071367/webrev.00"http://cr.openjdk.java.net/~asapre/webrev/2018/JDK-8071367/webrev.00 cc'ing compiler-dev for help on javac/resources/legacy.properties. I'm not 100% sure if it is used when compiling to old releases or not. I think legacy.properties is for compiling for older releases prior to 9 and it should not be changed. Let's get the compiler team to confirm. As you are re-wording the class description for jdk.internal.agent.Agent then we might as well get it right. The Agent class loaded and its static no-arg startAgent method is invoked when a system property starting with "com.sun.management" is specified on the command line. We could expand this to include the case where it is started in a running VM too. Good suggestion. build.properties - I assume the empty value for excludes shouldn't have a continuation character now. The rest looks good to me. I look through the webrev. No other comment besides called out above. I created https://bugs.openjdk.java.net/browse/JDK-8199358 to track the docs update. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Thu Mar 22 10:35:11 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 22 Mar 2018 19:35:11 +0900 Subject: RFR: 8199519: Several GC tests fails with: java.lang.NumberFormatException: Unparseable number: "-" Message-ID: <6755303f-a1a0-da4f-e1e0-a1bcb0c72efd@gmail.com> Hi all, Please review this change: JBS: https://bugs.openjdk.java.net/browse/JDK-8199519 webrev: cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.00/ After JDK-8153333, some jstat tests are failed because GCT in jstat output is dash (-) if garbage collector is not concurrent collector e.g. Serial GC. I fixed that GCT can be calculated correctly. This change has been tested on Mach5 by Stefan. Thanks, Yasumasa From Alan.Bateman at oracle.com Thu Mar 22 13:12:23 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 22 Mar 2018 13:12:23 +0000 Subject: RFR : JDK-8071367 - JMX: Remove SNMP support In-Reply-To: References: <5fed6078-624f-2c4b-b92c-5022fd07925c@oracle.com> Message-ID: On 22/03/2018 10:03, Amit Sapre wrote: > > Thanks Alan and Mandy for inputs. > > This webrev : > http://cr.openjdk.java.net/~asapre/webrev/2018/JDK-8071367/webrev.01/ > > addresses your comments. > > I reverted changes for ?legacy.properties? and also rebased the patch > for jdk/jdk repo. > > This looks good, just a few typos in Agent's class description "This class also provide entrypoints for jcmd tool ..." => "This class also provides entry points for the jcmd tool ...". -Alan. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Roger.Riggs at Oracle.com Thu Mar 22 13:46:16 2018 From: Roger.Riggs at Oracle.com (Roger Riggs) Date: Thu, 22 Mar 2018 09:46:16 -0400 Subject: RFR 8199467 Compilation Errors in libinstrument Reentrancy.c with VS2017 In-Reply-To: References: Message-ID: Hi Martin, Good recommendation; pushed. Thanks, Roger On 3/21/2018 3:31 PM, Martin Buchholz wrote: > > > On Wed, Mar 21, 2018 at 12:13 PM, Roger Riggs > wrote: > > > -??? void *????? test = (void *) 0x99999999; > +??? void *????? test = (void *) 0x99999999ul; > > > Martin's 15th law: Never use "l" in a numeric constant unless the > constant is 0xCafeBabel, > > so 0x99999999UL -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Thu Mar 22 17:45:10 2018 From: mandy.chung at oracle.com (mandy chung) Date: Thu, 22 Mar 2018 10:45:10 -0700 Subject: RFR : JDK-8071367 - JMX: Remove SNMP support In-Reply-To: References: <5fed6078-624f-2c4b-b92c-5022fd07925c@oracle.com> Message-ID: On 3/22/18 6:12 AM, Alan Bateman wrote: > > > On 22/03/2018 10:03, Amit Sapre wrote: >> >> Thanks Alan and Mandy for inputs. >> >> This webrev : >> http://cr.openjdk.java.net/~asapre/webrev/2018/JDK-8071367/webrev.01/ >> >> addresses your comments. >> >> I reverted changes for ?legacy.properties? and also rebased the patch >> for jdk/jdk repo. >> >> > This looks good, just a few typos in Agent's class description > > "This class also provide entrypoints for jcmd tool ..." => "This class > also provides entry points for the jcmd tool ...". > Looks good in general. Since you are cleaning up the comment, some suggestions: Replace line 60-67 with: This class provides the methods to start the management agent. 1.? {@link #startAgent} method is invoked by the VM if -Dcom.sun.management.* is set 2.? {@link #startLocalManagementAgent} or {@link #startRemoteManagementAgent} ???? is invoked to start the management agent after the VM starts ???? via jcmd ManagementAgent.start and start_local command. line 309-310 can be replaced with ?? /* ??? * Starts the local management agent. ??? * This method is invoked by either startAgent method or ??? * by the VM directly via jcmd ManagementAgent.start_local command. ??? */ line 332-335 can be replaced with ?? /* ??? * This method is invoked by the VM to start the remote management agent ??? * via jcmd ManagementAgent.start command. ??? */ Add the javadoc to startAgent() /* ?* This method is invoked by the VM to start the management agent ?* when -Dcom.sun.management.* is set during startup. ?*/ Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Thu Mar 22 19:15:46 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Thu, 22 Mar 2018 19:15:46 +0000 Subject: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of attach operations In-Reply-To: References: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> <9decad59-366b-a5bc-607f-88625169468d@oracle.com> <94b75a874b5c4a3cb1cb109a51220c16@sap.com> <94e35233-005a-3e2c-1db2-bf5625dca863@oracle.com> <693b2894c45244b1aaf914d6a00fc12b@sap.com> <3fdcb090-3e74-5d22-4188-8af625f8a217@oracle.com> <1173f3a8-b2cd-776c-e640-1b83d84a6777@oracle.com> Message-ID: <1d59458be1ab4210bd05e592ed68a578@sap.com> Hi David, > > I think you should keep your original fix since it now properly > > handles null arguments at the same attach-on-demand layer as the > > Linux code that you quoted. > > > > Handling this in args array processing would also be possible > > as David suggests, but it would bother me that Linux and Solaris > > lower attach-on-demand layers would have different behaviors. > > They already do have completely different behaviours. Linux handles NULL > at the Java layer by inserting empty strings! I had another look at the implementations on the various platforms. You are right, linux, aix and mac would write empty strings on java layer - but the enque mechanisms of these platforms looks quite different to Solaris. However, for Windows, where the implementation is different again, the handling of null params happens in the c-native code. For Solaris I would see the best place in the native code as well. So if you don't mind I would keep my change and annotate you and Dan as reviewers, ok? Thanks Christoph From alexey.menkov at oracle.com Thu Mar 22 20:43:42 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 22 Mar 2018 13:43:42 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> <3b9731df-3ea9-59ec-2c0c-af72a554e69a@oracle.com> <7402a238-0c43-ac32-e0b0-4951361c1760@oracle.com> <81f49bb9-fd37-b114-ad56-94e9141ec6e1@oracle.com> Message-ID: Updated webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.05/ The test was updated to ensure shmem name longer than 49 symbols causes java failure. --alex On 03/21/2018 15:39, David Holmes wrote: > On 22/03/2018 2:41 AM, Alex Menkov wrote: >> Hi David, >> >> On 03/20/2018 21:51, David Holmes wrote: >>> Hi Alex, >>> >>> On 21/03/2018 3:25 AM, Alex Menkov wrote: >>>> Hi David, >>>> >>>> On 03/19/2018 18:10, David Holmes wrote: >>>>> Hi Alex, >>>>> >>>>> On 20/03/2018 10:28 AM, Alex Menkov wrote: >>>>>> Hi guys, >>>>>> >>>>>> please re-review the fix. >>>>> >>>>> I still have an unanswered question about where the max of 49 is >>>>> enforced. I see it for the "address" but not names in general. ?? >>>> >>>> for shmem the "channel name" is the address (it's checked in >>>> createTransport/openTransport). >>>> Names for mutexes/events are generated by appending some strings to >>>> the adddress and length of the added parts are supposed to be less >>>> than MAX_IPC_SUFFIX (25 symbols): >>>> ".mutex" (+ up to 3 symbols) >>>> ".hasData" (+ up to 3 symbols) >>>> ".hasSpace" (+ up to 3 symbols) >>>> ".ctos" >>>> ".stoc" >>>> ".accept" (+ up to 3 symbols) >>>> ".attach" (+ up to 3 symbols) >>>> "." (pid is a DWORD) >>> >>> Okay so ... the code in shmemBase.c is very unclear as to which >>> "names" can come in from an external source and which are only ever >>> derived from other "names". If the "address" (which seems a very bad >>> description in this case!) is the only external source for a name, >>> and it is limited to a length of 49 then that is okay. >> >> Yes, the "address" is the only external arg, all other names are >> constructed from it. >> I believe it's "address" because it comes from "address" parameter: >> -Xrunjdwp:transport=st_shmem,address= >> >>> >>>>> >>>>>> Reg.test is added the the issue. >>>>> >>>>> I don't quite follow the test. I see you try to set the name with a >>>>> value that is too long, and if that doesn't cause an overflow and >>>>> we don't crash that is good. But I'd expect you to read back the >>>>> name and check it matches the truncated name with 49 characters. >>>> >>>> The test specifies the maximum length supported (49 symbols) >>>> (if longer name is specified, "address strings longer than 50 >>>> characters are invalid" error reported). >>> >>> I missed the substring that simply causes the name to be the maximum >>> supported length. That would trigger the overflow and so suffices as >>> a regression test for this fix. >>> >>> Is there another test that already passes a too-long name and >>> verifies the error gets thrown? >> >> Do you mean name >= 50 symbols? >> No, there is no such test. >> I don't think it make much sense (test an arbitrary >> implementation-specific restriction), but I can add the case to the test. > > It ensures that using a too-long name fails gracefully. > > Thanks, > David > >> --alex >> >>> >>>> As far as I see there is no way to read back the name used to create >>>> the transport. >>> >>> Ok. >>> >>> Thanks, >>> David >>> ----- >>> >>>> --alex >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.04/ >>>>>> >>>>>> --alex >>>>>> >>>>>> On 03/13/2018 16:14, Alex Menkov wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review a small fix for >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>>>> webrev: >>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>>>> >>>>>>> Root cause of the issue is jbd hungs as a result of the buffer >>>>>>> overflow. >>>>>>> >>>>>>> In the beginning of the shmemBase.c: >>>>>>> >>>>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name >>>>>>> for */ >>>>>>> ???????????????????????????? /* shared memory seg and prefix for >>>>>>> other IPC */ >>>>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other IPC >>>>>>> names */ >>>>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>>>> >>>>>>> buffer (char prefix[]) in function createStream is used to >>>>>>> generate base name for mutex/events, so MAX_IPC_PREFIX is not big >>>>>>> enough. >>>>>>> >>>>>>> --alex From david.holmes at oracle.com Thu Mar 22 21:24:25 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 23 Mar 2018 07:24:25 +1000 Subject: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of attach operations In-Reply-To: <1d59458be1ab4210bd05e592ed68a578@sap.com> References: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> <9decad59-366b-a5bc-607f-88625169468d@oracle.com> <94b75a874b5c4a3cb1cb109a51220c16@sap.com> <94e35233-005a-3e2c-1db2-bf5625dca863@oracle.com> <693b2894c45244b1aaf914d6a00fc12b@sap.com> <3fdcb090-3e74-5d22-4188-8af625f8a217@oracle.com> <1173f3a8-b2cd-776c-e640-1b83d84a6777@oracle.com> <1d59458be1ab4210bd05e592ed68a578@sap.com> Message-ID: On 23/03/2018 5:15 AM, Langer, Christoph wrote: > Hi David, > >>> I think you should keep your original fix since it now properly >>> handles null arguments at the same attach-on-demand layer as the >>> Linux code that you quoted. >>> >>> Handling this in args array processing would also be possible >>> as David suggests, but it would bother me that Linux and Solaris >>> lower attach-on-demand layers would have different behaviors. >> >> They already do have completely different behaviours. Linux handles NULL >> at the Java layer by inserting empty strings! > > I had another look at the implementations on the various platforms. > > You are right, linux, aix and mac would write empty strings on java layer - but the enque mechanisms of these platforms looks quite different to Solaris. However, for Windows, where the implementation is different again, the handling of null params happens in the c-native code. For Solaris I would see the best place in the native code as well. The native layer does differ across platforms - Solaris uses "doors", which other platforms don't have. It would make most sense to me if the Java level for each platform basically worked the same, particularly in the handling of nulls. But that should have been the case from day one. > So if you don't mind I would keep my change and annotate you and Dan as reviewers, ok? Fine. The code seemed okay - but harder to judge versus the trivial changes to java code. David ----- > Thanks > Christoph > From david.holmes at oracle.com Thu Mar 22 21:32:31 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 23 Mar 2018 07:32:31 +1000 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> <3b9731df-3ea9-59ec-2c0c-af72a554e69a@oracle.com> <7402a238-0c43-ac32-e0b0-4951361c1760@oracle.com> <81f49bb9-fd37-b114-ad56-94e9141ec6e1@oracle.com> Message-ID: On 23/03/2018 6:43 AM, Alex Menkov wrote: > > Updated webrev: > http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.05/ > > The test was updated to ensure shmem name longer than 49 symbols causes > java failure. This doesn't ensure it failed gracefully: 81 // extra test: ensure using of too-long name fails gracefully 82 // (shmemName + "X") is expected to be "too long". 83 ProcessTools.executeProcess(getTarget(shmemName + "X")) 84 .shouldNotHaveExitValue(0); It may have crashed. What exactly is the failure mode? return code 1? Exception message that we can check for in outputAnalyzer ? David > --alex > > > On 03/21/2018 15:39, David Holmes wrote: >> On 22/03/2018 2:41 AM, Alex Menkov wrote: >>> Hi David, >>> >>> On 03/20/2018 21:51, David Holmes wrote: >>>> Hi Alex, >>>> >>>> On 21/03/2018 3:25 AM, Alex Menkov wrote: >>>>> Hi David, >>>>> >>>>> On 03/19/2018 18:10, David Holmes wrote: >>>>>> Hi Alex, >>>>>> >>>>>> On 20/03/2018 10:28 AM, Alex Menkov wrote: >>>>>>> Hi guys, >>>>>>> >>>>>>> please re-review the fix. >>>>>> >>>>>> I still have an unanswered question about where the max of 49 is >>>>>> enforced. I see it for the "address" but not names in general. ?? >>>>> >>>>> for shmem the "channel name" is the address (it's checked in >>>>> createTransport/openTransport). >>>>> Names for mutexes/events are generated by appending some strings to >>>>> the adddress and length of the added parts are supposed to be less >>>>> than MAX_IPC_SUFFIX (25 symbols): >>>>> ".mutex" (+ up to 3 symbols) >>>>> ".hasData" (+ up to 3 symbols) >>>>> ".hasSpace" (+ up to 3 symbols) >>>>> ".ctos" >>>>> ".stoc" >>>>> ".accept" (+ up to 3 symbols) >>>>> ".attach" (+ up to 3 symbols) >>>>> "." (pid is a DWORD) >>>> >>>> Okay so ... the code in shmemBase.c is very unclear as to which >>>> "names" can come in from an external source and which are only ever >>>> derived from other "names". If the "address" (which seems a very bad >>>> description in this case!) is the only external source for a name, >>>> and it is limited to a length of 49 then that is okay. >>> >>> Yes, the "address" is the only external arg, all other names are >>> constructed from it. >>> I believe it's "address" because it comes from "address" parameter: >>> -Xrunjdwp:transport=st_shmem,address= >>> >>>> >>>>>> >>>>>>> Reg.test is added the the issue. >>>>>> >>>>>> I don't quite follow the test. I see you try to set the name with >>>>>> a value that is too long, and if that doesn't cause an overflow >>>>>> and we don't crash that is good. But I'd expect you to read back >>>>>> the name and check it matches the truncated name with 49 characters. >>>>> >>>>> The test specifies the maximum length supported (49 symbols) >>>>> (if longer name is specified, "address strings longer than 50 >>>>> characters are invalid" error reported). >>>> >>>> I missed the substring that simply causes the name to be the maximum >>>> supported length. That would trigger the overflow and so suffices as >>>> a regression test for this fix. >>>> >>>> Is there another test that already passes a too-long name and >>>> verifies the error gets thrown? >>> >>> Do you mean name >= 50 symbols? >>> No, there is no such test. >>> I don't think it make much sense (test an arbitrary >>> implementation-specific restriction), but I can add the case to the >>> test. >> >> It ensures that using a too-long name fails gracefully. >> >> Thanks, >> David >> >>> --alex >>> >>>> >>>>> As far as I see there is no way to read back the name used to >>>>> create the transport. >>>> >>>> Ok. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> --alex >>>>> >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> webrev: >>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.04/ >>>>>>> >>>>>>> --alex >>>>>>> >>>>>>> On 03/13/2018 16:14, Alex Menkov wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Please review a small fix for >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>>>>> webrev: >>>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>>>>> >>>>>>>> Root cause of the issue is jbd hungs as a result of the buffer >>>>>>>> overflow. >>>>>>>> >>>>>>>> In the beginning of the shmemBase.c: >>>>>>>> >>>>>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name >>>>>>>> for */ >>>>>>>> ???????????????????????????? /* shared memory seg and prefix for >>>>>>>> other IPC */ >>>>>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other >>>>>>>> IPC names */ >>>>>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>>>>> >>>>>>>> buffer (char prefix[]) in function createStream is used to >>>>>>>> generate base name for mutex/events, so MAX_IPC_PREFIX is not >>>>>>>> big enough. >>>>>>>> >>>>>>>> --alex From alexey.menkov at oracle.com Thu Mar 22 23:18:11 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 22 Mar 2018 16:18:11 -0700 Subject: RFR: JDK-8198393: Instrumentation.retransformClasses() throws NullPointerException when handling a zero-length array Message-ID: <880930e8-d4f2-63ab-a6ed-126e70277f23@oracle.com> Hi all, Please take a look at a simple fix for https://bugs.openjdk.java.net/browse/JDK-8198393 webrev: http://cr.openjdk.java.net/~amenkov/retransformClassesZeroLength/webrev/ --alex From alexey.menkov at oracle.com Thu Mar 22 23:28:15 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 22 Mar 2018 16:28:15 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> <3b9731df-3ea9-59ec-2c0c-af72a554e69a@oracle.com> <7402a238-0c43-ac32-e0b0-4951361c1760@oracle.com> <81f49bb9-fd37-b114-ad56-94e9141ec6e1@oracle.com> Message-ID: <5a8b8fc3-7b92-7e74-a2f7-5b7e077ce7b8@oracle.com> Hi David, With too-long shmem name java reports: ERROR: transport error 202: failed to create shared memory listener: Error: address strings longer than 50 characters are invalid and ret.code is 2 I added checks for both ret.code and presence of "address strings longer than" text in the output. webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.06/ --alex On 03/22/2018 14:32, David Holmes wrote: > On 23/03/2018 6:43 AM, Alex Menkov wrote: >> >> Updated webrev: >> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.05/ >> >> The test was updated to ensure shmem name longer than 49 symbols >> causes java failure. > > This doesn't ensure it failed gracefully: > > 81???????? // extra test: ensure using of too-long name fails gracefully > 82???????? // (shmemName + "X") is expected to be "too long". > 83???????? ProcessTools.executeProcess(getTarget(shmemName + "X")) > 84???????????????? .shouldNotHaveExitValue(0); > > It may have crashed. What exactly is the failure mode? return code 1? > Exception message that we can check for in outputAnalyzer ? > > David > >> --alex >> >> >> On 03/21/2018 15:39, David Holmes wrote: >>> On 22/03/2018 2:41 AM, Alex Menkov wrote: >>>> Hi David, >>>> >>>> On 03/20/2018 21:51, David Holmes wrote: >>>>> Hi Alex, >>>>> >>>>> On 21/03/2018 3:25 AM, Alex Menkov wrote: >>>>>> Hi David, >>>>>> >>>>>> On 03/19/2018 18:10, David Holmes wrote: >>>>>>> Hi Alex, >>>>>>> >>>>>>> On 20/03/2018 10:28 AM, Alex Menkov wrote: >>>>>>>> Hi guys, >>>>>>>> >>>>>>>> please re-review the fix. >>>>>>> >>>>>>> I still have an unanswered question about where the max of 49 is >>>>>>> enforced. I see it for the "address" but not names in general. ?? >>>>>> >>>>>> for shmem the "channel name" is the address (it's checked in >>>>>> createTransport/openTransport). >>>>>> Names for mutexes/events are generated by appending some strings >>>>>> to the adddress and length of the added parts are supposed to be >>>>>> less than MAX_IPC_SUFFIX (25 symbols): >>>>>> ".mutex" (+ up to 3 symbols) >>>>>> ".hasData" (+ up to 3 symbols) >>>>>> ".hasSpace" (+ up to 3 symbols) >>>>>> ".ctos" >>>>>> ".stoc" >>>>>> ".accept" (+ up to 3 symbols) >>>>>> ".attach" (+ up to 3 symbols) >>>>>> "." (pid is a DWORD) >>>>> >>>>> Okay so ... the code in shmemBase.c is very unclear as to which >>>>> "names" can come in from an external source and which are only ever >>>>> derived from other "names". If the "address" (which seems a very >>>>> bad description in this case!) is the only external source for a >>>>> name, and it is limited to a length of 49 then that is okay. >>>> >>>> Yes, the "address" is the only external arg, all other names are >>>> constructed from it. >>>> I believe it's "address" because it comes from "address" parameter: >>>> -Xrunjdwp:transport=st_shmem,address= >>>> >>>>> >>>>>>> >>>>>>>> Reg.test is added the the issue. >>>>>>> >>>>>>> I don't quite follow the test. I see you try to set the name with >>>>>>> a value that is too long, and if that doesn't cause an overflow >>>>>>> and we don't crash that is good. But I'd expect you to read back >>>>>>> the name and check it matches the truncated name with 49 characters. >>>>>> >>>>>> The test specifies the maximum length supported (49 symbols) >>>>>> (if longer name is specified, "address strings longer than 50 >>>>>> characters are invalid" error reported). >>>>> >>>>> I missed the substring that simply causes the name to be the >>>>> maximum supported length. That would trigger the overflow and so >>>>> suffices as a regression test for this fix. >>>>> >>>>> Is there another test that already passes a too-long name and >>>>> verifies the error gets thrown? >>>> >>>> Do you mean name >= 50 symbols? >>>> No, there is no such test. >>>> I don't think it make much sense (test an arbitrary >>>> implementation-specific restriction), but I can add the case to the >>>> test. >>> >>> It ensures that using a too-long name fails gracefully. >>> >>> Thanks, >>> David >>> >>>> --alex >>>> >>>>> >>>>>> As far as I see there is no way to read back the name used to >>>>>> create the transport. >>>>> >>>>> Ok. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> --alex >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> webrev: >>>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.04/ >>>>>>>> >>>>>>>> --alex >>>>>>>> >>>>>>>> On 03/13/2018 16:14, Alex Menkov wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Please review a small fix for >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>>>>>> webrev: >>>>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>>>>>> >>>>>>>>> Root cause of the issue is jbd hungs as a result of the buffer >>>>>>>>> overflow. >>>>>>>>> >>>>>>>>> In the beginning of the shmemBase.c: >>>>>>>>> >>>>>>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated name >>>>>>>>> for */ >>>>>>>>> ???????????????????????????? /* shared memory seg and prefix >>>>>>>>> for other IPC */ >>>>>>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other >>>>>>>>> IPC names */ >>>>>>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>>>>>> >>>>>>>>> buffer (char prefix[]) in function createStream is used to >>>>>>>>> generate base name for mutex/events, so MAX_IPC_PREFIX is not >>>>>>>>> big enough. >>>>>>>>> >>>>>>>>> --alex From david.holmes at oracle.com Thu Mar 22 23:50:09 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 23 Mar 2018 09:50:09 +1000 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <5a8b8fc3-7b92-7e74-a2f7-5b7e077ce7b8@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> <3b9731df-3ea9-59ec-2c0c-af72a554e69a@oracle.com> <7402a238-0c43-ac32-e0b0-4951361c1760@oracle.com> <81f49bb9-fd37-b114-ad56-94e9141ec6e1@oracle.com> <5a8b8fc3-7b92-7e74-a2f7-5b7e077ce7b8@oracle.com> Message-ID: <12f15c12-be21-ffd6-0172-7ca3be1f38f8@oracle.com> Thanks Alex! Looks good. David On 23/03/2018 9:28 AM, Alex Menkov wrote: > Hi David, > > With too-long shmem name java reports: > ERROR: transport error 202: failed to create shared memory listener: > Error: address strings longer than 50 characters are invalid > and ret.code is 2 > > I added checks for both ret.code and presence of "address strings longer > than" text in the output. > webrev: http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.06/ > > --alex > > On 03/22/2018 14:32, David Holmes wrote: >> On 23/03/2018 6:43 AM, Alex Menkov wrote: >>> >>> Updated webrev: >>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.05/ >>> >>> The test was updated to ensure shmem name longer than 49 symbols >>> causes java failure. >> >> This doesn't ensure it failed gracefully: >> >> 81???????? // extra test: ensure using of too-long name fails gracefully >> 82???????? // (shmemName + "X") is expected to be "too long". >> 83???????? ProcessTools.executeProcess(getTarget(shmemName + "X")) >> 84???????????????? .shouldNotHaveExitValue(0); >> >> It may have crashed. What exactly is the failure mode? return code 1? >> Exception message that we can check for in outputAnalyzer ? >> >> David >> >>> --alex >>> >>> >>> On 03/21/2018 15:39, David Holmes wrote: >>>> On 22/03/2018 2:41 AM, Alex Menkov wrote: >>>>> Hi David, >>>>> >>>>> On 03/20/2018 21:51, David Holmes wrote: >>>>>> Hi Alex, >>>>>> >>>>>> On 21/03/2018 3:25 AM, Alex Menkov wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> On 03/19/2018 18:10, David Holmes wrote: >>>>>>>> Hi Alex, >>>>>>>> >>>>>>>> On 20/03/2018 10:28 AM, Alex Menkov wrote: >>>>>>>>> Hi guys, >>>>>>>>> >>>>>>>>> please re-review the fix. >>>>>>>> >>>>>>>> I still have an unanswered question about where the max of 49 is >>>>>>>> enforced. I see it for the "address" but not names in general. ?? >>>>>>> >>>>>>> for shmem the "channel name" is the address (it's checked in >>>>>>> createTransport/openTransport). >>>>>>> Names for mutexes/events are generated by appending some strings >>>>>>> to the adddress and length of the added parts are supposed to be >>>>>>> less than MAX_IPC_SUFFIX (25 symbols): >>>>>>> ".mutex" (+ up to 3 symbols) >>>>>>> ".hasData" (+ up to 3 symbols) >>>>>>> ".hasSpace" (+ up to 3 symbols) >>>>>>> ".ctos" >>>>>>> ".stoc" >>>>>>> ".accept" (+ up to 3 symbols) >>>>>>> ".attach" (+ up to 3 symbols) >>>>>>> "." (pid is a DWORD) >>>>>> >>>>>> Okay so ... the code in shmemBase.c is very unclear as to which >>>>>> "names" can come in from an external source and which are only >>>>>> ever derived from other "names". If the "address" (which seems a >>>>>> very bad description in this case!) is the only external source >>>>>> for a name, and it is limited to a length of 49 then that is okay. >>>>> >>>>> Yes, the "address" is the only external arg, all other names are >>>>> constructed from it. >>>>> I believe it's "address" because it comes from "address" parameter: >>>>> -Xrunjdwp:transport=st_shmem,address= >>>>> >>>>>> >>>>>>>> >>>>>>>>> Reg.test is added the the issue. >>>>>>>> >>>>>>>> I don't quite follow the test. I see you try to set the name >>>>>>>> with a value that is too long, and if that doesn't cause an >>>>>>>> overflow and we don't crash that is good. But I'd expect you to >>>>>>>> read back the name and check it matches the truncated name with >>>>>>>> 49 characters. >>>>>>> >>>>>>> The test specifies the maximum length supported (49 symbols) >>>>>>> (if longer name is specified, "address strings longer than 50 >>>>>>> characters are invalid" error reported). >>>>>> >>>>>> I missed the substring that simply causes the name to be the >>>>>> maximum supported length. That would trigger the overflow and so >>>>>> suffices as a regression test for this fix. >>>>>> >>>>>> Is there another test that already passes a too-long name and >>>>>> verifies the error gets thrown? >>>>> >>>>> Do you mean name >= 50 symbols? >>>>> No, there is no such test. >>>>> I don't think it make much sense (test an arbitrary >>>>> implementation-specific restriction), but I can add the case to the >>>>> test. >>>> >>>> It ensures that using a too-long name fails gracefully. >>>> >>>> Thanks, >>>> David >>>> >>>>> --alex >>>>> >>>>>> >>>>>>> As far as I see there is no way to read back the name used to >>>>>>> create the transport. >>>>>> >>>>>> Ok. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> --alex >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> webrev: >>>>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.04/ >>>>>>>>> >>>>>>>>> >>>>>>>>> --alex >>>>>>>>> >>>>>>>>> On 03/13/2018 16:14, Alex Menkov wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> Please review a small fix for >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>>>>>>> webrev: >>>>>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>>>>>>> >>>>>>>>>> Root cause of the issue is jbd hungs as a result of the buffer >>>>>>>>>> overflow. >>>>>>>>>> >>>>>>>>>> In the beginning of the shmemBase.c: >>>>>>>>>> >>>>>>>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated >>>>>>>>>> name for */ >>>>>>>>>> ???????????????????????????? /* shared memory seg and prefix >>>>>>>>>> for other IPC */ >>>>>>>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other >>>>>>>>>> IPC names */ >>>>>>>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>>>>>>> >>>>>>>>>> buffer (char prefix[]) in function createStream is used to >>>>>>>>>> generate base name for mutex/events, so MAX_IPC_PREFIX is not >>>>>>>>>> big enough. >>>>>>>>>> >>>>>>>>>> --alex From martinrb at google.com Fri Mar 23 00:36:52 2018 From: martinrb at google.com (Martin Buchholz) Date: Thu, 22 Mar 2018 17:36:52 -0700 Subject: RFR : JDK-8071367 - JMX: Remove SNMP support In-Reply-To: References: Message-ID: On Wed, Mar 7, 2018 at 11:27 PM, Amit Sapre wrote: > > Bug ID : https://bugs.openjdk.java.net/browse/JDK-8071367 > Is there a reason this bug is not visible? -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Fri Mar 23 04:38:46 2018 From: mandy.chung at oracle.com (mandy chung) Date: Thu, 22 Mar 2018 21:38:46 -0700 Subject: RFR : JDK-8071367 - JMX: Remove SNMP support In-Reply-To: References: Message-ID: <8aadc8c8-a83c-6f73-390e-e6a1e92b4179@oracle.com> On 3/22/18 5:36 PM, Martin Buchholz wrote: > > > On Wed, Mar 7, 2018 at 11:27 PM, Amit Sapre > wrote: > > > Bug ID : https://bugs.openjdk.java.net/browse/JDK-8071367 > > > > Is there a reason this bug is not visible? I don't see any issue opening this JBS issue and so I went ahead and change it. Note that this is an Oracle JDK module and not part of the OpenJDK. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Fri Mar 23 08:37:57 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Fri, 23 Mar 2018 08:37:57 +0000 Subject: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of attach operations In-Reply-To: References: <6a45a94ca3bd4fc1a6371846314f1ff4@sap.com> <9decad59-366b-a5bc-607f-88625169468d@oracle.com> <94b75a874b5c4a3cb1cb109a51220c16@sap.com> <94e35233-005a-3e2c-1db2-bf5625dca863@oracle.com> <693b2894c45244b1aaf914d6a00fc12b@sap.com> <3fdcb090-3e74-5d22-4188-8af625f8a217@oracle.com> <1173f3a8-b2cd-776c-e640-1b83d84a6777@oracle.com> <1d59458be1ab4210bd05e592ed68a578@sap.com> Message-ID: <843af6133cd24243ab872106c6800aed@sap.com> Hi, I pushed it after running the "com/sun/tools/attach" jtreg tests on Solaris: http://hg.openjdk.java.net/jdk/jdk/rev/6e2d71029781 Thanks Christoph > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Donnerstag, 22. M?rz 2018 22:24 > To: Langer, Christoph ; > daniel.daugherty at oracle.com; serviceability-dev at openjdk.java.net > Subject: Re: RFR(S) 8199924: Solaris: Correctly enqueue null arguments of > attach operations > > On 23/03/2018 5:15 AM, Langer, Christoph wrote: > > Hi David, > > > >>> I think you should keep your original fix since it now properly > >>> handles null arguments at the same attach-on-demand layer as the > >>> Linux code that you quoted. > >>> > >>> Handling this in args array processing would also be possible > >>> as David suggests, but it would bother me that Linux and Solaris > >>> lower attach-on-demand layers would have different behaviors. > >> > >> They already do have completely different behaviours. Linux handles > NULL > >> at the Java layer by inserting empty strings! > > > > I had another look at the implementations on the various platforms. > > > > You are right, linux, aix and mac would write empty strings on java layer - > but the enque mechanisms of these platforms looks quite different to > Solaris. However, for Windows, where the implementation is different again, > the handling of null params happens in the c-native code. For Solaris I would > see the best place in the native code as well. > > The native layer does differ across platforms - Solaris uses "doors", > which other platforms don't have. > > It would make most sense to me if the Java level for each platform > basically worked the same, particularly in the handling of nulls. But > that should have been the case from day one. > > > So if you don't mind I would keep my change and annotate you and Dan as > reviewers, ok? > > Fine. The code seemed okay - but harder to judge versus the trivial > changes to java code. > > David > ----- > > > Thanks > > Christoph > > From amit.sapre at oracle.com Fri Mar 23 10:43:38 2018 From: amit.sapre at oracle.com (Amit Sapre) Date: Fri, 23 Mar 2018 03:43:38 -0700 (PDT) Subject: RFR : JDK-8071367 - JMX: Remove SNMP support In-Reply-To: References: <5fed6078-624f-2c4b-b92c-5022fd07925c@oracle.com> Message-ID: <1dcc4fe3-8463-49c6-947d-ef31b92d51db@default> Thanks all for the inputs. This webrev addresses the inputs : http://cr.openjdk.java.net/~asapre/webrev/2018/JDK-8071367/webrev.02/ Amit From: mandy chung Sent: Thursday, March 22, 2018 11:15 PM To: Alan Bateman; Amit Sapre Cc: serviceability-dev at openjdk.java.net; compiler-dev at openjdk.java.net Subject: Re: RFR : JDK-8071367 - JMX: Remove SNMP support On 3/22/18 6:12 AM, Alan Bateman wrote: On 22/03/2018 10:03, Amit Sapre wrote: Thanks Alan and Mandy for inputs. This webrev : HYPERLINK "http://cr.openjdk.java.net/%7Easapre/webrev/2018/JDK-8071367/webrev.01/"http://cr.openjdk.java.net/~asapre/webrev/2018/JDK-8071367/webrev.01/ addresses your comments. I reverted changes for legacy.properties and also rebased the patch for jdk/jdk repo. This looks good, just a few typos in Agent's class description "This class also provide entrypoints for jcmd tool ..." => "This class also provides entry points for the jcmd tool ...". Looks good in general. Since you are cleaning up the comment, some suggestions: Replace line 60-67 with: This class provides the methods to start the management agent. 1. {@link #startAgent} method is invoked by the VM if -Dcom.sun.management.* is set 2. {@link #startLocalManagementAgent} or {@link #startRemoteManagementAgent} is invoked to start the management agent after the VM starts via jcmd ManagementAgent.start and start_local command. line 309-310 can be replaced with /* * Starts the local management agent. * This method is invoked by either startAgent method or * by the VM directly via jcmd ManagementAgent.start_local command. */ line 332-335 can be replaced with /* * This method is invoked by the VM to start the remote management agent * via jcmd ManagementAgent.start command. */ Add the javadoc to startAgent() /* * This method is invoked by the VM to start the management agent * when -Dcom.sun.management.* is set during startup. */ Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From magnus.ihse.bursie at oracle.com Fri Mar 23 13:56:34 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 23 Mar 2018 14:56:34 +0100 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries Message-ID: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> With modern compilers, we can use compiler directives (such as _attribute__((visibility("default"))), or __declspec(dllexport)) to control symbol visibility, directly in the source code. This has historically not been present on all compilers, so we had to resort to using mapfiles (also known as linker scripts). This is no longer the case. Now all compilers we use support symbol visibility directives, in one form or another. We should start using this. Since this has been the only way to control symbol visibility on Windows, for most of the shared code, we already have proper JNIEXPORT decorations in place. If we fix the remaining platform-specific files to have proper JNIEXPORT tagging, then we can finally get rid of mapfiles. This fix removed mapfiles for all JDK libraries. It does not touch hotspot libraries nor JDK executables; they will have to wait for a future fix -- this was complex enough. This change will not have any impact on macosx, since we do not use mapfiles there, but instead export all symbols. (This is not a good idea, but I'll address that separately.) This change will also have a minimal impact on Windows. The only reason Windows is impacted at all, is that some changes needed by Solaris and Linux were simpler to fix for all platforms. I have strived for this change to have no impact on the actual generated code. Unfortunately, this was not possible to fully achieve. I do not believe that these changes will have any actual impact on the product, though. I will present the differences more in detail further down. Those who are not interested can probably skip that. The patch has passed tier1 testing and is currently running tier2 and tier3. Since the running code is more or less (see caveat below) unmodified, I don't expect any testing issues. Bug: https://bugs.openjdk.java.net/browse/JDK-8200178 WebRev: http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01 Details on changes: Most of the source code changes are (unsurprisingly) in java.base and java.desktop. Remaining changes are in jdk.crypto.ucrypto, jdk.hotspot.agent, jdk.jdi and jdk.jdwp.agent. Source code changes does almost to 100% consists in decorating an exported function with JNIEXPORT. I have also followed the long-standing convention of adding JNICALL. This is a no-op on non-Windows platforms, so for most of the changes this is purely cosmetic (and possibly adding in robustness, should the function ever be used on Windows in the future). I have also followed the stylistic convention of putting "JNIEXPORT JNICALL" on a separate line. For some functions, however, this might cause a change in calling convention on Windows. Since this can not apply to exported functions on Windows (otherwise they would already have had JNIEXPORT), I do not think this matters anything. A few libraries did not have a mapfile, on Linux and/or Solaris. This actually meant that all symbols were exported. It is highly unclear if this was known and intended by the original make rule writer. I have emulated this by adding the flag $(EXPORT_ALL_SYMBOLS) to these libraries. Hopefully, we can remove this flag and fix proper exported symbols in the future. I have run the complete build using COMPARE_BUILD, and made a thourough analysis of the differences for Linux and Solaris. All native libraries have symbol differences, but most of them are trivial and/or harmless. As a result, most libraries have disasm differences as well, but these too seem trivial and harmless. The differences in symbols that are common to all libraries include: ?* Internal symbols such as __bss_start, _edata, _end and _fini are now global. (They are imported as such from the compiler libraries/archives, and we have no linker script to override this behavior). ?* The versioning tag SUNWprivate_1.1 is not included, and thus neither the .gnu.version_d symbol. ?* There are a few differences in the symbol and/or mangling of some local functions. I'm not sure what's causing this, but it's unlikely to have any effect on the product. Another common source for change in symbols is due to previous platform differences. For instance, if we had "JNIEXPORT int JNICALL do_foo() { ... }", but do_foo was not in the mapfile, the symbol was exported on Windows but not on Linux and Solaris. (Presumable since it was not needed there, even though it was compiled for those platforms as well.) Now, with the mapfiles gone, do_foo() will be exported on all platforms. And contrary, functions that are compiled on all platforms, and were exported in mapfiles, but now have gotten an JNIEXPORT decoration, will now be visible even on Windows. (This accounts for half of the noticed symbol differences on Windows.) I could have made the JNIEXPORT conditional on OS, but I didn't think the mess in source code were worth the keeping of binary confidence with the old build. A third common source for change in symbols is due to exported functions "leaking" across library borders. For instance, some functions in java.desktop is compiled in both libawt_xawt and libawt_headless, but they were previously only included in the mapfile for one of these libraries. Now, since the visibility is determined by the source code itself, it gets exported in both libraries. A variant of this is when a library depends on another JDK library, and includes the header file from that other library, which in turn declares a function as JNIEXPORT. This will cause the including library to also export the function. This accounts for the other half of the changes on Windows. A typical example of this is that multiple libraries now re-export hotspot symbols from libjvm.so, like jio_fprintf. (I have not listed the libjvm re-exports below.) Note that? Java_java_io_FileOutputStream_close0 in java.base/unix/native/libjava/FileOutputStream_md.c is no longer exported, and can probably be removed. Here is a detailed table showing and accounting for all the remaining differences found on Linux and Solaris: java.base/unix/native/libjava: Java_java_io_FileOutputStream_close0 is now also exported on unix platforms due to JNIEXPORT. java.base/jspawnlauncher: On solaris, we also include libjava/childproc.o, which now exports less functions than it used to (it used to export all functions, now it is compiled with visibility=hidden). java.base/java(w).exe: Is now also exporting the following symbols due to added JNIEXPORT in libjli on Windows: (Yes, executables can export symbols on Windows. Confusing, I know.) ?JLI_AddArgsFromEnvVar ?JLI_CmdToArgs ?JLI_GetAppArgIndex ?JLI_GetStdArgc ?JLI_GetStdArgs ?JLI_InitArgProcessing ?JLI_Launch ?JLI_List_add ?JLI_List_new ?JLI_ManifestIterate ?JLI_MemAlloc ?JLI_MemFree ?JLI_PreprocessArg ?JLI_ReportErrorMessage ?JLI_ReportErrorMessageSys ?JLI_ReportExceptionDescription ?JLI_ReportMessage ?JLI_SetTraceLauncher ?JLI_StringDup java.desktop:/libawt_xawt: The following symbols are now also exported on linux and solaris due to JNIEXPORT: ?awt_DrawingSurface_FreeDrawingSurfaceInfo ?awt_DrawingSurface_GetDrawingSurfaceInfo ?awt_DrawingSurface_Lock ?awt_DrawingSurface_Unlock ?awt_GetColor The following symbols are now also exported on linux and solaris due to JNIEXPORT (they were previously ?exported only in libawt): ?Java_sun_awt_DebugSettings_setCTracingOn__Z ?Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2 ?Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2I ?Java_sun_awt_X11GraphicsConfig_getNumColors java.desktop:/libawt_headless: The following symbols are now also exported due to JNIEXPORT (they were previously ?exported only in libawt_xawt and/or libawt): ?Java_sun_java2d_opengl_GLXGraphicsConfig_getGLXConfigInfo ?Java_sun_java2d_opengl_GLXGraphicsConfig_getOGLCapabilities ?Java_sun_java2d_x11_X11PMBlitLoops_updateBitmask ?Java_sun_java2d_x11_X11SurfaceData_isShmPMAvailable ?X11SurfaceData_GetOps java.desktop/libawt: The following symbols are now also exported on Windows, due to added JNIEXPORT: ?SurfaceData_InitOps ?mul8table ?div8table ?doDrawPath ?doFillPath ?g_CMpDataID ?initInverseGrayLut ?make_dither_arrays ?make_uns_ordered_dither_array ?path2DFloatCoordsID ?path2DNumTypesID ?path2DTypesID ?path2DWindingRuleID ?sg2dStrokeHintID ?std_img_oda_blue ?std_img_oda_green ?std_img_oda_red ?std_odas_computed ?sunHints_INTVAL_STROKE_PURE java.desktop/libawt on solaris: A number of "#pragma weak" directives was previously overridden by the mapfile. Now these directives are respected, so these symbols are now weak instead of local: ?ByteGrayToIntArgbPreConvert_F ?ByteGrayToIntArgbPreScaleConvert_F ?IntArgbBmToFourByteAbgrPreScaleXparOver_F ?IntArgbToIntRgbXorBlit_F ?IntBgrToIntBgrAlphaMaskBlit_F java.desktop/libawt on solaris: These are now also exported due to JNIEXPORT in libmlib_image. ?j2d_mlib_ImageCreate ?j2d_mlib_ImageCreateStruct ?j2d_mlib_ImageDelete java.desktop/libawt on solaris: This is now also exported due to JNIEXPORT: ?GrPrim_CompGetXorColor ?SurfaceData_GetOpsNoSetup ?SurfaceData_IntersectBoundsXYWH ?SurfaceData_SetOps ?Transform_GetInfo ?Transform_transform java.desktop/libsplashscreen: JNI_OnLoad is now exported on linux and solaris due to JNIEXPORT. libspashscreen also had JNIEXPORT (actually a pure _declspec(dllexport)) but no JNICALL, which I added as a part of converting to JNIEXPORT. The same goes for libmlib_image . jdk.sctp/libsctp: handleSocketError is now exported on linux and solaris due to JNIEXPORT in libnio. java.instrument:/libinstrument: Agent_OnUnload is now also exported on linux and solaris platforms due to JNIEXPORT. JLI_ManifestIterate is now also exported on Windows, due to added JNIEXPORT in libjli. jdk.management/libmanagement_ext: Java_com_sun_management_internal_Flag_setDoubleValue is now also exported on linux and solaris platforms due to JNIEXPORT. /Magnus From erik.joelsson at oracle.com Fri Mar 23 14:30:14 2018 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Fri, 23 Mar 2018 07:30:14 -0700 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> Message-ID: <86488d2c-61e3-e489-d9fc-976178c35775@oracle.com> I have looked at the build changes and they look good. Will you file followups for each component team to look over their exported symbols, at least for the libraries with $(EXPORT_ALL_SYMBOLS)? It sure looks like there is some technical debt laying around here. /Erik On 2018-03-23 06:56, Magnus Ihse Bursie wrote: > With modern compilers, we can use compiler directives (such as > _attribute__((visibility("default"))), or __declspec(dllexport)) to > control symbol visibility, directly in the source code. This has > historically not been present on all compilers, so we had to resort to > using mapfiles (also known as linker scripts). > > This is no longer the case. Now all compilers we use support symbol > visibility directives, in one form or another. We should start using > this. Since this has been the only way to control symbol visibility on > Windows, for most of the shared code, we already have proper JNIEXPORT > decorations in place. > > If we fix the remaining platform-specific files to have proper > JNIEXPORT tagging, then we can finally get rid of mapfiles. > > This fix removed mapfiles for all JDK libraries. It does not touch > hotspot libraries nor JDK executables; they will have to wait for a > future fix -- this was complex enough. This change will not have any > impact on macosx, since we do not use mapfiles there, but instead > export all symbols. (This is not a good idea, but I'll address that > separately.) This change will also have a minimal impact on Windows. > The only reason Windows is impacted at all, is that some changes > needed by Solaris and Linux were simpler to fix for all platforms. > > I have strived for this change to have no impact on the actual > generated code. Unfortunately, this was not possible to fully achieve. > I do not believe that these changes will have any actual impact on the > product, though. I will present the differences more in detail further > down. Those who are not interested can probably skip that. > > The patch has passed tier1 testing and is currently running tier2 and > tier3. Since the running code is more or less (see caveat below) > unmodified, I don't expect any testing issues. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8200178 > WebRev: > http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01 > > Details on changes: > Most of the source code changes are (unsurprisingly) in java.base and > java.desktop. Remaining changes are in jdk.crypto.ucrypto, > jdk.hotspot.agent, jdk.jdi and jdk.jdwp.agent. > > Source code changes does almost to 100% consists in decorating an > exported function with JNIEXPORT. I have also followed the > long-standing convention of adding JNICALL. This is a no-op on > non-Windows platforms, so for most of the changes this is purely > cosmetic (and possibly adding in robustness, should the function ever > be used on Windows in the future). I have also followed the stylistic > convention of putting "JNIEXPORT JNICALL" on a separate > line. For some functions, however, this might cause a change in > calling convention on Windows. Since this can not apply to exported > functions on Windows (otherwise they would already have had > JNIEXPORT), I do not think this matters anything. > > A few libraries did not have a mapfile, on Linux and/or Solaris. This > actually meant that all symbols were exported. It is highly unclear if > this was known and intended by the original make rule writer. I have > emulated this by adding the flag $(EXPORT_ALL_SYMBOLS) to these > libraries. Hopefully, we can remove this flag and fix proper exported > symbols in the future. > > I have run the complete build using COMPARE_BUILD, and made a > thourough analysis of the differences for Linux and Solaris. All > native libraries have symbol differences, but most of them are trivial > and/or harmless. As a result, most libraries have disasm differences > as well, but these too seem trivial and harmless. The differences in > symbols that are common to all libraries include: > ?* Internal symbols such as __bss_start, _edata, _end and _fini are > now global. (They are imported as such from the compiler > libraries/archives, and we have no linker script to override this > behavior). > ?* The versioning tag SUNWprivate_1.1 is not included, and thus > neither the .gnu.version_d symbol. > ?* There are a few differences in the symbol and/or mangling of some > local functions. I'm not sure what's causing this, > but it's unlikely to have any effect on the product. > > Another common source for change in symbols is due to previous > platform differences. For instance, if we had "JNIEXPORT int JNICALL > do_foo() { ... }", but do_foo was not in the mapfile, the symbol was > exported on Windows but not on Linux and Solaris. (Presumable since it > was not needed there, even though it was compiled for those platforms > as well.) Now, with the mapfiles gone, do_foo() will be exported on > all platforms. And contrary, functions that are compiled on all > platforms, and were exported in mapfiles, but now have gotten an > JNIEXPORT decoration, will now be visible even on Windows. (This > accounts for half of the noticed symbol differences on Windows.) I > could have made the JNIEXPORT conditional on OS, but I didn't think > the mess in source code were worth the keeping of binary confidence > with the old build. > > A third common source for change in symbols is due to exported > functions "leaking" across library borders. For instance, some > functions in java.desktop is compiled in both libawt_xawt and > libawt_headless, but they were previously only included in the mapfile > for one of these libraries. Now, since the visibility is determined by > the source code itself, it gets exported in both libraries. A variant > of this is when a library depends on another JDK library, and includes > the header file from that other library, which in turn declares a > function as JNIEXPORT. This will cause the including library to also > export the function. This accounts for the other half of the changes > on Windows. A typical example of this is that multiple libraries now > re-export hotspot symbols from libjvm.so, like jio_fprintf. (I have > not listed the libjvm re-exports below.) > > Note that? Java_java_io_FileOutputStream_close0 in > java.base/unix/native/libjava/FileOutputStream_md.c is no longer > exported, > and can probably be removed. > > Here is a detailed table showing and accounting for all the remaining > differences found on Linux and Solaris: > java.base/unix/native/libjava: Java_java_io_FileOutputStream_close0 is > now also exported on unix platforms due to JNIEXPORT. > > java.base/jspawnlauncher: On solaris, we also include > libjava/childproc.o, which > now exports less functions than it used to (it used to export all > functions, now it is compiled with visibility=hidden). > > java.base/java(w).exe: Is now also exporting the following symbols due > to added JNIEXPORT in libjli on Windows: > (Yes, executables can export symbols on Windows. Confusing, I know.) > ?JLI_AddArgsFromEnvVar > ?JLI_CmdToArgs > ?JLI_GetAppArgIndex > ?JLI_GetStdArgc > ?JLI_GetStdArgs > ?JLI_InitArgProcessing > ?JLI_Launch > ?JLI_List_add > ?JLI_List_new > ?JLI_ManifestIterate > ?JLI_MemAlloc > ?JLI_MemFree > ?JLI_PreprocessArg > ?JLI_ReportErrorMessage > ?JLI_ReportErrorMessageSys > ?JLI_ReportExceptionDescription > ?JLI_ReportMessage > ?JLI_SetTraceLauncher > ?JLI_StringDup > > java.desktop:/libawt_xawt: The following symbols are now also exported > on linux and solaris due to JNIEXPORT: > ?awt_DrawingSurface_FreeDrawingSurfaceInfo > ?awt_DrawingSurface_GetDrawingSurfaceInfo > ?awt_DrawingSurface_Lock > ?awt_DrawingSurface_Unlock > ?awt_GetColor > > The following symbols are now also exported on linux and solaris due > to JNIEXPORT (they were previously > ?exported only in libawt): > ?Java_sun_awt_DebugSettings_setCTracingOn__Z > ?Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2 > ?Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2I > ?Java_sun_awt_X11GraphicsConfig_getNumColors > > java.desktop:/libawt_headless: The following symbols are now also > exported due to JNIEXPORT (they were previously > ?exported only in libawt_xawt and/or libawt): > ?Java_sun_java2d_opengl_GLXGraphicsConfig_getGLXConfigInfo > ?Java_sun_java2d_opengl_GLXGraphicsConfig_getOGLCapabilities > ?Java_sun_java2d_x11_X11PMBlitLoops_updateBitmask > ?Java_sun_java2d_x11_X11SurfaceData_isShmPMAvailable > ?X11SurfaceData_GetOps > > java.desktop/libawt: The following symbols are now also exported on > Windows, due to added > JNIEXPORT: > ?SurfaceData_InitOps > ?mul8table > ?div8table > ?doDrawPath > ?doFillPath > ?g_CMpDataID > ?initInverseGrayLut > ?make_dither_arrays > ?make_uns_ordered_dither_array > ?path2DFloatCoordsID > ?path2DNumTypesID > ?path2DTypesID > ?path2DWindingRuleID > ?sg2dStrokeHintID > ?std_img_oda_blue > ?std_img_oda_green > ?std_img_oda_red > ?std_odas_computed > ?sunHints_INTVAL_STROKE_PURE > > java.desktop/libawt on solaris: > A number of "#pragma weak" directives was previously overridden by the > mapfile. > Now these directives are respected, so these symbols are now weak > instead of local: > ?ByteGrayToIntArgbPreConvert_F > ?ByteGrayToIntArgbPreScaleConvert_F > ?IntArgbBmToFourByteAbgrPreScaleXparOver_F > ?IntArgbToIntRgbXorBlit_F > ?IntBgrToIntBgrAlphaMaskBlit_F > > java.desktop/libawt on solaris: These are now also exported due to > JNIEXPORT in libmlib_image. > ?j2d_mlib_ImageCreate > ?j2d_mlib_ImageCreateStruct > ?j2d_mlib_ImageDelete > > java.desktop/libawt on solaris: This is now also exported due to > JNIEXPORT: > ?GrPrim_CompGetXorColor > ?SurfaceData_GetOpsNoSetup > ?SurfaceData_IntersectBoundsXYWH > ?SurfaceData_SetOps > ?Transform_GetInfo > ?Transform_transform > > java.desktop/libsplashscreen: JNI_OnLoad is now exported on linux and > solaris due to JNIEXPORT. > libspashscreen also had JNIEXPORT (actually a pure > _declspec(dllexport)) but no JNICALL, which I added as > a part of converting to JNIEXPORT. The same goes for libmlib_image . > > jdk.sctp/libsctp: handleSocketError is now exported on linux and > solaris due to JNIEXPORT in libnio. > > java.instrument:/libinstrument: Agent_OnUnload is now also exported on > linux and solaris platforms due to JNIEXPORT. > JLI_ManifestIterate is now also exported on Windows, due to added > JNIEXPORT in libjli. > > jdk.management/libmanagement_ext: > Java_com_sun_management_internal_Flag_setDoubleValue is now also > exported on linux and solaris platforms due to JNIEXPORT. > > /Magnus > > From Alan.Bateman at oracle.com Fri Mar 23 14:45:09 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 23 Mar 2018 14:45:09 +0000 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> Message-ID: <16419c9d-df6a-c6cb-8357-470643d99cb3@oracle.com> On 23/03/2018 13:56, Magnus Ihse Bursie wrote: > With modern compilers, we can use compiler directives (such as > _attribute__((visibility("default"))), or __declspec(dllexport)) to > control symbol visibility, directly in the source code. This has > historically not been present on all compilers, so we had to resort to > using mapfiles (also known as linker scripts). > > This is no longer the case. Now all compilers we use support symbol > visibility directives, in one form or another. We should start using > this. Since this has been the only way to control symbol visibility on > Windows, for most of the shared code, we already have proper JNIEXPORT > decorations in place. > > If we fix the remaining platform-specific files to have proper > JNIEXPORT tagging, then we can finally get rid of mapfiles. This seems like a great cleanup as the mapfile have always been a pain to maintain. Also shines a light on some technical debt too. handleSocketError in libnio is a surprise, this should not be exported and should not have been in the map file.? I suspect the issue is that jdk.sctp is missing a function prototype from its header file (it has its own handleSocketError in SctpNet.c). NET_Wait in libnet is another one, I can't tell why this was listed in the map file. I'm also surprised with java.dll exporting handleRead, winHandleRead, and handleLSeek. I didn't see them mentioned in your mail so I'm curious what might be using those. -Alan From mandy.chung at oracle.com Fri Mar 23 15:08:17 2018 From: mandy.chung at oracle.com (mandy chung) Date: Fri, 23 Mar 2018 08:08:17 -0700 Subject: RFR : JDK-8071367 - JMX: Remove SNMP support In-Reply-To: <1dcc4fe3-8463-49c6-947d-ef31b92d51db@default> References: <5fed6078-624f-2c4b-b92c-5022fd07925c@oracle.com> <1dcc4fe3-8463-49c6-947d-ef31b92d51db@default> Message-ID: <7c68e8c6-aa6b-1cb7-482d-80df0a867346@oracle.com> On 3/23/18 3:43 AM, Amit Sapre wrote: > > Thanks all for the inputs. > > This webrev addresses the inputs : > http://cr.openjdk.java.net/~asapre/webrev/2018/JDK-8071367/webrev.02/ > > > Looks good. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From magnus.ihse.bursie at oracle.com Fri Mar 23 15:15:35 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 23 Mar 2018 16:15:35 +0100 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <16419c9d-df6a-c6cb-8357-470643d99cb3@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <16419c9d-df6a-c6cb-8357-470643d99cb3@oracle.com> Message-ID: > 23 mars 2018 kl. 15:45 skrev Alan Bateman : > >> On 23/03/2018 13:56, Magnus Ihse Bursie wrote: >> With modern compilers, we can use compiler directives (such as _attribute__((visibility("default"))), or __declspec(dllexport)) to control symbol visibility, directly in the source code. This has historically not been present on all compilers, so we had to resort to using mapfiles (also known as linker scripts). >> >> This is no longer the case. Now all compilers we use support symbol visibility directives, in one form or another. We should start using this. Since this has been the only way to control symbol visibility on Windows, for most of the shared code, we already have proper JNIEXPORT decorations in place. >> >> If we fix the remaining platform-specific files to have proper JNIEXPORT tagging, then we can finally get rid of mapfiles. > This seems like a great cleanup as the mapfile have always been a pain to maintain. Also shines a light on some technical debt too. Very much so, yes. I've found a lot of dubious exports, everything from global variables (yuck!) to functions that does not seem to be used anymore, to lots of strange exports. > handleSocketError in libnio is a surprise, this should not be exported and should not have been in the map file. I suspect the issue is that jdk.sctp is missing a function prototype from its header file (it has its own handleSocketError in SctpNet.c). That might be so, yes. > NET_Wait in libnet is another one, I can't tell why this was listed in the map file. Neither can I. :-) Once again, my goal with this patch was to keep the produced binaries as similar to before with the mapfiles. I'll be happy to file follow-up bugs listing all suspicious symbol handling I've encountered, but I'd rather not change anything about that in this patch. > I'm also surprised with java.dll exporting handleRead, winHandleRead, and handleLSeek. I didn't see them mentioned in your mail so I'm curious what might be using those. They were previously exported using -export: on the command line for the Microsoft linker. This was the case for a couple other libraries as well. Yeah, I forgot to write about that, sorry. :( Been a lot to keep track of, and it went away when I cleaned up my notes. Can I consider this a review? /Magnus > > -Alan From Alan.Bateman at oracle.com Fri Mar 23 16:04:12 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 23 Mar 2018 16:04:12 +0000 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <16419c9d-df6a-c6cb-8357-470643d99cb3@oracle.com> Message-ID: <810a6225-15af-6858-8bea-2c3122ac1c5d@oracle.com> On 23/03/2018 15:15, Magnus Ihse Bursie wrote: > : > Very much so, yes. I've found a lot of dubious exports, everything from global variables (yuck!) to functions that does not seem to be used anymore, to lots of strange exports. The changes looks good to me and I think we should follow this up with a few JIRA issues (as you suggested) for the symbols that don't make sense to export. -Alan From mandy.chung at oracle.com Fri Mar 23 16:05:23 2018 From: mandy.chung at oracle.com (mandy chung) Date: Fri, 23 Mar 2018 09:05:23 -0700 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <86488d2c-61e3-e489-d9fc-976178c35775@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <86488d2c-61e3-e489-d9fc-976178c35775@oracle.com> Message-ID: <2329dea1-a75e-bb70-95a6-242a93006c6d@oracle.com> This is a very good change and no more mapfile to maintain!! Please do file JBS issues for the component teams to clean up their exports. Mandy On 3/23/18 7:30 AM, Erik Joelsson wrote: > I have looked at the build changes and they look good. > > Will you file followups for each component team to look over their > exported symbols, at least for the libraries with > $(EXPORT_ALL_SYMBOLS)? It sure looks like there is some technical debt > laying around here. > > /Erik > > > On 2018-03-23 06:56, Magnus Ihse Bursie wrote: >> With modern compilers, we can use compiler directives (such as >> _attribute__((visibility("default"))), or __declspec(dllexport)) to >> control symbol visibility, directly in the source code. This has >> historically not been present on all compilers, so we had to resort >> to using mapfiles (also known as linker scripts). >> >> This is no longer the case. Now all compilers we use support symbol >> visibility directives, in one form or another. We should start using >> this. Since this has been the only way to control symbol visibility >> on Windows, for most of the shared code, we already have proper >> JNIEXPORT decorations in place. >> >> If we fix the remaining platform-specific files to have proper >> JNIEXPORT tagging, then we can finally get rid of mapfiles. >> >> This fix removed mapfiles for all JDK libraries. It does not touch >> hotspot libraries nor JDK executables; they will have to wait for a >> future fix -- this was complex enough. This change will not have any >> impact on macosx, since we do not use mapfiles there, but instead >> export all symbols. (This is not a good idea, but I'll address that >> separately.) This change will also have a minimal impact on Windows. >> The only reason Windows is impacted at all, is that some changes >> needed by Solaris and Linux were simpler to fix for all platforms. >> >> I have strived for this change to have no impact on the actual >> generated code. Unfortunately, this was not possible to fully >> achieve. I do not believe that these changes will have any actual >> impact on the product, though. I will present the differences more in >> detail further down. Those who are not interested can probably skip >> that. >> >> The patch has passed tier1 testing and is currently running tier2 and >> tier3. Since the running code is more or less (see caveat below) >> unmodified, I don't expect any testing issues. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8200178 >> WebRev: >> http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01 >> >> Details on changes: >> Most of the source code changes are (unsurprisingly) in java.base and >> java.desktop. Remaining changes are in jdk.crypto.ucrypto, >> jdk.hotspot.agent, jdk.jdi and jdk.jdwp.agent. >> >> Source code changes does almost to 100% consists in decorating an >> exported function with JNIEXPORT. I have also followed the >> long-standing convention of adding JNICALL. This is a no-op on >> non-Windows platforms, so for most of the changes this is purely >> cosmetic (and possibly adding in robustness, should the function ever >> be used on Windows in the future). I have also followed the stylistic >> convention of putting "JNIEXPORT JNICALL" on a separate >> line. For some functions, however, this might cause a change in >> calling convention on Windows. Since this can not apply to exported >> functions on Windows (otherwise they would already have had >> JNIEXPORT), I do not think this matters anything. >> >> A few libraries did not have a mapfile, on Linux and/or Solaris. This >> actually meant that all symbols were exported. It is highly unclear >> if this was known and intended by the original make rule writer. I >> have emulated this by adding the flag $(EXPORT_ALL_SYMBOLS) to these >> libraries. Hopefully, we can remove this flag and fix proper exported >> symbols in the future. >> >> I have run the complete build using COMPARE_BUILD, and made a >> thourough analysis of the differences for Linux and Solaris. All >> native libraries have symbol differences, but most of them are >> trivial and/or harmless. As a result, most libraries have disasm >> differences as well, but these too seem trivial and harmless. The >> differences in symbols that are common to all libraries include: >> ?* Internal symbols such as __bss_start, _edata, _end and _fini are >> now global. (They are imported as such from the compiler >> libraries/archives, and we have no linker script to override this >> behavior). >> ?* The versioning tag SUNWprivate_1.1 is not included, and thus >> neither the .gnu.version_d symbol. >> ?* There are a few differences in the symbol and/or mangling of some >> local functions. I'm not sure what's causing this, >> but it's unlikely to have any effect on the product. >> >> Another common source for change in symbols is due to previous >> platform differences. For instance, if we had "JNIEXPORT int JNICALL >> do_foo() { ... }", but do_foo was not in the mapfile, the symbol was >> exported on Windows but not on Linux and Solaris. (Presumable since >> it was not needed there, even though it was compiled for those >> platforms as well.) Now, with the mapfiles gone, do_foo() will be >> exported on all platforms. And contrary, functions that are compiled >> on all platforms, and were exported in mapfiles, but now have gotten >> an JNIEXPORT decoration, will now be visible even on Windows. (This >> accounts for half of the noticed symbol differences on Windows.) I >> could have made the JNIEXPORT conditional on OS, but I didn't think >> the mess in source code were worth the keeping of binary confidence >> with the old build. >> >> A third common source for change in symbols is due to exported >> functions "leaking" across library borders. For instance, some >> functions in java.desktop is compiled in both libawt_xawt and >> libawt_headless, but they were previously only included in the >> mapfile for one of these libraries. Now, since the visibility is >> determined by the source code itself, it gets exported in both >> libraries. A variant of this is when a library depends on another JDK >> library, and includes the header file from that other library, which >> in turn declares a function as JNIEXPORT. This will cause the >> including library to also export the function. This accounts for the >> other half of the changes on Windows. A typical example of this is >> that multiple libraries now re-export hotspot symbols from libjvm.so, >> like jio_fprintf. (I have not listed the libjvm re-exports below.) >> >> Note that? Java_java_io_FileOutputStream_close0 in >> java.base/unix/native/libjava/FileOutputStream_md.c is no longer >> exported, >> and can probably be removed. >> >> Here is a detailed table showing and accounting for all the remaining >> differences found on Linux and Solaris: >> java.base/unix/native/libjava: Java_java_io_FileOutputStream_close0 >> is now also exported on unix platforms due to JNIEXPORT. >> >> java.base/jspawnlauncher: On solaris, we also include >> libjava/childproc.o, which >> now exports less functions than it used to (it used to export all >> functions, now it is compiled with visibility=hidden). >> >> java.base/java(w).exe: Is now also exporting the following symbols >> due to added JNIEXPORT in libjli on Windows: >> (Yes, executables can export symbols on Windows. Confusing, I know.) >> ?JLI_AddArgsFromEnvVar >> ?JLI_CmdToArgs >> ?JLI_GetAppArgIndex >> ?JLI_GetStdArgc >> ?JLI_GetStdArgs >> ?JLI_InitArgProcessing >> ?JLI_Launch >> ?JLI_List_add >> ?JLI_List_new >> ?JLI_ManifestIterate >> ?JLI_MemAlloc >> ?JLI_MemFree >> ?JLI_PreprocessArg >> ?JLI_ReportErrorMessage >> ?JLI_ReportErrorMessageSys >> ?JLI_ReportExceptionDescription >> ?JLI_ReportMessage >> ?JLI_SetTraceLauncher >> ?JLI_StringDup >> >> java.desktop:/libawt_xawt: The following symbols are now also >> exported on linux and solaris due to JNIEXPORT: >> ?awt_DrawingSurface_FreeDrawingSurfaceInfo >> ?awt_DrawingSurface_GetDrawingSurfaceInfo >> ?awt_DrawingSurface_Lock >> ?awt_DrawingSurface_Unlock >> ?awt_GetColor >> >> The following symbols are now also exported on linux and solaris due >> to JNIEXPORT (they were previously >> ?exported only in libawt): >> ?Java_sun_awt_DebugSettings_setCTracingOn__Z >> ?Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2 >> ?Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2I >> ?Java_sun_awt_X11GraphicsConfig_getNumColors >> >> java.desktop:/libawt_headless: The following symbols are now also >> exported due to JNIEXPORT (they were previously >> ?exported only in libawt_xawt and/or libawt): >> ?Java_sun_java2d_opengl_GLXGraphicsConfig_getGLXConfigInfo >> ?Java_sun_java2d_opengl_GLXGraphicsConfig_getOGLCapabilities >> ?Java_sun_java2d_x11_X11PMBlitLoops_updateBitmask >> ?Java_sun_java2d_x11_X11SurfaceData_isShmPMAvailable >> ?X11SurfaceData_GetOps >> >> java.desktop/libawt: The following symbols are now also exported on >> Windows, due to added >> JNIEXPORT: >> ?SurfaceData_InitOps >> ?mul8table >> ?div8table >> ?doDrawPath >> ?doFillPath >> ?g_CMpDataID >> ?initInverseGrayLut >> ?make_dither_arrays >> ?make_uns_ordered_dither_array >> ?path2DFloatCoordsID >> ?path2DNumTypesID >> ?path2DTypesID >> ?path2DWindingRuleID >> ?sg2dStrokeHintID >> ?std_img_oda_blue >> ?std_img_oda_green >> ?std_img_oda_red >> ?std_odas_computed >> ?sunHints_INTVAL_STROKE_PURE >> >> java.desktop/libawt on solaris: >> A number of "#pragma weak" directives was previously overridden by >> the mapfile. >> Now these directives are respected, so these symbols are now weak >> instead of local: >> ?ByteGrayToIntArgbPreConvert_F >> ?ByteGrayToIntArgbPreScaleConvert_F >> ?IntArgbBmToFourByteAbgrPreScaleXparOver_F >> ?IntArgbToIntRgbXorBlit_F >> ?IntBgrToIntBgrAlphaMaskBlit_F >> >> java.desktop/libawt on solaris: These are now also exported due to >> JNIEXPORT in libmlib_image. >> ?j2d_mlib_ImageCreate >> ?j2d_mlib_ImageCreateStruct >> ?j2d_mlib_ImageDelete >> >> java.desktop/libawt on solaris: This is now also exported due to >> JNIEXPORT: >> ?GrPrim_CompGetXorColor >> ?SurfaceData_GetOpsNoSetup >> ?SurfaceData_IntersectBoundsXYWH >> ?SurfaceData_SetOps >> ?Transform_GetInfo >> ?Transform_transform >> >> java.desktop/libsplashscreen: JNI_OnLoad is now exported on linux and >> solaris due to JNIEXPORT. >> libspashscreen also had JNIEXPORT (actually a pure >> _declspec(dllexport)) but no JNICALL, which I added as >> a part of converting to JNIEXPORT. The same goes for libmlib_image . >> >> jdk.sctp/libsctp: handleSocketError is now exported on linux and >> solaris due to JNIEXPORT in libnio. >> >> java.instrument:/libinstrument: Agent_OnUnload is now also exported >> on linux and solaris platforms due to JNIEXPORT. >> JLI_ManifestIterate is now also exported on Windows, due to added >> JNIEXPORT in libjli. >> >> jdk.management/libmanagement_ext: >> Java_com_sun_management_internal_Flag_setDoubleValue is now also >> exported on linux and solaris platforms due to JNIEXPORT. >> >> /Magnus >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alan.Bateman at oracle.com Fri Mar 23 16:23:37 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 23 Mar 2018 16:23:37 +0000 Subject: RFR : JDK-8071367 - JMX: Remove SNMP support In-Reply-To: <1dcc4fe3-8463-49c6-947d-ef31b92d51db@default> References: <5fed6078-624f-2c4b-b92c-5022fd07925c@oracle.com> <1dcc4fe3-8463-49c6-947d-ef31b92d51db@default> Message-ID: <559041cd-e695-3bc8-29a2-ae2d49797292@oracle.com> On 23/03/2018 10:43, Amit Sapre wrote: > > Thanks all for the inputs. > > This webrev addresses the inputs : > http://cr.openjdk.java.net/~asapre/webrev/2018/JDK-8071367/webrev.02/ > > > I think you need to put {@code ... } around "-Dcom.sun.management.*", otherwise looks good to me. -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: From volker.simonis at gmail.com Fri Mar 23 17:24:24 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 23 Mar 2018 18:24:24 +0100 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <2329dea1-a75e-bb70-95a6-242a93006c6d@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <86488d2c-61e3-e489-d9fc-976178c35775@oracle.com> <2329dea1-a75e-bb70-95a6-242a93006c6d@oracle.com> Message-ID: Hi Magnus, thanks for addressing this long standing issue! I haven't looked at the changes, but just want to share some general and historical notes: - Compiling with "-fvisibility=hidden" which hides all symbols expect the ones explicitly exported with "__attribute__((visibility("default")))" has been requested by SAP back in 2007 even before we had OpenJDK (see "Use -fvisibility=hidden for gcc compiles" https://bugs.openjdk.java.net/browse/JDK-6588413) and finally pushed into the OpenJKD around 2010. - "-fvisibility=hidden" gave us performance improvements of about 5% (JBB2005) and 2% (JVM98) on Linux/IA64 and 1,5% (JBB2005) and 0,5% (JVM98) on Linux/PPC64 because the compiler could use faster calls for non exported symbols. This improvement was only very small on x86 tough. - "-fvisibility=hidden"/"__attribute__((visibility("default")))" applies BEFORE using the map files in the linking step (i.e. hidden symbols can't be exported any more even if mentioned in the map file) - because of the performance improvements we got by using "-fvisibility=hidden" it was worth while using it even though we had the mapfiles at the end of the process. Then we had several mail threads (which you probably remember because you were involved :) where we discussed to either remove the map files completely or instead generate them automatically during the build: http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-February/thread.html#12412 http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-February/thread.html#12628 The main arguments against removing the map files at that time were: 1. the danger to re-export all symbols of statically linked libraries (notably libstdc++ at that time) 2. loosing exports of compiler generated symbols like vtables which are required by the Serviceability Agent Point 1 is not a problem today, because I don't think we do any static linking any more. If we still do it under some circumstances, this problem should be re-evaluated. Point 2 is only relevant for HotSpot. But because of "8034065: GCC 4.3 and later doesn't export vtable symbols any more which seem to be needed by SA" (https://bugs.openjdk.java.net/browse/JDK-8034065), exporting such symbols trough a map files doesn't work any more anyway. So this isn't a problem either. So to cut a long story short - I think the time is ripe to get rid of the map files. Thumbs up from me (meant as moral support, not as a concrete review :) Regards, Volker On Fri, Mar 23, 2018 at 5:05 PM, mandy chung wrote: > This is a very good change and no more mapfile to maintain!! > > Please do file JBS issues for the component teams to clean up their exports. > > Mandy > > > On 3/23/18 7:30 AM, Erik Joelsson wrote: >> >> I have looked at the build changes and they look good. >> >> Will you file followups for each component team to look over their >> exported symbols, at least for the libraries with $(EXPORT_ALL_SYMBOLS)? It >> sure looks like there is some technical debt laying around here. >> >> /Erik >> >> >> On 2018-03-23 06:56, Magnus Ihse Bursie wrote: >>> >>> With modern compilers, we can use compiler directives (such as >>> _attribute__((visibility("default"))), or __declspec(dllexport)) to control >>> symbol visibility, directly in the source code. This has historically not >>> been present on all compilers, so we had to resort to using mapfiles (also >>> known as linker scripts). >>> >>> This is no longer the case. Now all compilers we use support symbol >>> visibility directives, in one form or another. We should start using this. >>> Since this has been the only way to control symbol visibility on Windows, >>> for most of the shared code, we already have proper JNIEXPORT decorations in >>> place. >>> >>> If we fix the remaining platform-specific files to have proper JNIEXPORT >>> tagging, then we can finally get rid of mapfiles. >>> >>> This fix removed mapfiles for all JDK libraries. It does not touch >>> hotspot libraries nor JDK executables; they will have to wait for a future >>> fix -- this was complex enough. This change will not have any impact on >>> macosx, since we do not use mapfiles there, but instead export all symbols. >>> (This is not a good idea, but I'll address that separately.) This change >>> will also have a minimal impact on Windows. The only reason Windows is >>> impacted at all, is that some changes needed by Solaris and Linux were >>> simpler to fix for all platforms. >>> >>> I have strived for this change to have no impact on the actual generated >>> code. Unfortunately, this was not possible to fully achieve. I do not >>> believe that these changes will have any actual impact on the product, >>> though. I will present the differences more in detail further down. Those >>> who are not interested can probably skip that. >>> >>> The patch has passed tier1 testing and is currently running tier2 and >>> tier3. Since the running code is more or less (see caveat below) unmodified, >>> I don't expect any testing issues. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8200178 >>> WebRev: >>> http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01 >>> >>> Details on changes: >>> Most of the source code changes are (unsurprisingly) in java.base and >>> java.desktop. Remaining changes are in jdk.crypto.ucrypto, >>> jdk.hotspot.agent, jdk.jdi and jdk.jdwp.agent. >>> >>> Source code changes does almost to 100% consists in decorating an >>> exported function with JNIEXPORT. I have also followed the long-standing >>> convention of adding JNICALL. This is a no-op on non-Windows platforms, so >>> for most of the changes this is purely cosmetic (and possibly adding in >>> robustness, should the function ever be used on Windows in the future). I >>> have also followed the stylistic convention of putting "JNIEXPORT >> type> JNICALL" on a separate line. For some functions, however, this might >>> cause a change in calling convention on Windows. Since this can not apply to >>> exported functions on Windows (otherwise they would already have had >>> JNIEXPORT), I do not think this matters anything. >>> >>> A few libraries did not have a mapfile, on Linux and/or Solaris. This >>> actually meant that all symbols were exported. It is highly unclear if this >>> was known and intended by the original make rule writer. I have emulated >>> this by adding the flag $(EXPORT_ALL_SYMBOLS) to these libraries. Hopefully, >>> we can remove this flag and fix proper exported symbols in the future. >>> >>> I have run the complete build using COMPARE_BUILD, and made a thourough >>> analysis of the differences for Linux and Solaris. All native libraries have >>> symbol differences, but most of them are trivial and/or harmless. As a >>> result, most libraries have disasm differences as well, but these too seem >>> trivial and harmless. The differences in symbols that are common to all >>> libraries include: >>> * Internal symbols such as __bss_start, _edata, _end and _fini are now >>> global. (They are imported as such from the compiler libraries/archives, and >>> we have no linker script to override this behavior). >>> * The versioning tag SUNWprivate_1.1 is not included, and thus neither >>> the .gnu.version_d symbol. >>> * There are a few differences in the symbol and/or mangling of some >>> local functions. I'm not sure what's causing this, >>> but it's unlikely to have any effect on the product. >>> >>> Another common source for change in symbols is due to previous platform >>> differences. For instance, if we had "JNIEXPORT int JNICALL do_foo() { ... >>> }", but do_foo was not in the mapfile, the symbol was exported on Windows >>> but not on Linux and Solaris. (Presumable since it was not needed there, >>> even though it was compiled for those platforms as well.) Now, with the >>> mapfiles gone, do_foo() will be exported on all platforms. And contrary, >>> functions that are compiled on all platforms, and were exported in mapfiles, >>> but now have gotten an JNIEXPORT decoration, will now be visible even on >>> Windows. (This accounts for half of the noticed symbol differences on >>> Windows.) I could have made the JNIEXPORT conditional on OS, but I didn't >>> think the mess in source code were worth the keeping of binary confidence >>> with the old build. >>> >>> A third common source for change in symbols is due to exported functions >>> "leaking" across library borders. For instance, some functions in >>> java.desktop is compiled in both libawt_xawt and libawt_headless, but they >>> were previously only included in the mapfile for one of these libraries. >>> Now, since the visibility is determined by the source code itself, it gets >>> exported in both libraries. A variant of this is when a library depends on >>> another JDK library, and includes the header file from that other library, >>> which in turn declares a function as JNIEXPORT. This will cause the >>> including library to also export the function. This accounts for the other >>> half of the changes on Windows. A typical example of this is that multiple >>> libraries now re-export hotspot symbols from libjvm.so, like jio_fprintf. (I >>> have not listed the libjvm re-exports below.) >>> >>> Note that Java_java_io_FileOutputStream_close0 in >>> java.base/unix/native/libjava/FileOutputStream_md.c is no longer exported, >>> and can probably be removed. >>> >>> Here is a detailed table showing and accounting for all the remaining >>> differences found on Linux and Solaris: >>> java.base/unix/native/libjava: Java_java_io_FileOutputStream_close0 is >>> now also exported on unix platforms due to JNIEXPORT. >>> >>> java.base/jspawnlauncher: On solaris, we also include >>> libjava/childproc.o, which >>> now exports less functions than it used to (it used to export all >>> functions, now it is compiled with visibility=hidden). >>> >>> java.base/java(w).exe: Is now also exporting the following symbols due to >>> added JNIEXPORT in libjli on Windows: >>> (Yes, executables can export symbols on Windows. Confusing, I know.) >>> JLI_AddArgsFromEnvVar >>> JLI_CmdToArgs >>> JLI_GetAppArgIndex >>> JLI_GetStdArgc >>> JLI_GetStdArgs >>> JLI_InitArgProcessing >>> JLI_Launch >>> JLI_List_add >>> JLI_List_new >>> JLI_ManifestIterate >>> JLI_MemAlloc >>> JLI_MemFree >>> JLI_PreprocessArg >>> JLI_ReportErrorMessage >>> JLI_ReportErrorMessageSys >>> JLI_ReportExceptionDescription >>> JLI_ReportMessage >>> JLI_SetTraceLauncher >>> JLI_StringDup >>> >>> java.desktop:/libawt_xawt: The following symbols are now also exported on >>> linux and solaris due to JNIEXPORT: >>> awt_DrawingSurface_FreeDrawingSurfaceInfo >>> awt_DrawingSurface_GetDrawingSurfaceInfo >>> awt_DrawingSurface_Lock >>> awt_DrawingSurface_Unlock >>> awt_GetColor >>> >>> The following symbols are now also exported on linux and solaris due to >>> JNIEXPORT (they were previously >>> exported only in libawt): >>> Java_sun_awt_DebugSettings_setCTracingOn__Z >>> Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2 >>> Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2I >>> Java_sun_awt_X11GraphicsConfig_getNumColors >>> >>> java.desktop:/libawt_headless: The following symbols are now also >>> exported due to JNIEXPORT (they were previously >>> exported only in libawt_xawt and/or libawt): >>> Java_sun_java2d_opengl_GLXGraphicsConfig_getGLXConfigInfo >>> Java_sun_java2d_opengl_GLXGraphicsConfig_getOGLCapabilities >>> Java_sun_java2d_x11_X11PMBlitLoops_updateBitmask >>> Java_sun_java2d_x11_X11SurfaceData_isShmPMAvailable >>> X11SurfaceData_GetOps >>> >>> java.desktop/libawt: The following symbols are now also exported on >>> Windows, due to added >>> JNIEXPORT: >>> SurfaceData_InitOps >>> mul8table >>> div8table >>> doDrawPath >>> doFillPath >>> g_CMpDataID >>> initInverseGrayLut >>> make_dither_arrays >>> make_uns_ordered_dither_array >>> path2DFloatCoordsID >>> path2DNumTypesID >>> path2DTypesID >>> path2DWindingRuleID >>> sg2dStrokeHintID >>> std_img_oda_blue >>> std_img_oda_green >>> std_img_oda_red >>> std_odas_computed >>> sunHints_INTVAL_STROKE_PURE >>> >>> java.desktop/libawt on solaris: >>> A number of "#pragma weak" directives was previously overridden by the >>> mapfile. >>> Now these directives are respected, so these symbols are now weak instead >>> of local: >>> ByteGrayToIntArgbPreConvert_F >>> ByteGrayToIntArgbPreScaleConvert_F >>> IntArgbBmToFourByteAbgrPreScaleXparOver_F >>> IntArgbToIntRgbXorBlit_F >>> IntBgrToIntBgrAlphaMaskBlit_F >>> >>> java.desktop/libawt on solaris: These are now also exported due to >>> JNIEXPORT in libmlib_image. >>> j2d_mlib_ImageCreate >>> j2d_mlib_ImageCreateStruct >>> j2d_mlib_ImageDelete >>> >>> java.desktop/libawt on solaris: This is now also exported due to >>> JNIEXPORT: >>> GrPrim_CompGetXorColor >>> SurfaceData_GetOpsNoSetup >>> SurfaceData_IntersectBoundsXYWH >>> SurfaceData_SetOps >>> Transform_GetInfo >>> Transform_transform >>> >>> java.desktop/libsplashscreen: JNI_OnLoad is now exported on linux and >>> solaris due to JNIEXPORT. >>> libspashscreen also had JNIEXPORT (actually a pure _declspec(dllexport)) >>> but no JNICALL, which I added as >>> a part of converting to JNIEXPORT. The same goes for libmlib_image . >>> >>> jdk.sctp/libsctp: handleSocketError is now exported on linux and solaris >>> due to JNIEXPORT in libnio. >>> >>> java.instrument:/libinstrument: Agent_OnUnload is now also exported on >>> linux and solaris platforms due to JNIEXPORT. >>> JLI_ManifestIterate is now also exported on Windows, due to added >>> JNIEXPORT in libjli. >>> >>> jdk.management/libmanagement_ext: >>> Java_com_sun_management_internal_Flag_setDoubleValue is now also exported on >>> linux and solaris platforms due to JNIEXPORT. >>> >>> /Magnus >>> >>> >> > From philip.race at oracle.com Fri Mar 23 17:33:55 2018 From: philip.race at oracle.com (Phil Race) Date: Fri, 23 Mar 2018 10:33:55 -0700 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> Message-ID: <359e997c-6f49-218b-a1e6-81888135242b@oracle.com> There are a lot of changes in the desktop libraries. Doing mach5 tier1/2/3 testing is not nearly sufficient to cover those since only tier3 has any UI tests and it barely uses anything that's touched here. So since testing seems to be wise, then I think you should do a jtreg desktop group run on Linux & Windows. You can probably skip Mac since it is unaffected and I think Linux will cover Solaris here. You should also do some headless testing. It could take some time to review this properly and decide what changes are OK, what changes are something we should clean up later, and what changes are something that ought to be addressed now .. I think I'd be mainly concerned that something fails due to a missing symbol, or that for newly exported symbols if we ended up with duplicate symbols as a result. The results of a test run will add confidence here. BTW I don't think you are right that java.desktop:/libawt_headless: The following symbols are now also exported due to JNIEXPORT (they were previously .. X11SurfaceData_GetOps It looks to me like it was previously exported. -phil. On 03/23/2018 06:56 AM, Magnus Ihse Bursie wrote: > With modern compilers, we can use compiler directives (such as > _attribute__((visibility("default"))), or __declspec(dllexport)) to > control symbol visibility, directly in the source code. This has > historically not been present on all compilers, so we had to resort to > using mapfiles (also known as linker scripts). > > This is no longer the case. Now all compilers we use support symbol > visibility directives, in one form or another. We should start using > this. Since this has been the only way to control symbol visibility on > Windows, for most of the shared code, we already have proper JNIEXPORT > decorations in place. > > If we fix the remaining platform-specific files to have proper > JNIEXPORT tagging, then we can finally get rid of mapfiles. > > This fix removed mapfiles for all JDK libraries. It does not touch > hotspot libraries nor JDK executables; they will have to wait for a > future fix -- this was complex enough. This change will not have any > impact on macosx, since we do not use mapfiles there, but instead > export all symbols. (This is not a good idea, but I'll address that > separately.) This change will also have a minimal impact on Windows. > The only reason Windows is impacted at all, is that some changes > needed by Solaris and Linux were simpler to fix for all platforms. > > I have strived for this change to have no impact on the actual > generated code. Unfortunately, this was not possible to fully achieve. > I do not believe that these changes will have any actual impact on the > product, though. I will present the differences more in detail further > down. Those who are not interested can probably skip that. > > The patch has passed tier1 testing and is currently running tier2 and > tier3. Since the running code is more or less (see caveat below) > unmodified, I don't expect any testing issues. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8200178 > WebRev: > http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01 > > Details on changes: > Most of the source code changes are (unsurprisingly) in java.base and > java.desktop. Remaining changes are in jdk.crypto.ucrypto, > jdk.hotspot.agent, jdk.jdi and jdk.jdwp.agent. > > Source code changes does almost to 100% consists in decorating an > exported function with JNIEXPORT. I have also followed the > long-standing convention of adding JNICALL. This is a no-op on > non-Windows platforms, so for most of the changes this is purely > cosmetic (and possibly adding in robustness, should the function ever > be used on Windows in the future). I have also followed the stylistic > convention of putting "JNIEXPORT JNICALL" on a separate > line. For some functions, however, this might cause a change in > calling convention on Windows. Since this can not apply to exported > functions on Windows (otherwise they would already have had > JNIEXPORT), I do not think this matters anything. > > A few libraries did not have a mapfile, on Linux and/or Solaris. This > actually meant that all symbols were exported. It is highly unclear if > this was known and intended by the original make rule writer. I have > emulated this by adding the flag $(EXPORT_ALL_SYMBOLS) to these > libraries. Hopefully, we can remove this flag and fix proper exported > symbols in the future. > > I have run the complete build using COMPARE_BUILD, and made a > thourough analysis of the differences for Linux and Solaris. All > native libraries have symbol differences, but most of them are trivial > and/or harmless. As a result, most libraries have disasm differences > as well, but these too seem trivial and harmless. The differences in > symbols that are common to all libraries include: > * Internal symbols such as __bss_start, _edata, _end and _fini are > now global. (They are imported as such from the compiler > libraries/archives, and we have no linker script to override this > behavior). > * The versioning tag SUNWprivate_1.1 is not included, and thus > neither the .gnu.version_d symbol. > * There are a few differences in the symbol and/or mangling of some > local functions. I'm not sure what's causing this, > but it's unlikely to have any effect on the product. > > Another common source for change in symbols is due to previous > platform differences. For instance, if we had "JNIEXPORT int JNICALL > do_foo() { ... }", but do_foo was not in the mapfile, the symbol was > exported on Windows but not on Linux and Solaris. (Presumable since it > was not needed there, even though it was compiled for those platforms > as well.) Now, with the mapfiles gone, do_foo() will be exported on > all platforms. And contrary, functions that are compiled on all > platforms, and were exported in mapfiles, but now have gotten an > JNIEXPORT decoration, will now be visible even on Windows. (This > accounts for half of the noticed symbol differences on Windows.) I > could have made the JNIEXPORT conditional on OS, but I didn't think > the mess in source code were worth the keeping of binary confidence > with the old build. > > A third common source for change in symbols is due to exported > functions "leaking" across library borders. For instance, some > functions in java.desktop is compiled in both libawt_xawt and > libawt_headless, but they were previously only included in the mapfile > for one of these libraries. Now, since the visibility is determined by > the source code itself, it gets exported in both libraries. A variant > of this is when a library depends on another JDK library, and includes > the header file from that other library, which in turn declares a > function as JNIEXPORT. This will cause the including library to also > export the function. This accounts for the other half of the changes > on Windows. A typical example of this is that multiple libraries now > re-export hotspot symbols from libjvm.so, like jio_fprintf. (I have > not listed the libjvm re-exports below.) > > Note that Java_java_io_FileOutputStream_close0 in > java.base/unix/native/libjava/FileOutputStream_md.c is no longer > exported, > and can probably be removed. > > Here is a detailed table showing and accounting for all the remaining > differences found on Linux and Solaris: > java.base/unix/native/libjava: Java_java_io_FileOutputStream_close0 is > now also exported on unix platforms due to JNIEXPORT. > > java.base/jspawnlauncher: On solaris, we also include > libjava/childproc.o, which > now exports less functions than it used to (it used to export all > functions, now it is compiled with visibility=hidden). > > java.base/java(w).exe: Is now also exporting the following symbols due > to added JNIEXPORT in libjli on Windows: > (Yes, executables can export symbols on Windows. Confusing, I know.) > JLI_AddArgsFromEnvVar > JLI_CmdToArgs > JLI_GetAppArgIndex > JLI_GetStdArgc > JLI_GetStdArgs > JLI_InitArgProcessing > JLI_Launch > JLI_List_add > JLI_List_new > JLI_ManifestIterate > JLI_MemAlloc > JLI_MemFree > JLI_PreprocessArg > JLI_ReportErrorMessage > JLI_ReportErrorMessageSys > JLI_ReportExceptionDescription > JLI_ReportMessage > JLI_SetTraceLauncher > JLI_StringDup > > java.desktop:/libawt_xawt: The following symbols are now also exported > on linux and solaris due to JNIEXPORT: > awt_DrawingSurface_FreeDrawingSurfaceInfo > awt_DrawingSurface_GetDrawingSurfaceInfo > awt_DrawingSurface_Lock > awt_DrawingSurface_Unlock > awt_GetColor > > The following symbols are now also exported on linux and solaris due > to JNIEXPORT (they were previously > exported only in libawt): > Java_sun_awt_DebugSettings_setCTracingOn__Z > Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2 > Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2I > Java_sun_awt_X11GraphicsConfig_getNumColors > > java.desktop:/libawt_headless: The following symbols are now also > exported due to JNIEXPORT (they were previously > exported only in libawt_xawt and/or libawt): > Java_sun_java2d_opengl_GLXGraphicsConfig_getGLXConfigInfo > Java_sun_java2d_opengl_GLXGraphicsConfig_getOGLCapabilities > Java_sun_java2d_x11_X11PMBlitLoops_updateBitmask > Java_sun_java2d_x11_X11SurfaceData_isShmPMAvailable > X11SurfaceData_GetOps > > java.desktop/libawt: The following symbols are now also exported on > Windows, due to added > JNIEXPORT: > SurfaceData_InitOps > mul8table > div8table > doDrawPath > doFillPath > g_CMpDataID > initInverseGrayLut > make_dither_arrays > make_uns_ordered_dither_array > path2DFloatCoordsID > path2DNumTypesID > path2DTypesID > path2DWindingRuleID > sg2dStrokeHintID > std_img_oda_blue > std_img_oda_green > std_img_oda_red > std_odas_computed > sunHints_INTVAL_STROKE_PURE > > java.desktop/libawt on solaris: > A number of "#pragma weak" directives was previously overridden by the > mapfile. > Now these directives are respected, so these symbols are now weak > instead of local: > ByteGrayToIntArgbPreConvert_F > ByteGrayToIntArgbPreScaleConvert_F > IntArgbBmToFourByteAbgrPreScaleXparOver_F > IntArgbToIntRgbXorBlit_F > IntBgrToIntBgrAlphaMaskBlit_F > > java.desktop/libawt on solaris: These are now also exported due to > JNIEXPORT in libmlib_image. > j2d_mlib_ImageCreate > j2d_mlib_ImageCreateStruct > j2d_mlib_ImageDelete > > java.desktop/libawt on solaris: This is now also exported due to > JNIEXPORT: > GrPrim_CompGetXorColor > SurfaceData_GetOpsNoSetup > SurfaceData_IntersectBoundsXYWH > SurfaceData_SetOps > Transform_GetInfo > Transform_transform > > java.desktop/libsplashscreen: JNI_OnLoad is now exported on linux and > solaris due to JNIEXPORT. > libspashscreen also had JNIEXPORT (actually a pure > _declspec(dllexport)) but no JNICALL, which I added as > a part of converting to JNIEXPORT. The same goes for libmlib_image . > > jdk.sctp/libsctp: handleSocketError is now exported on linux and > solaris due to JNIEXPORT in libnio. > > java.instrument:/libinstrument: Agent_OnUnload is now also exported on > linux and solaris platforms due to JNIEXPORT. > JLI_ManifestIterate is now also exported on Windows, due to added > JNIEXPORT in libjli. > > jdk.management/libmanagement_ext: > Java_com_sun_management_internal_Flag_setDoubleValue is now also > exported on linux and solaris platforms due to JNIEXPORT. > > /Magnus > > From philip.race at oracle.com Fri Mar 23 18:01:34 2018 From: philip.race at oracle.com (Phil Race) Date: Fri, 23 Mar 2018 11:01:34 -0700 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <359e997c-6f49-218b-a1e6-81888135242b@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <359e997c-6f49-218b-a1e6-81888135242b@oracle.com> Message-ID: <96a6b8d6-d979-9cc9-6d51-7c6f0926bf97@oracle.com> http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01/src/java.desktop/share/native/libmlib_image/mlib_image_proto.h.udiff.html The variable definitions here are now misaligned. ..and added 2d-dev since many of these native changes are in 2d. -phil. On 03/23/2018 10:33 AM, Phil Race wrote: > There are a lot of changes in the desktop libraries. > Doing mach5 tier1/2/3 testing is not nearly sufficient to cover those > since > only tier3 has any UI tests and it barely uses anything that's touched > here. > So since testing seems to be wise, then I think you should do a > jtreg desktop group run on Linux & Windows. > You can probably skip Mac since it is unaffected and I think Linux > will cover Solaris here. > You should also do some headless testing. > > It could take some time to review this properly and decide what > changes are OK, > what changes are something we should clean up later, and what changes > are something > that ought to be addressed now .. > > I think I'd be mainly concerned that something fails due to a missing > symbol, or > that for newly exported symbols if we ended up with duplicate symbols > as a result. > > The results of a test run will add confidence here. > > BTW I don't think you are right that > java.desktop:/libawt_headless: The following symbols are now also > exported due to JNIEXPORT (they were previously > .. > X11SurfaceData_GetOps > > It looks to me like it was previously exported. > > > -phil. > > > > On 03/23/2018 06:56 AM, Magnus Ihse Bursie wrote: >> With modern compilers, we can use compiler directives (such as >> _attribute__((visibility("default"))), or __declspec(dllexport)) to >> control symbol visibility, directly in the source code. This has >> historically not been present on all compilers, so we had to resort >> to using mapfiles (also known as linker scripts). >> >> This is no longer the case. Now all compilers we use support symbol >> visibility directives, in one form or another. We should start using >> this. Since this has been the only way to control symbol visibility >> on Windows, for most of the shared code, we already have proper >> JNIEXPORT decorations in place. >> >> If we fix the remaining platform-specific files to have proper >> JNIEXPORT tagging, then we can finally get rid of mapfiles. >> >> This fix removed mapfiles for all JDK libraries. It does not touch >> hotspot libraries nor JDK executables; they will have to wait for a >> future fix -- this was complex enough. This change will not have any >> impact on macosx, since we do not use mapfiles there, but instead >> export all symbols. (This is not a good idea, but I'll address that >> separately.) This change will also have a minimal impact on Windows. >> The only reason Windows is impacted at all, is that some changes >> needed by Solaris and Linux were simpler to fix for all platforms. >> >> I have strived for this change to have no impact on the actual >> generated code. Unfortunately, this was not possible to fully >> achieve. I do not believe that these changes will have any actual >> impact on the product, though. I will present the differences more in >> detail further down. Those who are not interested can probably skip >> that. >> >> The patch has passed tier1 testing and is currently running tier2 and >> tier3. Since the running code is more or less (see caveat below) >> unmodified, I don't expect any testing issues. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8200178 >> WebRev: >> http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01 >> >> Details on changes: >> Most of the source code changes are (unsurprisingly) in java.base and >> java.desktop. Remaining changes are in jdk.crypto.ucrypto, >> jdk.hotspot.agent, jdk.jdi and jdk.jdwp.agent. >> >> Source code changes does almost to 100% consists in decorating an >> exported function with JNIEXPORT. I have also followed the >> long-standing convention of adding JNICALL. This is a no-op on >> non-Windows platforms, so for most of the changes this is purely >> cosmetic (and possibly adding in robustness, should the function ever >> be used on Windows in the future). I have also followed the stylistic >> convention of putting "JNIEXPORT JNICALL" on a separate >> line. For some functions, however, this might cause a change in >> calling convention on Windows. Since this can not apply to exported >> functions on Windows (otherwise they would already have had >> JNIEXPORT), I do not think this matters anything. >> >> A few libraries did not have a mapfile, on Linux and/or Solaris. This >> actually meant that all symbols were exported. It is highly unclear >> if this was known and intended by the original make rule writer. I >> have emulated this by adding the flag $(EXPORT_ALL_SYMBOLS) to these >> libraries. Hopefully, we can remove this flag and fix proper exported >> symbols in the future. >> >> I have run the complete build using COMPARE_BUILD, and made a >> thourough analysis of the differences for Linux and Solaris. All >> native libraries have symbol differences, but most of them are >> trivial and/or harmless. As a result, most libraries have disasm >> differences as well, but these too seem trivial and harmless. The >> differences in symbols that are common to all libraries include: >> * Internal symbols such as __bss_start, _edata, _end and _fini are >> now global. (They are imported as such from the compiler >> libraries/archives, and we have no linker script to override this >> behavior). >> * The versioning tag SUNWprivate_1.1 is not included, and thus >> neither the .gnu.version_d symbol. >> * There are a few differences in the symbol and/or mangling of some >> local functions. I'm not sure what's causing this, >> but it's unlikely to have any effect on the product. >> >> Another common source for change in symbols is due to previous >> platform differences. For instance, if we had "JNIEXPORT int JNICALL >> do_foo() { ... }", but do_foo was not in the mapfile, the symbol was >> exported on Windows but not on Linux and Solaris. (Presumable since >> it was not needed there, even though it was compiled for those >> platforms as well.) Now, with the mapfiles gone, do_foo() will be >> exported on all platforms. And contrary, functions that are compiled >> on all platforms, and were exported in mapfiles, but now have gotten >> an JNIEXPORT decoration, will now be visible even on Windows. (This >> accounts for half of the noticed symbol differences on Windows.) I >> could have made the JNIEXPORT conditional on OS, but I didn't think >> the mess in source code were worth the keeping of binary confidence >> with the old build. >> >> A third common source for change in symbols is due to exported >> functions "leaking" across library borders. For instance, some >> functions in java.desktop is compiled in both libawt_xawt and >> libawt_headless, but they were previously only included in the >> mapfile for one of these libraries. Now, since the visibility is >> determined by the source code itself, it gets exported in both >> libraries. A variant of this is when a library depends on another JDK >> library, and includes the header file from that other library, which >> in turn declares a function as JNIEXPORT. This will cause the >> including library to also export the function. This accounts for the >> other half of the changes on Windows. A typical example of this is >> that multiple libraries now re-export hotspot symbols from libjvm.so, >> like jio_fprintf. (I have not listed the libjvm re-exports below.) >> >> Note that Java_java_io_FileOutputStream_close0 in >> java.base/unix/native/libjava/FileOutputStream_md.c is no longer >> exported, >> and can probably be removed. >> >> Here is a detailed table showing and accounting for all the remaining >> differences found on Linux and Solaris: >> java.base/unix/native/libjava: Java_java_io_FileOutputStream_close0 >> is now also exported on unix platforms due to JNIEXPORT. >> >> java.base/jspawnlauncher: On solaris, we also include >> libjava/childproc.o, which >> now exports less functions than it used to (it used to export all >> functions, now it is compiled with visibility=hidden). >> >> java.base/java(w).exe: Is now also exporting the following symbols >> due to added JNIEXPORT in libjli on Windows: >> (Yes, executables can export symbols on Windows. Confusing, I know.) >> JLI_AddArgsFromEnvVar >> JLI_CmdToArgs >> JLI_GetAppArgIndex >> JLI_GetStdArgc >> JLI_GetStdArgs >> JLI_InitArgProcessing >> JLI_Launch >> JLI_List_add >> JLI_List_new >> JLI_ManifestIterate >> JLI_MemAlloc >> JLI_MemFree >> JLI_PreprocessArg >> JLI_ReportErrorMessage >> JLI_ReportErrorMessageSys >> JLI_ReportExceptionDescription >> JLI_ReportMessage >> JLI_SetTraceLauncher >> JLI_StringDup >> >> java.desktop:/libawt_xawt: The following symbols are now also >> exported on linux and solaris due to JNIEXPORT: >> awt_DrawingSurface_FreeDrawingSurfaceInfo >> awt_DrawingSurface_GetDrawingSurfaceInfo >> awt_DrawingSurface_Lock >> awt_DrawingSurface_Unlock >> awt_GetColor >> >> The following symbols are now also exported on linux and solaris due >> to JNIEXPORT (they were previously >> exported only in libawt): >> Java_sun_awt_DebugSettings_setCTracingOn__Z >> Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2 >> Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2I >> Java_sun_awt_X11GraphicsConfig_getNumColors >> >> java.desktop:/libawt_headless: The following symbols are now also >> exported due to JNIEXPORT (they were previously >> exported only in libawt_xawt and/or libawt): >> Java_sun_java2d_opengl_GLXGraphicsConfig_getGLXConfigInfo >> Java_sun_java2d_opengl_GLXGraphicsConfig_getOGLCapabilities >> Java_sun_java2d_x11_X11PMBlitLoops_updateBitmask >> Java_sun_java2d_x11_X11SurfaceData_isShmPMAvailable >> X11SurfaceData_GetOps >> >> java.desktop/libawt: The following symbols are now also exported on >> Windows, due to added >> JNIEXPORT: >> SurfaceData_InitOps >> mul8table >> div8table >> doDrawPath >> doFillPath >> g_CMpDataID >> initInverseGrayLut >> make_dither_arrays >> make_uns_ordered_dither_array >> path2DFloatCoordsID >> path2DNumTypesID >> path2DTypesID >> path2DWindingRuleID >> sg2dStrokeHintID >> std_img_oda_blue >> std_img_oda_green >> std_img_oda_red >> std_odas_computed >> sunHints_INTVAL_STROKE_PURE >> >> java.desktop/libawt on solaris: >> A number of "#pragma weak" directives was previously overridden by >> the mapfile. >> Now these directives are respected, so these symbols are now weak >> instead of local: >> ByteGrayToIntArgbPreConvert_F >> ByteGrayToIntArgbPreScaleConvert_F >> IntArgbBmToFourByteAbgrPreScaleXparOver_F >> IntArgbToIntRgbXorBlit_F >> IntBgrToIntBgrAlphaMaskBlit_F >> >> java.desktop/libawt on solaris: These are now also exported due to >> JNIEXPORT in libmlib_image. >> j2d_mlib_ImageCreate >> j2d_mlib_ImageCreateStruct >> j2d_mlib_ImageDelete >> >> java.desktop/libawt on solaris: This is now also exported due to >> JNIEXPORT: >> GrPrim_CompGetXorColor >> SurfaceData_GetOpsNoSetup >> SurfaceData_IntersectBoundsXYWH >> SurfaceData_SetOps >> Transform_GetInfo >> Transform_transform >> >> java.desktop/libsplashscreen: JNI_OnLoad is now exported on linux and >> solaris due to JNIEXPORT. >> libspashscreen also had JNIEXPORT (actually a pure >> _declspec(dllexport)) but no JNICALL, which I added as >> a part of converting to JNIEXPORT. The same goes for libmlib_image . >> >> jdk.sctp/libsctp: handleSocketError is now exported on linux and >> solaris due to JNIEXPORT in libnio. >> >> java.instrument:/libinstrument: Agent_OnUnload is now also exported >> on linux and solaris platforms due to JNIEXPORT. >> JLI_ManifestIterate is now also exported on Windows, due to added >> JNIEXPORT in libjli. >> >> jdk.management/libmanagement_ext: >> Java_com_sun_management_internal_Flag_setDoubleValue is now also >> exported on linux and solaris platforms due to JNIEXPORT. >> >> /Magnus >> >> > From gary.adams at oracle.com Fri Mar 23 18:33:55 2018 From: gary.adams at oracle.com (Gary Adams) Date: Fri, 23 Mar 2018 14:33:55 -0400 Subject: RFR: JDK-8031445: Attach on windows can fail with java.io.IOException: All pipe instances are busy In-Reply-To: References: <4eb6ff53-0b9a-e4fb-60d1-1a5e0b82c10c@oracle.com> Message-ID: <5AB54893.5040301@oracle.com> I finally found some time for some additional testing with the fix below. I could not force the attach error with windows-x64, but did finally see some failures with windows-x86. - I plan to include the ProblemList.txt change for JDK-8057732, because this change removed the timeout observed with that test. - JDK-8037274 provided an interim change to include GetLastError() in the exception message. That explains why sightings were reporting different exception messages. Chris, if you're OK sponsoring this push, I'll send you a cleaned up rebased patch on Mon. On 2/7/18, 3:34 PM, gary.adams at oracle.com wrote: > On 2/7/18 3:19 PM, gary.adams at oracle.com wrote: >> Hi Gary, >> >> I don't think you intended to include the ProblemList.txt changes in >> your webrev. > You are right. I was also looking at JDK-8057732 in the same workspace. > I believe there may have been a windows-x86 issue that may no longer > be an issue. >> I think your changes address the "java.io.IOException: CreateNamedPipe >> failed" failures if a name collision is the cause. This failure mode was >> extremely rare (only 3 sightings), and if due to a collision, a single >> retry should suffice in making it not appear again in our lifetime. >> However, I don't think this addresses the "java.io.IOException: All pipe >> instances are busy" issue, which seems to the more common failures mode, >> although also very rare. Have you looked into its potential cause? > Unfortunately, we no longer have the stack traces from the earlier > test failures. > The one stack trace we do have comes from this same native call to > createNamedPipe. > > I have not been able to reproduce any of the original reported errors, > yet. > If this is a question of a heavily loaded system contending for a > limit number > of named pipes, the retry should address a number of those race > conditions. > We could also introduce a delay before the retry in case an older > process is exiting > and not getting enough cycles to complete. > > Since we're talking about attach operations, I don't think we'll see this > issue failing in real life situations. >> thanks, >> >> Chris >> >> On 2/7/18 8:51 AM,gary.adams at oracle.com wrote: >> >/ The IOException that is observed when creating a new named pipe >> />/ when the pipe already exists and is in use, recommends to retry >> />/ the operation later. Since we are already using a random number >> />/ to generate a unique pipe name, it makes sense to simply >> />/ retry the operation with a new pipe name. >> />/ >> />/ Here is a proposed fix. Testing in progress. >> />/ >> />/ Issue:https://bugs.openjdk.java.net/browse/JDK-8031445 >> />/ Webrev:http://cr.openjdk.java.net/~gadams/8031445/ >> />/ >> />/ >> / > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Mar 23 19:12:08 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 23 Mar 2018 12:12:08 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <5a8b8fc3-7b92-7e74-a2f7-5b7e077ce7b8@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> <3b9731df-3ea9-59ec-2c0c-af72a554e69a@oracle.com> <7402a238-0c43-ac32-e0b0-4951361c1760@oracle.com> <81f49bb9-fd37-b114-ad56-94e9141ec6e1@oracle.com> <5a8b8fc3-7b92-7e74-a2f7-5b7e077ce7b8@oracle.com> Message-ID: Hi Alex, It looks good to me. Thank you for the update! Thanks, Serguei On 3/22/18 16:28, Alex Menkov wrote: > Hi David, > > With too-long shmem name java reports: > ERROR: transport error 202: failed to create shared memory listener: > Error: address strings longer than 50 characters are invalid > and ret.code is 2 > > I added checks for both ret.code and presence of "address strings > longer than" text in the output. > webrev: > http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.06/ > > --alex > > On 03/22/2018 14:32, David Holmes wrote: >> On 23/03/2018 6:43 AM, Alex Menkov wrote: >>> >>> Updated webrev: >>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.05/ >>> >>> The test was updated to ensure shmem name longer than 49 symbols >>> causes java failure. >> >> This doesn't ensure it failed gracefully: >> >> 81???????? // extra test: ensure using of too-long name fails gracefully >> 82???????? // (shmemName + "X") is expected to be "too long". >> 83???????? ProcessTools.executeProcess(getTarget(shmemName + "X")) >> 84???????????????? .shouldNotHaveExitValue(0); >> >> It may have crashed. What exactly is the failure mode? return code 1? >> Exception message that we can check for in outputAnalyzer ? >> >> David >> >>> --alex >>> >>> >>> On 03/21/2018 15:39, David Holmes wrote: >>>> On 22/03/2018 2:41 AM, Alex Menkov wrote: >>>>> Hi David, >>>>> >>>>> On 03/20/2018 21:51, David Holmes wrote: >>>>>> Hi Alex, >>>>>> >>>>>> On 21/03/2018 3:25 AM, Alex Menkov wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> On 03/19/2018 18:10, David Holmes wrote: >>>>>>>> Hi Alex, >>>>>>>> >>>>>>>> On 20/03/2018 10:28 AM, Alex Menkov wrote: >>>>>>>>> Hi guys, >>>>>>>>> >>>>>>>>> please re-review the fix. >>>>>>>> >>>>>>>> I still have an unanswered question about where the max of 49 >>>>>>>> is enforced. I see it for the "address" but not names in >>>>>>>> general. ?? >>>>>>> >>>>>>> for shmem the "channel name" is the address (it's checked in >>>>>>> createTransport/openTransport). >>>>>>> Names for mutexes/events are generated by appending some strings >>>>>>> to the adddress and length of the added parts are supposed to be >>>>>>> less than MAX_IPC_SUFFIX (25 symbols): >>>>>>> ".mutex" (+ up to 3 symbols) >>>>>>> ".hasData" (+ up to 3 symbols) >>>>>>> ".hasSpace" (+ up to 3 symbols) >>>>>>> ".ctos" >>>>>>> ".stoc" >>>>>>> ".accept" (+ up to 3 symbols) >>>>>>> ".attach" (+ up to 3 symbols) >>>>>>> "." (pid is a DWORD) >>>>>> >>>>>> Okay so ... the code in shmemBase.c is very unclear as to which >>>>>> "names" can come in from an external source and which are only >>>>>> ever derived from other "names". If the "address" (which seems a >>>>>> very bad description in this case!) is the only external source >>>>>> for a name, and it is limited to a length of 49 then that is okay. >>>>> >>>>> Yes, the "address" is the only external arg, all other names are >>>>> constructed from it. >>>>> I believe it's "address" because it comes from "address" parameter: >>>>> -Xrunjdwp:transport=st_shmem,address= >>>>> >>>>>> >>>>>>>> >>>>>>>>> Reg.test is added the the issue. >>>>>>>> >>>>>>>> I don't quite follow the test. I see you try to set the name >>>>>>>> with a value that is too long, and if that doesn't cause an >>>>>>>> overflow and we don't crash that is good. But I'd expect you to >>>>>>>> read back the name and check it matches the truncated name with >>>>>>>> 49 characters. >>>>>>> >>>>>>> The test specifies the maximum length supported (49 symbols) >>>>>>> (if longer name is specified, "address strings longer than 50 >>>>>>> characters are invalid" error reported). >>>>>> >>>>>> I missed the substring that simply causes the name to be the >>>>>> maximum supported length. That would trigger the overflow and so >>>>>> suffices as a regression test for this fix. >>>>>> >>>>>> Is there another test that already passes a too-long name and >>>>>> verifies the error gets thrown? >>>>> >>>>> Do you mean name >= 50 symbols? >>>>> No, there is no such test. >>>>> I don't think it make much sense (test an arbitrary >>>>> implementation-specific restriction), but I can add the case to >>>>> the test. >>>> >>>> It ensures that using a too-long name fails gracefully. >>>> >>>> Thanks, >>>> David >>>> >>>>> --alex >>>>> >>>>>> >>>>>>> As far as I see there is no way to read back the name used to >>>>>>> create the transport. >>>>>> >>>>>> Ok. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> --alex >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> webrev: >>>>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.04/ >>>>>>>>> >>>>>>>>> >>>>>>>>> --alex >>>>>>>>> >>>>>>>>> On 03/13/2018 16:14, Alex Menkov wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> Please review a small fix for >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>>>>>>> webrev: >>>>>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>>>>>>> >>>>>>>>>> Root cause of the issue is jbd hungs as a result of the >>>>>>>>>> buffer overflow. >>>>>>>>>> >>>>>>>>>> In the beginning of the shmemBase.c: >>>>>>>>>> >>>>>>>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated >>>>>>>>>> name for */ >>>>>>>>>> ???????????????????????????? /* shared memory seg and prefix >>>>>>>>>> for other IPC */ >>>>>>>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for other >>>>>>>>>> IPC names */ >>>>>>>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>>>>>>> >>>>>>>>>> buffer (char prefix[]) in function createStream is used to >>>>>>>>>> generate base name for mutex/events, so MAX_IPC_PREFIX is not >>>>>>>>>> big enough. >>>>>>>>>> >>>>>>>>>> --alex From chris.plummer at oracle.com Fri Mar 23 20:15:44 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 23 Mar 2018 13:15:44 -0700 Subject: RFR: JDK-8031445: Attach on windows can fail with java.io.IOException: All pipe instances are busy In-Reply-To: <5AB54893.5040301@oracle.com> References: <4eb6ff53-0b9a-e4fb-60d1-1a5e0b82c10c@oracle.com> <5AB54893.5040301@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Fri Mar 23 20:17:57 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 23 Mar 2018 13:17:57 -0700 Subject: RFR: 8049695: nsk/jdb/options/connect/connect003 fails with "Launched jdb could not attach to debuggee during 300000 milliseconds" In-Reply-To: <12f15c12-be21-ffd6-0172-7ca3be1f38f8@oracle.com> References: <8096d37d-dea9-d6fe-7abf-1c09b128d220@oracle.com> <5d3b8c12-d091-aa51-d587-e77c6c4b5b90@oracle.com> <3b9731df-3ea9-59ec-2c0c-af72a554e69a@oracle.com> <7402a238-0c43-ac32-e0b0-4951361c1760@oracle.com> <81f49bb9-fd37-b114-ad56-94e9141ec6e1@oracle.com> <5a8b8fc3-7b92-7e74-a2f7-5b7e077ce7b8@oracle.com> <12f15c12-be21-ffd6-0172-7ca3be1f38f8@oracle.com> Message-ID: +1 On 3/22/18 4:50 PM, David Holmes wrote: > Thanks Alex! Looks good. > > David > > On 23/03/2018 9:28 AM, Alex Menkov wrote: >> Hi David, >> >> With too-long shmem name java reports: >> ERROR: transport error 202: failed to create shared memory listener: >> Error: address strings longer than 50 characters are invalid >> and ret.code is 2 >> >> I added checks for both ret.code and presence of "address strings >> longer than" text in the output. >> webrev: >> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.06/ >> >> --alex >> >> On 03/22/2018 14:32, David Holmes wrote: >>> On 23/03/2018 6:43 AM, Alex Menkov wrote: >>>> >>>> Updated webrev: >>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.05/ >>>> >>>> The test was updated to ensure shmem name longer than 49 symbols >>>> causes java failure. >>> >>> This doesn't ensure it failed gracefully: >>> >>> 81???????? // extra test: ensure using of too-long name fails >>> gracefully >>> 82???????? // (shmemName + "X") is expected to be "too long". >>> 83???????? ProcessTools.executeProcess(getTarget(shmemName + "X")) >>> 84???????????????? .shouldNotHaveExitValue(0); >>> >>> It may have crashed. What exactly is the failure mode? return code >>> 1? Exception message that we can check for in outputAnalyzer ? >>> >>> David >>> >>>> --alex >>>> >>>> >>>> On 03/21/2018 15:39, David Holmes wrote: >>>>> On 22/03/2018 2:41 AM, Alex Menkov wrote: >>>>>> Hi David, >>>>>> >>>>>> On 03/20/2018 21:51, David Holmes wrote: >>>>>>> Hi Alex, >>>>>>> >>>>>>> On 21/03/2018 3:25 AM, Alex Menkov wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> On 03/19/2018 18:10, David Holmes wrote: >>>>>>>>> Hi Alex, >>>>>>>>> >>>>>>>>> On 20/03/2018 10:28 AM, Alex Menkov wrote: >>>>>>>>>> Hi guys, >>>>>>>>>> >>>>>>>>>> please re-review the fix. >>>>>>>>> >>>>>>>>> I still have an unanswered question about where the max of 49 >>>>>>>>> is enforced. I see it for the "address" but not names in >>>>>>>>> general. ?? >>>>>>>> >>>>>>>> for shmem the "channel name" is the address (it's checked in >>>>>>>> createTransport/openTransport). >>>>>>>> Names for mutexes/events are generated by appending some >>>>>>>> strings to the adddress and length of the added parts are >>>>>>>> supposed to be less than MAX_IPC_SUFFIX (25 symbols): >>>>>>>> ".mutex" (+ up to 3 symbols) >>>>>>>> ".hasData" (+ up to 3 symbols) >>>>>>>> ".hasSpace" (+ up to 3 symbols) >>>>>>>> ".ctos" >>>>>>>> ".stoc" >>>>>>>> ".accept" (+ up to 3 symbols) >>>>>>>> ".attach" (+ up to 3 symbols) >>>>>>>> "." (pid is a DWORD) >>>>>>> >>>>>>> Okay so ... the code in shmemBase.c is very unclear as to which >>>>>>> "names" can come in from an external source and which are only >>>>>>> ever derived from other "names". If the "address" (which seems a >>>>>>> very bad description in this case!) is the only external source >>>>>>> for a name, and it is limited to a length of 49 then that is okay. >>>>>> >>>>>> Yes, the "address" is the only external arg, all other names are >>>>>> constructed from it. >>>>>> I believe it's "address" because it comes from "address" parameter: >>>>>> -Xrunjdwp:transport=st_shmem,address= >>>>>> >>>>>>> >>>>>>>>> >>>>>>>>>> Reg.test is added the the issue. >>>>>>>>> >>>>>>>>> I don't quite follow the test. I see you try to set the name >>>>>>>>> with a value that is too long, and if that doesn't cause an >>>>>>>>> overflow and we don't crash that is good. But I'd expect you >>>>>>>>> to read back the name and check it matches the truncated name >>>>>>>>> with 49 characters. >>>>>>>> >>>>>>>> The test specifies the maximum length supported (49 symbols) >>>>>>>> (if longer name is specified, "address strings longer than 50 >>>>>>>> characters are invalid" error reported). >>>>>>> >>>>>>> I missed the substring that simply causes the name to be the >>>>>>> maximum supported length. That would trigger the overflow and so >>>>>>> suffices as a regression test for this fix. >>>>>>> >>>>>>> Is there another test that already passes a too-long name and >>>>>>> verifies the error gets thrown? >>>>>> >>>>>> Do you mean name >= 50 symbols? >>>>>> No, there is no such test. >>>>>> I don't think it make much sense (test an arbitrary >>>>>> implementation-specific restriction), but I can add the case to >>>>>> the test. >>>>> >>>>> It ensures that using a too-long name fails gracefully. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> --alex >>>>>> >>>>>>> >>>>>>>> As far as I see there is no way to read back the name used to >>>>>>>> create the transport. >>>>>>> >>>>>>> Ok. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> --alex >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> webrev: >>>>>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open.04/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> --alex >>>>>>>>>> >>>>>>>>>> On 03/13/2018 16:14, Alex Menkov wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> Please review a small fix for >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8049695 >>>>>>>>>>> webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~amenkov/shmem_long_name/webrev_open/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Root cause of the issue is jbd hungs as a result of the >>>>>>>>>>> buffer overflow. >>>>>>>>>>> >>>>>>>>>>> In the beginning of the shmemBase.c: >>>>>>>>>>> >>>>>>>>>>> #define MAX_IPC_PREFIX 50?? /* user-specified or generated >>>>>>>>>>> name for */ >>>>>>>>>>> ???????????????????????????? /* shared memory seg and prefix >>>>>>>>>>> for other IPC */ >>>>>>>>>>> #define MAX_IPC_SUFFIX 25?? /* suffix to shmem name for >>>>>>>>>>> other IPC names */ >>>>>>>>>>> #define MAX_IPC_NAME?? (MAX_IPC_PREFIX + MAX_IPC_SUFFIX) >>>>>>>>>>> >>>>>>>>>>> buffer (char prefix[]) in function createStream is used to >>>>>>>>>>> generate base name for mutex/events, so MAX_IPC_PREFIX is >>>>>>>>>>> not big enough. >>>>>>>>>>> >>>>>>>>>>> --alex From magnus.ihse.bursie at oracle.com Fri Mar 23 20:36:29 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 23 Mar 2018 21:36:29 +0100 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <96a6b8d6-d979-9cc9-6d51-7c6f0926bf97@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <359e997c-6f49-218b-a1e6-81888135242b@oracle.com> <96a6b8d6-d979-9cc9-6d51-7c6f0926bf97@oracle.com> Message-ID: <2db2ec37-b561-3f93-623b-14ddbc615381@oracle.com> On 2018-03-23 19:01, Phil Race wrote: > > http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01/src/java.desktop/share/native/libmlib_image/mlib_image_proto.h.udiff.html > > > The variable definitions here are now misaligned. No, they are not. That's just an artifact of webrev, which filters out whitespace changes in the diff view. :-( To see the proper changes, including whitespace, you need to download the patch file. I've gone to great pains to mimick the existing style in the source codes I've made changes to. /Magnus > > ..and added 2d-dev since many of these native changes are in 2d. > > -phil. > > On 03/23/2018 10:33 AM, Phil Race wrote: >> There are a lot of changes in the desktop libraries. >> Doing mach5 tier1/2/3 testing is not nearly sufficient to cover those >> since >> only tier3 has any UI tests and it barely uses anything that's >> touched here. >> So since testing seems to be wise, then I think you should do a >> jtreg desktop group run on Linux & Windows. >> You can probably skip Mac since it is unaffected and I think Linux >> will cover Solaris here. >> You should also do some headless testing. >> >> It could take some time to review this properly and decide what >> changes are OK, >> what changes are something we should clean up later, and what changes >> are something >> that ought to be addressed now .. >> >> I think I'd be mainly concerned that something fails due to a missing >> symbol, or >> that for newly exported symbols if we ended up with duplicate symbols >> as a result. >> >> The results of a test run will add confidence here. >> >> BTW I don't think you are right that >> java.desktop:/libawt_headless: The following symbols are now also >> exported due to JNIEXPORT (they were previously >> .. >> ?X11SurfaceData_GetOps >> >> It looks to me like it was previously exported. >> >> >> -phil. >> >> >> >> On 03/23/2018 06:56 AM, Magnus Ihse Bursie wrote: >>> With modern compilers, we can use compiler directives (such as >>> _attribute__((visibility("default"))), or __declspec(dllexport)) to >>> control symbol visibility, directly in the source code. This has >>> historically not been present on all compilers, so we had to resort >>> to using mapfiles (also known as linker scripts). >>> >>> This is no longer the case. Now all compilers we use support symbol >>> visibility directives, in one form or another. We should start using >>> this. Since this has been the only way to control symbol visibility >>> on Windows, for most of the shared code, we already have proper >>> JNIEXPORT decorations in place. >>> >>> If we fix the remaining platform-specific files to have proper >>> JNIEXPORT tagging, then we can finally get rid of mapfiles. >>> >>> This fix removed mapfiles for all JDK libraries. It does not touch >>> hotspot libraries nor JDK executables; they will have to wait for a >>> future fix -- this was complex enough. This change will not have any >>> impact on macosx, since we do not use mapfiles there, but instead >>> export all symbols. (This is not a good idea, but I'll address that >>> separately.) This change will also have a minimal impact on Windows. >>> The only reason Windows is impacted at all, is that some changes >>> needed by Solaris and Linux were simpler to fix for all platforms. >>> >>> I have strived for this change to have no impact on the actual >>> generated code. Unfortunately, this was not possible to fully >>> achieve. I do not believe that these changes will have any actual >>> impact on the product, though. I will present the differences more >>> in detail further down. Those who are not interested can probably >>> skip that. >>> >>> The patch has passed tier1 testing and is currently running tier2 >>> and tier3. Since the running code is more or less (see caveat below) >>> unmodified, I don't expect any testing issues. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8200178 >>> WebRev: >>> http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01 >>> >>> Details on changes: >>> Most of the source code changes are (unsurprisingly) in java.base >>> and java.desktop. Remaining changes are in jdk.crypto.ucrypto, >>> jdk.hotspot.agent, jdk.jdi and jdk.jdwp.agent. >>> >>> Source code changes does almost to 100% consists in decorating an >>> exported function with JNIEXPORT. I have also followed the >>> long-standing convention of adding JNICALL. This is a no-op on >>> non-Windows platforms, so for most of the changes this is purely >>> cosmetic (and possibly adding in robustness, should the function >>> ever be used on Windows in the future). I have also followed the >>> stylistic convention of putting "JNIEXPORT JNICALL" on >>> a separate line. For some functions, however, this might cause a >>> change in calling convention on Windows. Since this can not apply to >>> exported functions on Windows (otherwise they would already have had >>> JNIEXPORT), I do not think this matters anything. >>> >>> A few libraries did not have a mapfile, on Linux and/or Solaris. >>> This actually meant that all symbols were exported. It is highly >>> unclear if this was known and intended by the original make rule >>> writer. I have emulated this by adding the flag >>> $(EXPORT_ALL_SYMBOLS) to these libraries. Hopefully, we can remove >>> this flag and fix proper exported symbols in the future. >>> >>> I have run the complete build using COMPARE_BUILD, and made a >>> thourough analysis of the differences for Linux and Solaris. All >>> native libraries have symbol differences, but most of them are >>> trivial and/or harmless. As a result, most libraries have disasm >>> differences as well, but these too seem trivial and harmless. The >>> differences in symbols that are common to all libraries include: >>> ?* Internal symbols such as __bss_start, _edata, _end and _fini are >>> now global. (They are imported as such from the compiler >>> libraries/archives, and we have no linker script to override this >>> behavior). >>> ?* The versioning tag SUNWprivate_1.1 is not included, and thus >>> neither the .gnu.version_d symbol. >>> ?* There are a few differences in the symbol and/or mangling of some >>> local functions. I'm not sure what's causing this, >>> but it's unlikely to have any effect on the product. >>> >>> Another common source for change in symbols is due to previous >>> platform differences. For instance, if we had "JNIEXPORT int JNICALL >>> do_foo() { ... }", but do_foo was not in the mapfile, the symbol was >>> exported on Windows but not on Linux and Solaris. (Presumable since >>> it was not needed there, even though it was compiled for those >>> platforms as well.) Now, with the mapfiles gone, do_foo() will be >>> exported on all platforms. And contrary, functions that are compiled >>> on all platforms, and were exported in mapfiles, but now have gotten >>> an JNIEXPORT decoration, will now be visible even on Windows. (This >>> accounts for half of the noticed symbol differences on Windows.) I >>> could have made the JNIEXPORT conditional on OS, but I didn't think >>> the mess in source code were worth the keeping of binary confidence >>> with the old build. >>> >>> A third common source for change in symbols is due to exported >>> functions "leaking" across library borders. For instance, some >>> functions in java.desktop is compiled in both libawt_xawt and >>> libawt_headless, but they were previously only included in the >>> mapfile for one of these libraries. Now, since the visibility is >>> determined by the source code itself, it gets exported in both >>> libraries. A variant of this is when a library depends on another >>> JDK library, and includes the header file from that other library, >>> which in turn declares a function as JNIEXPORT. This will cause the >>> including library to also export the function. This accounts for the >>> other half of the changes on Windows. A typical example of this is >>> that multiple libraries now re-export hotspot symbols from >>> libjvm.so, like jio_fprintf. (I have not listed the libjvm >>> re-exports below.) >>> >>> Note that? Java_java_io_FileOutputStream_close0 in >>> java.base/unix/native/libjava/FileOutputStream_md.c is no longer >>> exported, >>> and can probably be removed. >>> >>> Here is a detailed table showing and accounting for all the >>> remaining differences found on Linux and Solaris: >>> java.base/unix/native/libjava: Java_java_io_FileOutputStream_close0 >>> is now also exported on unix platforms due to JNIEXPORT. >>> >>> java.base/jspawnlauncher: On solaris, we also include >>> libjava/childproc.o, which >>> now exports less functions than it used to (it used to export all >>> functions, now it is compiled with visibility=hidden). >>> >>> java.base/java(w).exe: Is now also exporting the following symbols >>> due to added JNIEXPORT in libjli on Windows: >>> (Yes, executables can export symbols on Windows. Confusing, I know.) >>> ?JLI_AddArgsFromEnvVar >>> ?JLI_CmdToArgs >>> ?JLI_GetAppArgIndex >>> ?JLI_GetStdArgc >>> ?JLI_GetStdArgs >>> ?JLI_InitArgProcessing >>> ?JLI_Launch >>> ?JLI_List_add >>> ?JLI_List_new >>> ?JLI_ManifestIterate >>> ?JLI_MemAlloc >>> ?JLI_MemFree >>> ?JLI_PreprocessArg >>> ?JLI_ReportErrorMessage >>> ?JLI_ReportErrorMessageSys >>> ?JLI_ReportExceptionDescription >>> ?JLI_ReportMessage >>> ?JLI_SetTraceLauncher >>> ?JLI_StringDup >>> >>> java.desktop:/libawt_xawt: The following symbols are now also >>> exported on linux and solaris due to JNIEXPORT: >>> ?awt_DrawingSurface_FreeDrawingSurfaceInfo >>> ?awt_DrawingSurface_GetDrawingSurfaceInfo >>> ?awt_DrawingSurface_Lock >>> ?awt_DrawingSurface_Unlock >>> ?awt_GetColor >>> >>> The following symbols are now also exported on linux and solaris due >>> to JNIEXPORT (they were previously >>> ?exported only in libawt): >>> ?Java_sun_awt_DebugSettings_setCTracingOn__Z >>> ?Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2 >>> ?Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2I >>> ?Java_sun_awt_X11GraphicsConfig_getNumColors >>> >>> java.desktop:/libawt_headless: The following symbols are now also >>> exported due to JNIEXPORT (they were previously >>> ?exported only in libawt_xawt and/or libawt): >>> ?Java_sun_java2d_opengl_GLXGraphicsConfig_getGLXConfigInfo >>> ?Java_sun_java2d_opengl_GLXGraphicsConfig_getOGLCapabilities >>> ?Java_sun_java2d_x11_X11PMBlitLoops_updateBitmask >>> ?Java_sun_java2d_x11_X11SurfaceData_isShmPMAvailable >>> ?X11SurfaceData_GetOps >>> >>> java.desktop/libawt: The following symbols are now also exported on >>> Windows, due to added >>> JNIEXPORT: >>> ?SurfaceData_InitOps >>> ?mul8table >>> ?div8table >>> ?doDrawPath >>> ?doFillPath >>> ?g_CMpDataID >>> ?initInverseGrayLut >>> ?make_dither_arrays >>> ?make_uns_ordered_dither_array >>> ?path2DFloatCoordsID >>> ?path2DNumTypesID >>> ?path2DTypesID >>> ?path2DWindingRuleID >>> ?sg2dStrokeHintID >>> ?std_img_oda_blue >>> ?std_img_oda_green >>> ?std_img_oda_red >>> ?std_odas_computed >>> ?sunHints_INTVAL_STROKE_PURE >>> >>> java.desktop/libawt on solaris: >>> A number of "#pragma weak" directives was previously overridden by >>> the mapfile. >>> Now these directives are respected, so these symbols are now weak >>> instead of local: >>> ?ByteGrayToIntArgbPreConvert_F >>> ?ByteGrayToIntArgbPreScaleConvert_F >>> ?IntArgbBmToFourByteAbgrPreScaleXparOver_F >>> ?IntArgbToIntRgbXorBlit_F >>> ?IntBgrToIntBgrAlphaMaskBlit_F >>> >>> java.desktop/libawt on solaris: These are now also exported due to >>> JNIEXPORT in libmlib_image. >>> ?j2d_mlib_ImageCreate >>> ?j2d_mlib_ImageCreateStruct >>> ?j2d_mlib_ImageDelete >>> >>> java.desktop/libawt on solaris: This is now also exported due to >>> JNIEXPORT: >>> ?GrPrim_CompGetXorColor >>> ?SurfaceData_GetOpsNoSetup >>> ?SurfaceData_IntersectBoundsXYWH >>> ?SurfaceData_SetOps >>> ?Transform_GetInfo >>> ?Transform_transform >>> >>> java.desktop/libsplashscreen: JNI_OnLoad is now exported on linux >>> and solaris due to JNIEXPORT. >>> libspashscreen also had JNIEXPORT (actually a pure >>> _declspec(dllexport)) but no JNICALL, which I added as >>> a part of converting to JNIEXPORT. The same goes for libmlib_image . >>> >>> jdk.sctp/libsctp: handleSocketError is now exported on linux and >>> solaris due to JNIEXPORT in libnio. >>> >>> java.instrument:/libinstrument: Agent_OnUnload is now also exported >>> on linux and solaris platforms due to JNIEXPORT. >>> JLI_ManifestIterate is now also exported on Windows, due to added >>> JNIEXPORT in libjli. >>> >>> jdk.management/libmanagement_ext: >>> Java_com_sun_management_internal_Flag_setDoubleValue is now also >>> exported on linux and solaris platforms due to JNIEXPORT. >>> >>> /Magnus >>> >>> >> > From magnus.ihse.bursie at oracle.com Fri Mar 23 21:08:35 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 23 Mar 2018 22:08:35 +0100 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <359e997c-6f49-218b-a1e6-81888135242b@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <359e997c-6f49-218b-a1e6-81888135242b@oracle.com> Message-ID: On 2018-03-23 18:33, Phil Race wrote: > There are a lot of changes in the desktop libraries. Well, yes and no. While there are multiple touched files, the resulting native shared libraries that are built have very minimal changes in them. (That's the view point from the build guy, you know :)) > Doing mach5 tier1/2/3 testing is not nearly sufficient to cover those > since > only tier3 has any UI tests and it barely uses anything that's touched > here. > So since testing seems to be wise, then I think you should do a > jtreg desktop group run on Linux & Windows. There is next to *no* difference for java.desktop on Windows. The only, very subtle, difference, is that awt.dll now exports 18 more functions (totalling 800, instead of 782). I can't even begin to imagine how anything could fail due to this additional exporting. Not even the disassembly of the machine code of awt.dll is different from before, not a single byte. So I don't buy it that I need to do extensive client testing on Windows. > You can probably skip Mac since it is unaffected and I think Linux > will cover Solaris here. > You should also do some headless testing. I agree that it seems prudent to do some Linux/Solaris testing, since changes there are more wide spread. Could you please point me to some guidance on how to run these tests? (You can do it off list) > It could take some time to review this properly and decide what > changes are OK, > what changes are something we should clean up later, and what changes > are something > that ought to be addressed now .. As I said, I am going to file follow-up bugs for suspicious handling of exported symbols. These follow-up bugs will be separated per component team, unlike this fix, which by necessity addresses all JDK libraries at once. So you will get plenty of time to consider ways of cleaning up any exports handling that you do not like. It would be a pity if this entire checkin was delayed since the client team could not accept the changes needed in client libraries. :-( And frankly, I believe the java.desktop libs needs some serious refactoring to get to grip with the exported symbols situation. The major cause of problems is, I believe, rooted in a non-optimal split of functionality between libawt, libawt_xawt and libawt_headless. This is not likely something that can be addressed in this change. > I think I'd be mainly concerned that something fails due to a missing > symbol, or > that for newly exported symbols if we ended up with duplicate symbols > as a result. Once again, I've run the COMPARE_BUILD script on this patch. Let me explain a bit more in detail what it does, since that might be known only to us in the build team. This script analyses the build result, the jmods, the lib*.so files, etc. The basic idea here is that a change in the build system, which does not produce a change in the build result, is "transparent" to the product. There is e.g. no reason to run any further testing, since we're in effect testing the same bits. For many changes in the build system, we hold this as the gold standard. For this particular change, to achieve this kind of fidelity would have come with a too high price in code complexity, so I have allowed certain small deviations. These are really minimal, and should in most cases be undetectable by the product. The changes in Linux and Solaris that have occured, is those that I listed in my review mail. Basically, for some libraries, additional symbols are exported. I could fix this, but only at the expense of more complex code. While it's a good thing to minimize the functions exported, a handful extra symbols is not a disaster. (We have more important issues to address in our native libraries.) For the AWT libraries, most of the duplicates are coming from the source code that are shared between libraries, in java.desktop/share/native/common. This means that the same function is compiled into -- and now also exported from -- multiple libraries. This is not a big deal. Even if we were to link with two libraries defining the same symbol, the dynamic linker will arbitrarily chose one of them, but since they are identical, it does not matter. (It's another thing if they implement different functions, as you noted yourself in the bugs about linking with awt_xawt vs doing a runtime linking to awt_headless). Also, I guarantee you that in no way are there missing symbols in the refactored build. I've checked, double-checked and triple-checked that. > The results of a test run will add confidence here. > > BTW I don't think you are right that > java.desktop:/libawt_headless: The following symbols are now also > exported due to JNIEXPORT (they were previously > .. > ?X11SurfaceData_GetOps > > It looks to me like it was previously exported. You are correct, it was previously exported in libawt_headless. I meant that it is now also exported for libawt_xawt due to the JNIEXPORT. Sorry for mixing this up. /Magnus > > > -phil. > > > > On 03/23/2018 06:56 AM, Magnus Ihse Bursie wrote: >> With modern compilers, we can use compiler directives (such as >> _attribute__((visibility("default"))), or __declspec(dllexport)) to >> control symbol visibility, directly in the source code. This has >> historically not been present on all compilers, so we had to resort >> to using mapfiles (also known as linker scripts). >> >> This is no longer the case. Now all compilers we use support symbol >> visibility directives, in one form or another. We should start using >> this. Since this has been the only way to control symbol visibility >> on Windows, for most of the shared code, we already have proper >> JNIEXPORT decorations in place. >> >> If we fix the remaining platform-specific files to have proper >> JNIEXPORT tagging, then we can finally get rid of mapfiles. >> >> This fix removed mapfiles for all JDK libraries. It does not touch >> hotspot libraries nor JDK executables; they will have to wait for a >> future fix -- this was complex enough. This change will not have any >> impact on macosx, since we do not use mapfiles there, but instead >> export all symbols. (This is not a good idea, but I'll address that >> separately.) This change will also have a minimal impact on Windows. >> The only reason Windows is impacted at all, is that some changes >> needed by Solaris and Linux were simpler to fix for all platforms. >> >> I have strived for this change to have no impact on the actual >> generated code. Unfortunately, this was not possible to fully >> achieve. I do not believe that these changes will have any actual >> impact on the product, though. I will present the differences more in >> detail further down. Those who are not interested can probably skip >> that. >> >> The patch has passed tier1 testing and is currently running tier2 and >> tier3. Since the running code is more or less (see caveat below) >> unmodified, I don't expect any testing issues. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8200178 >> WebRev: >> http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01 >> >> Details on changes: >> Most of the source code changes are (unsurprisingly) in java.base and >> java.desktop. Remaining changes are in jdk.crypto.ucrypto, >> jdk.hotspot.agent, jdk.jdi and jdk.jdwp.agent. >> >> Source code changes does almost to 100% consists in decorating an >> exported function with JNIEXPORT. I have also followed the >> long-standing convention of adding JNICALL. This is a no-op on >> non-Windows platforms, so for most of the changes this is purely >> cosmetic (and possibly adding in robustness, should the function ever >> be used on Windows in the future). I have also followed the stylistic >> convention of putting "JNIEXPORT JNICALL" on a separate >> line. For some functions, however, this might cause a change in >> calling convention on Windows. Since this can not apply to exported >> functions on Windows (otherwise they would already have had >> JNIEXPORT), I do not think this matters anything. >> >> A few libraries did not have a mapfile, on Linux and/or Solaris. This >> actually meant that all symbols were exported. It is highly unclear >> if this was known and intended by the original make rule writer. I >> have emulated this by adding the flag $(EXPORT_ALL_SYMBOLS) to these >> libraries. Hopefully, we can remove this flag and fix proper exported >> symbols in the future. >> >> I have run the complete build using COMPARE_BUILD, and made a >> thourough analysis of the differences for Linux and Solaris. All >> native libraries have symbol differences, but most of them are >> trivial and/or harmless. As a result, most libraries have disasm >> differences as well, but these too seem trivial and harmless. The >> differences in symbols that are common to all libraries include: >> ?* Internal symbols such as __bss_start, _edata, _end and _fini are >> now global. (They are imported as such from the compiler >> libraries/archives, and we have no linker script to override this >> behavior). >> ?* The versioning tag SUNWprivate_1.1 is not included, and thus >> neither the .gnu.version_d symbol. >> ?* There are a few differences in the symbol and/or mangling of some >> local functions. I'm not sure what's causing this, >> but it's unlikely to have any effect on the product. >> >> Another common source for change in symbols is due to previous >> platform differences. For instance, if we had "JNIEXPORT int JNICALL >> do_foo() { ... }", but do_foo was not in the mapfile, the symbol was >> exported on Windows but not on Linux and Solaris. (Presumable since >> it was not needed there, even though it was compiled for those >> platforms as well.) Now, with the mapfiles gone, do_foo() will be >> exported on all platforms. And contrary, functions that are compiled >> on all platforms, and were exported in mapfiles, but now have gotten >> an JNIEXPORT decoration, will now be visible even on Windows. (This >> accounts for half of the noticed symbol differences on Windows.) I >> could have made the JNIEXPORT conditional on OS, but I didn't think >> the mess in source code were worth the keeping of binary confidence >> with the old build. >> >> A third common source for change in symbols is due to exported >> functions "leaking" across library borders. For instance, some >> functions in java.desktop is compiled in both libawt_xawt and >> libawt_headless, but they were previously only included in the >> mapfile for one of these libraries. Now, since the visibility is >> determined by the source code itself, it gets exported in both >> libraries. A variant of this is when a library depends on another JDK >> library, and includes the header file from that other library, which >> in turn declares a function as JNIEXPORT. This will cause the >> including library to also export the function. This accounts for the >> other half of the changes on Windows. A typical example of this is >> that multiple libraries now re-export hotspot symbols from libjvm.so, >> like jio_fprintf. (I have not listed the libjvm re-exports below.) >> >> Note that? Java_java_io_FileOutputStream_close0 in >> java.base/unix/native/libjava/FileOutputStream_md.c is no longer >> exported, >> and can probably be removed. >> >> Here is a detailed table showing and accounting for all the remaining >> differences found on Linux and Solaris: >> java.base/unix/native/libjava: Java_java_io_FileOutputStream_close0 >> is now also exported on unix platforms due to JNIEXPORT. >> >> java.base/jspawnlauncher: On solaris, we also include >> libjava/childproc.o, which >> now exports less functions than it used to (it used to export all >> functions, now it is compiled with visibility=hidden). >> >> java.base/java(w).exe: Is now also exporting the following symbols >> due to added JNIEXPORT in libjli on Windows: >> (Yes, executables can export symbols on Windows. Confusing, I know.) >> ?JLI_AddArgsFromEnvVar >> ?JLI_CmdToArgs >> ?JLI_GetAppArgIndex >> ?JLI_GetStdArgc >> ?JLI_GetStdArgs >> ?JLI_InitArgProcessing >> ?JLI_Launch >> ?JLI_List_add >> ?JLI_List_new >> ?JLI_ManifestIterate >> ?JLI_MemAlloc >> ?JLI_MemFree >> ?JLI_PreprocessArg >> ?JLI_ReportErrorMessage >> ?JLI_ReportErrorMessageSys >> ?JLI_ReportExceptionDescription >> ?JLI_ReportMessage >> ?JLI_SetTraceLauncher >> ?JLI_StringDup >> >> java.desktop:/libawt_xawt: The following symbols are now also >> exported on linux and solaris due to JNIEXPORT: >> ?awt_DrawingSurface_FreeDrawingSurfaceInfo >> ?awt_DrawingSurface_GetDrawingSurfaceInfo >> ?awt_DrawingSurface_Lock >> ?awt_DrawingSurface_Unlock >> ?awt_GetColor >> >> The following symbols are now also exported on linux and solaris due >> to JNIEXPORT (they were previously >> ?exported only in libawt): >> ?Java_sun_awt_DebugSettings_setCTracingOn__Z >> ?Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2 >> ?Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2I >> ?Java_sun_awt_X11GraphicsConfig_getNumColors >> >> java.desktop:/libawt_headless: The following symbols are now also >> exported due to JNIEXPORT (they were previously >> ?exported only in libawt_xawt and/or libawt): >> ?Java_sun_java2d_opengl_GLXGraphicsConfig_getGLXConfigInfo >> ?Java_sun_java2d_opengl_GLXGraphicsConfig_getOGLCapabilities >> ?Java_sun_java2d_x11_X11PMBlitLoops_updateBitmask >> ?Java_sun_java2d_x11_X11SurfaceData_isShmPMAvailable >> ?X11SurfaceData_GetOps >> >> java.desktop/libawt: The following symbols are now also exported on >> Windows, due to added >> JNIEXPORT: >> ?SurfaceData_InitOps >> ?mul8table >> ?div8table >> ?doDrawPath >> ?doFillPath >> ?g_CMpDataID >> ?initInverseGrayLut >> ?make_dither_arrays >> ?make_uns_ordered_dither_array >> ?path2DFloatCoordsID >> ?path2DNumTypesID >> ?path2DTypesID >> ?path2DWindingRuleID >> ?sg2dStrokeHintID >> ?std_img_oda_blue >> ?std_img_oda_green >> ?std_img_oda_red >> ?std_odas_computed >> ?sunHints_INTVAL_STROKE_PURE >> >> java.desktop/libawt on solaris: >> A number of "#pragma weak" directives was previously overridden by >> the mapfile. >> Now these directives are respected, so these symbols are now weak >> instead of local: >> ?ByteGrayToIntArgbPreConvert_F >> ?ByteGrayToIntArgbPreScaleConvert_F >> ?IntArgbBmToFourByteAbgrPreScaleXparOver_F >> ?IntArgbToIntRgbXorBlit_F >> ?IntBgrToIntBgrAlphaMaskBlit_F >> >> java.desktop/libawt on solaris: These are now also exported due to >> JNIEXPORT in libmlib_image. >> ?j2d_mlib_ImageCreate >> ?j2d_mlib_ImageCreateStruct >> ?j2d_mlib_ImageDelete >> >> java.desktop/libawt on solaris: This is now also exported due to >> JNIEXPORT: >> ?GrPrim_CompGetXorColor >> ?SurfaceData_GetOpsNoSetup >> ?SurfaceData_IntersectBoundsXYWH >> ?SurfaceData_SetOps >> ?Transform_GetInfo >> ?Transform_transform >> >> java.desktop/libsplashscreen: JNI_OnLoad is now exported on linux and >> solaris due to JNIEXPORT. >> libspashscreen also had JNIEXPORT (actually a pure >> _declspec(dllexport)) but no JNICALL, which I added as >> a part of converting to JNIEXPORT. The same goes for libmlib_image . >> >> jdk.sctp/libsctp: handleSocketError is now exported on linux and >> solaris due to JNIEXPORT in libnio. >> >> java.instrument:/libinstrument: Agent_OnUnload is now also exported >> on linux and solaris platforms due to JNIEXPORT. >> JLI_ManifestIterate is now also exported on Windows, due to added >> JNIEXPORT in libjli. >> >> jdk.management/libmanagement_ext: >> Java_com_sun_management_internal_Flag_setDoubleValue is now also >> exported on linux and solaris platforms due to JNIEXPORT. >> >> /Magnus >> >> > From magnus.ihse.bursie at oracle.com Fri Mar 23 21:31:36 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 23 Mar 2018 22:31:36 +0100 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <86488d2c-61e3-e489-d9fc-976178c35775@oracle.com> <2329dea1-a75e-bb70-95a6-242a93006c6d@oracle.com> Message-ID: <278836b5-b49b-2409-f1b5-e05fdc0d2b30@oracle.com> On 2018-03-23 18:24, Volker Simonis wrote: > Hi Magnus, > > thanks for addressing this long standing issue! I haven't looked at > the changes, but just want to share some general and historical notes: > > - Compiling with "-fvisibility=hidden" which hides all symbols expect > the ones explicitly exported with > "__attribute__((visibility("default")))" has been requested by SAP > back in 2007 even before we had OpenJDK (see "Use -fvisibility=hidden > for gcc compiles" https://bugs.openjdk.java.net/browse/JDK-6588413) > and finally pushed into the OpenJKD around 2010. > - "-fvisibility=hidden" gave us performance improvements of about 5% > (JBB2005) and 2% (JVM98) on Linux/IA64 and 1,5% (JBB2005) and 0,5% > (JVM98) on Linux/PPC64 because the compiler could use faster calls for > non exported symbols. This improvement was only very small on x86 > tough. That's a nice side effect! Although my main purpose here is maintainability, gaining performance is nothing I say no to. :) > - "-fvisibility=hidden"/"__attribute__((visibility("default")))" > applies BEFORE using the map files in the linking step (i.e. hidden > symbols can't be exported any more even if mentioned in the map file) > - because of the performance improvements we got by using > "-fvisibility=hidden" it was worth while using it even though we had > the mapfiles at the end of the process. > > Then we had several mail threads (which you probably remember because > you were involved :) where we discussed to either remove the map files > completely or instead generate them automatically during the build: > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-February/thread.html#12412 > http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-February/thread.html#12628 > > The main arguments against removing the map files at that time were: > > 1. the danger to re-export all symbols of statically linked libraries > (notably libstdc++ at that time) > 2. loosing exports of compiler generated symbols like vtables which > are required by the Serviceability Agent > > Point 1 is not a problem today, because I don't think we do any static > linking any more. If we still do it under some circumstances, this > problem should be re-evaluated. Well, we do static linking with libstdc++ on linux, in certain circumstances. See "--with-stdc++lib=,,". Fortunately, this is not a problem. The linker can be told not to include symbols from statically linked libraries, which is exactly what I do with LDFLAGS_JDKLIB += -Wl,--exclude-libs,ALL. The corresponding feature does not exist for the solstudio linker, but fortunately we do not use statically linked libraries there. > Point 2 is only relevant for HotSpot. But because of "8034065: GCC 4.3 > and later doesn't export vtable symbols any more which seem to be > needed by SA" (https://bugs.openjdk.java.net/browse/JDK-8034065), > exporting such symbols trough a map files doesn't work any more > anyway. So this isn't a problem either. In any case, that's a question for another day. :) There were reasons I left Hotspot out of this fix, and the question about the SA agent is one of them. :) As you say, I think they do not apply anymore, but I'll return to consider Hotspot later on. > So to cut a long story short - I think the time is ripe to get rid of > the map files. Thumbs up from me (meant as moral support, not as a > concrete review :) Thanks for the kind words! /Magnus > > Regards, > Volker > > On Fri, Mar 23, 2018 at 5:05 PM, mandy chung wrote: >> This is a very good change and no more mapfile to maintain!! >> >> Please do file JBS issues for the component teams to clean up their exports. >> >> Mandy >> >> >> On 3/23/18 7:30 AM, Erik Joelsson wrote: >>> I have looked at the build changes and they look good. >>> >>> Will you file followups for each component team to look over their >>> exported symbols, at least for the libraries with $(EXPORT_ALL_SYMBOLS)? It >>> sure looks like there is some technical debt laying around here. >>> >>> /Erik >>> >>> >>> On 2018-03-23 06:56, Magnus Ihse Bursie wrote: >>>> With modern compilers, we can use compiler directives (such as >>>> _attribute__((visibility("default"))), or __declspec(dllexport)) to control >>>> symbol visibility, directly in the source code. This has historically not >>>> been present on all compilers, so we had to resort to using mapfiles (also >>>> known as linker scripts). >>>> >>>> This is no longer the case. Now all compilers we use support symbol >>>> visibility directives, in one form or another. We should start using this. >>>> Since this has been the only way to control symbol visibility on Windows, >>>> for most of the shared code, we already have proper JNIEXPORT decorations in >>>> place. >>>> >>>> If we fix the remaining platform-specific files to have proper JNIEXPORT >>>> tagging, then we can finally get rid of mapfiles. >>>> >>>> This fix removed mapfiles for all JDK libraries. It does not touch >>>> hotspot libraries nor JDK executables; they will have to wait for a future >>>> fix -- this was complex enough. This change will not have any impact on >>>> macosx, since we do not use mapfiles there, but instead export all symbols. >>>> (This is not a good idea, but I'll address that separately.) This change >>>> will also have a minimal impact on Windows. The only reason Windows is >>>> impacted at all, is that some changes needed by Solaris and Linux were >>>> simpler to fix for all platforms. >>>> >>>> I have strived for this change to have no impact on the actual generated >>>> code. Unfortunately, this was not possible to fully achieve. I do not >>>> believe that these changes will have any actual impact on the product, >>>> though. I will present the differences more in detail further down. Those >>>> who are not interested can probably skip that. >>>> >>>> The patch has passed tier1 testing and is currently running tier2 and >>>> tier3. Since the running code is more or less (see caveat below) unmodified, >>>> I don't expect any testing issues. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8200178 >>>> WebRev: >>>> http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01 >>>> >>>> Details on changes: >>>> Most of the source code changes are (unsurprisingly) in java.base and >>>> java.desktop. Remaining changes are in jdk.crypto.ucrypto, >>>> jdk.hotspot.agent, jdk.jdi and jdk.jdwp.agent. >>>> >>>> Source code changes does almost to 100% consists in decorating an >>>> exported function with JNIEXPORT. I have also followed the long-standing >>>> convention of adding JNICALL. This is a no-op on non-Windows platforms, so >>>> for most of the changes this is purely cosmetic (and possibly adding in >>>> robustness, should the function ever be used on Windows in the future). I >>>> have also followed the stylistic convention of putting "JNIEXPORT >>> type> JNICALL" on a separate line. For some functions, however, this might >>>> cause a change in calling convention on Windows. Since this can not apply to >>>> exported functions on Windows (otherwise they would already have had >>>> JNIEXPORT), I do not think this matters anything. >>>> >>>> A few libraries did not have a mapfile, on Linux and/or Solaris. This >>>> actually meant that all symbols were exported. It is highly unclear if this >>>> was known and intended by the original make rule writer. I have emulated >>>> this by adding the flag $(EXPORT_ALL_SYMBOLS) to these libraries. Hopefully, >>>> we can remove this flag and fix proper exported symbols in the future. >>>> >>>> I have run the complete build using COMPARE_BUILD, and made a thourough >>>> analysis of the differences for Linux and Solaris. All native libraries have >>>> symbol differences, but most of them are trivial and/or harmless. As a >>>> result, most libraries have disasm differences as well, but these too seem >>>> trivial and harmless. The differences in symbols that are common to all >>>> libraries include: >>>> * Internal symbols such as __bss_start, _edata, _end and _fini are now >>>> global. (They are imported as such from the compiler libraries/archives, and >>>> we have no linker script to override this behavior). >>>> * The versioning tag SUNWprivate_1.1 is not included, and thus neither >>>> the .gnu.version_d symbol. >>>> * There are a few differences in the symbol and/or mangling of some >>>> local functions. I'm not sure what's causing this, >>>> but it's unlikely to have any effect on the product. >>>> >>>> Another common source for change in symbols is due to previous platform >>>> differences. For instance, if we had "JNIEXPORT int JNICALL do_foo() { ... >>>> }", but do_foo was not in the mapfile, the symbol was exported on Windows >>>> but not on Linux and Solaris. (Presumable since it was not needed there, >>>> even though it was compiled for those platforms as well.) Now, with the >>>> mapfiles gone, do_foo() will be exported on all platforms. And contrary, >>>> functions that are compiled on all platforms, and were exported in mapfiles, >>>> but now have gotten an JNIEXPORT decoration, will now be visible even on >>>> Windows. (This accounts for half of the noticed symbol differences on >>>> Windows.) I could have made the JNIEXPORT conditional on OS, but I didn't >>>> think the mess in source code were worth the keeping of binary confidence >>>> with the old build. >>>> >>>> A third common source for change in symbols is due to exported functions >>>> "leaking" across library borders. For instance, some functions in >>>> java.desktop is compiled in both libawt_xawt and libawt_headless, but they >>>> were previously only included in the mapfile for one of these libraries. >>>> Now, since the visibility is determined by the source code itself, it gets >>>> exported in both libraries. A variant of this is when a library depends on >>>> another JDK library, and includes the header file from that other library, >>>> which in turn declares a function as JNIEXPORT. This will cause the >>>> including library to also export the function. This accounts for the other >>>> half of the changes on Windows. A typical example of this is that multiple >>>> libraries now re-export hotspot symbols from libjvm.so, like jio_fprintf. (I >>>> have not listed the libjvm re-exports below.) >>>> >>>> Note that Java_java_io_FileOutputStream_close0 in >>>> java.base/unix/native/libjava/FileOutputStream_md.c is no longer exported, >>>> and can probably be removed. >>>> >>>> Here is a detailed table showing and accounting for all the remaining >>>> differences found on Linux and Solaris: >>>> java.base/unix/native/libjava: Java_java_io_FileOutputStream_close0 is >>>> now also exported on unix platforms due to JNIEXPORT. >>>> >>>> java.base/jspawnlauncher: On solaris, we also include >>>> libjava/childproc.o, which >>>> now exports less functions than it used to (it used to export all >>>> functions, now it is compiled with visibility=hidden). >>>> >>>> java.base/java(w).exe: Is now also exporting the following symbols due to >>>> added JNIEXPORT in libjli on Windows: >>>> (Yes, executables can export symbols on Windows. Confusing, I know.) >>>> JLI_AddArgsFromEnvVar >>>> JLI_CmdToArgs >>>> JLI_GetAppArgIndex >>>> JLI_GetStdArgc >>>> JLI_GetStdArgs >>>> JLI_InitArgProcessing >>>> JLI_Launch >>>> JLI_List_add >>>> JLI_List_new >>>> JLI_ManifestIterate >>>> JLI_MemAlloc >>>> JLI_MemFree >>>> JLI_PreprocessArg >>>> JLI_ReportErrorMessage >>>> JLI_ReportErrorMessageSys >>>> JLI_ReportExceptionDescription >>>> JLI_ReportMessage >>>> JLI_SetTraceLauncher >>>> JLI_StringDup >>>> >>>> java.desktop:/libawt_xawt: The following symbols are now also exported on >>>> linux and solaris due to JNIEXPORT: >>>> awt_DrawingSurface_FreeDrawingSurfaceInfo >>>> awt_DrawingSurface_GetDrawingSurfaceInfo >>>> awt_DrawingSurface_Lock >>>> awt_DrawingSurface_Unlock >>>> awt_GetColor >>>> >>>> The following symbols are now also exported on linux and solaris due to >>>> JNIEXPORT (they were previously >>>> exported only in libawt): >>>> Java_sun_awt_DebugSettings_setCTracingOn__Z >>>> Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2 >>>> Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2I >>>> Java_sun_awt_X11GraphicsConfig_getNumColors >>>> >>>> java.desktop:/libawt_headless: The following symbols are now also >>>> exported due to JNIEXPORT (they were previously >>>> exported only in libawt_xawt and/or libawt): >>>> Java_sun_java2d_opengl_GLXGraphicsConfig_getGLXConfigInfo >>>> Java_sun_java2d_opengl_GLXGraphicsConfig_getOGLCapabilities >>>> Java_sun_java2d_x11_X11PMBlitLoops_updateBitmask >>>> Java_sun_java2d_x11_X11SurfaceData_isShmPMAvailable >>>> X11SurfaceData_GetOps >>>> >>>> java.desktop/libawt: The following symbols are now also exported on >>>> Windows, due to added >>>> JNIEXPORT: >>>> SurfaceData_InitOps >>>> mul8table >>>> div8table >>>> doDrawPath >>>> doFillPath >>>> g_CMpDataID >>>> initInverseGrayLut >>>> make_dither_arrays >>>> make_uns_ordered_dither_array >>>> path2DFloatCoordsID >>>> path2DNumTypesID >>>> path2DTypesID >>>> path2DWindingRuleID >>>> sg2dStrokeHintID >>>> std_img_oda_blue >>>> std_img_oda_green >>>> std_img_oda_red >>>> std_odas_computed >>>> sunHints_INTVAL_STROKE_PURE >>>> >>>> java.desktop/libawt on solaris: >>>> A number of "#pragma weak" directives was previously overridden by the >>>> mapfile. >>>> Now these directives are respected, so these symbols are now weak instead >>>> of local: >>>> ByteGrayToIntArgbPreConvert_F >>>> ByteGrayToIntArgbPreScaleConvert_F >>>> IntArgbBmToFourByteAbgrPreScaleXparOver_F >>>> IntArgbToIntRgbXorBlit_F >>>> IntBgrToIntBgrAlphaMaskBlit_F >>>> >>>> java.desktop/libawt on solaris: These are now also exported due to >>>> JNIEXPORT in libmlib_image. >>>> j2d_mlib_ImageCreate >>>> j2d_mlib_ImageCreateStruct >>>> j2d_mlib_ImageDelete >>>> >>>> java.desktop/libawt on solaris: This is now also exported due to >>>> JNIEXPORT: >>>> GrPrim_CompGetXorColor >>>> SurfaceData_GetOpsNoSetup >>>> SurfaceData_IntersectBoundsXYWH >>>> SurfaceData_SetOps >>>> Transform_GetInfo >>>> Transform_transform >>>> >>>> java.desktop/libsplashscreen: JNI_OnLoad is now exported on linux and >>>> solaris due to JNIEXPORT. >>>> libspashscreen also had JNIEXPORT (actually a pure _declspec(dllexport)) >>>> but no JNICALL, which I added as >>>> a part of converting to JNIEXPORT. The same goes for libmlib_image . >>>> >>>> jdk.sctp/libsctp: handleSocketError is now exported on linux and solaris >>>> due to JNIEXPORT in libnio. >>>> >>>> java.instrument:/libinstrument: Agent_OnUnload is now also exported on >>>> linux and solaris platforms due to JNIEXPORT. >>>> JLI_ManifestIterate is now also exported on Windows, due to added >>>> JNIEXPORT in libjli. >>>> >>>> jdk.management/libmanagement_ext: >>>> Java_com_sun_management_internal_Flag_setDoubleValue is now also exported on >>>> linux and solaris platforms due to JNIEXPORT. >>>> >>>> /Magnus >>>> >>>> From magnus.ihse.bursie at oracle.com Fri Mar 23 22:03:59 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 23 Mar 2018 23:03:59 +0100 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <2329dea1-a75e-bb70-95a6-242a93006c6d@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <86488d2c-61e3-e489-d9fc-976178c35775@oracle.com> <2329dea1-a75e-bb70-95a6-242a93006c6d@oracle.com> Message-ID: <32db4add-8a59-03db-0074-e7df0aed14b8@oracle.com> On 2018-03-23 17:05, mandy chung wrote: > This is a very good change and no more mapfile to maintain!! Thank you! > > Please do file JBS issues for the component teams to clean up their > exports. I have now filed: https://bugs.openjdk.java.net/browse/JDK-8200191 -- for java.base https://bugs.openjdk.java.net/browse/JDK-8200192 -- for java.desktop https://bugs.openjdk.java.net/browse/JDK-8200193 -- for jdk.security.auth /Magnus > > Mandy > > On 3/23/18 7:30 AM, Erik Joelsson wrote: >> I have looked at the build changes and they look good. >> >> Will you file followups for each component team to look over their >> exported symbols, at least for the libraries with >> $(EXPORT_ALL_SYMBOLS)? It sure looks like there is some technical >> debt laying around here. >> >> /Erik >> >> >> On 2018-03-23 06:56, Magnus Ihse Bursie wrote: >>> With modern compilers, we can use compiler directives (such as >>> _attribute__((visibility("default"))), or __declspec(dllexport)) to >>> control symbol visibility, directly in the source code. This has >>> historically not been present on all compilers, so we had to resort >>> to using mapfiles (also known as linker scripts). >>> >>> This is no longer the case. Now all compilers we use support symbol >>> visibility directives, in one form or another. We should start using >>> this. Since this has been the only way to control symbol visibility >>> on Windows, for most of the shared code, we already have proper >>> JNIEXPORT decorations in place. >>> >>> If we fix the remaining platform-specific files to have proper >>> JNIEXPORT tagging, then we can finally get rid of mapfiles. >>> >>> This fix removed mapfiles for all JDK libraries. It does not touch >>> hotspot libraries nor JDK executables; they will have to wait for a >>> future fix -- this was complex enough. This change will not have any >>> impact on macosx, since we do not use mapfiles there, but instead >>> export all symbols. (This is not a good idea, but I'll address that >>> separately.) This change will also have a minimal impact on Windows. >>> The only reason Windows is impacted at all, is that some changes >>> needed by Solaris and Linux were simpler to fix for all platforms. >>> >>> I have strived for this change to have no impact on the actual >>> generated code. Unfortunately, this was not possible to fully >>> achieve. I do not believe that these changes will have any actual >>> impact on the product, though. I will present the differences more >>> in detail further down. Those who are not interested can probably >>> skip that. >>> >>> The patch has passed tier1 testing and is currently running tier2 >>> and tier3. Since the running code is more or less (see caveat below) >>> unmodified, I don't expect any testing issues. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8200178 >>> WebRev: >>> http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01 >>> >>> Details on changes: >>> Most of the source code changes are (unsurprisingly) in java.base >>> and java.desktop. Remaining changes are in jdk.crypto.ucrypto, >>> jdk.hotspot.agent, jdk.jdi and jdk.jdwp.agent. >>> >>> Source code changes does almost to 100% consists in decorating an >>> exported function with JNIEXPORT. I have also followed the >>> long-standing convention of adding JNICALL. This is a no-op on >>> non-Windows platforms, so for most of the changes this is purely >>> cosmetic (and possibly adding in robustness, should the function >>> ever be used on Windows in the future). I have also followed the >>> stylistic convention of putting "JNIEXPORT JNICALL" on >>> a separate line. For some functions, however, this might cause a >>> change in calling convention on Windows. Since this can not apply to >>> exported functions on Windows (otherwise they would already have had >>> JNIEXPORT), I do not think this matters anything. >>> >>> A few libraries did not have a mapfile, on Linux and/or Solaris. >>> This actually meant that all symbols were exported. It is highly >>> unclear if this was known and intended by the original make rule >>> writer. I have emulated this by adding the flag >>> $(EXPORT_ALL_SYMBOLS) to these libraries. Hopefully, we can remove >>> this flag and fix proper exported symbols in the future. >>> >>> I have run the complete build using COMPARE_BUILD, and made a >>> thourough analysis of the differences for Linux and Solaris. All >>> native libraries have symbol differences, but most of them are >>> trivial and/or harmless. As a result, most libraries have disasm >>> differences as well, but these too seem trivial and harmless. The >>> differences in symbols that are common to all libraries include: >>> ?* Internal symbols such as __bss_start, _edata, _end and _fini are >>> now global. (They are imported as such from the compiler >>> libraries/archives, and we have no linker script to override this >>> behavior). >>> ?* The versioning tag SUNWprivate_1.1 is not included, and thus >>> neither the .gnu.version_d symbol. >>> ?* There are a few differences in the symbol and/or mangling of some >>> local functions. I'm not sure what's causing this, >>> but it's unlikely to have any effect on the product. >>> >>> Another common source for change in symbols is due to previous >>> platform differences. For instance, if we had "JNIEXPORT int JNICALL >>> do_foo() { ... }", but do_foo was not in the mapfile, the symbol was >>> exported on Windows but not on Linux and Solaris. (Presumable since >>> it was not needed there, even though it was compiled for those >>> platforms as well.) Now, with the mapfiles gone, do_foo() will be >>> exported on all platforms. And contrary, functions that are compiled >>> on all platforms, and were exported in mapfiles, but now have gotten >>> an JNIEXPORT decoration, will now be visible even on Windows. (This >>> accounts for half of the noticed symbol differences on Windows.) I >>> could have made the JNIEXPORT conditional on OS, but I didn't think >>> the mess in source code were worth the keeping of binary confidence >>> with the old build. >>> >>> A third common source for change in symbols is due to exported >>> functions "leaking" across library borders. For instance, some >>> functions in java.desktop is compiled in both libawt_xawt and >>> libawt_headless, but they were previously only included in the >>> mapfile for one of these libraries. Now, since the visibility is >>> determined by the source code itself, it gets exported in both >>> libraries. A variant of this is when a library depends on another >>> JDK library, and includes the header file from that other library, >>> which in turn declares a function as JNIEXPORT. This will cause the >>> including library to also export the function. This accounts for the >>> other half of the changes on Windows. A typical example of this is >>> that multiple libraries now re-export hotspot symbols from >>> libjvm.so, like jio_fprintf. (I have not listed the libjvm >>> re-exports below.) >>> >>> Note that? Java_java_io_FileOutputStream_close0 in >>> java.base/unix/native/libjava/FileOutputStream_md.c is no longer >>> exported, >>> and can probably be removed. >>> >>> Here is a detailed table showing and accounting for all the >>> remaining differences found on Linux and Solaris: >>> java.base/unix/native/libjava: Java_java_io_FileOutputStream_close0 >>> is now also exported on unix platforms due to JNIEXPORT. >>> >>> java.base/jspawnlauncher: On solaris, we also include >>> libjava/childproc.o, which >>> now exports less functions than it used to (it used to export all >>> functions, now it is compiled with visibility=hidden). >>> >>> java.base/java(w).exe: Is now also exporting the following symbols >>> due to added JNIEXPORT in libjli on Windows: >>> (Yes, executables can export symbols on Windows. Confusing, I know.) >>> ?JLI_AddArgsFromEnvVar >>> ?JLI_CmdToArgs >>> ?JLI_GetAppArgIndex >>> ?JLI_GetStdArgc >>> ?JLI_GetStdArgs >>> ?JLI_InitArgProcessing >>> ?JLI_Launch >>> ?JLI_List_add >>> ?JLI_List_new >>> ?JLI_ManifestIterate >>> ?JLI_MemAlloc >>> ?JLI_MemFree >>> ?JLI_PreprocessArg >>> ?JLI_ReportErrorMessage >>> ?JLI_ReportErrorMessageSys >>> ?JLI_ReportExceptionDescription >>> ?JLI_ReportMessage >>> ?JLI_SetTraceLauncher >>> ?JLI_StringDup >>> >>> java.desktop:/libawt_xawt: The following symbols are now also >>> exported on linux and solaris due to JNIEXPORT: >>> ?awt_DrawingSurface_FreeDrawingSurfaceInfo >>> ?awt_DrawingSurface_GetDrawingSurfaceInfo >>> ?awt_DrawingSurface_Lock >>> ?awt_DrawingSurface_Unlock >>> ?awt_GetColor >>> >>> The following symbols are now also exported on linux and solaris due >>> to JNIEXPORT (they were previously >>> ?exported only in libawt): >>> ?Java_sun_awt_DebugSettings_setCTracingOn__Z >>> ?Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2 >>> ?Java_sun_awt_DebugSettings_setCTracingOn__ZLjava_lang_String_2I >>> ?Java_sun_awt_X11GraphicsConfig_getNumColors >>> >>> java.desktop:/libawt_headless: The following symbols are now also >>> exported due to JNIEXPORT (they were previously >>> ?exported only in libawt_xawt and/or libawt): >>> ?Java_sun_java2d_opengl_GLXGraphicsConfig_getGLXConfigInfo >>> ?Java_sun_java2d_opengl_GLXGraphicsConfig_getOGLCapabilities >>> ?Java_sun_java2d_x11_X11PMBlitLoops_updateBitmask >>> ?Java_sun_java2d_x11_X11SurfaceData_isShmPMAvailable >>> ?X11SurfaceData_GetOps >>> >>> java.desktop/libawt: The following symbols are now also exported on >>> Windows, due to added >>> JNIEXPORT: >>> ?SurfaceData_InitOps >>> ?mul8table >>> ?div8table >>> ?doDrawPath >>> ?doFillPath >>> ?g_CMpDataID >>> ?initInverseGrayLut >>> ?make_dither_arrays >>> ?make_uns_ordered_dither_array >>> ?path2DFloatCoordsID >>> ?path2DNumTypesID >>> ?path2DTypesID >>> ?path2DWindingRuleID >>> ?sg2dStrokeHintID >>> ?std_img_oda_blue >>> ?std_img_oda_green >>> ?std_img_oda_red >>> ?std_odas_computed >>> ?sunHints_INTVAL_STROKE_PURE >>> >>> java.desktop/libawt on solaris: >>> A number of "#pragma weak" directives was previously overridden by >>> the mapfile. >>> Now these directives are respected, so these symbols are now weak >>> instead of local: >>> ?ByteGrayToIntArgbPreConvert_F >>> ?ByteGrayToIntArgbPreScaleConvert_F >>> ?IntArgbBmToFourByteAbgrPreScaleXparOver_F >>> ?IntArgbToIntRgbXorBlit_F >>> ?IntBgrToIntBgrAlphaMaskBlit_F >>> >>> java.desktop/libawt on solaris: These are now also exported due to >>> JNIEXPORT in libmlib_image. >>> ?j2d_mlib_ImageCreate >>> ?j2d_mlib_ImageCreateStruct >>> ?j2d_mlib_ImageDelete >>> >>> java.desktop/libawt on solaris: This is now also exported due to >>> JNIEXPORT: >>> ?GrPrim_CompGetXorColor >>> ?SurfaceData_GetOpsNoSetup >>> ?SurfaceData_IntersectBoundsXYWH >>> ?SurfaceData_SetOps >>> ?Transform_GetInfo >>> ?Transform_transform >>> >>> java.desktop/libsplashscreen: JNI_OnLoad is now exported on linux >>> and solaris due to JNIEXPORT. >>> libspashscreen also had JNIEXPORT (actually a pure >>> _declspec(dllexport)) but no JNICALL, which I added as >>> a part of converting to JNIEXPORT. The same goes for libmlib_image . >>> >>> jdk.sctp/libsctp: handleSocketError is now exported on linux and >>> solaris due to JNIEXPORT in libnio. >>> >>> java.instrument:/libinstrument: Agent_OnUnload is now also exported >>> on linux and solaris platforms due to JNIEXPORT. >>> JLI_ManifestIterate is now also exported on Windows, due to added >>> JNIEXPORT in libjli. >>> >>> jdk.management/libmanagement_ext: >>> Java_com_sun_management_internal_Flag_setDoubleValue is now also >>> exported on linux and solaris platforms due to JNIEXPORT. >>> >>> /Magnus >>> >>> >> > From leonid.mesnik at oracle.com Fri Mar 23 23:31:11 2018 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Fri, 23 Mar 2018 16:31:11 -0700 Subject: RFR(XS): 8200187: Exclude 3 long-running tests from tier1 Message-ID: Hi Could you please review following fix which exclude following tests from tier1 testing: serviceability/sa/ClhsdbScanOops.java serviceability/sa/TestHeapDumpForLargeArray.java gc/g1/ihop/TestIHOPErgo.java Each of them takes more then 5 minutes to complete and significantly increase overall time to complete tier1. Please let me know if there are any reasons to run these tests in tier1 despite on their execution time. webrev:http://cr.openjdk.java.net/~lmesnik/8200187/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8200187 Leonid -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Sun Mar 25 13:16:39 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Sun, 25 Mar 2018 22:16:39 +0900 Subject: jhsdb jstack does not work with AppCDS Message-ID: <2c6c8678-5023-1b16-8159-e0c7e3a3c397@gmail.com> Hi all, I tried to get jstack via jhsdb from AppCDS enabled process, but it couldn't as below: --------------- [ysuenaga at fc27 jdk-hs]$ /usr/local/jdk-10/bin/jhsdb jstack --pid 5614 Attaching to process ID 5614, please wait... Debugger attached successfully. Server compiler detected. JVM version is 10+46 Deadlock Detection: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x0000000800098690 at jdk.hotspot.agent/sun.jvm.hotspot.runtime.InstanceConstructor.newWrongTypeException(InstanceConstructor.java:62) at jdk.hotspot.agent/sun.jvm.hotspot.runtime.VirtualBaseConstructor.instantiateWrapperFor(VirtualBaseConstructor.java:109) at jdk.hotspot.agent/sun.jvm.hotspot.oops.Metadata.instantiateWrapperFor(Metadata.java:73) at jdk.hotspot.agent/sun.jvm.hotspot.oops.Oop.getKlassForOopHandle(Oop.java:210) at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.newOop(ObjectHeap.java:252) at jdk.hotspot.agent/sun.jvm.hotspot.runtime.JavaThread.getThreadObj(JavaThread.java:359) at jdk.hotspot.agent/sun.jvm.hotspot.runtime.JavaThread.getCurrentParkBlocker(JavaThread.java:411) at jdk.hotspot.agent/sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:82) --------------- I attempt to change SA to refer Klass::_archived_mirror, but I got same exceptions. How can I avoid this error? Thanks, Yasumasa From david.holmes at oracle.com Sun Mar 25 22:48:35 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 26 Mar 2018 08:48:35 +1000 Subject: RFR(XS): 8200187: Exclude 3 long-running tests from tier1 In-Reply-To: References: Message-ID: Hi Leonid, On 24/03/2018 9:31 AM, Leonid Mesnik wrote: > Hi > > Could you please review following fix which exclude following tests from > tier1 testing: > serviceability/sa/ClhsdbScanOops.java > serviceability/sa/TestHeapDumpForLargeArray.java > gc/g1/ihop/TestIHOPErgo.java > > Each of them takes more then 5 minutes to complete and significantly > increase overall time to complete tier1. I'd need to see a much more detailed analysis of all the tests run, the order they run and the execution times to ascertain what impact this actually has on overall test execution time. But assuming 5 minutes is too long for tier1, this seems okay. But the tests must still be run in some other tier, and I'm not clear where that would be now? I would expect them to move to tier 3 perhaps, depending on what the time criteria for tier 3 is. Thanks, David > Please let me know if there are any reasons to run these tests in tier1 > despite on their execution time. > > webrev:http://cr.openjdk.java.net/~lmesnik/8200187/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8200187 > > Leonid From yasuenag at gmail.com Mon Mar 26 00:46:34 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Mon, 26 Mar 2018 09:46:34 +0900 Subject: RFR: 8200204: SharedArchiveConfigFile cannot accept output of VM.stringtable Message-ID: Hi all, Please review this change. JBS: https://bugs.openjdk.java.net/browse/JDK-8200204 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8200204/webrev.00/ submit-hs: mach5-one-ysuenaga-JDK-8200204-20180325-1440-16057 JDK-8134448 says SharedArchiveConfigFile accepts output of `jcmd VM.stringtable -verbose` , but it could not because JDK-8059510 has changed version number to 1.1 . I think we should accept version 1.1 stringtable. Thanks, Yasumasa From jini.george at oracle.com Mon Mar 26 01:39:56 2018 From: jini.george at oracle.com (Jini George) Date: Mon, 26 Mar 2018 07:09:56 +0530 Subject: jhsdb jstack does not work with AppCDS In-Reply-To: <2c6c8678-5023-1b16-8159-e0c7e3a3c397@gmail.com> References: <2c6c8678-5023-1b16-8159-e0c7e3a3c397@gmail.com> Message-ID: <75c4feb2-5fdd-7d4d-f2b5-1520ef821611@oracle.com> Hi Yasumasa, This is likely to be due to JDK-8174994 (for which I am working on a fix right now, and which I hope to send for review in a day or two). Thank you, Jini. On 3/25/2018 6:46 PM, Yasumasa Suenaga wrote: > Hi all, > > I tried to get jstack via jhsdb from AppCDS enabled process, but it > couldn't as below: > > --------------- > [ysuenaga at fc27 jdk-hs]$ /usr/local/jdk-10/bin/jhsdb jstack --pid 5614 > Attaching to process ID 5614, please wait... > Debugger attached successfully. > Server compiler detected. > JVM version is 10+46 > Deadlock Detection: > > sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of > address 0x0000000800098690 > ??????? at > jdk.hotspot.agent/sun.jvm.hotspot.runtime.InstanceConstructor.newWrongTypeException(InstanceConstructor.java:62) > > ??????? at > jdk.hotspot.agent/sun.jvm.hotspot.runtime.VirtualBaseConstructor.instantiateWrapperFor(VirtualBaseConstructor.java:109) > > ??????? at > jdk.hotspot.agent/sun.jvm.hotspot.oops.Metadata.instantiateWrapperFor(Metadata.java:73) > > ??????? at > jdk.hotspot.agent/sun.jvm.hotspot.oops.Oop.getKlassForOopHandle(Oop.java:210) > > ??????? at > jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.newOop(ObjectHeap.java:252) > > ??????? at > jdk.hotspot.agent/sun.jvm.hotspot.runtime.JavaThread.getThreadObj(JavaThread.java:359) > > ??????? at > jdk.hotspot.agent/sun.jvm.hotspot.runtime.JavaThread.getCurrentParkBlocker(JavaThread.java:411) > > ??????? at > jdk.hotspot.agent/sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:82) > > ?--------------- > > I attempt to change SA to refer Klass::_archived_mirror, but I got same > exceptions. > How can I avoid this error? > > > Thanks, > > Yasumasa > From yasuenag at gmail.com Mon Mar 26 04:08:40 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Mon, 26 Mar 2018 13:08:40 +0900 Subject: jhsdb jstack does not work with AppCDS In-Reply-To: <75c4feb2-5fdd-7d4d-f2b5-1520ef821611@oracle.com> References: <2c6c8678-5023-1b16-8159-e0c7e3a3c397@gmail.com> <75c4feb2-5fdd-7d4d-f2b5-1520ef821611@oracle.com> Message-ID: Thanks Jini, I will watch JDK-8174994. Yasumasa On 2018/03/26 10:39, Jini George wrote: > Hi Yasumasa, > > This is likely to be due to JDK-8174994 (for which I am working on a fix > right now, and which I hope to send for review in a day or two). > > Thank you, > Jini. > > > On 3/25/2018 6:46 PM, Yasumasa Suenaga wrote: >> Hi all, >> >> I tried to get jstack via jhsdb from AppCDS enabled process, but it >> couldn't as below: >> >> --------------- >> [ysuenaga at fc27 jdk-hs]$ /usr/local/jdk-10/bin/jhsdb jstack --pid 5614 >> Attaching to process ID 5614, please wait... >> Debugger attached successfully. >> Server compiler detected. >> JVM version is 10+46 >> Deadlock Detection: >> >> sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of >> address 0x0000000800098690 >> ??????? at >> jdk.hotspot.agent/sun.jvm.hotspot.runtime.InstanceConstructor.newWrongTypeException(InstanceConstructor.java:62) >> >> ??????? at >> jdk.hotspot.agent/sun.jvm.hotspot.runtime.VirtualBaseConstructor.instantiateWrapperFor(VirtualBaseConstructor.java:109) >> >> ??????? at >> jdk.hotspot.agent/sun.jvm.hotspot.oops.Metadata.instantiateWrapperFor(Metadata.java:73) >> >> ??????? at >> jdk.hotspot.agent/sun.jvm.hotspot.oops.Oop.getKlassForOopHandle(Oop.java:210) >> >> ??????? at >> jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.newOop(ObjectHeap.java:252) >> >> ??????? at >> jdk.hotspot.agent/sun.jvm.hotspot.runtime.JavaThread.getThreadObj(JavaThread.java:359) >> >> ??????? at >> jdk.hotspot.agent/sun.jvm.hotspot.runtime.JavaThread.getCurrentParkBlocker(JavaThread.java:411) >> >> ??????? at >> jdk.hotspot.agent/sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:82) >> >> ?--------------- >> >> I attempt to change SA to refer Klass::_archived_mirror, but I got same >> exceptions. >> How can I avoid this error? >> >> >> Thanks, >> >> Yasumasa >> From ioi.lam at oracle.com Mon Mar 26 04:39:39 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Sun, 25 Mar 2018 21:39:39 -0700 Subject: RFR: 8200204: SharedArchiveConfigFile cannot accept output of VM.stringtable In-Reply-To: References: Message-ID: <51adccc1-8660-f189-863c-62fcefc42fa9@oracle.com> Hi Yasumasa, The word "VERSION" actually means different things in different places. That's the confusing part. "jcmd VM.stringtable -verbose" prints out the version of the "string listing". However, The VERSION in SharedArchiveConfigFile means the "version of the config file". The current version is 1.0. The format of this file is: ??? VERSION: 1.0 ??? @SECTION: Symbol ??? ....contents of "jcmd VM.symboltable -verbose" (**) ??? @SECTION: String ??? ....contents of "jcmd VM.stringtable -verbose"(**) (**) The first two lines of jcmd output (pid and VERSION) should be skipped. So the creation of the config file is somewhat manual -- you need to cut out the process id anyway (maybe we should add an option to jcmd to not print the process ID). I think a proper fix should clarify which VERSION we are looking for. We need a mechanism to ensure that the @SECTIONs for Symbol and String are in the correct format as expected by the JVM. How about changing the config file format to this: ? ? VERSION: 1.1 ??? @SECTION: Symbol ??? VERSION: 1.0 ??? ....contents of "jcmd VM.symboltable -verbose" (**) ??? @SECTION: String ??? VERSION: 1.1 ??? ....contents of "jcmd VM.stringtable -verbose" (**) So we have 3 kinds of VERSIONS - for the config file, for the symbol section, and for the string section. What do you think? Thanks - Ioi On 3/25/18 5:46 PM, Yasumasa Suenaga wrote: > Hi all, > > Please review this change. > > ????? JBS: https://bugs.openjdk.java.net/browse/JDK-8200204 > ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8200204/webrev.00/ > submit-hs: mach5-one-ysuenaga-JDK-8200204-20180325-1440-16057 > > > JDK-8134448 says SharedArchiveConfigFile accepts output of `jcmd > VM.stringtable -verbose` , but it could not because JDK-8059510 has > changed version number to 1.1 . > > I think we should accept version 1.1 stringtable. > > > Thanks, > > Yasumasa From amit.sapre at oracle.com Mon Mar 26 10:28:46 2018 From: amit.sapre at oracle.com (Amit Sapre) Date: Mon, 26 Mar 2018 03:28:46 -0700 (PDT) Subject: RFR : JDK-8071367 - JMX: Remove SNMP support In-Reply-To: <559041cd-e695-3bc8-29a2-ae2d49797292@oracle.com> References: <5fed6078-624f-2c4b-b92c-5022fd07925c@oracle.com> <1dcc4fe3-8463-49c6-947d-ef31b92d51db@default> <559041cd-e695-3bc8-29a2-ae2d49797292@oracle.com> Message-ID: <325a2e31-5559-4819-a9f7-c7e0dadc59d5@default> Thanks Alan & mandy for reviews. Amit From: Alan Bateman Sent: Friday, March 23, 2018 9:54 PM To: Amit Sapre; Mandy Chung; serviceability-dev at openjdk.java.net; compiler-dev at openjdk.java.net Subject: Re: RFR : JDK-8071367 - JMX: Remove SNMP support On 23/03/2018 10:43, Amit Sapre wrote: Thanks all for the inputs. This webrev addresses the inputs : HYPERLINK "http://cr.openjdk.java.net/%7Easapre/webrev/2018/JDK-8071367/webrev.02/"http://cr.openjdk.java.net/~asapre/webrev/2018/JDK-8071367/webrev.02/ I think you need to put {@code ... } around "-Dcom.sun.management.*", otherwise looks good to me. -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Mon Mar 26 13:21:45 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Mon, 26 Mar 2018 22:21:45 +0900 Subject: RFR: 8200204: SharedArchiveConfigFile cannot accept output of VM.stringtable In-Reply-To: <51adccc1-8660-f189-863c-62fcefc42fa9@oracle.com> References: <51adccc1-8660-f189-863c-62fcefc42fa9@oracle.com> Message-ID: <6cacc9fc-d133-1483-5f18-02c095a9fee3@gmail.com> Hi Ioi, > I think a proper fix should clarify which VERSION we are looking for. I agree with you, but I cannot agree with new format because it is difficult to understand two different "VERSION" meanings. IMHO, we can change the format as below: 1. Define same VERSION to all @SECTION. It is same of current behavior. ---------------- VERSION: 1.0 @SECTION: Symbol ....contents of "jcmd VM.symboltable -verbose" (**) @SECTION: String ....contents of "jcmd VM.stringtable -verbose"(**) ---------------- 2. Define same VERSION to all @SECTION except "String". ---------------- VERSION: 1.0 @SECTION: Symbol ....contents of "jcmd VM.symboltable -verbose" (**) @SECTION: String VERSION: 1.1 ....contents of "jcmd VM.stringtable -verbose"(**) ---------------- 3. Define VERSIONs in each @SECTIONs. ---------------- @SECTION: Symbol VERSION: 1.0 ....contents of "jcmd VM.symboltable -verbose" (**) @SECTION: String VERSION: 1.1 ....contents of "jcmd VM.stringtable -verbose"(**) ---------------- How about this? Thanks, Yasumasa On 2018/03/26 13:39, Ioi Lam wrote: > Hi Yasumasa, > > The word "VERSION" actually means different things in different places. > That's the confusing part. > > "jcmd VM.stringtable -verbose" prints out the version of the > "string listing". > > However, > > The VERSION in SharedArchiveConfigFile means the "version of the config > file". The current version is 1.0. The format of this file is: > > ??? VERSION: 1.0 > ??? @SECTION: Symbol > ??? ....contents of "jcmd VM.symboltable -verbose" (**) > ??? @SECTION: String > ??? ....contents of "jcmd VM.stringtable -verbose"(**) > > (**) The first two lines of jcmd output (pid and VERSION) should be skipped. > > > So the creation of the config file is somewhat manual -- you need to cut > out the process id anyway (maybe we should add an option to jcmd to not > print the process ID). > > I think a proper fix should clarify which VERSION we are looking for. We > need a mechanism to ensure that the @SECTIONs for Symbol and String are > in the correct format as expected by the JVM. > > How about changing the config file format to this: > > ? ? VERSION: 1.1 > ??? @SECTION: Symbol > ??? VERSION: 1.0 > ??? ....contents of "jcmd VM.symboltable -verbose" (**) > ??? @SECTION: String > ??? VERSION: 1.1 > ??? ....contents of "jcmd VM.stringtable -verbose" (**) > > > So we have 3 kinds of VERSIONS - for the config file, for the symbol > section, and for the string section. > > What do you think? > > Thanks > - Ioi > > > > > On 3/25/18 5:46 PM, Yasumasa Suenaga wrote: >> Hi all, >> >> Please review this change. >> >> ????? JBS: https://bugs.openjdk.java.net/browse/JDK-8200204 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8200204/webrev.00/ >> submit-hs: mach5-one-ysuenaga-JDK-8200204-20180325-1440-16057 >> >> >> JDK-8134448 says SharedArchiveConfigFile accepts output of `jcmd >> VM.stringtable -verbose` , but it could not because JDK-8059510 has >> changed version number to 1.1 . >> >> I think we should accept version 1.1 stringtable. >> >> >> Thanks, >> >> Yasumasa From stefan.johansson at oracle.com Mon Mar 26 15:03:49 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Mon, 26 Mar 2018 17:03:49 +0200 Subject: RFR: 8199519: Several GC tests fails with: java.lang.NumberFormatException: Unparseable number: "-" In-Reply-To: <6755303f-a1a0-da4f-e1e0-a1bcb0c72efd@gmail.com> References: <6755303f-a1a0-da4f-e1e0-a1bcb0c72efd@gmail.com> Message-ID: <7809552d-dfa0-5f26-bd82-c13df7f45f5f@oracle.com> Hi Yasumasa, On 2018-03-22 11:35, Yasumasa Suenaga wrote: > Hi all, > > Please review this change: > > ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8199519 > webrev: cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.00/ The fix seems to make things to work as expected. Manually tested it and Mach5 also looks good. I have some comments regarding the patch. I think 'forcibly' should be rename to something more descriptive. Naming is never easy but I think 'required' would be better, as in, this column is required and not allowed to print '-'. That would also render the code in ExpressionResolver.java to be: ? return new Literal(isRequired ? 0.0d : Double.NaN); I think that also better explains why we return 0 instead of NaN. I would also like to see the forcibly/required state moved into the Expression it self, that way we don't have to pass it around but can instead do: ? return new Literal(e.isRequired() ? 0.0d : Double.NaN); Thanks, Stefan > > After JDK-8153333, some jstat tests are failed because GCT in jstat > output is dash (-) if garbage collector is not concurrent collector > e.g. Serial GC. > I fixed that GCT can be calculated correctly. > > This change has been tested on Mach5 by Stefan. > > > Thanks, > > Yasumasa From ioi.lam at oracle.com Mon Mar 26 20:17:45 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 26 Mar 2018 13:17:45 -0700 Subject: RFR: 8200204: SharedArchiveConfigFile cannot accept output of VM.stringtable In-Reply-To: <6cacc9fc-d133-1483-5f18-02c095a9fee3@gmail.com> References: <51adccc1-8660-f189-863c-62fcefc42fa9@oracle.com> <6cacc9fc-d133-1483-5f18-02c095a9fee3@gmail.com> Message-ID: <3ea7c7cc-b7d6-afb9-b143-3d41a81fa6ee@oracle.com> On 3/26/18 6:21 AM, Yasumasa Suenaga wrote: > Hi Ioi, > >> I think a proper fix should clarify which VERSION we are looking for. > > I agree with you, but I cannot agree with new format because it is > difficult to understand two different "VERSION" meanings. > > IMHO, we can change the format as below: > > > 1. Define same VERSION to all @SECTION. It is same of current behavior. > ---------------- > VERSION: 1.0 > @SECTION: Symbol > ....contents of "jcmd VM.symboltable -verbose" (**) > @SECTION: String > ....contents of "jcmd VM.stringtable -verbose"(**) > ---------------- > > 2. Define same VERSION to all @SECTION except "String". > ---------------- > VERSION: 1.0 > @SECTION: Symbol > ....contents of "jcmd VM.symboltable -verbose" (**) > @SECTION: String > VERSION: 1.1 > ....contents of "jcmd VM.stringtable -verbose"(**) > ---------------- > > 3. Define VERSIONs in each @SECTIONs. > ---------------- > @SECTION: Symbol > VERSION: 1.0 > ....contents of "jcmd VM.symboltable -verbose" (**) > @SECTION: String > VERSION: 1.1 > ....contents of "jcmd VM.stringtable -verbose"(**) > ---------------- > > > How about this? > Maybe we should just keep the current behavior, and stick with 1.0 for the config file version. That way we don't need to make any code changes, and just need to clarify the user documentation. Thanks - Ioi > Thanks, > Yasumasa > > > > On 2018/03/26 13:39, Ioi Lam wrote: >> Hi Yasumasa, >> >> The word "VERSION" actually means different things in different places. >> That's the confusing part. >> >> "jcmd VM.stringtable -verbose" prints out the version of the >> "string listing". >> >> However, >> >> The VERSION in SharedArchiveConfigFile means the "version of the config >> file". The current version is 1.0. The format of this file is: >> >> ? ??? VERSION: 1.0 >> ? ??? @SECTION: Symbol >> ? ??? ....contents of "jcmd VM.symboltable -verbose" (**) >> ? ??? @SECTION: String >> ? ??? ....contents of "jcmd VM.stringtable -verbose"(**) >> >> (**) The first two lines of jcmd output (pid and VERSION) should be >> skipped. >> >> >> So the creation of the config file is somewhat manual -- you need to cut >> out the process id anyway (maybe we should add an option to jcmd to not >> print the process ID). >> >> I think a proper fix should clarify which VERSION we are looking for. We >> need a mechanism to ensure that the @SECTIONs for Symbol and String are >> in the correct format as expected by the JVM. >> >> How about changing the config file format to this: >> >> ? ? ? VERSION: 1.1 >> ? ??? @SECTION: Symbol >> ? ??? VERSION: 1.0 >> ? ??? ....contents of "jcmd VM.symboltable -verbose" (**) >> ? ??? @SECTION: String >> ? ??? VERSION: 1.1 >> ? ??? ....contents of "jcmd VM.stringtable -verbose" (**) >> >> >> So we have 3 kinds of VERSIONS - for the config file, for the symbol >> section, and for the string section. >> >> What do you think? >> >> Thanks >> - Ioi >> >> >> >> >> On 3/25/18 5:46 PM, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> Please review this change. >>> >>> ????? JBS: https://bugs.openjdk.java.net/browse/JDK-8200204 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8200204/webrev.00/ >>> submit-hs: mach5-one-ysuenaga-JDK-8200204-20180325-1440-16057 >>> >>> >>> JDK-8134448 says SharedArchiveConfigFile accepts output of `jcmd >>> VM.stringtable -verbose` , but it could not because JDK-8059510 has >>> changed version number to 1.1 . >>> >>> I think we should accept version 1.1 stringtable. >>> >>> >>> Thanks, >>> >>> Yasumasa From serguei.spitsyn at oracle.com Mon Mar 26 21:31:15 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 26 Mar 2018 14:31:15 -0700 Subject: RFR: JDK-8198393: Instrumentation.retransformClasses() throws NullPointerException when handling a zero-length array In-Reply-To: <880930e8-d4f2-63ab-a6ed-126e70277f23@oracle.com> References: <880930e8-d4f2-63ab-a6ed-126e70277f23@oracle.com> Message-ID: <53ab97fe-7ec8-a321-012c-baf5e8c2c628@oracle.com> Hi Alex, It looks good to me. A couple of questions: ?- How does the test fail with the unfixed code? ?- It seems, the following imports in the test are not needed: 34 import java.io.IOException; . . . 43 import java.util.Arrays; . . . 45 import jdk.test.lib.Utils; 46 import jdk.test.lib.process.ExitCode; 47 import jdk.test.lib.process.OutputAnalyzer; Thanks, Serguei On 3/22/18 16:18, Alex Menkov wrote: > Hi all, > > Please take a look at a simple fix for > https://bugs.openjdk.java.net/browse/JDK-8198393 > webrev: > http://cr.openjdk.java.net/~amenkov/retransformClassesZeroLength/webrev/ > > --alex From serguei.spitsyn at oracle.com Mon Mar 26 21:33:34 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 26 Mar 2018 14:33:34 -0700 Subject: RFR: JDK-8198393: Instrumentation.retransformClasses() throws NullPointerException when handling a zero-length array In-Reply-To: <53ab97fe-7ec8-a321-012c-baf5e8c2c628@oracle.com> References: <880930e8-d4f2-63ab-a6ed-126e70277f23@oracle.com> <53ab97fe-7ec8-a321-012c-baf5e8c2c628@oracle.com> Message-ID: <265a3662-cec8-ac75-f9c9-95a07365610b@oracle.com> Forgot to tell that the copyright comment in the InstrumentationImpl.java needs an update. Thanks, Serguei On 3/26/18 14:31, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > It looks good to me. > > A couple of questions: > > ?- How does the test fail with the unfixed code? > > ?- It seems, the following imports in the test are not needed: > > ? 34 import java.io.IOException; > ? . . . > ? 43 import java.util.Arrays; > ? . . . > ? 45 import jdk.test.lib.Utils; > ? 46 import jdk.test.lib.process.ExitCode; > ? 47 import jdk.test.lib.process.OutputAnalyzer; > > Thanks, > Serguei > > > On 3/22/18 16:18, Alex Menkov wrote: >> Hi all, >> >> Please take a look at a simple fix for >> https://bugs.openjdk.java.net/browse/JDK-8198393 >> webrev: >> http://cr.openjdk.java.net/~amenkov/retransformClassesZeroLength/webrev/ >> >> --alex > From leonid.mesnik at oracle.com Mon Mar 26 23:32:01 2018 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Mon, 26 Mar 2018 16:32:01 -0700 Subject: RFR(XS): 8200187: Exclude 3 long-running tests from tier1 In-Reply-To: References: Message-ID: Hi There is no strict rules and time budgets for hotspot tiers. It is assumed that faster tests should be executed earlier tiers. The order and total impact of of these test depends on how they are executed. Here is time of execution of tier1 groups with and without these excluded tests: The whole :tier1 time reduced from 40 to 30 minutes (on dedicated HW with 32 core) Time for :tier1_serviceability reduced from 15 to 6 min (in HS CI) Time for :tier1_gc_1 reduced from 12 to 8-10 min (in HS CI) The benefits of exclusion of gc/g1/ihop/TestIHOPErgo.java are not so significant. However it is the only one GC stress test executed in tier1 while intention is to don?t run stress testing in tier1. Tests serviceability/sa/ClhsdbScanOops.java serviceability/sa/TestHeapDumpForLargeArray.java are now hotspot_tier3_runtime which includes all hotspot_serviceability tests which are not a part of tier1. They are executed as a part of tier3 now as well as most of serviceability tests. Test gc/g1/ihop/TestIHOPErgo.java in now in group :hotspot_gc only and also it is marked as ?stress? test. So it is executed with all other GC stress tests Leonid > On Mar 25, 2018, at 3:48 PM, David Holmes wrote: > > Hi Leonid, > > On 24/03/2018 9:31 AM, Leonid Mesnik wrote: >> Hi >> Could you please review following fix which exclude following tests from tier1 testing: >> serviceability/sa/ClhsdbScanOops.java >> serviceability/sa/TestHeapDumpForLargeArray.java >> gc/g1/ihop/TestIHOPErgo.java >> Each of them takes more then 5 minutes to complete and significantly increase overall time to complete tier1. > > I'd need to see a much more detailed analysis of all the tests run, the order they run and the execution times to ascertain what impact this actually has on overall test execution time. > > But assuming 5 minutes is too long for tier1, this seems okay. But the tests must still be run in some other tier, and I'm not clear where that would be now? I would expect them to move to tier 3 perhaps, depending on what the time criteria for tier 3 is. > > Thanks, > David > >> Please let me know if there are any reasons to run these tests in tier1 despite on their execution time. >> webrev:http://cr.openjdk.java.net/~lmesnik/8200187/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8200187 >> Leonid From alexey.menkov at oracle.com Mon Mar 26 23:36:20 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 26 Mar 2018 16:36:20 -0700 Subject: RFR: JDK-8198393: Instrumentation.retransformClasses() throws NullPointerException when handling a zero-length array In-Reply-To: <53ab97fe-7ec8-a321-012c-baf5e8c2c628@oracle.com> References: <880930e8-d4f2-63ab-a6ed-126e70277f23@oracle.com> <53ab97fe-7ec8-a321-012c-baf5e8c2c628@oracle.com> Message-ID: Hi Serguei, updated webrev: http://cr.openjdk.java.net/~amenkov/retransformClassesZeroLength/webrev.02/ - updated copyright in th eInstrumentationImpl.java - removed inused imports in the test On 03/26/2018 14:31, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > It looks good to me. > > A couple of questions: > > ?- How does the test fail with the unfixed code? As described in the jira issue: stdout: [FATAL ERROR in native method: processing of -javaagent failed ]; stderr: [java.lang.reflect.InvocationTargetException at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:564) at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:513) at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:525) Caused by: java.lang.NullPointerException at java.instrument/sun.instrument.InstrumentationImpl.retransformClasses0(Native Method) at java.instrument/sun.instrument.InstrumentationImpl.retransformClasses(InstrumentationImpl.java:167) at RetransformClassesZeroLength$Agent.premain(RetransformClassesZeroLength.java:77) ... 6 more *** java.lang.instrument ASSERTION FAILED ***: "numClasses != 0" at line: 1146 --alex > > ?- It seems, the following imports in the test are not needed: > > ? 34 import java.io.IOException; > ? . . . > ? 43 import java.util.Arrays; > ? . . . > ? 45 import jdk.test.lib.Utils; > ? 46 import jdk.test.lib.process.ExitCode; > ? 47 import jdk.test.lib.process.OutputAnalyzer; > > Thanks, > Serguei > > > On 3/22/18 16:18, Alex Menkov wrote: >> Hi all, >> >> Please take a look at a simple fix for >> https://bugs.openjdk.java.net/browse/JDK-8198393 >> webrev: >> http://cr.openjdk.java.net/~amenkov/retransformClassesZeroLength/webrev/ >> >> --alex > From serguei.spitsyn at oracle.com Mon Mar 26 23:44:17 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 26 Mar 2018 16:44:17 -0700 Subject: RFR: JDK-8198393: Instrumentation.retransformClasses() throws NullPointerException when handling a zero-length array In-Reply-To: References: <880930e8-d4f2-63ab-a6ed-126e70277f23@oracle.com> <53ab97fe-7ec8-a321-012c-baf5e8c2c628@oracle.com> Message-ID: <5de0e6cd-3185-f172-1cf2-4b4b146e7ff3@oracle.com> Hi Alex, On 3/26/18 16:36, Alex Menkov wrote: > Hi Serguei, > > updated webrev: > http://cr.openjdk.java.net/~amenkov/retransformClassesZeroLength/webrev.02/ > > > - updated copyright in th eInstrumentationImpl.java > - removed inused imports in the test Thank you for the update! > On 03/26/2018 14:31, serguei.spitsyn at oracle.com wrote: >> Hi Alex, >> >> It looks good to me. >> >> A couple of questions: >> >> ??- How does the test fail with the unfixed code? > > As described in the jira issue: > ?stdout: [FATAL ERROR in native method: processing of -javaagent failed > ]; > ?stderr: [java.lang.reflect.InvocationTargetException > ??????? at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > ??????? at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ??????? at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ??????? at java.base/java.lang.reflect.Method.invoke(Method.java:564) > ??????? at > java.instrument/sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:513) > ??????? at > java.instrument/sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:525) > Caused by: java.lang.NullPointerException > ??????? at > java.instrument/sun.instrument.InstrumentationImpl.retransformClasses0(Native > Method) > ??????? at > java.instrument/sun.instrument.InstrumentationImpl.retransformClasses(InstrumentationImpl.java:167) > ??????? at > RetransformClassesZeroLength$Agent.premain(RetransformClassesZeroLength.java:77) > ??????? ... 6 more > *** java.lang.instrument ASSERTION FAILED ***: "numClasses != 0" at > line: 1146 Great. Reviewed. Thanks, Serguei > > --alex > >> >> ??- It seems, the following imports in the test are not needed: >> >> ?? 34 import java.io.IOException; >> ?? . . . >> ?? 43 import java.util.Arrays; >> ?? . . . >> ?? 45 import jdk.test.lib.Utils; >> ?? 46 import jdk.test.lib.process.ExitCode; >> ?? 47 import jdk.test.lib.process.OutputAnalyzer; >> >> Thanks, >> Serguei >> >> >> On 3/22/18 16:18, Alex Menkov wrote: >>> Hi all, >>> >>> Please take a look at a simple fix for >>> https://bugs.openjdk.java.net/browse/JDK-8198393 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/retransformClassesZeroLength/webrev/ >>> >>> >>> --alex >> From chris.plummer at oracle.com Tue Mar 27 00:41:44 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 26 Mar 2018 17:41:44 -0700 Subject: RFR: JDK-8198393: Instrumentation.retransformClasses() throws NullPointerException when handling a zero-length array In-Reply-To: References: <880930e8-d4f2-63ab-a6ed-126e70277f23@oracle.com> <53ab97fe-7ec8-a321-012c-baf5e8c2c628@oracle.com> Message-ID: <40407803-fc5c-0618-2843-95ed30db29dc@oracle.com> Looks good. thanks, Chris On 3/26/18 4:36 PM, Alex Menkov wrote: > Hi Serguei, > > updated webrev: > http://cr.openjdk.java.net/~amenkov/retransformClassesZeroLength/webrev.02/ > > > - updated copyright in th eInstrumentationImpl.java > - removed inused imports in the test > > > On 03/26/2018 14:31, serguei.spitsyn at oracle.com wrote: >> Hi Alex, >> >> It looks good to me. >> >> A couple of questions: >> >> ??- How does the test fail with the unfixed code? > > As described in the jira issue: > ?stdout: [FATAL ERROR in native method: processing of -javaagent failed > ]; > ?stderr: [java.lang.reflect.InvocationTargetException > ??????? at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > ??????? at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ??????? at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ??????? at java.base/java.lang.reflect.Method.invoke(Method.java:564) > ??????? at > java.instrument/sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:513) > ??????? at > java.instrument/sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:525) > Caused by: java.lang.NullPointerException > ??????? at > java.instrument/sun.instrument.InstrumentationImpl.retransformClasses0(Native > Method) > ??????? at > java.instrument/sun.instrument.InstrumentationImpl.retransformClasses(InstrumentationImpl.java:167) > ??????? at > RetransformClassesZeroLength$Agent.premain(RetransformClassesZeroLength.java:77) > ??????? ... 6 more > *** java.lang.instrument ASSERTION FAILED ***: "numClasses != 0" at > line: 1146 > > --alex > >> >> ??- It seems, the following imports in the test are not needed: >> >> ?? 34 import java.io.IOException; >> ?? . . . >> ?? 43 import java.util.Arrays; >> ?? . . . >> ?? 45 import jdk.test.lib.Utils; >> ?? 46 import jdk.test.lib.process.ExitCode; >> ?? 47 import jdk.test.lib.process.OutputAnalyzer; >> >> Thanks, >> Serguei >> >> >> On 3/22/18 16:18, Alex Menkov wrote: >>> Hi all, >>> >>> Please take a look at a simple fix for >>> https://bugs.openjdk.java.net/browse/JDK-8198393 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/retransformClassesZeroLength/webrev/ >>> >>> >>> --alex >> From chris.plummer at oracle.com Tue Mar 27 00:48:51 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 26 Mar 2018 17:48:51 -0700 Subject: RFR(XS): 8200187: Exclude 3 long-running tests from tier1 In-Reply-To: References: Message-ID: <95567282-45e1-7c8c-9404-43da3eb6f9d4@oracle.com> Hi Leonid, The exclusion of these 3 tests from tier1 looks ok to me. It did lead me to some other questions however. The first was that if you exclude them from tier1, how to they get included in a later tier. The answer, for the SA tests, is that all of serviceability is included in tier2. So the next question was whether or not we want to repeat testing in higher tiers. I suppose since higher tiers include new platforms and build options, the answer is yes, but I hope the upper tiers are constructed well enough that we aren't wasting too much time repeating the same test(s) on the same platforms with the same options. thanks, Chris On 3/26/18 4:32 PM, Leonid Mesnik wrote: > Hi > > There is no strict rules and time budgets for hotspot tiers. It is assumed that faster tests should be executed earlier tiers. The order and total impact of of these test depends on how they are executed. > > Here is time of execution of tier1 groups with and without these excluded tests: > The whole :tier1 time reduced from 40 to 30 minutes (on dedicated HW with 32 core) > Time for :tier1_serviceability reduced from 15 to 6 min (in HS CI) > Time for :tier1_gc_1 reduced from 12 to 8-10 min (in HS CI) > > The benefits of exclusion of gc/g1/ihop/TestIHOPErgo.java are not so significant. However it is the only one GC stress test executed in tier1 while intention is to don?t run stress testing in tier1. > > Tests > serviceability/sa/ClhsdbScanOops.java > serviceability/sa/TestHeapDumpForLargeArray.java > are now hotspot_tier3_runtime which includes all hotspot_serviceability tests which are not a part of tier1. They are executed as a part of tier3 now as well as most of serviceability tests. > > Test gc/g1/ihop/TestIHOPErgo.java in now in group :hotspot_gc only and also it is marked as ?stress? test. So it is executed with all other GC stress tests > > Leonid > >> On Mar 25, 2018, at 3:48 PM, David Holmes wrote: >> >> Hi Leonid, >> >> On 24/03/2018 9:31 AM, Leonid Mesnik wrote: >>> Hi >>> Could you please review following fix which exclude following tests from tier1 testing: >>> serviceability/sa/ClhsdbScanOops.java >>> serviceability/sa/TestHeapDumpForLargeArray.java >>> gc/g1/ihop/TestIHOPErgo.java >>> Each of them takes more then 5 minutes to complete and significantly increase overall time to complete tier1. >> I'd need to see a much more detailed analysis of all the tests run, the order they run and the execution times to ascertain what impact this actually has on overall test execution time. >> >> But assuming 5 minutes is too long for tier1, this seems okay. But the tests must still be run in some other tier, and I'm not clear where that would be now? I would expect them to move to tier 3 perhaps, depending on what the time criteria for tier 3 is. >> >> Thanks, >> David >> >>> Please let me know if there are any reasons to run these tests in tier1 despite on their execution time. >>> webrev:http://cr.openjdk.java.net/~lmesnik/8200187/webrev.00/ >>> bug: https://bugs.openjdk.java.net/browse/JDK-8200187 >>> Leonid From david.holmes at oracle.com Tue Mar 27 00:53:32 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 27 Mar 2018 10:53:32 +1000 Subject: RFR(XS): 8200187: Exclude 3 long-running tests from tier1 In-Reply-To: References: Message-ID: Thanks for that Leonid. As long as they are still included somewhere that is fine. David On 27/03/2018 9:32 AM, Leonid Mesnik wrote: > Hi > > There is no strict rules and time budgets for hotspot tiers. It is assumed that faster tests should be executed earlier tiers. The order and total impact of of these test depends on how they are executed. > > Here is time of execution of tier1 groups with and without these excluded tests: > The whole :tier1 time reduced from 40 to 30 minutes (on dedicated HW with 32 core) > Time for :tier1_serviceability reduced from 15 to 6 min (in HS CI) > Time for :tier1_gc_1 reduced from 12 to 8-10 min (in HS CI) > > The benefits of exclusion of gc/g1/ihop/TestIHOPErgo.java are not so significant. However it is the only one GC stress test executed in tier1 while intention is to don?t run stress testing in tier1. > > Tests > serviceability/sa/ClhsdbScanOops.java > serviceability/sa/TestHeapDumpForLargeArray.java > are now hotspot_tier3_runtime which includes all hotspot_serviceability tests which are not a part of tier1. They are executed as a part of tier3 now as well as most of serviceability tests. > > Test gc/g1/ihop/TestIHOPErgo.java in now in group :hotspot_gc only and also it is marked as ?stress? test. So it is executed with all other GC stress tests > > Leonid > >> On Mar 25, 2018, at 3:48 PM, David Holmes wrote: >> >> Hi Leonid, >> >> On 24/03/2018 9:31 AM, Leonid Mesnik wrote: >>> Hi >>> Could you please review following fix which exclude following tests from tier1 testing: >>> serviceability/sa/ClhsdbScanOops.java >>> serviceability/sa/TestHeapDumpForLargeArray.java >>> gc/g1/ihop/TestIHOPErgo.java >>> Each of them takes more then 5 minutes to complete and significantly increase overall time to complete tier1. >> >> I'd need to see a much more detailed analysis of all the tests run, the order they run and the execution times to ascertain what impact this actually has on overall test execution time. >> >> But assuming 5 minutes is too long for tier1, this seems okay. But the tests must still be run in some other tier, and I'm not clear where that would be now? I would expect them to move to tier 3 perhaps, depending on what the time criteria for tier 3 is. >> >> Thanks, >> David >> >>> Please let me know if there are any reasons to run these tests in tier1 despite on their execution time. >>> webrev:http://cr.openjdk.java.net/~lmesnik/8200187/webrev.00/ >>> bug: https://bugs.openjdk.java.net/browse/JDK-8200187 >>> Leonid > From yasuenag at gmail.com Tue Mar 27 08:56:37 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 27 Mar 2018 17:56:37 +0900 Subject: RFR: 8199519: Several GC tests fails with: java.lang.NumberFormatException: Unparseable number: "-" In-Reply-To: <7809552d-dfa0-5f26-bd82-c13df7f45f5f@oracle.com> References: <6755303f-a1a0-da4f-e1e0-a1bcb0c72efd@gmail.com> <7809552d-dfa0-5f26-bd82-c13df7f45f5f@oracle.com> Message-ID: Hi Stefan, Thank you for your comment. I updated webrev: webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.01/ submit-hs: mach5-one-ysuenaga-JDK-8199519-20180327-0652-16322 Thanks, Yasumasa 2018-03-27 0:03 GMT+09:00 Stefan Johansson : > Hi Yasumasa, > > On 2018-03-22 11:35, Yasumasa Suenaga wrote: >> >> Hi all, >> >> Please review this change: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8199519 >> webrev: cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.00/ > > The fix seems to make things to work as expected. Manually tested it and > Mach5 also looks good. > > I have some comments regarding the patch. I think 'forcibly' should be > rename to something more descriptive. Naming is never easy but I think > 'required' would be better, as in, this column is required and not allowed > to print '-'. That would also render the code in ExpressionResolver.java to > be: > return new Literal(isRequired ? 0.0d : Double.NaN); > I think that also better explains why we return 0 instead of NaN. > > I would also like to see the forcibly/required state moved into the > Expression it self, that way we don't have to pass it around but can instead > do: > return new Literal(e.isRequired() ? 0.0d : Double.NaN); > > Thanks, > Stefan > > >> >> After JDK-8153333, some jstat tests are failed because GCT in jstat output >> is dash (-) if garbage collector is not concurrent collector e.g. Serial GC. >> I fixed that GCT can be calculated correctly. >> >> This change has been tested on Mach5 by Stefan. >> >> >> Thanks, >> >> Yasumasa > > From yasuenag at gmail.com Tue Mar 27 08:59:00 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 27 Mar 2018 17:59:00 +0900 Subject: RFR: 8200204: SharedArchiveConfigFile cannot accept output of VM.stringtable In-Reply-To: <3ea7c7cc-b7d6-afb9-b143-3d41a81fa6ee@oracle.com> References: <51adccc1-8660-f189-863c-62fcefc42fa9@oracle.com> <6cacc9fc-d133-1483-5f18-02c095a9fee3@gmail.com> <3ea7c7cc-b7d6-afb9-b143-3d41a81fa6ee@oracle.com> Message-ID: Hi Ioi, If my suggestion (1. in my previous email) is not accepted, I think it should be documented. Should I close this JBS ticket? Thanks, Yasumasa 2018-03-27 5:17 GMT+09:00 Ioi Lam : > > > On 3/26/18 6:21 AM, Yasumasa Suenaga wrote: >> >> Hi Ioi, >> >>> I think a proper fix should clarify which VERSION we are looking for. >> >> >> I agree with you, but I cannot agree with new format because it is >> difficult to understand two different "VERSION" meanings. >> >> IMHO, we can change the format as below: >> >> >> 1. Define same VERSION to all @SECTION. It is same of current behavior. >> ---------------- >> VERSION: 1.0 >> @SECTION: Symbol >> ....contents of "jcmd VM.symboltable -verbose" (**) >> @SECTION: String >> ....contents of "jcmd VM.stringtable -verbose"(**) >> ---------------- >> >> 2. Define same VERSION to all @SECTION except "String". >> ---------------- >> VERSION: 1.0 >> @SECTION: Symbol >> ....contents of "jcmd VM.symboltable -verbose" (**) >> @SECTION: String >> VERSION: 1.1 >> ....contents of "jcmd VM.stringtable -verbose"(**) >> ---------------- >> >> 3. Define VERSIONs in each @SECTIONs. >> ---------------- >> @SECTION: Symbol >> VERSION: 1.0 >> ....contents of "jcmd VM.symboltable -verbose" (**) >> @SECTION: String >> VERSION: 1.1 >> ....contents of "jcmd VM.stringtable -verbose"(**) >> ---------------- >> >> >> How about this? >> > Maybe we should just keep the current behavior, and stick with 1.0 for the > config file version. That way we don't need to make any code changes, and > just need to clarify the user documentation. > > Thanks > - Ioi > > >> Thanks, >> Yasumasa >> >> >> >> On 2018/03/26 13:39, Ioi Lam wrote: >>> >>> Hi Yasumasa, >>> >>> The word "VERSION" actually means different things in different places. >>> That's the confusing part. >>> >>> "jcmd VM.stringtable -verbose" prints out the version of the >>> "string listing". >>> >>> However, >>> >>> The VERSION in SharedArchiveConfigFile means the "version of the config >>> file". The current version is 1.0. The format of this file is: >>> >>> ??? VERSION: 1.0 >>> ??? @SECTION: Symbol >>> ??? ....contents of "jcmd VM.symboltable -verbose" (**) >>> ??? @SECTION: String >>> ??? ....contents of "jcmd VM.stringtable -verbose"(**) >>> >>> (**) The first two lines of jcmd output (pid and VERSION) should be >>> skipped. >>> >>> >>> So the creation of the config file is somewhat manual -- you need to cut >>> out the process id anyway (maybe we should add an option to jcmd to not >>> print the process ID). >>> >>> I think a proper fix should clarify which VERSION we are looking for. We >>> need a mechanism to ensure that the @SECTIONs for Symbol and String are >>> in the correct format as expected by the JVM. >>> >>> How about changing the config file format to this: >>> >>> ? ? VERSION: 1.1 >>> ??? @SECTION: Symbol >>> ??? VERSION: 1.0 >>> ??? ....contents of "jcmd VM.symboltable -verbose" (**) >>> ??? @SECTION: String >>> ??? VERSION: 1.1 >>> ??? ....contents of "jcmd VM.stringtable -verbose" (**) >>> >>> >>> So we have 3 kinds of VERSIONS - for the config file, for the symbol >>> section, and for the string section. >>> >>> What do you think? >>> >>> Thanks >>> - Ioi >>> >>> >>> >>> >>> On 3/25/18 5:46 PM, Yasumasa Suenaga wrote: >>>> >>>> Hi all, >>>> >>>> Please review this change. >>>> >>>> ????? JBS: https://bugs.openjdk.java.net/browse/JDK-8200204 >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8200204/webrev.00/ >>>> submit-hs: mach5-one-ysuenaga-JDK-8200204-20180325-1440-16057 >>>> >>>> >>>> JDK-8134448 says SharedArchiveConfigFile accepts output of `jcmd >>>> VM.stringtable -verbose` , but it could not because JDK-8059510 has >>>> changed version number to 1.1 . >>>> >>>> I think we should accept version 1.1 stringtable. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa > > From stefan.johansson at oracle.com Tue Mar 27 13:45:02 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 27 Mar 2018 15:45:02 +0200 Subject: RFR: 8199519: Several GC tests fails with: java.lang.NumberFormatException: Unparseable number: "-" In-Reply-To: References: <6755303f-a1a0-da4f-e1e0-a1bcb0c72efd@gmail.com> <7809552d-dfa0-5f26-bd82-c13df7f45f5f@oracle.com> Message-ID: <85853429-a520-1782-40e4-e05776aa639d@oracle.com> Hi Yasumasa, On 2018-03-27 10:56, Yasumasa Suenaga wrote: > Hi Stefan, > > Thank you for your comment. > I updated webrev: > > webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.01/ I think the usage of Optional in Expression.setRequired(bool) is a bit unnecessary. It will create temporary objects and there is no benefit from just doing two simple if-statements. I also ran this patch (and the one using forcibly) on my single core VM and realized that this fix will have to include some awk-file updates to make the test in test/jdk/sun/tools/jstat pass when Serial in chosen as the default collector. The tests in test/jdk/sun/tools/jstatd/ are fine. Thanks, Stefan > submit-hs: mach5-one-ysuenaga-JDK-8199519-20180327-0652-16322 > > > Thanks, > > Yasumasa > > > > 2018-03-27 0:03 GMT+09:00 Stefan Johansson : >> Hi Yasumasa, >> >> On 2018-03-22 11:35, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> Please review this change: >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8199519 >>> webrev: cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.00/ >> The fix seems to make things to work as expected. Manually tested it and >> Mach5 also looks good. >> >> I have some comments regarding the patch. I think 'forcibly' should be >> rename to something more descriptive. Naming is never easy but I think >> 'required' would be better, as in, this column is required and not allowed >> to print '-'. That would also render the code in ExpressionResolver.java to >> be: >> return new Literal(isRequired ? 0.0d : Double.NaN); >> I think that also better explains why we return 0 instead of NaN. >> >> I would also like to see the forcibly/required state moved into the >> Expression it self, that way we don't have to pass it around but can instead >> do: >> return new Literal(e.isRequired() ? 0.0d : Double.NaN); >> >> Thanks, >> Stefan >> >> >>> After JDK-8153333, some jstat tests are failed because GCT in jstat output >>> is dash (-) if garbage collector is not concurrent collector e.g. Serial GC. >>> I fixed that GCT can be calculated correctly. >>> >>> This change has been tested on Mach5 by Stefan. >>> >>> >>> Thanks, >>> >>> Yasumasa >> From yasuenag at gmail.com Tue Mar 27 14:44:15 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 27 Mar 2018 23:44:15 +0900 Subject: RFR: 8199519: Several GC tests fails with: java.lang.NumberFormatException: Unparseable number: "-" In-Reply-To: <85853429-a520-1782-40e4-e05776aa639d@oracle.com> References: <6755303f-a1a0-da4f-e1e0-a1bcb0c72efd@gmail.com> <7809552d-dfa0-5f26-bd82-c13df7f45f5f@oracle.com> <85853429-a520-1782-40e4-e05776aa639d@oracle.com> Message-ID: <40b04f2e-1d6c-524e-ea4a-08c42fd41ee6@gmail.com> Hi Stefan, On 2018/03/27 22:45, Stefan Johansson wrote: > Hi Yasumasa, > > On 2018-03-27 10:56, Yasumasa Suenaga wrote: >> Hi Stefan, >> >> Thank you for your comment. >> I updated webrev: >> >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.01/ > I think the usage of Optional in Expression.setRequired(bool) is a bit unnecessary. It will create temporary objects and there is no benefit from just doing two simple if-statements. I fixed it in new webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.02/ > I also ran this patch (and the one using forcibly) on my single core VM and realized that this fix will have to include some awk-file updates to make the test in test/jdk/sun/tools/jstat pass when Serial in chosen as the default collector. The tests in test/jdk/sun/tools/jstatd/ are fine. Can you share the failure report? If it occurs in jstatClassloadOutput1.sh, it relates to JDK-8173942. Thanks, Yasumasa > Thanks, > Stefan >> ?? submit-hs: mach5-one-ysuenaga-JDK-8199519-20180327-0652-16322 >> >> >> Thanks, >> >> Yasumasa >> >> >> >> 2018-03-27 0:03 GMT+09:00 Stefan Johansson : >>> Hi Yasumasa, >>> >>> On 2018-03-22 11:35, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> Please review this change: >>>> >>>> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8199519 >>>> webrev: cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.00/ >>> The fix seems to make things to work as expected. Manually tested it and >>> Mach5 also looks good. >>> >>> I have some comments regarding the patch. I think 'forcibly' should be >>> rename to something more descriptive. Naming is never easy but I think >>> 'required' would be better, as in, this column is required and not allowed >>> to print '-'. That would also render the code in ExpressionResolver.java to >>> be: >>> ?? return new Literal(isRequired ? 0.0d : Double.NaN); >>> I think that also better explains why we return 0 instead of NaN. >>> >>> I would also like to see the forcibly/required state moved into the >>> Expression it self, that way we don't have to pass it around but can instead >>> do: >>> ?? return new Literal(e.isRequired() ? 0.0d : Double.NaN); >>> >>> Thanks, >>> Stefan >>> >>> >>>> After JDK-8153333, some jstat tests are failed because GCT in jstat output >>>> is dash (-) if garbage collector is not concurrent collector e.g. Serial GC. >>>> I fixed that GCT can be calculated correctly. >>>> >>>> This change has been tested on Mach5 by Stefan. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>> > From stefan.johansson at oracle.com Tue Mar 27 15:29:39 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 27 Mar 2018 17:29:39 +0200 Subject: RFR: 8199519: Several GC tests fails with: java.lang.NumberFormatException: Unparseable number: "-" In-Reply-To: <40b04f2e-1d6c-524e-ea4a-08c42fd41ee6@gmail.com> References: <6755303f-a1a0-da4f-e1e0-a1bcb0c72efd@gmail.com> <7809552d-dfa0-5f26-bd82-c13df7f45f5f@oracle.com> <85853429-a520-1782-40e4-e05776aa639d@oracle.com> <40b04f2e-1d6c-524e-ea4a-08c42fd41ee6@gmail.com> Message-ID: On 2018-03-27 16:44, Yasumasa Suenaga wrote: > Hi Stefan, > > On 2018/03/27 22:45, Stefan Johansson wrote: >> Hi Yasumasa, >> >> On 2018-03-27 10:56, Yasumasa Suenaga wrote: >>> Hi Stefan, >>> >>> Thank you for your comment. >>> I updated webrev: >>> >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.01/ >> I think the usage of Optional in Expression.setRequired(bool) is a >> bit unnecessary. It will create temporary objects and there is no >> benefit from just doing two simple if-statements. > > I fixed it in new webrev: > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.02/ > > >> I also ran this patch (and the one using forcibly) on my single core >> VM and realized that this fix will have to include some awk-file >> updates to make the test in test/jdk/sun/tools/jstat pass when Serial >> in chosen as the default collector. The tests in >> test/jdk/sun/tools/jstatd/ are fine. > > Can you share the failure report? It relates to all tests that display the the CGC and the CGCT columns, for example in jstatGCOutput1.sh: ?S0C??? S1C??? S0U??? S1U????? EC?????? EU OC???????? OU?????? MC???? MU??? CCSC?? CCSU?? YGC???? YGCT FGC??? FGCT??? CGC??? CGCT???? GCT 256.0? 256.0? 254.0?? 0.0??? 2176.0?? 1025.0??? 5504.0 920.5??? 7168.0 6839.7 768.0? 602.8?????? 2??? 0.007?? 0 0.000?? -????????? -??? 0.007 The awk regex needs to be updated to handle '-' for these tests: test: sun/tools/jstat/jstatGcCapacityOutput1.sh Failed. Execution failed: exit code 1 test: sun/tools/jstat/jstatGcMetaCapacityOutput1.sh Failed. Execution failed: exit code 1 test: sun/tools/jstat/jstatGcNewCapacityOutput1.sh Failed. Execution failed: exit code 1 test: sun/tools/jstat/jstatGcOldCapacityOutput1.sh Failed. Execution failed: exit code 1 test: sun/tools/jstat/jstatGcOldOutput1.sh Failed. Execution failed: exit code 1 test: sun/tools/jstat/jstatGcOutput1.sh Failed. Execution failed: exit code 1 > If it occurs in jstatClassloadOutput1.sh, it relates to JDK-8173942. > > > Thanks, > > Yasumasa > > >> Thanks, >> Stefan >>> ?? submit-hs: mach5-one-ysuenaga-JDK-8199519-20180327-0652-16322 >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> >>> 2018-03-27 0:03 GMT+09:00 Stefan Johansson >>> : >>>> Hi Yasumasa, >>>> >>>> On 2018-03-22 11:35, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> Please review this change: >>>>> >>>>> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8199519 >>>>> webrev: cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.00/ >>>> The fix seems to make things to work as expected. Manually tested >>>> it and >>>> Mach5 also looks good. >>>> >>>> I have some comments regarding the patch. I think 'forcibly' should be >>>> rename to something more descriptive. Naming is never easy but I think >>>> 'required' would be better, as in, this column is required and not >>>> allowed >>>> to print '-'. That would also render the code in >>>> ExpressionResolver.java to >>>> be: >>>> ?? return new Literal(isRequired ? 0.0d : Double.NaN); >>>> I think that also better explains why we return 0 instead of NaN. >>>> >>>> I would also like to see the forcibly/required state moved into the >>>> Expression it self, that way we don't have to pass it around but >>>> can instead >>>> do: >>>> ?? return new Literal(e.isRequired() ? 0.0d : Double.NaN); >>>> >>>> Thanks, >>>> Stefan >>>> >>>> >>>>> After JDK-8153333, some jstat tests are failed because GCT in >>>>> jstat output >>>>> is dash (-) if garbage collector is not concurrent collector e.g. >>>>> Serial GC. >>>>> I fixed that GCT can be calculated correctly. >>>>> >>>>> This change has been tested on Mach5 by Stefan. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>> >> From weijun.wang at oracle.com Tue Mar 27 23:52:40 2018 From: weijun.wang at oracle.com (Weijun Wang) Date: Wed, 28 Mar 2018 07:52:40 +0800 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <32db4add-8a59-03db-0074-e7df0aed14b8@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <86488d2c-61e3-e489-d9fc-976178c35775@oracle.com> <2329dea1-a75e-bb70-95a6-242a93006c6d@oracle.com> <32db4add-8a59-03db-0074-e7df0aed14b8@oracle.com> Message-ID: > On Mar 24, 2018, at 6:03 AM, Magnus Ihse Bursie wrote: > > https://bugs.openjdk.java.net/browse/JDK-8200193 -- for jdk.security.auth There is only one function to export and it already has JNIEXPORT, so you can just remove the new $(LIBJAAS_CFLAGS) [1]. Are you going to update your webrev? Thanks Max [1] http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01/make/lib/Lib-jdk.security.auth.gmk.sdiff.html From leonid.mesnik at oracle.com Wed Mar 28 00:22:43 2018 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Tue, 27 Mar 2018 17:22:43 -0700 Subject: RFR(XS): 8200187: Exclude 3 long-running tests from tier1 In-Reply-To: <95567282-45e1-7c8c-9404-43da3eb6f9d4@oracle.com> References: <95567282-45e1-7c8c-9404-43da3eb6f9d4@oracle.com> Message-ID: <5B7068DF-7FFC-41A8-AFC5-4D8DFDFE2B3D@oracle.com> Chris, David Thank you for review. > On Mar 26, 2018, at 5:48 PM, Chris Plummer wrote: > > Hi Leonid, > > The exclusion of these 3 tests from tier1 looks ok to me. It did lead me to some other questions however. The first was that if you exclude them from tier1, how to they get included in a later tier. The answer, for the SA tests, is that all of serviceability is included in tier2. So the next question was whether or not we want to repeat testing in higher tiers. I suppose since higher tiers include new platforms and build options, the answer is yes, but I hope the upper tiers are constructed well enough that we aren't wasting too much time repeating the same test(s) on the same platforms with the same options. > It depends from component . Runtime the groups organized by excluding previous tiers like: hotspot_tier2_runtime = \ runtime/ \ serviceability/ \ -runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java \ -runtime/Thread/TestThreadDumpMonitorContention.java \ -runtime/containers/ \ -:tier1_runtime \ -:tier1_serviceability \ -:hotspot_tier2_runtime_platform_agnostic ? hotspot_tier3_runtime = \ runtime/ \ serviceability/ \ -runtime/containers/ \ -:tier1_runtime \ -:tier1_serviceability \ -:hotspot_tier2_runtime_platform_agnostic \ -:hotspot_tier2_runtime ? So once test is excluded from tier1 it automatically is executed as a part of tier 2/3. While these tiers doesn?t execute tests from tier1. The other teams organize their testing slightly different. However the whole idea is that test excluded from tier1 should be executed in latest tiers and test should not re-executed in fast tiers once it was done already. However during testing like PIT we might just run whole hotspot_[component] team even some subset was run always in CI. Leonid > thanks, > > Chris > > On 3/26/18 4:32 PM, Leonid Mesnik wrote: >> Hi >> >> There is no strict rules and time budgets for hotspot tiers. It is assumed that faster tests should be executed earlier tiers. The order and total impact of of these test depends on how they are executed. >> >> Here is time of execution of tier1 groups with and without these excluded tests: >> The whole :tier1 time reduced from 40 to 30 minutes (on dedicated HW with 32 core) >> Time for :tier1_serviceability reduced from 15 to 6 min (in HS CI) >> Time for :tier1_gc_1 reduced from 12 to 8-10 min (in HS CI) >> >> The benefits of exclusion of gc/g1/ihop/TestIHOPErgo.java are not so significant. However it is the only one GC stress test executed in tier1 while intention is to don?t run stress testing in tier1. >> >> Tests >> serviceability/sa/ClhsdbScanOops.java >> serviceability/sa/TestHeapDumpForLargeArray.java >> are now hotspot_tier3_runtime which includes all hotspot_serviceability tests which are not a part of tier1. They are executed as a part of tier3 now as well as most of serviceability tests. >> >> Test gc/g1/ihop/TestIHOPErgo.java in now in group :hotspot_gc only and also it is marked as ?stress? test. So it is executed with all other GC stress tests >> >> Leonid >> >>> On Mar 25, 2018, at 3:48 PM, David Holmes wrote: >>> >>> Hi Leonid, >>> >>> On 24/03/2018 9:31 AM, Leonid Mesnik wrote: >>>> Hi >>>> Could you please review following fix which exclude following tests from tier1 testing: >>>> serviceability/sa/ClhsdbScanOops.java >>>> serviceability/sa/TestHeapDumpForLargeArray.java >>>> gc/g1/ihop/TestIHOPErgo.java >>>> Each of them takes more then 5 minutes to complete and significantly increase overall time to complete tier1. >>> I'd need to see a much more detailed analysis of all the tests run, the order they run and the execution times to ascertain what impact this actually has on overall test execution time. >>> >>> But assuming 5 minutes is too long for tier1, this seems okay. But the tests must still be run in some other tier, and I'm not clear where that would be now? I would expect them to move to tier 3 perhaps, depending on what the time criteria for tier 3 is. >>> >>> Thanks, >>> David >>> >>>> Please let me know if there are any reasons to run these tests in tier1 despite on their execution time. >>>> webrev:http://cr.openjdk.java.net/~lmesnik/8200187/webrev.00/ >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8200187 >>>> Leonid > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Wed Mar 28 04:04:47 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 28 Mar 2018 13:04:47 +0900 Subject: RFR: 8199519: Several GC tests fails with: java.lang.NumberFormatException: Unparseable number: "-" In-Reply-To: References: <6755303f-a1a0-da4f-e1e0-a1bcb0c72efd@gmail.com> <7809552d-dfa0-5f26-bd82-c13df7f45f5f@oracle.com> <85853429-a520-1782-40e4-e05776aa639d@oracle.com> <40b04f2e-1d6c-524e-ea4a-08c42fd41ee6@gmail.com> Message-ID: Hi Stefan, Thank you for sharing your report! I could reproduce them on my VM. I've fixed them in new webrev, and it works fine on my environment. Could you check again? http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.03/ Thanks, Yasumasa 2018-03-28 0:29 GMT+09:00 Stefan Johansson : > > > On 2018-03-27 16:44, Yasumasa Suenaga wrote: >> >> Hi Stefan, >> >> On 2018/03/27 22:45, Stefan Johansson wrote: >>> >>> Hi Yasumasa, >>> >>> On 2018-03-27 10:56, Yasumasa Suenaga wrote: >>>> >>>> Hi Stefan, >>>> >>>> Thank you for your comment. >>>> I updated webrev: >>>> >>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.01/ >>> >>> I think the usage of Optional in Expression.setRequired(bool) is a bit >>> unnecessary. It will create temporary objects and there is no benefit from >>> just doing two simple if-statements. >> >> >> I fixed it in new webrev: >> http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.02/ >> >> >>> I also ran this patch (and the one using forcibly) on my single core VM >>> and realized that this fix will have to include some awk-file updates to >>> make the test in test/jdk/sun/tools/jstat pass when Serial in chosen as the >>> default collector. The tests in test/jdk/sun/tools/jstatd/ are fine. >> >> >> Can you share the failure report? > > It relates to all tests that display the the CGC and the CGCT columns, for > example in jstatGCOutput1.sh: > S0C S1C S0U S1U EC EU OC OU MC MU > CCSC CCSU YGC YGCT FGC FGCT CGC CGCT GCT > 256.0 256.0 254.0 0.0 2176.0 1025.0 5504.0 920.5 7168.0 > 6839.7 768.0 602.8 2 0.007 0 0.000 - - 0.007 > > The awk regex needs to be updated to handle '-' for these tests: > test: sun/tools/jstat/jstatGcCapacityOutput1.sh > Failed. Execution failed: exit code 1 > > test: sun/tools/jstat/jstatGcMetaCapacityOutput1.sh > Failed. Execution failed: exit code 1 > > test: sun/tools/jstat/jstatGcNewCapacityOutput1.sh > Failed. Execution failed: exit code 1 > > test: sun/tools/jstat/jstatGcOldCapacityOutput1.sh > Failed. Execution failed: exit code 1 > > test: sun/tools/jstat/jstatGcOldOutput1.sh > Failed. Execution failed: exit code 1 > > test: sun/tools/jstat/jstatGcOutput1.sh > Failed. Execution failed: exit code 1 > > >> If it occurs in jstatClassloadOutput1.sh, it relates to JDK-8173942. >> >> >> Thanks, >> >> Yasumasa >> >> >>> Thanks, >>> Stefan >>>> >>>> submit-hs: mach5-one-ysuenaga-JDK-8199519-20180327-0652-16322 >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> >>>> 2018-03-27 0:03 GMT+09:00 Stefan Johansson >>>> : >>>>> >>>>> Hi Yasumasa, >>>>> >>>>> On 2018-03-22 11:35, Yasumasa Suenaga wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> Please review this change: >>>>>> >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8199519 >>>>>> webrev: cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.00/ >>>>> >>>>> The fix seems to make things to work as expected. Manually tested it >>>>> and >>>>> Mach5 also looks good. >>>>> >>>>> I have some comments regarding the patch. I think 'forcibly' should be >>>>> rename to something more descriptive. Naming is never easy but I think >>>>> 'required' would be better, as in, this column is required and not >>>>> allowed >>>>> to print '-'. That would also render the code in >>>>> ExpressionResolver.java to >>>>> be: >>>>> return new Literal(isRequired ? 0.0d : Double.NaN); >>>>> I think that also better explains why we return 0 instead of NaN. >>>>> >>>>> I would also like to see the forcibly/required state moved into the >>>>> Expression it self, that way we don't have to pass it around but can >>>>> instead >>>>> do: >>>>> return new Literal(e.isRequired() ? 0.0d : Double.NaN); >>>>> >>>>> Thanks, >>>>> Stefan >>>>> >>>>> >>>>>> After JDK-8153333, some jstat tests are failed because GCT in jstat >>>>>> output >>>>>> is dash (-) if garbage collector is not concurrent collector e.g. >>>>>> Serial GC. >>>>>> I fixed that GCT can be calculated correctly. >>>>>> >>>>>> This change has been tested on Mach5 by Stefan. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>> >>>>> >>> > From ioi.lam at oracle.com Wed Mar 28 05:54:27 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 27 Mar 2018 22:54:27 -0700 Subject: RFR: 8200204: SharedArchiveConfigFile cannot accept output of VM.stringtable In-Reply-To: References: <51adccc1-8660-f189-863c-62fcefc42fa9@oracle.com> <6cacc9fc-d133-1483-5f18-02c095a9fee3@gmail.com> <3ea7c7cc-b7d6-afb9-b143-3d41a81fa6ee@oracle.com> Message-ID: <57cea0ea-8a3c-874e-fd0e-ccbfad92295a@oracle.com> Hi Yasumasa, I have filed JDK-8200348 to clarify the JDK documentation. I would recommend closing this issue (JDK-8200204) as not a bug. Thanks - Ioi On 3/27/18 1:59 AM, Yasumasa Suenaga wrote: > Hi Ioi, > > If my suggestion (1. in my previous email) is not accepted, I think it > should be documented. > Should I close this JBS ticket? > > > Thanks, > > Yasumasa > > > > 2018-03-27 5:17 GMT+09:00 Ioi Lam : >> >> On 3/26/18 6:21 AM, Yasumasa Suenaga wrote: >>> Hi Ioi, >>> >>>> I think a proper fix should clarify which VERSION we are looking for. >>> >>> I agree with you, but I cannot agree with new format because it is >>> difficult to understand two different "VERSION" meanings. >>> >>> IMHO, we can change the format as below: >>> >>> >>> 1. Define same VERSION to all @SECTION. It is same of current behavior. >>> ---------------- >>> VERSION: 1.0 >>> @SECTION: Symbol >>> ....contents of "jcmd VM.symboltable -verbose" (**) >>> @SECTION: String >>> ....contents of "jcmd VM.stringtable -verbose"(**) >>> ---------------- >>> >>> 2. Define same VERSION to all @SECTION except "String". >>> ---------------- >>> VERSION: 1.0 >>> @SECTION: Symbol >>> ....contents of "jcmd VM.symboltable -verbose" (**) >>> @SECTION: String >>> VERSION: 1.1 >>> ....contents of "jcmd VM.stringtable -verbose"(**) >>> ---------------- >>> >>> 3. Define VERSIONs in each @SECTIONs. >>> ---------------- >>> @SECTION: Symbol >>> VERSION: 1.0 >>> ....contents of "jcmd VM.symboltable -verbose" (**) >>> @SECTION: String >>> VERSION: 1.1 >>> ....contents of "jcmd VM.stringtable -verbose"(**) >>> ---------------- >>> >>> >>> How about this? >>> >> Maybe we should just keep the current behavior, and stick with 1.0 for the >> config file version. That way we don't need to make any code changes, and >> just need to clarify the user documentation. >> >> Thanks >> - Ioi >> >> >>> Thanks, >>> Yasumasa >>> >>> >>> >>> On 2018/03/26 13:39, Ioi Lam wrote: >>>> Hi Yasumasa, >>>> >>>> The word "VERSION" actually means different things in different places. >>>> That's the confusing part. >>>> >>>> "jcmd VM.stringtable -verbose" prints out the version of the >>>> "string listing". >>>> >>>> However, >>>> >>>> The VERSION in SharedArchiveConfigFile means the "version of the config >>>> file". The current version is 1.0. The format of this file is: >>>> >>>> ??? VERSION: 1.0 >>>> ??? @SECTION: Symbol >>>> ??? ....contents of "jcmd VM.symboltable -verbose" (**) >>>> ??? @SECTION: String >>>> ??? ....contents of "jcmd VM.stringtable -verbose"(**) >>>> >>>> (**) The first two lines of jcmd output (pid and VERSION) should be >>>> skipped. >>>> >>>> >>>> So the creation of the config file is somewhat manual -- you need to cut >>>> out the process id anyway (maybe we should add an option to jcmd to not >>>> print the process ID). >>>> >>>> I think a proper fix should clarify which VERSION we are looking for. We >>>> need a mechanism to ensure that the @SECTIONs for Symbol and String are >>>> in the correct format as expected by the JVM. >>>> >>>> How about changing the config file format to this: >>>> >>>> ? ? VERSION: 1.1 >>>> ??? @SECTION: Symbol >>>> ??? VERSION: 1.0 >>>> ??? ....contents of "jcmd VM.symboltable -verbose" (**) >>>> ??? @SECTION: String >>>> ??? VERSION: 1.1 >>>> ??? ....contents of "jcmd VM.stringtable -verbose" (**) >>>> >>>> >>>> So we have 3 kinds of VERSIONS - for the config file, for the symbol >>>> section, and for the string section. >>>> >>>> What do you think? >>>> >>>> Thanks >>>> - Ioi >>>> >>>> >>>> >>>> >>>> On 3/25/18 5:46 PM, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> Please review this change. >>>>> >>>>> ????? JBS: https://bugs.openjdk.java.net/browse/JDK-8200204 >>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8200204/webrev.00/ >>>>> submit-hs: mach5-one-ysuenaga-JDK-8200204-20180325-1440-16057 >>>>> >>>>> >>>>> JDK-8134448 says SharedArchiveConfigFile accepts output of `jcmd >>>>> VM.stringtable -verbose` , but it could not because JDK-8059510 has >>>>> changed version number to 1.1 . >>>>> >>>>> I think we should accept version 1.1 stringtable. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >> From yasuenag at gmail.com Wed Mar 28 06:09:36 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 28 Mar 2018 15:09:36 +0900 Subject: RFR: 8200204: SharedArchiveConfigFile cannot accept output of VM.stringtable In-Reply-To: <57cea0ea-8a3c-874e-fd0e-ccbfad92295a@oracle.com> References: <51adccc1-8660-f189-863c-62fcefc42fa9@oracle.com> <6cacc9fc-d133-1483-5f18-02c095a9fee3@gmail.com> <3ea7c7cc-b7d6-afb9-b143-3d41a81fa6ee@oracle.com> <57cea0ea-8a3c-874e-fd0e-ccbfad92295a@oracle.com> Message-ID: Hi Ioi, I closed JDK-8200204 as "Not an Issue". Thanks, Yasumasa 2018-03-28 14:54 GMT+09:00 Ioi Lam : > Hi Yasumasa, > > I have filed JDK-8200348 to clarify the JDK documentation. > > I would recommend closing this issue (JDK-8200204) as not a bug. > > Thanks > > - Ioi > > > > On 3/27/18 1:59 AM, Yasumasa Suenaga wrote: >> >> Hi Ioi, >> >> If my suggestion (1. in my previous email) is not accepted, I think it >> should be documented. >> Should I close this JBS ticket? >> >> >> Thanks, >> >> Yasumasa >> >> >> >> 2018-03-27 5:17 GMT+09:00 Ioi Lam : >>> >>> >>> On 3/26/18 6:21 AM, Yasumasa Suenaga wrote: >>>> >>>> Hi Ioi, >>>> >>>>> I think a proper fix should clarify which VERSION we are looking for. >>>> >>>> >>>> I agree with you, but I cannot agree with new format because it is >>>> difficult to understand two different "VERSION" meanings. >>>> >>>> IMHO, we can change the format as below: >>>> >>>> >>>> 1. Define same VERSION to all @SECTION. It is same of current behavior. >>>> ---------------- >>>> VERSION: 1.0 >>>> @SECTION: Symbol >>>> ....contents of "jcmd VM.symboltable -verbose" (**) >>>> @SECTION: String >>>> ....contents of "jcmd VM.stringtable -verbose"(**) >>>> ---------------- >>>> >>>> 2. Define same VERSION to all @SECTION except "String". >>>> ---------------- >>>> VERSION: 1.0 >>>> @SECTION: Symbol >>>> ....contents of "jcmd VM.symboltable -verbose" (**) >>>> @SECTION: String >>>> VERSION: 1.1 >>>> ....contents of "jcmd VM.stringtable -verbose"(**) >>>> ---------------- >>>> >>>> 3. Define VERSIONs in each @SECTIONs. >>>> ---------------- >>>> @SECTION: Symbol >>>> VERSION: 1.0 >>>> ....contents of "jcmd VM.symboltable -verbose" (**) >>>> @SECTION: String >>>> VERSION: 1.1 >>>> ....contents of "jcmd VM.stringtable -verbose"(**) >>>> ---------------- >>>> >>>> >>>> How about this? >>>> >>> Maybe we should just keep the current behavior, and stick with 1.0 for >>> the >>> config file version. That way we don't need to make any code changes, and >>> just need to clarify the user documentation. >>> >>> Thanks >>> - Ioi >>> >>> >>>> Thanks, >>>> Yasumasa >>>> >>>> >>>> >>>> On 2018/03/26 13:39, Ioi Lam wrote: >>>>> >>>>> Hi Yasumasa, >>>>> >>>>> The word "VERSION" actually means different things in different places. >>>>> That's the confusing part. >>>>> >>>>> "jcmd VM.stringtable -verbose" prints out the version of the >>>>> "string listing". >>>>> >>>>> However, >>>>> >>>>> The VERSION in SharedArchiveConfigFile means the "version of the config >>>>> file". The current version is 1.0. The format of this file is: >>>>> >>>>> ??? VERSION: 1.0 >>>>> ??? @SECTION: Symbol >>>>> ??? ....contents of "jcmd VM.symboltable -verbose" (**) >>>>> ??? @SECTION: String >>>>> ??? ....contents of "jcmd VM.stringtable -verbose"(**) >>>>> >>>>> (**) The first two lines of jcmd output (pid and VERSION) should be >>>>> skipped. >>>>> >>>>> >>>>> So the creation of the config file is somewhat manual -- you need to >>>>> cut >>>>> out the process id anyway (maybe we should add an option to jcmd to not >>>>> print the process ID). >>>>> >>>>> I think a proper fix should clarify which VERSION we are looking for. >>>>> We >>>>> need a mechanism to ensure that the @SECTIONs for Symbol and String are >>>>> in the correct format as expected by the JVM. >>>>> >>>>> How about changing the config file format to this: >>>>> >>>>> ? ? VERSION: 1.1 >>>>> ??? @SECTION: Symbol >>>>> ??? VERSION: 1.0 >>>>> ??? ....contents of "jcmd VM.symboltable -verbose" (**) >>>>> ??? @SECTION: String >>>>> ??? VERSION: 1.1 >>>>> ??? ....contents of "jcmd VM.stringtable -verbose" (**) >>>>> >>>>> >>>>> So we have 3 kinds of VERSIONS - for the config file, for the symbol >>>>> section, and for the string section. >>>>> >>>>> What do you think? >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> >>>>> >>>>> >>>>> On 3/25/18 5:46 PM, Yasumasa Suenaga wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> Please review this change. >>>>>> >>>>>> ????? JBS: https://bugs.openjdk.java.net/browse/JDK-8200204 >>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8200204/webrev.00/ >>>>>> submit-hs: mach5-one-ysuenaga-JDK-8200204-20180325-1440-16057 >>>>>> >>>>>> >>>>>> JDK-8134448 says SharedArchiveConfigFile accepts output of `jcmd >>>>>> VM.stringtable -verbose` , but it could not because JDK-8059510 has >>>>>> changed version number to 1.1 . >>>>>> >>>>>> I think we should accept version 1.1 stringtable. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>> >>> > From stefan.johansson at oracle.com Wed Mar 28 09:36:45 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 28 Mar 2018 11:36:45 +0200 Subject: RFR: 8199519: Several GC tests fails with: java.lang.NumberFormatException: Unparseable number: "-" In-Reply-To: References: <6755303f-a1a0-da4f-e1e0-a1bcb0c72efd@gmail.com> <7809552d-dfa0-5f26-bd82-c13df7f45f5f@oracle.com> <85853429-a520-1782-40e4-e05776aa639d@oracle.com> <40b04f2e-1d6c-524e-ea4a-08c42fd41ee6@gmail.com> Message-ID: <93a1ffeb-4959-3bdb-cbe3-510c258129b6@oracle.com> Hi Yasumasa, Local testing looks good and I've kicked of some additional Mach5 testing that will include these tests on all platforms. Cheers, Stefan On 2018-03-28 06:04, Yasumasa Suenaga wrote: > Hi Stefan, > > Thank you for sharing your report! > I could reproduce them on my VM. > > I've fixed them in new webrev, and it works fine on my environment. > Could you check again? > > http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.03/ > > > Thanks, > > Yasumasa > > > > 2018-03-28 0:29 GMT+09:00 Stefan Johansson : >> >> On 2018-03-27 16:44, Yasumasa Suenaga wrote: >>> Hi Stefan, >>> >>> On 2018/03/27 22:45, Stefan Johansson wrote: >>>> Hi Yasumasa, >>>> >>>> On 2018-03-27 10:56, Yasumasa Suenaga wrote: >>>>> Hi Stefan, >>>>> >>>>> Thank you for your comment. >>>>> I updated webrev: >>>>> >>>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.01/ >>>> I think the usage of Optional in Expression.setRequired(bool) is a bit >>>> unnecessary. It will create temporary objects and there is no benefit from >>>> just doing two simple if-statements. >>> >>> I fixed it in new webrev: >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.02/ >>> >>> >>>> I also ran this patch (and the one using forcibly) on my single core VM >>>> and realized that this fix will have to include some awk-file updates to >>>> make the test in test/jdk/sun/tools/jstat pass when Serial in chosen as the >>>> default collector. The tests in test/jdk/sun/tools/jstatd/ are fine. >>> >>> Can you share the failure report? >> It relates to all tests that display the the CGC and the CGCT columns, for >> example in jstatGCOutput1.sh: >> S0C S1C S0U S1U EC EU OC OU MC MU >> CCSC CCSU YGC YGCT FGC FGCT CGC CGCT GCT >> 256.0 256.0 254.0 0.0 2176.0 1025.0 5504.0 920.5 7168.0 >> 6839.7 768.0 602.8 2 0.007 0 0.000 - - 0.007 >> >> The awk regex needs to be updated to handle '-' for these tests: >> test: sun/tools/jstat/jstatGcCapacityOutput1.sh >> Failed. Execution failed: exit code 1 >> >> test: sun/tools/jstat/jstatGcMetaCapacityOutput1.sh >> Failed. Execution failed: exit code 1 >> >> test: sun/tools/jstat/jstatGcNewCapacityOutput1.sh >> Failed. Execution failed: exit code 1 >> >> test: sun/tools/jstat/jstatGcOldCapacityOutput1.sh >> Failed. Execution failed: exit code 1 >> >> test: sun/tools/jstat/jstatGcOldOutput1.sh >> Failed. Execution failed: exit code 1 >> >> test: sun/tools/jstat/jstatGcOutput1.sh >> Failed. Execution failed: exit code 1 >> >> >>> If it occurs in jstatClassloadOutput1.sh, it relates to JDK-8173942. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>> Thanks, >>>> Stefan >>>>> submit-hs: mach5-one-ysuenaga-JDK-8199519-20180327-0652-16322 >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> >>>>> 2018-03-27 0:03 GMT+09:00 Stefan Johansson >>>>> : >>>>>> Hi Yasumasa, >>>>>> >>>>>> On 2018-03-22 11:35, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review this change: >>>>>>> >>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8199519 >>>>>>> webrev: cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.00/ >>>>>> The fix seems to make things to work as expected. Manually tested it >>>>>> and >>>>>> Mach5 also looks good. >>>>>> >>>>>> I have some comments regarding the patch. I think 'forcibly' should be >>>>>> rename to something more descriptive. Naming is never easy but I think >>>>>> 'required' would be better, as in, this column is required and not >>>>>> allowed >>>>>> to print '-'. That would also render the code in >>>>>> ExpressionResolver.java to >>>>>> be: >>>>>> return new Literal(isRequired ? 0.0d : Double.NaN); >>>>>> I think that also better explains why we return 0 instead of NaN. >>>>>> >>>>>> I would also like to see the forcibly/required state moved into the >>>>>> Expression it self, that way we don't have to pass it around but can >>>>>> instead >>>>>> do: >>>>>> return new Literal(e.isRequired() ? 0.0d : Double.NaN); >>>>>> >>>>>> Thanks, >>>>>> Stefan >>>>>> >>>>>> >>>>>>> After JDK-8153333, some jstat tests are failed because GCT in jstat >>>>>>> output >>>>>>> is dash (-) if garbage collector is not concurrent collector e.g. >>>>>>> Serial GC. >>>>>>> I fixed that GCT can be calculated correctly. >>>>>>> >>>>>>> This change has been tested on Mach5 by Stefan. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>> From magnus.ihse.bursie at oracle.com Wed Mar 28 10:31:49 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Wed, 28 Mar 2018 12:31:49 +0200 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <86488d2c-61e3-e489-d9fc-976178c35775@oracle.com> <2329dea1-a75e-bb70-95a6-242a93006c6d@oracle.com> <32db4add-8a59-03db-0074-e7df0aed14b8@oracle.com> Message-ID: <3fce1a22-39e4-b8ba-0713-03cd013a6709@oracle.com> On 2018-03-28 01:52, Weijun Wang wrote: > >> On Mar 24, 2018, at 6:03 AM, Magnus Ihse Bursie wrote: >> >> https://bugs.openjdk.java.net/browse/JDK-8200193 -- for jdk.security.auth > There is only one function to export and it already has JNIEXPORT, so you can just remove the new $(LIBJAAS_CFLAGS) [1]. Ok, thanks Max! > Are you going to update your webrev? Here is a new webrev. It includes your recommended change in Lib-jdk.security.auth.gmk. It is also updated to keep track of changes in shared native libraries that has happend in the mainline since my first webrev. Most notably is the addition of libjsig. For now, I have just added the JNIEXPORT markers for the platforms that need it. Hopefully we can unify libjsig across all platforms, but that seems to be more complicated than I thought, so that'll have to wait. I have also recieved word from Phil Race that there were no testing issues for client, so he's happy as well. Updated webrev: http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.03 /Magnus > > Thanks > Max > > [1] http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01/make/lib/Lib-jdk.security.auth.gmk.sdiff.html From yasuenag at gmail.com Wed Mar 28 11:32:18 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 28 Mar 2018 11:32:18 +0000 Subject: RFR: 8199519: Several GC tests fails with: java.lang.NumberFormatException: Unparseable number: "-" In-Reply-To: <93a1ffeb-4959-3bdb-cbe3-510c258129b6@oracle.com> References: <6755303f-a1a0-da4f-e1e0-a1bcb0c72efd@gmail.com> <7809552d-dfa0-5f26-bd82-c13df7f45f5f@oracle.com> <85853429-a520-1782-40e4-e05776aa639d@oracle.com> <40b04f2e-1d6c-524e-ea4a-08c42fd41ee6@gmail.com> <93a1ffeb-4959-3bdb-cbe3-510c258129b6@oracle.com> Message-ID: Thanks Stefan, I'm waiting for second reviewer. Yasumasa 2018?3?28?(?) 18:36 Stefan Johansson : > Hi Yasumasa, > > Local testing looks good and I've kicked of some additional Mach5 > testing that will include these tests on all platforms. > > Cheers, > Stefan > > On 2018-03-28 06:04, Yasumasa Suenaga wrote: > > Hi Stefan, > > > > Thank you for sharing your report! > > I could reproduce them on my VM. > > > > I've fixed them in new webrev, and it works fine on my environment. > > Could you check again? > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.03/ > > > > > > Thanks, > > > > Yasumasa > > > > > > > > 2018-03-28 0:29 GMT+09:00 Stefan Johansson >: > >> > >> On 2018-03-27 16:44, Yasumasa Suenaga wrote: > >>> Hi Stefan, > >>> > >>> On 2018/03/27 22:45, Stefan Johansson wrote: > >>>> Hi Yasumasa, > >>>> > >>>> On 2018-03-27 10:56, Yasumasa Suenaga wrote: > >>>>> Hi Stefan, > >>>>> > >>>>> Thank you for your comment. > >>>>> I updated webrev: > >>>>> > >>>>> webrev: > http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.01/ > >>>> I think the usage of Optional in Expression.setRequired(bool) is a bit > >>>> unnecessary. It will create temporary objects and there is no benefit > from > >>>> just doing two simple if-statements. > >>> > >>> I fixed it in new webrev: > >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.02/ > >>> > >>> > >>>> I also ran this patch (and the one using forcibly) on my single core > VM > >>>> and realized that this fix will have to include some awk-file updates > to > >>>> make the test in test/jdk/sun/tools/jstat pass when Serial in chosen > as the > >>>> default collector. The tests in test/jdk/sun/tools/jstatd/ are fine. > >>> > >>> Can you share the failure report? > >> It relates to all tests that display the the CGC and the CGCT columns, > for > >> example in jstatGCOutput1.sh: > >> S0C S1C S0U S1U EC EU OC OU MC > MU > >> CCSC CCSU YGC YGCT FGC FGCT CGC CGCT GCT > >> 256.0 256.0 254.0 0.0 2176.0 1025.0 5504.0 920.5 7168.0 > >> 6839.7 768.0 602.8 2 0.007 0 0.000 - - 0.007 > >> > >> The awk regex needs to be updated to handle '-' for these tests: > >> test: sun/tools/jstat/jstatGcCapacityOutput1.sh > >> Failed. Execution failed: exit code 1 > >> > >> test: sun/tools/jstat/jstatGcMetaCapacityOutput1.sh > >> Failed. Execution failed: exit code 1 > >> > >> test: sun/tools/jstat/jstatGcNewCapacityOutput1.sh > >> Failed. Execution failed: exit code 1 > >> > >> test: sun/tools/jstat/jstatGcOldCapacityOutput1.sh > >> Failed. Execution failed: exit code 1 > >> > >> test: sun/tools/jstat/jstatGcOldOutput1.sh > >> Failed. Execution failed: exit code 1 > >> > >> test: sun/tools/jstat/jstatGcOutput1.sh > >> Failed. Execution failed: exit code 1 > >> > >> > >>> If it occurs in jstatClassloadOutput1.sh, it relates to JDK-8173942. > >>> > >>> > >>> Thanks, > >>> > >>> Yasumasa > >>> > >>> > >>>> Thanks, > >>>> Stefan > >>>>> submit-hs: mach5-one-ysuenaga-JDK-8199519-20180327-0652-16322 > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Yasumasa > >>>>> > >>>>> > >>>>> > >>>>> 2018-03-27 0:03 GMT+09:00 Stefan Johansson > >>>>> : > >>>>>> Hi Yasumasa, > >>>>>> > >>>>>> On 2018-03-22 11:35, Yasumasa Suenaga wrote: > >>>>>>> Hi all, > >>>>>>> > >>>>>>> Please review this change: > >>>>>>> > >>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8199519 > >>>>>>> webrev: cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.00/ > >>>>>> The fix seems to make things to work as expected. Manually tested it > >>>>>> and > >>>>>> Mach5 also looks good. > >>>>>> > >>>>>> I have some comments regarding the patch. I think 'forcibly' should > be > >>>>>> rename to something more descriptive. Naming is never easy but I > think > >>>>>> 'required' would be better, as in, this column is required and not > >>>>>> allowed > >>>>>> to print '-'. That would also render the code in > >>>>>> ExpressionResolver.java to > >>>>>> be: > >>>>>> return new Literal(isRequired ? 0.0d : Double.NaN); > >>>>>> I think that also better explains why we return 0 instead of NaN. > >>>>>> > >>>>>> I would also like to see the forcibly/required state moved into the > >>>>>> Expression it self, that way we don't have to pass it around but can > >>>>>> instead > >>>>>> do: > >>>>>> return new Literal(e.isRequired() ? 0.0d : Double.NaN); > >>>>>> > >>>>>> Thanks, > >>>>>> Stefan > >>>>>> > >>>>>> > >>>>>>> After JDK-8153333, some jstat tests are failed because GCT in jstat > >>>>>>> output > >>>>>>> is dash (-) if garbage collector is not concurrent collector e.g. > >>>>>>> Serial GC. > >>>>>>> I fixed that GCT can be calculated correctly. > >>>>>>> > >>>>>>> This change has been tested on Mach5 by Stefan. > >>>>>>> > >>>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> Yasumasa > >>>>>> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.johansson at oracle.com Wed Mar 28 13:38:01 2018 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 28 Mar 2018 15:38:01 +0200 Subject: RFR: 8199519: Several GC tests fails with: java.lang.NumberFormatException: Unparseable number: "-" In-Reply-To: References: <6755303f-a1a0-da4f-e1e0-a1bcb0c72efd@gmail.com> <7809552d-dfa0-5f26-bd82-c13df7f45f5f@oracle.com> <85853429-a520-1782-40e4-e05776aa639d@oracle.com> <40b04f2e-1d6c-524e-ea4a-08c42fd41ee6@gmail.com> <93a1ffeb-4959-3bdb-cbe3-510c258129b6@oracle.com> Message-ID: <5c1975cd-1080-652e-c23a-abd693cc0095@oracle.com> Mach5 testing looks good. Can someone in the serviceability team do the second review? Cheers, Stefan On 2018-03-28 13:32, Yasumasa Suenaga wrote: > Thanks Stefan, > I'm waiting for second reviewer. > > > Yasumasa > > > 2018?3?28?(?) 18:36 Stefan Johansson >: > > Hi Yasumasa, > > Local testing looks good and I've kicked of some additional Mach5 > testing that will include these tests on all platforms. > > Cheers, > Stefan > > On 2018-03-28 06:04, Yasumasa Suenaga wrote: > > Hi Stefan, > > > > Thank you for sharing your report! > > I could reproduce them on my VM. > > > > I've fixed them in new webrev, and it works fine on my environment. > > Could you check again? > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.03/ > > > > > > > Thanks, > > > > Yasumasa > > > > > > > > 2018-03-28 0:29 GMT+09:00 Stefan Johansson > >: > >> > >> On 2018-03-27 16:44, Yasumasa Suenaga wrote: > >>> Hi Stefan, > >>> > >>> On 2018/03/27 22:45, Stefan Johansson wrote: > >>>> Hi Yasumasa, > >>>> > >>>> On 2018-03-27 10:56, Yasumasa Suenaga wrote: > >>>>> Hi Stefan, > >>>>> > >>>>> Thank you for your comment. > >>>>> I updated webrev: > >>>>> > >>>>>? ? ?webrev: > http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.01/ > > >>>> I think the usage of Optional in Expression.setRequired(bool) > is a bit > >>>> unnecessary. It will create temporary objects and there is no > benefit from > >>>> just doing two simple if-statements. > >>> > >>> I fixed it in new webrev: > >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.02/ > > >>> > >>> > >>>> I also ran this patch (and the one using forcibly) on my > single core VM > >>>> and realized that this fix will have to include some awk-file > updates to > >>>> make the test in test/jdk/sun/tools/jstat pass when Serial in > chosen as the > >>>> default collector. The tests in test/jdk/sun/tools/jstatd/ > are fine. > >>> > >>> Can you share the failure report? > >> It relates to all tests that display the the CGC and the CGCT > columns, for > >> example in jstatGCOutput1.sh: > >>? ?S0C? ? S1C? ? S0U? ? S1U? ? ? EC? ? ? ?EU OC ?OU? ? ? ?MC? ? ?MU > >> CCSC? ?CCSU? ?YGC? ? ?YGCT FGC? ? FGCT? ? CGC CGCT? ? ?GCT > >> 256.0? 256.0? 254.0? ?0.0? ? 2176.0? ?1025.0 5504.0 920.5? ? 7168.0 > >> 6839.7 768.0? 602.8? ? ? ?2? ? 0.007? ?0 0.000? ?- ? ? ? -? ? 0.007 > >> > >> The awk regex needs to be updated to handle '-' for these tests: > >> test: sun/tools/jstat/jstatGcCapacityOutput1.sh > >> Failed. Execution failed: exit code 1 > >> > >> test: sun/tools/jstat/jstatGcMetaCapacityOutput1.sh > >> Failed. Execution failed: exit code 1 > >> > >> test: sun/tools/jstat/jstatGcNewCapacityOutput1.sh > >> Failed. Execution failed: exit code 1 > >> > >> test: sun/tools/jstat/jstatGcOldCapacityOutput1.sh > >> Failed. Execution failed: exit code 1 > >> > >> test: sun/tools/jstat/jstatGcOldOutput1.sh > >> Failed. Execution failed: exit code 1 > >> > >> test: sun/tools/jstat/jstatGcOutput1.sh > >> Failed. Execution failed: exit code 1 > >> > >> > >>> If it occurs in jstatClassloadOutput1.sh, it relates to > JDK-8173942. > >>> > >>> > >>> Thanks, > >>> > >>> Yasumasa > >>> > >>> > >>>> Thanks, > >>>> Stefan > >>>>>? ? ?submit-hs: > mach5-one-ysuenaga-JDK-8199519-20180327-0652-16322 > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Yasumasa > >>>>> > >>>>> > >>>>> > >>>>> 2018-03-27 0:03 GMT+09:00 Stefan Johansson > >>>>> >: > >>>>>> Hi Yasumasa, > >>>>>> > >>>>>> On 2018-03-22 11:35, Yasumasa Suenaga wrote: > >>>>>>> Hi all, > >>>>>>> > >>>>>>> Please review this change: > >>>>>>> > >>>>>>>? ? ? JBS: https://bugs.openjdk.java.net/browse/JDK-8199519 > >>>>>>> webrev: > cr.openjdk.java.net/~ysuenaga/JDK-8199519/webrev.00/ > > >>>>>> The fix seems to make things to work as expected. Manually > tested it > >>>>>> and > >>>>>> Mach5 also looks good. > >>>>>> > >>>>>> I have some comments regarding the patch. I think > 'forcibly' should be > >>>>>> rename to something more descriptive. Naming is never easy > but I think > >>>>>> 'required' would be better, as in, this column is required > and not > >>>>>> allowed > >>>>>> to print '-'. That would also render the code in > >>>>>> ExpressionResolver.java to > >>>>>> be: > >>>>>>? ? ?return new Literal(isRequired ? 0.0d : Double.NaN); > >>>>>> I think that also better explains why we return 0 instead > of NaN. > >>>>>> > >>>>>> I would also like to see the forcibly/required state moved > into the > >>>>>> Expression it self, that way we don't have to pass it > around but can > >>>>>> instead > >>>>>> do: > >>>>>>? ? ?return new Literal(e.isRequired() ? 0.0d : Double.NaN); > >>>>>> > >>>>>> Thanks, > >>>>>> Stefan > >>>>>> > >>>>>> > >>>>>>> After JDK-8153333, some jstat tests are failed because GCT > in jstat > >>>>>>> output > >>>>>>> is dash (-) if garbage collector is not concurrent > collector e.g. > >>>>>>> Serial GC. > >>>>>>> I fixed that GCT can be calculated correctly. > >>>>>>> > >>>>>>> This change has been tested on Mach5 by Stefan. > >>>>>>> > >>>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> Yasumasa > >>>>>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Wed Mar 28 15:43:28 2018 From: jcbeyler at google.com (JC Beyler) Date: Wed, 28 Mar 2018 15:43:28 +0000 Subject: JDK-8171119: Low-Overhead Heap Profiling In-Reply-To: References: <5A819F10.8040201@oracle.com> <5A8414AC.3020209@oracle.com> Message-ID: Hi all, I've been working on deflaking the tests mostly and the wording in the JVMTI spec. Here is the two incremental webrevs: http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event.5_6/ http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event.06_07/ Here is the total webrev: http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event.07/ Here are the notes of this change: - Currently the tests pass 100 times in a row, I am working on checking if they pass 1000 times in a row. - The default sampling rate is set to 512k, this is what we use internally and having a default means that to enable the sampling with the default, the user only has to do a enable event/disable event via JVMTI (instead of enable + set sample rate). - I deprecated the code that was handling the fast path tlab refill if it happened since this is now deprecated - Though I saw that Graal is still using it so I have to see what needs to be done there exactly Finally, using the Dacapo benchmark suite, I noted a 1% overhead for when the event system is turned on and the callback to the native agent is just empty. I got a 3% overhead with a 512k sampling rate with the code I put in the native side of my tests. Thanks and comments are appreciated, Jc On Mon, Mar 19, 2018 at 2:06 PM JC Beyler wrote: > Hi all, > > The incremental webrev update is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event4_5/ > > The full webrev is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event5/ > > Major change here is: > - I've removed the heapMonitoring.cpp code in favor of just having the > sampling events as per Serguei's request; I still have to do some overhead > measurements but the tests prove the concept can work > - Most of the tlab code is unchanged, the only major part is that > now things get sent off to event collectors when used and enabled. > - Added the interpreter collectors to handle interpreter execution > - Updated the name from SetTlabHeapSampling to SetHeapSampling to be > more generic > - Added a mutex for the thread sampling so that we can initialize an > internal static array safely > - Ported the tests from the old system to this new one > > I've also updated the JEP and CSR to reflect these changes: > https://bugs.openjdk.java.net/browse/JDK-8194905 > https://bugs.openjdk.java.net/browse/JDK-8171119 > > In order to make this have some forward progress, I've removed the heap > sampling code entirely and now rely entirely on the event sampling system. > The tests reflect this by using a simplified implementation of what an > agent could do: > > http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event5/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitor.c > (Search for anything mentioning event_storage). > > I have not taken the time to port the whole code we had originally in > heapMonitoring to this. I hesitate only because that code was in C++, I'd > have to port it to C and this is for tests so perhaps what I have now is > good enough? > > As far as testing goes, I've ported all the relevant tests and then added > a few: > - Turning the system on/off > - Testing using various GCs > - Testing using the interpreter > - Testing the sampling rate > - Testing with objects and arrays > - Testing with various threads > > Finally, as overhead goes, I have the numbers of the system off vs a clean > build and I have 0% overhead, which is what we'd want. This was using the > Dacapo benchmarks. I am now preparing to run a version with the events on > using dacapo and will report back here. > > Any comments are welcome :) > Jc > > > > > On Thu, Mar 8, 2018 at 4:00 PM JC Beyler wrote: > >> Hi all, >> >> I apologize for the delay but I wanted to add an event system and that >> took a bit longer than expected and I also reworked the code to take into >> account the deprecation of FastTLABRefill. >> >> This update has four parts: >> >> A) I moved the implementation from Thread to ThreadHeapSampler inside of >> Thread. Would you prefer it as a pointer inside of Thread or like this >> works for you? Second question would be would you rather have an >> association outside of Thread altogether that tries to remember when >> threads are live and then we would have something like: >> ThreadHeapSampler::get_sampling_size(this_thread); >> >> I worry about the overhead of this but perhaps it is not too too bad? >> >> B) I also have been working on the Allocation event system that sends out >> a notification at each sampled event. This will be practical when wanting >> to do something at the allocation point. I'm also looking at if the whole >> heapMonitoring code could not reside in the agent code and not in the JDK. >> I'm not convinced but I'm talking to Serguei about it to see/assess :) >> - Also added two tests for the new event subsystem >> >> C) Removed the slow_path fields inside the TLAB code since now >> FastTLABRefill is deprecated >> >> D) Updated the JVMTI documentation and specification for the methods. >> >> So the incremental webrev is here: >> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.09_10/ >> >> and the full webrev is here: >> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.10 >> >> I believe I have updated the various JIRA issues that track this :) >> >> Thanks for your input, >> Jc >> >> >> On Wed, Feb 14, 2018 at 10:34 PM, JC Beyler wrote: >> >>> Hi Erik, >>> >>> I inlined my answers, which the last one seems to answer Robbin's >>> concerns about the same thing (adding things to Thread). >>> >>> On Wed, Feb 14, 2018 at 2:51 AM, Erik ?sterlund < >>> erik.osterlund at oracle.com> wrote: >>> >>>> Hi JC, >>>> >>>> Comments are inlined below. >>>> >>>> >>>> On 2018-02-13 06:18, JC Beyler wrote: >>>> >>>> Hi Erik, >>>> >>>> Thanks for your answers, I've now inlined my own answers/comments. >>>> >>>> I've done a new webrev here: >>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08/ >>>> >>>> The incremental is here: >>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/ >>>> >>>> Note to all: >>>> - I've been integrating changes from Erin/Serguei/David comments so >>>> this webrev incremental is a bit an answer to all comments in one. I >>>> apologize for that :) >>>> >>>> >>>> On Mon, Feb 12, 2018 at 6:05 AM, Erik ?sterlund < >>>> erik.osterlund at oracle.com> wrote: >>>> >>>>> Hi JC, >>>>> >>>>> Sorry for the delayed reply. >>>>> >>>>> Inlined answers: >>>>> >>>>> >>>>> On 2018-02-06 00:04, JC Beyler wrote: >>>>> >>>>>> Hi Erik, >>>>>> >>>>>> (Renaming this to be folded into the newly renamed thread :)) >>>>>> >>>>>> First off, thanks a lot for reviewing the webrev! I appreciate it! >>>>>> >>>>>> I updated the webrev to: >>>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/ >>>>>> >>>>>> And the incremental one is here: >>>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.04_05a/ >>>>>> >>>>>> It contains: >>>>>> - The change for since from 9 to 11 for the jvmti.xml >>>>>> - The use of the OrderAccess for initialized >>>>>> - Clearing the oop >>>>>> >>>>>> I also have inlined my answers to your comments. The biggest question >>>>>> will come from the multiple *_end variables. A bit of the logic there >>>>>> is due to handling the slow path refill vs fast path refill and >>>>>> checking that the rug was not pulled underneath the slowpath. I >>>>>> believe that a previous comment was that TlabFastRefill was going to >>>>>> be deprecated. >>>>>> >>>>>> If this is true, we could revert this code a bit and just do a : if >>>>>> TlabFastRefill is enabled, disable this. And then deprecate that when >>>>>> TlabFastRefill is deprecated. >>>>>> >>>>>> This might simplify this webrev and I can work on a follow-up that >>>>>> either: removes TlabFastRefill if Robbin does not have the time to do >>>>>> it or add the support to the assembly side to handle this correctly. >>>>>> What do you think? >>>>>> >>>>> >>>>> I support removing TlabFastRefill, but I think it is good to not >>>>> depend on that happening first. >>>>> >>>>> >>>> >>>> I'm slowly pushing on the FastTLABRefill ( >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8194084), I agree on keeping >>>> both separate for now though so that we can think of both differently >>>> >>>> >>>> >>>>> Now, below, inlined are my answers: >>>>>> >>>>>> On Fri, Feb 2, 2018 at 8:44 AM, Erik ?sterlund >>>>>> wrote: >>>>>> >>>>>>> Hi JC, >>>>>>> >>>>>>> Hope I am reviewing the right version of your work. Here goes... >>>>>>> >>>>>>> src/hotspot/share/gc/shared/collectedHeap.inline.hpp: >>>>>>> >>>>>>> 159 AllocTracer::send_allocation_outside_tlab(klass, result, >>>>>>> size * >>>>>>> HeapWordSize, THREAD); >>>>>>> 160 >>>>>>> 161 THREAD->tlab().handle_sample(THREAD, result, size); >>>>>>> 162 return result; >>>>>>> 163 } >>>>>>> >>>>>>> Should not call tlab()->X without checking if (UseTLAB) IMO. >>>>>>> >>>>>>> Done! >>>>>> >>>>> >>>>> More about this later. >>>>> >>>>> >>>>> >>>>>> src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp: >>>>>>> >>>>>>> So first of all, there seems to quite a few ends. There is an "end", >>>>>>> a "hard >>>>>>> end", a "slow path end", and an "actual end". Moreover, it seems >>>>>>> like the >>>>>>> "hard end" is actually further away than the "actual end". So the >>>>>>> "hard end" >>>>>>> seems like more of a "really definitely actual end" or something. I >>>>>>> don't >>>>>>> know about you, but I think it looks kind of messy. In particular, I >>>>>>> don't >>>>>>> feel like the name "actual end" reflects what it represents, >>>>>>> especially when >>>>>>> there is another end that is behind the "actual end". >>>>>>> >>>>>>> 413 HeapWord* ThreadLocalAllocBuffer::hard_end() { >>>>>>> 414 // Did a fast TLAB refill occur? >>>>>>> 415 if (_slow_path_end != _end) { >>>>>>> 416 // Fix up the actual end to be now the end of this TLAB. >>>>>>> 417 _slow_path_end = _end; >>>>>>> 418 _actual_end = _end; >>>>>>> 419 } >>>>>>> 420 >>>>>>> 421 return _actual_end + alignment_reserve(); >>>>>>> 422 } >>>>>>> >>>>>>> I really do not like making getters unexpectedly have these kind of >>>>>>> side >>>>>>> effects. It is not expected that when you ask for the "hard end", you >>>>>>> implicitly update the "slow path end" and "actual end" to new values. >>>>>>> >>>>>>> As I said, a lot of this is due to the FastTlabRefill. If I make this >>>>>> not supporting FastTlabRefill, this goes away. The reason the system >>>>>> needs to update itself at the get is that you only know at that get if >>>>>> things have shifted underneath the tlab slow path. I am not sure of >>>>>> really better names (naming is hard!), perhaps we could do these >>>>>> names: >>>>>> >>>>>> - current_tlab_end // Either the allocated tlab end or a >>>>>> sampling point >>>>>> - last_allocation_address // The end of the tlab allocation >>>>>> - last_slowpath_allocated_end // In case a fast refill occurred the >>>>>> end might have changed, this is to remember slow vs fast past refills >>>>>> >>>>>> the hard_end method can be renamed to something like: >>>>>> tlab_end_pointer() // The end of the lab including a bit of >>>>>> alignment reserved bytes >>>>>> >>>>> >>>>> Those names sound better to me. Could you please provide a mapping >>>>> from the old names to the new names so I understand which one is which >>>>> please? >>>>> >>>>> This is my current guess of what you are proposing: >>>>> >>>>> end -> current_tlab_end >>>>> actual_end -> last_allocation_address >>>>> slow_path_end -> last_slowpath_allocated_end >>>>> hard_end -> tlab_end_pointer >>>>> >>>>> >>>> Yes that is correct, that was what I was proposing. >>>> >>>> >>>>> I would prefer this naming: >>>>> >>>>> end -> slow_path_end // the end for taking a slow path; either due to >>>>> sampling or refilling >>>>> actual_end -> allocation_end // the end for allocations >>>>> slow_path_end -> last_slow_path_end // last address for slow_path_end >>>>> (as opposed to allocation_end) >>>>> hard_end -> reserved_end // the end of the reserved space of the TLAB >>>>> >>>>> About setting things in the getter... that still seems like a very >>>>> unpleasant thing to me. It would be better to inspect the call hierarchy >>>>> and explicitly update the ends where they need updating, and assert in the >>>>> getter that they are in sync, rather than implicitly setting various ends >>>>> as a surprising side effect in a getter. It looks like the call hierarchy >>>>> is very small. With my new naming convention, reserved_end() would >>>>> presumably return _allocation_end + alignment_reserve(), and have an assert >>>>> checking that _allocation_end == _last_slow_path_allocation_end, >>>>> complaining that this invariant must hold, and that a caller to this >>>>> function, such as make_parsable(), must first explicitly synchronize the >>>>> ends as required, to honor that invariant. >>>>> >>>>> >>>> >>>> I've renamed the variables to how you preferred it except for the _end >>>> one. I did: >>>> current_end >>>> last_allocation_address >>>> tlab_end_ptr >>>> >>>> The reason is that the architecture dependent code use the thread.hpp >>>> API and it already has tlab included into the name so it becomes >>>> tlab_current_end (which is better that tlab_current_tlab_end in my opinion). >>>> >>>> I also moved the update into a separate method with a TODO that says to >>>> remove it when FastTLABRefill is deprecated >>>> >>>> >>>> This looks a lot better now. Thanks. >>>> >>>> Note that the following comment now needs updating accordingly in >>>> threadLocalAllocBuffer.hpp: >>>> >>>> 41 // Heap sampling is performed via the end/actual_end fields. 42 // actual_end contains the real end of the tlab allocation, 43 // whereas end can be set to an arbitrary spot in the tlab to 44 // trip the return and sample the allocation. 45 // slow_path_end is used to track if a fast tlab refill occured 46 // between slowpath calls. >>>> >>>> There might be other comments too, I have not looked in detail. >>>> >>> >>> This was the only spot that still had an actual_end, I fixed it now. >>> I'll do a sweep to double check other comments. >>> >>> >>>> >>>> >>>> >>>> >>>> >>>>> >>>>> Not sure it's better but before updating the webrev, I wanted to try >>>>>> to get input/consensus :) >>>>>> >>>>>> (Note hard_end was always further off than end). >>>>>> >>>>>> src/hotspot/share/prims/jvmti.xml: >>>>>>> >>>>>>> 10357 >>>>>>> 10358 >>>>>>> 10359 Can sample the heap. >>>>>>> 10360 If this capability is enabled then the heap sampling >>>>>>> methods >>>>>>> can be called. >>>>>>> 10361 >>>>>>> 10362 >>>>>>> >>>>>>> Looks like this capability should not be "since 9" if it gets >>>>>>> integrated >>>>>>> now. >>>>>>> >>>>>> Updated now to 11, crossing my fingers :) >>>>>> >>>>>> >>>>>> src/hotspot/share/runtime/heapMonitoring.cpp: >>>>>>> >>>>>>> 448 if (is_alive->do_object_b(value)) { >>>>>>> 449 // Update the oop to point to the new object if it is >>>>>>> still >>>>>>> alive. >>>>>>> 450 f->do_oop(&(trace.obj)); >>>>>>> 451 >>>>>>> 452 // Copy the old trace, if it is still live. >>>>>>> 453 _allocated_traces->at_put(curr_pos++, trace); >>>>>>> 454 >>>>>>> 455 // Store the live trace in a cache, to be served up on >>>>>>> /heapz. >>>>>>> 456 _traces_on_last_full_gc->append(trace); >>>>>>> 457 >>>>>>> 458 count++; >>>>>>> 459 } else { >>>>>>> 460 // If the old trace is no longer live, add it to the >>>>>>> list of >>>>>>> 461 // recently collected garbage. >>>>>>> 462 store_garbage_trace(trace); >>>>>>> 463 } >>>>>>> >>>>>>> In the case where the oop was not live, I would like it to be >>>>>>> explicitly >>>>>>> cleared. >>>>>>> >>>>>> Done I think how you wanted it. Let me know because I'm not familiar >>>>>> with the RootAccess API. I'm unclear if I'm doing this right or not so >>>>>> reviews of these parts are highly appreciated. Robbin had talked of >>>>>> perhaps later pushing this all into a OopStorage, should I do this now >>>>>> do you think? Or can that wait a second webrev later down the road? >>>>>> >>>>> >>>>> I think using handles can and should be done later. You can use the >>>>> Access API now. >>>>> I noticed that you are missing an #include "oops/access.inline.hpp" in >>>>> your heapMonitoring.cpp file. >>>>> >>>>> >>>> The missing header is there for me so I don't know, I made sure it is >>>> present in the latest webrev. Sorry about that. >>>> >>>> >>>> >>>>> + Did I clear it the way you wanted me to or were you thinking of >>>>>> something else? >>>>>> >>>>> >>>>> That is precisely how I wanted it to be cleared. Thanks. >>>>> >>>>> + Final question here, seems like if I were to want to not do the >>>>>> f->do_oop directly on the trace.obj, I'd need to do something like: >>>>>> >>>>>> f->do_oop(&value); >>>>>> ... >>>>>> trace->store_oop(value); >>>>>> >>>>>> to update the oop internally. Is that right/is that one of the >>>>>> advantages of going to the Oopstorage sooner than later? >>>>>> >>>>> >>>>> I think you really want to do the do_oop on the root directly. Is >>>>> there a particular reason why you would not want to do that? >>>>> Otherwise, yes - the benefit with using the handle approach is that >>>>> you do not need to call do_oop explicitly in your code. >>>>> >>>>> >>>> There is no reason except that now we have a load_oop and a >>>> get_oop_addr, I was not sure what you would think of that. >>>> >>>> >>>> That's fine. >>>> >>>> >>>> >>>>> >>>>>> Also I see a lot of concurrent-looking use of the following field: >>>>>>> 267 volatile bool _initialized; >>>>>>> >>>>>>> Please note that the "volatile" qualifier does not help with >>>>>>> reordering >>>>>>> here. Reordering between volatile and non-volatile fields is >>>>>>> completely free >>>>>>> for both compiler and hardware, except for windows with MSVC, where >>>>>>> volatile >>>>>>> semantics is defined to use acquire/release semantics, and the >>>>>>> hardware is >>>>>>> TSO. But for the general case, I would expect this field to be >>>>>>> stored with >>>>>>> OrderAccess::release_store and loaded with OrderAccess::load_acquire. >>>>>>> Otherwise it is not thread safe. >>>>>>> >>>>>> Because everything is behind a mutex, I wasn't really worried about >>>>>> this. I have a test that has multiple threads trying to hit this >>>>>> corner case and it passes. >>>>>> >>>>>> However, to be paranoid, I updated it to using the OrderAccess API >>>>>> now, thanks! Let me know what you think there too! >>>>>> >>>>> >>>>> If it is indeed always supposed to be read and written under a mutex, >>>>> then I would strongly prefer to have it accessed as a normal non-volatile >>>>> member, and have an assertion that given lock is held or we are in a >>>>> safepoint, as we do in many other places. Something like this: >>>>> >>>>> assert(HeapMonitorStorage_lock->owned_by_self() || >>>>> (SafepointSynchronize::is_at_safepoint() && >>>>> Thread::current()->is_VM_thread()), "this should not be accessed >>>>> concurrently"); >>>>> >>>>> It would be confusing to people reading the code if there are uses of >>>>> OrderAccess that are actually always protected under a mutex. >>>>> >>>>> >>>> Thank you for the exact example to be put in the code! I put it around >>>> each access/assignment of the _initialized method and found one case where >>>> yes you can touch it and not have the lock. It actually is "ok" because you >>>> don't act on the storage until later and only when you really want to >>>> modify the storage (see the object_alloc_do_sample method which calls the >>>> add_trace method). >>>> >>>> But, because of this, I'm going to put the OrderAccess here, I'll do >>>> some performance numbers later and if there are issues, I might add a >>>> "unsafe" read and a "safe" one to make it explicit to the reader. But I >>>> don't think it will come to that. >>>> >>>> >>>> Okay. This double return in heapMonitoring.cpp looks wrong: >>>> >>>> 283 bool initialized() { >>>> 284 return OrderAccess::load_acquire(&_initialized) != 0; >>>> 285 return _initialized; >>>> 286 } >>>> >>>> Since you said object_alloc_do_sample() is the only place where you do >>>> not hold the mutex while reading initialized(), I had a closer look at >>>> that. It looks like in its current shape, the lack of a mutex may lead to a >>>> memory leak. In particular, it first checks if (initialized()). Let's >>>> assume this is now true. It then allocates a bunch of stuff, and checks if >>>> the number of frames were over 0. If they were, it calls >>>> StackTraceStorage::storage()->add_trace() seemingly hoping that after >>>> grabbing the lock in there, initialized() will still return true. But it >>>> could now return false and skip doing anything, in which case the allocated >>>> stuff will never be freed. >>>> >>> >>> I fixed this now by making add_trace return a boolean and checking for >>> that. It will be in the next webrev. Thanks, the truth is that in our >>> implementation the system is always on or off, so this never really occurs >>> :). In this version though, that is not true and it's important to handle >>> so thanks again! >>> >>> >>> >>>> >>>> So the analysis seems to be that _initialized is only used outside of >>>> the mutex in once instance, where it is used to perform double-checked >>>> locking, that actually causes a memory leak. >>>> >>>> I am not proposing how to fix that, just raising the issue. If you >>>> still want to perform this double-checked locking somehow, then the use of >>>> acquire/release still seems odd. Because the memory ordering restrictions >>>> of it never comes into play in this particular case. If it ever did, then >>>> the use of destroy_stuff(); release_store(_initialized, 0) would be broken >>>> anyway as that would imply that whatever concurrent reader there ever was >>>> would after reading _initialized with load_acquire() could *never* read the >>>> data that is concurrently destroyed anyway. I would be biased to think that >>>> RawAccess::load/store looks like a more appropriate solution, >>>> given that the memory leak issue is resolved. I do not know how painful it >>>> would be to not perform this double-checked locking. >>>> >>> >>> So I agree with this entirely. I looked also a bit more and the >>> difference and code really stems from our internal version. In this version >>> however, there are actually a lot of things going on that I did not go >>> entirely through in my head but this comment made me ponder a bit more on >>> it. >>> >>> Since every object_alloc_do_sample is protected by a check to >>> HeapMonitoring::enabled(), there is only a small chance that the call is >>> happening when things have been disabled. So there is no real need to do a >>> first check on the initialized, it is a rare occurence that a call happens >>> to object_alloc_do_sample and the initialized of the storage returns false. >>> >>> (By the way, even if you did call object_alloc_do_sample without looking >>> at HeapMonitoring::enabled(), that would be ok too. You would gather the >>> stacktrace and get nowhere at the add_trace call, which would return false; >>> so though not optimal performance wise, nothing would break). >>> >>> Furthermore, the add_trace is really the moment of no return and we have >>> the mutex lock and then the initialized check. So, in the end, I did two >>> things: I removed that first check and then I removed the OrderAccess for >>> the storage initialized. I think now I have a better grasp and >>> understanding why it was done in our code and why it is not needed here. >>> Thanks for pointing it out :). This now still passes my JTREG tests, >>> especially the threaded one. >>> >>> >>> >>> >>> >>>> >>>> >>>> >>>> >>>> >>>>> As a kind of meta comment, I wonder if it would make sense to add >>>>>>> sampling >>>>>>> for non-TLAB allocations. Seems like if someone is rapidly >>>>>>> allocating a >>>>>>> whole bunch of 1 MB objects that never fit in a TLAB, I might still >>>>>>> be >>>>>>> interested in seeing that in my traces, and not get surprised that >>>>>>> the >>>>>>> allocation rate is very high yet not showing up in any profiles. >>>>>>> >>>>>>> That is handled by the handle_sample where you wanted me to put a >>>>>> UseTlab because you hit that case if the allocation is too big. >>>>>> >>>>> >>>>> I see. It was not obvious to me that non-TLAB sampling is done in the >>>>> TLAB class. That seems like an abstraction crime. >>>>> What I wanted in my previous comment was that we do not call into the >>>>> TLAB when we are not using TLABs. If there is sampling logic in the TLAB >>>>> that is used for something else than TLABs, then it seems like that logic >>>>> simply does not belong inside of the TLAB. It should be moved out of the >>>>> TLAB, and instead have the TLAB call this common abstraction that makes >>>>> sense. >>>>> >>>>> >>>> So in the incremental version: >>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/, this is >>>> still a "crime". The reason is that the system has to have the >>>> bytes_until_sample on a per-thread level and it made "sense" to have it >>>> with the TLAB implementation. Also, I was not sure how people felt about >>>> adding something to the thread instance instead. >>>> >>>> Do you think it fits better at the Thread level? I can see how >>>> difficult it is to make it happen there and add some logic there. Let me >>>> know what you think. >>>> >>>> >>>> We have an unfortunate situation where everyone that has some fields >>>> that are thread local tend to dump them right into Thread, making the size >>>> and complexity of Thread grow as it becomes tightly coupled with various >>>> unrelated subsystems. It would be desirable to have a separate class for >>>> this instead that encapsulates the sampling logic. That class could >>>> possibly reside in Thread though as a value object of Thread. >>>> >>> >>> I imagined that would be the case but was not sure. I will look at the >>> example that Robbin is talking about (ThreadSMR) and will see how to >>> refactor my code to use that. >>> >>> Thanks again for your help, >>> Jc >>> >>> >>>> >>>> >>>> >>>> >>>> >>>>> Hope I have answered your questions and that my feedback makes sense >>>>> to you. >>>>> >>>>> >>>> You have and thank you for them, I think we are getting to a cleaner >>>> implementation and things are getting better and more readable :) >>>> >>>> >>>> Yes it is getting better. >>>> >>>> Thanks, >>>> /Erik >>>> >>>> >>>> Thanks for your help! >>>> Jc >>>> >>>> >>>> >>>>> Thanks, >>>>> /Erik >>>>> >>>>> >>>>> I double checked by changing the test >>>>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatObjectCorrectnessTest.java >>>>>> >>>>>> to use a smaller Tlab (2048) and made the object bigger and it goes >>>>>> through that and passes. >>>>>> >>>>>> Thanks again for your review and I look forward to your pointers for >>>>>> the questions I now have raised! >>>>>> Jc >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>>> /Erik >>>>>>> >>>>>>> >>>>>>> On 2018-01-26 06:45, JC Beyler wrote: >>>>>>> >>>>>>>> Thanks Robbin for the reviews :) >>>>>>>> >>>>>>>> The new full webrev is here: >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/ >>>>>>>> The incremental webrev is here: >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/ >>>>>>>> >>>>>>>> I inlined my answers: >>>>>>>> >>>>>>>> On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn < >>>>>>>> robbin.ehn at oracle.com> wrote: >>>>>>>> >>>>>>>>> Hi JC, great to see another revision! >>>>>>>>> >>>>>>>>> #### >>>>>>>>> heapMonitoring.cpp >>>>>>>>> >>>>>>>>> StackTraceData should not contain the oop for 'safety' reasons. >>>>>>>>> When StackTraceData is moved from _allocated_traces: >>>>>>>>> L452 store_garbage_trace(trace); >>>>>>>>> it contains a dead oop. >>>>>>>>> _allocated_traces could instead be a tupel of oop and >>>>>>>>> StackTraceData thus >>>>>>>>> dead oops are not kept. >>>>>>>>> >>>>>>>> Done I used inheritance to make the copier work regardless but the >>>>>>>> idea is the same. >>>>>>>> >>>>>>>> You should use the new Access API for loading the oop, something >>>>>>>>> like >>>>>>>>> this: >>>>>>>>> RootAccess::load(...) >>>>>>>>> I don't think you need to use Access API for clearing the oop, but >>>>>>>>> it >>>>>>>>> would >>>>>>>>> look nicer. And you shouldn't probably be using: >>>>>>>>> Universe::heap()->is_in_reserved(value) >>>>>>>>> >>>>>>>> I am unfamiliar with this but I think I did do it like you wanted me >>>>>>>> to (all tests pass so that's a start). I'm not sure how to clear the >>>>>>>> oop exactly, is there somewhere that does that, which I can use to >>>>>>>> do >>>>>>>> the same? >>>>>>>> >>>>>>>> I removed the is_in_reserved, this came from our internal version, I >>>>>>>> don't know why it was there but my tests work without so I removed >>>>>>>> it >>>>>>>> :) >>>>>>>> >>>>>>>> >>>>>>>> The lock: >>>>>>>>> L424 MutexLocker mu(HeapMonitorStorage_lock); >>>>>>>>> Is not needed as far as I can see. >>>>>>>>> weak_oops_do is called in a safepoint, no TLAB allocation can >>>>>>>>> happen and >>>>>>>>> JVMTI thread can't access these data-structures. Is there >>>>>>>>> something more >>>>>>>>> to >>>>>>>>> this lock that I'm missing? >>>>>>>>> >>>>>>>> Since a thread can call the JVMTI getLiveTraces (or any of the other >>>>>>>> ones), it can get to the point of trying to copying the >>>>>>>> _allocated_traces. I imagine it is possible that this is happening >>>>>>>> during a GC or that it can be started and a GC happens afterwards. >>>>>>>> Therefore, it seems to me that you want this protected, no? >>>>>>>> >>>>>>>> >>>>>>>> #### >>>>>>>>> You have 6 files without any changes in them (any more): >>>>>>>>> g1CollectedHeap.cpp >>>>>>>>> psMarkSweep.cpp >>>>>>>>> psParallelCompact.cpp >>>>>>>>> genCollectedHeap.cpp >>>>>>>>> referenceProcessor.cpp >>>>>>>>> thread.hpp >>>>>>>>> >>>>>>>>> Done. >>>>>>>> >>>>>>>> #### >>>>>>>>> I have not looked closely, but is it possible to hide heap >>>>>>>>> sampling in >>>>>>>>> AllocTracer ? (with some minor changes to the AllocTracer API) >>>>>>>>> >>>>>>>>> I am imagining that you are saying to move the code that does the >>>>>>>> sampling code (change the tlab end, do the call to HeapMonitoring, >>>>>>>> etc.) into the AllocTracer code itself? I think that is right and >>>>>>>> I'll >>>>>>>> look if that is possible and prepare a webrev to show what would be >>>>>>>> needed to make that happen. >>>>>>>> >>>>>>>> #### >>>>>>>>> Minor nit, when declaring pointer there is a little mix of having >>>>>>>>> the >>>>>>>>> pointer adjacent by type name and data name. (Most hotspot code is >>>>>>>>> by >>>>>>>>> type >>>>>>>>> name) >>>>>>>>> E.g. >>>>>>>>> heapMonitoring.cpp:711 jvmtiStackTrace *trace = .... >>>>>>>>> heapMonitoring.cpp:733 Method* m = vfst.method(); >>>>>>>>> (not just this file) >>>>>>>>> >>>>>>>>> Done! >>>>>>>> >>>>>>>> #### >>>>>>>>> HeapMonitorThreadOnOffTest.java:77 >>>>>>>>> I would make g_tmp volatile, otherwise the assignment in loop may >>>>>>>>> theoretical be skipped. >>>>>>>>> >>>>>>>>> Also done! >>>>>>>> >>>>>>>> Thanks again! >>>>>>>> Jc >>>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >>>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.joelsson at oracle.com Wed Mar 28 20:17:39 2018 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Wed, 28 Mar 2018 13:17:39 -0700 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <3fce1a22-39e4-b8ba-0713-03cd013a6709@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <86488d2c-61e3-e489-d9fc-976178c35775@oracle.com> <2329dea1-a75e-bb70-95a6-242a93006c6d@oracle.com> <32db4add-8a59-03db-0074-e7df0aed14b8@oracle.com> <3fce1a22-39e4-b8ba-0713-03cd013a6709@oracle.com> Message-ID: Build changes still look good to me. /Erik On 2018-03-28 03:31, Magnus Ihse Bursie wrote: > On 2018-03-28 01:52, Weijun Wang wrote: >> >>> On Mar 24, 2018, at 6:03 AM, Magnus Ihse Bursie >>> wrote: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8200193 -- for >>> jdk.security.auth >> There is only one function to export and it already has JNIEXPORT, so >> you can just remove the new $(LIBJAAS_CFLAGS) [1]. > Ok, thanks Max! >> Are you going to update your webrev? > Here is a new webrev. It includes your recommended change in > Lib-jdk.security.auth.gmk. > > It is also updated to keep track of changes in shared native libraries > that has happend in the mainline since my first webrev. Most notably > is the addition of libjsig. For now, I have just added the JNIEXPORT > markers for the platforms that need it. Hopefully we can unify libjsig > across all platforms, but that seems to be more complicated than I > thought, so that'll have to wait. > > I have also recieved word from Phil Race that there were no testing > issues for client, so he's happy as well. > > Updated webrev: > http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.03 > > /Magnus > >> >> Thanks >> Max >> >> [1] >> http://cr.openjdk.java.net/~ihse/JDK-8200178-remove-mapfiles/webrev.01/make/lib/Lib-jdk.security.auth.gmk.sdiff.html > From martinrb at google.com Wed Mar 28 21:53:10 2018 From: martinrb at google.com (Martin Buchholz) Date: Wed, 28 Mar 2018 14:53:10 -0700 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> Message-ID: I can't find any documentation for what JNIEXPORT and friends actually do. People including myself have been cargo-culting JNIEXPORT and JNICALL for decades. Why aren't they in the JNI spec? --- It's fishy that the attribute externally_visible (which seems very interesting!) is ARM specific. #ifdef ARM #define JNIEXPORT __attribute__((externally_visible,visibility("default"))) #define JNIIMPORT __attribute__((externally_visible,visibility("default"))) #else #define JNIEXPORT __attribute__((visibility("default"))) #define JNIIMPORT __attribute__((visibility("default"))) #endif -------------- next part -------------- An HTML attachment was scrubbed... URL: From magnus.ihse.bursie at oracle.com Wed Mar 28 22:14:13 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Thu, 29 Mar 2018 00:14:13 +0200 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> Message-ID: <5e7f7a0c-bf52-e095-ede6-a2599376cb22@oracle.com> On 2018-03-28 23:53, Martin Buchholz wrote: > I can't find any documentation for what JNIEXPORT and friends actually do. > People including myself have been cargo-culting JNIEXPORT and JNICALL > for decades. > Why aren't they in the JNI spec? That surprises me. I'm quite certain that javah (or rather, java -h nowadays) generate header files with JNIEXPORT and JNICALL. As you can see in the jni.h and jni_md.h files, JNIEXPORT equals __attribute__((visibility("default"))) for compilers that support it (gcc and friends), and __declspec(dllexport) for Windows. This means, that the symbol should be exported. (And it's ignored if you use mapfiles aka linker scripts.) As for JNICALL, it's empty on most compilers, but evaluates to __stdcall on Windows. This defines the calling convention to use. This is required for JNI calls from Java. (Ask the JVM team why.) While it's not technically required for calling from one dll to another, it's good practice to use it all time to be consistent. In any way, it doesn't hurt us. > > --- > > It's fishy that the attribute externally_visible (which seems very > interesting!) is ARM specific. > > ? #ifdef ARM > ? ? #define JNIEXPORT > ?__attribute__((externally_visible,visibility("default"))) > ? ? #define JNIIMPORT > ?__attribute__((externally_visible,visibility("default"))) Yeah, this is broken on so many levels. :-( The ARM here goes back to the old Oracle proprietary arm32 port. This used lto, link time optimization, to get an absolutely minimal runtime, at expense of a extremely long built time. (I think linking libjvm took like 20 minutes.) But when using lto, you also need to decorate your functions with the externally_visible attribute. So this was added to get hotspot to export the proper symbols (since they, too, used the jni.h file). So, in short, we should: 1) have used a special, local jni.h file for the proprietary arm port, and/or 2) added the externally_visible attribute not based on platform, but on the existence of lto. At this point in time, we're not building the old 32-bit arm port, and I doubt anyone does. And even if so, we could probably remove the lto part, and thus remove this from jni_md.h. If you want, please file a bug. /Magnus > ? #else > ? ? #define JNIEXPORT ?__attribute__((visibility("default"))) > ? ? #define JNIIMPORT ?__attribute__((visibility("default"))) > ? #endif > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Thu Mar 29 01:32:25 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 29 Mar 2018 11:32:25 +1000 Subject: RFR(xxs): 8200384: jcmd help output should be sorted In-Reply-To: References: Message-ID: Re-directing to serviceability-dev. David On 29/03/2018 6:08 AM, Thomas St?fe wrote: > Hi all, > > may I get reviews for this tiny trivial change which causes jcmd help > output (the command list) to be sorted? > > bug: https://bugs.openjdk.java.net/browse/JDK-8200384 > webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8200384-jcmd-help-sorted/webrev.00/webrev/ > > Thanks! > > Best Regards, Thomas > From martinrb at google.com Thu Mar 29 06:16:18 2018 From: martinrb at google.com (Martin Buchholz) Date: Wed, 28 Mar 2018 23:16:18 -0700 Subject: RFR: JDK-8200178 Remove mapfiles for JDK native libraries In-Reply-To: <5e7f7a0c-bf52-e095-ede6-a2599376cb22@oracle.com> References: <9b6ec99a-1f75-3302-36cb-679b59291a20@oracle.com> <5e7f7a0c-bf52-e095-ede6-a2599376cb22@oracle.com> Message-ID: On Wed, Mar 28, 2018 at 3:14 PM, Magnus Ihse Bursie < magnus.ihse.bursie at oracle.com> wrote: > On 2018-03-28 23:53, Martin Buchholz wrote: > > I can't find any documentation for what JNIEXPORT and friends actually do. > People including myself have been cargo-culting JNIEXPORT and JNICALL for > decades. > Why aren't they in the JNI spec? > > That surprises me. I'm quite certain that javah (or rather, java -h > nowadays) generate header files with JNIEXPORT and JNICALL. > > As you can see in the jni.h and jni_md.h files, JNIEXPORT equals > __attribute__((visibility("default"))) for compilers that support it (gcc > and friends), and __declspec(dllexport) for Windows. This means, that the > symbol should be exported. (And it's ignored if you use mapfiles aka linker > scripts.) > > As for JNICALL, it's empty on most compilers, but evaluates to __stdcall > on Windows. This defines the calling convention to use. This is required > for JNI calls from Java. (Ask the JVM team why.) While it's not technically > required for calling from one dll to another, it's good practice to use it > all time to be consistent. In any way, it doesn't hurt us. > Sure, I can see how JNIEXPORT and JNICALL are implemented, but what do they *mean?* For example, one might expect from the JNI prefix that these macros are exclusively for use by JNI linking, i.e. unsupported except in the output of javac -h. But of course in practice they are used with arbitrary symbols to communicate between components of user native code, not just to communicate with the JVM. Is that a bug? -------------- next part -------------- An HTML attachment was scrubbed... URL: From shafi.s.ahmad at oracle.com Thu Mar 29 09:11:44 2018 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Thu, 29 Mar 2018 02:11:44 -0700 (PDT) Subject: [8u] RFR for backport of "JDK-8165736: Error message should be shown when JVMTI agent cannot be attached" to jdk8u-dev Message-ID: <8c218a37-4a50-4b4f-847b-4c67e02b7866@default> Hi, Please review the backport of ' JDK-8165736: Error message should be shown when JVMTI agent cannot be attached' to jdk8u-dev. Please note that this is not a clean backport because we can't not backport native jtreg tests as infrastructure of naive jtreg test has been available since JDK 9. webrev: http://cr.openjdk.java.net/~shshahma/8165736/ jdk10 bug: https://bugs.openjdk.java.net/browse/JDK-8165736 original patch pushed to jdk10: http://hg.openjdk.java.net/jdk/jdk/rev/bc1cffa26561 Test: Run jprt -testset hotspot, -testset core Regards, Shafi From daniil.x.titov at oracle.com Thu Mar 29 17:27:44 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 29 Mar 2018 10:27:44 -0700 Subject: RFR 4613913: Four EventRequest methods are invokable on deleted request Message-ID: Please review the changes that ensure that no operation on deleted com.sun.jdi.request.EventRequest objects are permitted as per JDI specification for com.sun.jdi.request.EventRequestManager.deleteEventRequest(com.sun.jdi.request.EventRequest) method. The fix makes the following 4 methods in class com.sun.tools.jdi. EventRequestManagerImpl$EventRequestImpl to throw com.sun.jdi.request.InvalidRequestStateException if the request is deleted: - getProperty() - putProperty(Object, Object) - suspendPolicy() - isEnabled() Bug: https://bugs.openjdk.java.net/browse/JDK-4613913 Webrev: http://cr.openjdk.java.net/~dtitov/4613913/webrev.02/ Best regards, Daniil From serguei.spitsyn at oracle.com Thu Mar 29 18:46:05 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 29 Mar 2018 11:46:05 -0700 Subject: RFR 4613913: Four EventRequest methods are invokable on deleted request In-Reply-To: References: Message-ID: <579aad5f-fdaa-e0c9-dc16-7bc2394cb82f@oracle.com> Hi Daniil, It looks good in general. One minor comment is that it would be nice to make a cleanup (as we already discussed) for all places like this: 202 if (isEnabled() || deleted) { 203 throw invalidState(); 204 } As the isEnabled() now checks for deleted and throws the invalidState() then we can simplify these fragments to be: 202 if (isEnabled()) { 203 throw invalidState(); 204 } Thanks, Serguei On 3/29/18 10:27, Daniil Titov wrote: > Please review the changes that ensure that no operation on deleted com.sun.jdi.request.EventRequest objects are permitted as per JDI specification for com.sun.jdi.request.EventRequestManager.deleteEventRequest(com.sun.jdi.request.EventRequest) method. The fix makes the following 4 methods in class com.sun.tools.jdi. EventRequestManagerImpl$EventRequestImpl to throw com.sun.jdi.request.InvalidRequestStateException if the request is deleted: > - getProperty() > - putProperty(Object, Object) > - suspendPolicy() > - isEnabled() > > Bug: https://bugs.openjdk.java.net/browse/JDK-4613913 > Webrev: http://cr.openjdk.java.net/~dtitov/4613913/webrev.02/ > > Best regards, > Daniil > > From daniil.x.titov at oracle.com Thu Mar 29 22:36:17 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 29 Mar 2018 15:36:17 -0700 Subject: RFR 4613913: Four EventRequest methods are invokable on deleted request In-Reply-To: <579aad5f-fdaa-e0c9-dc16-7bc2394cb82f@oracle.com> References: <579aad5f-fdaa-e0c9-dc16-7bc2394cb82f@oracle.com> Message-ID: Hi Serguei, Please review a new version of the fix that has these places corrected. Webreb: http://cr.openjdk.java.net/~dtitov/4613913/webrev.03 Bug: https://bugs.openjdk.java.net/browse/JDK-4613913 Thanks! Best regards, Daniil ?On 3/29/18, 11:46 AM, "serguei.spitsyn at oracle.com" wrote: Hi Daniil, It looks good in general. One minor comment is that it would be nice to make a cleanup (as we already discussed) for all places like this: 202 if (isEnabled() || deleted) { 203 throw invalidState(); 204 } As the isEnabled() now checks for deleted and throws the invalidState() then we can simplify these fragments to be: 202 if (isEnabled()) { 203 throw invalidState(); 204 } Thanks, Serguei On 3/29/18 10:27, Daniil Titov wrote: > Please review the changes that ensure that no operation on deleted com.sun.jdi.request.EventRequest objects are permitted as per JDI specification for com.sun.jdi.request.EventRequestManager.deleteEventRequest(com.sun.jdi.request.EventRequest) method. The fix makes the following 4 methods in class com.sun.tools.jdi. EventRequestManagerImpl$EventRequestImpl to throw com.sun.jdi.request.InvalidRequestStateException if the request is deleted: > - getProperty() > - putProperty(Object, Object) > - suspendPolicy() > - isEnabled() > > Bug: https://bugs.openjdk.java.net/browse/JDK-4613913 > Webrev: http://cr.openjdk.java.net/~dtitov/4613913/webrev.02/ > > Best regards, > Daniil > > From serguei.spitsyn at oracle.com Thu Mar 29 22:38:10 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 29 Mar 2018 15:38:10 -0700 Subject: RFR 4613913: Four EventRequest methods are invokable on deleted request In-Reply-To: References: <579aad5f-fdaa-e0c9-dc16-7bc2394cb82f@oracle.com> Message-ID: <57ec5515-317a-0989-de85-56527c867b90@oracle.com> Looks good. Thank you for the update! Thanks, Serguei On 3/29/18 15:36, Daniil Titov wrote: > Hi Serguei, > > Please review a new version of the fix that has these places corrected. > > Webreb: http://cr.openjdk.java.net/~dtitov/4613913/webrev.03 > Bug: https://bugs.openjdk.java.net/browse/JDK-4613913 > > Thanks! > > Best regards, > Daniil > > ?On 3/29/18, 11:46 AM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > It looks good in general. > One minor comment is that it would be nice to make a cleanup > (as we already discussed) for all places like this: > > 202 if (isEnabled() || deleted) { > 203 throw invalidState(); > 204 } > > As the isEnabled() now checks for deleted and throws the invalidState() > then we can simplify these fragments to be: > > 202 if (isEnabled()) { > 203 throw invalidState(); > 204 } > > > Thanks, > Serguei > > > On 3/29/18 10:27, Daniil Titov wrote: > > Please review the changes that ensure that no operation on deleted com.sun.jdi.request.EventRequest objects are permitted as per JDI specification for com.sun.jdi.request.EventRequestManager.deleteEventRequest(com.sun.jdi.request.EventRequest) method. The fix makes the following 4 methods in class com.sun.tools.jdi. EventRequestManagerImpl$EventRequestImpl to throw com.sun.jdi.request.InvalidRequestStateException if the request is deleted: > > - getProperty() > > - putProperty(Object, Object) > > - suspendPolicy() > > - isEnabled() > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-4613913 > > Webrev: http://cr.openjdk.java.net/~dtitov/4613913/webrev.02/ > > > > Best regards, > > Daniil > > > > > > > > From david.holmes at oracle.com Fri Mar 30 00:12:50 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 30 Mar 2018 10:12:50 +1000 Subject: RFR 4613913: Four EventRequest methods are invokable on deleted request In-Reply-To: References: <579aad5f-fdaa-e0c9-dc16-7bc2394cb82f@oracle.com> Message-ID: Daniil, Even as far back as 2007 there was concern that changing the current behaviour might break existing code. That has to be an even bigger concern now! Further the spec is sloppy here: " Once the eventRequest is deleted, no operations (for example, EventRequest.setEnabled(boolean)) are permitted." This is too loose. What is an "operation"? Is a query like isEnabled() really an "operation"? I would not consider it so. And if we can delete requests why is there no "isDeleted" query? The spec seems incomplete and too vague. To me this something that should have been clarified in the spec first and then the implementation brought into alignment. But that should have happened many years ago. Changing this now seems risky to me. This change in long standing behaviour also requires a CSR request if it is to proceed. David ----- On 30/03/2018 8:36 AM, Daniil Titov wrote: > Hi Serguei, > > Please review a new version of the fix that has these places corrected. > > Webreb: http://cr.openjdk.java.net/~dtitov/4613913/webrev.03 > Bug: https://bugs.openjdk.java.net/browse/JDK-4613913 > > Thanks! > > Best regards, > Daniil > > ?On 3/29/18, 11:46 AM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > It looks good in general. > One minor comment is that it would be nice to make a cleanup > (as we already discussed) for all places like this: > > 202 if (isEnabled() || deleted) { > 203 throw invalidState(); > 204 } > > As the isEnabled() now checks for deleted and throws the invalidState() > then we can simplify these fragments to be: > > 202 if (isEnabled()) { > 203 throw invalidState(); > 204 } > > > Thanks, > Serguei > > > On 3/29/18 10:27, Daniil Titov wrote: > > Please review the changes that ensure that no operation on deleted com.sun.jdi.request.EventRequest objects are permitted as per JDI specification for com.sun.jdi.request.EventRequestManager.deleteEventRequest(com.sun.jdi.request.EventRequest) method. The fix makes the following 4 methods in class com.sun.tools.jdi. EventRequestManagerImpl$EventRequestImpl to throw com.sun.jdi.request.InvalidRequestStateException if the request is deleted: > > - getProperty() > > - putProperty(Object, Object) > > - suspendPolicy() > > - isEnabled() > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-4613913 > > Webrev: http://cr.openjdk.java.net/~dtitov/4613913/webrev.02/ > > > > Best regards, > > Daniil > > > > > > > > From serguei.spitsyn at oracle.com Fri Mar 30 00:16:29 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 29 Mar 2018 17:16:29 -0700 Subject: RFR 4613913: Four EventRequest methods are invokable on deleted request In-Reply-To: References: <579aad5f-fdaa-e0c9-dc16-7bc2394cb82f@oracle.com> Message-ID: <3f3cad45-5d88-c825-e9ec-3b2697f68b49@oracle.com> Looks good. Thank you for the update! Thanks, Serguei On 3/29/18 15:36, Daniil Titov wrote: > Hi Serguei, > > Please review a new version of the fix that has these places corrected. > > Webreb: http://cr.openjdk.java.net/~dtitov/4613913/webrev.03 > Bug: https://bugs.openjdk.java.net/browse/JDK-4613913 > > Thanks! > > Best regards, > Daniil > > ?On 3/29/18, 11:46 AM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > It looks good in general. > One minor comment is that it would be nice to make a cleanup > (as we already discussed) for all places like this: > > 202 if (isEnabled() || deleted) { > 203 throw invalidState(); > 204 } > > As the isEnabled() now checks for deleted and throws the invalidState() > then we can simplify these fragments to be: > > 202 if (isEnabled()) { > 203 throw invalidState(); > 204 } > > > Thanks, > Serguei > > > On 3/29/18 10:27, Daniil Titov wrote: > > Please review the changes that ensure that no operation on deleted com.sun.jdi.request.EventRequest objects are permitted as per JDI specification for com.sun.jdi.request.EventRequestManager.deleteEventRequest(com.sun.jdi.request.EventRequest) method. The fix makes the following 4 methods in class com.sun.tools.jdi. EventRequestManagerImpl$EventRequestImpl to throw com.sun.jdi.request.InvalidRequestStateException if the request is deleted: > > - getProperty() > > - putProperty(Object, Object) > > - suspendPolicy() > > - isEnabled() > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-4613913 > > Webrev: http://cr.openjdk.java.net/~dtitov/4613913/webrev.02/ > > > > Best regards, > > Daniil > > > > > > > > From Derek.White at cavium.com Fri Mar 30 23:24:09 2018 From: Derek.White at cavium.com (White, Derek) Date: Fri, 30 Mar 2018 23:24:09 +0000 Subject: JDK-8171119: Low-Overhead Heap Profiling In-Reply-To: References: <5A819F10.8040201@oracle.com> <5A8414AC.3020209@oracle.com> Message-ID: Hi Jc, I?ve been having trouble getting your patch to apply correctly. I may have based it on the wrong version. In any case, I think there?s a missing update to macroAssembler_aarch64.cpp, in MacroAssembler::tlab_allocate(), where ?JavaThread::tlab_end_offset()? should become ?JavaThread::tlab_current_end_offset()?. This should correspond to the other port?s changes in templateTable_.cpp files. Thanks! - Derek From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of JC Beyler Sent: Wednesday, March 28, 2018 11:43 AM To: Erik ?sterlund Cc: serviceability-dev at openjdk.java.net; hotspot-compiler-dev Subject: Re: JDK-8171119: Low-Overhead Heap Profiling Hi all, I've been working on deflaking the tests mostly and the wording in the JVMTI spec. Here is the two incremental webrevs: http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event.5_6/ http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event.06_07/ Here is the total webrev: http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event.07/ Here are the notes of this change: - Currently the tests pass 100 times in a row, I am working on checking if they pass 1000 times in a row. - The default sampling rate is set to 512k, this is what we use internally and having a default means that to enable the sampling with the default, the user only has to do a enable event/disable event via JVMTI (instead of enable + set sample rate). - I deprecated the code that was handling the fast path tlab refill if it happened since this is now deprecated - Though I saw that Graal is still using it so I have to see what needs to be done there exactly Finally, using the Dacapo benchmark suite, I noted a 1% overhead for when the event system is turned on and the callback to the native agent is just empty. I got a 3% overhead with a 512k sampling rate with the code I put in the native side of my tests. Thanks and comments are appreciated, Jc On Mon, Mar 19, 2018 at 2:06 PM JC Beyler > wrote: Hi all, The incremental webrev update is here: http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event4_5/ The full webrev is here: http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event5/ Major change here is: - I've removed the heapMonitoring.cpp code in favor of just having the sampling events as per Serguei's request; I still have to do some overhead measurements but the tests prove the concept can work - Most of the tlab code is unchanged, the only major part is that now things get sent off to event collectors when used and enabled. - Added the interpreter collectors to handle interpreter execution - Updated the name from SetTlabHeapSampling to SetHeapSampling to be more generic - Added a mutex for the thread sampling so that we can initialize an internal static array safely - Ported the tests from the old system to this new one I've also updated the JEP and CSR to reflect these changes: https://bugs.openjdk.java.net/browse/JDK-8194905 https://bugs.openjdk.java.net/browse/JDK-8171119 In order to make this have some forward progress, I've removed the heap sampling code entirely and now rely entirely on the event sampling system. The tests reflect this by using a simplified implementation of what an agent could do: http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event5/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitor.c (Search for anything mentioning event_storage). I have not taken the time to port the whole code we had originally in heapMonitoring to this. I hesitate only because that code was in C++, I'd have to port it to C and this is for tests so perhaps what I have now is good enough? As far as testing goes, I've ported all the relevant tests and then added a few: - Turning the system on/off - Testing using various GCs - Testing using the interpreter - Testing the sampling rate - Testing with objects and arrays - Testing with various threads Finally, as overhead goes, I have the numbers of the system off vs a clean build and I have 0% overhead, which is what we'd want. This was using the Dacapo benchmarks. I am now preparing to run a version with the events on using dacapo and will report back here. Any comments are welcome :) Jc On Thu, Mar 8, 2018 at 4:00 PM JC Beyler > wrote: Hi all, I apologize for the delay but I wanted to add an event system and that took a bit longer than expected and I also reworked the code to take into account the deprecation of FastTLABRefill. This update has four parts: A) I moved the implementation from Thread to ThreadHeapSampler inside of Thread. Would you prefer it as a pointer inside of Thread or like this works for you? Second question would be would you rather have an association outside of Thread altogether that tries to remember when threads are live and then we would have something like: ThreadHeapSampler::get_sampling_size(this_thread); I worry about the overhead of this but perhaps it is not too too bad? B) I also have been working on the Allocation event system that sends out a notification at each sampled event. This will be practical when wanting to do something at the allocation point. I'm also looking at if the whole heapMonitoring code could not reside in the agent code and not in the JDK. I'm not convinced but I'm talking to Serguei about it to see/assess :) - Also added two tests for the new event subsystem C) Removed the slow_path fields inside the TLAB code since now FastTLABRefill is deprecated D) Updated the JVMTI documentation and specification for the methods. So the incremental webrev is here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.09_10/ and the full webrev is here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.10 I believe I have updated the various JIRA issues that track this :) Thanks for your input, Jc On Wed, Feb 14, 2018 at 10:34 PM, JC Beyler > wrote: Hi Erik, I inlined my answers, which the last one seems to answer Robbin's concerns about the same thing (adding things to Thread). On Wed, Feb 14, 2018 at 2:51 AM, Erik ?sterlund > wrote: Hi JC, Comments are inlined below. On 2018-02-13 06:18, JC Beyler wrote: Hi Erik, Thanks for your answers, I've now inlined my own answers/comments. I've done a new webrev here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08/ The incremental is here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/ Note to all: - I've been integrating changes from Erin/Serguei/David comments so this webrev incremental is a bit an answer to all comments in one. I apologize for that :) On Mon, Feb 12, 2018 at 6:05 AM, Erik ?sterlund > wrote: Hi JC, Sorry for the delayed reply. Inlined answers: On 2018-02-06 00:04, JC Beyler wrote: Hi Erik, (Renaming this to be folded into the newly renamed thread :)) First off, thanks a lot for reviewing the webrev! I appreciate it! I updated the webrev to: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/ And the incremental one is here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.04_05a/ It contains: - The change for since from 9 to 11 for the jvmti.xml - The use of the OrderAccess for initialized - Clearing the oop I also have inlined my answers to your comments. The biggest question will come from the multiple *_end variables. A bit of the logic there is due to handling the slow path refill vs fast path refill and checking that the rug was not pulled underneath the slowpath. I believe that a previous comment was that TlabFastRefill was going to be deprecated. If this is true, we could revert this code a bit and just do a : if TlabFastRefill is enabled, disable this. And then deprecate that when TlabFastRefill is deprecated. This might simplify this webrev and I can work on a follow-up that either: removes TlabFastRefill if Robbin does not have the time to do it or add the support to the assembly side to handle this correctly. What do you think? I support removing TlabFastRefill, but I think it is good to not depend on that happening first. I'm slowly pushing on the FastTLABRefill (https://bugs.openjdk.java.net/browse/JDK-8194084), I agree on keeping both separate for now though so that we can think of both differently Now, below, inlined are my answers: On Fri, Feb 2, 2018 at 8:44 AM, Erik ?sterlund > wrote: Hi JC, Hope I am reviewing the right version of your work. Here goes... src/hotspot/share/gc/shared/collectedHeap.inline.hpp: 159 AllocTracer::send_allocation_outside_tlab(klass, result, size * HeapWordSize, THREAD); 160 161 THREAD->tlab().handle_sample(THREAD, result, size); 162 return result; 163 } Should not call tlab()->X without checking if (UseTLAB) IMO. Done! More about this later. src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp: So first of all, there seems to quite a few ends. There is an "end", a "hard end", a "slow path end", and an "actual end". Moreover, it seems like the "hard end" is actually further away than the "actual end". So the "hard end" seems like more of a "really definitely actual end" or something. I don't know about you, but I think it looks kind of messy. In particular, I don't feel like the name "actual end" reflects what it represents, especially when there is another end that is behind the "actual end". 413 HeapWord* ThreadLocalAllocBuffer::hard_end() { 414 // Did a fast TLAB refill occur? 415 if (_slow_path_end != _end) { 416 // Fix up the actual end to be now the end of this TLAB. 417 _slow_path_end = _end; 418 _actual_end = _end; 419 } 420 421 return _actual_end + alignment_reserve(); 422 } I really do not like making getters unexpectedly have these kind of side effects. It is not expected that when you ask for the "hard end", you implicitly update the "slow path end" and "actual end" to new values. As I said, a lot of this is due to the FastTlabRefill. If I make this not supporting FastTlabRefill, this goes away. The reason the system needs to update itself at the get is that you only know at that get if things have shifted underneath the tlab slow path. I am not sure of really better names (naming is hard!), perhaps we could do these names: - current_tlab_end // Either the allocated tlab end or a sampling point - last_allocation_address // The end of the tlab allocation - last_slowpath_allocated_end // In case a fast refill occurred the end might have changed, this is to remember slow vs fast past refills the hard_end method can be renamed to something like: tlab_end_pointer() // The end of the lab including a bit of alignment reserved bytes Those names sound better to me. Could you please provide a mapping from the old names to the new names so I understand which one is which please? This is my current guess of what you are proposing: end -> current_tlab_end actual_end -> last_allocation_address slow_path_end -> last_slowpath_allocated_end hard_end -> tlab_end_pointer Yes that is correct, that was what I was proposing. I would prefer this naming: end -> slow_path_end // the end for taking a slow path; either due to sampling or refilling actual_end -> allocation_end // the end for allocations slow_path_end -> last_slow_path_end // last address for slow_path_end (as opposed to allocation_end) hard_end -> reserved_end // the end of the reserved space of the TLAB About setting things in the getter... that still seems like a very unpleasant thing to me. It would be better to inspect the call hierarchy and explicitly update the ends where they need updating, and assert in the getter that they are in sync, rather than implicitly setting various ends as a surprising side effect in a getter. It looks like the call hierarchy is very small. With my new naming convention, reserved_end() would presumably return _allocation_end + alignment_reserve(), and have an assert checking that _allocation_end == _last_slow_path_allocation_end, complaining that this invariant must hold, and that a caller to this function, such as make_parsable(), must first explicitly synchronize the ends as required, to honor that invariant. I've renamed the variables to how you preferred it except for the _end one. I did: current_end last_allocation_address tlab_end_ptr The reason is that the architecture dependent code use the thread.hpp API and it already has tlab included into the name so it becomes tlab_current_end (which is better that tlab_current_tlab_end in my opinion). I also moved the update into a separate method with a TODO that says to remove it when FastTLABRefill is deprecated This looks a lot better now. Thanks. Note that the following comment now needs updating accordingly in threadLocalAllocBuffer.hpp: 41 // Heap sampling is performed via the end/actual_end fields. 42 // actual_end contains the real end of the tlab allocation, 43 // whereas end can be set to an arbitrary spot in the tlab to 44 // trip the return and sample the allocation. 45 // slow_path_end is used to track if a fast tlab refill occured 46 // between slowpath calls. There might be other comments too, I have not looked in detail. This was the only spot that still had an actual_end, I fixed it now. I'll do a sweep to double check other comments. Not sure it's better but before updating the webrev, I wanted to try to get input/consensus :) (Note hard_end was always further off than end). src/hotspot/share/prims/jvmti.xml: 10357 10358 10359 Can sample the heap. 10360 If this capability is enabled then the heap sampling methods can be called. 10361 10362 Looks like this capability should not be "since 9" if it gets integrated now. Updated now to 11, crossing my fingers :) src/hotspot/share/runtime/heapMonitoring.cpp: 448 if (is_alive->do_object_b(value)) { 449 // Update the oop to point to the new object if it is still alive. 450 f->do_oop(&(trace.obj)); 451 452 // Copy the old trace, if it is still live. 453 _allocated_traces->at_put(curr_pos++, trace); 454 455 // Store the live trace in a cache, to be served up on /heapz. 456 _traces_on_last_full_gc->append(trace); 457 458 count++; 459 } else { 460 // If the old trace is no longer live, add it to the list of 461 // recently collected garbage. 462 store_garbage_trace(trace); 463 } In the case where the oop was not live, I would like it to be explicitly cleared. Done I think how you wanted it. Let me know because I'm not familiar with the RootAccess API. I'm unclear if I'm doing this right or not so reviews of these parts are highly appreciated. Robbin had talked of perhaps later pushing this all into a OopStorage, should I do this now do you think? Or can that wait a second webrev later down the road? I think using handles can and should be done later. You can use the Access API now. I noticed that you are missing an #include "oops/access.inline.hpp" in your heapMonitoring.cpp file. The missing header is there for me so I don't know, I made sure it is present in the latest webrev. Sorry about that. + Did I clear it the way you wanted me to or were you thinking of something else? That is precisely how I wanted it to be cleared. Thanks. + Final question here, seems like if I were to want to not do the f->do_oop directly on the trace.obj, I'd need to do something like: f->do_oop(&value); ... trace->store_oop(value); to update the oop internally. Is that right/is that one of the advantages of going to the Oopstorage sooner than later? I think you really want to do the do_oop on the root directly. Is there a particular reason why you would not want to do that? Otherwise, yes - the benefit with using the handle approach is that you do not need to call do_oop explicitly in your code. There is no reason except that now we have a load_oop and a get_oop_addr, I was not sure what you would think of that. That's fine. Also I see a lot of concurrent-looking use of the following field: 267 volatile bool _initialized; Please note that the "volatile" qualifier does not help with reordering here. Reordering between volatile and non-volatile fields is completely free for both compiler and hardware, except for windows with MSVC, where volatile semantics is defined to use acquire/release semantics, and the hardware is TSO. But for the general case, I would expect this field to be stored with OrderAccess::release_store and loaded with OrderAccess::load_acquire. Otherwise it is not thread safe. Because everything is behind a mutex, I wasn't really worried about this. I have a test that has multiple threads trying to hit this corner case and it passes. However, to be paranoid, I updated it to using the OrderAccess API now, thanks! Let me know what you think there too! If it is indeed always supposed to be read and written under a mutex, then I would strongly prefer to have it accessed as a normal non-volatile member, and have an assertion that given lock is held or we are in a safepoint, as we do in many other places. Something like this: assert(HeapMonitorStorage_lock->owned_by_self() || (SafepointSynchronize::is_at_safepoint() && Thread::current()->is_VM_thread()), "this should not be accessed concurrently"); It would be confusing to people reading the code if there are uses of OrderAccess that are actually always protected under a mutex. Thank you for the exact example to be put in the code! I put it around each access/assignment of the _initialized method and found one case where yes you can touch it and not have the lock. It actually is "ok" because you don't act on the storage until later and only when you really want to modify the storage (see the object_alloc_do_sample method which calls the add_trace method). But, because of this, I'm going to put the OrderAccess here, I'll do some performance numbers later and if there are issues, I might add a "unsafe" read and a "safe" one to make it explicit to the reader. But I don't think it will come to that. Okay. This double return in heapMonitoring.cpp looks wrong: 283 bool initialized() { 284 return OrderAccess::load_acquire(&_initialized) != 0; 285 return _initialized; 286 } Since you said object_alloc_do_sample() is the only place where you do not hold the mutex while reading initialized(), I had a closer look at that. It looks like in its current shape, the lack of a mutex may lead to a memory leak. In particular, it first checks if (initialized()). Let's assume this is now true. It then allocates a bunch of stuff, and checks if the number of frames were over 0. If they were, it calls StackTraceStorage::storage()->add_trace() seemingly hoping that after grabbing the lock in there, initialized() will still return true. But it could now return false and skip doing anything, in which case the allocated stuff will never be freed. I fixed this now by making add_trace return a boolean and checking for that. It will be in the next webrev. Thanks, the truth is that in our implementation the system is always on or off, so this never really occurs :). In this version though, that is not true and it's important to handle so thanks again! So the analysis seems to be that _initialized is only used outside of the mutex in once instance, where it is used to perform double-checked locking, that actually causes a memory leak. I am not proposing how to fix that, just raising the issue. If you still want to perform this double-checked locking somehow, then the use of acquire/release still seems odd. Because the memory ordering restrictions of it never comes into play in this particular case. If it ever did, then the use of destroy_stuff(); release_store(_initialized, 0) would be broken anyway as that would imply that whatever concurrent reader there ever was would after reading _initialized with load_acquire() could *never* read the data that is concurrently destroyed anyway. I would be biased to think that RawAccess::load/store looks like a more appropriate solution, given that the memory leak issue is resolved. I do not know how painful it would be to not perform this double-checked locking. So I agree with this entirely. I looked also a bit more and the difference and code really stems from our internal version. In this version however, there are actually a lot of things going on that I did not go entirely through in my head but this comment made me ponder a bit more on it. Since every object_alloc_do_sample is protected by a check to HeapMonitoring::enabled(), there is only a small chance that the call is happening when things have been disabled. So there is no real need to do a first check on the initialized, it is a rare occurence that a call happens to object_alloc_do_sample and the initialized of the storage returns false. (By the way, even if you did call object_alloc_do_sample without looking at HeapMonitoring::enabled(), that would be ok too. You would gather the stacktrace and get nowhere at the add_trace call, which would return false; so though not optimal performance wise, nothing would break). Furthermore, the add_trace is really the moment of no return and we have the mutex lock and then the initialized check. So, in the end, I did two things: I removed that first check and then I removed the OrderAccess for the storage initialized. I think now I have a better grasp and understanding why it was done in our code and why it is not needed here. Thanks for pointing it out :). This now still passes my JTREG tests, especially the threaded one. As a kind of meta comment, I wonder if it would make sense to add sampling for non-TLAB allocations. Seems like if someone is rapidly allocating a whole bunch of 1 MB objects that never fit in a TLAB, I might still be interested in seeing that in my traces, and not get surprised that the allocation rate is very high yet not showing up in any profiles. That is handled by the handle_sample where you wanted me to put a UseTlab because you hit that case if the allocation is too big. I see. It was not obvious to me that non-TLAB sampling is done in the TLAB class. That seems like an abstraction crime. What I wanted in my previous comment was that we do not call into the TLAB when we are not using TLABs. If there is sampling logic in the TLAB that is used for something else than TLABs, then it seems like that logic simply does not belong inside of the TLAB. It should be moved out of the TLAB, and instead have the TLAB call this common abstraction that makes sense. So in the incremental version: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/, this is still a "crime". The reason is that the system has to have the bytes_until_sample on a per-thread level and it made "sense" to have it with the TLAB implementation. Also, I was not sure how people felt about adding something to the thread instance instead. Do you think it fits better at the Thread level? I can see how difficult it is to make it happen there and add some logic there. Let me know what you think. We have an unfortunate situation where everyone that has some fields that are thread local tend to dump them right into Thread, making the size and complexity of Thread grow as it becomes tightly coupled with various unrelated subsystems. It would be desirable to have a separate class for this instead that encapsulates the sampling logic. That class could possibly reside in Thread though as a value object of Thread. I imagined that would be the case but was not sure. I will look at the example that Robbin is talking about (ThreadSMR) and will see how to refactor my code to use that. Thanks again for your help, Jc Hope I have answered your questions and that my feedback makes sense to you. You have and thank you for them, I think we are getting to a cleaner implementation and things are getting better and more readable :) Yes it is getting better. Thanks, /Erik Thanks for your help! Jc Thanks, /Erik I double checked by changing the test http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatObjectCorrectnessTest.java to use a smaller Tlab (2048) and made the object bigger and it goes through that and passes. Thanks again for your review and I look forward to your pointers for the questions I now have raised! Jc Thanks, /Erik On 2018-01-26 06:45, JC Beyler wrote: Thanks Robbin for the reviews :) The new full webrev is here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/ The incremental webrev is here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/ I inlined my answers: On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn > wrote: Hi JC, great to see another revision! #### heapMonitoring.cpp StackTraceData should not contain the oop for 'safety' reasons. When StackTraceData is moved from _allocated_traces: L452 store_garbage_trace(trace); it contains a dead oop. _allocated_traces could instead be a tupel of oop and StackTraceData thus dead oops are not kept. Done I used inheritance to make the copier work regardless but the idea is the same. You should use the new Access API for loading the oop, something like this: RootAccess::load(...) I don't think you need to use Access API for clearing the oop, but it would look nicer. And you shouldn't probably be using: Universe::heap()->is_in_reserved(value) I am unfamiliar with this but I think I did do it like you wanted me to (all tests pass so that's a start). I'm not sure how to clear the oop exactly, is there somewhere that does that, which I can use to do the same? I removed the is_in_reserved, this came from our internal version, I don't know why it was there but my tests work without so I removed it :) The lock: L424 MutexLocker mu(HeapMonitorStorage_lock); Is not needed as far as I can see. weak_oops_do is called in a safepoint, no TLAB allocation can happen and JVMTI thread can't access these data-structures. Is there something more to this lock that I'm missing? Since a thread can call the JVMTI getLiveTraces (or any of the other ones), it can get to the point of trying to copying the _allocated_traces. I imagine it is possible that this is happening during a GC or that it can be started and a GC happens afterwards. Therefore, it seems to me that you want this protected, no? #### You have 6 files without any changes in them (any more): g1CollectedHeap.cpp psMarkSweep.cpp psParallelCompact.cpp genCollectedHeap.cpp referenceProcessor.cpp thread.hpp Done. #### I have not looked closely, but is it possible to hide heap sampling in AllocTracer ? (with some minor changes to the AllocTracer API) I am imagining that you are saying to move the code that does the sampling code (change the tlab end, do the call to HeapMonitoring, etc.) into the AllocTracer code itself? I think that is right and I'll look if that is possible and prepare a webrev to show what would be needed to make that happen. #### Minor nit, when declaring pointer there is a little mix of having the pointer adjacent by type name and data name. (Most hotspot code is by type name) E.g. heapMonitoring.cpp:711 jvmtiStackTrace *trace = .... heapMonitoring.cpp:733 Method* m = vfst.method(); (not just this file) Done! #### HeapMonitorThreadOnOffTest.java:77 I would make g_tmp volatile, otherwise the assignment in loop may theoretical be skipped. Also done! Thanks again! Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Sat Mar 31 07:24:28 2018 From: david.holmes at oracle.com (David Holmes) Date: Sat, 31 Mar 2018 17:24:28 +1000 Subject: [8u] RFR for backport of "JDK-8165736: Error message should be shown when JVMTI agent cannot be attached" to jdk8u-dev In-Reply-To: <8c218a37-4a50-4b4f-847b-4c67e02b7866@default> References: <8c218a37-4a50-4b4f-847b-4c67e02b7866@default> Message-ID: <70a18b4a-a310-babe-1f41-c86100638457@oracle.com> Hi Shafi, On 29/03/2018 7:11 PM, Shafi Ahmad wrote: > Hi, > > Please review the backport of ' JDK-8165736: Error message should be shown when JVMTI agent cannot be attached' to jdk8u-dev. > Please note that this is not a clean backport because we can't not backport native jtreg tests as infrastructure of naive jtreg test has been available since JDK 9. Ok. > webrev: http://cr.openjdk.java.net/~shshahma/8165736/ > jdk10 bug: https://bugs.openjdk.java.net/browse/JDK-8165736 > original patch pushed to jdk10: http://hg.openjdk.java.net/jdk/jdk/rev/bc1cffa26561 src/share/vm/prims/jvmtiExport.cpp You missed the initalization of ebuf: + char ebuf[1024] = {0}; Otherwise the functional backport seems okay. Thanks, David > Test: Run jprt -testset hotspot, -testset core > > Regards, > Shafi >