From david.holmes at oracle.com Mon Dec 1 06:15:54 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 01 Dec 2014 16:15:54 +1000 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <5479E9DE.7070703@gmail.com> References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> Message-ID: <547C079A.9020604@oracle.com> Hi Yasumasa, On 30/11/2014 1:44 AM, Yasumasa Suenaga wrote: > Hi all, > > > Thank you for checking my patch! > I've uploaded new webrev: > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.03/hotspot.patch > > David: >> The change in: >> src/os/aix/vm/os_aix.cpp >> src/os/solaris/vm/os_solaris.cpp >> >> jio_snprintf(buffer, bufferSize, "%s/core or core.%d", >> current_process_id()); >> >> has no argument for the %s - presumably p was intended. > > I've fixed. Thanks. The formatting needs fixing up though, the p should line up with buffer. I'm concerned by the changes in os_linux.cpp and os_posix.cpp to use os::malloc. If this is being called from a signal handler there's a real risk of deadlock if we try to use malloc/free. I know Thomas suggested this (and sorry I didn't notice it then) but I don't think it is a good idea for the crash handler. Thanks, David > > Staffan: >> src/os/bsd/vm/os_linux.cpp: >> Could we not simplify this to print a helpful message instead? > > Most of case in Linux, I think that core image name is "core." . > In other case which except pipe redirection, I guess that user defines it. > Thus I print string in kernel.core_pattern directly. > >> src/os/bsd/vm/os_bsd.cpp: >> On OS X cores are by default written to /cores/core.. This is >> configureable with the kern.corefile sysctl variable, although it is >> rare to do so. > > Thank you! > I changed path to "/cores/core." . > > > Thomas: >> - jio_snprintf() returns -1 on truncation. n+=written may walk >> backwards. I would probably check for (written >= 0) and also, at the >> start of the loop, for (n < sizeof(core_path)). >> - code is used in error reporting. I would be hesitant to create >> larger buffers on the stack. malloc may be better. > > I've fixed them. > >> - code does not detect truncation of core_path (unlikely but possible) > > Do you mean variable name? > "core_path" in my patch stores /proc/sys/kernel/core_pattern . > Length of kernel.core_pattern is defined 128 chars in Linux Kernel > Documentation. > https://www.kernel.org/doc/Documentation/sysctl/kernel.txt > > Thus length of core_path (129 chars) is enough. > >> - when reading /proc/sys/kernel/core_uses_pid, using fgetc instead of >> fgets may be a tiny bit simpler. > > I changed to use fgetc() . > > > Thanks, > > Yasumasa > > > (2014/11/26 23:12), Thomas St?fe wrote: >> Hi Yasumasa, >> >> I am not a Reviewer. Barring the general decision of the real >> reviewers, here are some thoughts: >> >> os_linux.cpp >> >> - jio_snprintf() returns -1 on truncation. n+=written may walk >> backwards. I would probably check for (written >= 0) and also, at the >> start of the loop, for (n < sizeof(core_path)). >> - code is used in error reporting. I would be hesitant to create >> larger buffers on the stack. malloc may be better. >> - code does not detect truncation of core_path (unlikely but possible) >> >> the rest is more matter of taste: >> - I would prefer sizeof(core_path) over PATH_MAX at all places where >> you refer to the size of the buffer. So you could make the buffer very >> small and test e.g. how your code behaves with truncation. >> - when reading /proc/sys/kernel/core_uses_pid, using fgetc instead of >> fgets may be a tiny bit simpler. >> >> Kind Regards, Thomas >> >> >> >> On Wed, Nov 26, 2014 at 4:54 AM, Yasumasa Suenaga > > wrote: >> >> Hi Staffan, >> >> Thank you for reviewing! >> >> os_linux.cpp: >> I want to print coredump location correctly to hs_err. So I want >> to output >> whether coredump is processed in other process or is written to file. >> If os::get_core_path() should be more simply, I will print raw >> string in >> core_pattern. >> >> os_bsd.cpp: >> I don't have OS X. So I cannot check it. >> I am focusing Linux in this enhancement. Could you file it as another >> enhancement if it need? >> >> Thanks, >> >> Yasumasa >> >> 2014/11/25 18:15 "Staffan Larsen" > >: >> >> > src/os/bsd/vm/os_linux.cpp: >> > I?m inclined to think this is too complicated and hard to test and >> > maintain (and I see no tests in the webrev). Could we not >> simplify this to >> > print a helpful message instead? Something that prints the >> core_pattern and >> > perhaps some of the values that could be used for substitution, >> but does >> > not do the actual substitution? I think that would go a long >> way but be a >> > lot more maintainable. >> > >> > src/os/bsd/vm/os_bsd.cpp: >> > On OS X cores are by default written to /cores/core.. This is >> > configureable with the kern.corefile sysctl variable, although >> it is rare >> > to do so. >> > >> > /Staffan >> > >> > > On 24 nov 2014, at 14:21, Yasumasa Suenaga >> > wrote: >> > > >> > > Hi all, >> > > >> > > I've uploaded webrev for this issue about a month ago. >> > > Could you review it and sponsor it? >> > > >> > > >> > > Thanks, >> > > >> > > Yasumasa >> > > >> > > >> > > On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote: >> > >> Hi David, >> > >> >> > >> I've uploaded new webrev: >> > >> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/ >> > >> >> > >> >> > >>> I wasn't suggesting that you make such a change though >> because it is >> > large and disruptive. >> > >> >> > >>> Unfactoring check_or_create_dump is a step backwards in >> terms of code >> > sharing. >> > >> >> > >> I restored check_or_create_dump() to os_posix.cpp . >> > >> And I changed get_core_path() to create message which >> represents core >> > dump path >> > >> (including filename) in each OS. >> > >> >> > >> >> > >>> Expanding the get_core_path in os_linux.cpp to handle the >> core_pattern >> > may be okay (but I don't know enough about it to validate >> everything). >> > >> >> > >> I implemented all parameters in Linux kernel documentation: >> > >> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt >> > >> >> > >> So I think that parameters which are processed are enough. >> > >> >> > >> >> > >> Thanks, >> > >> >> > >> Yasumasa >> > >> >> > >> >> > >> >> > >> (2014/10/15 9:41), David Holmes wrote: >> > >>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote: >> > >>>> Hi David, >> > >>>> >> > >>>> Thank you for comments! >> > >>>> I've uploaded new webrev. Could you review it again? >> > >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/ >> > >>>> >> > >>>> I am an author of jdk9. So I cannot commit it. >> > >>>> Could you be a sponsor for this enhancement? >> > >>>> >> > >>>> >> > >>>>> In which case that should be handled by the linux specific >> > >>>>> get_core_path() function. >> > >>>> >> > >>>> Agree. >> > >>>> So I implemented it in os_linux.cpp . >> > >>>> But part of format characters (%P: global pid, %s: signal, >> %t dump >> > time) >> > >>>> are not processed >> > >>>> in this function because I think these parameters are >> difficult to >> > >>>> handle in it. >> > >>>> >> > >>>> %P: I could not find API for this. >> > >>>> %s: We have to change arguments of get_core_path() . >> > >>>> %t: This parameter means timestamp of coredump. It is >> decided in >> > Kernel. >> > >>>> >> > >>>> >> > >>>>> Fixing this means changing all the os_posix using >> platforms. But your >> > >>>>> patch is not about this part. :) >> > >>>> >> > >>>> I moved os::check_or_create_dump() to each OS >> implementations (AIX, >> > BSD, >> > >>>> Solaris, Linux) . >> > >>>> So I can write Linux specific code to >> check_or_create_dump() . >> > >>>> As a result, I could remove "#ifdef LINUX" from >> os_posix.cpp :-) >> > >>> >> > >>> I wasn't suggesting that you make such a change though >> because it is >> > large and disruptive. The simple handling of the | part of >> core_pattern was >> > basically ok. Expanding the get_core_path in os_linux.cpp to >> handle the >> > core_pattern may be okay (but I don't know enough about it to >> validate >> > everything). Unfactoring check_or_create_dump is a step >> backwards in terms >> > of code sharing. >> > >>> >> > >>> Sorry this has grown too large for me to deal with right now. >> > >>> >> > >>> David >> > >>> ----- >> > >>> >> > >>>> >> > >>>>> Though I'm unclear whether it both invokes the program >> and creates a >> > >>>>> core dump file; or just invokes the program? >> > >>>> >> > >>>> If '|' is set, Linux kernel will just redirect core image >> to user >> > process. >> > >>>> Kernel documentation says as below: >> > >>>> ------------ >> > >>>> . If the first character of the pattern is a '|', the >> kernel will >> > treat >> > >>>> the rest of the pattern as a command to run. The core >> dump will be >> > >>>> written to the standard input of that program instead of >> to a file. >> > >>>> ------------ >> > >>>> >> > >>>> And implementation of coredump (do_coredump()) follows to it. >> > >>>> >> > >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c >> >> > >>>> >> > >>>> >> > >>>> In case of ABRT, ABRT dumps core image to default location >> > >>>> (/core.) >> > >>>> if user set unlimited to resource limit of core (ulimit -c) . >> > >>>> >> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c >> > >>>> >> > >>>> >> > >>>>> A few style nits - you need spaces around keywords and >> before braces >> > >>>>> I also suggest saying "Core dumps may be processed with >> ..." rather >> > >>>>> than "treated". >> > >>>>> And as you don't do anything in the non-redirect case I >> suggest >> > >>>>> collapsing this: >> > >>>> >> > >>>> I've fixed them. >> > >>>> >> > >>>> >> > >>>> Thanks, >> > >>>> >> > >>>> Yasumasa >> > >>>> >> > >>>> >> > >>>> (2014/10/13 9:41), David Holmes wrote: >> > >>>>> Hi Yasumasa, >> > >>>>> >> > >>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote: >> > >>>>>> Hi David, >> > >>>>>> >> > >>>>>> Sorry for my English. >> > >>>>>> >> > >>>>>> I want to propose that JVM should create message >> according to core >> > >>>>>> pattern (/proc/sys/kernel/core_pattern) . >> > >>>>>> So I filed it to JBS and created a patch. >> > >>>>> >> > >>>>> So I've had a quick look at this core_pattern business >> and it seems >> > to >> > >>>>> me that there are two aspects to this. >> > >>>>> >> > >>>>> First, without the leading |, the entry in the >> core_pattern file is a >> > >>>>> naming pattern for the core file. In which case that >> should be >> > handled >> > >>>>> by the linux specific get_core_path() function. Though >> that in itself >> > >>>>> can't fully report the expected name, as part of it is >> provided in >> > the >> > >>>>> shared code in os::check_or_create_dump. Fixing this >> means changing >> > >>>>> all the os_posix using platforms. But your patch is not >> about this >> > >>>>> part. :) >> > >>>>> >> > >>>>> Second, with a leading | the core_pattern is actually the >> name of a >> > >>>>> program to execute when the program is about to core >> dump, and that >> > is >> > >>>>> what you report with your patch. Though I'm unclear >> whether it both >> > >>>>> invokes the program and creates a core dump file; or just >> invokes the >> > >>>>> program? >> > >>>>> >> > >>>>> So with regards to this second part your patch seems >> functionally ok. >> > >>>>> I do dislike having a big chunk of linux specific code in >> this >> > "posix" >> > >>>>> support file but ... >> > >>>>> >> > >>>>> A few style nits - you need spaces around keywords and >> before braces >> > eg: >> > >>>>> >> > >>>>> if(x){ >> > >>>>> >> > >>>>> should be >> > >>>>> >> > >>>>> if (x) { >> > >>>>> >> > >>>>> I also suggest saying "Core dumps may be processed with >> ..." rather >> > >>>>> than "treated". >> > >>>>> >> > >>>>> And as you don't do anything in the non-redirect case I >> suggest >> > >>>>> collapsing this: >> > >>>>> >> > >>>>> 83 is_redirect = core_pattern[0] == '|'; >> > >>>>> 84 } >> > >>>>> 85 >> > >>>>> 86 if(is_redirect){ >> > >>>>> 87 jio_snprintf(buffer, bufferSize, >> > >>>>> 88 "Core dumps may be treated with >> \"%s\"", >> > >>>>> &core_pattern[1]); >> > >>>>> 89 } >> > >>>>> >> > >>>>> to just >> > >>>>> >> > >>>>> 83 if (core_pattern[0] == '|') { // redirect >> > >>>>> 84 jio_snprintf(buffer, bufferSize, "Core >> dumps may be >> > >>>>> processed with \"%s\"", &core_pattern[1]); >> > >>>>> 85 } >> > >>>>> 86 } >> > >>>>> >> > >>>>> Comments from other runtime folk appreciated. >> > >>>>> >> > >>>>> Thanks, >> > >>>>> David >> > >>>>> >> > >>>>>> Thanks, >> > >>>>>> >> > >>>>>> Yasumasa >> > >>>>>> >> > >>>>>> 2014/10/07 15:43 "David Holmes" > >> > >>>>>> > >>: >> > >>>>>> >> > >>>>>> Hi Yasumasa, >> > >>>>>> >> > >>>>>> I'm sorry but I don't understand what you are >> proposing. When you >> > >>>>>> say >> > >>>>>> "treat" do you mean "create"? Otherwise what do you >> mean by >> > >>>>>> "treated"? >> > >>>>>> >> > >>>>>> Thanks, >> > >>>>>> David >> > >>>>>> >> > >>>>>> On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote: >> > >>>>>> > I'm in Hackergarten @ JavaOne :-) >> > >>>>>> > >> > >>>>>> > >> > >>>>>> > Hi all, >> > >>>>>> > >> > >>>>>> > I would like to enhance the messages in hs_err >> report. >> > >>>>>> > Modern Linux kernel can treat core dump with user >> process >> > >>>>>> (e.g. ABRT) >> > >>>>>> > However, hs_err report cannot detect it. >> > >>>>>> > >> > >>>>>> > I think that hs_err report should output messages >> as below: >> > >>>>>> > ------------- >> > >>>>>> > Failed to write core dump. Core dumps may be >> treated with >> > >>>>>> "/usr/sbin/chroot /proc/%P/root >> /usr/libexec/abrt-hook-ccpp %s >> > %c %p >> > >>>>>> %u %g %t e" >> > >>>>>> > ------------- >> > >>>>>> > >> > >>>>>> > I've uploaded webrev of this enhancement. >> > >>>>>> > Could you review it? >> > >>>>>> > >> > >>>>>> > >> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/ >> > >>>>>> > >> > >>>>>> > This patch works fine on Fedora20 x86_64. >> > >>>>>> > >> > >>>>>> > >> > >>>>>> > >> > >>>>>> > Thanks, >> > >>>>>> > >> > >>>>>> > Yasumasa >> > >>>>>> > >> > >>>>>> >> > >> > >> >> From thomas.stuefe at gmail.com Mon Dec 1 07:18:17 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 1 Dec 2014 08:18:17 +0100 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <547C079A.9020604@oracle.com> References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> Message-ID: David, Yasumasa, > Thanks. The formatting needs fixing up though, the p should line up with > buffer. > > I'm concerned by the changes in os_linux.cpp and os_posix.cpp to use > os::malloc. If this is being called from a signal handler there's a real > risk of deadlock if we try to use malloc/free. I know Thomas suggested this > (and sorry I didn't notice it then) but I don't think it is a good idea for > the crash handler. > > Correct. Sorry, my fault, I was not clear enough. I meant for you to use the pure malloc(3), not os::malloc. Using pure malloc is still a risk if the C-Heap is corrupted, but sometimes it makes sense for larger buffers. The risk of running into a corrupted C-Heap is sometimes offset by the risk of running out of stack space. Kind Regards, Thomas > Thanks, > David > > From david.holmes at oracle.com Mon Dec 1 07:26:13 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 01 Dec 2014 17:26:13 +1000 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> Message-ID: <547C1815.6050900@oracle.com> On 1/12/2014 5:18 PM, Thomas St?fe wrote: > David, Yasumasa, > > Thanks. The formatting needs fixing up though, the p should line up > with buffer. > > I'm concerned by the changes in os_linux.cpp and os_posix.cpp to use > os::malloc. If this is being called from a signal handler there's a > real risk of deadlock if we try to use malloc/free. I know Thomas > suggested this (and sorry I didn't notice it then) but I don't think > it is a good idea for the crash handler. > > > Correct. Sorry, my fault, I was not clear enough. I meant for you to use > the pure malloc(3), not os::malloc. I was thinking both may be undesirable. I think my conservatism dial is up higher than yours :) Let's see what Staffan (or others) thinks. Perhaps a static buffer rather than either malloc or stack based? Cheers, David > Using pure malloc is still a risk if the C-Heap is corrupted, but > sometimes it makes sense for larger buffers. The risk of running into a > corrupted C-Heap is sometimes offset by the risk of running out of stack > space. > > Kind Regards, Thomas > > Thanks, > David > From thomas.stuefe at gmail.com Mon Dec 1 07:53:59 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 1 Dec 2014 08:53:59 +0100 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <547C1815.6050900@oracle.com> References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> Message-ID: On Mon, Dec 1, 2014 at 8:26 AM, David Holmes wrote: > On 1/12/2014 5:18 PM, Thomas St?fe wrote: > >> David, Yasumasa, >> >> Thanks. The formatting needs fixing up though, the p should line up >> with buffer. >> >> I'm concerned by the changes in os_linux.cpp and os_posix.cpp to use >> os::malloc. If this is being called from a signal handler there's a >> real risk of deadlock if we try to use malloc/free. I know Thomas >> suggested this (and sorry I didn't notice it then) but I don't think >> it is a good idea for the crash handler. >> >> >> Correct. Sorry, my fault, I was not clear enough. I meant for you to use >> the pure malloc(3), not os::malloc. >> > > I was thinking both may be undesirable. I think my conservatism dial is up > higher than yours :) Let's see what Staffan (or others) thinks. Perhaps a > static buffer rather than either malloc or stack based? > > That would work, VmError::report_and_die() is singlethreaded. At least the part which dumps out the core file name. Another way would be to pre-calc the path at startup, in os::init() maybe. You run the risk of the pattern changing during the lifetime of the process though, but I guess that does not happen often. But lets others decide. Too many ways to do this :) Kind Regards, Thomas From yasuenag at gmail.com Mon Dec 1 09:45:17 2014 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Mon, 01 Dec 2014 18:45:17 +0900 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> Message-ID: <547C38AD.6050703@gmail.com> Hi Thomas, David, Sorry, I didn't think about async signal safety. > That would work, VmError::report_and_die() is singlethreaded. At least the part which dumps out the core file name. I think that signal handler (in this case) may run concurrency with other thread. If another thread calls malloc(3) in JNI, C Heap corruption may occur. I want to rewrite a patch as below: - Use async signal safety functions. fopen -> open, fgets -> read, etc. - Use O_BUFLEN for buffer size. O_BUFLEN is defined to 2000 in ostream.hpp . This macro is used in various points. VMError::coredump_message is also defined with this value. What do you think about this plan? Thanks, Yasumasa (2014/12/01 16:53), Thomas St?fe wrote: > > > On Mon, Dec 1, 2014 at 8:26 AM, David Holmes > wrote: > > On 1/12/2014 5:18 PM, Thomas St?fe wrote: > > David, Yasumasa, > > Thanks. The formatting needs fixing up though, the p should line up > with buffer. > > I'm concerned by the changes in os_linux.cpp and os_posix.cpp to use > os::malloc. If this is being called from a signal handler there's a > real risk of deadlock if we try to use malloc/free. I know Thomas > suggested this (and sorry I didn't notice it then) but I don't think > it is a good idea for the crash handler. > > > Correct. Sorry, my fault, I was not clear enough. I meant for you to use > the pure malloc(3), not os::malloc. > > > I was thinking both may be undesirable. I think my conservatism dial is up higher than yours :) Let's see what Staffan (or others) thinks. Perhaps a static buffer rather than either malloc or stack based? > > > That would work, VmError::report_and_die() is singlethreaded. At least the part which dumps out the core file name. > > Another way would be to pre-calc the path at startup, in os::init() maybe. You run the risk of the pattern changing during the lifetime of the process though, but I guess that does not happen often. > > But lets others decide. Too many ways to do this :) > > Kind Regards, Thomas > From thomas.stuefe at gmail.com Mon Dec 1 12:57:44 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 1 Dec 2014 13:57:44 +0100 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <547C38AD.6050703@gmail.com> References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> <547C38AD.6050703@gmail.com> Message-ID: Hi Yasumasa, On Mon, Dec 1, 2014 at 10:45 AM, Yasumasa Suenaga wrote: > Hi Thomas, David, > > Sorry, I didn't think about async signal safety. > > That would work, VmError::report_and_die() is singlethreaded. At least >> the part which dumps out the core file name. >> > > I think that signal handler (in this case) may run concurrency with > other thread. > If another thread calls malloc(3) in JNI, C Heap corruption may occur. > > No, malloc(3) should be thread safe on our platforms. But this was not the point. If I understood David right, he suggested using a static buffer inside get_core_path() for assembling the core path, which would make get_core_path() thread-unsafe (multiple threads calling it would get garbled results). But as get_core_path() is only called from within VmError::report_and_die() and that section is only ever executed by one thread, Davids suggestion would still work. > I want to rewrite a patch as below: > > - Use async signal safety functions. > fopen -> open, fgets -> read, etc. > > - Use O_BUFLEN for buffer size. > O_BUFLEN is defined to 2000 in ostream.hpp . > This macro is used in various points. VMError::coredump_message is > also defined with this value. > > I think PATH_MAX is fine. I think O_BUFLEN was originally used as a max. length of temporary buffers to assemble an output line. And then it spread a bit. But your intend is to hold a path and using PATH_MAX clearly documents this. And, to really nitpick, right now you do not handle ERANGE with get_current_path() (if the provided buffer is too small), which is probably fine because it is improbable that a path is larger than PATH_MAX. But if you change the size of the buffer to something which may be smaller than PATH_MAX (O_BUFLEN), get_current_directory() may fail. I like your patch, I think it could be a nice time safer when core_pattern is something unusual. But I also see Staffans point of too-much-complexity. So I will keep out of this discussion until the real Reviewers decided what to do :) Kind Regards, Thomas From thomas.stuefe at gmail.com Mon Dec 1 13:30:01 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 1 Dec 2014 14:30:01 +0100 Subject: RFR(s): 8065895: Synchronous signals during error reporting may terminate or hang VM process In-Reply-To: <54770536.5090101@redhat.com> References: <54752CF8.5070408@oracle.com> <54759DFE.7020300@oracle.com> <5475C15E.30207@oracle.com> <54767502.6010907@oracle.com> <5476B417.9030008@oracle.com> <5476E851.8050802@oracle.com> <5476F2AB.4050401@redhat.com> <5477030E.6070605@redhat.com> <54770436.8070705@oracle.com> <54770536.5090101@redhat.com> Message-ID: Hi all, lets not get this patch bogged down on ARM opcode discussions. For me, it is just a question of style and which one would be most acceptable to the OpenJDK. As I see it, here are my options: 1 leave the code as it is and whoever does ARM porting at Oracle will provide the SIGILL opcodes inside debug.cpp 2 like (1), but provide a fallback for CPUs where we do not know the SIGILL opcodes right now, by doing a raise(SIGILL). This would work but make the test a tiny bit less valuable on those platforms. 3 Move the CPU-dependend parts (the big #ifdef) away from debug.cpp into debug_.cpp. Would mean a bit code duplication because for 3 out of 5 cpus the SIGILL-generating opcode is 0. This basically would be the same as my second webrev ( http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.01/) 4 like (3), but with additional introduction of a debug_.hpp, and adding a "ZERO_WILL_GENERATE_SIGILL" or somesuch macro to provide a common fallback for cpus where 0 generates SIGILL. I am leaning toward (2) or (3) but I am okay with any of the four. Kind Regards, Thomas Stuefe On Thu, Nov 27, 2014 at 12:04 PM, Andrew Haley wrote: > On 11/27/2014 11:00 AM, David Holmes wrote: > > On 27/11/2014 8:55 PM, Andrew Haley wrote: > >> On 11/27/2014 10:38 AM, Thomas St?fe wrote: > >>> Hi Andrew, thank you! Does endianess matter ? > >> > >> Yes. I'd do it symbolically rather than mess with endian defines: > >> > >> #ifdef AARCH64 > >> unsigned insn; > >> asm("b 1f; 0: dcps1; 1: ldr %0, 0b" : "=r"(insn)); > >> #endif > > > > Does that work for ARMv7? > > Sorry, I don't know what a good choice there would be. And I must > warn you: DCPS1 isn't necessarily guaranteed to do this forever, but > it works on the kernels I've tried. > > Andrew. > > > From yumin.qi at oracle.com Mon Dec 1 15:23:45 2014 From: yumin.qi at oracle.com (Yumin Qi) Date: Mon, 01 Dec 2014 07:23:45 -0800 Subject: RFR(XS): 8053995: Add method to WhiteBox to get vm pagesize. In-Reply-To: <547676FE.3080204@oracle.com> References: <53DAC336.6050302@oracle.com> <54752EAF.4020404@oracle.com> <547532C0.4080500@oracle.com> <54760F90.6040100@oracle.com> <547676FE.3080204@oracle.com> Message-ID: <547C8801.60608@oracle.com> Too late! It is pushed before I saw this email. I will take care of this (if it happens or not), file a bug to fix it. Thanks Yumin On 11/26/2014 4:57 PM, David Holmes wrote: > On 27/11/2014 3:36 AM, Yumin Qi wrote: >> Thanks for the review. Yes, the test will build testlibrary with >> >> @library /testlibrary /testlibrary/whitebox > > No that won't necessarily build the testlibrary. > > From other email: > > >> I'm having a problem running a test in 8u25 that uses the testlibrary > >> ProcessTools API. I get a ClassNotFoundException. Looking in the > >> classes directory I only see two testlibrary classes - which map to > >> two specific testlibrary classes that one test has on its @build > >> line. The test in question simply has: > >> > >> @library /testlibrary > >> > >> Does it need an explicit: > >> > >> @build com.oracle.java.testlibrary.* > > > > Yes. It turns out that JTReg might not compile the library classes on > > demand (but it does sometimes). So it is better to specify the > > required build manually. > > > > -JB- > > David > ----- > >> >> Thanks >> Yumin >> >> >> >> On 11/25/14, 5:54 PM, David Holmes wrote: >>> Hi Yumin, >>> >>> On 26/11/2014 11:36 AM, Yumin Qi wrote: >>>> Please review >>>> >>>> bugs: https://bugs.openjdk.java.net/browse/JDK-8053995 >>>> webrev: http://cr.openjdk.java.net/~minqi/8053995/webrev01/ >>> >>> The test also needs to ensure the testlibrary gets built. >>> >>> Otherwise seems okay. >>> >>> Thanks, >>> David >>> >>>> Now the API usage is in internal test case, see separate email for the >>>> webrev. >>>> >>>> It is same as previous version (webrev00). >>>> >>>> Thanks >>>> Yumin >>>> >>>> On 7/31/14, 3:29 PM, Yumin Qi wrote: >>>>> Please review: >>>>> >>>>> http://cr.openjdk.java.net/~minqi/8053995/webrev00/ >>>>> >>>>> Summary: Currently there is no java API to get underlying OS >>>>> native VM >>>>> page size unless using Unsafe which is not recommended. The new added >>>>> method to WhiteBox can read this property and used in test. >>>>> >>>>> >>>>> Tests: JPRT and jtreg. >>>>> >>>>> Thanks >>>>> Yumin From michail.chernov at oracle.com Mon Dec 1 16:35:07 2014 From: michail.chernov at oracle.com (Michail Chernov) Date: Mon, 01 Dec 2014 19:35:07 +0300 Subject: RFR: 8064909: FragmentMetaspace.java got OutOfMemoryError In-Reply-To: <54772711.6000003@oracle.com> References: <5475D74A.2060907@oracle.com> <54762451.3070802@oracle.com> <54763D54.3070704@oracle.com> <54765B70.10509@oracle.com> <54772711.6000003@oracle.com> Message-ID: <547C98BB.206@oracle.com> Hi, Can you please look this review? Thanks, Michail On 27.11.2014 16:28, Michail Chernov wrote: > Hi, > > CC'ed hotspot-runtime-dev. > > Here is not test failure - test works as expected. OOME is occurred in > compiler instance. > > private JavaCompiler javac; > ... > javac = ToolProvider.getSystemJavaCompiler(); > ... > int exitcode = javac.run(null, null, null, > file.getCanonicalPath()); > if (exitcode != 0) { > throw new RuntimeException("javac failure when compiling: " + > file.getCanonicalPath()); > > Here is 2 ways - rewrite getGeneratedClass > (runtime/testlibrary/GeneratedClassLoader.java) to allow them to throw > not only RuntimeException, or to catch RuntimeException and check > exception message comparing with "javac failure when compiling:". Both > ways seem to me are not as clear as expected for this simple test. > More - javac does not throw anything - it just returns exitcode > (non-zero) and writes its messages to System.err. > > Also I can add comment to code like "OOME with message > "java.lang.OutOfMemoryError: Java heap space" doesn't mean that > something wrong with metaspace - need just to increase -Xmx". > > Thanks, > Michail > > On 27.11.2014 2:00, Jon Masamitsu wrote: >> Dima, >> >> If this test fails with an OOME in the future, I would like it to be >> obvious that the failure is not that an OOME occurred. I cannot >> tell that from looking at the test. Can the test be changed so >> I don't have to spend time figuring out that the OOME is not >> a failure mode of the test? >> >> Jon >> >> >> On 11/26/2014 12:51 PM, Dmitry Fazunenko wrote: >>> Hi Jon, >>> >>> The original version of test worked for 80 seconds trying to perform >>> as many iterations as possible. The number of iterations performed >>> depended on how fast is the machine. With each next iteration the >>> size of generated and loaded classes is growing, so on fast enough >>> machines 80 seconds is enough to run out of heap while generating a >>> class. >>> >>> The fix not only sets the heap, but limits iterations. 300m heap is >>> enough for 200 iterations. >>> >>> Your approach, with catching OOME(heap) and passing will also work, >>> but it will reduce the test readability (and potentially could bring >>> more problems). >>> >>> An alternative approach would be to limit metaspace and heap >>> accordingly and load classes until we don't run out metaspace... But >>> this might take awhile. >>> >>> So, I hope that Michael's fix is good. >>> >>> Thanks for looking and expressing comments. >>> Dima >>> >>> >>> >>> >>> On 26.11.2014 22:04, Jon Masamitsu wrote: >>>> Michail, >>>> >>>> Your change makes this test pass but it seems like at >>>> some future date 300m might not be big enough >>>> (for whatever reason). Could the test be make to >>>> caught an OOME, print out a message saying that >>>> an OOME doesn't mean the test failed but that >>>> the test needs a larger heap? Then pass an >>>> exception up (maybe some type of Runtime >>>> exception - sorry if that is vague but I don't >>>> what type of exception would make sense). That >>>> would mean we wouldn't have to spend time >>>> diagnosing what the OOME means again. >>>> >>>> Jon >>>> >>>> On 11/26/2014 5:36 AM, Michail Chernov wrote: >>>>> Hi, >>>>> >>>>> Please review this simple fix for nightly test failure: >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~eistepan/~mchernov/8064909/webrev.00/ >>>>> Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8064909 >>>>> >>>>> Problem: test fails because of OOME (not enough heap size). >>>>> Solution: heap size were increased. >>>>> >>>>> Testing: >>>>> jtreg >>>>> >>>>> Thanks, >>>>> Michail >>>> >>> >> >> >> > From jon.masamitsu at oracle.com Mon Dec 1 19:52:37 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Mon, 01 Dec 2014 11:52:37 -0800 Subject: RFR: 8064909: FragmentMetaspace.java got OutOfMemoryError In-Reply-To: <54772711.6000003@oracle.com> References: <5475D74A.2060907@oracle.com> <54762451.3070802@oracle.com> <54763D54.3070704@oracle.com> <54765B70.10509@oracle.com> <54772711.6000003@oracle.com> Message-ID: <547CC705.5070004@oracle.com> On 11/27/2014 05:28 AM, Michail Chernov wrote: > Hi, > > CC'ed hotspot-runtime-dev. > > Here is not test failure - test works as expected. OOME is occurred in > compiler instance. > > private JavaCompiler javac; > ... > javac = ToolProvider.getSystemJavaCompiler(); > ... > int exitcode = javac.run(null, null, null, > file.getCanonicalPath()); > if (exitcode != 0) { > throw new RuntimeException("javac failure when compiling: " + > file.getCanonicalPath()); > > Here is 2 ways - rewrite getGeneratedClass > (runtime/testlibrary/GeneratedClassLoader.java) to allow them to throw > not only RuntimeException, Seems like this would be more precise with regard to recognizing the cause of the failure. Are there too many places which would have to change to catch the OOME. > or to catch RuntimeException and check exception message comparing > with "javac failure when compiling:". Both ways seem to me are not as > clear as expected for this simple test. More - javac does not throw > anything - it just returns exitcode (non-zero) and writes its messages > to System.err. > > Also I can add comment to code like "OOME with message > "java.lang.OutOfMemoryError: Java heap space" doesn't mean that > something wrong with metaspace - need just to increase -Xmx". That would be enough for me if you don't think throwing the OOME from GeneratedClassLoader() adds much value. Jon > > Thanks, > Michail > > On 27.11.2014 2:00, Jon Masamitsu wrote: >> Dima, >> >> If this test fails with an OOME in the future, I would like it to be >> obvious that the failure is not that an OOME occurred. I cannot >> tell that from looking at the test. Can the test be changed so >> I don't have to spend time figuring out that the OOME is not >> a failure mode of the test? >> >> Jon >> >> >> On 11/26/2014 12:51 PM, Dmitry Fazunenko wrote: >>> Hi Jon, >>> >>> The original version of test worked for 80 seconds trying to perform >>> as many iterations as possible. The number of iterations performed >>> depended on how fast is the machine. With each next iteration the >>> size of generated and loaded classes is growing, so on fast enough >>> machines 80 seconds is enough to run out of heap while generating a >>> class. >>> >>> The fix not only sets the heap, but limits iterations. 300m heap is >>> enough for 200 iterations. >>> >>> Your approach, with catching OOME(heap) and passing will also work, >>> but it will reduce the test readability (and potentially could bring >>> more problems). >>> >>> An alternative approach would be to limit metaspace and heap >>> accordingly and load classes until we don't run out metaspace... But >>> this might take awhile. >>> >>> So, I hope that Michael's fix is good. >>> >>> Thanks for looking and expressing comments. >>> Dima >>> >>> >>> >>> >>> On 26.11.2014 22:04, Jon Masamitsu wrote: >>>> Michail, >>>> >>>> Your change makes this test pass but it seems like at >>>> some future date 300m might not be big enough >>>> (for whatever reason). Could the test be make to >>>> caught an OOME, print out a message saying that >>>> an OOME doesn't mean the test failed but that >>>> the test needs a larger heap? Then pass an >>>> exception up (maybe some type of Runtime >>>> exception - sorry if that is vague but I don't >>>> what type of exception would make sense). That >>>> would mean we wouldn't have to spend time >>>> diagnosing what the OOME means again. >>>> >>>> Jon >>>> >>>> On 11/26/2014 5:36 AM, Michail Chernov wrote: >>>>> Hi, >>>>> >>>>> Please review this simple fix for nightly test failure: >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~eistepan/~mchernov/8064909/webrev.00/ >>>>> Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8064909 >>>>> >>>>> Problem: test fails because of OOME (not enough heap size). >>>>> Solution: heap size were increased. >>>>> >>>>> Testing: >>>>> jtreg >>>>> >>>>> Thanks, >>>>> Michail >>>> >>> >> >> >> > From calvin.cheung at oracle.com Mon Dec 1 20:10:24 2014 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Mon, 01 Dec 2014 12:10:24 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified Message-ID: <547CCB30.6010806@oracle.com> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, and SharedReadWriteSize. For the SharedMiscDataSize, it is based on MetaspaceShared::generate_vtable_methods(). Similar to what was done for the SharedMiscCodeSize. For the ShareReadOnlySize and SharedReadWriteSize, I'm checking if they are at least the default size. I think it's reasonable to enforce the ro and rw sizes to be at least the default size. A default dump of CDS archive requires >8M of ro space and >11M of rw space. webrev: http://cr.openjdk.java.net/~ccheung/8065050/webrev/ tests: ran the testcase via jtreg on linux_x64 and windows_x64 JPRT thanks, Calvin From martinrb at google.com Mon Dec 1 22:31:04 2014 From: martinrb at google.com (Martin Buchholz) Date: Mon, 1 Dec 2014 14:31:04 -0800 Subject: Only C2 optimizes "release stores"? Message-ID: Hi fence team, I'm foolishly reading hotspot sources and see // The non-intrinsified versions of setOrdered just use setVolatile But why? Aside from conservative fear (never ever weaken fences!) why not do the obvious diff --git a/src/share/vm/prims/unsafe.cpp b/src/share/vm/prims/unsafe.cpp --- a/src/share/vm/prims/unsafe.cpp +++ b/src/share/vm/prims/unsafe.cpp @@ -171,8 +171,12 @@ #define SET_FIELD_VOLATILE(obj, offset, type_name, x) \ oop p = JNIHandles::resolve(obj); \ OrderAccess::release_store_fence((volatile type_name*)index_oop_from_field_offset_long(p, offset), x); +#define SET_FIELD_RELEASE(obj, offset, type_name, x) \ + oop p = JNIHandles::resolve(obj); \ + OrderAccess::release_store((volatile type_name*)index_oop_from_field_offset_long(p, offset), x); + // Macros for oops that check UseCompressedOops #define GET_OOP_FIELD(obj, offset, v) \ oop p = JNIHandles::resolve(obj); \ @@ -372,9 +376,9 @@ // The non-intrinsified versions of setOrdered just use setVolatile UNSAFE_ENTRY(void, Unsafe_SetOrderedInt(JNIEnv *env, jobject unsafe, jobject obj, jlong offset, jint x)) UnsafeWrapper("Unsafe_SetOrderedInt"); - SET_FIELD_VOLATILE(obj, offset, jint, x); + SET_FIELD_RELEASE(obj, offset, jint, x); UNSAFE_END UNSAFE_ENTRY(void, Unsafe_SetOrderedObject(JNIEnv *env, jobject unsafe, jobject obj, jlong offset, jobject x_h)) UnsafeWrapper("Unsafe_SetOrderedObject"); @@ -386,15 +390,14 @@ oop_store((narrowOop*)addr, x); } else { oop_store((oop*)addr, x); } - OrderAccess::fence(); UNSAFE_END UNSAFE_ENTRY(void, Unsafe_SetOrderedLong(JNIEnv *env, jobject unsafe, jobject obj, jlong offset, jlong x)) UnsafeWrapper("Unsafe_SetOrderedLong"); #ifdef SUPPORTS_NATIVE_CX8 - SET_FIELD_VOLATILE(obj, offset, jlong, x); + SET_FIELD_RELEASE(obj, offset, jlong, x); #else // Keep old code for platforms which may not have atomic long (8 bytes) instructions { if (VM_Version::supports_cx8()) { From mikhailo.seledtsov at oracle.com Tue Dec 2 01:59:57 2014 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Mon, 01 Dec 2014 17:59:57 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <547CCB30.6010806@oracle.com> References: <547CCB30.6010806@oracle.com> Message-ID: <547D1D1D.70803@oracle.com> Test changes look good. Thank you, Misha On 12/1/2014 12:10 PM, Calvin Cheung wrote: > JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 > > Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, and > SharedReadWriteSize. > > For the SharedMiscDataSize, it is based on > MetaspaceShared::generate_vtable_methods(). Similar to what was done > for the SharedMiscCodeSize. > > For the ShareReadOnlySize and SharedReadWriteSize, I'm checking if > they are at least the default size. > I think it's reasonable to enforce the ro and rw sizes to be at least > the default size. A default dump of CDS archive requires >8M of ro > space and >11M of rw space. > > webrev: > http://cr.openjdk.java.net/~ccheung/8065050/webrev/ > > tests: > ran the testcase via jtreg on linux_x64 and windows_x64 > JPRT > > thanks, > Calvin From chris.plummer at oracle.com Tue Dec 2 02:39:22 2014 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 01 Dec 2014 18:39:22 -0800 Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation fault In-Reply-To: <546CBCAB.7040101@oracle.com> References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com> <54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com> <546C5986.6010500@oracle.com> <546C6D1A.8050903@oracle.com> <546CBCAB.7040101@oracle.com> Message-ID: <547D265A.20005@oracle.com> Sorry about the long delay in getting back to this. I ran into two separate JPRT issues that were preventing me from testing these changes, plus I was on vacation last week. Here's an updated webrev. I'm not sure where we left things, so I'll just say what's changed since the original version: 1. Rewrote the test to be in Java instead of a shell script. 2. Moved the test from hotspot/test/runtime/memory to jdk/test/tools/launcher 3. Added STACK_SIZE_MINIMUM to java.c, allowing a makefile to override the default 32k minimum value. https://bugs.openjdk.java.net/browse/JDK-6762191 http://cr.openjdk.java.net/~cjplummer/6762191/webrev.02/ thanks, Chris On 11/19/14 7:52 AM, Chris Plummer wrote: > On 11/19/14 2:12 AM, David Holmes wrote: >> On 19/11/2014 6:49 PM, Chris Plummer wrote: >>> I've update the webrev to add STACK_SIZE_MINIMUM in place of the 32k >>> references, and also moved the test from hotspot/test/runtime to >>> jdk/test/tools/launcher as David requested. That required some >>> adjustments to the test script, since test_env.sh does not exist in >>> jdk/test, so I had to pull in the bits I needed into the script. >> >> Is there a reason this needs a shell script instead of using the >> testlibrary tools to launch the VM and check the output? > Not that I'm aware of. I guess I just really didn't look at what it > would take to make it all in java. I'll have a look at java examples > and convert it. > > Chris >> >> Sorry that should have been mentioned much earlier. >> >> David >> >> >>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.01/ >>> >>> I still need to rerun through JPRT. I'll do so once there are no more >>> suggested changes. >>> >>> thanks, >>> >>> Chris >>> >>> On 11/18/14 2:08 PM, Chris Plummer wrote: >>>> Adding core-libs-dev at openjdk.java.net, since one of the changes is in >>>> java.c. >>>> >>>> Chris >>>> >>>> On 11/12/14 6:43 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> Sorry for the delay. >>>>> >>>>> On 13/11/2014 5:44 AM, Chris Plummer wrote: >>>>>> Hi, >>>>>> >>>>>> I'm still looking for reviewers. >>>>> >>>>> As the change is to the launcher it needs to be reviewed by the >>>>> launcher owner - which I think is serviceability (though also cc'd >>>>> Kumar :) ). >>>>> >>>>> Launcher change, and your rationale, seems okay to me. I'd probably >>>>> put the test in to jdk/test/tools/launcher/ though. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 11/7/14 7:53 PM, Chris Plummer wrote: >>>>>>> This is an initial review for 6762191. I'm guessing there will be >>>>>>> recommendations to fix in a different way, but thought this >>>>>>> would be a >>>>>>> good time to start the discussion. >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-6762191 >>>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/ >>>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/ >>>>>>> >>>>>>> The bug is that if the -Xss size is set to something very small >>>>>>> (like >>>>>>> 16k), on linux there will be a crash due to overwriting the end >>>>>>> of the >>>>>>> stack. This happens before hotspot can compute its stack needs and >>>>>>> verify that the stack is big enough. >>>>>>> >>>>>>> It didn't seem viable to move the hotspot stack size check >>>>>>> earlier. It >>>>>>> depends on too much other work done before that point, and the >>>>>>> changes >>>>>>> would have been disruptive. The stack size check is currently >>>>>>> done in >>>>>>> os::init_2(). >>>>>>> >>>>>>> What is needed is a check before the thread is created. That way we >>>>>>> can create a thread with a big enough stack to handle all needs >>>>>>> up to >>>>>>> the point of the check in os::init_2(). This initial check does not >>>>>>> need to be the final check. It just needs to confirm that we have >>>>>>> enough stack to get us to the check in os::init_2(). >>>>>>> >>>>>>> I decided to check in java.c if the -Xss size is too small, and >>>>>>> set it >>>>>>> to a larger size if it is. I hard coded this size to 32k (I'll >>>>>>> explain >>>>>>> why 32k later). I suspect this is the part that will result in some >>>>>>> debate. If you have better suggestions let me know. If it does stay >>>>>>> here, then probably the 32k needs to be a #define, and maybe >>>>>>> even an >>>>>>> OS porting interface, but I'm not sure where to put it. >>>>>>> >>>>>>> The reason I chose 32k is because this is big enough for all >>>>>>> platforms >>>>>>> to get to the stack size check in os::init_2(). It is also smaller >>>>>>> than the actual minimum stack size allowed on any platform. 32-bit >>>>>>> windows has the smallest requirement at 64k. I add some printfs to >>>>>>> print the minimum stack requirement, and then ran a simple JTReg >>>>>>> test >>>>>>> with every JPRT supported platform to get the results. >>>>>>> >>>>>>> The TooSmallStackSize.sh will run "java -version" with -Xss16k, >>>>>>> -Xss32k, and -XXss, where is the size from the >>>>>>> error message produced by the JVM, such as in the following: >>>>>>> >>>>>>> $ java -Xss32k -version >>>>>>> The stack size specified is too small, Specify at least 100k >>>>>>> Error: Could not create the Java Virtual Machine. >>>>>>> Error: A fatal exception has occurred. Program will exit. >>>>>>> >>>>>>> I ran this test through JPRT on all platforms, and they all pass. >>>>>>> >>>>>>> One thing to point out is that Windows behaves a bit different than >>>>>>> the other platforms. It always rounds the stack size up to a >>>>>>> multiple >>>>>>> of 64k , so even if you specify -Xss16k, you get a 64k stack. On >>>>>>> 32-bit Windows with C1, 64k is also the minimum requirement, so >>>>>>> there >>>>>>> is no error produced in this case. However, on 32-bit Windows >>>>>>> with C2, >>>>>>> 68k is the minimum, so an error is produced since the stack will >>>>>>> only >>>>>>> be 64k. There is no bug here. It's just a bit confusing. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>> >>>> >>> > From david.holmes at oracle.com Tue Dec 2 09:40:45 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 02 Dec 2014 19:40:45 +1000 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> <547C38AD.6050703@gmail.com> Message-ID: <547D891D.7010809@oracle.com> On 1/12/2014 10:57 PM, Thomas St?fe wrote: > Hi Yasumasa, > > On Mon, Dec 1, 2014 at 10:45 AM, Yasumasa Suenaga > wrote: > > Hi Thomas, David, > > Sorry, I didn't think about async signal safety. > > That would work, VmError::report_and_die() is singlethreaded. At > least the part which dumps out the core file name. > > > I think that signal handler (in this case) may run concurrency with > other thread. > If another thread calls malloc(3) in JNI, C Heap corruption may occur. > > > No, malloc(3) should be thread safe on our platforms. But this was not > the point. If I understood David right, he suggested using a static > buffer inside get_core_path() for assembling the core path, which would > make get_core_path() thread-unsafe (multiple threads calling it would > get garbled results). But as get_core_path() is only called from within > VmError::report_and_die() and that section is only ever executed by one > thread, Davids suggestion would still work. Yes that is what I was suggesting. > I want to rewrite a patch as below: > > - Use async signal safety functions. > fopen -> open, fgets -> read, etc. This is commendable if it is practical, but error reporting already does many, many things that are not async-signal safe, so there is no need to go to extreme measures here. > - Use O_BUFLEN for buffer size. > O_BUFLEN is defined to 2000 in ostream.hpp . > This macro is used in various points. VMError::coredump_message is > also defined with this value. > > > I think PATH_MAX is fine. I think O_BUFLEN was originally used as a max. > length of temporary buffers to assemble an output line. And then it > spread a bit. But your intend is to hold a path and using PATH_MAX > clearly documents this. > And, to really nitpick, right now you do not handle ERANGE with > get_current_path() (if the provided buffer is too small), which is > probably fine because it is improbable that a path is larger than > PATH_MAX. But if you change the size of the buffer to something which > may be smaller than PATH_MAX (O_BUFLEN), get_current_directory() may fail. > > I like your patch, I think it could be a nice time safer when > core_pattern is something unusual. But I also see Staffans point of > too-much-complexity. So I will keep out of this discussion until the > real Reviewers decided what to do :) I have a hard time evaluating the merits of the patch as I don't work in an environment where this extra info is needed. But I take it on good faith that it is useful for the context Yasumasa describes. David > Kind Regards, Thomas > From david.holmes at oracle.com Tue Dec 2 09:50:34 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 02 Dec 2014 19:50:34 +1000 Subject: RFR(s): 8065895: Synchronous signals during error reporting may terminate or hang VM process In-Reply-To: References: <54752CF8.5070408@oracle.com> <54759DFE.7020300@oracle.com> <5475C15E.30207@oracle.com> <54767502.6010907@oracle.com> <5476B417.9030008@oracle.com> <5476E851.8050802@oracle.com> <5476F2AB.4050401@redhat.com> <5477030E.6070605@redhat.com> <54770436.8070705@oracle.com> <54770536.5090101@redhat.com> Message-ID: <547D8B6A.6040002@oracle.com> On 1/12/2014 11:30 PM, Thomas St?fe wrote: > Hi all, > > lets not get this patch bogged down on ARM opcode discussions. > > For me, it is just a question of style and which one would be most > acceptable to the OpenJDK. > > As I see it, here are my options: > > 1 leave the code as it is and whoever does ARM porting at Oracle will > provide the SIGILL opcodes inside debug.cpp > 2 like (1), but provide a fallback for CPUs where we do not know the > SIGILL opcodes right now, by doing a raise(SIGILL). This would work but > make the test a tiny bit less valuable on those platforms. > > 3 Move the CPU-dependend parts (the big #ifdef) away from debug.cpp > into debug_.cpp. Would mean a bit code duplication because for 3 > out of 5 cpus the SIGILL-generating opcode is 0. This basically would be > the same as my second webrev > (http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.01/) > 4 like (3), but with additional introduction of a debug_.hpp, and > adding a "ZERO_WILL_GENERATE_SIGILL" or somesuch macro to provide a > common fallback for cpus where 0 generates SIGILL. > > I am leaning toward (2) or (3) but I am okay with any of the four. I'm really undecided here. #1 makes me cringe because of the cpu ifdefs in shared code (including those for non-OpenJDK platforms). #3 and #4 make me cringe because it is a lot of overhead to introduce the debug_.hpp files on all platforms. That leaves #2 though I'm unclear how we will identify the platforms that don't have defined bad opcodes. If that's still just a variant of the ifdefs in #1 then I'm still cringing. :) Would appreciate someone else from runtime jumping in with an opinion here :) David (PS. I'm on vacation tomorrow so apologies for delayed responses.) > Kind Regards, > > Thomas Stuefe > > > > > > > > On Thu, Nov 27, 2014 at 12:04 PM, Andrew Haley > wrote: > > On 11/27/2014 11:00 AM, David Holmes wrote: > > On 27/11/2014 8:55 PM, Andrew Haley wrote: > >> On 11/27/2014 10:38 AM, Thomas St?fe wrote: > >>> Hi Andrew, thank you! Does endianess matter ? > >> > >> Yes. I'd do it symbolically rather than mess with endian defines: > >> > >> #ifdef AARCH64 > >> unsigned insn; > >> asm("b 1f; 0: dcps1; 1: ldr %0, 0b" : "=r"(insn)); > >> #endif > > > > Does that work for ARMv7? > > Sorry, I don't know what a good choice there would be. And I must > warn you: DCPS1 isn't necessarily guaranteed to do this forever, but > it works on the kernels I've tried. > > Andrew. > > > From yasuenag at gmail.com Tue Dec 2 14:30:19 2014 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 02 Dec 2014 23:30:19 +0900 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <547D891D.7010809@oracle.com> References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> <547C38AD.6050703@gmail.com> <547D891D.7010809@oracle.com> Message-ID: <547DCCFB.3050209@gmail.com> Hi David, Thomas, I've uploaded new webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.04/ >> I want to rewrite a patch as below: >> >> - Use async signal safety functions. >> fopen -> open, fgets -> read, etc. > > This is commendable if it is practical, but error reporting already does many, many things that are not async-signal safe, so there is no need to go to extreme measures here. I've used async-signal safe functions as possible. >> - Use O_BUFLEN for buffer size. >> O_BUFLEN is defined to 2000 in ostream.hpp . >> This macro is used in various points. VMError::coredump_message is >> also defined with this value. >> >> >> I think PATH_MAX is fine. I think O_BUFLEN was originally used as a max. >> length of temporary buffers to assemble an output line. And then it >> spread a bit. But your intend is to hold a path and using PATH_MAX >> clearly documents this. I've used PATH_MAX again. >> And, to really nitpick, right now you do not handle ERANGE with >> get_current_path() (if the provided buffer is too small), which is >> probably fine because it is improbable that a path is larger than >> PATH_MAX. But if you change the size of the buffer to something which >> may be smaller than PATH_MAX (O_BUFLEN), get_current_directory() may fail. If get_current_path() call is failed in get_core_path(), get_core_path() returns immediately with 0. Caller (check_or_create_dump()) handles this result as illegal state. get_current_path() calls getcwd() only and redirects result to caller. So result of this function is NULL, we can judge getcwd() was finished with error. I think it is enough. >> I like your patch, I think it could be a nice time safer when >> core_pattern is something unusual. But I also see Staffans point of >> too-much-complexity. So I will keep out of this discussion until the >> real Reviewers decided what to do :) > > I have a hard time evaluating the merits of the patch as I don't work in an environment where this extra info is needed. But I take it on good faith that it is useful for the context Yasumasa describes. I want to suggest to Java user where coredump is. Modern Linux distribution(s) contains ABRT. OS can dump corefile automatically despite a lack of setting coredump resource by user. I'm support engineer of Java. My customer says "coredump does not found.", but coredump is saved by ABRT. Thus I want them to know "coredump is available" through stderr and hs_err immediately. I belive it is first step of troubleshoot. Thanks, Yasumasa (2014/12/02 18:40), David Holmes wrote: > On 1/12/2014 10:57 PM, Thomas St?fe wrote: >> Hi Yasumasa, >> >> On Mon, Dec 1, 2014 at 10:45 AM, Yasumasa Suenaga > > wrote: >> >> Hi Thomas, David, >> >> Sorry, I didn't think about async signal safety. >> >> That would work, VmError::report_and_die() is singlethreaded. At >> least the part which dumps out the core file name. >> >> >> I think that signal handler (in this case) may run concurrency with >> other thread. >> If another thread calls malloc(3) in JNI, C Heap corruption may occur. >> >> >> No, malloc(3) should be thread safe on our platforms. But this was not >> the point. If I understood David right, he suggested using a static >> buffer inside get_core_path() for assembling the core path, which would >> make get_core_path() thread-unsafe (multiple threads calling it would >> get garbled results). But as get_core_path() is only called from within >> VmError::report_and_die() and that section is only ever executed by one >> thread, Davids suggestion would still work. > > Yes that is what I was suggesting. > >> I want to rewrite a patch as below: >> >> - Use async signal safety functions. >> fopen -> open, fgets -> read, etc. > > This is commendable if it is practical, but error reporting already does many, many things that are not async-signal safe, so there is no need to go to extreme measures here. > >> - Use O_BUFLEN for buffer size. >> O_BUFLEN is defined to 2000 in ostream.hpp . >> This macro is used in various points. VMError::coredump_message is >> also defined with this value. >> >> >> I think PATH_MAX is fine. I think O_BUFLEN was originally used as a max. >> length of temporary buffers to assemble an output line. And then it >> spread a bit. But your intend is to hold a path and using PATH_MAX >> clearly documents this. >> And, to really nitpick, right now you do not handle ERANGE with >> get_current_path() (if the provided buffer is too small), which is >> probably fine because it is improbable that a path is larger than >> PATH_MAX. But if you change the size of the buffer to something which >> may be smaller than PATH_MAX (O_BUFLEN), get_current_directory() may fail. >> >> I like your patch, I think it could be a nice time safer when >> core_pattern is something unusual. But I also see Staffans point of >> too-much-complexity. So I will keep out of this discussion until the >> real Reviewers decided what to do :) > > I have a hard time evaluating the merits of the patch as I don't work in an environment where this extra info is needed. But I take it on good faith that it is useful for the context Yasumasa describes. > > David > >> Kind Regards, Thomas >> From cheleswer.sahu at oracle.com Tue Dec 2 12:15:52 2014 From: cheleswer.sahu at oracle.com (Cheleswer Sahu) Date: Tue, 2 Dec 2014 04:15:52 -0800 (PST) Subject: [8u40] request for approval: 8035893: JVM_GetVersionInfo fails to zero structure Message-ID: <3bf5b22e-be81-4244-b2e8-6b514abefdad@default> Hi! May I please have approval to backport this fix from JDK9 to JDK8. I have build the JDK-8 hotspot and tested already. JDK9 fix applies cleanly to JDK8 source. As I do not have account for OpenJDK, David Buck will push the fix into jdk8u/hs-dev/hotspot. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8035893 JDK9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cd30121047ac review thread: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/011054.html Regards, Cheleswer From thomas.stuefe at gmail.com Tue Dec 2 16:04:32 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 2 Dec 2014 17:04:32 +0100 Subject: RFR(s): 8065895: Synchronous signals during error reporting may terminate or hang VM process In-Reply-To: <547D8B6A.6040002@oracle.com> References: <54752CF8.5070408@oracle.com> <54759DFE.7020300@oracle.com> <5475C15E.30207@oracle.com> <54767502.6010907@oracle.com> <5476B417.9030008@oracle.com> <5476E851.8050802@oracle.com> <5476F2AB.4050401@redhat.com> <5477030E.6070605@redhat.com> <54770436.8070705@oracle.com> <54770536.5090101@redhat.com> <547D8B6A.6040002@oracle.com> Message-ID: Hi David, you are a hard man to uncringe :) Here is a last modification, which in my opinion would be the best balance. Basically, it is (2) with the CPU dependend code moved away from shared coding and a fallback for CPUs which have no (known) way to cause a SIGILL. http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.03/ Kind Regards, Thomas On Tue, Dec 2, 2014 at 10:50 AM, David Holmes wrote: > On 1/12/2014 11:30 PM, Thomas St?fe wrote: > >> Hi all, >> >> lets not get this patch bogged down on ARM opcode discussions. >> >> For me, it is just a question of style and which one would be most >> acceptable to the OpenJDK. >> >> As I see it, here are my options: >> >> 1 leave the code as it is and whoever does ARM porting at Oracle will >> provide the SIGILL opcodes inside debug.cpp >> 2 like (1), but provide a fallback for CPUs where we do not know the >> SIGILL opcodes right now, by doing a raise(SIGILL). This would work but >> make the test a tiny bit less valuable on those platforms. >> >> 3 Move the CPU-dependend parts (the big #ifdef) away from debug.cpp >> into debug_.cpp. Would mean a bit code duplication because for 3 >> out of 5 cpus the SIGILL-generating opcode is 0. This basically would be >> the same as my second webrev >> (http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.01/) >> 4 like (3), but with additional introduction of a debug_.hpp, and >> adding a "ZERO_WILL_GENERATE_SIGILL" or somesuch macro to provide a >> common fallback for cpus where 0 generates SIGILL. >> >> I am leaning toward (2) or (3) but I am okay with any of the four. >> > > I'm really undecided here. #1 makes me cringe because of the cpu ifdefs in > shared code (including those for non-OpenJDK platforms). #3 and #4 make me > cringe because it is a lot of overhead to introduce the debug_.hpp > files on all platforms. > > That leaves #2 though I'm unclear how we will identify the platforms that > don't have defined bad opcodes. If that's still just a variant of the > ifdefs in #1 then I'm still cringing. :) > > Would appreciate someone else from runtime jumping in with an opinion here > :) > > David > > (PS. I'm on vacation tomorrow so apologies for delayed responses.) > > > Kind Regards, >> >> Thomas Stuefe >> >> >> >> >> >> >> >> On Thu, Nov 27, 2014 at 12:04 PM, Andrew Haley > > wrote: >> >> On 11/27/2014 11:00 AM, David Holmes wrote: >> > On 27/11/2014 8:55 PM, Andrew Haley wrote: >> >> On 11/27/2014 10:38 AM, Thomas St?fe wrote: >> >>> Hi Andrew, thank you! Does endianess matter ? >> >> >> >> Yes. I'd do it symbolically rather than mess with endian defines: >> >> >> >> #ifdef AARCH64 >> >> unsigned insn; >> >> asm("b 1f; 0: dcps1; 1: ldr %0, 0b" : "=r"(insn)); >> >> #endif >> > >> > Does that work for ARMv7? >> >> Sorry, I don't know what a good choice there would be. And I must >> warn you: DCPS1 isn't necessarily guaranteed to do this forever, but >> it works on the kernels I've tried. >> >> Andrew. >> >> >> >> From jiangli.zhou at oracle.com Tue Dec 2 21:44:45 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 02 Dec 2014 13:44:45 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <5451626D.2030102@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> Message-ID: <547E32CD.5050103@oracle.com> Hi, I finally got back to this, sorry for the delay. Please see the following new webre. http://cr.openjdk.java.net/~jiangli/8059510/webrev.06/ New in the webrev: 1. Further compression of the compact table * Remove the bucket_size table. With the sequential layout of the buckets, lookup process can seek to the start of the next bucket without the need of the current bucket size. For the last bucket, it can seek to the end of the table. The table end offset is added to the archived data. * Bucket with exactly one entry is marked as 'compact' bucket, whose entry only contains the symbol offset. The symbol hash is eliminated for 'compact' buckets. Lookup compares the symbol directly in that case. 2. The shared symbol table is not always looked up first. The last table that fetches symbol successfully is used for lookup. 3. Added a lot more comments in compactHashtable.hpp with details of the compact table layout and dump/lookup process. I measured using the classloading benchmark that Aleksey pointed to me. This benchmark loads classes using user defined classloader. There is a very small degradation shown in the benchmark comparing 'before' and 'after' with archive dumped with the default configuration. When symbols from the test is added to the shared table, there is an observable speedup in the benchmark. The speedup is also very small. Thanks, Jiangli On 10/29/2014 02:55 PM, Jiangli Zhou wrote: > Hi John, > > Thank you for the thoughts on this! Yes, it's a good time to have > these conversations. Please see some quick responses from me below, > with more details to follow. > > On 10/29/2014 12:46 PM, John Rose wrote: >> I have a few points about the impact of this change on startup performance, and on trends in our code base: >> >> 1. We can live with small performance regressions on some benchmarks. Otherwise we'd never get anywhere. So I am not saying that the current (very interesting and multi-facted) conversation must continue a long time before we can push any code. >> >> 2. Aleksey's challenge is valid, and I don't see a strong reply to it yet. Work like this, that emphasizes compactness and sharability can usually be given the benefit of the doubt for startup. But if we observe a problem we should make a more careful measurement. If this change goes in with a measurable regression, we need a follow-up conversation (centered around a tracking bug) about quantifying the performance regression and fixing it. (It's possible the conversation ends by saying "we decided we don't care and here's why", but I doubt it will end that way.) > > Besides my classloading benchmark results posted in earlier message, > we have asked performance team to help us measure startup, > classloding, and memory saving regarding this change, and verified > there was no regression in startup/classloading. The results were not > posted in the thread however. They were added to the bug report for > JDK-8059510 . > >> 3. The benchmark Aleksey chose, Nashorn startup, may well be an outlier. Dynamic language runtimes create lots of tiny methods, and are likely to put heavy loads on symbol resolution, including resolution of symbols that are not known statically. For these, the first-look (front-end) table being proposed might be a loss, especially if the Nashorn runtime is not in the first-look table. >> >> 4. Possible workarounds: #1 Track hit rates and start looking at the static table *second* if it fails to produce enough hits, compared to the dynamic table. #2 When that happens, incrementally migrate (rehash) frequently found symbols in the static table into the main table. #3 Re-evaluate the hashing algorithm. (More on this below.) > > That's interesting thinking. I'll follow up on this. > >> 5. I strongly support moving towards position-independent, shareable, pointer-free data structures, and I want us to learn to do it well over time. (Ioi, you are one of our main explorers here!) To me, a performance regression is a suggestion that we have something more to learn. And, it's not a surprise that we don't have it all figured out yet. > > Point taken! Position-independent is one of our goal for the archived > data. We'll be start looking into removing direct pointers in the > shared class data. > >> 6. For work like this, we need to agree upon at least one or two startup performance tests to shake out bottlenecks early and give confidence. People who work on startup performance should know how to run them and be able to quote them. > > Thank you for bring this up. Totally agree. For me, personally I've > played with different benchmarks and applications with different focus > over time. It would be good to have some commonly agreed startup > performance tests for this. A standalone classloading benchmark, > HelloWorld and Nashorn startup probably are good choices. We also have > an internal application for startup and memory saving measurement. > >> 7. There's a vibrant literature about offline (statically created) hash tables, and lots of tricks floating around, such as perfect or semi-perfect hash functions, and multiple-choice hashing, and locality-aware structures. I can think of several algorithmic tweaks I'd like to try on this code. (If they haven't already been tried or discarded: I assume Ioi has already tried a number of things.) Moreover, this is not just doorknob-polishing, because (as said in point 5) we want to develop our competency with these sorts of data structures. >> >> 8. I found the code hard to read, so it was hard to reason about the algorithm. The core routine, "lookup", has no comments and only one assert. It uses only primitive C types and (perhaps in the name of performance) is not factored into sub-functions. The code which generates the static table is also hard to reason about in similar ways. The bucket size parameter choice (a crucial compromise between performance and compactness) is 4 but the motivation and implications are left for the reader to puzzle out. Likewise, the use of hardware division instead of power-of-two table size is presented without comment, despite the fact that we favor power-of-two in other parts of our stack, and a power-of-two solution would be reasonable here (given the bucket size). >> >> 9. Perhaps I wouldn't care as much about the code style if the code were not performance sensitive or if it were a unique and isolated part of the JVM. In this case, I expect this code (and other similar code) to improve, over time, in readability and robustness, as we learn to work with this new kind of data structure. So even if we decide there is no significant regression here, and decide to push it as-is, we still need to use it as an example to help us get better at writing easy-to-read code which works with pointer-free data. >> >> 10. I would like to see (posted somewhere or attached to the bug) a sample list of the symbols in a typical symbol table. Perhaps this already exists and I missed it. I think it would be friendly to put some comments in the code that help the reader estimate numbers like table size, bucket length, number of queries, number of probes per query, symbol length statistics (mean, histogram). Of course such comments go stale over time, but so does the algorithm, and being coy about the conditions of the moment doesn't help us in the long run. Even a short comment is better than none, for example: >> http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/share/vm/classfile/vmSymbols.cpp#l206 > > Those are good suggestions. I'll add a sample list o the symbols to > the bug report and try to put more comments in the code. > > Thanks! > > Jiangli > >> It is a good time to have these conversations. >> >> Best wishes, >> ? John >> >> On Oct 13, 2014, at 11:46 AM, Aleksey Shipilev wrote: >> >>> Hi Jiangli, >>> >>> On 13.10.2014 18:26, Jiangli Zhou wrote: >>>> On 10/13/2014 03:18 AM, Aleksey Shipilev wrote: >>>>> On 13.10.2014 03:32, David Holmes wrote: >>>>>> On 11/10/2014 1:47 PM, Jiangli Zhou wrote: >>>>>> Also is the benchmarking being done on dedicated systems? >>>>> Also, specjvm98 is meaningless to estimate the classloading costs. >>>>> Please try specjvm2008:startup.* tests? >>>> The specjvm run was for Gerard's question about standard benchmarks. >>> SPECjvm2008 is a standard benchmark. In fact, it is a benchmark that >>> deprecates SPECjvm98. >>> >>>> These are not benchmarks specifically for classloading. >>> There are benchmarks that try to estimate the startup costs. >>> SPECjvm2008:startup.* tests are one of them. >>> >>>> However, I agree it's a good idea to run standard benchmarks to >>>> confirm there is no overall performance degradation. From all the >>>> benchmarks including classloading measurements, we have confirmed >>>> that this specific change does not have negative impact on >>>> classloading itself and the overall performance. >>> Excellent. What are those benchmarks? May we see those? Because I have a >>> counter-example in this thread that this change *does* negatively impact >>> classloading. >>> >>> -Aleksey. >>> > From dean.long at oracle.com Wed Dec 3 04:09:24 2014 From: dean.long at oracle.com (Dean Long) Date: Tue, 02 Dec 2014 20:09:24 -0800 Subject: RFR(s): 8065895: Synchronous signals during error reporting may terminate or hang VM process In-Reply-To: References: <54752CF8.5070408@oracle.com> <54759DFE.7020300@oracle.com> <5475C15E.30207@oracle.com> <54767502.6010907@oracle.com> <5476B417.9030008@oracle.com> <5476E851.8050802@oracle.com> <5476F2AB.4050401@redhat.com> <5477030E.6070605@redhat.com> <54770436.8070705@oracle.com> <54770536.5090101@redhat.com> <547D8B6A.6040002@oracle.com> Message-ID: <547E8CF4.3050305@oracle.com> Instead of get_illegal_instruction_sequence() that fills in a buffer in reserved memory page, how about simply generate_illegal_instruction_sequence() that causes the SIGILL when executed? Then crash_with_sigill() simplifies to something like: tty->print_cr("will jump to PC " PTR_FORMAT", which should cause a SIGILL.",generate_illegal_instruction_sequence); tty->flush(); generate_illegal_instruction_sequence(); // boom dl On 12/2/2014 8:04 AM, Thomas St?fe wrote: > Hi David, you are a hard man to uncringe :) > > Here is a last modification, which in my opinion would be the best balance. > Basically, it is (2) with the CPU dependend code moved away from shared > coding and a fallback for CPUs which have no (known) way to cause a SIGILL. > > http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.03/ > > Kind Regards, Thomas > > > On Tue, Dec 2, 2014 at 10:50 AM, David Holmes > wrote: > >> On 1/12/2014 11:30 PM, Thomas St?fe wrote: >> >>> Hi all, >>> >>> lets not get this patch bogged down on ARM opcode discussions. >>> >>> For me, it is just a question of style and which one would be most >>> acceptable to the OpenJDK. >>> >>> As I see it, here are my options: >>> >>> 1 leave the code as it is and whoever does ARM porting at Oracle will >>> provide the SIGILL opcodes inside debug.cpp >>> 2 like (1), but provide a fallback for CPUs where we do not know the >>> SIGILL opcodes right now, by doing a raise(SIGILL). This would work but >>> make the test a tiny bit less valuable on those platforms. >>> >>> 3 Move the CPU-dependend parts (the big #ifdef) away from debug.cpp >>> into debug_.cpp. Would mean a bit code duplication because for 3 >>> out of 5 cpus the SIGILL-generating opcode is 0. This basically would be >>> the same as my second webrev >>> (http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.01/) >>> 4 like (3), but with additional introduction of a debug_.hpp, and >>> adding a "ZERO_WILL_GENERATE_SIGILL" or somesuch macro to provide a >>> common fallback for cpus where 0 generates SIGILL. >>> >>> I am leaning toward (2) or (3) but I am okay with any of the four. >>> >> I'm really undecided here. #1 makes me cringe because of the cpu ifdefs in >> shared code (including those for non-OpenJDK platforms). #3 and #4 make me >> cringe because it is a lot of overhead to introduce the debug_.hpp >> files on all platforms. >> >> That leaves #2 though I'm unclear how we will identify the platforms that >> don't have defined bad opcodes. If that's still just a variant of the >> ifdefs in #1 then I'm still cringing. :) >> >> Would appreciate someone else from runtime jumping in with an opinion here >> :) >> >> David >> >> (PS. I'm on vacation tomorrow so apologies for delayed responses.) >> >> >> Kind Regards, >>> Thomas Stuefe >>> >>> >>> >>> >>> >>> >>> >>> On Thu, Nov 27, 2014 at 12:04 PM, Andrew Haley >> > wrote: >>> >>> On 11/27/2014 11:00 AM, David Holmes wrote: >>> > On 27/11/2014 8:55 PM, Andrew Haley wrote: >>> >> On 11/27/2014 10:38 AM, Thomas St?fe wrote: >>> >>> Hi Andrew, thank you! Does endianess matter ? >>> >> >>> >> Yes. I'd do it symbolically rather than mess with endian defines: >>> >> >>> >> #ifdef AARCH64 >>> >> unsigned insn; >>> >> asm("b 1f; 0: dcps1; 1: ldr %0, 0b" : "=r"(insn)); >>> >> #endif >>> > >>> > Does that work for ARMv7? >>> >>> Sorry, I don't know what a good choice there would be. And I must >>> warn you: DCPS1 isn't necessarily guaranteed to do this forever, but >>> it works on the kernels I've tried. >>> >>> Andrew. >>> >>> >>> >>> From serguei.spitsyn at oracle.com Wed Dec 3 07:48:17 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 02 Dec 2014 23:48:17 -0800 Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation fault In-Reply-To: <547D265A.20005@oracle.com> References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com> <54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com> <546C5986.6010500@oracle.com> <546C6D1A.8050903@oracle.com> <546CBCAB.7040101@oracle.com> <547D265A.20005@oracle.com> Message-ID: <547EC041.4000103@oracle.com> The fix still looks good to me. Thanks, Serguei On 12/1/14 6:39 PM, Chris Plummer wrote: > Sorry about the long delay in getting back to this. I ran into two > separate JPRT issues that were preventing me from testing these > changes, plus I was on vacation last week. Here's an updated webrev. > I'm not sure where we left things, so I'll just say what's changed > since the original version: > > 1. Rewrote the test to be in Java instead of a shell script. > 2. Moved the test from hotspot/test/runtime/memory to > jdk/test/tools/launcher > 3. Added STACK_SIZE_MINIMUM to java.c, allowing a makefile to override > the default 32k minimum value. > > https://bugs.openjdk.java.net/browse/JDK-6762191 > http://cr.openjdk.java.net/~cjplummer/6762191/webrev.02/ > > thanks, > > Chris > > On 11/19/14 7:52 AM, Chris Plummer wrote: >> On 11/19/14 2:12 AM, David Holmes wrote: >>> On 19/11/2014 6:49 PM, Chris Plummer wrote: >>>> I've update the webrev to add STACK_SIZE_MINIMUM in place of the 32k >>>> references, and also moved the test from hotspot/test/runtime to >>>> jdk/test/tools/launcher as David requested. That required some >>>> adjustments to the test script, since test_env.sh does not exist in >>>> jdk/test, so I had to pull in the bits I needed into the script. >>> >>> Is there a reason this needs a shell script instead of using the >>> testlibrary tools to launch the VM and check the output? >> Not that I'm aware of. I guess I just really didn't look at what it >> would take to make it all in java. I'll have a look at java examples >> and convert it. >> >> Chris >>> >>> Sorry that should have been mentioned much earlier. >>> >>> David >>> >>> >>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.01/ >>>> >>>> I still need to rerun through JPRT. I'll do so once there are no more >>>> suggested changes. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 11/18/14 2:08 PM, Chris Plummer wrote: >>>>> Adding core-libs-dev at openjdk.java.net, since one of the changes is in >>>>> java.c. >>>>> >>>>> Chris >>>>> >>>>> On 11/12/14 6:43 PM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Sorry for the delay. >>>>>> >>>>>> On 13/11/2014 5:44 AM, Chris Plummer wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I'm still looking for reviewers. >>>>>> >>>>>> As the change is to the launcher it needs to be reviewed by the >>>>>> launcher owner - which I think is serviceability (though also cc'd >>>>>> Kumar :) ). >>>>>> >>>>>> Launcher change, and your rationale, seems okay to me. I'd probably >>>>>> put the test in to jdk/test/tools/launcher/ though. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 11/7/14 7:53 PM, Chris Plummer wrote: >>>>>>>> This is an initial review for 6762191. I'm guessing there will be >>>>>>>> recommendations to fix in a different way, but thought this >>>>>>>> would be a >>>>>>>> good time to start the discussion. >>>>>>>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6762191 >>>>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/ >>>>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/ >>>>>>>> >>>>>>>> The bug is that if the -Xss size is set to something very small >>>>>>>> (like >>>>>>>> 16k), on linux there will be a crash due to overwriting the end >>>>>>>> of the >>>>>>>> stack. This happens before hotspot can compute its stack needs and >>>>>>>> verify that the stack is big enough. >>>>>>>> >>>>>>>> It didn't seem viable to move the hotspot stack size check >>>>>>>> earlier. It >>>>>>>> depends on too much other work done before that point, and the >>>>>>>> changes >>>>>>>> would have been disruptive. The stack size check is currently >>>>>>>> done in >>>>>>>> os::init_2(). >>>>>>>> >>>>>>>> What is needed is a check before the thread is created. That >>>>>>>> way we >>>>>>>> can create a thread with a big enough stack to handle all needs >>>>>>>> up to >>>>>>>> the point of the check in os::init_2(). This initial check does >>>>>>>> not >>>>>>>> need to be the final check. It just needs to confirm that we have >>>>>>>> enough stack to get us to the check in os::init_2(). >>>>>>>> >>>>>>>> I decided to check in java.c if the -Xss size is too small, and >>>>>>>> set it >>>>>>>> to a larger size if it is. I hard coded this size to 32k (I'll >>>>>>>> explain >>>>>>>> why 32k later). I suspect this is the part that will result in >>>>>>>> some >>>>>>>> debate. If you have better suggestions let me know. If it does >>>>>>>> stay >>>>>>>> here, then probably the 32k needs to be a #define, and maybe >>>>>>>> even an >>>>>>>> OS porting interface, but I'm not sure where to put it. >>>>>>>> >>>>>>>> The reason I chose 32k is because this is big enough for all >>>>>>>> platforms >>>>>>>> to get to the stack size check in os::init_2(). It is also smaller >>>>>>>> than the actual minimum stack size allowed on any platform. 32-bit >>>>>>>> windows has the smallest requirement at 64k. I add some printfs to >>>>>>>> print the minimum stack requirement, and then ran a simple >>>>>>>> JTReg test >>>>>>>> with every JPRT supported platform to get the results. >>>>>>>> >>>>>>>> The TooSmallStackSize.sh will run "java -version" with -Xss16k, >>>>>>>> -Xss32k, and -XXss, where is the size from the >>>>>>>> error message produced by the JVM, such as in the following: >>>>>>>> >>>>>>>> $ java -Xss32k -version >>>>>>>> The stack size specified is too small, Specify at least 100k >>>>>>>> Error: Could not create the Java Virtual Machine. >>>>>>>> Error: A fatal exception has occurred. Program will exit. >>>>>>>> >>>>>>>> I ran this test through JPRT on all platforms, and they all pass. >>>>>>>> >>>>>>>> One thing to point out is that Windows behaves a bit different >>>>>>>> than >>>>>>>> the other platforms. It always rounds the stack size up to a >>>>>>>> multiple >>>>>>>> of 64k , so even if you specify -Xss16k, you get a 64k stack. On >>>>>>>> 32-bit Windows with C1, 64k is also the minimum requirement, so >>>>>>>> there >>>>>>>> is no error produced in this case. However, on 32-bit Windows >>>>>>>> with C2, >>>>>>>> 68k is the minimum, so an error is produced since the stack >>>>>>>> will only >>>>>>>> be 64k. There is no bug here. It's just a bit confusing. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>> >>>>> >>>> >> > From aleksey.shipilev at oracle.com Wed Dec 3 10:42:40 2014 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Wed, 03 Dec 2014 13:42:40 +0300 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <547E32CD.5050103@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> Message-ID: <547EE920.5060406@oracle.com> Hi Jiangli, On 12/03/2014 12:44 AM, Jiangli Zhou wrote: > http://cr.openjdk.java.net/~jiangli/8059510/webrev.06/ Looks good. > New in the webrev: > > 1. Further compression of the compact table > > * Remove the bucket_size table. With the sequential layout of the > buckets, lookup process can seek to the start of the next bucket > without the need of the current bucket size. For the last bucket, it > can seek to the end of the table. The table end offset is added to > the archived data. > * Bucket with exactly one entry is marked as 'compact' bucket, whose > entry only contains the symbol offset. The symbol hash is eliminated > for 'compact' buckets. Lookup compares the symbol directly in that case. I'll keep this for others to review. > 2. The shared symbol table is not always looked up first. The last table > that fetches symbol successfully is used for lookup. This is a good stuff. > I measured using the classloading benchmark that Aleksey pointed to me. > This benchmark loads classes using user defined classloader. There is a > very small degradation shown in the benchmark comparing 'before' and > 'after' with archive dumped with the default configuration. When symbols > from the test is added to the shared table, there is an observable > speedup in the benchmark. The speedup is also very small. Yes, I have remeasured on my scenario, and there seems to be no statistically significant degradation now. -Aleksey. From thomas.stuefe at gmail.com Wed Dec 3 10:47:42 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 3 Dec 2014 11:47:42 +0100 Subject: RFR(s): 8065895: Synchronous signals during error reporting may terminate or hang VM process In-Reply-To: <547E8CF4.3050305@oracle.com> References: <54752CF8.5070408@oracle.com> <54759DFE.7020300@oracle.com> <5475C15E.30207@oracle.com> <54767502.6010907@oracle.com> <5476B417.9030008@oracle.com> <5476E851.8050802@oracle.com> <5476F2AB.4050401@redhat.com> <5477030E.6070605@redhat.com> <54770436.8070705@oracle.com> <54770536.5090101@redhat.com> <547D8B6A.6040002@oracle.com> <547E8CF4.3050305@oracle.com> Message-ID: Hi Dean, I dont understand. Such a function does not exist, does it? So I would have to write it: Do you mean generating and using a StubRoutine which would SIGILL? I did not do this because I wanted to be able to generate SIGILL also in initialization code, where StubRoutines may not yet be generated. This point may may be arguable, but as this function is used to test error handling, it may be interesting to test it for half-initialized VMs too. Otherwise I would implement the CPU specific generate_illegal_instruction_sequence() probably the same way as I do now the crash_with_sigill() function. That would mean a bit of more code duplication because: - Either I use the method I use now (reserve_memory and copy the instructions to the reserved page) - Or I use inline assembly - which probably does not work across multiple OSs, so for CPUs which span various OSs I would have to add one function per os_cpu combination, not just per cpu. Kind regards, Thomas On Wed, Dec 3, 2014 at 5:09 AM, Dean Long wrote: > Instead of get_illegal_instruction_sequence() that fills in a buffer in > reserved memory page, how > about simply generate_illegal_instruction_sequence() that causes the > SIGILL when executed? > Then crash_with_sigill() simplifies to something like: > > tty->print_cr("will jump to PC " PTR_FORMAT", which should cause a > SIGILL.",generate_illegal_instruction_sequence); > tty->flush(); > > generate_illegal_instruction_sequence(); // boom > > dl > > > On 12/2/2014 8:04 AM, Thomas St?fe wrote: > >> Hi David, you are a hard man to uncringe :) >> >> Here is a last modification, which in my opinion would be the best >> balance. >> Basically, it is (2) with the CPU dependend code moved away from shared >> coding and a fallback for CPUs which have no (known) way to cause a >> SIGILL. >> >> http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.03/ >> >> Kind Regards, Thomas >> >> >> On Tue, Dec 2, 2014 at 10:50 AM, David Holmes >> wrote: >> >> On 1/12/2014 11:30 PM, Thomas St?fe wrote: >>> >>> Hi all, >>>> >>>> lets not get this patch bogged down on ARM opcode discussions. >>>> >>>> For me, it is just a question of style and which one would be most >>>> acceptable to the OpenJDK. >>>> >>>> As I see it, here are my options: >>>> >>>> 1 leave the code as it is and whoever does ARM porting at Oracle will >>>> provide the SIGILL opcodes inside debug.cpp >>>> 2 like (1), but provide a fallback for CPUs where we do not know the >>>> SIGILL opcodes right now, by doing a raise(SIGILL). This would work but >>>> make the test a tiny bit less valuable on those platforms. >>>> >>>> 3 Move the CPU-dependend parts (the big #ifdef) away from debug.cpp >>>> into debug_.cpp. Would mean a bit code duplication because for 3 >>>> out of 5 cpus the SIGILL-generating opcode is 0. This basically would be >>>> the same as my second webrev >>>> (http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.01/) >>>> 4 like (3), but with additional introduction of a debug_.hpp, and >>>> adding a "ZERO_WILL_GENERATE_SIGILL" or somesuch macro to provide a >>>> common fallback for cpus where 0 generates SIGILL. >>>> >>>> I am leaning toward (2) or (3) but I am okay with any of the four. >>>> >>>> I'm really undecided here. #1 makes me cringe because of the cpu >>> ifdefs in >>> shared code (including those for non-OpenJDK platforms). #3 and #4 make >>> me >>> cringe because it is a lot of overhead to introduce the debug_.hpp >>> files on all platforms. >>> >>> That leaves #2 though I'm unclear how we will identify the platforms that >>> don't have defined bad opcodes. If that's still just a variant of the >>> ifdefs in #1 then I'm still cringing. :) >>> >>> Would appreciate someone else from runtime jumping in with an opinion >>> here >>> :) >>> >>> David >>> >>> (PS. I'm on vacation tomorrow so apologies for delayed responses.) >>> >>> >>> Kind Regards, >>> >>>> Thomas Stuefe >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Nov 27, 2014 at 12:04 PM, Andrew Haley >>> > wrote: >>>> >>>> On 11/27/2014 11:00 AM, David Holmes wrote: >>>> > On 27/11/2014 8:55 PM, Andrew Haley wrote: >>>> >> On 11/27/2014 10:38 AM, Thomas St?fe wrote: >>>> >>> Hi Andrew, thank you! Does endianess matter ? >>>> >> >>>> >> Yes. I'd do it symbolically rather than mess with endian >>>> defines: >>>> >> >>>> >> #ifdef AARCH64 >>>> >> unsigned insn; >>>> >> asm("b 1f; 0: dcps1; 1: ldr %0, 0b" : "=r"(insn)); >>>> >> #endif >>>> > >>>> > Does that work for ARMv7? >>>> >>>> Sorry, I don't know what a good choice there would be. And I must >>>> warn you: DCPS1 isn't necessarily guaranteed to do this forever, >>>> but >>>> it works on the kernels I've tried. >>>> >>>> Andrew. >>>> >>>> >>>> >>>> >>>> > From Alan.Bateman at oracle.com Wed Dec 3 12:56:41 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 03 Dec 2014 12:56:41 +0000 Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation fault In-Reply-To: <547D265A.20005@oracle.com> References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com> <54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com> <546C5986.6010500@oracle.com> <546C6D1A.8050903@oracle.com> <546CBCAB.7040101@oracle.com> <547D265A.20005@oracle.com> Message-ID: <547F0889.5050204@oracle.com> On 02/12/2014 02:39, Chris Plummer wrote: > Sorry about the long delay in getting back to this. I ran into two > separate JPRT issues that were preventing me from testing these > changes, plus I was on vacation last week. Here's an updated webrev. > I'm not sure where we left things, so I'll just say what's changed > since the original version: > > 1. Rewrote the test to be in Java instead of a shell script. > 2. Moved the test from hotspot/test/runtime/memory to > jdk/test/tools/launcher > 3. Added STACK_SIZE_MINIMUM to java.c, allowing a makefile to override > the default 32k minimum value. > > https://bugs.openjdk.java.net/browse/JDK-6762191 > http://cr.openjdk.java.net/~cjplummer/6762191/webrev.02/ This looks to me. A minor comment for java.c is that this code uses 4-space indent (different to hotspot). The test looks okay too, you might just checking the copyright date as I assume was not written in 2010. Also I think the import of java.io.File may be left behind from the previous round. -Alan From jiangli.zhou at oracle.com Wed Dec 3 17:42:04 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 03 Dec 2014 09:42:04 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <547EE920.5060406@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547EE920.5060406@oracle.com> Message-ID: <547F4B6C.3070400@oracle.com> Hi Aleksey, Thanks a lot for the review! Jiangli On 12/03/2014 02:42 AM, Aleksey Shipilev wrote: > Hi Jiangli, > > On 12/03/2014 12:44 AM, Jiangli Zhou wrote: >> http://cr.openjdk.java.net/~jiangli/8059510/webrev.06/ > Looks good. > >> New in the webrev: >> >> 1. Further compression of the compact table >> >> * Remove the bucket_size table. With the sequential layout of the >> buckets, lookup process can seek to the start of the next bucket >> without the need of the current bucket size. For the last bucket, it >> can seek to the end of the table. The table end offset is added to >> the archived data. >> * Bucket with exactly one entry is marked as 'compact' bucket, whose >> entry only contains the symbol offset. The symbol hash is eliminated >> for 'compact' buckets. Lookup compares the symbol directly in that case. > I'll keep this for others to review. > >> 2. The shared symbol table is not always looked up first. The last table >> that fetches symbol successfully is used for lookup. > This is a good stuff. > >> I measured using the classloading benchmark that Aleksey pointed to me. >> This benchmark loads classes using user defined classloader. There is a >> very small degradation shown in the benchmark comparing 'before' and >> 'after' with archive dumped with the default configuration. When symbols >> from the test is added to the shared table, there is an observable >> speedup in the benchmark. The speedup is also very small. > Yes, I have remeasured on my scenario, and there seems to be no > statistically significant degradation now. > > -Aleksey. > > From dean.long at oracle.com Wed Dec 3 17:51:26 2014 From: dean.long at oracle.com (Dean Long) Date: Wed, 03 Dec 2014 09:51:26 -0800 Subject: RFR(s): 8065895: Synchronous signals during error reporting may terminate or hang VM process In-Reply-To: References: <54752CF8.5070408@oracle.com> <54759DFE.7020300@oracle.com> <5475C15E.30207@oracle.com> <54767502.6010907@oracle.com> <5476B417.9030008@oracle.com> <5476E851.8050802@oracle.com> <5476F2AB.4050401@redhat.com> <5477030E.6070605@redhat.com> <54770436.8070705@oracle.com> <54770536.5090101@redhat.com> <547D8B6A.6040002@oracle.com> <547E8CF4.3050305@oracle.com> Message-ID: <547F4D9E.9060202@oracle.com> On 12/3/2014 2:47 AM, Thomas St?fe wrote: > Hi Dean, > > I dont understand. Such a function does not exist, does it? So I would > have to write it: > > Do you mean generating and using a StubRoutine which would SIGILL? I > did not do this because I wanted to be able to generate SIGILL also in > initialization code, where StubRoutines may not yet be generated. This > point may may be arguable, but as this function is used to test error > handling, it may be interesting to test it for half-initialized VMs too. > > Otherwise I would implement the CPU specific > generate_illegal_instruction_sequence() probably the same way as I do > now the crash_with_sigill() function. That would mean a bit of more > code duplication because: > - Either I use the method I use now (reserve_memory and copy the > instructions to the reserved page) > - Or I use inline assembly - which probably does not work across > multiple OSs, so for CPUs which span various OSs I would have to add > one function per os_cpu combination, not just per cpu. > Yes, I was thinking inline assembly or assembly in a .S file, but didn't consider CPUs which span various OSes. I don't have a strong opinion about the approach, except I would like to avoid ifdefs in shared code, and using memcpy to new memory, we may need to flush the icache after. And on aarch64, we may need to flush the pipeline of each processor. I think the aarch64 port does this at safepoints. I'm not sure, but you may be able to generate the code using the MacroAssembler early in initialization, even before StubRoutines are generated. If so, then that solves the per os_cpu combination problem, but not the icache/pipeline flushing problem. dl > Kind regards, Thomas > > On Wed, Dec 3, 2014 at 5:09 AM, Dean Long > wrote: > > Instead of get_illegal_instruction_sequence() that fills in a > buffer in reserved memory page, how > about simply generate_illegal_instruction_sequence() that causes > the SIGILL when executed? > Then crash_with_sigill() simplifies to something like: > > tty->print_cr("will jump to PC " PTR_FORMAT", which should > cause a SIGILL.",generate_illegal_instruction_sequence); > tty->flush(); > > generate_illegal_instruction_sequence(); // boom > > dl > > > On 12/2/2014 8:04 AM, Thomas St?fe wrote: > > Hi David, you are a hard man to uncringe :) > > Here is a last modification, which in my opinion would be the > best balance. > Basically, it is (2) with the CPU dependend code moved away > from shared > coding and a fallback for CPUs which have no (known) way to > cause a SIGILL. > > http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.03/ > > > Kind Regards, Thomas > > > On Tue, Dec 2, 2014 at 10:50 AM, David Holmes > > > wrote: > > On 1/12/2014 11:30 PM, Thomas St?fe wrote: > > Hi all, > > lets not get this patch bogged down on ARM opcode > discussions. > > For me, it is just a question of style and which one > would be most > acceptable to the OpenJDK. > > As I see it, here are my options: > > 1 leave the code as it is and whoever does ARM > porting at Oracle will > provide the SIGILL opcodes inside debug.cpp > 2 like (1), but provide a fallback for CPUs where we > do not know the > SIGILL opcodes right now, by doing a raise(SIGILL). > This would work but > make the test a tiny bit less valuable on those platforms. > > 3 Move the CPU-dependend parts (the big #ifdef) away > from debug.cpp > into debug_.cpp. Would mean a bit code > duplication because for 3 > out of 5 cpus the SIGILL-generating opcode is 0. This > basically would be > the same as my second webrev > (http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.01/ > ) > 4 like (3), but with additional introduction of a > debug_.hpp, and > adding a "ZERO_WILL_GENERATE_SIGILL" or somesuch macro > to provide a > common fallback for cpus where 0 generates SIGILL. > > I am leaning toward (2) or (3) but I am okay with any > of the four. > > I'm really undecided here. #1 makes me cringe because of > the cpu ifdefs in > shared code (including those for non-OpenJDK platforms). > #3 and #4 make me > cringe because it is a lot of overhead to introduce the > debug_.hpp > files on all platforms. > > That leaves #2 though I'm unclear how we will identify the > platforms that > don't have defined bad opcodes. If that's still just a > variant of the > ifdefs in #1 then I'm still cringing. :) > > Would appreciate someone else from runtime jumping in with > an opinion here > :) > > David > > (PS. I'm on vacation tomorrow so apologies for delayed > responses.) > > > Kind Regards, > > Thomas Stuefe > > > > > > > > On Thu, Nov 27, 2014 at 12:04 PM, Andrew Haley > > >> wrote: > > On 11/27/2014 11:00 AM, David Holmes wrote: > > On 27/11/2014 8:55 PM, Andrew Haley wrote: > >> On 11/27/2014 10:38 AM, Thomas St?fe wrote: > >>> Hi Andrew, thank you! Does endianess matter ? > >> > >> Yes. I'd do it symbolically rather than mess > with endian defines: > >> > >> #ifdef AARCH64 > >> unsigned insn; > >> asm("b 1f; 0: dcps1; 1: ldr %0, 0b" : > "=r"(insn)); > >> #endif > > > > Does that work for ARMv7? > > Sorry, I don't know what a good choice there > would be. And I must > warn you: DCPS1 isn't necessarily guaranteed to > do this forever, but > it works on the kernels I've tried. > > Andrew. > > > > > > From kumar.x.srinivasan at oracle.com Wed Dec 3 18:58:24 2014 From: kumar.x.srinivasan at oracle.com (Kumar Srinivasan) Date: Wed, 03 Dec 2014 10:58:24 -0800 Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation fault In-Reply-To: <547D265A.20005@oracle.com> References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com> <54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com> <546C5986.6010500@oracle.com> <546C6D1A.8050903@oracle.com> <546CBCAB.7040101@oracle.com> <547D265A.20005@oracle.com> Message-ID: <547F5D50.30309@oracle.com> Hi Chris, Approved with some minor nits, typos which needs correction. yes java.c follows the JDK indenting as Alan pointed out. TooSmallStackSize.java Copyright should be 2014, 1. 37 * stack size for the platform (as provided by the JVM error message when a very 38 * small is used), and then verify that the JVM can be launched with that stack s/small is/small stack is/ 2. 54 * Returns the minimum stack size this platform will allowed based on the s/allowed/allow/ 3. 58 * The TestResult argument must containthe result of having already run s/containthe/contain the/ 4. 92 if (verbose) System.out.println("*** Testing " + stackSize); I know this is allowed in the HotSpot world but in JDK land we always use with a LF + Indent, also in other places. 5. 85 * Returns the mimumum allowed stack size gleaned from the error message, s/mimumum/minimum Finally I am concerned with the integration, since it spans both HotSpot and JDK, and it appears the test will fail if the HotSpot changes are not integrated first, or has it already ? Thanks Kumar On 12/1/2014 6:39 PM, Chris Plummer wrote: > Sorry about the long delay in getting back to this. I ran into two > separate JPRT issues that were preventing me from testing these > changes, plus I was on vacation last week. Here's an updated webrev. > I'm not sure where we left things, so I'll just say what's changed > since the original version: > > 1. Rewrote the test to be in Java instead of a shell script. > 2. Moved the test from hotspot/test/runtime/memory to > jdk/test/tools/launcher > 3. Added STACK_SIZE_MINIMUM to java.c, allowing a makefile to override > the default 32k minimum value. > > https://bugs.openjdk.java.net/browse/JDK-6762191 > http://cr.openjdk.java.net/~cjplummer/6762191/webrev.02/ > > thanks, > > Chris > > On 11/19/14 7:52 AM, Chris Plummer wrote: >> On 11/19/14 2:12 AM, David Holmes wrote: >>> On 19/11/2014 6:49 PM, Chris Plummer wrote: >>>> I've update the webrev to add STACK_SIZE_MINIMUM in place of the 32k >>>> references, and also moved the test from hotspot/test/runtime to >>>> jdk/test/tools/launcher as David requested. That required some >>>> adjustments to the test script, since test_env.sh does not exist in >>>> jdk/test, so I had to pull in the bits I needed into the script. >>> >>> Is there a reason this needs a shell script instead of using the >>> testlibrary tools to launch the VM and check the output? >> Not that I'm aware of. I guess I just really didn't look at what it >> would take to make it all in java. I'll have a look at java examples >> and convert it. >> >> Chris >>> >>> Sorry that should have been mentioned much earlier. >>> >>> David >>> >>> >>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.01/ >>>> >>>> I still need to rerun through JPRT. I'll do so once there are no more >>>> suggested changes. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 11/18/14 2:08 PM, Chris Plummer wrote: >>>>> Adding core-libs-dev at openjdk.java.net, since one of the changes is in >>>>> java.c. >>>>> >>>>> Chris >>>>> >>>>> On 11/12/14 6:43 PM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Sorry for the delay. >>>>>> >>>>>> On 13/11/2014 5:44 AM, Chris Plummer wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I'm still looking for reviewers. >>>>>> >>>>>> As the change is to the launcher it needs to be reviewed by the >>>>>> launcher owner - which I think is serviceability (though also cc'd >>>>>> Kumar :) ). >>>>>> >>>>>> Launcher change, and your rationale, seems okay to me. I'd probably >>>>>> put the test in to jdk/test/tools/launcher/ though. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 11/7/14 7:53 PM, Chris Plummer wrote: >>>>>>>> This is an initial review for 6762191. I'm guessing there will be >>>>>>>> recommendations to fix in a different way, but thought this >>>>>>>> would be a >>>>>>>> good time to start the discussion. >>>>>>>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6762191 >>>>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/ >>>>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/ >>>>>>>> >>>>>>>> The bug is that if the -Xss size is set to something very small >>>>>>>> (like >>>>>>>> 16k), on linux there will be a crash due to overwriting the end >>>>>>>> of the >>>>>>>> stack. This happens before hotspot can compute its stack needs and >>>>>>>> verify that the stack is big enough. >>>>>>>> >>>>>>>> It didn't seem viable to move the hotspot stack size check >>>>>>>> earlier. It >>>>>>>> depends on too much other work done before that point, and the >>>>>>>> changes >>>>>>>> would have been disruptive. The stack size check is currently >>>>>>>> done in >>>>>>>> os::init_2(). >>>>>>>> >>>>>>>> What is needed is a check before the thread is created. That >>>>>>>> way we >>>>>>>> can create a thread with a big enough stack to handle all needs >>>>>>>> up to >>>>>>>> the point of the check in os::init_2(). This initial check does >>>>>>>> not >>>>>>>> need to be the final check. It just needs to confirm that we have >>>>>>>> enough stack to get us to the check in os::init_2(). >>>>>>>> >>>>>>>> I decided to check in java.c if the -Xss size is too small, and >>>>>>>> set it >>>>>>>> to a larger size if it is. I hard coded this size to 32k (I'll >>>>>>>> explain >>>>>>>> why 32k later). I suspect this is the part that will result in >>>>>>>> some >>>>>>>> debate. If you have better suggestions let me know. If it does >>>>>>>> stay >>>>>>>> here, then probably the 32k needs to be a #define, and maybe >>>>>>>> even an >>>>>>>> OS porting interface, but I'm not sure where to put it. >>>>>>>> >>>>>>>> The reason I chose 32k is because this is big enough for all >>>>>>>> platforms >>>>>>>> to get to the stack size check in os::init_2(). It is also smaller >>>>>>>> than the actual minimum stack size allowed on any platform. 32-bit >>>>>>>> windows has the smallest requirement at 64k. I add some printfs to >>>>>>>> print the minimum stack requirement, and then ran a simple >>>>>>>> JTReg test >>>>>>>> with every JPRT supported platform to get the results. >>>>>>>> >>>>>>>> The TooSmallStackSize.sh will run "java -version" with -Xss16k, >>>>>>>> -Xss32k, and -XXss, where is the size from the >>>>>>>> error message produced by the JVM, such as in the following: >>>>>>>> >>>>>>>> $ java -Xss32k -version >>>>>>>> The stack size specified is too small, Specify at least 100k >>>>>>>> Error: Could not create the Java Virtual Machine. >>>>>>>> Error: A fatal exception has occurred. Program will exit. >>>>>>>> >>>>>>>> I ran this test through JPRT on all platforms, and they all pass. >>>>>>>> >>>>>>>> One thing to point out is that Windows behaves a bit different >>>>>>>> than >>>>>>>> the other platforms. It always rounds the stack size up to a >>>>>>>> multiple >>>>>>>> of 64k , so even if you specify -Xss16k, you get a 64k stack. On >>>>>>>> 32-bit Windows with C1, 64k is also the minimum requirement, so >>>>>>>> there >>>>>>>> is no error produced in this case. However, on 32-bit Windows >>>>>>>> with C2, >>>>>>>> 68k is the minimum, so an error is produced since the stack >>>>>>>> will only >>>>>>>> be 64k. There is no bug here. It's just a bit confusing. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>> >>>>> >>>> >> > From chris.plummer at oracle.com Wed Dec 3 19:26:41 2014 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 03 Dec 2014 11:26:41 -0800 Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation fault In-Reply-To: <547F5D50.30309@oracle.com> References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com> <54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com> <546C5986.6010500@oracle.com> <546C6D1A.8050903@oracle.com> <546CBCAB.7040101@oracle.com> <547D265A.20005@oracle.com> <547F5D50.30309@oracle.com> Message-ID: <547F63F1.5050202@oracle.com> Hi Kumar, On 12/3/14 10:58 AM, Kumar Srinivasan wrote: > Hi Chris, > > Approved with some minor nits, typos which needs correction. > > yes java.c follows the JDK indenting as Alan pointed out. > > TooSmallStackSize.java > > Copyright should be 2014, Copy/paste error from example test I was referred to. I will fix, and also remove the import if not needed. > > 1. > > 37 * stack size for the platform (as provided by the JVM error > message when a very > 38 * small is used), and then verify that the JVM can be launched > with that stack > > s/small is/small stack is/ ok > > 2. > > 54 * Returns the minimum stack size this platform will allowed > based on the > > s/allowed/allow/ ok > > 3. > > 58 * The TestResult argument must containthe result of having > already run > s/containthe/contain the/ ok > > 4. > 92 if (verbose) System.out.println("*** Testing " + stackSize); > > I know this is allowed in the HotSpot world but in JDK land we always > use with a LF + Indent, also in other places. Are curly braces needed then? I know some coding conventions require them. > > 5. > 85 * Returns the mimumum allowed stack size gleaned from the > error message, > s/mimumum/minimum ok. > > > Finally I am concerned with the integration, since it spans both > HotSpot and JDK, and it appears the test will fail if the HotSpot > changes are not integrated first, or has it already ? There are no hotspot changes. java.c is where the fix is. thanks, Chris > > Thanks > Kumar > > > > > > > > On 12/1/2014 6:39 PM, Chris Plummer wrote: >> Sorry about the long delay in getting back to this. I ran into two >> separate JPRT issues that were preventing me from testing these >> changes, plus I was on vacation last week. Here's an updated webrev. >> I'm not sure where we left things, so I'll just say what's changed >> since the original version: >> >> 1. Rewrote the test to be in Java instead of a shell script. >> 2. Moved the test from hotspot/test/runtime/memory to >> jdk/test/tools/launcher >> 3. Added STACK_SIZE_MINIMUM to java.c, allowing a makefile to >> override the default 32k minimum value. >> >> https://bugs.openjdk.java.net/browse/JDK-6762191 >> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.02/ >> >> thanks, >> >> Chris >> >> On 11/19/14 7:52 AM, Chris Plummer wrote: >>> On 11/19/14 2:12 AM, David Holmes wrote: >>>> On 19/11/2014 6:49 PM, Chris Plummer wrote: >>>>> I've update the webrev to add STACK_SIZE_MINIMUM in place of the 32k >>>>> references, and also moved the test from hotspot/test/runtime to >>>>> jdk/test/tools/launcher as David requested. That required some >>>>> adjustments to the test script, since test_env.sh does not exist in >>>>> jdk/test, so I had to pull in the bits I needed into the script. >>>> >>>> Is there a reason this needs a shell script instead of using the >>>> testlibrary tools to launch the VM and check the output? >>> Not that I'm aware of. I guess I just really didn't look at what it >>> would take to make it all in java. I'll have a look at java examples >>> and convert it. >>> >>> Chris >>>> >>>> Sorry that should have been mentioned much earlier. >>>> >>>> David >>>> >>>> >>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.01/ >>>>> >>>>> I still need to rerun through JPRT. I'll do so once there are no more >>>>> suggested changes. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 11/18/14 2:08 PM, Chris Plummer wrote: >>>>>> Adding core-libs-dev at openjdk.java.net, since one of the changes >>>>>> is in >>>>>> java.c. >>>>>> >>>>>> Chris >>>>>> >>>>>> On 11/12/14 6:43 PM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Sorry for the delay. >>>>>>> >>>>>>> On 13/11/2014 5:44 AM, Chris Plummer wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I'm still looking for reviewers. >>>>>>> >>>>>>> As the change is to the launcher it needs to be reviewed by the >>>>>>> launcher owner - which I think is serviceability (though also cc'd >>>>>>> Kumar :) ). >>>>>>> >>>>>>> Launcher change, and your rationale, seems okay to me. I'd probably >>>>>>> put the test in to jdk/test/tools/launcher/ though. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 11/7/14 7:53 PM, Chris Plummer wrote: >>>>>>>>> This is an initial review for 6762191. I'm guessing there will be >>>>>>>>> recommendations to fix in a different way, but thought this >>>>>>>>> would be a >>>>>>>>> good time to start the discussion. >>>>>>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6762191 >>>>>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/ >>>>>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/ >>>>>>>>> >>>>>>>>> The bug is that if the -Xss size is set to something very >>>>>>>>> small (like >>>>>>>>> 16k), on linux there will be a crash due to overwriting the >>>>>>>>> end of the >>>>>>>>> stack. This happens before hotspot can compute its stack needs >>>>>>>>> and >>>>>>>>> verify that the stack is big enough. >>>>>>>>> >>>>>>>>> It didn't seem viable to move the hotspot stack size check >>>>>>>>> earlier. It >>>>>>>>> depends on too much other work done before that point, and the >>>>>>>>> changes >>>>>>>>> would have been disruptive. The stack size check is currently >>>>>>>>> done in >>>>>>>>> os::init_2(). >>>>>>>>> >>>>>>>>> What is needed is a check before the thread is created. That >>>>>>>>> way we >>>>>>>>> can create a thread with a big enough stack to handle all >>>>>>>>> needs up to >>>>>>>>> the point of the check in os::init_2(). This initial check >>>>>>>>> does not >>>>>>>>> need to be the final check. It just needs to confirm that we have >>>>>>>>> enough stack to get us to the check in os::init_2(). >>>>>>>>> >>>>>>>>> I decided to check in java.c if the -Xss size is too small, >>>>>>>>> and set it >>>>>>>>> to a larger size if it is. I hard coded this size to 32k (I'll >>>>>>>>> explain >>>>>>>>> why 32k later). I suspect this is the part that will result in >>>>>>>>> some >>>>>>>>> debate. If you have better suggestions let me know. If it does >>>>>>>>> stay >>>>>>>>> here, then probably the 32k needs to be a #define, and maybe >>>>>>>>> even an >>>>>>>>> OS porting interface, but I'm not sure where to put it. >>>>>>>>> >>>>>>>>> The reason I chose 32k is because this is big enough for all >>>>>>>>> platforms >>>>>>>>> to get to the stack size check in os::init_2(). It is also >>>>>>>>> smaller >>>>>>>>> than the actual minimum stack size allowed on any platform. >>>>>>>>> 32-bit >>>>>>>>> windows has the smallest requirement at 64k. I add some >>>>>>>>> printfs to >>>>>>>>> print the minimum stack requirement, and then ran a simple >>>>>>>>> JTReg test >>>>>>>>> with every JPRT supported platform to get the results. >>>>>>>>> >>>>>>>>> The TooSmallStackSize.sh will run "java -version" with -Xss16k, >>>>>>>>> -Xss32k, and -XXss, where is the size from the >>>>>>>>> error message produced by the JVM, such as in the following: >>>>>>>>> >>>>>>>>> $ java -Xss32k -version >>>>>>>>> The stack size specified is too small, Specify at least 100k >>>>>>>>> Error: Could not create the Java Virtual Machine. >>>>>>>>> Error: A fatal exception has occurred. Program will exit. >>>>>>>>> >>>>>>>>> I ran this test through JPRT on all platforms, and they all pass. >>>>>>>>> >>>>>>>>> One thing to point out is that Windows behaves a bit different >>>>>>>>> than >>>>>>>>> the other platforms. It always rounds the stack size up to a >>>>>>>>> multiple >>>>>>>>> of 64k , so even if you specify -Xss16k, you get a 64k stack. On >>>>>>>>> 32-bit Windows with C1, 64k is also the minimum requirement, >>>>>>>>> so there >>>>>>>>> is no error produced in this case. However, on 32-bit Windows >>>>>>>>> with C2, >>>>>>>>> 68k is the minimum, so an error is produced since the stack >>>>>>>>> will only >>>>>>>>> be 64k. There is no bug here. It's just a bit confusing. >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>> >>>>>> >>>>> >>> >> > From ioi.lam at oracle.com Wed Dec 3 20:06:10 2014 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 03 Dec 2014 12:06:10 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <547E32CD.5050103@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> Message-ID: <547F6D32.6050708@oracle.com> Hi Jiangli, The changes look good. Just a few nits: (1) src/share/vm/services/diagnosticCommand.cpp: This seems to be not needed 27 #include "classfile/compactHashtable.hpp" (2) test/runtime/SharedArchiveFile/DumpSymbolAndStringTable.java: 43 output.shouldContain("DumpSymbolAndStringTable"); 48 output.shouldContain("java.lang.String");Maybe these two should include the full line, and the terminating \n? Also, why is this necessary for VM.stringtable but not done for VM.symboltable? 49 } catch (RuntimeException e) { 50 output.shouldContain("Unknown diagnostic command"); (3) src/share/vm/classfile/compactHashtable.cpp: 50 stats->hashentry_bytes = _num_entries * 8; // estimates only Is this always an conservative estimate (more than actually needed)? This should be explained in the comments. Also, if the actual number is different, stats->hashentry_bytes should be adjusted so that the statistics can be printed correctly. Indentation: 4->2 134 assert(bucket_type == COMPACT_BUCKET_TYPE, "Bad bucket type"); Is the a reason to use guarantee here and not assert (I realize that I wrote these two lines, and I don't remember why :-)? 142 guarantee(deltax < max_delta, "range check"); 147 guarantee(count == _bucket_sizes[index], "sanity"); I think these two blocks can be rewritten to avoid the use of the #ifdef 162 #ifdef _LP64 163 *p++ = juint(base_address >> 32); 164 #else 165 *p++ = 0; 166 #endif 167 *p++ = juint(base_address & 0xffffffff); // base address 205 juint upper = *p++; 206 juint lower = *p++; 207 #ifdef _LP64 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); 209 #else 210 _base_address = uintx(lower); 211 #endif -> 163 *p++ = juint(base_address >> 32); 167 *p++ = juint(base_address & 0xffffffff); 205 juint upper = *p++; 206 juint lower = *p++; 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); Also, do you have statistics of the percentage of COMPACT_BUCKET_TYPE vs REGULAR_BUCKET_TYPE? Thanks - Ioi On 12/2/14, 1:44 PM, Jiangli Zhou wrote: > Hi, > > I finally got back to this, sorry for the delay. Please see the > following new webre. > > http://cr.openjdk.java.net/~jiangli/8059510/webrev.06/ > > New in the webrev: > > 1. Further compression of the compact table > > * Remove the bucket_size table. With the sequential layout of the > buckets, lookup process can seek to the start of the next bucket > without the need of the current bucket size. For the last bucket, > it can seek to the end of the table. The table end offset is added > to the archived data. > * Bucket with exactly one entry is marked as 'compact' bucket, whose > entry only contains the symbol offset. The symbol hash is > eliminated for 'compact' buckets. Lookup compares the symbol > directly in that case. > > 2. The shared symbol table is not always looked up first. The last > table that fetches symbol successfully is used for lookup. > > 3. Added a lot more comments in compactHashtable.hpp with details of > the compact table layout and dump/lookup process. > > I measured using the classloading benchmark that Aleksey pointed to > me. This benchmark loads classes using user defined classloader. There > is a very small degradation shown in the benchmark comparing 'before' > and 'after' with archive dumped with the default configuration. When > symbols from the test is added to the shared table, there is an > observable speedup in the benchmark. The speedup is also very small. > > Thanks, > Jiangli > > > On 10/29/2014 02:55 PM, Jiangli Zhou wrote: >> Hi John, >> >> Thank you for the thoughts on this! Yes, it's a good time to have >> these conversations. Please see some quick responses from me below, >> with more details to follow. >> >> On 10/29/2014 12:46 PM, John Rose wrote: >>> I have a few points about the impact of this change on startup performance, and on trends in our code base: >>> >>> 1. We can live with small performance regressions on some benchmarks. Otherwise we'd never get anywhere. So I am not saying that the current (very interesting and multi-facted) conversation must continue a long time before we can push any code. >>> >>> 2. Aleksey's challenge is valid, and I don't see a strong reply to it yet. Work like this, that emphasizes compactness and sharability can usually be given the benefit of the doubt for startup. But if we observe a problem we should make a more careful measurement. If this change goes in with a measurable regression, we need a follow-up conversation (centered around a tracking bug) about quantifying the performance regression and fixing it. (It's possible the conversation ends by saying "we decided we don't care and here's why", but I doubt it will end that way.) >> >> Besides my classloading benchmark results posted in earlier message, >> we have asked performance team to help us measure startup, >> classloding, and memory saving regarding this change, and verified >> there was no regression in startup/classloading. The results were not >> posted in the thread however. They were added to the bug report for >> JDK-8059510 . >> >>> 3. The benchmark Aleksey chose, Nashorn startup, may well be an outlier. Dynamic language runtimes create lots of tiny methods, and are likely to put heavy loads on symbol resolution, including resolution of symbols that are not known statically. For these, the first-look (front-end) table being proposed might be a loss, especially if the Nashorn runtime is not in the first-look table. >>> >>> 4. Possible workarounds: #1 Track hit rates and start looking at the static table *second* if it fails to produce enough hits, compared to the dynamic table. #2 When that happens, incrementally migrate (rehash) frequently found symbols in the static table into the main table. #3 Re-evaluate the hashing algorithm. (More on this below.) >> >> That's interesting thinking. I'll follow up on this. >> >>> 5. I strongly support moving towards position-independent, shareable, pointer-free data structures, and I want us to learn to do it well over time. (Ioi, you are one of our main explorers here!) To me, a performance regression is a suggestion that we have something more to learn. And, it's not a surprise that we don't have it all figured out yet. >> >> Point taken! Position-independent is one of our goal for the archived >> data. We'll be start looking into removing direct pointers in the >> shared class data. >> >>> 6. For work like this, we need to agree upon at least one or two startup performance tests to shake out bottlenecks early and give confidence. People who work on startup performance should know how to run them and be able to quote them. >> >> Thank you for bring this up. Totally agree. For me, personally I've >> played with different benchmarks and applications with different >> focus over time. It would be good to have some commonly agreed >> startup performance tests for this. A standalone classloading >> benchmark, HelloWorld and Nashorn startup probably are good choices. >> We also have an internal application for startup and memory saving >> measurement. >> >>> 7. There's a vibrant literature about offline (statically created) hash tables, and lots of tricks floating around, such as perfect or semi-perfect hash functions, and multiple-choice hashing, and locality-aware structures. I can think of several algorithmic tweaks I'd like to try on this code. (If they haven't already been tried or discarded: I assume Ioi has already tried a number of things.) Moreover, this is not just doorknob-polishing, because (as said in point 5) we want to develop our competency with these sorts of data structures. >>> >>> 8. I found the code hard to read, so it was hard to reason about the algorithm. The core routine, "lookup", has no comments and only one assert. It uses only primitive C types and (perhaps in the name of performance) is not factored into sub-functions. The code which generates the static table is also hard to reason about in similar ways. The bucket size parameter choice (a crucial compromise between performance and compactness) is 4 but the motivation and implications are left for the reader to puzzle out. Likewise, the use of hardware division instead of power-of-two table size is presented without comment, despite the fact that we favor power-of-two in other parts of our stack, and a power-of-two solution would be reasonable here (given the bucket size). >>> >>> 9. Perhaps I wouldn't care as much about the code style if the code were not performance sensitive or if it were a unique and isolated part of the JVM. In this case, I expect this code (and other similar code) to improve, over time, in readability and robustness, as we learn to work with this new kind of data structure. So even if we decide there is no significant regression here, and decide to push it as-is, we still need to use it as an example to help us get better at writing easy-to-read code which works with pointer-free data. >>> >>> 10. I would like to see (posted somewhere or attached to the bug) a sample list of the symbols in a typical symbol table. Perhaps this already exists and I missed it. I think it would be friendly to put some comments in the code that help the reader estimate numbers like table size, bucket length, number of queries, number of probes per query, symbol length statistics (mean, histogram). Of course such comments go stale over time, but so does the algorithm, and being coy about the conditions of the moment doesn't help us in the long run. Even a short comment is better than none, for example: >>> http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/share/vm/classfile/vmSymbols.cpp#l206 >> >> Those are good suggestions. I'll add a sample list o the symbols to >> the bug report and try to put more comments in the code. >> >> Thanks! >> >> Jiangli >> >>> It is a good time to have these conversations. >>> >>> Best wishes, >>> ? John >>> >>> On Oct 13, 2014, at 11:46 AM, Aleksey Shipilev wrote: >>> >>>> Hi Jiangli, >>>> >>>> On 13.10.2014 18:26, Jiangli Zhou wrote: >>>>> On 10/13/2014 03:18 AM, Aleksey Shipilev wrote: >>>>>> On 13.10.2014 03:32, David Holmes wrote: >>>>>>> On 11/10/2014 1:47 PM, Jiangli Zhou wrote: >>>>>>> Also is the benchmarking being done on dedicated systems? >>>>>> Also, specjvm98 is meaningless to estimate the classloading costs. >>>>>> Please try specjvm2008:startup.* tests? >>>>> The specjvm run was for Gerard's question about standard benchmarks. >>>> SPECjvm2008 is a standard benchmark. In fact, it is a benchmark that >>>> deprecates SPECjvm98. >>>> >>>>> These are not benchmarks specifically for classloading. >>>> There are benchmarks that try to estimate the startup costs. >>>> SPECjvm2008:startup.* tests are one of them. >>>> >>>>> However, I agree it's a good idea to run standard benchmarks to >>>>> confirm there is no overall performance degradation. From all the >>>>> benchmarks including classloading measurements, we have confirmed >>>>> that this specific change does not have negative impact on >>>>> classloading itself and the overall performance. >>>> Excellent. What are those benchmarks? May we see those? Because I have a >>>> counter-example in this thread that this change *does* negatively impact >>>> classloading. >>>> >>>> -Aleksey. >>>> >> > From jiangli.zhou at oracle.com Wed Dec 3 21:20:52 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 03 Dec 2014 13:20:52 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <547F6D32.6050708@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> Message-ID: <547F7EB4.7090801@oracle.com> Hi Ioi, Thanks for the comments. On 12/03/2014 12:06 PM, Ioi Lam wrote: > Hi Jiangli, > > The changes look good. Just a few nits: > > (1) src/share/vm/services/diagnosticCommand.cpp: > > This seems to be not needed > 27 #include "classfile/compactHashtable.hpp" The #include of the header file is needed. The VM.symboltable and VM.stringtable definitions are moved to compactHashtable.hpp to be more close to their usage. > > (2) test/runtime/SharedArchiveFile/DumpSymbolAndStringTable.java: > 43 output.shouldContain("DumpSymbolAndStringTable"); > 48 output.shouldContain("java.lang.String");Maybe these two should include the full line, and the terminating \n? Sounds good. Added. > Also, why is this necessary for VM.stringtable but not done for > VM.symboltable? > > 49 } catch (RuntimeException e) { > 50 output.shouldContain("Unknown diagnostic command"); Fixed for VM.symboltable. Thanks for catching that. > > (3) src/share/vm/classfile/compactHashtable.cpp: > > 50 stats->hashentry_bytes = _num_entries * 8; // estimates only > > Is this always an conservative estimate (more than actually needed)? Yes. Some entries will have only the 4-byte offset, but we don't know how many entries at this point. So we use a conservative estimate. > This should be explained in the comments. Will do. > Also, if the actual number is different, stats->hashentry_bytes should > be adjusted so that the statistics can be printed correctly. Ok. > > Indentation: 4->2 > 134 assert(bucket_type == COMPACT_BUCKET_TYPE, "Bad bucket > type"); Fixed. > > Is the a reason to use guarantee here and not assert (I realize that I > wrote these two lines, and I don't remember why :-)? > 142 guarantee(deltax < max_delta, "range check"); > 147 guarantee(count == _bucket_sizes[index], "sanity"); I'm guessing you wanted to have the check in non-debug build also. > > I think these two blocks can be rewritten to avoid the use of the #ifdef > 162 #ifdef _LP64 > 163 *p++ = juint(base_address >> 32); > 164 #else > 165 *p++ = 0; > 166 #endif > 167 *p++ = juint(base_address & 0xffffffff); // base address > > 205 juint upper = *p++; > 206 juint lower = *p++; > 207 #ifdef _LP64 > 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); > 209 #else > 210 _base_address = uintx(lower); > 211 #endif > > -> > > 163 *p++ = juint(base_address >> 32); > 167 *p++ = juint(base_address & 0xffffffff); > > 205 juint upper = *p++; > 206 juint lower = *p++; > 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); Sounds good. > > Also, do you have statistics of the percentage of COMPACT_BUCKET_TYPE > vs REGULAR_BUCKET_TYPE? Yes. :) With the default classlist, the compact bucket is about 6% for. empty buckets 1 entry buckets 2 entries buckets 3 entries buckets 4 entries buckets 5 entries buckets 6 entries buckets 7 entries buckets 8 entries buckets >8 entries Bucket size 4 25 71 172 247 229 158 108 87 29 26 Compressed Thanks, Jiangli > > Thanks > - Ioi > > > On 12/2/14, 1:44 PM, Jiangli Zhou wrote: >> Hi, >> >> I finally got back to this, sorry for the delay. Please see the >> following new webre. >> >> http://cr.openjdk.java.net/~jiangli/8059510/webrev.06/ >> >> New in the webrev: >> >> 1. Further compression of the compact table >> >> * Remove the bucket_size table. With the sequential layout of the >> buckets, lookup process can seek to the start of the next bucket >> without the need of the current bucket size. For the last bucket, >> it can seek to the end of the table. The table end offset is >> added to the archived data. >> * Bucket with exactly one entry is marked as 'compact' bucket, >> whose entry only contains the symbol offset. The symbol hash is >> eliminated for 'compact' buckets. Lookup compares the symbol >> directly in that case. >> >> 2. The shared symbol table is not always looked up first. The last >> table that fetches symbol successfully is used for lookup. >> >> 3. Added a lot more comments in compactHashtable.hpp with details of >> the compact table layout and dump/lookup process. >> >> I measured using the classloading benchmark that Aleksey pointed to >> me. This benchmark loads classes using user defined classloader. >> There is a very small degradation shown in the benchmark comparing >> 'before' and 'after' with archive dumped with the default >> configuration. When symbols from the test is added to the shared >> table, there is an observable speedup in the benchmark. The speedup >> is also very small. >> >> Thanks, >> Jiangli >> >> >> On 10/29/2014 02:55 PM, Jiangli Zhou wrote: >>> Hi John, >>> >>> Thank you for the thoughts on this! Yes, it's a good time to have >>> these conversations. Please see some quick responses from me below, >>> with more details to follow. >>> >>> On 10/29/2014 12:46 PM, John Rose wrote: >>>> I have a few points about the impact of this change on startup performance, and on trends in our code base: >>>> >>>> 1. We can live with small performance regressions on some benchmarks. Otherwise we'd never get anywhere. So I am not saying that the current (very interesting and multi-facted) conversation must continue a long time before we can push any code. >>>> >>>> 2. Aleksey's challenge is valid, and I don't see a strong reply to it yet. Work like this, that emphasizes compactness and sharability can usually be given the benefit of the doubt for startup. But if we observe a problem we should make a more careful measurement. If this change goes in with a measurable regression, we need a follow-up conversation (centered around a tracking bug) about quantifying the performance regression and fixing it. (It's possible the conversation ends by saying "we decided we don't care and here's why", but I doubt it will end that way.) >>> >>> Besides my classloading benchmark results posted in earlier message, >>> we have asked performance team to help us measure startup, >>> classloding, and memory saving regarding this change, and verified >>> there was no regression in startup/classloading. The results were >>> not posted in the thread however. They were added to the bug report >>> for JDK-8059510 . >>> >>>> 3. The benchmark Aleksey chose, Nashorn startup, may well be an outlier. Dynamic language runtimes create lots of tiny methods, and are likely to put heavy loads on symbol resolution, including resolution of symbols that are not known statically. For these, the first-look (front-end) table being proposed might be a loss, especially if the Nashorn runtime is not in the first-look table. >>>> >>>> 4. Possible workarounds: #1 Track hit rates and start looking at the static table *second* if it fails to produce enough hits, compared to the dynamic table. #2 When that happens, incrementally migrate (rehash) frequently found symbols in the static table into the main table. #3 Re-evaluate the hashing algorithm. (More on this below.) >>> >>> That's interesting thinking. I'll follow up on this. >>> >>>> 5. I strongly support moving towards position-independent, shareable, pointer-free data structures, and I want us to learn to do it well over time. (Ioi, you are one of our main explorers here!) To me, a performance regression is a suggestion that we have something more to learn. And, it's not a surprise that we don't have it all figured out yet. >>> >>> Point taken! Position-independent is one of our goal for the >>> archived data. We'll be start looking into removing direct pointers >>> in the shared class data. >>> >>>> 6. For work like this, we need to agree upon at least one or two startup performance tests to shake out bottlenecks early and give confidence. People who work on startup performance should know how to run them and be able to quote them. >>> >>> Thank you for bring this up. Totally agree. For me, personally I've >>> played with different benchmarks and applications with different >>> focus over time. It would be good to have some commonly agreed >>> startup performance tests for this. A standalone classloading >>> benchmark, HelloWorld and Nashorn startup probably are good choices. >>> We also have an internal application for startup and memory saving >>> measurement. >>> >>>> 7. There's a vibrant literature about offline (statically created) hash tables, and lots of tricks floating around, such as perfect or semi-perfect hash functions, and multiple-choice hashing, and locality-aware structures. I can think of several algorithmic tweaks I'd like to try on this code. (If they haven't already been tried or discarded: I assume Ioi has already tried a number of things.) Moreover, this is not just doorknob-polishing, because (as said in point 5) we want to develop our competency with these sorts of data structures. >>>> >>>> 8. I found the code hard to read, so it was hard to reason about the algorithm. The core routine, "lookup", has no comments and only one assert. It uses only primitive C types and (perhaps in the name of performance) is not factored into sub-functions. The code which generates the static table is also hard to reason about in similar ways. The bucket size parameter choice (a crucial compromise between performance and compactness) is 4 but the motivation and implications are left for the reader to puzzle out. Likewise, the use of hardware division instead of power-of-two table size is presented without comment, despite the fact that we favor power-of-two in other parts of our stack, and a power-of-two solution would be reasonable here (given the bucket size). >>>> >>>> 9. Perhaps I wouldn't care as much about the code style if the code were not performance sensitive or if it were a unique and isolated part of the JVM. In this case, I expect this code (and other similar code) to improve, over time, in readability and robustness, as we learn to work with this new kind of data structure. So even if we decide there is no significant regression here, and decide to push it as-is, we still need to use it as an example to help us get better at writing easy-to-read code which works with pointer-free data. >>>> >>>> 10. I would like to see (posted somewhere or attached to the bug) a sample list of the symbols in a typical symbol table. Perhaps this already exists and I missed it. I think it would be friendly to put some comments in the code that help the reader estimate numbers like table size, bucket length, number of queries, number of probes per query, symbol length statistics (mean, histogram). Of course such comments go stale over time, but so does the algorithm, and being coy about the conditions of the moment doesn't help us in the long run. Even a short comment is better than none, for example: >>>> http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/share/vm/classfile/vmSymbols.cpp#l206 >>> >>> Those are good suggestions. I'll add a sample list o the symbols to >>> the bug report and try to put more comments in the code. >>> >>> Thanks! >>> >>> Jiangli >>> >>>> It is a good time to have these conversations. >>>> >>>> Best wishes, >>>> ? John >>>> >>>> On Oct 13, 2014, at 11:46 AM, Aleksey Shipilev wrote: >>>> >>>>> Hi Jiangli, >>>>> >>>>> On 13.10.2014 18:26, Jiangli Zhou wrote: >>>>>> On 10/13/2014 03:18 AM, Aleksey Shipilev wrote: >>>>>>> On 13.10.2014 03:32, David Holmes wrote: >>>>>>>> On 11/10/2014 1:47 PM, Jiangli Zhou wrote: >>>>>>>> Also is the benchmarking being done on dedicated systems? >>>>>>> Also, specjvm98 is meaningless to estimate the classloading costs. >>>>>>> Please try specjvm2008:startup.* tests? >>>>>> The specjvm run was for Gerard's question about standard benchmarks. >>>>> SPECjvm2008 is a standard benchmark. In fact, it is a benchmark that >>>>> deprecates SPECjvm98. >>>>> >>>>>> These are not benchmarks specifically for classloading. >>>>> There are benchmarks that try to estimate the startup costs. >>>>> SPECjvm2008:startup.* tests are one of them. >>>>> >>>>>> However, I agree it's a good idea to run standard benchmarks to >>>>>> confirm there is no overall performance degradation. From all the >>>>>> benchmarks including classloading measurements, we have confirmed >>>>>> that this specific change does not have negative impact on >>>>>> classloading itself and the overall performance. >>>>> Excellent. What are those benchmarks? May we see those? Because I have a >>>>> counter-example in this thread that this change *does* negatively impact >>>>> classloading. >>>>> >>>>> -Aleksey. >>>>> >>> >> > From jiangli.zhou at oracle.com Wed Dec 3 21:26:14 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 03 Dec 2014 13:26:14 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <547F7EB4.7090801@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> <547F7EB4.7090801@oracle.com> Message-ID: <547F7FF6.7050305@oracle.com> On 12/03/2014 01:20 PM, Jiangli Zhou wrote: > Hi Ioi, > > Thanks for the comments. > > On 12/03/2014 12:06 PM, Ioi Lam wrote: >> Hi Jiangli, >> >> The changes look good. Just a few nits: >> >> (1) src/share/vm/services/diagnosticCommand.cpp: >> >> This seems to be not needed >> 27 #include "classfile/compactHashtable.hpp" > > The #include of the header file is needed. The VM.symboltable and > VM.stringtable definitions are moved to compactHashtable.hpp to be > more close to their usage. > >> >> (2) test/runtime/SharedArchiveFile/DumpSymbolAndStringTable.java: >> 43 output.shouldContain("DumpSymbolAndStringTable"); >> 48 output.shouldContain("java.lang.String");Maybe these two >> should include the full line, and the terminating \n? > > Sounds good. Added. > >> Also, why is this necessary for VM.stringtable but not done for >> VM.symboltable? >> >> 49 } catch (RuntimeException e) { >> 50 output.shouldContain("Unknown diagnostic command"); > > Fixed for VM.symboltable. Thanks for catching that. > >> >> (3) src/share/vm/classfile/compactHashtable.cpp: >> >> 50 stats->hashentry_bytes = _num_entries * 8; // estimates only >> >> Is this always an conservative estimate (more than actually needed)? > > Yes. Some entries will have only the 4-byte offset, but we don't know > how many entries at this point. So we use a conservative estimate. > >> This should be explained in the comments. > > Will do. > >> Also, if the actual number is different, stats->hashentry_bytes >> should be adjusted so that the statistics can be printed correctly. > > Ok. > >> >> Indentation: 4->2 >> 134 assert(bucket_type == COMPACT_BUCKET_TYPE, "Bad bucket >> type"); > > Fixed. > >> >> Is the a reason to use guarantee here and not assert (I realize that >> I wrote these two lines, and I don't remember why :-)? >> 142 guarantee(deltax < max_delta, "range check"); >> 147 guarantee(count == _bucket_sizes[index], "sanity"); > > I'm guessing you wanted to have the check in non-debug build also. > >> >> I think these two blocks can be rewritten to avoid the use of the #ifdef >> 162 #ifdef _LP64 >> 163 *p++ = juint(base_address >> 32); >> 164 #else >> 165 *p++ = 0; >> 166 #endif >> 167 *p++ = juint(base_address & 0xffffffff); // base address >> >> 205 juint upper = *p++; >> 206 juint lower = *p++; >> 207 #ifdef _LP64 >> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); >> 209 #else >> 210 _base_address = uintx(lower); >> 211 #endif >> >> -> >> >> 163 *p++ = juint(base_address >> 32); >> 167 *p++ = juint(base_address & 0xffffffff); >> >> 205 juint upper = *p++; >> 206 juint lower = *p++; >> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); > > Sounds good. > >> >> Also, do you have statistics of the percentage of COMPACT_BUCKET_TYPE >> vs REGULAR_BUCKET_TYPE? > > Yes. :) With the default classlist, the compact bucket is about 6%. Here is better formatted table: empty buckets 1 entry buckets 2 entries buckets 3 entries buckets 4 entries buckets 5 entries buckets 6 entries buckets 7 entries buckets 8 entries buckets >8 entries 25 71 172 247 229 158 108 87 29 26 Thanks, Jiangli > > > empty buckets 1 entry buckets > 2 entries buckets > 3 entries buckets > 4 entries buckets > 5 entries buckets > 6 entries buckets > 7 entries buckets > 8 entries buckets > >8 entries > Bucket size 4 > 25 > 71 > 172 > 247 > 229 > 158 > 108 > 87 > 29 > 26 > > > Compressed > > > > > > > > > > > Thanks, > Jiangli > >> >> Thanks >> - Ioi >> >> >> On 12/2/14, 1:44 PM, Jiangli Zhou wrote: >>> Hi, >>> >>> I finally got back to this, sorry for the delay. Please see the >>> following new webre. >>> >>> http://cr.openjdk.java.net/~jiangli/8059510/webrev.06/ >>> >>> New in the webrev: >>> >>> 1. Further compression of the compact table >>> >>> * Remove the bucket_size table. With the sequential layout of the >>> buckets, lookup process can seek to the start of the next bucket >>> without the need of the current bucket size. For the last bucket, >>> it can seek to the end of the table. The table end offset is >>> added to the archived data. >>> * Bucket with exactly one entry is marked as 'compact' bucket, >>> whose entry only contains the symbol offset. The symbol hash is >>> eliminated for 'compact' buckets. Lookup compares the symbol >>> directly in that case. >>> >>> 2. The shared symbol table is not always looked up first. The last >>> table that fetches symbol successfully is used for lookup. >>> >>> 3. Added a lot more comments in compactHashtable.hpp with details of >>> the compact table layout and dump/lookup process. >>> >>> I measured using the classloading benchmark that Aleksey pointed to >>> me. This benchmark loads classes using user defined classloader. >>> There is a very small degradation shown in the benchmark comparing >>> 'before' and 'after' with archive dumped with the default >>> configuration. When symbols from the test is added to the shared >>> table, there is an observable speedup in the benchmark. The speedup >>> is also very small. >>> >>> Thanks, >>> Jiangli >>> >>> >>> On 10/29/2014 02:55 PM, Jiangli Zhou wrote: >>>> Hi John, >>>> >>>> Thank you for the thoughts on this! Yes, it's a good time to have >>>> these conversations. Please see some quick responses from me below, >>>> with more details to follow. >>>> >>>> On 10/29/2014 12:46 PM, John Rose wrote: >>>>> I have a few points about the impact of this change on startup >>>>> performance, and on trends in our code base: >>>>> >>>>> 1. We can live with small performance regressions on some >>>>> benchmarks. Otherwise we'd never get anywhere. So I am not >>>>> saying that the current (very interesting and multi-facted) >>>>> conversation must continue a long time before we can push any code. >>>>> >>>>> 2. Aleksey's challenge is valid, and I don't see a strong reply to >>>>> it yet. Work like this, that emphasizes compactness and >>>>> sharability can usually be given the benefit of the doubt for >>>>> startup. But if we observe a problem we should make a more >>>>> careful measurement. If this change goes in with a measurable >>>>> regression, we need a follow-up conversation (centered around a >>>>> tracking bug) about quantifying the performance regression and >>>>> fixing it. (It's possible the conversation ends by saying "we >>>>> decided we don't care and here's why", but I doubt it will end >>>>> that way.) >>>> >>>> Besides my classloading benchmark results posted in earlier >>>> message, we have asked performance team to help us measure startup, >>>> classloding, and memory saving regarding this change, and verified >>>> there was no regression in startup/classloading. The results were >>>> not posted in the thread however. They were added to the bug report >>>> for JDK-8059510 . >>>> >>>>> 3. The benchmark Aleksey chose, Nashorn startup, may well be an >>>>> outlier. Dynamic language runtimes create lots of tiny methods, >>>>> and are likely to put heavy loads on symbol resolution, including >>>>> resolution of symbols that are not known statically. For these, >>>>> the first-look (front-end) table being proposed might be a loss, >>>>> especially if the Nashorn runtime is not in the first-look table. >>>>> >>>>> 4. Possible workarounds: #1 Track hit rates and start looking at >>>>> the static table *second* if it fails to produce enough hits, >>>>> compared to the dynamic table. #2 When that happens, >>>>> incrementally migrate (rehash) frequently found symbols in the >>>>> static table into the main table. #3 Re-evaluate the hashing >>>>> algorithm. (More on this below.) >>>> >>>> That's interesting thinking. I'll follow up on this. >>>> >>>>> 5. I strongly support moving towards position-independent, >>>>> shareable, pointer-free data structures, and I want us to learn to >>>>> do it well over time. (Ioi, you are one of our main explorers >>>>> here!) To me, a performance regression is a suggestion that we >>>>> have something more to learn. And, it's not a surprise that we >>>>> don't have it all figured out yet. >>>> >>>> Point taken! Position-independent is one of our goal for the >>>> archived data. We'll be start looking into removing direct pointers >>>> in the shared class data. >>>> >>>>> 6. For work like this, we need to agree upon at least one or two >>>>> startup performance tests to shake out bottlenecks early and give >>>>> confidence. People who work on startup performance should know how >>>>> to run them and be able to quote them. >>>> >>>> Thank you for bring this up. Totally agree. For me, personally I've >>>> played with different benchmarks and applications with different >>>> focus over time. It would be good to have some commonly agreed >>>> startup performance tests for this. A standalone classloading >>>> benchmark, HelloWorld and Nashorn startup probably are good >>>> choices. We also have an internal application for startup and >>>> memory saving measurement. >>>> >>>>> 7. There's a vibrant literature about offline (statically created) >>>>> hash tables, and lots of tricks floating around, such as perfect >>>>> or semi-perfect hash functions, and multiple-choice hashing, and >>>>> locality-aware structures. I can think of several algorithmic >>>>> tweaks I'd like to try on this code. (If they haven't already >>>>> been tried or discarded: I assume Ioi has already tried a number >>>>> of things.) Moreover, this is not just doorknob-polishing, >>>>> because (as said in point 5) we want to develop our competency >>>>> with these sorts of data structures. >>>>> >>>>> 8. I found the code hard to read, so it was hard to reason about >>>>> the algorithm. The core routine, "lookup", has no comments and >>>>> only one assert. It uses only primitive C types and (perhaps in >>>>> the name of performance) is not factored into sub-functions. The >>>>> code which generates the static table is also hard to reason about >>>>> in similar ways. The bucket size parameter choice (a crucial >>>>> compromise between performance and compactness) is 4 but the >>>>> motivation and implications are left for the reader to puzzle >>>>> out. Likewise, the use of hardware division instead of >>>>> power-of-two table size is presented without comment, despite the >>>>> fact that we favor power-of-two in other parts of our stack, and a >>>>> power-of-two solution would be reasonable here (given the bucket >>>>> size). >>>>> >>>>> 9. Perhaps I wouldn't care as much about the code style if the >>>>> code were not performance sensitive or if it were a unique and >>>>> isolated part of the JVM. In this case, I expect this code (and >>>>> other similar code) to improve, over time, in readability and >>>>> robustness, as we learn to work with this new kind of data >>>>> structure. So even if we decide there is no significant >>>>> regression here, and decide to push it as-is, we still need to use >>>>> it as an example to help us get better at writing easy-to-read >>>>> code which works with pointer-free data. >>>>> >>>>> 10. I would like to see (posted somewhere or attached to the bug) >>>>> a sample list of the symbols in a typical symbol table. Perhaps >>>>> this already exists and I missed it. I think it would be friendly >>>>> to put some comments in the code that help the reader estimate >>>>> numbers like table size, bucket length, number of queries, number >>>>> of probes per query, symbol length statistics (mean, histogram). >>>>> Of course such comments go stale over time, but so does the >>>>> algorithm, and being coy about the conditions of the moment >>>>> doesn't help us in the long run. Even a short comment is better >>>>> than none, for example: >>>>> http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/share/vm/classfile/vmSymbols.cpp#l206 >>>> >>>> Those are good suggestions. I'll add a sample list o the symbols to >>>> the bug report and try to put more comments in the code. >>>> >>>> Thanks! >>>> >>>> Jiangli >>>> >>>>> It is a good time to have these conversations. >>>>> >>>>> Best wishes, >>>>> ? John >>>>> >>>>> On Oct 13, 2014, at 11:46 AM, Aleksey >>>>> Shipilev wrote: >>>>> >>>>>> Hi Jiangli, >>>>>> >>>>>> On 13.10.2014 18:26, Jiangli Zhou wrote: >>>>>>> On 10/13/2014 03:18 AM, Aleksey Shipilev wrote: >>>>>>>> On 13.10.2014 03:32, David Holmes wrote: >>>>>>>>> On 11/10/2014 1:47 PM, Jiangli Zhou wrote: >>>>>>>>> Also is the benchmarking being done on dedicated systems? >>>>>>>> Also, specjvm98 is meaningless to estimate the classloading costs. >>>>>>>> Please try specjvm2008:startup.* tests? >>>>>>> The specjvm run was for Gerard's question about standard >>>>>>> benchmarks. >>>>>> SPECjvm2008 is a standard benchmark. In fact, it is a benchmark that >>>>>> deprecates SPECjvm98. >>>>>> >>>>>>> These are not benchmarks specifically for classloading. >>>>>> There are benchmarks that try to estimate the startup costs. >>>>>> SPECjvm2008:startup.* tests are one of them. >>>>>> >>>>>>> However, I agree it's a good idea to run standard benchmarks to >>>>>>> confirm there is no overall performance degradation. From all the >>>>>>> benchmarks including classloading measurements, we have confirmed >>>>>>> that this specific change does not have negative impact on >>>>>>> classloading itself and the overall performance. >>>>>> Excellent. What are those benchmarks? May we see those? Because I >>>>>> have a >>>>>> counter-example in this thread that this change *does* negatively >>>>>> impact >>>>>> classloading. >>>>>> >>>>>> -Aleksey. >>>>>> >>>> >>> >> > From ioi.lam at oracle.com Wed Dec 3 21:28:56 2014 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 03 Dec 2014 13:28:56 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <547F7EB4.7090801@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> <547F7EB4.7090801@oracle.com> Message-ID: <547F8098.3050206@oracle.com> On 12/3/14, 1:20 PM, Jiangli Zhou wrote: > >> >> Is the a reason to use guarantee here and not assert (I realize that >> I wrote these two lines, and I don't remember why :-)? >> 142 guarantee(deltax < max_delta, "range check"); >> 147 guarantee(count == _bucket_sizes[index], "sanity"); > > I'm guessing you wanted to have the check in non-debug build also. I think it's better to switch to assert for uniformity. The checks aren't necessary for non-debug build. Thanks - Ioi > >> >> I think these two blocks can be rewritten to avoid the use of the #ifdef >> 162 #ifdef _LP64 >> 163 *p++ = juint(base_address >> 32); >> 164 #else >> 165 *p++ = 0; >> 166 #endif >> 167 *p++ = juint(base_address & 0xffffffff); // base address >> >> 205 juint upper = *p++; >> 206 juint lower = *p++; >> 207 #ifdef _LP64 >> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); >> 209 #else >> 210 _base_address = uintx(lower); >> 211 #endif >> >> -> >> >> 163 *p++ = juint(base_address >> 32); >> 167 *p++ = juint(base_address & 0xffffffff); >> >> 205 juint upper = *p++; >> 206 juint lower = *p++; >> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); > > Sounds good. > >> >> Also, do you have statistics of the percentage of COMPACT_BUCKET_TYPE >> vs REGULAR_BUCKET_TYPE? > > Yes. :) With the default classlist, the compact bucket is about 6% for. > > > empty buckets 1 entry buckets > 2 entries buckets > 3 entries buckets > 4 entries buckets > 5 entries buckets > 6 entries buckets > 7 entries buckets > 8 entries buckets > >8 entries > Bucket size 4 > 25 > 71 > 172 > 247 > 229 > 158 > 108 > 87 > 29 > 26 > > > Compressed > > > > > > > > > > > Thanks, > Jiangli > >> >> Thanks >> - Ioi >> >> >> On 12/2/14, 1:44 PM, Jiangli Zhou wrote: >>> Hi, >>> >>> I finally got back to this, sorry for the delay. Please see the >>> following new webre. >>> >>> http://cr.openjdk.java.net/~jiangli/8059510/webrev.06/ >>> >>> New in the webrev: >>> >>> 1. Further compression of the compact table >>> >>> * Remove the bucket_size table. With the sequential layout of the >>> buckets, lookup process can seek to the start of the next bucket >>> without the need of the current bucket size. For the last >>> bucket, it can seek to the end of the table. The table end >>> offset is added to the archived data. >>> * Bucket with exactly one entry is marked as 'compact' bucket, >>> whose entry only contains the symbol offset. The symbol hash is >>> eliminated for 'compact' buckets. Lookup compares the symbol >>> directly in that case. >>> >>> 2. The shared symbol table is not always looked up first. The last >>> table that fetches symbol successfully is used for lookup. >>> >>> 3. Added a lot more comments in compactHashtable.hpp with details of >>> the compact table layout and dump/lookup process. >>> >>> I measured using the classloading benchmark that Aleksey pointed to >>> me. This benchmark loads classes using user defined classloader. >>> There is a very small degradation shown in the benchmark comparing >>> 'before' and 'after' with archive dumped with the default >>> configuration. When symbols from the test is added to the shared >>> table, there is an observable speedup in the benchmark. The speedup >>> is also very small. >>> >>> Thanks, >>> Jiangli >>> >>> >>> On 10/29/2014 02:55 PM, Jiangli Zhou wrote: >>>> Hi John, >>>> >>>> Thank you for the thoughts on this! Yes, it's a good time to have >>>> these conversations. Please see some quick responses from me below, >>>> with more details to follow. >>>> >>>> On 10/29/2014 12:46 PM, John Rose wrote: >>>>> I have a few points about the impact of this change on startup performance, and on trends in our code base: >>>>> >>>>> 1. We can live with small performance regressions on some benchmarks. Otherwise we'd never get anywhere. So I am not saying that the current (very interesting and multi-facted) conversation must continue a long time before we can push any code. >>>>> >>>>> 2. Aleksey's challenge is valid, and I don't see a strong reply to it yet. Work like this, that emphasizes compactness and sharability can usually be given the benefit of the doubt for startup. But if we observe a problem we should make a more careful measurement. If this change goes in with a measurable regression, we need a follow-up conversation (centered around a tracking bug) about quantifying the performance regression and fixing it. (It's possible the conversation ends by saying "we decided we don't care and here's why", but I doubt it will end that way.) >>>> >>>> Besides my classloading benchmark results posted in earlier >>>> message, we have asked performance team to help us measure startup, >>>> classloding, and memory saving regarding this change, and verified >>>> there was no regression in startup/classloading. The results were >>>> not posted in the thread however. They were added to the bug report >>>> for JDK-8059510 . >>>> >>>>> 3. The benchmark Aleksey chose, Nashorn startup, may well be an outlier. Dynamic language runtimes create lots of tiny methods, and are likely to put heavy loads on symbol resolution, including resolution of symbols that are not known statically. For these, the first-look (front-end) table being proposed might be a loss, especially if the Nashorn runtime is not in the first-look table. >>>>> >>>>> 4. Possible workarounds: #1 Track hit rates and start looking at the static table *second* if it fails to produce enough hits, compared to the dynamic table. #2 When that happens, incrementally migrate (rehash) frequently found symbols in the static table into the main table. #3 Re-evaluate the hashing algorithm. (More on this below.) >>>> >>>> That's interesting thinking. I'll follow up on this. >>>> >>>>> 5. I strongly support moving towards position-independent, shareable, pointer-free data structures, and I want us to learn to do it well over time. (Ioi, you are one of our main explorers here!) To me, a performance regression is a suggestion that we have something more to learn. And, it's not a surprise that we don't have it all figured out yet. >>>> >>>> Point taken! Position-independent is one of our goal for the >>>> archived data. We'll be start looking into removing direct pointers >>>> in the shared class data. >>>> >>>>> 6. For work like this, we need to agree upon at least one or two startup performance tests to shake out bottlenecks early and give confidence. People who work on startup performance should know how to run them and be able to quote them. >>>> >>>> Thank you for bring this up. Totally agree. For me, personally I've >>>> played with different benchmarks and applications with different >>>> focus over time. It would be good to have some commonly agreed >>>> startup performance tests for this. A standalone classloading >>>> benchmark, HelloWorld and Nashorn startup probably are good >>>> choices. We also have an internal application for startup and >>>> memory saving measurement. >>>> >>>>> 7. There's a vibrant literature about offline (statically created) hash tables, and lots of tricks floating around, such as perfect or semi-perfect hash functions, and multiple-choice hashing, and locality-aware structures. I can think of several algorithmic tweaks I'd like to try on this code. (If they haven't already been tried or discarded: I assume Ioi has already tried a number of things.) Moreover, this is not just doorknob-polishing, because (as said in point 5) we want to develop our competency with these sorts of data structures. >>>>> >>>>> 8. I found the code hard to read, so it was hard to reason about the algorithm. The core routine, "lookup", has no comments and only one assert. It uses only primitive C types and (perhaps in the name of performance) is not factored into sub-functions. The code which generates the static table is also hard to reason about in similar ways. The bucket size parameter choice (a crucial compromise between performance and compactness) is 4 but the motivation and implications are left for the reader to puzzle out. Likewise, the use of hardware division instead of power-of-two table size is presented without comment, despite the fact that we favor power-of-two in other parts of our stack, and a power-of-two solution would be reasonable here (given the bucket size). >>>>> >>>>> 9. Perhaps I wouldn't care as much about the code style if the code were not performance sensitive or if it were a unique and isolated part of the JVM. In this case, I expect this code (and other similar code) to improve, over time, in readability and robustness, as we learn to work with this new kind of data structure. So even if we decide there is no significant regression here, and decide to push it as-is, we still need to use it as an example to help us get better at writing easy-to-read code which works with pointer-free data. >>>>> >>>>> 10. I would like to see (posted somewhere or attached to the bug) a sample list of the symbols in a typical symbol table. Perhaps this already exists and I missed it. I think it would be friendly to put some comments in the code that help the reader estimate numbers like table size, bucket length, number of queries, number of probes per query, symbol length statistics (mean, histogram). Of course such comments go stale over time, but so does the algorithm, and being coy about the conditions of the moment doesn't help us in the long run. Even a short comment is better than none, for example: >>>>> http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/share/vm/classfile/vmSymbols.cpp#l206 >>>> >>>> Those are good suggestions. I'll add a sample list o the symbols to >>>> the bug report and try to put more comments in the code. >>>> >>>> Thanks! >>>> >>>> Jiangli >>>> >>>>> It is a good time to have these conversations. >>>>> >>>>> Best wishes, >>>>> ? John >>>>> >>>>> On Oct 13, 2014, at 11:46 AM, Aleksey Shipilev wrote: >>>>> >>>>>> Hi Jiangli, >>>>>> >>>>>> On 13.10.2014 18:26, Jiangli Zhou wrote: >>>>>>> On 10/13/2014 03:18 AM, Aleksey Shipilev wrote: >>>>>>>> On 13.10.2014 03:32, David Holmes wrote: >>>>>>>>> On 11/10/2014 1:47 PM, Jiangli Zhou wrote: >>>>>>>>> Also is the benchmarking being done on dedicated systems? >>>>>>>> Also, specjvm98 is meaningless to estimate the classloading costs. >>>>>>>> Please try specjvm2008:startup.* tests? >>>>>>> The specjvm run was for Gerard's question about standard benchmarks. >>>>>> SPECjvm2008 is a standard benchmark. In fact, it is a benchmark that >>>>>> deprecates SPECjvm98. >>>>>> >>>>>>> These are not benchmarks specifically for classloading. >>>>>> There are benchmarks that try to estimate the startup costs. >>>>>> SPECjvm2008:startup.* tests are one of them. >>>>>> >>>>>>> However, I agree it's a good idea to run standard benchmarks to >>>>>>> confirm there is no overall performance degradation. From all the >>>>>>> benchmarks including classloading measurements, we have confirmed >>>>>>> that this specific change does not have negative impact on >>>>>>> classloading itself and the overall performance. >>>>>> Excellent. What are those benchmarks? May we see those? Because I have a >>>>>> counter-example in this thread that this change *does* negatively impact >>>>>> classloading. >>>>>> >>>>>> -Aleksey. >>>>>> >>>> >>> >> > From kumar.x.srinivasan at oracle.com Wed Dec 3 22:13:03 2014 From: kumar.x.srinivasan at oracle.com (Kumar Srinivasan) Date: Wed, 03 Dec 2014 14:13:03 -0800 Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation fault In-Reply-To: <547F63F1.5050202@oracle.com> References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com> <54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com> <546C5986.6010500@oracle.com> <546C6D1A.8050903@oracle.com> <546CBCAB.7040101@oracle.com> <547D265A.20005@oracle.com> <547F5D50.30309@oracle.com> <547F63F1.5050202@oracle.com> Message-ID: <547F8AEF.3000002@oracle.com> On 12/3/2014 11:26 AM, Chris Plummer wrote: > Hi Kumar, > > On 12/3/14 10:58 AM, Kumar Srinivasan wrote: >> Hi Chris, >> >> Approved with some minor nits, typos which needs correction. >> >> yes java.c follows the JDK indenting as Alan pointed out. >> >> TooSmallStackSize.java >> >> Copyright should be 2014, > Copy/paste error from example test I was referred to. I will fix, and > also remove the import if not needed. >> >> 1. >> >> 37 * stack size for the platform (as provided by the JVM error >> message when a very >> 38 * small is used), and then verify that the JVM can be launched >> with that stack >> >> s/small is/small stack is/ > ok >> >> 2. >> >> 54 * Returns the minimum stack size this platform will allowed >> based on the >> >> s/allowed/allow/ > ok >> >> 3. >> >> 58 * The TestResult argument must containthe result of having >> already run >> s/containthe/contain the/ > ok >> >> 4. >> 92 if (verbose) System.out.println("*** Testing " + stackSize); >> >> I know this is allowed in the HotSpot world but in JDK land we always >> use with a LF + Indent, also in other places. > Are curly braces needed then? I know some coding conventions require > them. No not necessary for one liners. >> >> 5. >> 85 * Returns the mimumum allowed stack size gleaned from the >> error message, >> s/mimumum/minimum > ok. >> >> >> Finally I am concerned with the integration, since it spans both >> HotSpot and JDK, and it appears the test will fail if the HotSpot >> changes are not integrated first, or has it already ? > There are no hotspot changes. java.c is where the fix is. Great!. Thanks Kumar > > thanks, > > Chris >> >> Thanks >> Kumar >> >> >> >> >> >> >> >> On 12/1/2014 6:39 PM, Chris Plummer wrote: >>> Sorry about the long delay in getting back to this. I ran into two >>> separate JPRT issues that were preventing me from testing these >>> changes, plus I was on vacation last week. Here's an updated webrev. >>> I'm not sure where we left things, so I'll just say what's changed >>> since the original version: >>> >>> 1. Rewrote the test to be in Java instead of a shell script. >>> 2. Moved the test from hotspot/test/runtime/memory to >>> jdk/test/tools/launcher >>> 3. Added STACK_SIZE_MINIMUM to java.c, allowing a makefile to >>> override the default 32k minimum value. >>> >>> https://bugs.openjdk.java.net/browse/JDK-6762191 >>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.02/ >>> >>> thanks, >>> >>> Chris >>> >>> On 11/19/14 7:52 AM, Chris Plummer wrote: >>>> On 11/19/14 2:12 AM, David Holmes wrote: >>>>> On 19/11/2014 6:49 PM, Chris Plummer wrote: >>>>>> I've update the webrev to add STACK_SIZE_MINIMUM in place of the 32k >>>>>> references, and also moved the test from hotspot/test/runtime to >>>>>> jdk/test/tools/launcher as David requested. That required some >>>>>> adjustments to the test script, since test_env.sh does not exist in >>>>>> jdk/test, so I had to pull in the bits I needed into the script. >>>>> >>>>> Is there a reason this needs a shell script instead of using the >>>>> testlibrary tools to launch the VM and check the output? >>>> Not that I'm aware of. I guess I just really didn't look at what it >>>> would take to make it all in java. I'll have a look at java >>>> examples and convert it. >>>> >>>> Chris >>>>> >>>>> Sorry that should have been mentioned much earlier. >>>>> >>>>> David >>>>> >>>>> >>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.01/ >>>>>> >>>>>> I still need to rerun through JPRT. I'll do so once there are no >>>>>> more >>>>>> suggested changes. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 11/18/14 2:08 PM, Chris Plummer wrote: >>>>>>> Adding core-libs-dev at openjdk.java.net, since one of the changes >>>>>>> is in >>>>>>> java.c. >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 11/12/14 6:43 PM, David Holmes wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Sorry for the delay. >>>>>>>> >>>>>>>> On 13/11/2014 5:44 AM, Chris Plummer wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I'm still looking for reviewers. >>>>>>>> >>>>>>>> As the change is to the launcher it needs to be reviewed by the >>>>>>>> launcher owner - which I think is serviceability (though also cc'd >>>>>>>> Kumar :) ). >>>>>>>> >>>>>>>> Launcher change, and your rationale, seems okay to me. I'd >>>>>>>> probably >>>>>>>> put the test in to jdk/test/tools/launcher/ though. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 11/7/14 7:53 PM, Chris Plummer wrote: >>>>>>>>>> This is an initial review for 6762191. I'm guessing there >>>>>>>>>> will be >>>>>>>>>> recommendations to fix in a different way, but thought this >>>>>>>>>> would be a >>>>>>>>>> good time to start the discussion. >>>>>>>>>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6762191 >>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/ >>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/ >>>>>>>>>> >>>>>>>>>> The bug is that if the -Xss size is set to something very >>>>>>>>>> small (like >>>>>>>>>> 16k), on linux there will be a crash due to overwriting the >>>>>>>>>> end of the >>>>>>>>>> stack. This happens before hotspot can compute its stack >>>>>>>>>> needs and >>>>>>>>>> verify that the stack is big enough. >>>>>>>>>> >>>>>>>>>> It didn't seem viable to move the hotspot stack size check >>>>>>>>>> earlier. It >>>>>>>>>> depends on too much other work done before that point, and >>>>>>>>>> the changes >>>>>>>>>> would have been disruptive. The stack size check is currently >>>>>>>>>> done in >>>>>>>>>> os::init_2(). >>>>>>>>>> >>>>>>>>>> What is needed is a check before the thread is created. That >>>>>>>>>> way we >>>>>>>>>> can create a thread with a big enough stack to handle all >>>>>>>>>> needs up to >>>>>>>>>> the point of the check in os::init_2(). This initial check >>>>>>>>>> does not >>>>>>>>>> need to be the final check. It just needs to confirm that we >>>>>>>>>> have >>>>>>>>>> enough stack to get us to the check in os::init_2(). >>>>>>>>>> >>>>>>>>>> I decided to check in java.c if the -Xss size is too small, >>>>>>>>>> and set it >>>>>>>>>> to a larger size if it is. I hard coded this size to 32k >>>>>>>>>> (I'll explain >>>>>>>>>> why 32k later). I suspect this is the part that will result >>>>>>>>>> in some >>>>>>>>>> debate. If you have better suggestions let me know. If it >>>>>>>>>> does stay >>>>>>>>>> here, then probably the 32k needs to be a #define, and maybe >>>>>>>>>> even an >>>>>>>>>> OS porting interface, but I'm not sure where to put it. >>>>>>>>>> >>>>>>>>>> The reason I chose 32k is because this is big enough for all >>>>>>>>>> platforms >>>>>>>>>> to get to the stack size check in os::init_2(). It is also >>>>>>>>>> smaller >>>>>>>>>> than the actual minimum stack size allowed on any platform. >>>>>>>>>> 32-bit >>>>>>>>>> windows has the smallest requirement at 64k. I add some >>>>>>>>>> printfs to >>>>>>>>>> print the minimum stack requirement, and then ran a simple >>>>>>>>>> JTReg test >>>>>>>>>> with every JPRT supported platform to get the results. >>>>>>>>>> >>>>>>>>>> The TooSmallStackSize.sh will run "java -version" with -Xss16k, >>>>>>>>>> -Xss32k, and -XXss, where is the size from >>>>>>>>>> the >>>>>>>>>> error message produced by the JVM, such as in the following: >>>>>>>>>> >>>>>>>>>> $ java -Xss32k -version >>>>>>>>>> The stack size specified is too small, Specify at least 100k >>>>>>>>>> Error: Could not create the Java Virtual Machine. >>>>>>>>>> Error: A fatal exception has occurred. Program will exit. >>>>>>>>>> >>>>>>>>>> I ran this test through JPRT on all platforms, and they all >>>>>>>>>> pass. >>>>>>>>>> >>>>>>>>>> One thing to point out is that Windows behaves a bit >>>>>>>>>> different than >>>>>>>>>> the other platforms. It always rounds the stack size up to a >>>>>>>>>> multiple >>>>>>>>>> of 64k , so even if you specify -Xss16k, you get a 64k stack. On >>>>>>>>>> 32-bit Windows with C1, 64k is also the minimum requirement, >>>>>>>>>> so there >>>>>>>>>> is no error produced in this case. However, on 32-bit Windows >>>>>>>>>> with C2, >>>>>>>>>> 68k is the minimum, so an error is produced since the stack >>>>>>>>>> will only >>>>>>>>>> be 64k. There is no bug here. It's just a bit confusing. >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>> >>>>>>> >>>>>> >>>> >>> >> > From jiangli.zhou at oracle.com Wed Dec 3 22:36:36 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 03 Dec 2014 14:36:36 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <547F8098.3050206@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> <547F7EB4.7090801@oracle.com> <547F8098.3050206@oracle.com> Message-ID: <547F9074.9080504@oracle.com> On 12/03/2014 01:28 PM, Ioi Lam wrote: > > On 12/3/14, 1:20 PM, Jiangli Zhou wrote: >> >>> >>> Is the a reason to use guarantee here and not assert (I realize that >>> I wrote these two lines, and I don't remember why :-)? >>> 142 guarantee(deltax < max_delta, "range check"); >>> 147 guarantee(count == _bucket_sizes[index], "sanity"); >> >> I'm guessing you wanted to have the check in non-debug build also. > > I think it's better to switch to assert for uniformity. The checks > aren't necessary for non-debug build. Changed both to asserts. Thanks, Jiangli > > Thanks > - Ioi >> >>> >>> I think these two blocks can be rewritten to avoid the use of the #ifdef >>> 162 #ifdef _LP64 >>> 163 *p++ = juint(base_address >> 32); >>> 164 #else >>> 165 *p++ = 0; >>> 166 #endif >>> 167 *p++ = juint(base_address & 0xffffffff); // base address >>> >>> 205 juint upper = *p++; >>> 206 juint lower = *p++; >>> 207 #ifdef _LP64 >>> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); >>> 209 #else >>> 210 _base_address = uintx(lower); >>> 211 #endif >>> >>> -> >>> >>> 163 *p++ = juint(base_address >> 32); >>> 167 *p++ = juint(base_address & 0xffffffff); >>> >>> 205 juint upper = *p++; >>> 206 juint lower = *p++; >>> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); >> >> Sounds good. >> >>> >>> Also, do you have statistics of the percentage of >>> COMPACT_BUCKET_TYPE vs REGULAR_BUCKET_TYPE? >> >> Yes. :) With the default classlist, the compact bucket is about 6% for. >> >> >> empty buckets 1 entry buckets >> 2 entries buckets >> 3 entries buckets >> 4 entries buckets >> 5 entries buckets >> 6 entries buckets >> 7 entries buckets >> 8 entries buckets >> >8 entries >> Bucket size 4 >> 25 >> 71 >> 172 >> 247 >> 229 >> 158 >> 108 >> 87 >> 29 >> 26 >> >> >> Compressed >> >> >> >> >> >> >> >> >> >> >> Thanks, >> Jiangli >> >>> >>> Thanks >>> - Ioi >>> >>> >>> On 12/2/14, 1:44 PM, Jiangli Zhou wrote: >>>> Hi, >>>> >>>> I finally got back to this, sorry for the delay. Please see the >>>> following new webre. >>>> >>>> http://cr.openjdk.java.net/~jiangli/8059510/webrev.06/ >>>> >>>> New in the webrev: >>>> >>>> 1. Further compression of the compact table >>>> >>>> * Remove the bucket_size table. With the sequential layout of the >>>> buckets, lookup process can seek to the start of the next >>>> bucket without the need of the current bucket size. For the >>>> last bucket, it can seek to the end of the table. The table end >>>> offset is added to the archived data. >>>> * Bucket with exactly one entry is marked as 'compact' bucket, >>>> whose entry only contains the symbol offset. The symbol hash is >>>> eliminated for 'compact' buckets. Lookup compares the symbol >>>> directly in that case. >>>> >>>> 2. The shared symbol table is not always looked up first. The last >>>> table that fetches symbol successfully is used for lookup. >>>> >>>> 3. Added a lot more comments in compactHashtable.hpp with details >>>> of the compact table layout and dump/lookup process. >>>> >>>> I measured using the classloading benchmark that Aleksey pointed to >>>> me. This benchmark loads classes using user defined classloader. >>>> There is a very small degradation shown in the benchmark comparing >>>> 'before' and 'after' with archive dumped with the default >>>> configuration. When symbols from the test is added to the shared >>>> table, there is an observable speedup in the benchmark. The speedup >>>> is also very small. >>>> >>>> Thanks, >>>> Jiangli >>>> >>>> >>>> On 10/29/2014 02:55 PM, Jiangli Zhou wrote: >>>>> Hi John, >>>>> >>>>> Thank you for the thoughts on this! Yes, it's a good time to have >>>>> these conversations. Please see some quick responses from me >>>>> below, with more details to follow. >>>>> >>>>> On 10/29/2014 12:46 PM, John Rose wrote: >>>>>> I have a few points about the impact of this change on startup performance, and on trends in our code base: >>>>>> >>>>>> 1. We can live with small performance regressions on some benchmarks. Otherwise we'd never get anywhere. So I am not saying that the current (very interesting and multi-facted) conversation must continue a long time before we can push any code. >>>>>> >>>>>> 2. Aleksey's challenge is valid, and I don't see a strong reply to it yet. Work like this, that emphasizes compactness and sharability can usually be given the benefit of the doubt for startup. But if we observe a problem we should make a more careful measurement. If this change goes in with a measurable regression, we need a follow-up conversation (centered around a tracking bug) about quantifying the performance regression and fixing it. (It's possible the conversation ends by saying "we decided we don't care and here's why", but I doubt it will end that way.) >>>>> >>>>> Besides my classloading benchmark results posted in earlier >>>>> message, we have asked performance team to help us measure >>>>> startup, classloding, and memory saving regarding this change, and >>>>> verified there was no regression in startup/classloading. The >>>>> results were not posted in the thread however. They were added to >>>>> the bug report for JDK-8059510 >>>>> . >>>>> >>>>>> 3. The benchmark Aleksey chose, Nashorn startup, may well be an outlier. Dynamic language runtimes create lots of tiny methods, and are likely to put heavy loads on symbol resolution, including resolution of symbols that are not known statically. For these, the first-look (front-end) table being proposed might be a loss, especially if the Nashorn runtime is not in the first-look table. >>>>>> >>>>>> 4. Possible workarounds: #1 Track hit rates and start looking at the static table *second* if it fails to produce enough hits, compared to the dynamic table. #2 When that happens, incrementally migrate (rehash) frequently found symbols in the static table into the main table. #3 Re-evaluate the hashing algorithm. (More on this below.) >>>>> >>>>> That's interesting thinking. I'll follow up on this. >>>>> >>>>>> 5. I strongly support moving towards position-independent, shareable, pointer-free data structures, and I want us to learn to do it well over time. (Ioi, you are one of our main explorers here!) To me, a performance regression is a suggestion that we have something more to learn. And, it's not a surprise that we don't have it all figured out yet. >>>>> >>>>> Point taken! Position-independent is one of our goal for the >>>>> archived data. We'll be start looking into removing direct >>>>> pointers in the shared class data. >>>>> >>>>>> 6. For work like this, we need to agree upon at least one or two startup performance tests to shake out bottlenecks early and give confidence. People who work on startup performance should know how to run them and be able to quote them. >>>>> >>>>> Thank you for bring this up. Totally agree. For me, personally >>>>> I've played with different benchmarks and applications with >>>>> different focus over time. It would be good to have some commonly >>>>> agreed startup performance tests for this. A standalone >>>>> classloading benchmark, HelloWorld and Nashorn startup probably >>>>> are good choices. We also have an internal application for startup >>>>> and memory saving measurement. >>>>> >>>>>> 7. There's a vibrant literature about offline (statically created) hash tables, and lots of tricks floating around, such as perfect or semi-perfect hash functions, and multiple-choice hashing, and locality-aware structures. I can think of several algorithmic tweaks I'd like to try on this code. (If they haven't already been tried or discarded: I assume Ioi has already tried a number of things.) Moreover, this is not just doorknob-polishing, because (as said in point 5) we want to develop our competency with these sorts of data structures. >>>>>> >>>>>> 8. I found the code hard to read, so it was hard to reason about the algorithm. The core routine, "lookup", has no comments and only one assert. It uses only primitive C types and (perhaps in the name of performance) is not factored into sub-functions. The code which generates the static table is also hard to reason about in similar ways. The bucket size parameter choice (a crucial compromise between performance and compactness) is 4 but the motivation and implications are left for the reader to puzzle out. Likewise, the use of hardware division instead of power-of-two table size is presented without comment, despite the fact that we favor power-of-two in other parts of our stack, and a power-of-two solution would be reasonable here (given the bucket size). >>>>>> >>>>>> 9. Perhaps I wouldn't care as much about the code style if the code were not performance sensitive or if it were a unique and isolated part of the JVM. In this case, I expect this code (and other similar code) to improve, over time, in readability and robustness, as we learn to work with this new kind of data structure. So even if we decide there is no significant regression here, and decide to push it as-is, we still need to use it as an example to help us get better at writing easy-to-read code which works with pointer-free data. >>>>>> >>>>>> 10. I would like to see (posted somewhere or attached to the bug) a sample list of the symbols in a typical symbol table. Perhaps this already exists and I missed it. I think it would be friendly to put some comments in the code that help the reader estimate numbers like table size, bucket length, number of queries, number of probes per query, symbol length statistics (mean, histogram). Of course such comments go stale over time, but so does the algorithm, and being coy about the conditions of the moment doesn't help us in the long run. Even a short comment is better than none, for example: >>>>>> http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/share/vm/classfile/vmSymbols.cpp#l206 >>>>> >>>>> Those are good suggestions. I'll add a sample list o the symbols >>>>> to the bug report and try to put more comments in the code. >>>>> >>>>> Thanks! >>>>> >>>>> Jiangli >>>>> >>>>>> It is a good time to have these conversations. >>>>>> >>>>>> Best wishes, >>>>>> ? John >>>>>> >>>>>> On Oct 13, 2014, at 11:46 AM, Aleksey Shipilev wrote: >>>>>> >>>>>>> Hi Jiangli, >>>>>>> >>>>>>> On 13.10.2014 18:26, Jiangli Zhou wrote: >>>>>>>> On 10/13/2014 03:18 AM, Aleksey Shipilev wrote: >>>>>>>>> On 13.10.2014 03:32, David Holmes wrote: >>>>>>>>>> On 11/10/2014 1:47 PM, Jiangli Zhou wrote: >>>>>>>>>> Also is the benchmarking being done on dedicated systems? >>>>>>>>> Also, specjvm98 is meaningless to estimate the classloading costs. >>>>>>>>> Please try specjvm2008:startup.* tests? >>>>>>>> The specjvm run was for Gerard's question about standard benchmarks. >>>>>>> SPECjvm2008 is a standard benchmark. In fact, it is a benchmark that >>>>>>> deprecates SPECjvm98. >>>>>>> >>>>>>>> These are not benchmarks specifically for classloading. >>>>>>> There are benchmarks that try to estimate the startup costs. >>>>>>> SPECjvm2008:startup.* tests are one of them. >>>>>>> >>>>>>>> However, I agree it's a good idea to run standard benchmarks to >>>>>>>> confirm there is no overall performance degradation. From all the >>>>>>>> benchmarks including classloading measurements, we have confirmed >>>>>>>> that this specific change does not have negative impact on >>>>>>>> classloading itself and the overall performance. >>>>>>> Excellent. What are those benchmarks? May we see those? Because I have a >>>>>>> counter-example in this thread that this change *does* negatively impact >>>>>>> classloading. >>>>>>> >>>>>>> -Aleksey. >>>>>>> >>>>> >>>> >>> >> > From jiangli.zhou at oracle.com Wed Dec 3 22:55:05 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 03 Dec 2014 14:55:05 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <547CCB30.6010806@oracle.com> References: <547CCB30.6010806@oracle.com> Message-ID: <547F94C9.7020002@oracle.com> Hi Calvin, It's better to define 12M and 16M as enums in metaspaceShared.hpp now they are referenced in more than one place. I also have some questions. The 12M/16M are not introduced by this change, do you know why those values were chosen as the default RO and RW sizes? Now we require both spaces have to be at lease 12M on 32-bit machines and 16M on 64-bit machine, is it a reasonable requirement? What's the minimum size requirement for the RO and RW spaces with the default classlist? Thanks, Jiangli On 12/01/2014 12:10 PM, Calvin Cheung wrote: > JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 > > Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, and > SharedReadWriteSize. > > For the SharedMiscDataSize, it is based on > MetaspaceShared::generate_vtable_methods(). Similar to what was done > for the SharedMiscCodeSize. > > For the ShareReadOnlySize and SharedReadWriteSize, I'm checking if > they are at least the default size. > I think it's reasonable to enforce the ro and rw sizes to be at least > the default size. A default dump of CDS archive requires >8M of ro > space and >11M of rw space. > > webrev: > http://cr.openjdk.java.net/~ccheung/8065050/webrev/ > > tests: > ran the testcase via jtreg on linux_x64 and windows_x64 > JPRT > > thanks, > Calvin From jiangli.zhou at oracle.com Wed Dec 3 23:38:52 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 03 Dec 2014 15:38:52 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <547F6D32.6050708@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> Message-ID: <547F9F0C.1000904@oracle.com> Hi Ioi, > > I think these two blocks can be rewritten to avoid the use of the #ifdef > 162 #ifdef _LP64 > 163 *p++ = juint(base_address >> 32); > 164 #else > 165 *p++ = 0; > 166 #endif > 167 *p++ = juint(base_address & 0xffffffff); // base address > > 205 juint upper = *p++; > 206 juint lower = *p++; > 207 #ifdef _LP64 > 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); > 209 #else > 210 _base_address = uintx(lower); > 211 #endif > > -> > > 163 *p++ = juint(base_address >> 32); > 167 *p++ = juint(base_address & 0xffffffff); > > 205 juint upper = *p++; > 206 juint lower = *p++; > 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); > Actually it would have problem on 32-bit platforms. The behaviour of shift by greater than or equal to the number of bits that exist in the operand is undefined. Gcc gives warning about the >>32 on linux-x86. Thanks, Jiangli From calvin.cheung at oracle.com Wed Dec 3 23:51:23 2014 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 03 Dec 2014 15:51:23 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <547F94C9.7020002@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> Message-ID: <547FA1FB.1070801@oracle.com> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: > Hi Calvin, > > It's better to define 12M and 16M as enums in metaspaceShared.hpp now > they are referenced in more than one place. So global.hpp will need to include metaspaceShared.hpp. > > I also have some questions. The 12M/16M are not introduced by this > change, do you know why those values were chosen as the default RO and > RW sizes? Sorry. I don't know the reasons why those values were chosen. > Now we require both spaces have to be at lease 12M on 32-bit machines > and 16M on 64-bit machine, is it a reasonable requirement? What's the > minimum size requirement for the RO and RW spaces with the default > classlist? Below are the numbers for the RO and RW spaces with the default classlist for various platforms: ===== linux ===== 64-bit ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% used] at 0x0000000800000000 rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% used] at 0x0000000801000000 32-bit ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% used] at 0x80000000 rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% used] at 0x80c00000 ======== windows ======== 64-bit ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% used] at 0x0000000800000000 rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% used] at 0x0000000801000000 32-bit ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% used] at 0x14690000 rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% used] at 0x15290000 ==== mac ==== 64-bit ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% used] at 0x0000000800000000 rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% used] at 0x0000000801000000 ==== So maybe we can define some enums as follows and leave the default values in globals.hpp alone? min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) thanks, Calvin > > Thanks, > Jiangli > > On 12/01/2014 12:10 PM, Calvin Cheung wrote: >> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >> >> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, and >> SharedReadWriteSize. >> >> For the SharedMiscDataSize, it is based on >> MetaspaceShared::generate_vtable_methods(). Similar to what was done >> for the SharedMiscCodeSize. >> >> For the ShareReadOnlySize and SharedReadWriteSize, I'm checking if >> they are at least the default size. >> I think it's reasonable to enforce the ro and rw sizes to be at least >> the default size. A default dump of CDS archive requires >8M of ro >> space and >11M of rw space. >> >> webrev: >> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >> >> tests: >> ran the testcase via jtreg on linux_x64 and windows_x64 >> JPRT >> >> thanks, >> Calvin > From jiangli.zhou at oracle.com Wed Dec 3 23:53:53 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 03 Dec 2014 15:53:53 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <547FA1FB.1070801@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> Message-ID: <547FA291.5050407@oracle.com> Hi Calvin, On 12/03/2014 03:51 PM, Calvin Cheung wrote: > On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >> Hi Calvin, >> >> It's better to define 12M and 16M as enums in metaspaceShared.hpp now >> they are referenced in more than one place. > So global.hpp will need to include metaspaceShared.hpp. >> >> I also have some questions. The 12M/16M are not introduced by this >> change, do you know why those values were chosen as the default RO >> and RW sizes? > Sorry. I don't know the reasons why those values were chosen. >> Now we require both spaces have to be at lease 12M on 32-bit machines >> and 16M on 64-bit machine, is it a reasonable requirement? What's the >> minimum size requirement for the RO and RW spaces with the default >> classlist? > > Below are the numbers for the RO and RW spaces with the default > classlist for various platforms: > ===== > linux > ===== > 64-bit > ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% > used] at 0x0000000800000000 > rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% > used] at 0x0000000801000000 > > 32-bit > ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% > used] at 0x80000000 > rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% > used] at 0x80c00000 > > ======== > windows > ======== > 64-bit > ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% > used] at 0x0000000800000000 > rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% > used] at 0x0000000801000000 > > 32-bit > ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% > used] at 0x14690000 > rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% > used] at 0x15290000 > > ==== > mac > ==== > 64-bit > ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% > used] at 0x0000000800000000 > rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% > used] at 0x0000000801000000 > > ==== > > So maybe we can define some enums as follows and leave the default > values in globals.hpp alone? > > min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) > min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) Sounds good to me. Thanks, Jiangli > > thanks, > Calvin > >> >> Thanks, >> Jiangli >> >> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>> >>> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, and >>> SharedReadWriteSize. >>> >>> For the SharedMiscDataSize, it is based on >>> MetaspaceShared::generate_vtable_methods(). Similar to what was done >>> for the SharedMiscCodeSize. >>> >>> For the ShareReadOnlySize and SharedReadWriteSize, I'm checking if >>> they are at least the default size. >>> I think it's reasonable to enforce the ro and rw sizes to be at >>> least the default size. A default dump of CDS archive requires >8M >>> of ro space and >11M of rw space. >>> >>> webrev: >>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>> >>> tests: >>> ran the testcase via jtreg on linux_x64 and windows_x64 >>> JPRT >>> >>> thanks, >>> Calvin >> > From calvin.cheung at oracle.com Thu Dec 4 01:40:26 2014 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 03 Dec 2014 17:40:26 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <547FA291.5050407@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> <547FA291.5050407@oracle.com> Message-ID: <547FBB89.1080409@oracle.com> Hi Jiangli, I've updated the webrev at the same location: http://cr.openjdk.java.net/~ccheung/8065050/webrev/ Previous version is saved at: http://cr.openjdk.java.net/~ccheung/8065050/webrev.00/ thanks, Calvin On 12/3/2014 3:53 PM, Jiangli Zhou wrote: > Hi Calvin, > > On 12/03/2014 03:51 PM, Calvin Cheung wrote: >> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >>> Hi Calvin, >>> >>> It's better to define 12M and 16M as enums in metaspaceShared.hpp >>> now they are referenced in more than one place. >> So global.hpp will need to include metaspaceShared.hpp. >>> >>> I also have some questions. The 12M/16M are not introduced by this >>> change, do you know why those values were chosen as the default RO >>> and RW sizes? >> Sorry. I don't know the reasons why those values were chosen. >>> Now we require both spaces have to be at lease 12M on 32-bit >>> machines and 16M on 64-bit machine, is it a reasonable requirement? >>> What's the minimum size requirement for the RO and RW spaces with >>> the default classlist? >> >> Below are the numbers for the RO and RW spaces with the default >> classlist for various platforms: >> ===== >> linux >> ===== >> 64-bit >> ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% >> used] at 0x0000000800000000 >> rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% >> used] at 0x0000000801000000 >> >> 32-bit >> ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% >> used] at 0x80000000 >> rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% >> used] at 0x80c00000 >> >> ======== >> windows >> ======== >> 64-bit >> ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% >> used] at 0x0000000800000000 >> rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% >> used] at 0x0000000801000000 >> >> 32-bit >> ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% >> used] at 0x14690000 >> rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% >> used] at 0x15290000 >> >> ==== >> mac >> ==== >> 64-bit >> ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% >> used] at 0x0000000800000000 >> rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% >> used] at 0x0000000801000000 >> >> ==== >> >> So maybe we can define some enums as follows and leave the default >> values in globals.hpp alone? >> >> min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) >> min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) > > Sounds good to me. > > Thanks, > Jiangli > >> >> thanks, >> Calvin >> >>> >>> Thanks, >>> Jiangli >>> >>> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>>> >>>> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, >>>> and SharedReadWriteSize. >>>> >>>> For the SharedMiscDataSize, it is based on >>>> MetaspaceShared::generate_vtable_methods(). Similar to what was >>>> done for the SharedMiscCodeSize. >>>> >>>> For the ShareReadOnlySize and SharedReadWriteSize, I'm checking if >>>> they are at least the default size. >>>> I think it's reasonable to enforce the ro and rw sizes to be at >>>> least the default size. A default dump of CDS archive requires >8M >>>> of ro space and >11M of rw space. >>>> >>>> webrev: >>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>> >>>> tests: >>>> ran the testcase via jtreg on linux_x64 and windows_x64 >>>> JPRT >>>> >>>> thanks, >>>> Calvin >>> >> > From jiangli.zhou at oracle.com Thu Dec 4 02:06:23 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 03 Dec 2014 18:06:23 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <547F9F0C.1000904@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> <547F9F0C.1000904@oracle.com> Message-ID: <547FC19F.6040406@oracle.com> Hi Ioi, I've updated the webrev: http://cr.openjdk.java.net/~jiangli/8059510/webrev.06/ Thanks, Jiangli On 12/03/2014 03:38 PM, Jiangli Zhou wrote: > Hi Ioi, >> >> I think these two blocks can be rewritten to avoid the use of the #ifdef >> 162 #ifdef _LP64 >> 163 *p++ = juint(base_address >> 32); >> 164 #else >> 165 *p++ = 0; >> 166 #endif >> 167 *p++ = juint(base_address & 0xffffffff); // base address >> >> 205 juint upper = *p++; >> 206 juint lower = *p++; >> 207 #ifdef _LP64 >> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); >> 209 #else >> 210 _base_address = uintx(lower); >> 211 #endif >> >> -> >> >> 163 *p++ = juint(base_address >> 32); >> 167 *p++ = juint(base_address & 0xffffffff); >> >> 205 juint upper = *p++; >> 206 juint lower = *p++; >> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); >> > > Actually it would have problem on 32-bit platforms. The behaviour of > shift by greater than or equal to the number of bits that exist in the > operand is undefined. Gcc gives warning about the >>32 on linux-x86. > > Thanks, > Jiangli > > From jiangli.zhou at oracle.com Thu Dec 4 04:04:51 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 03 Dec 2014 20:04:51 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <547FBB89.1080409@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> <547FA291.5050407@oracle.com> <547FBB89.1080409@oracle.com> Message-ID: <547FDD63.5020009@oracle.com> Hi Calvin, Looks good. Thanks, Jiangli On 12/03/2014 05:40 PM, Calvin Cheung wrote: > Hi Jiangli, > > I've updated the webrev at the same location: > http://cr.openjdk.java.net/~ccheung/8065050/webrev/ > > Previous version is saved at: > http://cr.openjdk.java.net/~ccheung/8065050/webrev.00/ > > thanks, > Calvin > > On 12/3/2014 3:53 PM, Jiangli Zhou wrote: >> Hi Calvin, >> >> On 12/03/2014 03:51 PM, Calvin Cheung wrote: >>> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >>>> Hi Calvin, >>>> >>>> It's better to define 12M and 16M as enums in metaspaceShared.hpp >>>> now they are referenced in more than one place. >>> So global.hpp will need to include metaspaceShared.hpp. >>>> >>>> I also have some questions. The 12M/16M are not introduced by this >>>> change, do you know why those values were chosen as the default RO >>>> and RW sizes? >>> Sorry. I don't know the reasons why those values were chosen. >>>> Now we require both spaces have to be at lease 12M on 32-bit >>>> machines and 16M on 64-bit machine, is it a reasonable requirement? >>>> What's the minimum size requirement for the RO and RW spaces with >>>> the default classlist? >>> >>> Below are the numbers for the RO and RW spaces with the default >>> classlist for various platforms: >>> ===== >>> linux >>> ===== >>> 64-bit >>> ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% >>> used] at 0x0000000800000000 >>> rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% >>> used] at 0x0000000801000000 >>> >>> 32-bit >>> ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% >>> used] at 0x80000000 >>> rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% >>> used] at 0x80c00000 >>> >>> ======== >>> windows >>> ======== >>> 64-bit >>> ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% >>> used] at 0x0000000800000000 >>> rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% >>> used] at 0x0000000801000000 >>> >>> 32-bit >>> ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% >>> used] at 0x14690000 >>> rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% >>> used] at 0x15290000 >>> >>> ==== >>> mac >>> ==== >>> 64-bit >>> ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% >>> used] at 0x0000000800000000 >>> rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% >>> used] at 0x0000000801000000 >>> >>> ==== >>> >>> So maybe we can define some enums as follows and leave the default >>> values in globals.hpp alone? >>> >>> min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) >>> min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) >> >> Sounds good to me. >> >> Thanks, >> Jiangli >> >>> >>> thanks, >>> Calvin >>> >>>> >>>> Thanks, >>>> Jiangli >>>> >>>> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>>>> >>>>> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, >>>>> and SharedReadWriteSize. >>>>> >>>>> For the SharedMiscDataSize, it is based on >>>>> MetaspaceShared::generate_vtable_methods(). Similar to what was >>>>> done for the SharedMiscCodeSize. >>>>> >>>>> For the ShareReadOnlySize and SharedReadWriteSize, I'm checking if >>>>> they are at least the default size. >>>>> I think it's reasonable to enforce the ro and rw sizes to be at >>>>> least the default size. A default dump of CDS archive requires >8M >>>>> of ro space and >11M of rw space. >>>>> >>>>> webrev: >>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>> >>>>> tests: >>>>> ran the testcase via jtreg on linux_x64 and windows_x64 >>>>> JPRT >>>>> >>>>> thanks, >>>>> Calvin >>>> >>> >> > From david.holmes at oracle.com Thu Dec 4 05:07:13 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 04 Dec 2014 15:07:13 +1000 Subject: [8u40] request for approval: 8035893: JVM_GetVersionInfo fails to zero structure In-Reply-To: <3bf5b22e-be81-4244-b2e8-6b514abefdad@default> References: <3bf5b22e-be81-4244-b2e8-6b514abefdad@default> Message-ID: <547FEC01.3080005@oracle.com> > Hi! > > May I please have approval to backport this fix from JDK9 to JDK8. I > have build the JDK-8 hotspot and tested already. JDK9 fix applies > cleanly to JDK8 source. As I do not have account for OpenJDK, David > Buck will push the fix into jdk8u/hs-dev/hotspot. Approved for backport. Thanks, David > > BUGURL: https://bugs.openjdk.java.net/browse/JDK-8035893 > > JDK9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cd30121047ac > > review thread: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-February/011054.html > > > Regards, > Cheleswer From calvin.cheung at oracle.com Thu Dec 4 05:48:57 2014 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 03 Dec 2014 21:48:57 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <547FDD63.5020009@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> <547FA291.5050407@oracle.com> <547FBB89.1080409@oracle.com> <547FDD63.5020009@oracle.com> Message-ID: <547FF5C9.7060901@oracle.com> Thanks again - Jiangli. Calvin On 12/3/2014 8:04 PM, Jiangli Zhou wrote: > Hi Calvin, > > Looks good. > > Thanks, > Jiangli > > On 12/03/2014 05:40 PM, Calvin Cheung wrote: >> Hi Jiangli, >> >> I've updated the webrev at the same location: >> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >> >> Previous version is saved at: >> http://cr.openjdk.java.net/~ccheung/8065050/webrev.00/ >> >> thanks, >> Calvin >> >> On 12/3/2014 3:53 PM, Jiangli Zhou wrote: >>> Hi Calvin, >>> >>> On 12/03/2014 03:51 PM, Calvin Cheung wrote: >>>> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >>>>> Hi Calvin, >>>>> >>>>> It's better to define 12M and 16M as enums in metaspaceShared.hpp >>>>> now they are referenced in more than one place. >>>> So global.hpp will need to include metaspaceShared.hpp. >>>>> >>>>> I also have some questions. The 12M/16M are not introduced by this >>>>> change, do you know why those values were chosen as the default RO >>>>> and RW sizes? >>>> Sorry. I don't know the reasons why those values were chosen. >>>>> Now we require both spaces have to be at lease 12M on 32-bit >>>>> machines and 16M on 64-bit machine, is it a reasonable >>>>> requirement? What's the minimum size requirement for the RO and RW >>>>> spaces with the default classlist? >>>> >>>> Below are the numbers for the RO and RW spaces with the default >>>> classlist for various platforms: >>>> ===== >>>> linux >>>> ===== >>>> 64-bit >>>> ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% >>>> used] at 0x0000000800000000 >>>> rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% >>>> used] at 0x0000000801000000 >>>> >>>> 32-bit >>>> ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% >>>> used] at 0x80000000 >>>> rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% >>>> used] at 0x80c00000 >>>> >>>> ======== >>>> windows >>>> ======== >>>> 64-bit >>>> ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% >>>> used] at 0x0000000800000000 >>>> rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% >>>> used] at 0x0000000801000000 >>>> >>>> 32-bit >>>> ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% >>>> used] at 0x14690000 >>>> rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% >>>> used] at 0x15290000 >>>> >>>> ==== >>>> mac >>>> ==== >>>> 64-bit >>>> ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% >>>> used] at 0x0000000800000000 >>>> rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% >>>> used] at 0x0000000801000000 >>>> >>>> ==== >>>> >>>> So maybe we can define some enums as follows and leave the default >>>> values in globals.hpp alone? >>>> >>>> min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) >>>> min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) >>> >>> Sounds good to me. >>> >>> Thanks, >>> Jiangli >>> >>>> >>>> thanks, >>>> Calvin >>>> >>>>> >>>>> Thanks, >>>>> Jiangli >>>>> >>>>> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>>>>> >>>>>> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, >>>>>> and SharedReadWriteSize. >>>>>> >>>>>> For the SharedMiscDataSize, it is based on >>>>>> MetaspaceShared::generate_vtable_methods(). Similar to what was >>>>>> done for the SharedMiscCodeSize. >>>>>> >>>>>> For the ShareReadOnlySize and SharedReadWriteSize, I'm checking >>>>>> if they are at least the default size. >>>>>> I think it's reasonable to enforce the ro and rw sizes to be at >>>>>> least the default size. A default dump of CDS archive requires >>>>>> >8M of ro space and >11M of rw space. >>>>>> >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>>> >>>>>> tests: >>>>>> ran the testcase via jtreg on linux_x64 and windows_x64 >>>>>> JPRT >>>>>> >>>>>> thanks, >>>>>> Calvin >>>>> >>>> >>> >> > From chris.plummer at oracle.com Thu Dec 4 08:12:58 2014 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 04 Dec 2014 00:12:58 -0800 Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation fault In-Reply-To: <547F0889.5050204@oracle.com> References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com> <54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com> <546C5986.6010500@oracle.com> <546C6D1A.8050903@oracle.com> <546CBCAB.7040101@oracle.com> <547D265A.20005@oracle.com> <547F0889.5050204@oracle.com> Message-ID: <5480178A.9090709@oracle.com> On 12/3/14 4:56 AM, Alan Bateman wrote: > On 02/12/2014 02:39, Chris Plummer wrote: >> Sorry about the long delay in getting back to this. I ran into two >> separate JPRT issues that were preventing me from testing these >> changes, plus I was on vacation last week. Here's an updated webrev. >> I'm not sure where we left things, so I'll just say what's changed >> since the original version: >> >> 1. Rewrote the test to be in Java instead of a shell script. >> 2. Moved the test from hotspot/test/runtime/memory to >> jdk/test/tools/launcher >> 3. Added STACK_SIZE_MINIMUM to java.c, allowing a makefile to >> override the default 32k minimum value. >> >> https://bugs.openjdk.java.net/browse/JDK-6762191 >> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.02/ > This looks to me. A minor comment for java.c is that this code uses > 4-space indent (different to hotspot). > > The test looks okay too, you might just checking the copyright date as > I assume was not written in 2010. Also I think the import of > java.io.File may be left behind from the previous round. > > -Alan Hi Alan, While removing the java.io.File import, I also questioned why I had java.io.IOException being imported. There were a couple of methods that declared it thrown, and the main method therefore had to catch it, but it turns out this was just copy/paste from the Settings.java test I used as a template, and is not actually needed. I removed the import, throws, and try/catch of IOException. All the other issues mentioned by others have also been addressed. A new webrev can be found at: http://cr.openjdk.java.net/~cjplummer/6762191/webrev.03/ thanks, Chris From ioi.lam at oracle.com Thu Dec 4 10:32:07 2014 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 04 Dec 2014 02:32:07 -0800 Subject: RFR (XS): 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid Message-ID: <54803827.8080402@oracle.com> Hi Folks, Please review a small fix: https://bugs.openjdk.java.net/browse/JDK-8066670 http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ Summary of fix: Do not set UseSharedSpaces to falseif PrintSharedArchiveAndExit is enabled. After this fix, the JVM correctly exits when PrintSharedArchiveAndExit is enabled and an invalid archive is encountered. New test cases are in closed source code. Tests: JPRT JTREG From david.holmes at oracle.com Thu Dec 4 11:27:57 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 04 Dec 2014 21:27:57 +1000 Subject: RFR (XS): 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid In-Reply-To: <54803827.8080402@oracle.com> References: <54803827.8080402@oracle.com> Message-ID: <5480453D.1050801@oracle.com> Hi Ioi, On 4/12/2014 8:32 PM, Ioi Lam wrote: > Hi Folks, > > Please review a small fix: > > https://bugs.openjdk.java.net/browse/JDK-8066670 > http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ > > Summary of fix: > > Do not set UseSharedSpaces to falseif PrintSharedArchiveAndExit is enabled. > > After this fix, the JVM correctly exits when > PrintSharedArchiveAndExit is enabled and an invalid archive is encountered. The change to metaspaceShared.cpp is fine. In filemap.cpp I'm less clear on the logic. It seems that if _validating_classpath_entry_table is false then we will still continue, even if PrintSharedArchiveAndExit is true. > > New test cases are in closed source code. Begs the question as to why there can't be an open test for this? Thanks, David > Tests: > > JPRT > JTREG > > > From ioi.lam at oracle.com Thu Dec 4 11:40:00 2014 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 04 Dec 2014 03:40:00 -0800 Subject: RFR (XS): 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid In-Reply-To: <5480453D.1050801@oracle.com> References: <54803827.8080402@oracle.com> <5480453D.1050801@oracle.com> Message-ID: <54804810.2060001@oracle.com> On 12/4/14, 3:27 AM, David Holmes wrote: > Hi Ioi, > > On 4/12/2014 8:32 PM, Ioi Lam wrote: >> Hi Folks, >> >> Please review a small fix: >> >> https://bugs.openjdk.java.net/browse/JDK-8066670 >> http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ >> >> Summary of fix: >> >> Do not set UseSharedSpaces to falseif PrintSharedArchiveAndExit is >> enabled. >> >> After this fix, the JVM correctly exits when >> PrintSharedArchiveAndExit is enabled and an invalid archive is >> encountered. > > The change to metaspaceShared.cpp is fine. > > In filemap.cpp I'm less clear on the logic. It seems that if > _validating_classpath_entry_table is false then we will still > continue, even if PrintSharedArchiveAndExit is true. > The goal is try to print out as much information as possible. It turns out the most useful information with PrintSharedArchiveAndExit is to find out which part of the classpath is invalid. When _validating_classpath_entry_table is true, we know it's safe to print an error message (about a part of the classpath that's invalid) and continue. When doing other things (_validating_classpath_entry_table==false), it's less clear whether we can continue if a failure is encountered. In this case, since PrintSharedArchiveAndExit is true, RequireSharedSpaces is automatically set to true (by arguments.cpp), so we will print out the error message and exit immediately. >> >> New test cases are in closed source code. > > Begs the question as to why there can't be an open test for this? > I will add an open test as well. Thanks - Ioi > Thanks, > David > >> Tests: >> >> JPRT >> JTREG >> >> >> From david.holmes at oracle.com Thu Dec 4 11:43:40 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 04 Dec 2014 21:43:40 +1000 Subject: RFR (XS): 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid In-Reply-To: <54804810.2060001@oracle.com> References: <54803827.8080402@oracle.com> <5480453D.1050801@oracle.com> <54804810.2060001@oracle.com> Message-ID: <548048EC.4030906@oracle.com> On 4/12/2014 9:40 PM, Ioi Lam wrote: > > On 12/4/14, 3:27 AM, David Holmes wrote: >> Hi Ioi, >> >> On 4/12/2014 8:32 PM, Ioi Lam wrote: >>> Hi Folks, >>> >>> Please review a small fix: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8066670 >>> http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ >>> >>> Summary of fix: >>> >>> Do not set UseSharedSpaces to falseif PrintSharedArchiveAndExit is >>> enabled. >>> >>> After this fix, the JVM correctly exits when >>> PrintSharedArchiveAndExit is enabled and an invalid archive is >>> encountered. >> >> The change to metaspaceShared.cpp is fine. >> >> In filemap.cpp I'm less clear on the logic. It seems that if >> _validating_classpath_entry_table is false then we will still >> continue, even if PrintSharedArchiveAndExit is true. >> > The goal is try to print out as much information as possible. It turns > out the most useful information with PrintSharedArchiveAndExit is to > find out which part of the classpath is invalid. When > _validating_classpath_entry_table is true, we know it's safe to print an > error message (about a part of the classpath that's invalid) and continue. > > When doing other things (_validating_classpath_entry_table==false), it's > less clear whether we can continue if a failure is encountered. In this > case, since PrintSharedArchiveAndExit is true, RequireSharedSpaces is > automatically set to true (by arguments.cpp), so we will print out the > error message and exit immediately. Okay - thanks for explaining. >>> >>> New test cases are in closed source code. >> >> Begs the question as to why there can't be an open test for this? >> > I will add an open test as well. Thanks. David > Thanks > - Ioi > >> Thanks, >> David >> >>> Tests: >>> >>> JPRT >>> JTREG >>> >>> >>> > From ioi.lam at oracle.com Thu Dec 4 12:04:13 2014 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 04 Dec 2014 04:04:13 -0800 Subject: RFR (XS): 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid In-Reply-To: <548048EC.4030906@oracle.com> References: <54803827.8080402@oracle.com> <5480453D.1050801@oracle.com> <54804810.2060001@oracle.com> <548048EC.4030906@oracle.com> Message-ID: <54804DBD.9070402@oracle.com> On 12/4/14, 3:43 AM, David Holmes wrote: > On 4/12/2014 9:40 PM, Ioi Lam wrote: >> >> On 12/4/14, 3:27 AM, David Holmes wrote: >>> Hi Ioi, >>> >>> On 4/12/2014 8:32 PM, Ioi Lam wrote: >>>> Hi Folks, >>>> >>>> Please review a small fix: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8066670 >>>> http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ >>>> >>>> Summary of fix: >>>> >>>> Do not set UseSharedSpaces to falseif PrintSharedArchiveAndExit is >>>> enabled. >>>> >>>> After this fix, the JVM correctly exits when >>>> PrintSharedArchiveAndExit is enabled and an invalid archive is >>>> encountered. >>> >>> The change to metaspaceShared.cpp is fine. >>> >>> In filemap.cpp I'm less clear on the logic. It seems that if >>> _validating_classpath_entry_table is false then we will still >>> continue, even if PrintSharedArchiveAndExit is true. >>> >> The goal is try to print out as much information as possible. It turns >> out the most useful information with PrintSharedArchiveAndExit is to >> find out which part of the classpath is invalid. When >> _validating_classpath_entry_table is true, we know it's safe to print an >> error message (about a part of the classpath that's invalid) and >> continue. >> >> When doing other things (_validating_classpath_entry_table==false), it's >> less clear whether we can continue if a failure is encountered. In this >> case, since PrintSharedArchiveAndExit is true, RequireSharedSpaces is >> automatically set to true (by arguments.cpp), so we will print out the >> error message and exit immediately. > > Okay - thanks for explaining. > >>>> >>>> New test cases are in closed source code. >>> >>> Begs the question as to why there can't be an open test for this? >>> >> I will add an open test as well. > I added the new test in the open code, under the same location: http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ Thanks - Ioi From michail.chernov at oracle.com Thu Dec 4 15:02:54 2014 From: michail.chernov at oracle.com (Michail Chernov) Date: Thu, 04 Dec 2014 18:02:54 +0300 Subject: RFR: 8064909: FragmentMetaspace.java got OutOfMemoryError In-Reply-To: <547CC705.5070004@oracle.com> References: <5475D74A.2060907@oracle.com> <54762451.3070802@oracle.com> <54763D54.3070704@oracle.com> <54765B70.10509@oracle.com> <54772711.6000003@oracle.com> <547CC705.5070004@oracle.com> Message-ID: <5480779E.4080407@oracle.com> Hi, Here the updated webrev: http://cr.openjdk.java.net/~eistepan/~mchernov/8064909/webrev.01/ On 01.12.2014 22:52, Jon Masamitsu wrote: > > On 11/27/2014 05:28 AM, Michail Chernov wrote: >> Hi, >> >> CC'ed hotspot-runtime-dev. >> >> Here is not test failure - test works as expected. OOME is occurred >> in compiler instance. >> >> private JavaCompiler javac; >> ... >> javac = ToolProvider.getSystemJavaCompiler(); >> ... >> int exitcode = javac.run(null, null, null, >> file.getCanonicalPath()); >> if (exitcode != 0) { >> throw new RuntimeException("javac failure when compiling: >> " + >> file.getCanonicalPath()); >> >> Here is 2 ways - rewrite getGeneratedClass >> (runtime/testlibrary/GeneratedClassLoader.java) to allow them to >> throw not only RuntimeException, > > Seems like this would be more precise with regard to recognizing the > cause of the failure. Are there too many places which would have to > change to catch the OOME. > >> or to catch RuntimeException and check exception message comparing >> with "javac failure when compiling:". Both ways seem to me are not as >> clear as expected for this simple test. More - javac does not throw >> anything - it just returns exitcode (non-zero) and writes its >> messages to System.err. >> >> Also I can add comment to code like "OOME with message >> "java.lang.OutOfMemoryError: Java heap space" doesn't mean that >> something wrong with metaspace - need just to increase -Xmx". > > That would be enough for me if you don't think > throwing the OOME from GeneratedClassLoader() > adds much value. > > Jon > >> >> Thanks, >> Michail >> >> On 27.11.2014 2:00, Jon Masamitsu wrote: >>> Dima, >>> >>> If this test fails with an OOME in the future, I would like it to be >>> obvious that the failure is not that an OOME occurred. I cannot >>> tell that from looking at the test. Can the test be changed so >>> I don't have to spend time figuring out that the OOME is not >>> a failure mode of the test? >>> >>> Jon >>> >>> >>> On 11/26/2014 12:51 PM, Dmitry Fazunenko wrote: >>>> Hi Jon, >>>> >>>> The original version of test worked for 80 seconds trying to >>>> perform as many iterations as possible. The number of iterations >>>> performed depended on how fast is the machine. With each next >>>> iteration the size of generated and loaded classes is growing, so >>>> on fast enough machines 80 seconds is enough to run out of heap >>>> while generating a class. >>>> >>>> The fix not only sets the heap, but limits iterations. 300m heap >>>> is enough for 200 iterations. >>>> >>>> Your approach, with catching OOME(heap) and passing will also work, >>>> but it will reduce the test readability (and potentially could >>>> bring more problems). >>>> >>>> An alternative approach would be to limit metaspace and heap >>>> accordingly and load classes until we don't run out metaspace... >>>> But this might take awhile. >>>> >>>> So, I hope that Michael's fix is good. >>>> >>>> Thanks for looking and expressing comments. >>>> Dima >>>> >>>> >>>> >>>> >>>> On 26.11.2014 22:04, Jon Masamitsu wrote: >>>>> Michail, >>>>> >>>>> Your change makes this test pass but it seems like at >>>>> some future date 300m might not be big enough >>>>> (for whatever reason). Could the test be make to >>>>> caught an OOME, print out a message saying that >>>>> an OOME doesn't mean the test failed but that >>>>> the test needs a larger heap? Then pass an >>>>> exception up (maybe some type of Runtime >>>>> exception - sorry if that is vague but I don't >>>>> what type of exception would make sense). That >>>>> would mean we wouldn't have to spend time >>>>> diagnosing what the OOME means again. >>>>> >>>>> Jon >>>>> >>>>> On 11/26/2014 5:36 AM, Michail Chernov wrote: >>>>>> Hi, >>>>>> >>>>>> Please review this simple fix for nightly test failure: >>>>>> >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~eistepan/~mchernov/8064909/webrev.00/ >>>>>> Bug: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8064909 >>>>>> >>>>>> Problem: test fails because of OOME (not enough heap size). >>>>>> Solution: heap size were increased. >>>>>> >>>>>> Testing: >>>>>> jtreg >>>>>> >>>>>> Thanks, >>>>>> Michail >>>>> >>>> >>> >>> >>> >> > > > From Alan.Bateman at oracle.com Thu Dec 4 17:30:48 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 04 Dec 2014 17:30:48 +0000 Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation fault In-Reply-To: <5480178A.9090709@oracle.com> References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com> <54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com> <546C5986.6010500@oracle.com> <546C6D1A.8050903@oracle.com> <546CBCAB.7040101@oracle.com> <547D265A.20005@oracle.com> <547F0889.5050204@oracle.com> <5480178A.9090709@oracle.com> Message-ID: <54809A48.5090201@oracle.com> On 04/12/2014 08:12, Chris Plummer wrote: > Hi Alan, > > While removing the java.io.File import, I also questioned why I had > java.io.IOException being imported. There were a couple of methods > that declared it thrown, and the main method therefore had to catch > it, but it turns out this was just copy/paste from the Settings.java > test I used as a template, and is not actually needed. I removed the > import, throws, and try/catch of IOException. > > All the other issues mentioned by others have also been addressed. A > new webrev can be found at: > > http://cr.openjdk.java.net/~cjplummer/6762191/webrev.03/ This looks good to me. -Alan. From calvin.cheung at oracle.com Thu Dec 4 18:02:20 2014 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Thu, 04 Dec 2014 10:02:20 -0800 Subject: RFR (XS): 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid In-Reply-To: <54803827.8080402@oracle.com> References: <54803827.8080402@oracle.com> Message-ID: <5480A1AC.7030602@oracle.com> Hi Ioi, The fix looks good. For the testcase, I'm wondering if it's possible to start only one java process for each scenario by calling output.shouldNotContain() twice. For (1) With a valid archive for example: pb = ProcessTools.createJavaProcessBuilder( "-XX:+UnlockDiagnosticVMOptions", "-XX:SharedArchiveFile=./sample.jsa", "-XX:+PrintSharedArchiveAndExit", "-version"); output = new OutputAnalyzer(pb.start()); output.shouldContain("archive is valid"); output.shouldNotContain("Java HotSpot(TM)"); // Should not print JVM version output.shouldNotContain("Usage:"); // Should not print JVM help message output.shouldHaveExitValue(0); // Should report success in error code. thanks, Calvin On 12/4/2014 2:32 AM, Ioi Lam wrote: > Hi Folks, > > Please review a small fix: > > https://bugs.openjdk.java.net/browse/JDK-8066670 > http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ > > Summary of fix: > > Do not set UseSharedSpaces to falseif PrintSharedArchiveAndExit is > enabled. > > After this fix, the JVM correctly exits when > PrintSharedArchiveAndExit is enabled > and an invalid archive is encountered. > > New test cases are in closed source code. > > Tests: > > JPRT > JTREG > > > From ioi.lam at oracle.com Thu Dec 4 18:39:02 2014 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 04 Dec 2014 10:39:02 -0800 Subject: RFR (XS): 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid In-Reply-To: <5480A1AC.7030602@oracle.com> References: <54803827.8080402@oracle.com> <5480A1AC.7030602@oracle.com> Message-ID: <5480AA46.5030604@oracle.com> Hi Calvin, Without the -XX:+PrintSharedArchiveAndExit switch: * The -version switch would trigger "Java HotSpot(TM) version ...." in the output * Without -version, the usage info "Usage: ..." would be printed Since the two output are mutually exclusive, I decided to put them in two separate test cases. Thanks - Ioi On 12/4/14, 10:02 AM, Calvin Cheung wrote: > Hi Ioi, > > The fix looks good. > > For the testcase, I'm wondering if it's possible to start only one > java process for each scenario by calling output.shouldNotContain() > twice. > > For (1) With a valid archive for example: > pb = ProcessTools.createJavaProcessBuilder( > "-XX:+UnlockDiagnosticVMOptions", > "-XX:SharedArchiveFile=./sample.jsa", > "-XX:+PrintSharedArchiveAndExit", "-version"); > output = new OutputAnalyzer(pb.start()); > output.shouldContain("archive is valid"); > output.shouldNotContain("Java HotSpot(TM)"); // Should not > print JVM version > output.shouldNotContain("Usage:"); // Should not > print JVM help message > output.shouldHaveExitValue(0); // Should report > success in error code. > > thanks, > Calvin > > > On 12/4/2014 2:32 AM, Ioi Lam wrote: >> Hi Folks, >> >> Please review a small fix: >> >> https://bugs.openjdk.java.net/browse/JDK-8066670 >> http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ >> >> Summary of fix: >> >> Do not set UseSharedSpaces to falseif PrintSharedArchiveAndExit is >> enabled. >> >> After this fix, the JVM correctly exits when >> PrintSharedArchiveAndExit is enabled >> and an invalid archive is encountered. >> >> New test cases are in closed source code. >> >> Tests: >> >> JPRT >> JTREG >> >> >> > From chris.plummer at oracle.com Thu Dec 4 20:11:57 2014 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 04 Dec 2014 12:11:57 -0800 Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation fault In-Reply-To: <54809A48.5090201@oracle.com> References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com> <54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com> <546C5986.6010500@oracle.com> <546C6D1A.8050903@oracle.com> <546CBCAB.7040101@oracle.com> <547D265A.20005@oracle.com> <547F0889.5050204@oracle.com> <5480178A.9090709@oracle.com> <54809A48.5090201@oracle.com> Message-ID: <5480C00D.1050003@oracle.com> On 12/4/14 9:30 AM, Alan Bateman wrote: > On 04/12/2014 08:12, Chris Plummer wrote: >> Hi Alan, >> >> While removing the java.io.File import, I also questioned why I had >> java.io.IOException being imported. There were a couple of methods >> that declared it thrown, and the main method therefore had to catch >> it, but it turns out this was just copy/paste from the Settings.java >> test I used as a template, and is not actually needed. I removed the >> import, throws, and try/catch of IOException. >> >> All the other issues mentioned by others have also been addressed. A >> new webrev can be found at: >> >> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.03/ > This looks good to me. > > -Alan. Thanks everyone for the reviews! Chris From calvin.cheung at oracle.com Thu Dec 4 20:41:36 2014 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Thu, 04 Dec 2014 12:41:36 -0800 Subject: RFR (XS): 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid In-Reply-To: <5480AA46.5030604@oracle.com> References: <54803827.8080402@oracle.com> <5480A1AC.7030602@oracle.com> <5480AA46.5030604@oracle.com> Message-ID: <5480C700.5080204@oracle.com> Oops... I missed the "-version" part. Looks good to me. Calvin On 12/4/2014 10:39 AM, Ioi Lam wrote: > Hi Calvin, > > Without the -XX:+PrintSharedArchiveAndExit switch: > > * The -version switch would trigger "Java HotSpot(TM) version ...." in > the output > * Without -version, the usage info "Usage: ..." would be printed > > Since the two output are mutually exclusive, I decided to put them in > two separate test cases. > > Thanks > - Ioi > > On 12/4/14, 10:02 AM, Calvin Cheung wrote: >> Hi Ioi, >> >> The fix looks good. >> >> For the testcase, I'm wondering if it's possible to start only one >> java process for each scenario by calling output.shouldNotContain() >> twice. >> >> For (1) With a valid archive for example: >> pb = ProcessTools.createJavaProcessBuilder( >> "-XX:+UnlockDiagnosticVMOptions", >> "-XX:SharedArchiveFile=./sample.jsa", >> "-XX:+PrintSharedArchiveAndExit", "-version"); >> output = new OutputAnalyzer(pb.start()); >> output.shouldContain("archive is valid"); >> output.shouldNotContain("Java HotSpot(TM)"); // Should not >> print JVM version >> output.shouldNotContain("Usage:"); // Should not >> print JVM help message >> output.shouldHaveExitValue(0); // Should >> report success in error code. >> >> thanks, >> Calvin >> >> >> On 12/4/2014 2:32 AM, Ioi Lam wrote: >>> Hi Folks, >>> >>> Please review a small fix: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8066670 >>> http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ >>> >>> Summary of fix: >>> >>> Do not set UseSharedSpaces to falseif PrintSharedArchiveAndExit is >>> enabled. >>> >>> After this fix, the JVM correctly exits when >>> PrintSharedArchiveAndExit is enabled >>> and an invalid archive is encountered. >>> >>> New test cases are in closed source code. >>> >>> Tests: >>> >>> JPRT >>> JTREG >>> >>> >>> >> > From ioi.lam at oracle.com Thu Dec 4 22:17:18 2014 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 04 Dec 2014 14:17:18 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <547FC19F.6040406@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> <547F9F0C.1000904@oracle.com> <547FC19F.6040406@oracle.com> Message-ID: <5480DD6E.7050509@oracle.com> Hi Jiangli, Looks good. Thanks! - Ioi On 12/3/14, 6:06 PM, Jiangli Zhou wrote: > Hi Ioi, > > I've updated the webrev: > http://cr.openjdk.java.net/~jiangli/8059510/webrev.06/ > > Thanks, > Jiangli > > On 12/03/2014 03:38 PM, Jiangli Zhou wrote: >> Hi Ioi, >>> >>> I think these two blocks can be rewritten to avoid the use of the >>> #ifdef >>> 162 #ifdef _LP64 >>> 163 *p++ = juint(base_address >> 32); >>> 164 #else >>> 165 *p++ = 0; >>> 166 #endif >>> 167 *p++ = juint(base_address & 0xffffffff); // base address >>> >>> 205 juint upper = *p++; >>> 206 juint lower = *p++; >>> 207 #ifdef _LP64 >>> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); >>> 209 #else >>> 210 _base_address = uintx(lower); >>> 211 #endif >>> >>> -> >>> >>> 163 *p++ = juint(base_address >> 32); >>> 167 *p++ = juint(base_address & 0xffffffff); >>> >>> 205 juint upper = *p++; >>> 206 juint lower = *p++; >>> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); >>> >> >> Actually it would have problem on 32-bit platforms. The behaviour of >> shift by greater than or equal to the number of bits that exist in >> the operand is undefined. Gcc gives warning about the >>32 on linux-x86. >> >> Thanks, >> Jiangli >> >> > From jiangli.zhou at oracle.com Thu Dec 4 23:01:52 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 04 Dec 2014 15:01:52 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <5480DD6E.7050509@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> <547F9F0C.1000904@oracle.com> <547FC19F.6040406@oracle.com> <5480DD6E.7050509@oracle.com> Message-ID: <5480E7E0.4040709@oracle.com> Thanks, Ioi! Jiangli On 12/04/2014 02:17 PM, Ioi Lam wrote: > Hi Jiangli, > > Looks good. Thanks! > > - Ioi > > On 12/3/14, 6:06 PM, Jiangli Zhou wrote: >> Hi Ioi, >> >> I've updated the webrev: >> http://cr.openjdk.java.net/~jiangli/8059510/webrev.06/ >> >> Thanks, >> Jiangli >> >> On 12/03/2014 03:38 PM, Jiangli Zhou wrote: >>> Hi Ioi, >>>> >>>> I think these two blocks can be rewritten to avoid the use of the >>>> #ifdef >>>> 162 #ifdef _LP64 >>>> 163 *p++ = juint(base_address >> 32); >>>> 164 #else >>>> 165 *p++ = 0; >>>> 166 #endif >>>> 167 *p++ = juint(base_address & 0xffffffff); // base address >>>> >>>> 205 juint upper = *p++; >>>> 206 juint lower = *p++; >>>> 207 #ifdef _LP64 >>>> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); >>>> 209 #else >>>> 210 _base_address = uintx(lower); >>>> 211 #endif >>>> >>>> -> >>>> >>>> 163 *p++ = juint(base_address >> 32); >>>> 167 *p++ = juint(base_address & 0xffffffff); >>>> >>>> 205 juint upper = *p++; >>>> 206 juint lower = *p++; >>>> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); >>>> >>> >>> Actually it would have problem on 32-bit platforms. The behaviour of >>> shift by greater than or equal to the number of bits that exist in >>> the operand is undefined. Gcc gives warning about the >>32 on >>> linux-x86. >>> >>> Thanks, >>> Jiangli >>> >>> >> > From david.holmes at oracle.com Thu Dec 4 23:14:22 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 05 Dec 2014 09:14:22 +1000 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <547FBB89.1080409@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> <547FA291.5050407@oracle.com> <547FBB89.1080409@oracle.com> Message-ID: <5480EACE.7000605@oracle.com> On 4/12/2014 11:40 AM, Calvin Cheung wrote: > Hi Jiangli, > > I've updated the webrev at the same location: > http://cr.openjdk.java.net/~ccheung/8065050/webrev/ Seems okay. Do we have a test that sets the flags to these allowed minimum values and dumps and then uses the archive? Thanks, David > Previous version is saved at: > http://cr.openjdk.java.net/~ccheung/8065050/webrev.00/ > > thanks, > Calvin > > On 12/3/2014 3:53 PM, Jiangli Zhou wrote: >> Hi Calvin, >> >> On 12/03/2014 03:51 PM, Calvin Cheung wrote: >>> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >>>> Hi Calvin, >>>> >>>> It's better to define 12M and 16M as enums in metaspaceShared.hpp >>>> now they are referenced in more than one place. >>> So global.hpp will need to include metaspaceShared.hpp. >>>> >>>> I also have some questions. The 12M/16M are not introduced by this >>>> change, do you know why those values were chosen as the default RO >>>> and RW sizes? >>> Sorry. I don't know the reasons why those values were chosen. >>>> Now we require both spaces have to be at lease 12M on 32-bit >>>> machines and 16M on 64-bit machine, is it a reasonable requirement? >>>> What's the minimum size requirement for the RO and RW spaces with >>>> the default classlist? >>> >>> Below are the numbers for the RO and RW spaces with the default >>> classlist for various platforms: >>> ===== >>> linux >>> ===== >>> 64-bit >>> ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% >>> used] at 0x0000000800000000 >>> rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% >>> used] at 0x0000000801000000 >>> >>> 32-bit >>> ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% >>> used] at 0x80000000 >>> rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% >>> used] at 0x80c00000 >>> >>> ======== >>> windows >>> ======== >>> 64-bit >>> ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% >>> used] at 0x0000000800000000 >>> rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% >>> used] at 0x0000000801000000 >>> >>> 32-bit >>> ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% >>> used] at 0x14690000 >>> rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% >>> used] at 0x15290000 >>> >>> ==== >>> mac >>> ==== >>> 64-bit >>> ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% >>> used] at 0x0000000800000000 >>> rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% >>> used] at 0x0000000801000000 >>> >>> ==== >>> >>> So maybe we can define some enums as follows and leave the default >>> values in globals.hpp alone? >>> >>> min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) >>> min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) >> >> Sounds good to me. >> >> Thanks, >> Jiangli >> >>> >>> thanks, >>> Calvin >>> >>>> >>>> Thanks, >>>> Jiangli >>>> >>>> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>>>> >>>>> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, >>>>> and SharedReadWriteSize. >>>>> >>>>> For the SharedMiscDataSize, it is based on >>>>> MetaspaceShared::generate_vtable_methods(). Similar to what was >>>>> done for the SharedMiscCodeSize. >>>>> >>>>> For the ShareReadOnlySize and SharedReadWriteSize, I'm checking if >>>>> they are at least the default size. >>>>> I think it's reasonable to enforce the ro and rw sizes to be at >>>>> least the default size. A default dump of CDS archive requires >8M >>>>> of ro space and >11M of rw space. >>>>> >>>>> webrev: >>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>> >>>>> tests: >>>>> ran the testcase via jtreg on linux_x64 and windows_x64 >>>>> JPRT >>>>> >>>>> thanks, >>>>> Calvin >>>> >>> >> > From david.holmes at oracle.com Thu Dec 4 23:46:42 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 05 Dec 2014 09:46:42 +1000 Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation fault In-Reply-To: <5480178A.9090709@oracle.com> References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com> <54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com> <546C5986.6010500@oracle.com> <546C6D1A.8050903@oracle.com> <546CBCAB.7040101@oracle.com> <547D265A.20005@oracle.com> <547F0889.5050204@oracle.com> <5480178A.9090709@oracle.com> Message-ID: <5480F262.9010407@oracle.com> Looks good to me too Chris - sorry for the delay getting back to you. But at least Kumar spotted all the typos :) David On 4/12/2014 6:12 PM, Chris Plummer wrote: > On 12/3/14 4:56 AM, Alan Bateman wrote: >> On 02/12/2014 02:39, Chris Plummer wrote: >>> Sorry about the long delay in getting back to this. I ran into two >>> separate JPRT issues that were preventing me from testing these >>> changes, plus I was on vacation last week. Here's an updated webrev. >>> I'm not sure where we left things, so I'll just say what's changed >>> since the original version: >>> >>> 1. Rewrote the test to be in Java instead of a shell script. >>> 2. Moved the test from hotspot/test/runtime/memory to >>> jdk/test/tools/launcher >>> 3. Added STACK_SIZE_MINIMUM to java.c, allowing a makefile to >>> override the default 32k minimum value. >>> >>> https://bugs.openjdk.java.net/browse/JDK-6762191 >>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.02/ >> This looks to me. A minor comment for java.c is that this code uses >> 4-space indent (different to hotspot). >> >> The test looks okay too, you might just checking the copyright date as >> I assume was not written in 2010. Also I think the import of >> java.io.File may be left behind from the previous round. >> >> -Alan > Hi Alan, > > While removing the java.io.File import, I also questioned why I had > java.io.IOException being imported. There were a couple of methods that > declared it thrown, and the main method therefore had to catch it, but > it turns out this was just copy/paste from the Settings.java test I used > as a template, and is not actually needed. I removed the import, throws, > and try/catch of IOException. > > All the other issues mentioned by others have also been addressed. A new > webrev can be found at: > > http://cr.openjdk.java.net/~cjplummer/6762191/webrev.03/ > > thanks, > > Chris From serguei.spitsyn at oracle.com Fri Dec 5 00:38:35 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 04 Dec 2014 16:38:35 -0800 Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation fault In-Reply-To: <5480F262.9010407@oracle.com> References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com> <54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com> <546C5986.6010500@oracle.com> <546C6D1A.8050903@oracle.com> <546CBCAB.7040101@oracle.com> <547D265A.20005@oracle.com> <547F0889.5050204@oracle.com> <5480178A.9090709@oracle.com> <5480F262.9010407@oracle.com> Message-ID: <5480FE8B.5010301@oracle.com> It still looks good to me too. :) Thanks, Serguei On 12/4/14 3:46 PM, David Holmes wrote: > Looks good to me too Chris - sorry for the delay getting back to you. > But at least Kumar spotted all the typos :) > > David > > On 4/12/2014 6:12 PM, Chris Plummer wrote: >> On 12/3/14 4:56 AM, Alan Bateman wrote: >>> On 02/12/2014 02:39, Chris Plummer wrote: >>>> Sorry about the long delay in getting back to this. I ran into two >>>> separate JPRT issues that were preventing me from testing these >>>> changes, plus I was on vacation last week. Here's an updated webrev. >>>> I'm not sure where we left things, so I'll just say what's changed >>>> since the original version: >>>> >>>> 1. Rewrote the test to be in Java instead of a shell script. >>>> 2. Moved the test from hotspot/test/runtime/memory to >>>> jdk/test/tools/launcher >>>> 3. Added STACK_SIZE_MINIMUM to java.c, allowing a makefile to >>>> override the default 32k minimum value. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-6762191 >>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.02/ >>> This looks to me. A minor comment for java.c is that this code uses >>> 4-space indent (different to hotspot). >>> >>> The test looks okay too, you might just checking the copyright date as >>> I assume was not written in 2010. Also I think the import of >>> java.io.File may be left behind from the previous round. >>> >>> -Alan >> Hi Alan, >> >> While removing the java.io.File import, I also questioned why I had >> java.io.IOException being imported. There were a couple of methods that >> declared it thrown, and the main method therefore had to catch it, but >> it turns out this was just copy/paste from the Settings.java test I used >> as a template, and is not actually needed. I removed the import, throws, >> and try/catch of IOException. >> >> All the other issues mentioned by others have also been addressed. A new >> webrev can be found at: >> >> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.03/ >> >> thanks, >> >> Chris From david.holmes at oracle.com Fri Dec 5 01:15:52 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 05 Dec 2014 11:15:52 +1000 Subject: RFR (XS): 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid In-Reply-To: <54804DBD.9070402@oracle.com> References: <54803827.8080402@oracle.com> <5480453D.1050801@oracle.com> <54804810.2060001@oracle.com> <548048EC.4030906@oracle.com> <54804DBD.9070402@oracle.com> Message-ID: <54810748.9000706@oracle.com> Hi Ioi, Thanks for the new open test. Two nits: output.shouldNotContain("Java HotSpot(TM)"); // Should not print JVM version This won't work for an OpenJDK build. 59 // (2) With an valid archive (boot class path has been prepended) valid -> invalid I assume? Thanks, David On 4/12/2014 10:04 PM, Ioi Lam wrote: > > On 12/4/14, 3:43 AM, David Holmes wrote: >> On 4/12/2014 9:40 PM, Ioi Lam wrote: >>> >>> On 12/4/14, 3:27 AM, David Holmes wrote: >>>> Hi Ioi, >>>> >>>> On 4/12/2014 8:32 PM, Ioi Lam wrote: >>>>> Hi Folks, >>>>> >>>>> Please review a small fix: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8066670 >>>>> http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ >>>>> >>>>> Summary of fix: >>>>> >>>>> Do not set UseSharedSpaces to falseif PrintSharedArchiveAndExit is >>>>> enabled. >>>>> >>>>> After this fix, the JVM correctly exits when >>>>> PrintSharedArchiveAndExit is enabled and an invalid archive is >>>>> encountered. >>>> >>>> The change to metaspaceShared.cpp is fine. >>>> >>>> In filemap.cpp I'm less clear on the logic. It seems that if >>>> _validating_classpath_entry_table is false then we will still >>>> continue, even if PrintSharedArchiveAndExit is true. >>>> >>> The goal is try to print out as much information as possible. It turns >>> out the most useful information with PrintSharedArchiveAndExit is to >>> find out which part of the classpath is invalid. When >>> _validating_classpath_entry_table is true, we know it's safe to print an >>> error message (about a part of the classpath that's invalid) and >>> continue. >>> >>> When doing other things (_validating_classpath_entry_table==false), it's >>> less clear whether we can continue if a failure is encountered. In this >>> case, since PrintSharedArchiveAndExit is true, RequireSharedSpaces is >>> automatically set to true (by arguments.cpp), so we will print out the >>> error message and exit immediately. >> >> Okay - thanks for explaining. >> >>>>> >>>>> New test cases are in closed source code. >>>> >>>> Begs the question as to why there can't be an open test for this? >>>> >>> I will add an open test as well. >> > I added the new test in the open code, under the same location: > > http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ > > Thanks > - Ioi From ioi.lam at oracle.com Fri Dec 5 01:37:59 2014 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 04 Dec 2014 17:37:59 -0800 Subject: RFR (XS): 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid In-Reply-To: <54810748.9000706@oracle.com> References: <54803827.8080402@oracle.com> <5480453D.1050801@oracle.com> <54804810.2060001@oracle.com> <548048EC.4030906@oracle.com> <54804DBD.9070402@oracle.com> <54810748.9000706@oracle.com> Message-ID: <54810C77.3000003@oracle.com> Hi David, Thanks for the review. I have fixed the test case(and also made similar changes in the closed test case) ioilinux ~/jdk/jdk9/hotspot$ hg diff diff -r a711ea14195a test/runtime/SharedArchiveFile/PrintSharedArchiveAndExit.java --- a/test/runtime/SharedArchiveFile/PrintSharedArchiveAndExit.java Thu Dec 04 15:20:09 2014 -0800 +++ b/test/runtime/SharedArchiveFile/PrintSharedArchiveAndExit.java Thu Dec 04 17:36:49 2014 -0800 @@ -45,7 +45,7 @@ "-XX:+PrintSharedArchiveAndExit", "-version"); output = new OutputAnalyzer(pb.start()); output.shouldContain("archive is valid"); - output.shouldNotContain("Java HotSpot(TM)"); // Should not print JVM version + output.shouldNotContain("java version"); // Should not print JVM version output.shouldHaveExitValue(0); // Should report success in error code. pb = ProcessTools.createJavaProcessBuilder( @@ -56,14 +56,14 @@ output.shouldNotContain("Usage:"); // Should not print JVM help message output.shouldHaveExitValue(0); // Should report success in error code. - // (2) With an valid archive (boot class path has been prepended) + // (2) With an invalid archive (boot class path has been prepended) pb = ProcessTools.createJavaProcessBuilder( "-Xbootclasspath/p:foo.jar", "-XX:+UnlockDiagnosticVMOptions", "-XX:SharedArchiveFile=./sample.jsa", "-XX:+PrintSharedArchiveAndExit", "-version"); output = new OutputAnalyzer(pb.start()); output.shouldContain("archive is invalid"); - output.shouldNotContain("Java HotSpot(TM)"); // Should not print JVM version + output.shouldNotContain("java version"); // Should not print JVM version output.shouldHaveExitValue(1); // Should report failure in error code. - Ioi On 12/4/14, 5:15 PM, David Holmes wrote: > Hi Ioi, > > Thanks for the new open test. Two nits: > > output.shouldNotContain("Java HotSpot(TM)"); // Should not print JVM > version > > This won't work for an OpenJDK build. > > 59 // (2) With an valid archive (boot class path has been > prepended) > > valid -> invalid I assume? > > Thanks, > David > > On 4/12/2014 10:04 PM, Ioi Lam wrote: >> >> On 12/4/14, 3:43 AM, David Holmes wrote: >>> On 4/12/2014 9:40 PM, Ioi Lam wrote: >>>> >>>> On 12/4/14, 3:27 AM, David Holmes wrote: >>>>> Hi Ioi, >>>>> >>>>> On 4/12/2014 8:32 PM, Ioi Lam wrote: >>>>>> Hi Folks, >>>>>> >>>>>> Please review a small fix: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8066670 >>>>>> http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ >>>>>> >>>>>> Summary of fix: >>>>>> >>>>>> Do not set UseSharedSpaces to falseif PrintSharedArchiveAndExit is >>>>>> enabled. >>>>>> >>>>>> After this fix, the JVM correctly exits when >>>>>> PrintSharedArchiveAndExit is enabled and an invalid archive is >>>>>> encountered. >>>>> >>>>> The change to metaspaceShared.cpp is fine. >>>>> >>>>> In filemap.cpp I'm less clear on the logic. It seems that if >>>>> _validating_classpath_entry_table is false then we will still >>>>> continue, even if PrintSharedArchiveAndExit is true. >>>>> >>>> The goal is try to print out as much information as possible. It turns >>>> out the most useful information with PrintSharedArchiveAndExit is to >>>> find out which part of the classpath is invalid. When >>>> _validating_classpath_entry_table is true, we know it's safe to >>>> print an >>>> error message (about a part of the classpath that's invalid) and >>>> continue. >>>> >>>> When doing other things (_validating_classpath_entry_table==false), >>>> it's >>>> less clear whether we can continue if a failure is encountered. In >>>> this >>>> case, since PrintSharedArchiveAndExit is true, RequireSharedSpaces is >>>> automatically set to true (by arguments.cpp), so we will print out the >>>> error message and exit immediately. >>> >>> Okay - thanks for explaining. >>> >>>>>> >>>>>> New test cases are in closed source code. >>>>> >>>>> Begs the question as to why there can't be an open test for this? >>>>> >>>> I will add an open test as well. >>> >> I added the new test in the open code, under the same location: >> >> http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ >> >> Thanks >> - Ioi From calvin.cheung at oracle.com Fri Dec 5 02:40:30 2014 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Thu, 04 Dec 2014 18:40:30 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <5480EACE.7000605@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> <547FA291.5050407@oracle.com> <547FBB89.1080409@oracle.com> <5480EACE.7000605@oracle.com> Message-ID: <54811B1E.6030901@oracle.com> Hi David, On 12/4/2014 3:14 PM, David Holmes wrote: > On 4/12/2014 11:40 AM, Calvin Cheung wrote: >> Hi Jiangli, >> >> I've updated the webrev at the same location: >> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ > > Seems okay. Do we have a test that sets the flags to these allowed > minimum values and dumps and then uses the archive? I've added more test scenarios to the testcase. Updated webrev is at the same location. thanks, Calvin > > Thanks, > David > >> Previous version is saved at: >> http://cr.openjdk.java.net/~ccheung/8065050/webrev.00/ >> >> thanks, >> Calvin >> >> On 12/3/2014 3:53 PM, Jiangli Zhou wrote: >>> Hi Calvin, >>> >>> On 12/03/2014 03:51 PM, Calvin Cheung wrote: >>>> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >>>>> Hi Calvin, >>>>> >>>>> It's better to define 12M and 16M as enums in metaspaceShared.hpp >>>>> now they are referenced in more than one place. >>>> So global.hpp will need to include metaspaceShared.hpp. >>>>> >>>>> I also have some questions. The 12M/16M are not introduced by this >>>>> change, do you know why those values were chosen as the default RO >>>>> and RW sizes? >>>> Sorry. I don't know the reasons why those values were chosen. >>>>> Now we require both spaces have to be at lease 12M on 32-bit >>>>> machines and 16M on 64-bit machine, is it a reasonable requirement? >>>>> What's the minimum size requirement for the RO and RW spaces with >>>>> the default classlist? >>>> >>>> Below are the numbers for the RO and RW spaces with the default >>>> classlist for various platforms: >>>> ===== >>>> linux >>>> ===== >>>> 64-bit >>>> ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% >>>> used] at 0x0000000800000000 >>>> rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% >>>> used] at 0x0000000801000000 >>>> >>>> 32-bit >>>> ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% >>>> used] at 0x80000000 >>>> rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% >>>> used] at 0x80c00000 >>>> >>>> ======== >>>> windows >>>> ======== >>>> 64-bit >>>> ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% >>>> used] at 0x0000000800000000 >>>> rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% >>>> used] at 0x0000000801000000 >>>> >>>> 32-bit >>>> ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% >>>> used] at 0x14690000 >>>> rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% >>>> used] at 0x15290000 >>>> >>>> ==== >>>> mac >>>> ==== >>>> 64-bit >>>> ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% >>>> used] at 0x0000000800000000 >>>> rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% >>>> used] at 0x0000000801000000 >>>> >>>> ==== >>>> >>>> So maybe we can define some enums as follows and leave the default >>>> values in globals.hpp alone? >>>> >>>> min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) >>>> min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) >>> >>> Sounds good to me. >>> >>> Thanks, >>> Jiangli >>> >>>> >>>> thanks, >>>> Calvin >>>> >>>>> >>>>> Thanks, >>>>> Jiangli >>>>> >>>>> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>>>>> >>>>>> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, >>>>>> and SharedReadWriteSize. >>>>>> >>>>>> For the SharedMiscDataSize, it is based on >>>>>> MetaspaceShared::generate_vtable_methods(). Similar to what was >>>>>> done for the SharedMiscCodeSize. >>>>>> >>>>>> For the ShareReadOnlySize and SharedReadWriteSize, I'm checking if >>>>>> they are at least the default size. >>>>>> I think it's reasonable to enforce the ro and rw sizes to be at >>>>>> least the default size. A default dump of CDS archive requires >8M >>>>>> of ro space and >11M of rw space. >>>>>> >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>>> >>>>>> tests: >>>>>> ran the testcase via jtreg on linux_x64 and windows_x64 >>>>>> JPRT >>>>>> >>>>>> thanks, >>>>>> Calvin >>>>> >>>> >>> >> From david.holmes at oracle.com Fri Dec 5 03:50:52 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 05 Dec 2014 13:50:52 +1000 Subject: RFR (XS): 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid In-Reply-To: <54810C77.3000003@oracle.com> References: <54803827.8080402@oracle.com> <5480453D.1050801@oracle.com> <54804810.2060001@oracle.com> <548048EC.4030906@oracle.com> <54804DBD.9070402@oracle.com> <54810748.9000706@oracle.com> <54810C77.3000003@oracle.com> Message-ID: <54812B9C.7090804@oracle.com> Thanks Ioi - good to go. David On 5/12/2014 11:37 AM, Ioi Lam wrote: > Hi David, > > Thanks for the review. I have fixed the test case(and also made similar > changes in the closed test case) > > ioilinux ~/jdk/jdk9/hotspot$ hg diff > diff -r a711ea14195a > test/runtime/SharedArchiveFile/PrintSharedArchiveAndExit.java > --- a/test/runtime/SharedArchiveFile/PrintSharedArchiveAndExit.java Thu > Dec 04 15:20:09 2014 -0800 > +++ b/test/runtime/SharedArchiveFile/PrintSharedArchiveAndExit.java Thu > Dec 04 17:36:49 2014 -0800 > @@ -45,7 +45,7 @@ > "-XX:+PrintSharedArchiveAndExit", "-version"); > output = new OutputAnalyzer(pb.start()); > output.shouldContain("archive is valid"); > - output.shouldNotContain("Java HotSpot(TM)"); // Should not print > JVM version > + output.shouldNotContain("java version"); // Should not print > JVM version > output.shouldHaveExitValue(0); // Should report > success in error code. > > pb = ProcessTools.createJavaProcessBuilder( > @@ -56,14 +56,14 @@ > output.shouldNotContain("Usage:"); // Should not print > JVM help message > output.shouldHaveExitValue(0); // Should report > success in error code. > > - // (2) With an valid archive (boot class path has been prepended) > + // (2) With an invalid archive (boot class path has been prepended) > pb = ProcessTools.createJavaProcessBuilder( > "-Xbootclasspath/p:foo.jar", > "-XX:+UnlockDiagnosticVMOptions", > "-XX:SharedArchiveFile=./sample.jsa", > "-XX:+PrintSharedArchiveAndExit", "-version"); > output = new OutputAnalyzer(pb.start()); > output.shouldContain("archive is invalid"); > - output.shouldNotContain("Java HotSpot(TM)"); // Should not print > JVM version > + output.shouldNotContain("java version"); // Should not print > JVM version > output.shouldHaveExitValue(1); // Should report > failure in error code. > > > - Ioi > > On 12/4/14, 5:15 PM, David Holmes wrote: >> Hi Ioi, >> >> Thanks for the new open test. Two nits: >> >> output.shouldNotContain("Java HotSpot(TM)"); // Should not print JVM >> version >> >> This won't work for an OpenJDK build. >> >> 59 // (2) With an valid archive (boot class path has been >> prepended) >> >> valid -> invalid I assume? >> >> Thanks, >> David >> >> On 4/12/2014 10:04 PM, Ioi Lam wrote: >>> >>> On 12/4/14, 3:43 AM, David Holmes wrote: >>>> On 4/12/2014 9:40 PM, Ioi Lam wrote: >>>>> >>>>> On 12/4/14, 3:27 AM, David Holmes wrote: >>>>>> Hi Ioi, >>>>>> >>>>>> On 4/12/2014 8:32 PM, Ioi Lam wrote: >>>>>>> Hi Folks, >>>>>>> >>>>>>> Please review a small fix: >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8066670 >>>>>>> http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ >>>>>>> >>>>>>> Summary of fix: >>>>>>> >>>>>>> Do not set UseSharedSpaces to falseif PrintSharedArchiveAndExit is >>>>>>> enabled. >>>>>>> >>>>>>> After this fix, the JVM correctly exits when >>>>>>> PrintSharedArchiveAndExit is enabled and an invalid archive is >>>>>>> encountered. >>>>>> >>>>>> The change to metaspaceShared.cpp is fine. >>>>>> >>>>>> In filemap.cpp I'm less clear on the logic. It seems that if >>>>>> _validating_classpath_entry_table is false then we will still >>>>>> continue, even if PrintSharedArchiveAndExit is true. >>>>>> >>>>> The goal is try to print out as much information as possible. It turns >>>>> out the most useful information with PrintSharedArchiveAndExit is to >>>>> find out which part of the classpath is invalid. When >>>>> _validating_classpath_entry_table is true, we know it's safe to >>>>> print an >>>>> error message (about a part of the classpath that's invalid) and >>>>> continue. >>>>> >>>>> When doing other things (_validating_classpath_entry_table==false), >>>>> it's >>>>> less clear whether we can continue if a failure is encountered. In >>>>> this >>>>> case, since PrintSharedArchiveAndExit is true, RequireSharedSpaces is >>>>> automatically set to true (by arguments.cpp), so we will print out the >>>>> error message and exit immediately. >>>> >>>> Okay - thanks for explaining. >>>> >>>>>>> >>>>>>> New test cases are in closed source code. >>>>>> >>>>>> Begs the question as to why there can't be an open test for this? >>>>>> >>>>> I will add an open test as well. >>>> >>> I added the new test in the open code, under the same location: >>> >>> http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit/ >>> >>> Thanks >>> - Ioi > From david.holmes at oracle.com Fri Dec 5 04:08:37 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 05 Dec 2014 14:08:37 +1000 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <54811B1E.6030901@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> <547FA291.5050407@oracle.com> <547FBB89.1080409@oracle.com> <5480EACE.7000605@oracle.com> <54811B1E.6030901@oracle.com> Message-ID: <54812FC5.2000505@oracle.com> On 5/12/2014 12:40 PM, Calvin Cheung wrote: > Hi David, > > On 12/4/2014 3:14 PM, David Holmes wrote: >> On 4/12/2014 11:40 AM, Calvin Cheung wrote: >>> Hi Jiangli, >>> >>> I've updated the webrev at the same location: >>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >> >> Seems okay. Do we have a test that sets the flags to these allowed >> minimum values and dumps and then uses the archive? > > I've added more test scenarios to the testcase. > Updated webrev is at the same location. These values: ! value = Platform.is64bit() ? "10M" : "8M"; ! break; ! case RW: ! value = Platform.is64bit() ? "13M" : "7M"; should match these: min_ro_size = NOT_LP64(7*M) LP64_ONLY(9*M), min_rw_size = NOT_LP64(6*M) LP64_ONLY(12*M) David ----- > thanks, > Calvin > > >> >> Thanks, >> David >> >>> Previous version is saved at: >>> http://cr.openjdk.java.net/~ccheung/8065050/webrev.00/ >>> >>> thanks, >>> Calvin >>> >>> On 12/3/2014 3:53 PM, Jiangli Zhou wrote: >>>> Hi Calvin, >>>> >>>> On 12/03/2014 03:51 PM, Calvin Cheung wrote: >>>>> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >>>>>> Hi Calvin, >>>>>> >>>>>> It's better to define 12M and 16M as enums in metaspaceShared.hpp >>>>>> now they are referenced in more than one place. >>>>> So global.hpp will need to include metaspaceShared.hpp. >>>>>> >>>>>> I also have some questions. The 12M/16M are not introduced by this >>>>>> change, do you know why those values were chosen as the default RO >>>>>> and RW sizes? >>>>> Sorry. I don't know the reasons why those values were chosen. >>>>>> Now we require both spaces have to be at lease 12M on 32-bit >>>>>> machines and 16M on 64-bit machine, is it a reasonable requirement? >>>>>> What's the minimum size requirement for the RO and RW spaces with >>>>>> the default classlist? >>>>> >>>>> Below are the numbers for the RO and RW spaces with the default >>>>> classlist for various platforms: >>>>> ===== >>>>> linux >>>>> ===== >>>>> 64-bit >>>>> ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% >>>>> used] at 0x0000000800000000 >>>>> rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% >>>>> used] at 0x0000000801000000 >>>>> >>>>> 32-bit >>>>> ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% >>>>> used] at 0x80000000 >>>>> rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% >>>>> used] at 0x80c00000 >>>>> >>>>> ======== >>>>> windows >>>>> ======== >>>>> 64-bit >>>>> ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% >>>>> used] at 0x0000000800000000 >>>>> rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% >>>>> used] at 0x0000000801000000 >>>>> >>>>> 32-bit >>>>> ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% >>>>> used] at 0x14690000 >>>>> rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% >>>>> used] at 0x15290000 >>>>> >>>>> ==== >>>>> mac >>>>> ==== >>>>> 64-bit >>>>> ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% >>>>> used] at 0x0000000800000000 >>>>> rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% >>>>> used] at 0x0000000801000000 >>>>> >>>>> ==== >>>>> >>>>> So maybe we can define some enums as follows and leave the default >>>>> values in globals.hpp alone? >>>>> >>>>> min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) >>>>> min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) >>>> >>>> Sounds good to me. >>>> >>>> Thanks, >>>> Jiangli >>>> >>>>> >>>>> thanks, >>>>> Calvin >>>>> >>>>>> >>>>>> Thanks, >>>>>> Jiangli >>>>>> >>>>>> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>>>>>> >>>>>>> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, >>>>>>> and SharedReadWriteSize. >>>>>>> >>>>>>> For the SharedMiscDataSize, it is based on >>>>>>> MetaspaceShared::generate_vtable_methods(). Similar to what was >>>>>>> done for the SharedMiscCodeSize. >>>>>>> >>>>>>> For the ShareReadOnlySize and SharedReadWriteSize, I'm checking if >>>>>>> they are at least the default size. >>>>>>> I think it's reasonable to enforce the ro and rw sizes to be at >>>>>>> least the default size. A default dump of CDS archive requires >8M >>>>>>> of ro space and >11M of rw space. >>>>>>> >>>>>>> webrev: >>>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>>>> >>>>>>> tests: >>>>>>> ran the testcase via jtreg on linux_x64 and windows_x64 >>>>>>> JPRT >>>>>>> >>>>>>> thanks, >>>>>>> Calvin >>>>>> >>>>> >>>> >>> > From calvin.cheung at oracle.com Fri Dec 5 06:55:41 2014 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Thu, 04 Dec 2014 22:55:41 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <54812FC5.2000505@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> <547FA291.5050407@oracle.com> <547FBB89.1080409@oracle.com> <5480EACE.7000605@oracle.com> <54811B1E.6030901@oracle.com> <54812FC5.2000505@oracle.com> Message-ID: <548156ED.6070502@oracle.com> On 12/4/2014 8:08 PM, David Holmes wrote: > On 5/12/2014 12:40 PM, Calvin Cheung wrote: >> Hi David, >> >> On 12/4/2014 3:14 PM, David Holmes wrote: >>> On 4/12/2014 11:40 AM, Calvin Cheung wrote: >>>> Hi Jiangli, >>>> >>>> I've updated the webrev at the same location: >>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>> >>> Seems okay. Do we have a test that sets the flags to these allowed >>> minimum values and dumps and then uses the archive? >> >> I've added more test scenarios to the testcase. >> Updated webrev is at the same location. > > These values: > > ! value = Platform.is64bit() ? "10M" : "8M"; > ! break; > ! case RW: > ! value = Platform.is64bit() ? "13M" : "7M"; > > should match these: > > min_ro_size = NOT_LP64(7*M) LP64_ONLY(9*M), > min_rw_size = NOT_LP64(6*M) LP64_ONLY(12*M) For 64-bit, I've changed the testcase to match with the definitions. For 32-bit, I've changed the definitions to match with the testcase. Otherwise, the test fails with "not enough space". It may have something to do with the following calculation: SharedReadOnlySize = align_size_up(SharedReadOnlySize, max_alignment); SharedReadWriteSize = align_size_up(SharedReadWriteSize, max_alignment); I've updated the webrev at the same location. thanks, Calvin > > David > ----- > >> thanks, >> Calvin >> >> >>> >>> Thanks, >>> David >>> >>>> Previous version is saved at: >>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev.00/ >>>> >>>> thanks, >>>> Calvin >>>> >>>> On 12/3/2014 3:53 PM, Jiangli Zhou wrote: >>>>> Hi Calvin, >>>>> >>>>> On 12/03/2014 03:51 PM, Calvin Cheung wrote: >>>>>> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >>>>>>> Hi Calvin, >>>>>>> >>>>>>> It's better to define 12M and 16M as enums in metaspaceShared.hpp >>>>>>> now they are referenced in more than one place. >>>>>> So global.hpp will need to include metaspaceShared.hpp. >>>>>>> >>>>>>> I also have some questions. The 12M/16M are not introduced by this >>>>>>> change, do you know why those values were chosen as the default RO >>>>>>> and RW sizes? >>>>>> Sorry. I don't know the reasons why those values were chosen. >>>>>>> Now we require both spaces have to be at lease 12M on 32-bit >>>>>>> machines and 16M on 64-bit machine, is it a reasonable requirement? >>>>>>> What's the minimum size requirement for the RO and RW spaces with >>>>>>> the default classlist? >>>>>> >>>>>> Below are the numbers for the RO and RW spaces with the default >>>>>> classlist for various platforms: >>>>>> ===== >>>>>> linux >>>>>> ===== >>>>>> 64-bit >>>>>> ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% >>>>>> used] at 0x0000000800000000 >>>>>> rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% >>>>>> used] at 0x0000000801000000 >>>>>> >>>>>> 32-bit >>>>>> ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% >>>>>> used] at 0x80000000 >>>>>> rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% >>>>>> used] at 0x80c00000 >>>>>> >>>>>> ======== >>>>>> windows >>>>>> ======== >>>>>> 64-bit >>>>>> ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% >>>>>> used] at 0x0000000800000000 >>>>>> rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% >>>>>> used] at 0x0000000801000000 >>>>>> >>>>>> 32-bit >>>>>> ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% >>>>>> used] at 0x14690000 >>>>>> rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% >>>>>> used] at 0x15290000 >>>>>> >>>>>> ==== >>>>>> mac >>>>>> ==== >>>>>> 64-bit >>>>>> ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% >>>>>> used] at 0x0000000800000000 >>>>>> rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% >>>>>> used] at 0x0000000801000000 >>>>>> >>>>>> ==== >>>>>> >>>>>> So maybe we can define some enums as follows and leave the default >>>>>> values in globals.hpp alone? >>>>>> >>>>>> min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) >>>>>> min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) >>>>> >>>>> Sounds good to me. >>>>> >>>>> Thanks, >>>>> Jiangli >>>>> >>>>>> >>>>>> thanks, >>>>>> Calvin >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Jiangli >>>>>>> >>>>>>> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>>>>>>> >>>>>>>> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, >>>>>>>> and SharedReadWriteSize. >>>>>>>> >>>>>>>> For the SharedMiscDataSize, it is based on >>>>>>>> MetaspaceShared::generate_vtable_methods(). Similar to what was >>>>>>>> done for the SharedMiscCodeSize. >>>>>>>> >>>>>>>> For the ShareReadOnlySize and SharedReadWriteSize, I'm checking if >>>>>>>> they are at least the default size. >>>>>>>> I think it's reasonable to enforce the ro and rw sizes to be at >>>>>>>> least the default size. A default dump of CDS archive requires >8M >>>>>>>> of ro space and >11M of rw space. >>>>>>>> >>>>>>>> webrev: >>>>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>>>>> >>>>>>>> tests: >>>>>>>> ran the testcase via jtreg on linux_x64 and windows_x64 >>>>>>>> JPRT >>>>>>>> >>>>>>>> thanks, >>>>>>>> Calvin >>>>>>> >>>>>> >>>>> >>>> >> From david.holmes at oracle.com Fri Dec 5 07:05:40 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 05 Dec 2014 17:05:40 +1000 Subject: RFR(s): 8065895: Synchronous signals during error reporting may terminate or hang VM process In-Reply-To: References: <54752CF8.5070408@oracle.com> <54759DFE.7020300@oracle.com> <5475C15E.30207@oracle.com> <54767502.6010907@oracle.com> <5476B417.9030008@oracle.com> <5476E851.8050802@oracle.com> <5476F2AB.4050401@redhat.com> <5477030E.6070605@redhat.com> <54770436.8070705@oracle.com> <54770536.5090101@redhat.com> <547D8B6A.6040002@oracle.com> <547E8CF4.3050305@oracle.com> Message-ID: <54815944.1060500@oracle.com> On 3/12/2014 8:47 PM, Thomas St?fe wrote: > Hi Dean, > > I dont understand. Such a function does not exist, does it? So I would > have to write it: > > Do you mean generating and using a StubRoutine which would SIGILL? I did > not do this because I wanted to be able to generate SIGILL also in > initialization code, where StubRoutines may not yet be generated. This > point may may be arguable, but as this function is used to test error > handling, it may be interesting to test it for half-initialized VMs too. > > Otherwise I would implement the CPU specific > generate_illegal_instruction___sequence() probably the same way as I do > now the crash_with_sigill() function. That would mean a bit of more code > duplication because: > - Either I use the method I use now (reserve_memory and copy the > instructions to the reserved page) > - Or I use inline assembly - which probably does not work across > multiple OSs, so for CPUs which span various OSs I would have to add one > function per os_cpu combination, not just per cpu. I don't think there is any OS dependency with inline assembly - only compiler. And I am also concerned that writing code to an executable page will also enter the realm of "self-modifying code" and all the jumping through hoops that entails. That aspect hadn't occurred to me till Dean raised it. I'm forming the view that triggering a SIGILL is more effort than it is worth for a secondary testing function. I should also add that some of my cringing was unfounded as I was mistakenly thinking that you were adding the cpu specific debug files when you were not. Side nit: raise() should be os::raise() (though I don't see any implementation that does anything but raise() ). David > Kind regards, Thomas > > On Wed, Dec 3, 2014 at 5:09 AM, Dean Long > wrote: > > Instead of get_illegal_instruction___sequence() that fills in a > buffer in reserved memory page, how > about simply generate_illegal_instruction___sequence() that causes > the SIGILL when executed? > Then crash_with_sigill() simplifies to something like: > > tty->print_cr("will jump to PC " PTR_FORMAT", which should > cause a SIGILL.",generate_illegal___instruction_sequence); > tty->flush(); > > generate_illegal_instruction___sequence(); // boom > > dl > > > On 12/2/2014 8:04 AM, Thomas St?fe wrote: > > Hi David, you are a hard man to uncringe :) > > Here is a last modification, which in my opinion would be the > best balance. > Basically, it is (2) with the CPU dependend code moved away from > shared > coding and a fallback for CPUs which have no (known) way to > cause a SIGILL. > > http://cr.openjdk.java.net/~__stuefe/webrevs/8065895/webrev.__03/ > > Kind Regards, Thomas > > > On Tue, Dec 2, 2014 at 10:50 AM, David Holmes > > > wrote: > > On 1/12/2014 11:30 PM, Thomas St?fe wrote: > > Hi all, > > lets not get this patch bogged down on ARM opcode > discussions. > > For me, it is just a question of style and which one > would be most > acceptable to the OpenJDK. > > As I see it, here are my options: > > 1 leave the code as it is and whoever does ARM porting > at Oracle will > provide the SIGILL opcodes inside debug.cpp > 2 like (1), but provide a fallback for CPUs where we do > not know the > SIGILL opcodes right now, by doing a raise(SIGILL). This > would work but > make the test a tiny bit less valuable on those platforms. > > 3 Move the CPU-dependend parts (the big #ifdef) away > from debug.cpp > into debug_.cpp. Would mean a bit code duplication > because for 3 > out of 5 cpus the SIGILL-generating opcode is 0. This > basically would be > the same as my second webrev > (http://cr.openjdk.java.net/~__stuefe/webrevs/8065895/webrev.__01/ > ) > 4 like (3), but with additional introduction of a > debug_.hpp, and > adding a "ZERO_WILL_GENERATE_SIGILL" or somesuch macro > to provide a > common fallback for cpus where 0 generates SIGILL. > > I am leaning toward (2) or (3) but I am okay with any of > the four. > > I'm really undecided here. #1 makes me cringe because of the > cpu ifdefs in > shared code (including those for non-OpenJDK platforms). #3 > and #4 make me > cringe because it is a lot of overhead to introduce the > debug_.hpp > files on all platforms. > > That leaves #2 though I'm unclear how we will identify the > platforms that > don't have defined bad opcodes. If that's still just a > variant of the > ifdefs in #1 then I'm still cringing. :) > > Would appreciate someone else from runtime jumping in with > an opinion here > :) > > David > > (PS. I'm on vacation tomorrow so apologies for delayed > responses.) > > > Kind Regards, > > Thomas Stuefe > > > > > > > > On Thu, Nov 27, 2014 at 12:04 PM, Andrew Haley > > >> wrote: > > On 11/27/2014 11:00 AM, David Holmes wrote: > > On 27/11/2014 8:55 PM, Andrew Haley wrote: > >> On 11/27/2014 10:38 AM, Thomas St?fe wrote: > >>> Hi Andrew, thank you! Does endianess matter ? > >> > >> Yes. I'd do it symbolically rather than mess > with endian defines: > >> > >> #ifdef AARCH64 > >> unsigned insn; > >> asm("b 1f; 0: dcps1; 1: ldr %0, 0b" : > "=r"(insn)); > >> #endif > > > > Does that work for ARMv7? > > Sorry, I don't know what a good choice there would > be. And I must > warn you: DCPS1 isn't necessarily guaranteed to do > this forever, but > it works on the kernels I've tried. > > Andrew. > > > > > > From david.holmes at oracle.com Fri Dec 5 07:19:48 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 05 Dec 2014 17:19:48 +1000 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <548156ED.6070502@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> <547FA291.5050407@oracle.com> <547FBB89.1080409@oracle.com> <5480EACE.7000605@oracle.com> <54811B1E.6030901@oracle.com> <54812FC5.2000505@oracle.com> <548156ED.6070502@oracle.com> Message-ID: <54815C94.9080401@oracle.com> On 5/12/2014 4:55 PM, Calvin Cheung wrote: > On 12/4/2014 8:08 PM, David Holmes wrote: >> On 5/12/2014 12:40 PM, Calvin Cheung wrote: >>> Hi David, >>> >>> On 12/4/2014 3:14 PM, David Holmes wrote: >>>> On 4/12/2014 11:40 AM, Calvin Cheung wrote: >>>>> Hi Jiangli, >>>>> >>>>> I've updated the webrev at the same location: >>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>> >>>> Seems okay. Do we have a test that sets the flags to these allowed >>>> minimum values and dumps and then uses the archive? >>> >>> I've added more test scenarios to the testcase. >>> Updated webrev is at the same location. >> >> These values: >> >> ! value = Platform.is64bit() ? "10M" : "8M"; >> ! break; >> ! case RW: >> ! value = Platform.is64bit() ? "13M" : "7M"; >> >> should match these: >> >> min_ro_size = NOT_LP64(7*M) LP64_ONLY(9*M), >> min_rw_size = NOT_LP64(6*M) LP64_ONLY(12*M) > > For 64-bit, I've changed the testcase to match with the definitions. > For 32-bit, I've changed the definitions to match with the testcase. > Otherwise, the test fails with "not enough space". Precisely what needed to be verified! Looks okay to me now. Thanks, David > It may have something to do with the following calculation: > SharedReadOnlySize = align_size_up(SharedReadOnlySize, > max_alignment); > SharedReadWriteSize = align_size_up(SharedReadWriteSize, > max_alignment); > > I've updated the webrev at the same location. > > thanks, > Calvin >> >> David >> ----- >> >>> thanks, >>> Calvin >>> >>> >>>> >>>> Thanks, >>>> David >>>> >>>>> Previous version is saved at: >>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev.00/ >>>>> >>>>> thanks, >>>>> Calvin >>>>> >>>>> On 12/3/2014 3:53 PM, Jiangli Zhou wrote: >>>>>> Hi Calvin, >>>>>> >>>>>> On 12/03/2014 03:51 PM, Calvin Cheung wrote: >>>>>>> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >>>>>>>> Hi Calvin, >>>>>>>> >>>>>>>> It's better to define 12M and 16M as enums in metaspaceShared.hpp >>>>>>>> now they are referenced in more than one place. >>>>>>> So global.hpp will need to include metaspaceShared.hpp. >>>>>>>> >>>>>>>> I also have some questions. The 12M/16M are not introduced by this >>>>>>>> change, do you know why those values were chosen as the default RO >>>>>>>> and RW sizes? >>>>>>> Sorry. I don't know the reasons why those values were chosen. >>>>>>>> Now we require both spaces have to be at lease 12M on 32-bit >>>>>>>> machines and 16M on 64-bit machine, is it a reasonable requirement? >>>>>>>> What's the minimum size requirement for the RO and RW spaces with >>>>>>>> the default classlist? >>>>>>> >>>>>>> Below are the numbers for the RO and RW spaces with the default >>>>>>> classlist for various platforms: >>>>>>> ===== >>>>>>> linux >>>>>>> ===== >>>>>>> 64-bit >>>>>>> ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% >>>>>>> used] at 0x0000000800000000 >>>>>>> rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% >>>>>>> used] at 0x0000000801000000 >>>>>>> >>>>>>> 32-bit >>>>>>> ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% >>>>>>> used] at 0x80000000 >>>>>>> rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% >>>>>>> used] at 0x80c00000 >>>>>>> >>>>>>> ======== >>>>>>> windows >>>>>>> ======== >>>>>>> 64-bit >>>>>>> ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% >>>>>>> used] at 0x0000000800000000 >>>>>>> rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% >>>>>>> used] at 0x0000000801000000 >>>>>>> >>>>>>> 32-bit >>>>>>> ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% >>>>>>> used] at 0x14690000 >>>>>>> rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% >>>>>>> used] at 0x15290000 >>>>>>> >>>>>>> ==== >>>>>>> mac >>>>>>> ==== >>>>>>> 64-bit >>>>>>> ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% >>>>>>> used] at 0x0000000800000000 >>>>>>> rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% >>>>>>> used] at 0x0000000801000000 >>>>>>> >>>>>>> ==== >>>>>>> >>>>>>> So maybe we can define some enums as follows and leave the default >>>>>>> values in globals.hpp alone? >>>>>>> >>>>>>> min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) >>>>>>> min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) >>>>>> >>>>>> Sounds good to me. >>>>>> >>>>>> Thanks, >>>>>> Jiangli >>>>>> >>>>>>> >>>>>>> thanks, >>>>>>> Calvin >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jiangli >>>>>>>> >>>>>>>> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>>>>>>>> >>>>>>>>> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, >>>>>>>>> and SharedReadWriteSize. >>>>>>>>> >>>>>>>>> For the SharedMiscDataSize, it is based on >>>>>>>>> MetaspaceShared::generate_vtable_methods(). Similar to what was >>>>>>>>> done for the SharedMiscCodeSize. >>>>>>>>> >>>>>>>>> For the ShareReadOnlySize and SharedReadWriteSize, I'm checking if >>>>>>>>> they are at least the default size. >>>>>>>>> I think it's reasonable to enforce the ro and rw sizes to be at >>>>>>>>> least the default size. A default dump of CDS archive requires >8M >>>>>>>>> of ro space and >11M of rw space. >>>>>>>>> >>>>>>>>> webrev: >>>>>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>>>>>> >>>>>>>>> tests: >>>>>>>>> ran the testcase via jtreg on linux_x64 and windows_x64 >>>>>>>>> JPRT >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> Calvin >>>>>>>> >>>>>>> >>>>>> >>>>> >>> > From jesper.wilhelmsson at oracle.com Fri Dec 5 13:39:42 2014 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Fri, 05 Dec 2014 14:39:42 +0100 Subject: RFR: 6522873 - Java not print "Unrecognized option" when it is invalid option. Message-ID: <5481B59E.1000601@oracle.com> Hi, Please review this patch to make argument parsing stop accepting random characters at the end of command line flags. This topic was discussed in hotspot-dev at openjdk.java.net and I strongly believe that this bug should be reopened and fixed. Short summary of the problem: Today some (not all) flags are accepted even though they have random characters appended to them. Some examples are -Xconcgc, -Xcomp, -Xboundthreads, -XX:+AlwaysTenure etc which will also be accepted when written for instance -Xconcgcnoway, -Xcomposer, -Xboundthreadstodogs or -XX:+AlwaysTenureAtBlueMoon There is a potential problem here since we will also accept things like -XX:+ExtendedDTraceProbes-XX:+UseG1GC without saying a word (and of course without running with G1). Bug: https://bugs.openjdk.java.net/browse/JDK-6522873 Webrev: http://cr.openjdk.java.net/~jwilhelm/6522873/webrev.00/ The full list of flags affected by this change is: -Xnoclassgc -Xconcgc -Xnoconcgc -Xbatch -green -native -Xsqnopause -Xrs -Xusealtsigs -Xoptimize -Xprof -Xconcurrentio -Xinternalversion -Xprintflags -Xint -Xmixed -Xcomp -Xshare:dump -Xshare:on -Xshare:auto -Xshare:off -Xdebug -Xnoagent -Xboundthreads vfprintf exit abort -XX:+AggressiveHeap -XX:+NeverTenure -XX:+AlwaysTenure -XX:+CMSPermGenSweepingEnabled -XX:-CMSPermGenSweepingEnabled -XX:+UseGCTimeLimit -XX:-UseGCTimeLimit -XX:+ResizeTLE -XX:-ResizeTLE -XX:+PrintTLE -XX:-PrintTLE -XX:+UseTLE -XX:-UseTLE -XX:+DisplayVMOutputToStderr -XX:+DisplayVMOutputToStdout -XX:+ExtendedDTraceProbes -XX:+FullGCALot -XX:+ManagementServer -XX:+PrintVMOptions -XX:-PrintVMOptions -XX:+IgnoreUnrecognizedVMOptions -XX:-IgnoreUnrecognizedVMOptions -XX:+PrintFlagsInitial -XX:+PrintFlagsWithComments Thanks, /Jesper From calvin.cheung at oracle.com Fri Dec 5 16:47:47 2014 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Fri, 05 Dec 2014 08:47:47 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <54815C94.9080401@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> <547FA291.5050407@oracle.com> <547FBB89.1080409@oracle.com> <5480EACE.7000605@oracle.com> <54811B1E.6030901@oracle.com> <54812FC5.2000505@oracle.com> <548156ED.6070502@oracle.com> <54815C94.9080401@oracle.com> Message-ID: <5481E1B3.5030106@oracle.com> Thanks again - David. Calvin On 12/4/2014 11:19 PM, David Holmes wrote: > On 5/12/2014 4:55 PM, Calvin Cheung wrote: >> On 12/4/2014 8:08 PM, David Holmes wrote: >>> On 5/12/2014 12:40 PM, Calvin Cheung wrote: >>>> Hi David, >>>> >>>> On 12/4/2014 3:14 PM, David Holmes wrote: >>>>> On 4/12/2014 11:40 AM, Calvin Cheung wrote: >>>>>> Hi Jiangli, >>>>>> >>>>>> I've updated the webrev at the same location: >>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>> >>>>> Seems okay. Do we have a test that sets the flags to these allowed >>>>> minimum values and dumps and then uses the archive? >>>> >>>> I've added more test scenarios to the testcase. >>>> Updated webrev is at the same location. >>> >>> These values: >>> >>> ! value = Platform.is64bit() ? "10M" : "8M"; >>> ! break; >>> ! case RW: >>> ! value = Platform.is64bit() ? "13M" : "7M"; >>> >>> should match these: >>> >>> min_ro_size = NOT_LP64(7*M) LP64_ONLY(9*M), >>> min_rw_size = NOT_LP64(6*M) LP64_ONLY(12*M) >> >> For 64-bit, I've changed the testcase to match with the definitions. >> For 32-bit, I've changed the definitions to match with the testcase. >> Otherwise, the test fails with "not enough space". > > Precisely what needed to be verified! > > Looks okay to me now. > > Thanks, > David > >> It may have something to do with the following calculation: >> SharedReadOnlySize = align_size_up(SharedReadOnlySize, >> max_alignment); >> SharedReadWriteSize = align_size_up(SharedReadWriteSize, >> max_alignment); >> >> I've updated the webrev at the same location. >> >> thanks, >> Calvin >>> >>> David >>> ----- >>> >>>> thanks, >>>> Calvin >>>> >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Previous version is saved at: >>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev.00/ >>>>>> >>>>>> thanks, >>>>>> Calvin >>>>>> >>>>>> On 12/3/2014 3:53 PM, Jiangli Zhou wrote: >>>>>>> Hi Calvin, >>>>>>> >>>>>>> On 12/03/2014 03:51 PM, Calvin Cheung wrote: >>>>>>>> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >>>>>>>>> Hi Calvin, >>>>>>>>> >>>>>>>>> It's better to define 12M and 16M as enums in metaspaceShared.hpp >>>>>>>>> now they are referenced in more than one place. >>>>>>>> So global.hpp will need to include metaspaceShared.hpp. >>>>>>>>> >>>>>>>>> I also have some questions. The 12M/16M are not introduced by >>>>>>>>> this >>>>>>>>> change, do you know why those values were chosen as the >>>>>>>>> default RO >>>>>>>>> and RW sizes? >>>>>>>> Sorry. I don't know the reasons why those values were chosen. >>>>>>>>> Now we require both spaces have to be at lease 12M on 32-bit >>>>>>>>> machines and 16M on 64-bit machine, is it a reasonable >>>>>>>>> requirement? >>>>>>>>> What's the minimum size requirement for the RO and RW spaces with >>>>>>>>> the default classlist? >>>>>>>> >>>>>>>> Below are the numbers for the RO and RW spaces with the default >>>>>>>> classlist for various platforms: >>>>>>>> ===== >>>>>>>> linux >>>>>>>> ===== >>>>>>>> 64-bit >>>>>>>> ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% >>>>>>>> used] at 0x0000000800000000 >>>>>>>> rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% >>>>>>>> used] at 0x0000000801000000 >>>>>>>> >>>>>>>> 32-bit >>>>>>>> ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% >>>>>>>> used] at 0x80000000 >>>>>>>> rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% >>>>>>>> used] at 0x80c00000 >>>>>>>> >>>>>>>> ======== >>>>>>>> windows >>>>>>>> ======== >>>>>>>> 64-bit >>>>>>>> ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% >>>>>>>> used] at 0x0000000800000000 >>>>>>>> rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% >>>>>>>> used] at 0x0000000801000000 >>>>>>>> >>>>>>>> 32-bit >>>>>>>> ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% >>>>>>>> used] at 0x14690000 >>>>>>>> rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% >>>>>>>> used] at 0x15290000 >>>>>>>> >>>>>>>> ==== >>>>>>>> mac >>>>>>>> ==== >>>>>>>> 64-bit >>>>>>>> ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% >>>>>>>> used] at 0x0000000800000000 >>>>>>>> rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% >>>>>>>> used] at 0x0000000801000000 >>>>>>>> >>>>>>>> ==== >>>>>>>> >>>>>>>> So maybe we can define some enums as follows and leave the default >>>>>>>> values in globals.hpp alone? >>>>>>>> >>>>>>>> min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) >>>>>>>> min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) >>>>>>> >>>>>>> Sounds good to me. >>>>>>> >>>>>>> Thanks, >>>>>>> Jiangli >>>>>>> >>>>>>>> >>>>>>>> thanks, >>>>>>>> Calvin >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Jiangli >>>>>>>>> >>>>>>>>> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>>>>>>>>> >>>>>>>>>> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, >>>>>>>>>> and SharedReadWriteSize. >>>>>>>>>> >>>>>>>>>> For the SharedMiscDataSize, it is based on >>>>>>>>>> MetaspaceShared::generate_vtable_methods(). Similar to what was >>>>>>>>>> done for the SharedMiscCodeSize. >>>>>>>>>> >>>>>>>>>> For the ShareReadOnlySize and SharedReadWriteSize, I'm >>>>>>>>>> checking if >>>>>>>>>> they are at least the default size. >>>>>>>>>> I think it's reasonable to enforce the ro and rw sizes to be at >>>>>>>>>> least the default size. A default dump of CDS archive >>>>>>>>>> requires >8M >>>>>>>>>> of ro space and >11M of rw space. >>>>>>>>>> >>>>>>>>>> webrev: >>>>>>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>>>>>>> >>>>>>>>>> tests: >>>>>>>>>> ran the testcase via jtreg on linux_x64 and windows_x64 >>>>>>>>>> JPRT >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> Calvin >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >> From jiangli.zhou at oracle.com Fri Dec 5 18:27:59 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 05 Dec 2014 10:27:59 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <548156ED.6070502@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> <547FA291.5050407@oracle.com> <547FBB89.1080409@oracle.com> <5480EACE.7000605@oracle.com> <54811B1E.6030901@oracle.com> <54812FC5.2000505@oracle.com> <548156ED.6070502@oracle.com> Message-ID: <5481F92F.3000207@oracle.com> Hi Calvin, On 12/04/2014 10:55 PM, Calvin Cheung wrote: > On 12/4/2014 8:08 PM, David Holmes wrote: >> On 5/12/2014 12:40 PM, Calvin Cheung wrote: >>> Hi David, >>> >>> On 12/4/2014 3:14 PM, David Holmes wrote: >>>> On 4/12/2014 11:40 AM, Calvin Cheung wrote: >>>>> Hi Jiangli, >>>>> >>>>> I've updated the webrev at the same location: >>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>> >>>> Seems okay. Do we have a test that sets the flags to these allowed >>>> minimum values and dumps and then uses the archive? >>> >>> I've added more test scenarios to the testcase. >>> Updated webrev is at the same location. >> >> These values: >> >> ! value = Platform.is64bit() ? "10M" : "8M"; >> ! break; >> ! case RW: >> ! value = Platform.is64bit() ? "13M" : "7M"; >> >> should match these: >> >> min_ro_size = NOT_LP64(7*M) LP64_ONLY(9*M), >> min_rw_size = NOT_LP64(6*M) LP64_ONLY(12*M) > > For 64-bit, I've changed the testcase to match with the definitions. > For 32-bit, I've changed the definitions to match with the testcase. > Otherwise, the test fails with "not enough space". > It may have something to do with the following calculation: > SharedReadOnlySize = align_size_up(SharedReadOnlySize, > max_alignment); > SharedReadWriteSize = align_size_up(SharedReadWriteSize, > max_alignment); That's strange. Why would the test fail when using 7M/6M? The minimum aligned sizes should be less than those value, if the status output reports the correct values. Is there any bug that causes the sizes of the reported memory usage wrong? Thanks, Jiangli > > I've updated the webrev at the same location. > > thanks, > Calvin >> >> David >> ----- >> >>> thanks, >>> Calvin >>> >>> >>>> >>>> Thanks, >>>> David >>>> >>>>> Previous version is saved at: >>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev.00/ >>>>> >>>>> thanks, >>>>> Calvin >>>>> >>>>> On 12/3/2014 3:53 PM, Jiangli Zhou wrote: >>>>>> Hi Calvin, >>>>>> >>>>>> On 12/03/2014 03:51 PM, Calvin Cheung wrote: >>>>>>> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >>>>>>>> Hi Calvin, >>>>>>>> >>>>>>>> It's better to define 12M and 16M as enums in metaspaceShared.hpp >>>>>>>> now they are referenced in more than one place. >>>>>>> So global.hpp will need to include metaspaceShared.hpp. >>>>>>>> >>>>>>>> I also have some questions. The 12M/16M are not introduced by this >>>>>>>> change, do you know why those values were chosen as the default RO >>>>>>>> and RW sizes? >>>>>>> Sorry. I don't know the reasons why those values were chosen. >>>>>>>> Now we require both spaces have to be at lease 12M on 32-bit >>>>>>>> machines and 16M on 64-bit machine, is it a reasonable >>>>>>>> requirement? >>>>>>>> What's the minimum size requirement for the RO and RW spaces with >>>>>>>> the default classlist? >>>>>>> >>>>>>> Below are the numbers for the RO and RW spaces with the default >>>>>>> classlist for various platforms: >>>>>>> ===== >>>>>>> linux >>>>>>> ===== >>>>>>> 64-bit >>>>>>> ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% >>>>>>> used] at 0x0000000800000000 >>>>>>> rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% >>>>>>> used] at 0x0000000801000000 >>>>>>> >>>>>>> 32-bit >>>>>>> ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% >>>>>>> used] at 0x80000000 >>>>>>> rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% >>>>>>> used] at 0x80c00000 >>>>>>> >>>>>>> ======== >>>>>>> windows >>>>>>> ======== >>>>>>> 64-bit >>>>>>> ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% >>>>>>> used] at 0x0000000800000000 >>>>>>> rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% >>>>>>> used] at 0x0000000801000000 >>>>>>> >>>>>>> 32-bit >>>>>>> ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% >>>>>>> used] at 0x14690000 >>>>>>> rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% >>>>>>> used] at 0x15290000 >>>>>>> >>>>>>> ==== >>>>>>> mac >>>>>>> ==== >>>>>>> 64-bit >>>>>>> ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% >>>>>>> used] at 0x0000000800000000 >>>>>>> rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% >>>>>>> used] at 0x0000000801000000 >>>>>>> >>>>>>> ==== >>>>>>> >>>>>>> So maybe we can define some enums as follows and leave the default >>>>>>> values in globals.hpp alone? >>>>>>> >>>>>>> min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) >>>>>>> min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) >>>>>> >>>>>> Sounds good to me. >>>>>> >>>>>> Thanks, >>>>>> Jiangli >>>>>> >>>>>>> >>>>>>> thanks, >>>>>>> Calvin >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jiangli >>>>>>>> >>>>>>>> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>>>>>>>> >>>>>>>>> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, >>>>>>>>> and SharedReadWriteSize. >>>>>>>>> >>>>>>>>> For the SharedMiscDataSize, it is based on >>>>>>>>> MetaspaceShared::generate_vtable_methods(). Similar to what was >>>>>>>>> done for the SharedMiscCodeSize. >>>>>>>>> >>>>>>>>> For the ShareReadOnlySize and SharedReadWriteSize, I'm >>>>>>>>> checking if >>>>>>>>> they are at least the default size. >>>>>>>>> I think it's reasonable to enforce the ro and rw sizes to be at >>>>>>>>> least the default size. A default dump of CDS archive requires >>>>>>>>> >8M >>>>>>>>> of ro space and >11M of rw space. >>>>>>>>> >>>>>>>>> webrev: >>>>>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>>>>>> >>>>>>>>> tests: >>>>>>>>> ran the testcase via jtreg on linux_x64 and windows_x64 >>>>>>>>> JPRT >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> Calvin >>>>>>>> >>>>>>> >>>>>> >>>>> >>> > From calvin.cheung at oracle.com Fri Dec 5 18:44:09 2014 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Fri, 05 Dec 2014 10:44:09 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <5481F92F.3000207@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> <547FA291.5050407@oracle.com> <547FBB89.1080409@oracle.com> <5480EACE.7000605@oracle.com> <54811B1E.6030901@oracle.com> <54812FC5.2000505@oracle.com> <548156ED.6070502@oracle.com> <5481F92F.3000207@oracle.com> Message-ID: <5481FCF9.7000500@oracle.com> On 12/5/2014 10:27 AM, Jiangli Zhou wrote: > Hi Calvin, > > On 12/04/2014 10:55 PM, Calvin Cheung wrote: >> On 12/4/2014 8:08 PM, David Holmes wrote: >>> On 5/12/2014 12:40 PM, Calvin Cheung wrote: >>>> Hi David, >>>> >>>> On 12/4/2014 3:14 PM, David Holmes wrote: >>>>> On 4/12/2014 11:40 AM, Calvin Cheung wrote: >>>>>> Hi Jiangli, >>>>>> >>>>>> I've updated the webrev at the same location: >>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>> >>>>> Seems okay. Do we have a test that sets the flags to these allowed >>>>> minimum values and dumps and then uses the archive? >>>> >>>> I've added more test scenarios to the testcase. >>>> Updated webrev is at the same location. >>> >>> These values: >>> >>> ! value = Platform.is64bit() ? "10M" : "8M"; >>> ! break; >>> ! case RW: >>> ! value = Platform.is64bit() ? "13M" : "7M"; >>> >>> should match these: >>> >>> min_ro_size = NOT_LP64(7*M) LP64_ONLY(9*M), >>> min_rw_size = NOT_LP64(6*M) LP64_ONLY(12*M) >> >> For 64-bit, I've changed the testcase to match with the definitions. >> For 32-bit, I've changed the definitions to match with the testcase. >> Otherwise, the test fails with "not enough space". >> It may have something to do with the following calculation: >> SharedReadOnlySize = align_size_up(SharedReadOnlySize, >> max_alignment); >> SharedReadWriteSize = align_size_up(SharedReadWriteSize, >> max_alignment); > > That's strange. Why would the test fail when using 7M/6M? The minimum > aligned sizes should be less than those value, if the status output > reports the correct values. Is there any bug that causes the sizes of > the reported memory usage wrong? It looks like running it with a fastdebug build on linux requires a slightly bigger ro and rw sizes: ro space: 7720800 [ 50.5% of total] out of 12582912 bytes [61.4% used] at 0x80000000 rw space: 6325080 [ 41.3% of total] out of 7340032 bytes [86.2% used] at 0x80c00000 The numbers I listed before were from a release build. thanks, Calvin > > Thanks, > Jiangli > >> >> I've updated the webrev at the same location. >> >> thanks, >> Calvin >>> >>> David >>> ----- >>> >>>> thanks, >>>> Calvin >>>> >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Previous version is saved at: >>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev.00/ >>>>>> >>>>>> thanks, >>>>>> Calvin >>>>>> >>>>>> On 12/3/2014 3:53 PM, Jiangli Zhou wrote: >>>>>>> Hi Calvin, >>>>>>> >>>>>>> On 12/03/2014 03:51 PM, Calvin Cheung wrote: >>>>>>>> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >>>>>>>>> Hi Calvin, >>>>>>>>> >>>>>>>>> It's better to define 12M and 16M as enums in metaspaceShared.hpp >>>>>>>>> now they are referenced in more than one place. >>>>>>>> So global.hpp will need to include metaspaceShared.hpp. >>>>>>>>> >>>>>>>>> I also have some questions. The 12M/16M are not introduced by >>>>>>>>> this >>>>>>>>> change, do you know why those values were chosen as the >>>>>>>>> default RO >>>>>>>>> and RW sizes? >>>>>>>> Sorry. I don't know the reasons why those values were chosen. >>>>>>>>> Now we require both spaces have to be at lease 12M on 32-bit >>>>>>>>> machines and 16M on 64-bit machine, is it a reasonable >>>>>>>>> requirement? >>>>>>>>> What's the minimum size requirement for the RO and RW spaces with >>>>>>>>> the default classlist? >>>>>>>> >>>>>>>> Below are the numbers for the RO and RW spaces with the default >>>>>>>> classlist for various platforms: >>>>>>>> ===== >>>>>>>> linux >>>>>>>> ===== >>>>>>>> 64-bit >>>>>>>> ro space: 8433480 [ 37.8% of total] out of 16777216 bytes [50.3% >>>>>>>> used] at 0x0000000800000000 >>>>>>>> rw space: 11418608 [ 51.1% of total] out of 16777216 bytes [68.1% >>>>>>>> used] at 0x0000000801000000 >>>>>>>> >>>>>>>> 32-bit >>>>>>>> ro space: 6316488 [ 48.9% of total] out of 12582912 bytes [50.2% >>>>>>>> used] at 0x80000000 >>>>>>>> rw space: 5794312 [ 44.9% of total] out of 12582912 bytes [46.0% >>>>>>>> used] at 0x80c00000 >>>>>>>> >>>>>>>> ======== >>>>>>>> windows >>>>>>>> ======== >>>>>>>> 64-bit >>>>>>>> ro space: 7888680 [ 37.7% of total] out of 16777216 bytes [47.0% >>>>>>>> used] at 0x0000000800000000 >>>>>>>> rw space: 10704496 [ 51.1% of total] out of 16777216 bytes [63.8% >>>>>>>> used] at 0x0000000801000000 >>>>>>>> >>>>>>>> 32-bit >>>>>>>> ro space: 6030640 [ 49.3% of total] out of 12582912 bytes [47.9% >>>>>>>> used] at 0x14690000 >>>>>>>> rw space: 5440904 [ 44.5% of total] out of 12582912 bytes [43.2% >>>>>>>> used] at 0x15290000 >>>>>>>> >>>>>>>> ==== >>>>>>>> mac >>>>>>>> ==== >>>>>>>> 64-bit >>>>>>>> ro space: 6798968 [ 37.0% of total] out of 16777216 bytes [40.5% >>>>>>>> used] at 0x0000000800000000 >>>>>>>> rw space: 9446240 [ 51.4% of total] out of 16777216 bytes [56.3% >>>>>>>> used] at 0x0000000801000000 >>>>>>>> >>>>>>>> ==== >>>>>>>> >>>>>>>> So maybe we can define some enums as follows and leave the default >>>>>>>> values in globals.hpp alone? >>>>>>>> >>>>>>>> min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) >>>>>>>> min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) >>>>>>> >>>>>>> Sounds good to me. >>>>>>> >>>>>>> Thanks, >>>>>>> Jiangli >>>>>>> >>>>>>>> >>>>>>>> thanks, >>>>>>>> Calvin >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Jiangli >>>>>>>>> >>>>>>>>> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>>>>>>>>> >>>>>>>>>> Adding more checks on the SharedMiscDataSize, ShareReadOnlySize, >>>>>>>>>> and SharedReadWriteSize. >>>>>>>>>> >>>>>>>>>> For the SharedMiscDataSize, it is based on >>>>>>>>>> MetaspaceShared::generate_vtable_methods(). Similar to what was >>>>>>>>>> done for the SharedMiscCodeSize. >>>>>>>>>> >>>>>>>>>> For the ShareReadOnlySize and SharedReadWriteSize, I'm >>>>>>>>>> checking if >>>>>>>>>> they are at least the default size. >>>>>>>>>> I think it's reasonable to enforce the ro and rw sizes to be at >>>>>>>>>> least the default size. A default dump of CDS archive >>>>>>>>>> requires >8M >>>>>>>>>> of ro space and >11M of rw space. >>>>>>>>>> >>>>>>>>>> webrev: >>>>>>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>>>>>>> >>>>>>>>>> tests: >>>>>>>>>> ran the testcase via jtreg on linux_x64 and windows_x64 >>>>>>>>>> JPRT >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> Calvin >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >> > From jiangli.zhou at oracle.com Fri Dec 5 18:52:25 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 05 Dec 2014 10:52:25 -0800 Subject: RFR(XS): 8065050: vm crashes during CDS dump when very small SharedMiscDataSize is specified In-Reply-To: <5481FCF9.7000500@oracle.com> References: <547CCB30.6010806@oracle.com> <547F94C9.7020002@oracle.com> <547FA1FB.1070801@oracle.com> <547FA291.5050407@oracle.com> <547FBB89.1080409@oracle.com> <5480EACE.7000605@oracle.com> <54811B1E.6030901@oracle.com> <54812FC5.2000505@oracle.com> <548156ED.6070502@oracle.com> <5481F92F.3000207@oracle.com> <5481FCF9.7000500@oracle.com> Message-ID: <5481FEE9.7060508@oracle.com> On 12/05/2014 10:44 AM, Calvin Cheung wrote: > On 12/5/2014 10:27 AM, Jiangli Zhou wrote: >> Hi Calvin, >> >> On 12/04/2014 10:55 PM, Calvin Cheung wrote: >>> On 12/4/2014 8:08 PM, David Holmes wrote: >>>> On 5/12/2014 12:40 PM, Calvin Cheung wrote: >>>>> Hi David, >>>>> >>>>> On 12/4/2014 3:14 PM, David Holmes wrote: >>>>>> On 4/12/2014 11:40 AM, Calvin Cheung wrote: >>>>>>> Hi Jiangli, >>>>>>> >>>>>>> I've updated the webrev at the same location: >>>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>>> >>>>>> Seems okay. Do we have a test that sets the flags to these allowed >>>>>> minimum values and dumps and then uses the archive? >>>>> >>>>> I've added more test scenarios to the testcase. >>>>> Updated webrev is at the same location. >>>> >>>> These values: >>>> >>>> ! value = Platform.is64bit() ? "10M" : "8M"; >>>> ! break; >>>> ! case RW: >>>> ! value = Platform.is64bit() ? "13M" : "7M"; >>>> >>>> should match these: >>>> >>>> min_ro_size = NOT_LP64(7*M) LP64_ONLY(9*M), >>>> min_rw_size = NOT_LP64(6*M) LP64_ONLY(12*M) >>> >>> For 64-bit, I've changed the testcase to match with the definitions. >>> For 32-bit, I've changed the definitions to match with the testcase. >>> Otherwise, the test fails with "not enough space". >>> It may have something to do with the following calculation: >>> SharedReadOnlySize = align_size_up(SharedReadOnlySize, >>> max_alignment); >>> SharedReadWriteSize = align_size_up(SharedReadWriteSize, >>> max_alignment); >> >> That's strange. Why would the test fail when using 7M/6M? The minimum >> aligned sizes should be less than those value, if the status output >> reports the correct values. Is there any bug that causes the sizes of >> the reported memory usage wrong? > It looks like running it with a fastdebug build on linux requires a > slightly bigger ro and rw sizes: > ro space: 7720800 [ 50.5% of total] out of 12582912 bytes [61.4% > used] at 0x80000000 > rw space: 6325080 [ 41.3% of total] out of 7340032 bytes [86.2% > used] at 0x80c00000 > > The numbers I listed before were from a release build. Ok. Thanks, Jiangli > > thanks, > Calvin > >> >> Thanks, >> Jiangli >> >>> >>> I've updated the webrev at the same location. >>> >>> thanks, >>> Calvin >>>> >>>> David >>>> ----- >>>> >>>>> thanks, >>>>> Calvin >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Previous version is saved at: >>>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev.00/ >>>>>>> >>>>>>> thanks, >>>>>>> Calvin >>>>>>> >>>>>>> On 12/3/2014 3:53 PM, Jiangli Zhou wrote: >>>>>>>> Hi Calvin, >>>>>>>> >>>>>>>> On 12/03/2014 03:51 PM, Calvin Cheung wrote: >>>>>>>>> On 12/3/2014 2:55 PM, Jiangli Zhou wrote: >>>>>>>>>> Hi Calvin, >>>>>>>>>> >>>>>>>>>> It's better to define 12M and 16M as enums in >>>>>>>>>> metaspaceShared.hpp >>>>>>>>>> now they are referenced in more than one place. >>>>>>>>> So global.hpp will need to include metaspaceShared.hpp. >>>>>>>>>> >>>>>>>>>> I also have some questions. The 12M/16M are not introduced by >>>>>>>>>> this >>>>>>>>>> change, do you know why those values were chosen as the >>>>>>>>>> default RO >>>>>>>>>> and RW sizes? >>>>>>>>> Sorry. I don't know the reasons why those values were chosen. >>>>>>>>>> Now we require both spaces have to be at lease 12M on 32-bit >>>>>>>>>> machines and 16M on 64-bit machine, is it a reasonable >>>>>>>>>> requirement? >>>>>>>>>> What's the minimum size requirement for the RO and RW spaces >>>>>>>>>> with >>>>>>>>>> the default classlist? >>>>>>>>> >>>>>>>>> Below are the numbers for the RO and RW spaces with the default >>>>>>>>> classlist for various platforms: >>>>>>>>> ===== >>>>>>>>> linux >>>>>>>>> ===== >>>>>>>>> 64-bit >>>>>>>>> ro space: 8433480 [ 37.8% of total] out of 16777216 bytes >>>>>>>>> [50.3% >>>>>>>>> used] at 0x0000000800000000 >>>>>>>>> rw space: 11418608 [ 51.1% of total] out of 16777216 bytes >>>>>>>>> [68.1% >>>>>>>>> used] at 0x0000000801000000 >>>>>>>>> >>>>>>>>> 32-bit >>>>>>>>> ro space: 6316488 [ 48.9% of total] out of 12582912 bytes >>>>>>>>> [50.2% >>>>>>>>> used] at 0x80000000 >>>>>>>>> rw space: 5794312 [ 44.9% of total] out of 12582912 bytes >>>>>>>>> [46.0% >>>>>>>>> used] at 0x80c00000 >>>>>>>>> >>>>>>>>> ======== >>>>>>>>> windows >>>>>>>>> ======== >>>>>>>>> 64-bit >>>>>>>>> ro space: 7888680 [ 37.7% of total] out of 16777216 bytes >>>>>>>>> [47.0% >>>>>>>>> used] at 0x0000000800000000 >>>>>>>>> rw space: 10704496 [ 51.1% of total] out of 16777216 bytes >>>>>>>>> [63.8% >>>>>>>>> used] at 0x0000000801000000 >>>>>>>>> >>>>>>>>> 32-bit >>>>>>>>> ro space: 6030640 [ 49.3% of total] out of 12582912 bytes >>>>>>>>> [47.9% >>>>>>>>> used] at 0x14690000 >>>>>>>>> rw space: 5440904 [ 44.5% of total] out of 12582912 bytes >>>>>>>>> [43.2% >>>>>>>>> used] at 0x15290000 >>>>>>>>> >>>>>>>>> ==== >>>>>>>>> mac >>>>>>>>> ==== >>>>>>>>> 64-bit >>>>>>>>> ro space: 6798968 [ 37.0% of total] out of 16777216 bytes >>>>>>>>> [40.5% >>>>>>>>> used] at 0x0000000800000000 >>>>>>>>> rw space: 9446240 [ 51.4% of total] out of 16777216 bytes >>>>>>>>> [56.3% >>>>>>>>> used] at 0x0000000801000000 >>>>>>>>> >>>>>>>>> ==== >>>>>>>>> >>>>>>>>> So maybe we can define some enums as follows and leave the >>>>>>>>> default >>>>>>>>> values in globals.hpp alone? >>>>>>>>> >>>>>>>>> min_ro_size NOT_LP64(7*M) LP64_ONLY(9*M) >>>>>>>>> min_rw_size NOT_LP64(6*M) LP64_ONLY(12*M) >>>>>>>> >>>>>>>> Sounds good to me. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jiangli >>>>>>>> >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> Calvin >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Jiangli >>>>>>>>>> >>>>>>>>>> On 12/01/2014 12:10 PM, Calvin Cheung wrote: >>>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8065050 >>>>>>>>>>> >>>>>>>>>>> Adding more checks on the SharedMiscDataSize, >>>>>>>>>>> ShareReadOnlySize, >>>>>>>>>>> and SharedReadWriteSize. >>>>>>>>>>> >>>>>>>>>>> For the SharedMiscDataSize, it is based on >>>>>>>>>>> MetaspaceShared::generate_vtable_methods(). Similar to what was >>>>>>>>>>> done for the SharedMiscCodeSize. >>>>>>>>>>> >>>>>>>>>>> For the ShareReadOnlySize and SharedReadWriteSize, I'm >>>>>>>>>>> checking if >>>>>>>>>>> they are at least the default size. >>>>>>>>>>> I think it's reasonable to enforce the ro and rw sizes to be at >>>>>>>>>>> least the default size. A default dump of CDS archive >>>>>>>>>>> requires >8M >>>>>>>>>>> of ro space and >11M of rw space. >>>>>>>>>>> >>>>>>>>>>> webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~ccheung/8065050/webrev/ >>>>>>>>>>> >>>>>>>>>>> tests: >>>>>>>>>>> ran the testcase via jtreg on linux_x64 and windows_x64 >>>>>>>>>>> JPRT >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> Calvin >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>> >> > From max.ockner at oracle.com Fri Dec 5 20:30:39 2014 From: max.ockner at oracle.com (Max Ockner) Date: Fri, 05 Dec 2014 15:30:39 -0500 Subject: RFR: 6522873 - Java not print "Unrecognized option" when it is invalid option. In-Reply-To: <5481B59E.1000601@oracle.com> References: <5481B59E.1000601@oracle.com> Message-ID: <548215EF.8020508@oracle.com> Jesper, I have reviewed your change (I am not an official reviewer yet, but I have recent experience with this code). True, there was a recent discussion about this issue, but only obsolete arguments were targeted. I have inspected your change, and I am convinced that there isn't really a better way to correct this additional issue. Max O The only other solution I can think of is to move all flags into the flagTable, since the code which checks tabled flags already seems to work. But I'm no longer On 12/5/2014 8:39 AM, Jesper Wilhelmsson wrote: > Hi, > > Please review this patch to make argument parsing stop accepting > random characters at the end of command line flags. This topic was > discussed in hotspot-dev at openjdk.java.net and I strongly believe that > this bug should be reopened and fixed. > > Short summary of the problem: > Today some (not all) flags are accepted even though they have random > characters appended to them. Some examples are -Xconcgc, -Xcomp, > -Xboundthreads, -XX:+AlwaysTenure etc which will also be accepted when > written for instance -Xconcgcnoway, -Xcomposer, -Xboundthreadstodogs > or -XX:+AlwaysTenureAtBlueMoon > > There is a potential problem here since we will also accept things > like -XX:+ExtendedDTraceProbes-XX:+UseG1GC without saying a word (and > of course without running with G1). > > Bug: https://bugs.openjdk.java.net/browse/JDK-6522873 > Webrev: http://cr.openjdk.java.net/~jwilhelm/6522873/webrev.00/ > > > The full list of flags affected by this change is: > > -Xnoclassgc > -Xconcgc > -Xnoconcgc > -Xbatch > -green > -native > -Xsqnopause > -Xrs > -Xusealtsigs > -Xoptimize > -Xprof > -Xconcurrentio > -Xinternalversion > -Xprintflags > -Xint > -Xmixed > -Xcomp > -Xshare:dump > -Xshare:on > -Xshare:auto > -Xshare:off > -Xdebug > -Xnoagent > -Xboundthreads > vfprintf > exit > abort > -XX:+AggressiveHeap > -XX:+NeverTenure > -XX:+AlwaysTenure > -XX:+CMSPermGenSweepingEnabled > -XX:-CMSPermGenSweepingEnabled > -XX:+UseGCTimeLimit > -XX:-UseGCTimeLimit > -XX:+ResizeTLE > -XX:-ResizeTLE > -XX:+PrintTLE > -XX:-PrintTLE > -XX:+UseTLE > -XX:-UseTLE > -XX:+DisplayVMOutputToStderr > -XX:+DisplayVMOutputToStdout > -XX:+ExtendedDTraceProbes > -XX:+FullGCALot > -XX:+ManagementServer > -XX:+PrintVMOptions > -XX:-PrintVMOptions > -XX:+IgnoreUnrecognizedVMOptions > -XX:-IgnoreUnrecognizedVMOptions > -XX:+PrintFlagsInitial > -XX:+PrintFlagsWithComments > > > Thanks, > /Jesper From john.r.rose at oracle.com Fri Dec 5 22:15:25 2014 From: john.r.rose at oracle.com (John Rose) Date: Fri, 5 Dec 2014 14:15:25 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <5480DD6E.7050509@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> <547F9F0C.1000904@oracle.com> <547FC19F.6040406@oracle.com> <5480DD6E.7050509@oracle.com> Message-ID: <90637548-0BB1-4D09-803F-C98A500B8E27@oracle.com> On Dec 4, 2014, at 2:17 PM, Ioi Lam wrote: > >>>> 163 *p++ = juint(base_address >> 32); >>>> 167 *p++ = juint(base_address & 0xffffffff); >>>> >>>> 205 juint upper = *p++; >>>> 206 juint lower = *p++; >>>> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); >>>> >>> >>> Actually it would have problem on 32-bit platforms. The behaviour of shift by greater than or equal to the number of bits that exist in the operand is undefined. Gcc gives warning about the >>32 on linux-x86. The use of "raw constants" 32 and 0xffffffff is an anti-pattern, a clue that something better can be done. https://wiki.openjdk.java.net/display/HotSpot/StyleGuide#StyleGuide-NamedCons In this case, we can get safer, more portable code by using functions from globalDefinitions.hpp. Let's use those functions when they are available, instead of raw, manual C shift operators. 163 *p++ = high(base_address); 167 *p++ = low(base_address); 205 juint upper = *p++; 206 juint lower = *p++; 208 _base_address = jlong_from(upper, lower); The statistics on bucket size are very interesting. It's particularly interesting (a little surprising to me) that reducing average bucket size below 4 doesn't seem to help performance. That suggests that cache line scale (bucket of size four "just happens" to be 64 bytes = x86 cache line) dominates the performance. In that case, and given that length=1 is only 6% of buckets, I think we could drop the special 'COMPACT_BUCKET_TYPE'. Getting rid of the bucket length table is good progress. A standard trick for this kind of "differential" data structure is to regularize the code by duplicating the value in _table_end_offset at the end of the _buckets array at _buckets[_bucket_count]. Then you won't need the extra check "if (index == int(_bucket_count - 1))". ? John From jiangli.zhou at oracle.com Sat Dec 6 00:44:35 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 05 Dec 2014 16:44:35 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <90637548-0BB1-4D09-803F-C98A500B8E27@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> <547F9F0C.1000904@oracle.com> <547FC19F.6040406@oracle.com> <5480DD6E.7050509@oracle.com> <90637548-0BB1-4D09-803F-C98A500B8E27@oracle.com> Message-ID: <54825173.5020008@oracle.com> Hi John, Thank you for the feedback. And thank you again for all the great suggestions! On 12/05/2014 02:15 PM, John Rose wrote: > On Dec 4, 2014, at 2:17 PM, Ioi Lam > wrote: >> >>>>> 163 *p++ = juint(base_address >> 32); >>>>> 167 *p++ = juint(base_address & 0xffffffff); >>>>> >>>>> 205 juint upper = *p++; >>>>> 206 juint lower = *p++; >>>>> 208 _base_address = (uintx(upper) << 32 ) + uintx(lower); >>>>> >>>> >>>> Actually it would have problem on 32-bit platforms. The behaviour >>>> of shift by greater than or equal to the number of bits that exist >>>> in the operand is undefined. Gcc gives warning about the >>32 on >>>> linux-x86. > > The use of "raw constants" 32 and 0xffffffff is an anti-pattern, a > clue that something better can be done. > https://wiki.openjdk.java.net/display/HotSpot/StyleGuide#StyleGuide-NamedCons > > In this case, we can get safer, more portable code by using functions > from globalDefinitions.hpp. Let's use those functions when they are > available, instead of raw, manual C shift operators. > > 163 *p++ = high(base_address); > 167 *p++ = low(base_address); > > 205 juint upper = *p++; > 206 juint lower = *p++; > 208 _base_address = jlong_from(upper, lower); Good to know there are existing APIs. I'll change to use those. > > The statistics on bucket size are very interesting. It's particularly > interesting (a little surprising to me) that reducing average bucket > size below 4 doesn't seem to help performance. That suggests that > cache line scale (bucket of size four "just happens" to be 64 bytes = > x86 cache line) dominates the performance. > > In that case, and given that length=1 is only 6% of buckets, I think > we could drop the special 'COMPACT_BUCKET_TYPE'. I wonder if it would help more when we also add the support for string table. With a lower hash table load factor, the percentage of buckets with one entry increases. I'm inclined to leave it in if there is no strong objection. > > Getting rid of the bucket length table is good progress. A standard > trick for this kind of "differential" data structure is to regularize > the code by duplicating the value in _table_end_offset at the end of > the _buckets array at _buckets[_bucket_count]. Then you won't need > the extra check "if (index == int(_bucket_count - 1))". That's a good trick. I'll make the change. Thanks! Jiangli > > ? John From john.r.rose at oracle.com Sat Dec 6 04:54:28 2014 From: john.r.rose at oracle.com (John Rose) Date: Fri, 5 Dec 2014 20:54:28 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <54825173.5020008@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> <547F9F0C.1000904@oracle.com> <547FC19F.6040406@oracle.com> <5480DD6E.7050509@oracle.com> <90637548-0BB1-4D09-803F-C98A500B8E27@oracle.com> <54825173.5020008@oracle.com> Message-ID: On Dec 5, 2014, at 4:44 PM, Jiangli Zhou wrote: > > I wonder if it would help more when we also add the support for string table. With a lower hash table load factor, the percentage of buckets with one entry increases. I'm inclined to leave it in if there is no strong objection. Yes, it will help with a lower load factor. I'm OK either way. (Would still like to reduce the number of indirections, but that's for later.) ? John From jiangli.zhou at oracle.com Sat Dec 6 05:05:31 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 05 Dec 2014 21:05:31 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> <547F9F0C.1000904@oracle.com> <547FC19F.6040406@oracle.com> <5480DD6E.7050509@oracle.com> <90637548-0BB1-4D09-803F-C98A500B8E27@oracle.com> <54825173.5020008@oracle.com> Message-ID: <54828E9B.10004@oracle.com> Ok. Thanks, John. Jiangli On 12/05/2014 08:54 PM, John Rose wrote: > On Dec 5, 2014, at 4:44 PM, Jiangli Zhou > wrote: >> >> I wonder if it would help more when we also add the support for >> string table. With a lower hash table load factor, the percentage of >> buckets with one entry increases. I'm inclined to leave it in if >> there is no strong objection. > > Yes, it will help with a lower load factor. I'm OK either way. > (Would still like to reduce the number of indirections, but that's > for later.) ? John From david.holmes at oracle.com Mon Dec 8 06:25:57 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 08 Dec 2014 16:25:57 +1000 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <547DCCFB.3050209@gmail.com> References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> <547C38AD.6050703@gmail.com> <547D891D.7010809@oracle.com> <547DCCFB.3050209@gmail.com> Message-ID: <54854475.20800@oracle.com> Hi Yasumasa, I'm okay with these changes. Just a minor style nit (no need for updated webrev) can you remove the blank lines in os_linux.cpp: 6011 } 6012 6013 } 6014 6015 } 6057 } 6058 6059 } 6060 6061 } If anyone has any objections please raise them asap. Thanks, David On 3/12/2014 12:30 AM, Yasumasa Suenaga wrote: > Hi David, Thomas, > > I've uploaded new webrev: > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.04/ > >>> I want to rewrite a patch as below: >>> >>> - Use async signal safety functions. >>> fopen -> open, fgets -> read, etc. >> >> This is commendable if it is practical, but error reporting already >> does many, many things that are not async-signal safe, so there is no >> need to go to extreme measures here. > > I've used async-signal safe functions as possible. > > >>> - Use O_BUFLEN for buffer size. >>> O_BUFLEN is defined to 2000 in ostream.hpp . >>> This macro is used in various points. >>> VMError::coredump_message is >>> also defined with this value. >>> >>> >>> I think PATH_MAX is fine. I think O_BUFLEN was originally used as a max. >>> length of temporary buffers to assemble an output line. And then it >>> spread a bit. But your intend is to hold a path and using PATH_MAX >>> clearly documents this. > > I've used PATH_MAX again. > > >>> And, to really nitpick, right now you do not handle ERANGE with >>> get_current_path() (if the provided buffer is too small), which is >>> probably fine because it is improbable that a path is larger than >>> PATH_MAX. But if you change the size of the buffer to something which >>> may be smaller than PATH_MAX (O_BUFLEN), get_current_directory() may >>> fail. > > If get_current_path() call is failed in get_core_path(), get_core_path() > returns immediately with 0. > Caller (check_or_create_dump()) handles this result as illegal state. > > get_current_path() calls getcwd() only and redirects result to caller. > So result of this function is NULL, we can judge getcwd() was finished with > error. > I think it is enough. > > >>> I like your patch, I think it could be a nice time safer when >>> core_pattern is something unusual. But I also see Staffans point of >>> too-much-complexity. So I will keep out of this discussion until the >>> real Reviewers decided what to do :) >> >> I have a hard time evaluating the merits of the patch as I don't work >> in an environment where this extra info is needed. But I take it on >> good faith that it is useful for the context Yasumasa describes. > > I want to suggest to Java user where coredump is. > Modern Linux distribution(s) contains ABRT. > OS can dump corefile automatically despite a lack of setting coredump > resource by user. > > I'm support engineer of Java. My customer says "coredump does not found.", > but coredump is saved by ABRT. > Thus I want them to know "coredump is available" through stderr and > hs_err immediately. > I belive it is first step of troubleshoot. > > > Thanks, > > Yasumasa > > > (2014/12/02 18:40), David Holmes wrote: >> On 1/12/2014 10:57 PM, Thomas St?fe wrote: >>> Hi Yasumasa, >>> >>> On Mon, Dec 1, 2014 at 10:45 AM, Yasumasa Suenaga >> > wrote: >>> >>> Hi Thomas, David, >>> >>> Sorry, I didn't think about async signal safety. >>> >>> That would work, VmError::report_and_die() is singlethreaded. At >>> least the part which dumps out the core file name. >>> >>> >>> I think that signal handler (in this case) may run concurrency with >>> other thread. >>> If another thread calls malloc(3) in JNI, C Heap corruption may >>> occur. >>> >>> >>> No, malloc(3) should be thread safe on our platforms. But this was not >>> the point. If I understood David right, he suggested using a static >>> buffer inside get_core_path() for assembling the core path, which would >>> make get_core_path() thread-unsafe (multiple threads calling it would >>> get garbled results). But as get_core_path() is only called from within >>> VmError::report_and_die() and that section is only ever executed by one >>> thread, Davids suggestion would still work. >> >> Yes that is what I was suggesting. >> >>> I want to rewrite a patch as below: >>> >>> - Use async signal safety functions. >>> fopen -> open, fgets -> read, etc. >> >> This is commendable if it is practical, but error reporting already >> does many, many things that are not async-signal safe, so there is no >> need to go to extreme measures here. >> >>> - Use O_BUFLEN for buffer size. >>> O_BUFLEN is defined to 2000 in ostream.hpp . >>> This macro is used in various points. >>> VMError::coredump_message is >>> also defined with this value. >>> >>> >>> I think PATH_MAX is fine. I think O_BUFLEN was originally used as a max. >>> length of temporary buffers to assemble an output line. And then it >>> spread a bit. But your intend is to hold a path and using PATH_MAX >>> clearly documents this. >>> And, to really nitpick, right now you do not handle ERANGE with >>> get_current_path() (if the provided buffer is too small), which is >>> probably fine because it is improbable that a path is larger than >>> PATH_MAX. But if you change the size of the buffer to something which >>> may be smaller than PATH_MAX (O_BUFLEN), get_current_directory() may >>> fail. >>> >>> I like your patch, I think it could be a nice time safer when >>> core_pattern is something unusual. But I also see Staffans point of >>> too-much-complexity. So I will keep out of this discussion until the >>> real Reviewers decided what to do :) >> >> I have a hard time evaluating the merits of the patch as I don't work >> in an environment where this extra info is needed. But I take it on >> good faith that it is useful for the context Yasumasa describes. >> >> David >> >>> Kind Regards, Thomas >>> From thomas.stuefe at gmail.com Mon Dec 8 10:27:16 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 8 Dec 2014 11:27:16 +0100 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <54854475.20800@oracle.com> References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> <547C38AD.6050703@gmail.com> <547D891D.7010809@oracle.com> <547DCCFB.3050209@gmail.com> <54854475.20800@oracle.com> Message-ID: Hi, I do not really like the handling of the leading pipe symbol: So, we read the core_pattern, and if the pipe symbol is detected, we write the core pattern minus the pipe symbol but plus a leading quote to the output; the leading quote then serves as a info to the layer above in os_posix.cpp to treat this case specially. This means the logic spills out of the platform dependend os_linux.cpp to shared code and this is also difficult to read. This comes from the fact that "get_core_path()" assumes the core file is written to the file system. I think it just does not fit anymore, better would be to replace it with something like "os::print_core_file_location(outputStream* os)", and the OS handles both core path retrieval and the printing. Because then the shared code does not need to know whether core file gets printed traditionally or piped to a executable or whatever. Kind regards, Thomas On Mon, Dec 8, 2014 at 7:25 AM, David Holmes wrote: > Hi Yasumasa, > > I'm okay with these changes. Just a minor style nit (no need for updated > webrev) can you remove the blank lines in os_linux.cpp: > > 6011 } > 6012 > 6013 } > 6014 > 6015 } > > 6057 } > 6058 > 6059 } > 6060 > 6061 } > > If anyone has any objections please raise them asap. > > Thanks, > David > > > On 3/12/2014 12:30 AM, Yasumasa Suenaga wrote: > >> Hi David, Thomas, >> >> I've uploaded new webrev: >> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.04/ >> >> I want to rewrite a patch as below: >>>> >>>> - Use async signal safety functions. >>>> fopen -> open, fgets -> read, etc. >>>> >>> >>> This is commendable if it is practical, but error reporting already >>> does many, many things that are not async-signal safe, so there is no >>> need to go to extreme measures here. >>> >> >> I've used async-signal safe functions as possible. >> >> >> - Use O_BUFLEN for buffer size. >>>> O_BUFLEN is defined to 2000 in ostream.hpp . >>>> This macro is used in various points. >>>> VMError::coredump_message is >>>> also defined with this value. >>>> >>>> >>>> I think PATH_MAX is fine. I think O_BUFLEN was originally used as a max. >>>> length of temporary buffers to assemble an output line. And then it >>>> spread a bit. But your intend is to hold a path and using PATH_MAX >>>> clearly documents this. >>>> >>> >> I've used PATH_MAX again. >> >> >> And, to really nitpick, right now you do not handle ERANGE with >>>> get_current_path() (if the provided buffer is too small), which is >>>> probably fine because it is improbable that a path is larger than >>>> PATH_MAX. But if you change the size of the buffer to something which >>>> may be smaller than PATH_MAX (O_BUFLEN), get_current_directory() may >>>> fail. >>>> >>> >> If get_current_path() call is failed in get_core_path(), get_core_path() >> returns immediately with 0. >> Caller (check_or_create_dump()) handles this result as illegal state. >> >> get_current_path() calls getcwd() only and redirects result to caller. >> So result of this function is NULL, we can judge getcwd() was finished >> with >> error. >> I think it is enough. >> >> >> I like your patch, I think it could be a nice time safer when >>>> core_pattern is something unusual. But I also see Staffans point of >>>> too-much-complexity. So I will keep out of this discussion until the >>>> real Reviewers decided what to do :) >>>> >>> >>> I have a hard time evaluating the merits of the patch as I don't work >>> in an environment where this extra info is needed. But I take it on >>> good faith that it is useful for the context Yasumasa describes. >>> >> >> I want to suggest to Java user where coredump is. >> Modern Linux distribution(s) contains ABRT. >> OS can dump corefile automatically despite a lack of setting coredump >> resource by user. >> >> I'm support engineer of Java. My customer says "coredump does not found.", >> but coredump is saved by ABRT. >> Thus I want them to know "coredump is available" through stderr and >> hs_err immediately. >> I belive it is first step of troubleshoot. >> >> >> Thanks, >> >> Yasumasa >> >> >> (2014/12/02 18:40), David Holmes wrote: >> >>> On 1/12/2014 10:57 PM, Thomas St?fe wrote: >>> >>>> Hi Yasumasa, >>>> >>>> On Mon, Dec 1, 2014 at 10:45 AM, Yasumasa Suenaga >>> > wrote: >>>> >>>> Hi Thomas, David, >>>> >>>> Sorry, I didn't think about async signal safety. >>>> >>>> That would work, VmError::report_and_die() is singlethreaded. At >>>> least the part which dumps out the core file name. >>>> >>>> >>>> I think that signal handler (in this case) may run concurrency with >>>> other thread. >>>> If another thread calls malloc(3) in JNI, C Heap corruption may >>>> occur. >>>> >>>> >>>> No, malloc(3) should be thread safe on our platforms. But this was not >>>> the point. If I understood David right, he suggested using a static >>>> buffer inside get_core_path() for assembling the core path, which would >>>> make get_core_path() thread-unsafe (multiple threads calling it would >>>> get garbled results). But as get_core_path() is only called from within >>>> VmError::report_and_die() and that section is only ever executed by one >>>> thread, Davids suggestion would still work. >>>> >>> >>> Yes that is what I was suggesting. >>> >>> I want to rewrite a patch as below: >>>> >>>> - Use async signal safety functions. >>>> fopen -> open, fgets -> read, etc. >>>> >>> >>> This is commendable if it is practical, but error reporting already >>> does many, many things that are not async-signal safe, so there is no >>> need to go to extreme measures here. >>> >>> - Use O_BUFLEN for buffer size. >>>> O_BUFLEN is defined to 2000 in ostream.hpp . >>>> This macro is used in various points. >>>> VMError::coredump_message is >>>> also defined with this value. >>>> >>>> >>>> I think PATH_MAX is fine. I think O_BUFLEN was originally used as a max. >>>> length of temporary buffers to assemble an output line. And then it >>>> spread a bit. But your intend is to hold a path and using PATH_MAX >>>> clearly documents this. >>>> And, to really nitpick, right now you do not handle ERANGE with >>>> get_current_path() (if the provided buffer is too small), which is >>>> probably fine because it is improbable that a path is larger than >>>> PATH_MAX. But if you change the size of the buffer to something which >>>> may be smaller than PATH_MAX (O_BUFLEN), get_current_directory() may >>>> fail. >>>> >>>> I like your patch, I think it could be a nice time safer when >>>> core_pattern is something unusual. But I also see Staffans point of >>>> too-much-complexity. So I will keep out of this discussion until the >>>> real Reviewers decided what to do :) >>>> >>> >>> I have a hard time evaluating the merits of the patch as I don't work >>> in an environment where this extra info is needed. But I take it on >>> good faith that it is useful for the context Yasumasa describes. >>> >>> David >>> >>> Kind Regards, Thomas >>>> >>>> From thomas.stuefe at gmail.com Mon Dec 8 10:37:02 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 8 Dec 2014 11:37:02 +0100 Subject: RFR(s): 8065895: Synchronous signals during error reporting may terminate or hang VM process In-Reply-To: <54815944.1060500@oracle.com> References: <54752CF8.5070408@oracle.com> <54759DFE.7020300@oracle.com> <5475C15E.30207@oracle.com> <54767502.6010907@oracle.com> <5476B417.9030008@oracle.com> <5476E851.8050802@oracle.com> <5476F2AB.4050401@redhat.com> <5477030E.6070605@redhat.com> <54770436.8070705@oracle.com> <54770536.5090101@redhat.com> <547D8B6A.6040002@oracle.com> <547E8CF4.3050305@oracle.com> <54815944.1060500@oracle.com> Message-ID: Hi David, Dean, On Fri, Dec 5, 2014 at 8:05 AM, David Holmes wrote: > On 3/12/2014 8:47 PM, Thomas St?fe wrote: > >> Hi Dean, >> >> I dont understand. Such a function does not exist, does it? So I would >> have to write it: >> >> Do you mean generating and using a StubRoutine which would SIGILL? I did >> not do this because I wanted to be able to generate SIGILL also in >> initialization code, where StubRoutines may not yet be generated. This >> point may may be arguable, but as this function is used to test error >> handling, it may be interesting to test it for half-initialized VMs too. >> >> Otherwise I would implement the CPU specific >> generate_illegal_instruction___sequence() probably the same way as I do >> now the crash_with_sigill() function. That would mean a bit of more code >> duplication because: >> - Either I use the method I use now (reserve_memory and copy the >> instructions to the reserved page) >> - Or I use inline assembly - which probably does not work across >> multiple OSs, so for CPUs which span various OSs I would have to add one >> function per os_cpu combination, not just per cpu. >> > > I don't think there is any OS dependency with inline assembly - only > compiler. And I am also concerned that writing code to an executable page > will also enter the realm of "self-modifying code" and all the jumping > through hoops that entails. That aspect hadn't occurred to me till Dean > raised it. I'm forming the view that triggering a SIGILL is more effort > than it is worth for a secondary testing function. > > Well, the code is used and works in our VM since some years on a number of CPUs, so the problem with the flushing do not occur at least in our cases. But I agree with you, and this seems to be a point of contention and it is really too unimportant to stop the whole patch. The whole point of using SIGILL was to have another unblockable signal besides SIGSEGV to occur naturally (without raising) to be able to demonstrate the bug before fixing it. I will now attempt to change the patch to use either SIGFPE or SIGBUS as a secondary signal. Maybe generating those signals with pure C/C++ is easier. If that does not work out, I will see what I can do with raise(). Thanks and Kind regards, Thomas From goetz.lindenmaier at sap.com Mon Dec 8 15:54:36 2014 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 8 Dec 2014 15:54:36 +0000 Subject: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. In-Reply-To: <5480AB58.80604@oracle.com> References: <4295855A5C1DE049A61835A1887419CC2CF264E2@DEWDFEMB12A.global.corp.sap> <5466A656.40707@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF27CD3@DEWDFEMB12A.global.corp.sap> <4295855A5C1DE049A61835A1887419CC2CF29936@DEWDFEMB12A.global.corp.sap> <547E4F3B.2090501@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF35E66@DEWDFEMB12A.global.corp.sap> <547F4914.1030904@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF36139@DEWDFEMB12A.global.corp.sap> <54809196.7030306@oracle.com> <4295855A5C1DE049A61835A1887419CC2CF39BA5@DEWDFEMB12A.global.corp.sap> <5480AB58.80604@oracle.com> Message-ID: <4295855A5C1DE049A61835A1887419CC2CF3D6BB@DEWDFEMB12A.global.corp.sap> Hi, This is just a ping to gc/rt mailing lists to reach appropriate people. I please need a reviewer from gc or rt, could somebody have a look at this? Short summary: - new cOops mode disjointbase that allows optimizations on PPC improving over heapbased - search for heaps: finds zerobased on sparc Solaris 11 and Aix - concentrate cOops heap allocation code in one function http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.02/ Please reply only to the original thread in hotspot-dev to keep this local. Thanks and best regards, Goetz. -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Donnerstag, 4. Dezember 2014 19:44 To: Lindenmaier, Goetz Cc: 'hotspot-dev developers' Subject: Re: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. This looks good to me. Now we need second review since changes are significant. Preferable from GC group since you changed ReservedHeapSpace. They will be affected most. Review from Runtime is also welcome. Thanks, Vladimir On 12/4/14 10:27 AM, Lindenmaier, Goetz wrote: > Hi Vladimir. > > Sorry. I updated the webrev once more. Hope it's fine now. > At least I can write comments :) > > Best regards, > Goetz > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Thursday, December 04, 2014 5:54 PM > To: Lindenmaier, Goetz > Cc: 'hotspot-dev developers' > Subject: Re: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. > > I spotted an other bug. > You replaced !_base with _base != NULL when moved code to try_reserve_range() - it should be _base == NULL. > The same problem in asserts: > > + assert(_base != NULL || markOopDesc::encode_pointer_as_mark(_base)->decode_pointer() == _base, > + "area must be distinguishable from marks for mark-sweep"); > + assert(_base != NULL || markOopDesc::encode_pointer_as_mark(&_base[size])->decode_pointer() == &_base[size], > + "area must be distinguishable from marks for mark-sweep"); > > > Also you did not remove _base && in next place: > > + (_base && _base + size > zerobased_max))) { // Unscaled delivered an arbitrary address. > > New comment is good. > > Thanks, > Vladimri > > On 12/4/14 1:45 AM, Lindenmaier, Goetz wrote: >> Hi Vladimir, >> >>> Add more extending comment explaining that. >> The comment for try_reserve_heap was meant to explain that. >> I further added a comment in initialize_compressed_heap(). >> >>> You need another parameter to pass UnscaledOopHeapMax or zerobased_max. >> Oh, thanks a lot! That's important. Fixed. >> >>> I mean that you already checked _base == NULL so on other side of || _base != NULL - why you need (_base &&) check? >> Sorry, now I got it. Removed. >> >> I updated the webrev: >> http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.02/ >> Increment on top of the increment :) >> http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.02/incremental_diffs2.patch >> >> Thanks, >> Goetz. >> >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Mittwoch, 3. Dezember 2014 18:32 >> To: Lindenmaier, Goetz; 'hotspot-dev at openjdk.java.net' >> Subject: Re: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. >> >> Comments are below. >> >> On 12/3/14 5:49 AM, Lindenmaier, Goetz wrote: >>> Hi Vladimir, >>> >>> thanks for looking at the change! See my comments inline below. >>> >>> I made a new webrev: >>> http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.02/ >>> Incremental changes: >>> http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.02/incremental_diffs.patch >>> >>> Best regards, >>> Goetz. >>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>> Sent: Mittwoch, 3. Dezember 2014 00:46 >>>> To: Lindenmaier, Goetz; 'hotspot-dev at openjdk.java.net' >>>> Subject: Re: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. >>>> >>>> This looks good to me. Someone in runtime/gc have to look on it too. >>>> >>>> universe.cpp about SystemProperty("com.sap.vm.test.compressedOopsMode" >>>> we have: >>>> java.vm.info=mixed mode, sharing >>>> so we can have: >>>> java.vm.compressedOopsMode=... >>> Yes, that's good for me. Fixed. >>> >>>> I am not expert in properties names but I don't want to have 'com.sap' >>>> in VM's property name. >>> >>>> virtualspace.cpp: >>>> Could you fix release() - it does not reset _alignment? >>> Fixed. >>> >>>> In try_reserve_heap(), please, use (base == NULL) instead of (!base). >>>> And you don't need 'return;' in alignment check at the end of method. >>> Fixed. >>> >>>> In initialize_compressed_heap() again (!_base). >>> Fixed. >>> >>>> You don't stop (check >>>> (base == NULL)) after successful unscaled, zerobased, disjointbase >>>> allocations. You need to separate them with the check: >>>> >>>> + >>>> + } >>>> + } >>>> + if (_base == NULL) { >>>> + >>>> + if (PrintCompressedOopsMode && Verbose) { >>>> + tty->print(" == Z E R O B A S E D ==\n"); >>>> + } >>>> and so on. >>> No, I can't and don't want to check for _base != NULL. >>> I always keep the result of the last try, also if it didn't fulfil the required properties. >>> So I take that result and go into the next check. That check might succeed >>> with the heap allocated before. >>> This allows me to separate allocation and placement criteria, and to have the >>> placement criteria checked in only one place (per mode). >>> Only for HeapBaseMinAddress I don't do it that way, I explicitly call release(). >>> This way I can enforce mode heapbased. >> >> I see what you are saying. It was not clear from comments what is going on. >> Add more extending comment explaining that. >> >>> >>>> num_attempts calculation and while() loop are similar in unscaled and >>>> zerobased cases. Could you move it into a separate method? >>> I can do that, but I don't like it as I have to pass in 7 parameters. >> >> You need an other parameter to pass UnscaledOopHeapMax or zerobased_max. >> >>> That makes the code not much more readable. The function will look like this: >> >> I think initialize_compressed_heap() is more readable now. >> >>> >>> void ReserveHeapSpace::try_reserve_range(char *const highest_start, char *lowest_start, size_t attach_point_alignment, >>> char *aligned_HBMA, size_t size, size_t alignment, bool large) { >>> guarantee(HeapSearchSteps > 0, "Don't set HeapSearchSteps to 0"); >>> >>> const size_t attach_range = highest_start - lowest_start; >>> // Cap num_attempts at possible number. >>> // At least one is possible even for 0 sized attach range. >>> const uint64_t num_attempts_possible = (attach_range / attach_point_alignment) + 1; >>> const uint64_t num_attempts_to_try = MIN2(HeapSearchSteps, num_attempts_possible); >>> >>> const size_t stepsize = align_size_up(attach_range / num_attempts_to_try, attach_point_alignment); >>> >>> // Try attach points from top to bottom. >>> char* attach_point = highest_start; >>> while (attach_point >= lowest_start && >>> attach_point <= highest_start && // Avoid wrap around. >>> (!_base || _base < aligned_HBMA || _base + size > (char *)UnscaledOopHeapMax)) { >>> try_reserve_heap(size, alignment, large, attach_point); >>> attach_point -= stepsize; >>> } >>> } >>> >>> >>>> In disjointbase while() condition no need for _base second check: >>>> + (_base == NULL || >>>> + ((_base + size > (char *)OopEncodingHeapMax) && >>> I need this for the same reason as above: This is the check for successful allocation. >> >> I mean that you already checked _base == NULL so on other side of || _base != NULL - why you need (_base &&) check? >> >> Thanks, >> Vladimir >> >>> >>> >>> >>> Thanks, >>> Vladimir >>> >>> On 11/21/14 5:31 AM, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> I prepared a new webrev trying to cover all the issues mentioned below. >>>> http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.01/ >>>> >>>> I moved functionality from os.cpp and universe.cpp into >>>> ReservedHeapSpace::initialize_compressed_heap(). >>>> This class offers to save _base and _special, which I would have to reimplement >>>> if I had improved the methods I had added to os.cpp to also allocate large page >>>> heaps. >>>> Anyways, I think this class is the right place to gather code trying to reserve >>>> the heap. >>>> Also, I get along without setting the shift, base, implicit_null_check etc. fields >>>> of Universe, so there is no unnecessary calling back and forth between the two >>>> classes. >>>> Universe gets the heap back, and then sets the properties it needs to configure >>>> the compressed oops. >>>> All code handling the noaccess prefix is in a single method, too. >>>> >>>> Best regards, >>>> Goetz. >>>> >>>> Btw, I had to workaround a SS12u1 problem: it wouldn't compile >>>> char * x = (char*)UnscaledOopHeapMax - size in 32-bit mode. >>>> >>>> >>>> -----Original Message----- >>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz >>>> Sent: Montag, 17. November 2014 09:33 >>>> To: 'Vladimir Kozlov'; 'hotspot-dev at openjdk.java.net' >>>> Subject: RE: RFR(L): 8064457: Introduce compressed oops mode "disjoint base" and improve compressed heap handling. >>>> >>>> Hi Vladimir, >>>> >>>>> It is very significant rewriting and it takes time to evaluate it. >>>> Yes, I know ... and I don't want to push, but nevertheless a ping >>>> can be useful sometimes. Thanks a lot for looking at it. >>>> >>>>> And I would not say it is simpler then before :) >>>> If I fix what you propose it's gonna get even more simple ;) >>>>> These is what I found so far. >>>> >>>>> The idea to try to allocate in a range instead of just below >>>>> UnscaledOopHeapMax or OopEncodingHeapMax is good. So I would ask to do >>>>> several attempts (3?) on non_PPC64 platforms too. >>>> Set to 3. >>>> >>>>> It is matter of preference but I am not comfortable with switch in loop. >>>>> For me sequential 'if (addr == 0)' checks is simpler. >>>> I'll fix this. >>>> >>>>> One thing worries me that you release found space and try to get it >>>>> again with ReservedHeapSpace. Is it possible to add new >>>>> ReservedHeapSpace ctor which simple use already allocated space? >>>> This was to keep diff's small, but I also think a new constructor is good. >>>> I'll fix this. >>>> >>>>> The next code in ReservedHeapSpace() is hard to understand (): >>>>> (UseCompressedOops && (requested_address == NULL || >>>> requested_address+size > (char*)OopEncodingHeapMax) ? >>>>> may be move all this into noaccess_prefix_size() and add comments. >>>> I have to redo this anyways if I make new constructors. >>>> >>>>> Why you need prefix when requested_address == NULL? >>>> If we allocate with NULL, we most probably will get a heap where >>>> base != NULL and thus need a noaccess prefix. >>>> >>>>> Remove next comment in universe.cpp: >>>>> // SAPJVM GL 2014-09-22 >>>> Removed. >>>> >>>>> Again you will release space so why bother to include space for classes?: >>>>> + // For small heaps, save some space for compressed class pointer >>>>> + // space so it can be decoded with no base. >>>> This was done like this before. We must assure the upper bound of the >>>> heap is low enough that the compressed class space still fits in there. >>>> >>>> virtualspace.cpp >>>> >>>>> With new code size+noaccess_prefix could be requested. But later it is >>>>> not used if WIN64_ONLY(&& UseLargePages) and you will have empty >>>>> non-protected page below heap. >>>> There's several points to this: >>>> * Also if not protectable, the heap base has to be below the real start of the >>>> heap. Else the first object in the heap will be compressed to 'null' >>>> and decompression will fail. >>>> * If we don't reserve the memory other stuff can end up in this space. On >>>> errors, if would be quite unexpected to find memory there. >>>> * To get a heap for the new disjoint mode I must control the size of this. >>>> Requesting a heap starting at (aligned base + prefix) is more likely to fail. >>>> * The size for the prefix must anyways be considered when deciding whether the >>>> heap is small enough to run with compressed oops. >>>> So distinguishing the case where we really can omit this would require >>>> quite some additional checks everywhere, and I thought it's not worth it. >>>> >>>> matcher.hpp >>>> >>>>> Universe::narrow_oop_use_implicit_null_checks() should be true for such >>>>> case too. So you can add new condition with || to existing ones. The >>>>> only condition you relax is base != NULL. Right? >>>> Yes, that's how it's intended. >>>> >>>> arguments.* files >>>> >>>>> Why you need PropertyList_add changes. >>>> Oh, the code using it got lost. I commented on this in the description in the webrev. >>>> "To more efficiently run expensive tests in various compressed oop modes, we set a property with the mode the VM is running in. So far it's called "com.sap.vm.test.compressedOopsMode" better suggestions are welcome (and necessary I guess). Our long running tests that are supposed to run in a dedicated compressed oop mode check this property and abort themselves if it's not the expected mode." >>>> When I know about the heap I do >>>> Arguments::PropertyList_add(new SystemProperty("com.sap.vm.test.compressedOopsMode", >>>> narrow_oop_mode_to_string(narrow_oop_mode()), >>>> false)); >>>> in universe.cpp. >>>> On some OSes it's deterministic which modes work, there we don't start such tests. >>>> Others, as you mentioned OSX, are very indeterministic. Here we save testruntime with this. >>>> But it's not that important. >>>> We can still parse the PrintCompresseOopsMode output after the test and discard the >>>> run. >>>> >>>>> Do you have platform specific changes? >>>> Yes, for ppc and aix. I'll submit them once this is in. >>>> >>>> From your other mail: >>>>> One more thing. You should allow an allocation in the range when returned from OS allocated address does not match >>>>> requested address. We had such cases on OSX, for example, when OS allocates at different address but still inside range. >>>> Good point. I'll fix that in os::attempt_reserve_memory_in_range. >>>> >>>> I'll ping again once a new webrev is done! >>>> >>>> Best regards, >>>> Goetz. >>>> >>>> >>>> On 11/10/14 6:57 AM, Lindenmaier, Goetz wrote: >>>>> Hi, >>>>> >>>>> I need to improve a row of things around compressed oops heap handling >>>>> to achieve good performance on ppc. >>>>> I prepared a first webrev for review: >>>>> http://cr.openjdk.java.net/~goetz/webrevs/8064457-disjoint/webrev.00/ >>>>> >>>>> A detailed technical description of the change is in the webrev and according bug. >>>>> >>>>> If requested, I will split the change into parts with more respective less impact on >>>>> non-ppc platforms. >>>>> >>>>> The change is derived from well-tested code in our VM. Originally it was >>>>> crafted to require the least changes of VM coding, I changed it to be better >>>>> streamlined with the VM. >>>>> I tested this change to deliver heaps at about the same addresses as before. >>>>> Heap addresses mostly differ in lower bits. In some cases (Solaris 5.11) a heap >>>>> in a better compressed oops mode is found, though. >>>>> I ran (and adapted) test/runtime/CompressedOops and gc/arguments/TestUseCompressedOops*. >>>>> >>>>> Best regards, >>>>> Goetz. >>>>> >>>>> From christian.thalinger at oracle.com Mon Dec 8 18:01:50 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Mon, 8 Dec 2014 10:01:50 -0800 Subject: RFR: 6522873 - Java not print "Unrecognized option" when it is invalid option. In-Reply-To: <5481B59E.1000601@oracle.com> References: <5481B59E.1000601@oracle.com> Message-ID: <4A0D9F57-2322-45D8-BB44-EE78DF5DE141@oracle.com> Interesting. I honestly thought we fixed that. The poster child for fixing this issue is IBM J9?s -Xcompressedrefs http://www-01.ibm.com/support/knowledgecenter/#!/SSYKE2_7.0.0/com.ibm.java.aix.71.doc/diag/appendixes/cmdline/Xcompressedrefs.html It nicely matches -Xcomp which leads to interesting bug reports. > On Dec 5, 2014, at 5:39 AM, Jesper Wilhelmsson wrote: > > Hi, > > Please review this patch to make argument parsing stop accepting random characters at the end of command line flags. This topic was discussed in hotspot-dev at openjdk.java.net and I strongly believe that this bug should be reopened and fixed. > > Short summary of the problem: > Today some (not all) flags are accepted even though they have random characters appended to them. Some examples are -Xconcgc, -Xcomp, -Xboundthreads, -XX:+AlwaysTenure etc which will also be accepted when written for instance -Xconcgcnoway, -Xcomposer, -Xboundthreadstodogs or -XX:+AlwaysTenureAtBlueMoon > > There is a potential problem here since we will also accept things like -XX:+ExtendedDTraceProbes-XX:+UseG1GC without saying a word (and of course without running with G1). > > Bug: https://bugs.openjdk.java.net/browse/JDK-6522873 > Webrev: http://cr.openjdk.java.net/~jwilhelm/6522873/webrev.00/ > > > The full list of flags affected by this change is: > > -Xnoclassgc > -Xconcgc > -Xnoconcgc > -Xbatch > -green > -native > -Xsqnopause > -Xrs > -Xusealtsigs > -Xoptimize > -Xprof > -Xconcurrentio > -Xinternalversion > -Xprintflags > -Xint > -Xmixed > -Xcomp > -Xshare:dump > -Xshare:on > -Xshare:auto > -Xshare:off > -Xdebug > -Xnoagent > -Xboundthreads > vfprintf > exit > abort > -XX:+AggressiveHeap > -XX:+NeverTenure > -XX:+AlwaysTenure > -XX:+CMSPermGenSweepingEnabled > -XX:-CMSPermGenSweepingEnabled > -XX:+UseGCTimeLimit > -XX:-UseGCTimeLimit > -XX:+ResizeTLE > -XX:-ResizeTLE > -XX:+PrintTLE > -XX:-PrintTLE > -XX:+UseTLE > -XX:-UseTLE > -XX:+DisplayVMOutputToStderr > -XX:+DisplayVMOutputToStdout > -XX:+ExtendedDTraceProbes > -XX:+FullGCALot > -XX:+ManagementServer > -XX:+PrintVMOptions > -XX:-PrintVMOptions > -XX:+IgnoreUnrecognizedVMOptions > -XX:-IgnoreUnrecognizedVMOptions > -XX:+PrintFlagsInitial > -XX:+PrintFlagsWithComments > > > Thanks, > /Jesper From daniel.daugherty at oracle.com Mon Dec 8 18:40:38 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 08 Dec 2014 11:40:38 -0700 Subject: RFR: 6522873 - Java not print "Unrecognized option" when it is invalid option. In-Reply-To: <5481B59E.1000601@oracle.com> References: <5481B59E.1000601@oracle.com> Message-ID: <5485F0A6.6010804@oracle.com> > Webrev: http://cr.openjdk.java.net/~jwilhelm/6522873/webrev.00/ General: I agree that this should be fixed (again). I think we're early enough in JDK9 to make this change and deal with any fall out. src/share/vm/runtime/arguments.cpp These are the only lines that caught my eye: line 3140 } else if (match_option(option, "vfprintf")) { line 3141 _vfprintf_hook = CAST_TO_FN_PTR(vfprintf_hook_t, option->extraInfo); line 3142 } else if (match_option(option, "exit")) { line 3143 _exit_hook = CAST_TO_FN_PTR(exit_hook_t, option->extraInfo); line 3144 } else if (match_option(option, "abort")) { line 3145 _abort_hook = CAST_TO_FN_PTR(abort_hook_t, option->extraInfo); These strange options will not likely make it past the Java launcher so they must only be available via a JNI invocation. I'm going to bet that they also worked with the ancient gamma launcher (that's been removed). Lines 3140, 3142, and 3144 date back to this Teamware delta: $ sp -r1.92.7.1 src/share/vm/runtime/arguments.cpp src/share/vm/runtime/SCCS/s.arguments.cpp: D 1.92.7.1 99/05/18 11:24:40 renes 227 216 00032/00022/00619 MRs: COMMENTS: This delta predates when the HotSpot project used bug IDs in the delta comments so I can't nail this down to a specific bug ID (if there was one). Since the 'extraInfo' field only shows up in jni.h, that supports that these options can only be used from a JNI invocation: $ rgrep extraInfo src src/share/vm/prims/jni.h: void *extraInfo; src/share/vm/runtime/arguments.cpp: process_java_launcher_argument(tail, option->extraInfo); src/share/vm/runtime/arguments.cpp: _vfprintf_hook = CAST_TO_FN_PTR(vfprintf_hook_t, option->extraInfo); src/share/vm/runtime/arguments.cpp: _exit_hook = CAST_TO_FN_PTR(exit_hook_t, option->extraInfo); src/share/vm/runtime/arguments.cpp: _abort_hook = CAST_TO_FN_PTR(abort_hook_t, option->extraInfo); See the JNI spec for JNI_CreateJavaVM() where those options are discussed, e.g.: http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/invocation.html#wp16334 Thumbs up on this change! Dan On 12/5/14 6:39 AM, Jesper Wilhelmsson wrote: > Hi, > > Please review this patch to make argument parsing stop accepting > random characters at the end of command line flags. This topic was > discussed in hotspot-dev at openjdk.java.net and I strongly believe that > this bug should be reopened and fixed. > > Short summary of the problem: > Today some (not all) flags are accepted even though they have random > characters appended to them. Some examples are -Xconcgc, -Xcomp, > -Xboundthreads, -XX:+AlwaysTenure etc which will also be accepted when > written for instance -Xconcgcnoway, -Xcomposer, -Xboundthreadstodogs > or -XX:+AlwaysTenureAtBlueMoon > > There is a potential problem here since we will also accept things > like -XX:+ExtendedDTraceProbes-XX:+UseG1GC without saying a word (and > of course without running with G1). > > Bug: https://bugs.openjdk.java.net/browse/JDK-6522873 > Webrev: http://cr.openjdk.java.net/~jwilhelm/6522873/webrev.00/ > > > The full list of flags affected by this change is: > > -Xnoclassgc > -Xconcgc > -Xnoconcgc > -Xbatch > -green > -native > -Xsqnopause > -Xrs > -Xusealtsigs > -Xoptimize > -Xprof > -Xconcurrentio > -Xinternalversion > -Xprintflags > -Xint > -Xmixed > -Xcomp > -Xshare:dump > -Xshare:on > -Xshare:auto > -Xshare:off > -Xdebug > -Xnoagent > -Xboundthreads > vfprintf > exit > abort > -XX:+AggressiveHeap > -XX:+NeverTenure > -XX:+AlwaysTenure > -XX:+CMSPermGenSweepingEnabled > -XX:-CMSPermGenSweepingEnabled > -XX:+UseGCTimeLimit > -XX:-UseGCTimeLimit > -XX:+ResizeTLE > -XX:-ResizeTLE > -XX:+PrintTLE > -XX:-PrintTLE > -XX:+UseTLE > -XX:-UseTLE > -XX:+DisplayVMOutputToStderr > -XX:+DisplayVMOutputToStdout > -XX:+ExtendedDTraceProbes > -XX:+FullGCALot > -XX:+ManagementServer > -XX:+PrintVMOptions > -XX:-PrintVMOptions > -XX:+IgnoreUnrecognizedVMOptions > -XX:-IgnoreUnrecognizedVMOptions > -XX:+PrintFlagsInitial > -XX:+PrintFlagsWithComments > > > Thanks, > /Jesper From harold.seigel at oracle.com Mon Dec 8 19:40:02 2014 From: harold.seigel at oracle.com (harold seigel) Date: Mon, 08 Dec 2014 14:40:02 -0500 Subject: RFR: 6522873 - Java not print "Unrecognized option" when it is invalid option. In-Reply-To: <5481B59E.1000601@oracle.com> References: <5481B59E.1000601@oracle.com> Message-ID: <5485FE92.1030201@oracle.com> Hi Jesper, Can you add some tests for this fix? Thanks, Harold On 12/5/2014 8:39 AM, Jesper Wilhelmsson wrote: > Hi, > > Please review this patch to make argument parsing stop accepting > random characters at the end of command line flags. This topic was > discussed in hotspot-dev at openjdk.java.net and I strongly believe that > this bug should be reopened and fixed. > > Short summary of the problem: > Today some (not all) flags are accepted even though they have random > characters appended to them. Some examples are -Xconcgc, -Xcomp, > -Xboundthreads, -XX:+AlwaysTenure etc which will also be accepted when > written for instance -Xconcgcnoway, -Xcomposer, -Xboundthreadstodogs > or -XX:+AlwaysTenureAtBlueMoon > > There is a potential problem here since we will also accept things > like -XX:+ExtendedDTraceProbes-XX:+UseG1GC without saying a word (and > of course without running with G1). > > Bug: https://bugs.openjdk.java.net/browse/JDK-6522873 > Webrev: http://cr.openjdk.java.net/~jwilhelm/6522873/webrev.00/ > > > The full list of flags affected by this change is: > > -Xnoclassgc > -Xconcgc > -Xnoconcgc > -Xbatch > -green > -native > -Xsqnopause > -Xrs > -Xusealtsigs > -Xoptimize > -Xprof > -Xconcurrentio > -Xinternalversion > -Xprintflags > -Xint > -Xmixed > -Xcomp > -Xshare:dump > -Xshare:on > -Xshare:auto > -Xshare:off > -Xdebug > -Xnoagent > -Xboundthreads > vfprintf > exit > abort > -XX:+AggressiveHeap > -XX:+NeverTenure > -XX:+AlwaysTenure > -XX:+CMSPermGenSweepingEnabled > -XX:-CMSPermGenSweepingEnabled > -XX:+UseGCTimeLimit > -XX:-UseGCTimeLimit > -XX:+ResizeTLE > -XX:-ResizeTLE > -XX:+PrintTLE > -XX:-PrintTLE > -XX:+UseTLE > -XX:-UseTLE > -XX:+DisplayVMOutputToStderr > -XX:+DisplayVMOutputToStdout > -XX:+ExtendedDTraceProbes > -XX:+FullGCALot > -XX:+ManagementServer > -XX:+PrintVMOptions > -XX:-PrintVMOptions > -XX:+IgnoreUnrecognizedVMOptions > -XX:-IgnoreUnrecognizedVMOptions > -XX:+PrintFlagsInitial > -XX:+PrintFlagsWithComments > > > Thanks, > /Jesper From ioi.lam at oracle.com Mon Dec 8 23:26:19 2014 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 08 Dec 2014 15:26:19 -0800 Subject: RFR (XS) [8u40] 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid Message-ID: <5486339B.8090204@oracle.com> Hi Folks, Please approve the backport of this P1 bug to 8u40 https://bugs.openjdk.java.net/browse/JDK-8066670 http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit-8u40/ It's a clean import of the JDK9 changes. Tests: JPRT SQE RT_Baseline nightly Thanks - Ioi From jiangli.zhou at oracle.com Tue Dec 9 00:18:19 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Mon, 08 Dec 2014 16:18:19 -0800 Subject: RFR (XS) [8u40] 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid In-Reply-To: <5486339B.8090204@oracle.com> References: <5486339B.8090204@oracle.com> Message-ID: <54863FCB.4020702@oracle.com> Looks good for backport. Thanks, Jiangli On 12/08/2014 03:26 PM, Ioi Lam wrote: > Hi Folks, > > Please approve the backport of this P1 bug to 8u40 > > https://bugs.openjdk.java.net/browse/JDK-8066670 > http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit-8u40/ > > It's a clean import of the JDK9 changes. > > Tests: JPRT > SQE RT_Baseline nightly > > Thanks > - Ioi From david.holmes at oracle.com Tue Dec 9 01:07:25 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 09 Dec 2014 11:07:25 +1000 Subject: RFR (XS) [8u40] 8066670 - PrintSharedArchiveAndExit does not exit the VM when the archive is invalid In-Reply-To: <54863FCB.4020702@oracle.com> References: <5486339B.8090204@oracle.com> <54863FCB.4020702@oracle.com> Message-ID: <54864B4D.2080005@oracle.com> +1 David On 9/12/2014 10:18 AM, Jiangli Zhou wrote: > Looks good for backport. > > Thanks, > Jiangli > > On 12/08/2014 03:26 PM, Ioi Lam wrote: >> Hi Folks, >> >> Please approve the backport of this P1 bug to 8u40 >> >> https://bugs.openjdk.java.net/browse/JDK-8066670 >> http://cr.openjdk.java.net/~iklam/8066670-PrintSharedArchiveAndExit-8u40/ >> >> It's a clean import of the JDK9 changes. >> >> Tests: JPRT >> SQE RT_Baseline nightly >> >> Thanks >> - Ioi > From jiangli.zhou at oracle.com Tue Dec 9 04:49:24 2014 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Mon, 08 Dec 2014 20:49:24 -0800 Subject: RFR JDK-8059510 Compact symbol table layout inside shared archive In-Reply-To: <54828E9B.10004@oracle.com> References: <542DC4A3.3060608@oracle.com> <542E3051.6050100@oracle.com> <542ECEFE.3090504@oracle.com> <54334A12.3060708@oracle.com> <5436A3EB.7040701@oracle.com> <5436DDD9.4030108@oracle.com> <5437F77B.9010204@oracle.com> <54383D41.9030809@oracle.com> <5438453F.9010706@oracle.com> <54384A69.9050103@oracle.com> <54386937.2020309@oracle.com> <5438A837.9060809@oracle.com> <543B2BAB.8010502@oracle.com> <543BA6EB.4090608@oracle.com> <543BFD38.1080205@oracle.com> <543C1E23.6050405@oracle.com> <4BFC34C0-E417-4D72-A3D8-729FCE4491DC@oracle.com> <5451626D.2030102@oracle.com> <547E32CD.5050103@oracle.com> <547F6D32.6050708@oracle.com> <547F9F0C.1000904@oracle.com> <547FC19F.6040406@oracle.com> <5480DD6E.7050509@oracle.com> <90637548-0BB1-4D09-803F-C98A500B8E27@oracle.com> <54825173.5020008@oracle.com> <54828E9B.10004@oracle.com> Message-ID: <54867F54.7030603@oracle.com> Hi John, Here is the webreve that incorporated your latest suggestions, if you want to take another look. http://cr.openjdk.java.net/~jiangli/8059510/webrev.07/ Going through the final testing (including JPRT) of the change, so far so good. Thanks, Jiangli On 12/05/2014 09:05 PM, Jiangli Zhou wrote: > Ok. Thanks, John. > > Jiangli > > On 12/05/2014 08:54 PM, John Rose wrote: >> On Dec 5, 2014, at 4:44 PM, Jiangli Zhou > > wrote: >>> >>> I wonder if it would help more when we also add the support for >>> string table. With a lower hash table load factor, the percentage of >>> buckets with one entry increases. I'm inclined to leave it in if >>> there is no strong objection. >> >> Yes, it will help with a lower load factor. I'm OK either way. >> (Would still like to reduce the number of indirections, but that's >> for later.) ? John > From david.holmes at oracle.com Tue Dec 9 05:39:14 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 09 Dec 2014 15:39:14 +1000 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> <547C38AD.6050703@gmail.com> <547D891D.7010809@oracle.com> <547DCCFB.3050209@gmail.com> <54854475.20800@oracle.com> Message-ID: <54868B02.1070408@oracle.com> Hi Thomas, On 8/12/2014 8:27 PM, Thomas St?fe wrote: > Hi, > > I do not really like the handling of the leading pipe symbol: To be fair to Yasumasa this aspect of the fix has been the same since Oct 15: http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/ and was not flagged. > So, we read the core_pattern, and if the pipe symbol is detected, we > write the core pattern minus the pipe symbol but plus a leading quote to > the output; the leading quote then serves as a info to the layer above > in os_posix.cpp to treat this case specially. This means the logic > spills out of the platform dependend os_linux.cpp to shared code and > this is also difficult to read. > > This comes from the fact that "get_core_path()" assumes the core file is > written to the file system. I think it just does not fit anymore, better > would be to replace it with something like > "os::print_core_file_location(outputStream* os)", and the OS handles > both core path retrieval and the printing. Because then the shared code > does not need to know whether core file gets printed traditionally or > piped to a executable or whatever. This sounds like a refactoring that I suggested would be too disruptive. http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-October/015547.html http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-October/015557.html http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-October/015573.html David ----- > Kind regards, Thomas > > > On Mon, Dec 8, 2014 at 7:25 AM, David Holmes > wrote: > > Hi Yasumasa, > > I'm okay with these changes. Just a minor style nit (no need for > updated webrev) can you remove the blank lines in os_linux.cpp: > > 6011 } > 6012 > 6013 } > 6014 > 6015 } > > 6057 } > 6058 > 6059 } > 6060 > 6061 } > > If anyone has any objections please raise them asap. > > Thanks, > David > > > On 3/12/2014 12:30 AM, Yasumasa Suenaga wrote: > > Hi David, Thomas, > > I've uploaded new webrev: > http://cr.openjdk.java.net/~__ysuenaga/JDK-8059586/webrev.__04/ > > > I want to rewrite a patch as below: > > - Use async signal safety functions. > fopen -> open, fgets -> read, etc. > > > This is commendable if it is practical, but error reporting > already > does many, many things that are not async-signal safe, so > there is no > need to go to extreme measures here. > > > I've used async-signal safe functions as possible. > > > - Use O_BUFLEN for buffer size. > O_BUFLEN is defined to 2000 in ostream.hpp . > This macro is used in various points. > VMError::coredump_message is > also defined with this value. > > > I think PATH_MAX is fine. I think O_BUFLEN was > originally used as a max. > length of temporary buffers to assemble an output line. > And then it > spread a bit. But your intend is to hold a path and > using PATH_MAX > clearly documents this. > > > I've used PATH_MAX again. > > > And, to really nitpick, right now you do not handle > ERANGE with > get_current_path() (if the provided buffer is too > small), which is > probably fine because it is improbable that a path is > larger than > PATH_MAX. But if you change the size of the buffer to > something which > may be smaller than PATH_MAX (O_BUFLEN), > get_current_directory() may > fail. > > > If get_current_path() call is failed in get_core_path(), > get_core_path() > returns immediately with 0. > Caller (check_or_create_dump()) handles this result as illegal > state. > > get_current_path() calls getcwd() only and redirects result to > caller. > So result of this function is NULL, we can judge getcwd() was > finished with > error. > I think it is enough. > > > I like your patch, I think it could be a nice time safer > when > core_pattern is something unusual. But I also see > Staffans point of > too-much-complexity. So I will keep out of this > discussion until the > real Reviewers decided what to do :) > > > I have a hard time evaluating the merits of the patch as I > don't work > in an environment where this extra info is needed. But I > take it on > good faith that it is useful for the context Yasumasa describes. > > > I want to suggest to Java user where coredump is. > Modern Linux distribution(s) contains ABRT. > OS can dump corefile automatically despite a lack of setting > coredump > resource by user. > > I'm support engineer of Java. My customer says "coredump does > not found.", > but coredump is saved by ABRT. > Thus I want them to know "coredump is available" through stderr and > hs_err immediately. > I belive it is first step of troubleshoot. > > > Thanks, > > Yasumasa > > > (2014/12/02 18:40), David Holmes wrote: > > On 1/12/2014 10:57 PM, Thomas St?fe wrote: > > Hi Yasumasa, > > On Mon, Dec 1, 2014 at 10:45 AM, Yasumasa Suenaga > > >> > wrote: > > Hi Thomas, David, > > Sorry, I didn't think about async signal safety. > > That would work, VmError::report_and_die() is > singlethreaded. At > least the part which dumps out the core file name. > > > I think that signal handler (in this case) may run > concurrency with > other thread. > If another thread calls malloc(3) in JNI, C Heap > corruption may > occur. > > > No, malloc(3) should be thread safe on our platforms. > But this was not > the point. If I understood David right, he suggested > using a static > buffer inside get_core_path() for assembling the core > path, which would > make get_core_path() thread-unsafe (multiple threads > calling it would > get garbled results). But as get_core_path() is only > called from within > VmError::report_and_die() and that section is only ever > executed by one > thread, Davids suggestion would still work. > > > Yes that is what I was suggesting. > > I want to rewrite a patch as below: > > - Use async signal safety functions. > fopen -> open, fgets -> read, etc. > > > This is commendable if it is practical, but error reporting > already > does many, many things that are not async-signal safe, so > there is no > need to go to extreme measures here. > > - Use O_BUFLEN for buffer size. > O_BUFLEN is defined to 2000 in ostream.hpp . > This macro is used in various points. > VMError::coredump_message is > also defined with this value. > > > I think PATH_MAX is fine. I think O_BUFLEN was > originally used as a max. > length of temporary buffers to assemble an output line. > And then it > spread a bit. But your intend is to hold a path and > using PATH_MAX > clearly documents this. > And, to really nitpick, right now you do not handle > ERANGE with > get_current_path() (if the provided buffer is too > small), which is > probably fine because it is improbable that a path is > larger than > PATH_MAX. But if you change the size of the buffer to > something which > may be smaller than PATH_MAX (O_BUFLEN), > get_current_directory() may > fail. > > I like your patch, I think it could be a nice time safer > when > core_pattern is something unusual. But I also see > Staffans point of > too-much-complexity. So I will keep out of this > discussion until the > real Reviewers decided what to do :) > > > I have a hard time evaluating the merits of the patch as I > don't work > in an environment where this extra info is needed. But I > take it on > good faith that it is useful for the context Yasumasa describes. > > David > > Kind Regards, Thomas > > From thomas.stuefe at gmail.com Tue Dec 9 08:09:15 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 9 Dec 2014 09:09:15 +0100 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <54868B02.1070408@oracle.com> References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> <547C38AD.6050703@gmail.com> <547D891D.7010809@oracle.com> <547DCCFB.3050209@gmail.com> <54854475.20800@oracle.com> <54868B02.1070408@oracle.com> Message-ID: Hi David, On Tue, Dec 9, 2014 at 6:39 AM, David Holmes wrote: > Hi Thomas, > > On 8/12/2014 8:27 PM, Thomas St?fe wrote: > >> Hi, >> >> I do not really like the handling of the leading pipe symbol: >> > > To be fair to Yasumasa this aspect of the fix has been the same since Oct > 15: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/ > > and was not flagged. You are right, I did not read those mails close enough. > > > So, we read the core_pattern, and if the pipe symbol is detected, we >> write the core pattern minus the pipe symbol but plus a leading quote to >> the output; the leading quote then serves as a info to the layer above >> in os_posix.cpp to treat this case specially. This means the logic >> spills out of the platform dependend os_linux.cpp to shared code and >> this is also difficult to read. >> >> This comes from the fact that "get_core_path()" assumes the core file is >> written to the file system. I think it just does not fit anymore, better >> would be to replace it with something like >> "os::print_core_file_location(outputStream* os)", and the OS handles >> both core path retrieval and the printing. Because then the shared code >> does not need to know whether core file gets printed traditionally or >> piped to a executable or whatever. >> > > This sounds like a refactoring that I suggested would be too disruptive. > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2014- > October/015547.html > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2014- > October/015557.html > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2014- > October/015573.html > > I do not think that this would be such a big a change, but it also could be done with another patch. Apart from my reservations I stated above the code looks fine and is definitly an improvement (just last week I was helplessly looking for a core on a machine where core_pattern turned out to be a redirection to another program). Kind Regards, Thomas > David From david.holmes at oracle.com Tue Dec 9 08:56:50 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 09 Dec 2014 18:56:50 +1000 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: References: <542C8274.3010809@gmail.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> <547C38AD.6050703@gmail.com> <547D891D.7010809@oracle.com> <547DCCFB.3050209@gmail.com> <54854475.20800@oracle.com> <54868B02.1070408@oracle.com> Message-ID: <5486B952.4070708@oracle.com> Hi Thomas, So can we take this as-is for now and file a RFE to address your concerns? Anybody else object to that? Thanks, David On 9/12/2014 6:09 PM, Thomas St?fe wrote: > Hi David, > > On Tue, Dec 9, 2014 at 6:39 AM, David Holmes > wrote: > > Hi Thomas, > > On 8/12/2014 8:27 PM, Thomas St?fe wrote: > > Hi, > > I do not really like the handling of the leading pipe symbol: > > > To be fair to Yasumasa this aspect of the fix has been the same > since Oct 15: > > http://cr.openjdk.java.net/~__ysuenaga/JDK-8059586/webrev.__02/ > > > and was not flagged. > > > You are right, I did not read those mails close enough. > > > > So, we read the core_pattern, and if the pipe symbol is detected, we > write the core pattern minus the pipe symbol but plus a leading > quote to > the output; the leading quote then serves as a info to the layer > above > in os_posix.cpp to treat this case specially. This means the logic > spills out of the platform dependend os_linux.cpp to shared code and > this is also difficult to read. > > This comes from the fact that "get_core_path()" assumes the core > file is > written to the file system. I think it just does not fit > anymore, better > would be to replace it with something like > "os::print_core_file_location(__outputStream* os)", and the OS > handles > both core path retrieval and the printing. Because then the > shared code > does not need to know whether core file gets printed > traditionally or > piped to a executable or whatever. > > > This sounds like a refactoring that I suggested would be too disruptive. > > http://mail.openjdk.java.net/__pipermail/hotspot-dev/2014-__October/015547.html > > > http://mail.openjdk.java.net/__pipermail/hotspot-dev/2014-__October/015557.html > > > http://mail.openjdk.java.net/__pipermail/hotspot-dev/2014-__October/015573.html > > > > I do not think that this would be such a big a change, but it also could > be done with another patch. > > Apart from my reservations I stated above the code looks fine and is > definitly an improvement (just last week I was helplessly looking for a > core on a machine where core_pattern turned out to be a redirection to > another program). > > Kind Regards, Thomas > > > David > From thomas.stuefe at gmail.com Tue Dec 9 12:06:53 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 9 Dec 2014 13:06:53 +0100 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <5486B952.4070708@oracle.com> References: <542C8274.3010809@gmail.com> <543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> <547C38AD.6050703@gmail.com> <547D891D.7010809@oracle.com> <547DCCFB.3050209@gmail.com> <54854475.20800@oracle.com> <54868B02.1070408@oracle.com> <5486B952.4070708@oracle.com> Message-ID: Yes, Sure :-) @Yasumasa : thank you for this patch! Kind regards, Thomas On Dec 9, 2014 9:56 AM, "David Holmes" wrote: > Hi Thomas, > > So can we take this as-is for now and file a RFE to address your concerns? > > Anybody else object to that? > > Thanks, > David > > On 9/12/2014 6:09 PM, Thomas St?fe wrote: > >> Hi David, >> >> On Tue, Dec 9, 2014 at 6:39 AM, David Holmes > > wrote: >> >> Hi Thomas, >> >> On 8/12/2014 8:27 PM, Thomas St?fe wrote: >> >> Hi, >> >> I do not really like the handling of the leading pipe symbol: >> >> >> To be fair to Yasumasa this aspect of the fix has been the same >> since Oct 15: >> >> http://cr.openjdk.java.net/~__ysuenaga/JDK-8059586/webrev.__02/ >> >> >> and was not flagged. >> >> >> You are right, I did not read those mails close enough. >> >> >> >> So, we read the core_pattern, and if the pipe symbol is detected, >> we >> write the core pattern minus the pipe symbol but plus a leading >> quote to >> the output; the leading quote then serves as a info to the layer >> above >> in os_posix.cpp to treat this case specially. This means the logic >> spills out of the platform dependend os_linux.cpp to shared code >> and >> this is also difficult to read. >> >> This comes from the fact that "get_core_path()" assumes the core >> file is >> written to the file system. I think it just does not fit >> anymore, better >> would be to replace it with something like >> "os::print_core_file_location(__outputStream* os)", and the OS >> handles >> both core path retrieval and the printing. Because then the >> shared code >> does not need to know whether core file gets printed >> traditionally or >> piped to a executable or whatever. >> >> >> This sounds like a refactoring that I suggested would be too >> disruptive. >> >> http://mail.openjdk.java.net/__pipermail/hotspot-dev/2014-__ >> October/015547.html >> > October/015547.html> >> >> http://mail.openjdk.java.net/__pipermail/hotspot-dev/2014-__ >> October/015557.html >> > October/015557.html> >> >> http://mail.openjdk.java.net/__pipermail/hotspot-dev/2014-__ >> October/015573.html >> > October/015573.html> >> >> >> I do not think that this would be such a big a change, but it also could >> be done with another patch. >> >> Apart from my reservations I stated above the code looks fine and is >> definitly an improvement (just last week I was helplessly looking for a >> core on a machine where core_pattern turned out to be a redirection to >> another program). >> >> Kind Regards, Thomas >> >> >> David >> >> From yasuenag at gmail.com Tue Dec 9 13:56:42 2014 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Tue, 09 Dec 2014 22:56:42 +0900 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: References: <542C8274.3010809@gmail.com> <543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> <547C38AD.6050703@gmail.com> <547D891D.7010809@oracle.com> <547DCCFB.3050209@gmail.com> <54854475.20800@oracle.com> <54868B02.1070408@oracle.com> <5486B952.4070708@oracle.com> Message-ID: <5486FF9A.2050604@gmail.com> David, Thomas, Thank you so much! I wait 2nd reviewer. BTW, I'm not a committer. So I'm also waiting a sponsor :-) > I'm okay with these changes. Just a minor style nit (no need for updated webrev) can you remove the blank lines in os_linux.cpp: > > 6011 } > 6012 > 6013 } > 6014 > 6015 } > > 6057 } > 6058 > 6059 } > 6060 > 6061 } > > If anyone has any objections please raise them asap. I will upload new webrev which is fix them after reviewing. Thanks, Yasumasa (2014/12/09 21:06), Thomas St?fe wrote: > Yes, Sure :-) @Yasumasa : thank you for this patch! > > Kind regards, Thomas > > On Dec 9, 2014 9:56 AM, "David Holmes" > wrote: > > Hi Thomas, > > So can we take this as-is for now and file a RFE to address your concerns? > > Anybody else object to that? > > Thanks, > David > > On 9/12/2014 6:09 PM, Thomas St?fe wrote: > > Hi David, > > On Tue, Dec 9, 2014 at 6:39 AM, David Holmes > >> wrote: > > Hi Thomas, > > On 8/12/2014 8:27 PM, Thomas St?fe wrote: > > Hi, > > I do not really like the handling of the leading pipe symbol: > > > To be fair to Yasumasa this aspect of the fix has been the same > since Oct 15: > > http://cr.openjdk.java.net/~____ysuenaga/JDK-8059586/webrev.____02/ > > > > and was not flagged. > > > You are right, I did not read those mails close enough. > > > > So, we read the core_pattern, and if the pipe symbol is detected, we > write the core pattern minus the pipe symbol but plus a leading > quote to > the output; the leading quote then serves as a info to the layer > above > in os_posix.cpp to treat this case specially. This means the logic > spills out of the platform dependend os_linux.cpp to shared code and > this is also difficult to read. > > This comes from the fact that "get_core_path()" assumes the core > file is > written to the file system. I think it just does not fit > anymore, better > would be to replace it with something like > "os::print_core_file_location(____outputStream* os)", and the OS > handles > both core path retrieval and the printing. Because then the > shared code > does not need to know whether core file gets printed > traditionally or > piped to a executable or whatever. > > > This sounds like a refactoring that I suggested would be too disruptive. > > http://mail.openjdk.java.net/____pipermail/hotspot-dev/2014-____October/015547.html > > > > http://mail.openjdk.java.net/____pipermail/hotspot-dev/2014-____October/015557.html > > > > http://mail.openjdk.java.net/____pipermail/hotspot-dev/2014-____October/015573.html > > > > > I do not think that this would be such a big a change, but it also could > be done with another patch. > > Apart from my reservations I stated above the code looks fine and is > definitly an improvement (just last week I was helplessly looking for a > core on a machine where core_pattern turned out to be a redirection to > another program). > > Kind Regards, Thomas > > > David > From volker.simonis at gmail.com Tue Dec 9 17:39:31 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 9 Dec 2014 18:39:31 +0100 Subject: RFR(XXS): 8067015: Implement os::pd_map_memory() on AIX Message-ID: Hi, could I please get a review for the following trivial change which simply implements os::pd_map_memory() on AIX: http://cr.openjdk.java.net/~simonis/webrevs/8067015/ Until now os::pd_map_memory() was only used in the context of class data sharing (CDS) which isn't supported on AIX anyway, so we hadn't implemented it in os_aix.cpp However with the integration of the modularity change, os::pd_map_memory() is now also needed for the loading of image files. The implementation is a straightforward copy of the corresponding Linux version. I'd like to push this directly to jdk9/dev/hotspot because it was introduced there and because it affects all our AIX builds. I hope that's no problem, especially because the change only touches an AIX-only files. Thank you and best regards, Volker From david.holmes at oracle.com Wed Dec 10 06:29:46 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 10 Dec 2014 16:29:46 +1000 Subject: RFR(XXS): 8067015: Implement os::pd_map_memory() on AIX In-Reply-To: References: Message-ID: <5487E85A.6060000@oracle.com> Hi Volker, On 10/12/2014 3:39 AM, Volker Simonis wrote: > Hi, > > could I please get a review for the following trivial change which > simply implements os::pd_map_memory() on AIX: > > http://cr.openjdk.java.net/~simonis/webrevs/8067015/ > > Until now os::pd_map_memory() was only used in the context of class > data sharing (CDS) which isn't supported on AIX anyway, so we hadn't > implemented it in os_aix.cpp > > However with the integration of the modularity change, > os::pd_map_memory() is now also needed for the loading of image files. > > The implementation is a straightforward copy of the corresponding Linux version. Change looks fine to me. > I'd like to push this directly to jdk9/dev/hotspot because it was > introduced there and because it affects all our AIX builds. I hope > that's no problem, especially because the change only touches an > AIX-only files. Need to check this with Alejandro - cc'd. Thanks, David > Thank you and best regards, > Volker > From david.holmes at oracle.com Wed Dec 10 06:37:41 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 10 Dec 2014 16:37:41 +1000 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <5486FF9A.2050604@gmail.com> References: <542C8274.3010809@gmail.com> <543E80F8.3080204@gmail.com> <547330E5.1050708@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> <547C38AD.6050703@gmail.com> <547D891D.7010809@oracle.com> <547DCCFB.3050209@gmail.com> <54854475.20800@oracle.com> <54868B02.1070408@oracle.com> <5486B952.4070708@oracle.com> <5486FF9A.2050604@gma! il.com> Message-ID: <5487EA35.6070709@oracle.com> Hi Yasumasa, On 9/12/2014 11:56 PM, Yasumasa Suenaga wrote: > David, Thomas, > > Thank you so much! > I wait 2nd reviewer. I'm a Reviewer and I think Thomas counts as a reviewer. Plus Staffan had a look too. So I think this is good to go - though I'll give it till my morning before finalizing it. > BTW, I'm not a committer. > So I'm also waiting a sponsor :-) I will sponsor if you can prepare the changeset. Thanks, David >> I'm okay with these changes. Just a minor style nit (no need for >> updated webrev) can you remove the blank lines in os_linux.cpp: >> >> 6011 } >> 6012 >> 6013 } >> 6014 >> 6015 } >> >> 6057 } >> 6058 >> 6059 } >> 6060 >> 6061 } >> >> If anyone has any objections please raise them asap. > > I will upload new webrev which is fix them after reviewing. > > > Thanks, > > Yasumasa > > > (2014/12/09 21:06), Thomas St?fe wrote: >> Yes, Sure :-) @Yasumasa : thank you for this patch! >> >> Kind regards, Thomas >> >> On Dec 9, 2014 9:56 AM, "David Holmes" > > wrote: >> >> Hi Thomas, >> >> So can we take this as-is for now and file a RFE to address your >> concerns? >> >> Anybody else object to that? >> >> Thanks, >> David >> >> On 9/12/2014 6:09 PM, Thomas St?fe wrote: >> >> Hi David, >> >> On Tue, Dec 9, 2014 at 6:39 AM, David Holmes >> >> > >> wrote: >> >> Hi Thomas, >> >> On 8/12/2014 8:27 PM, Thomas St?fe wrote: >> >> Hi, >> >> I do not really like the handling of the leading pipe >> symbol: >> >> >> To be fair to Yasumasa this aspect of the fix has been >> the same >> since Oct 15: >> >> >> http://cr.openjdk.java.net/~____ysuenaga/JDK-8059586/webrev.____02/ >> >> >> > > >> >> and was not flagged. >> >> >> You are right, I did not read those mails close enough. >> >> >> >> So, we read the core_pattern, and if the pipe symbol >> is detected, we >> write the core pattern minus the pipe symbol but plus >> a leading >> quote to >> the output; the leading quote then serves as a info >> to the layer >> above >> in os_posix.cpp to treat this case specially. This >> means the logic >> spills out of the platform dependend os_linux.cpp to >> shared code and >> this is also difficult to read. >> >> This comes from the fact that "get_core_path()" >> assumes the core >> file is >> written to the file system. I think it just does not fit >> anymore, better >> would be to replace it with something like >> "os::print_core_file_location(____outputStream* os)", >> and the OS >> handles >> both core path retrieval and the printing. Because >> then the >> shared code >> does not need to know whether core file gets printed >> traditionally or >> piped to a executable or whatever. >> >> >> This sounds like a refactoring that I suggested would be >> too disruptive. >> >> >> http://mail.openjdk.java.net/____pipermail/hotspot-dev/2014-____October/015547.html >> >> >> >> > > >> >> >> >> http://mail.openjdk.java.net/____pipermail/hotspot-dev/2014-____October/015557.html >> >> >> >> > > >> >> >> >> http://mail.openjdk.java.net/____pipermail/hotspot-dev/2014-____October/015573.html >> >> >> >> > > >> >> >> >> I do not think that this would be such a big a change, but it >> also could >> be done with another patch. >> >> Apart from my reservations I stated above the code looks fine >> and is >> definitly an improvement (just last week I was helplessly >> looking for a >> core on a machine where core_pattern turned out to be a >> redirection to >> another program). >> >> Kind Regards, Thomas >> >> >> David >> From thomas.stuefe at gmail.com Wed Dec 10 09:05:15 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 10 Dec 2014 10:05:15 +0100 Subject: RFR(XXS): 8067015: Implement os::pd_map_memory() on AIX In-Reply-To: References: Message-ID: Hi Volker, Looks fine :-) Kind Regards, Thomas On Dec 9, 2014 6:40 PM, "Volker Simonis" wrote: > Hi, > > could I please get a review for the following trivial change which > simply implements os::pd_map_memory() on AIX: > > http://cr.openjdk.java.net/~simonis/webrevs/8067015/ > > Until now os::pd_map_memory() was only used in the context of class > data sharing (CDS) which isn't supported on AIX anyway, so we hadn't > implemented it in os_aix.cpp > > However with the integration of the modularity change, > os::pd_map_memory() is now also needed for the loading of image files. > > The implementation is a straightforward copy of the corresponding Linux > version. > > I'd like to push this directly to jdk9/dev/hotspot because it was > introduced there and because it affects all our AIX builds. I hope > that's no problem, especially because the change only touches an > AIX-only files. > > Thank you and best regards, > Volker > From yasuenag at gmail.com Wed Dec 10 14:44:19 2014 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 10 Dec 2014 23:44:19 +0900 Subject: RFR: JDK-8059586: hs_err report should treat redirected core pattern. In-Reply-To: <5487EA35.6070709@oracle.com> References: <542C8274.3010809@gmail.com>

<5479E9DE.7070703@gmail.com> <547C079A.9020604@oracle.com> <547C1815.6050900@oracle.com> <547C38AD.6050703@gmail.com> <547D891D.7010809@oracle.com> <547DCCFB.3050209@gmail.com> <54854475.20800@oracle.com> <54868B02.1070408@oracle.com> <5486B952.4070708@oracle.com> <5486FF9A.2050604@gma! il.com> <5487EA35.6070709@oracle.com> Message-ID: <54885C43.4060500@gmail.com> Hi David, > I will sponsor if you can prepare the changeset. Thank you so much! I've uploaded webrev which contains fixes for your comment. http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.05/ Could you push this patch? Thanks, Yasumasa (2014/12/10 15:37), David Holmes wrote: > Hi Yasumasa, > > On 9/12/2014 11:56 PM, Yasumasa Suenaga wrote: >> David, Thomas, >> >> Thank you so much! >> I wait 2nd reviewer. > > I'm a Reviewer and I think Thomas counts as a reviewer. Plus Staffan had a look too. So I think this is good to go - though I'll give it till my morning before finalizing it. > >> BTW, I'm not a committer. >> So I'm also waiting a sponsor :-) > > I will sponsor if you can prepare the changeset. > > Thanks, > David > >>> I'm okay with these changes. Just a minor style nit (no need for >>> updated webrev) can you remove the blank lines in os_linux.cpp: >>> >>> 6011 } >>> 6012 >>> 6013 } >>> 6014 >>> 6015 } >>> >>> 6057 } >>> 6058 >>> 6059 } >>> 6060 >>> 6061 } >>> >>> If anyone has any objections please raise them asap. >> >> I will upload new webrev which is fix them after reviewing. >> >> >> Thanks, >> >> Yasumasa >> >> >> (2014/12/09 21:06), Thomas St?fe wrote: >>> Yes, Sure :-) @Yasumasa : thank you for this patch! >>> >>> Kind regards, Thomas >>> >>> On Dec 9, 2014 9:56 AM, "David Holmes" >> > wrote: >>> >>> Hi Thomas, >>> >>> So can we take this as-is for now and file a RFE to address your >>> concerns? >>> >>> Anybody else object to that? >>> >>> Thanks, >>> David >>> >>> On 9/12/2014 6:09 PM, Thomas St?fe wrote: >>> >>> Hi David, >>> >>> On Tue, Dec 9, 2014 at 6:39 AM, David Holmes >>> >>> >> >> wrote: >>> >>> Hi Thomas, >>> >>> On 8/12/2014 8:27 PM, Thomas St?fe wrote: >>> >>> Hi, >>> >>> I do not really like the handling of the leading pipe >>> symbol: >>> >>> >>> To be fair to Yasumasa this aspect of the fix has been >>> the same >>> since Oct 15: >>> >>> >>> http://cr.openjdk.java.net/~____ysuenaga/JDK-8059586/webrev.____02/ >>> >>> >>> >> > >>> >>> and was not flagged. >>> >>> >>> You are right, I did not read those mails close enough. >>> >>> >>> >>> So, we read the core_pattern, and if the pipe symbol >>> is detected, we >>> write the core pattern minus the pipe symbol but plus >>> a leading >>> quote to >>> the output; the leading quote then serves as a info >>> to the layer >>> above >>> in os_posix.cpp to treat this case specially. This >>> means the logic >>> spills out of the platform dependend os_linux.cpp to >>> shared code and >>> this is also difficult to read. >>> >>> This comes from the fact that "get_core_path()" >>> assumes the core >>> file is >>> written to the file system. I think it just does not fit >>> anymore, better >>> would be to replace it with something like >>> "os::print_core_file_location(____outputStream* os)", >>> and the OS >>> handles >>> both core path retrieval and the printing. Because >>> then the >>> shared code >>> does not need to know whether core file gets printed >>> traditionally or >>> piped to a executable or whatever. >>> >>> >>> This sounds like a refactoring that I suggested would be >>> too disruptive. >>> >>> >>> http://mail.openjdk.java.net/____pipermail/hotspot-dev/2014-____October/015547.html >>> >>> >>> >>> >> > >>> >>> >>> >>> http://mail.openjdk.java.net/____pipermail/hotspot-dev/2014-____October/015557.html >>> >>> >>> >>> >> > >>> >>> >>> >>> http://mail.openjdk.java.net/____pipermail/hotspot-dev/2014-____October/015573.html >>> >>> >>> >>> >>