From serguei.spitsyn at oracle.com  Sat Nov  1 09:59:39 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Sat, 01 Nov 2014 02:59:39 -0700
Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh
	fails with ClassCircularityError
In-Reply-To: <5454218D.40009@oracle.com>
References: <543C591E.8010602@oracle.com>
	<544AB477.4000204@oracle.com>	<544ADC07.6080904@oracle.com>
	<544AE76A.9030701@oracle.com>	<544E5123.1060202@oracle.com>
	<544E8844.1070907@oracle.com>
	<0FB37288-5995-4E0B-B005-E00A4FB3B22A@oracle.com>
	<5454218D.40009@oracle.com>
Message-ID: <5454AF0B.4060008@oracle.com>

On 10/31/14 4:55 PM, Yumin Qi wrote:
> Karen,
>
>   Thanks for your detail message for debugging. Yes, from my 
> debugging, the exception did happen in TestThread other than main 
> thread. I have no idea why in the end the exception was reported in 
> main thread.
>
>    You mention
>
> So that change to the test would be:
>     in TestTransformer:
>        if (loader != null) {
>            if (tName.equals("TestThread")) {
>            {
>               loadClasses(3);
>            }
>         }
>         return null;
>      }
>
> The loader is the one defined in the test case, right?

Not sure, I understand your question correctly.

If thread is the TestThread then most likely the answer is "Yes".
This one is expected:
                 sClassLoader = new URLClassLoader(new URL[] {sURL});

The class loading for TestThread has to happen in the loadClasses(2).
I wonder if we ever observe any other loader for the TestThread.

The question is because the TestThread is pretty simple:

         private static class TestThread extends Thread {
                 private final int fIndex;
                 public TestThread(int index) {         <== it is called 
with index = 2
                      super("TestThread");
                      fIndex = index;
                 }
                 public void  run() {  loadClasses(fIndex); }
         }

Thanks,
Serguei


> The system class loader is never null.
> I will try this change, let's see if it can work it out.
>
> Thanks
> Yumin
>
> On 10/31/2014 3:29 PM, Karen Kinnear wrote:
>> Yumin,
>>
>>  From your earlier exception stack trace (many thanks) you reported:
>>
>> Exception in thread "main" java.lang.ClassCircularityError:  (no - I don't know why this is in thread "main")
>> sun/misc/URLClassPath$JarLoader$2
>> at sun.misc.URLClassPath$JarLoader.checkResource(URLClassPath.java:771)
>> at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:843)
>> at sun.misc.URLClassPath.getResource(URLClassPath.java:199)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:364)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:426)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:359)
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:340)
>> at ParallelTransformerLoaderApp.loadClasses(ParallelTransformerLoaderApp.java:83)
>> at ParallelTransformerLoaderApp.main(ParallelTransformerLoaderApp.java:45)
>>
>>
>> So I ran  with -XX:AbortVMOnException=java.lang.ClassCircularityError -XX:+ShowMessageBoxOnError to get
>> a log file and stack trace. See my instructions below on how to do that.
>>
>> I did this, attached a debugger, which didn't help enough since I needed to see the java stack frames,
>>   and got an hs_err_log also, so the stack traces came from the error log.
>>
>> The stack trace was on Thread 2, which in the hs_err_log was TestThread (which makes sense for what the test logic says).
>> See later in email for stack traces from Thread 2.
>>
>> Summary of stack trace:
>>
>> TestThread:
>>    loadClasses(#) -> forName(TestClass#, URLClassLoader)
>>      vm calls out to URLClassLoader.loadClass(String) which is inherited from java.lang.ClassLoader.loadClass(String)
>>      ... calls java.net.URLClassLoader.findClass(...) which calls
>>        DoPrivileged  java.net.URLClassLoader$1.run which calls
>>           sun.misc.URLClassPath.getResource(name, false)  which calls
>>               sun.misc.URLClassPath$JarLoader.getResource which calls
>>                   sun.misc.URLClassPath$JarLoader.checkResource which tries to call sun.misc.URLClassPath$JarLoader$2
>>     - and then the transformer jumps in with loadClasses(# (which we know is 3) and walks the same logic which tries to load sun.misc.URLClassPath$JarLoader$2 again
>>
>> Note that in the placeholder table information that Yumin printed, the circularity error is on sun.misc.URLClassPath$JarLoader$2 with the null == boot loader, which
>> makes sense -- that is the appropriate defining loader, and therefore the one the CFLH would intercept during the defineClass phase.
>>
>> In the sun.misc.URLClassPath.java file, in the class JarLoader, in the method checkResource
>> ... return new Resource() { ... }
>> -- I do not know why that generates sun.misc.URLClassPath$JarLoader$1, $2 and $3 at build time or when that was added.
>> I would guess that is when the bug started happening.
>>
>> When I have a successful run, sun.misc.URLClassPath$JarLoader$2 loads before any TestClass1 loads.
>>
>> My belief is that the point of the test is to test parallel class loading for URL class loaders.
>> I don't think the point is to test the bootstrap class loader, nor to test bootstrapping - i.e. running the agent before
>> we have loaded sufficient classes to allow loading URLClassLoader classes.
>>
>> What I suggested to Yumin that he try would be to change the test to NOT intercept boot loader loads, so that sun.misc.URLClassPath$JarLoader$#
>> can load which will in turn allow classes loaded by a URLClassLoader subclass to load.
>>
>> So that change to the test would be:
>>     in TestTransformer:
>>        if (loader != null) {
>>            if (tName.equals("TestThread")) {
>>            {
>>               loadClasses(3);
>>            }
>>         }
>>         return null;
>>      }
>> // I also suspect with that change, we can remove the sleep loop
>> Note: there was a printed message which said that the Thread "Signal Dispatcher" has called transform(), which I
>> ignored, however it is good that we don't call loadClass on that thread  - which is part of what the sleep loop does -
>> but that would be handled by the boot loader screening above
>>
>> Alternatively we can preload the URLClassPath classes, but I don't think we want to do that, or
>> we can have the agent explicitly screen on a variety of jdk bootstrapping classes. But I think the cleaner
>> solution is to screen on the boot loader.
>>
>> Does that make any sense to others?
>>
>> thanks,
>> Karen
>>
>> p.s. How to run with hotspot flags (jtreg has a -show:rerun option, but with a shell script in the test, this is more complex, so
>> the following should be easier):
>>
>> So what I did was run the test once for it to pass (not your script, but just once with jtreg) so that it generated
>> the $DST/work directory.
>> I then created a rerun.csh script - attached - you can modify for your own $DST directory.
>> I used it to be able to quickly rerun the test without the jtreg framework and compile time etc. but mostly
>> to be able to actually add hotspot command-line flags.
>>
>>
>>
>> p.p.s. details from the error log (let me know if you want me to attach the error log to the bug report)
>>
>> note: error log shows last 10 events including:
>> Event: 0.928 loading class sun/misc/URLClassPath$JarLoader$2
>> Event: 0.928 loading class TestClass3
>> Event: 0.929 loading class TestClass3 done
>> Event: 0.929 loading class java/lang/ClassCircularityError
>> Event: 0.929 loading class java/lang/ClassCircularityError done
>>
>> TestThread
>>
>> java frames:
>>
>> j  sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>> j  sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>> j  sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>> v  ~StubRoutines::call_stub
>> j  java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>> j  java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>> v  ~StubRoutines::call_stub
>> j  java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>> j  java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>> j  ParallelTransformerLoaderAgent$TestTransformer.loadClasses(I)V+25
>> j  ParallelTransformerLoaderAgent$TestTransformer.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+81
>> j  sun.instrument.TransformerManager.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+50
>> j  sun.instrument.InstrumentationImpl.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[BZ)[B+34
>> v  ~StubRoutines::call_stub
>> j  sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>> j  sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>> j  sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>> v  ~StubRoutines::call_stub
>> j  java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>> j  java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>> v  ~StubRoutines::call_stub
>> j  java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>> j  java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>> j  ParallelTransformerLoaderApp.loadClasses(I)V+25
>> j  ParallelTransformerLoaderApp$TestThread.run()V+4
>> v  ~StubRoutines::call_stub
>>
>>
>>    
>>
>> detailed frames:
>>
>> V  [libjvm.so+0x760f5a]  Exceptions::_throw_msg(Thread*, char const*, int, Symbol*, char const*)+0x7c
>> V  [libjvm.so+0xce005c]  SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle, Handle, Thread*)+0x7d8
>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*, Handle, Handle, Thread*)+0x26d
>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*, Handle, Handle, bool, Thread*)+0x39
>> V  [libjvm.so+0x690fbc]  ConstantPool::klass_at_impl(constantPoolHandle, int, Thread*)+0x3cc
>> V  [libjvm.so+0x5398cb]  ConstantPool::klass_at(int, Thread*)+0x55
>> V  [libjvm.so+0x8b1f3c]  InterpreterRuntime::_new(JavaThread*, ConstantPool*, int)+0x14a
>> j  sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>> j  sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>> j  sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>> v  ~StubRoutines::call_stub
>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*, methodHandle*, JavaCallArguments*, Thread*), JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x3a
>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*)+0x7d
>> V  [libjvm.so+0x972a80]  JVM_DoPrivileged+0x63d
>> j  java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>> j  java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>> v  ~StubRoutines::call_stub
>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*, methodHandle*, JavaCallArguments*, Thread*), JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x3a
>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*)+0x7d
>> V  [libjvm.so+0x8c1ec7]  JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x1cb
>> V  [libjvm.so+0x8c2086]  JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Handle, Thread*)+0xb0
>> V  [libjvm.so+0xce2096]  SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x4ea
>> V  [libjvm.so+0xce00a8]  SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle, Handle, Thread*)+0x824
>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*, Handle, Handle, Thread*)+0x26d
>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*, Handle, Handle, bool, Thread*)+0x39
>> V  [libjvm.so+0x98c89e]  find_class_from_class_loader(JNIEnv_*, Symbol*, unsigned char, Handle, Handle, unsigned char, Thread*)+0x49
>> V  [libjvm.so+0x96f681]  JVM_FindClassFromCaller+0x39d
>> C  [libjava.so+0xdfd0]  Java_java_lang_Class_forName0+0x130
>> j  java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>> j  java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>> j  ParallelTransformerLoaderAgent$TestTransformer.loadClasses(I)V+25
>> j  ParallelTransformerLoaderAgent$TestTransformer.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+81
>> j  sun.instrument.TransformerManager.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+50
>> j  sun.instrument.InstrumentationImpl.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[BZ)[B+34
>> v  ~StubRoutines::call_stub
>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*, methodHandle*, JavaCallArguments*, Thread*), JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x3a
>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*)+0x7d
>> V  [libjvm.so+0x911bfb]  jni_invoke_nonstatic(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)+0x3cd
>> V  [libjvm.so+0x916918]  jni_CallObjectMethod+0x388
>> C  [libinstrument.so+0x4eb5]  transformClassFile+0x1e5
>> C  [libinstrument.so+0x1e06]  eventHandlerClassFileLoadHook+0x96
>> V  [libjvm.so+0xa04afa]  JvmtiClassFileLoadHookPoster::post_to_env(JvmtiEnv*, bool)+0x1a8
>> V  [libjvm.so+0xa0485e]  JvmtiClassFileLoadHookPoster::post_all_envs()+0x8a
>> V  [libjvm.so+0xa047c6]  JvmtiClassFileLoadHookPoster::post()+0x18
>> V  [libjvm.so+0x9fb6e1]  JvmtiExport::post_class_file_load_hook(Symbol*, Handle, Handle, unsigned char**, unsigned char**, JvmtiCachedClassFileData**)+0x85
>> V  [libjvm.so+0x5cd17d]  ClassFileParser::parseClassFile(Symbol*, ClassLoaderData*, Handle, KlassHandle, GrowableArray<Handle>*, TempNewSymbol&, bool, Thread*)+0x2af
>> V  [libjvm.so+0x5dd441]  ClassFileParser::parseClassFile(Symbol*, ClassLoaderData*, Handle, TempNewSymbol&, bool, Thread*)+0x95
>> V  [libjvm.so+0x5daf03]  ClassLoader::load_classfile(Symbol*, Thread*)+0x2ed
>> V  [libjvm.so+0xce1cc4]  SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x118
>> V  [libjvm.so+0xce00a8]  SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle, Handle, Thread*)+0x824
>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*, Handle, Handle, Thread*)+0x26d
>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*, Handle, Handle, bool, Thread*)+0x39
>> V  [libjvm.so+0x690fbc]  ConstantPool::klass_at_impl(constantPoolHandle, int, Thread*)+0x3cc
>> V  [libjvm.so+0x5398cb]  ConstantPool::klass_at(int, Thread*)+0x55
>> V  [libjvm.so+0x8b1f3c]  InterpreterRuntime::_new(JavaThread*, ConstantPool*, int)+0x14a
>> j  sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>> j  sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>> j  sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>> v  ~StubRoutines::call_stub
>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*, methodHandle*, JavaCallArguments*, Thread*), JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x3a
>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*)+0x7d
>> V  [libjvm.so+0x972a80]  JVM_DoPrivileged+0x63d
>> j  java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>> j  java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;
>> v  ~StubRoutines::call_stub
>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*, methodHandle*, JavaCallArguments*, Thread*), JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x3a
>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*)+0x7d
>> V  [libjvm.so+0x8c1ec7]  JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x1cb
>> V  [libjvm.so+0x8c2086]  JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Handle, Thread*)+0xb0
>> V  [libjvm.so+0xce2096]  SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x4ea
>> V  [libjvm.so+0xce00a8]  SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle, Handle, Thread*)+0x824
>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*, Handle, Handle, Thread*)+0x26d
>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*, Handle, Handle, bool, Thread*)+0x39
>> V  [libjvm.so+0x98c89e]  find_class_from_class_loader(JNIEnv_*, Symbol*, unsigned char, Handle, Handle, unsigned char, Thread*)+0x49
>> V  [libjvm.so+0x96f681]  JVM_FindClassFromCaller+0x39d
>> C  [libjava.so+0xdfd0]  Java_java_lang_Class_forName0+0x130
>> j  java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>> j  java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>> j  ParallelTransformerLoaderApp.loadClasses(I)V+25
>> ...<more frames>...


From karen.kinnear at oracle.com  Sat Nov  1 12:47:07 2014
From: karen.kinnear at oracle.com (Karen Kinnear)
Date: Sat, 1 Nov 2014 08:47:07 -0400
Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh
	fails with ClassCircularityError
In-Reply-To: <5454AF0B.4060008@oracle.com>
References: <543C591E.8010602@oracle.com>
	<544AB477.4000204@oracle.com>	<544ADC07.6080904@oracle.com>
	<544AE76A.9030701@oracle.com>	<544E5123.1060202@oracle.com>
	<544E8844.1070907@oracle.com>
	<0FB37288-5995-4E0B-B005-E00A4FB3B22A@oracle.com>
	<5454218D.40009@oracle.com> <5454AF0B.4060008@oracle.com>
Message-ID: <48A070EE-B8B8-4C3E-B29C-1A45BF48C84B@oracle.com>

So the loader this code refers to is the loader that is called when we get a CFLH. 
Run the test with -XX:+TraceClassLoading and on the error situation - see which
class loader is trying to load URLClassPath$JarLoader$2 - that should be the boot loader,
and that is the value of the "loader" coming in to the TestTransformer for the
situation below.

The way I read this - the boot loader is trying to load URLClassPath$JarLoader$2 in order
to be able to let the URLClassLoader do its findClass, and the agent intercepts that
and tries to call loadClasses, which itself will use the loader in the test case, which puts
us into an infinite loop - except circularity detection catches us.

See if that makes sense with experimentation please.

thanks,
Karen

On Nov 1, 2014, at 5:59 AM, serguei.spitsyn at oracle.com wrote:

> On 10/31/14 4:55 PM, Yumin Qi wrote:
>> Karen, 
>> 
>>   Thanks for your detail message for debugging. Yes, from my debugging, the exception did happen in TestThread other than main thread. I have no idea why in the end the exception was reported in main thread. 
>> 
>>    You mention 
>>  
>> So that change to the test would be:
>>    in TestTransformer:
>>       if (loader != null) {
>>           if (tName.equals("TestThread")) {
>>           {
>>              loadClasses(3);
>>           }
>>        }
>>        return null;
>>     }
>> 
>> The loader is the one defined in the test case, right?
> 
> Not sure, I understand your question correctly.
> 
> If thread is the TestThread then most likely the answer is "Yes".
> This one is expected:
>                 sClassLoader = new URLClassLoader(new URL[] {sURL});
> 
> The class loading for TestThread has to happen in the loadClasses(2).
> I wonder if we ever observe any other loader for the TestThread.
> 
> The question is because the TestThread is pretty simple:
> 
>         private static class TestThread extends Thread {
>                 private final int fIndex;
>                 public TestThread(int index) {         <== it is called with index = 2
>                      super("TestThread");
>                      fIndex = index;
>                 }
>                 public void  run() {  loadClasses(fIndex); }
>         }
> 
> Thanks,
> Serguei
> 
> 
>> The system class loader is never null.
>> I will try this change, let's see if it can work it out.
>> 
>> Thanks
>> Yumin  
>>   
>> On 10/31/2014 3:29 PM, Karen Kinnear wrote:
>>> Yumin,
>>> 
>>> From your earlier exception stack trace (many thanks) you reported:
>>> 
>>> Exception in thread "main" java.lang.ClassCircularityError:  (no - I don't know why this is in thread "main")
>>> sun/misc/URLClassPath$JarLoader$2
>>> at sun.misc.URLClassPath$JarLoader.checkResource(URLClassPath.java:771)
>>> at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:843)
>>> at sun.misc.URLClassPath.getResource(URLClassPath.java:199)
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:364)
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:426)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:359)
>>> at java.lang.Class.forName0(Native Method)
>>> at java.lang.Class.forName(Class.java:340)
>>> at ParallelTransformerLoaderApp.loadClasses(ParallelTransformerLoaderApp.java:83)
>>> at ParallelTransformerLoaderApp.main(ParallelTransformerLoaderApp.java:45)
>>> 
>>> 
>>> So I ran  with -XX:AbortVMOnException=java.lang.ClassCircularityError -XX:+ShowMessageBoxOnError to get
>>> a log file and stack trace. See my instructions below on how to do that.
>>> 
>>> I did this, attached a debugger, which didn't help enough since I needed to see the java stack frames,
>>>  and got an hs_err_log also, so the stack traces came from the error log.
>>> 
>>> The stack trace was on Thread 2, which in the hs_err_log was TestThread (which makes sense for what the test logic says).
>>> See later in email for stack traces from Thread 2.
>>> 
>>> Summary of stack trace:
>>> 
>>> TestThread:
>>>   loadClasses(#) -> forName(TestClass#, URLClassLoader)
>>>     vm calls out to URLClassLoader.loadClass(String) which is inherited from java.lang.ClassLoader.loadClass(String)
>>>     ... calls java.net.URLClassLoader.findClass(...) which calls
>>>       DoPrivileged  java.net.URLClassLoader$1.run which calls
>>>          sun.misc.URLClassPath.getResource(name, false)  which calls
>>>              sun.misc.URLClassPath$JarLoader.getResource which calls
>>>                  sun.misc.URLClassPath$JarLoader.checkResource which tries to call sun.misc.URLClassPath$JarLoader$2
>>>    - and then the transformer jumps in with loadClasses(# (which we know is 3) and walks the same logic which tries to load sun.misc.URLClassPath$JarLoader$2 again
>>> 
>>> Note that in the placeholder table information that Yumin printed, the circularity error is on sun.misc.URLClassPath$JarLoader$2 with the null == boot loader, which
>>> makes sense -- that is the appropriate defining loader, and therefore the one the CFLH would intercept during the defineClass phase.
>>> 
>>> In the sun.misc.URLClassPath.java file, in the class JarLoader, in the method checkResource 
>>> ... return new Resource() { ... }
>>> -- I do not know why that generates sun.misc.URLClassPath$JarLoader$1, $2 and $3 at build time or when that was added.
>>> I would guess that is when the bug started happening.
>>> 
>>> When I have a successful run, sun.misc.URLClassPath$JarLoader$2 loads before any TestClass1 loads.
>>> 
>>> My belief is that the point of the test is to test parallel class loading for URL class loaders.
>>> I don't think the point is to test the bootstrap class loader, nor to test bootstrapping - i.e. running the agent before
>>> we have loaded sufficient classes to allow loading URLClassLoader classes.
>>> 
>>> What I suggested to Yumin that he try would be to change the test to NOT intercept boot loader loads, so that sun.misc.URLClassPath$JarLoader$#
>>> can load which will in turn allow classes loaded by a URLClassLoader subclass to load.
>>> 
>>> So that change to the test would be:
>>>    in TestTransformer:
>>>       if (loader != null) {
>>>           if (tName.equals("TestThread")) {
>>>           {
>>>              loadClasses(3);
>>>           }
>>>        }
>>>        return null;
>>>     }
>>> // I also suspect with that change, we can remove the sleep loop
>>> Note: there was a printed message which said that the Thread "Signal Dispatcher" has called transform(), which I
>>> ignored, however it is good that we don't call loadClass on that thread  - which is part of what the sleep loop does -
>>> but that would be handled by the boot loader screening above
>>> 
>>> Alternatively we can preload the URLClassPath classes, but I don't think we want to do that, or
>>> we can have the agent explicitly screen on a variety of jdk bootstrapping classes. But I think the cleaner
>>> solution is to screen on the boot loader.
>>> 
>>> Does that make any sense to others?
>>> 
>>> thanks,
>>> Karen
>>> 
>>> p.s. How to run with hotspot flags (jtreg has a -show:rerun option, but with a shell script in the test, this is more complex, so
>>> the following should be easier):
>>> 
>>> So what I did was run the test once for it to pass (not your script, but just once with jtreg) so that it generated
>>> the $DST/work directory.
>>> I then created a rerun.csh script - attached - you can modify for your own $DST directory.
>>> I used it to be able to quickly rerun the test without the jtreg framework and compile time etc. but mostly
>>> to be able to actually add hotspot command-line flags.
>>> 
>>> 
>>> 
>>> p.p.s. details from the error log (let me know if you want me to attach the error log to the bug report)
>>> 
>>> note: error log shows last 10 events including:
>>> Event: 0.928 loading class sun/misc/URLClassPath$JarLoader$2
>>> Event: 0.928 loading class TestClass3
>>> Event: 0.929 loading class TestClass3 done
>>> Event: 0.929 loading class java/lang/ClassCircularityError
>>> Event: 0.929 loading class java/lang/ClassCircularityError done
>>> 
>>> TestThread 
>>> 
>>> java frames:
>>> 
>>> j  sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>> j  sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>> j  sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>> v  ~StubRoutines::call_stub
>>> j  java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>> j  java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>>> v  ~StubRoutines::call_stub
>>> j  java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>> j  java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>> j  ParallelTransformerLoaderAgent$TestTransformer.loadClasses(I)V+25
>>> j  ParallelTransformerLoaderAgent$TestTransformer.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+81
>>> j  sun.instrument.TransformerManager.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+50
>>> j  sun.instrument.InstrumentationImpl.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[BZ)[B+34
>>> v  ~StubRoutines::call_stub
>>> j  sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>> j  sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>> j  sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>> v  ~StubRoutines::call_stub
>>> j  java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>> j  java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>>> v  ~StubRoutines::call_stub
>>> j  java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>> j  java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>> j  ParallelTransformerLoaderApp.loadClasses(I)V+25
>>> j  ParallelTransformerLoaderApp$TestThread.run()V+4
>>> v  ~StubRoutines::call_stub
>>> 
>>> 
>>>   
>>> 
>>> detailed frames:
>>> 
>>> V  [libjvm.so+0x760f5a]  Exceptions::_throw_msg(Thread*, char const*, int, Symbol*, char const*)+0x7c
>>> V  [libjvm.so+0xce005c]  SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle, Handle, Thread*)+0x7d8
>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*, Handle, Handle, Thread*)+0x26d
>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*, Handle, Handle, bool, Thread*)+0x39
>>> V  [libjvm.so+0x690fbc]  ConstantPool::klass_at_impl(constantPoolHandle, int, Thread*)+0x3cc
>>> V  [libjvm.so+0x5398cb]  ConstantPool::klass_at(int, Thread*)+0x55
>>> V  [libjvm.so+0x8b1f3c]  InterpreterRuntime::_new(JavaThread*, ConstantPool*, int)+0x14a
>>> j  sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>> j  sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>> j  sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>> v  ~StubRoutines::call_stub
>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*, methodHandle*, JavaCallArguments*, Thread*), JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*)+0x7d
>>> V  [libjvm.so+0x972a80]  JVM_DoPrivileged+0x63d
>>> j  java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>> j  java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>>> v  ~StubRoutines::call_stub
>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*, methodHandle*, JavaCallArguments*, Thread*), JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*)+0x7d
>>> V  [libjvm.so+0x8c1ec7]  JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x1cb
>>> V  [libjvm.so+0x8c2086]  JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Handle, Thread*)+0xb0
>>> V  [libjvm.so+0xce2096]  SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x4ea
>>> V  [libjvm.so+0xce00a8]  SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle, Handle, Thread*)+0x824
>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*, Handle, Handle, Thread*)+0x26d
>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*, Handle, Handle, bool, Thread*)+0x39
>>> V  [libjvm.so+0x98c89e]  find_class_from_class_loader(JNIEnv_*, Symbol*, unsigned char, Handle, Handle, unsigned char, Thread*)+0x49
>>> V  [libjvm.so+0x96f681]  JVM_FindClassFromCaller+0x39d
>>> C  [libjava.so+0xdfd0]  Java_java_lang_Class_forName0+0x130
>>> j  java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>> j  java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>> j  ParallelTransformerLoaderAgent$TestTransformer.loadClasses(I)V+25
>>> j  ParallelTransformerLoaderAgent$TestTransformer.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+81
>>> j  sun.instrument.TransformerManager.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+50
>>> j  sun.instrument.InstrumentationImpl.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[BZ)[B+34
>>> v  ~StubRoutines::call_stub
>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*, methodHandle*, JavaCallArguments*, Thread*), JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*)+0x7d
>>> V  [libjvm.so+0x911bfb]  jni_invoke_nonstatic(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)+0x3cd
>>> V  [libjvm.so+0x916918]  jni_CallObjectMethod+0x388
>>> C  [libinstrument.so+0x4eb5]  transformClassFile+0x1e5
>>> C  [libinstrument.so+0x1e06]  eventHandlerClassFileLoadHook+0x96
>>> V  [libjvm.so+0xa04afa]  JvmtiClassFileLoadHookPoster::post_to_env(JvmtiEnv*, bool)+0x1a8
>>> V  [libjvm.so+0xa0485e]  JvmtiClassFileLoadHookPoster::post_all_envs()+0x8a
>>> V  [libjvm.so+0xa047c6]  JvmtiClassFileLoadHookPoster::post()+0x18
>>> V  [libjvm.so+0x9fb6e1]  JvmtiExport::post_class_file_load_hook(Symbol*, Handle, Handle, unsigned char**, unsigned char**, JvmtiCachedClassFileData**)+0x85
>>> V  [libjvm.so+0x5cd17d]  ClassFileParser::parseClassFile(Symbol*, ClassLoaderData*, Handle, KlassHandle, GrowableArray<Handle>*, TempNewSymbol&, bool, Thread*)+0x2af
>>> V  [libjvm.so+0x5dd441]  ClassFileParser::parseClassFile(Symbol*, ClassLoaderData*, Handle, TempNewSymbol&, bool, Thread*)+0x95
>>> V  [libjvm.so+0x5daf03]  ClassLoader::load_classfile(Symbol*, Thread*)+0x2ed
>>> V  [libjvm.so+0xce1cc4]  SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x118
>>> V  [libjvm.so+0xce00a8]  SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle, Handle, Thread*)+0x824
>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*, Handle, Handle, Thread*)+0x26d
>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*, Handle, Handle, bool, Thread*)+0x39
>>> V  [libjvm.so+0x690fbc]  ConstantPool::klass_at_impl(constantPoolHandle, int, Thread*)+0x3cc
>>> V  [libjvm.so+0x5398cb]  ConstantPool::klass_at(int, Thread*)+0x55
>>> V  [libjvm.so+0x8b1f3c]  InterpreterRuntime::_new(JavaThread*, ConstantPool*, int)+0x14a
>>> j  sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>> j  sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>> j  sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>> v  ~StubRoutines::call_stub
>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*, methodHandle*, JavaCallArguments*, Thread*), JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*)+0x7d
>>> V  [libjvm.so+0x972a80]  JVM_DoPrivileged+0x63d
>>> j  java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>> j  java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;
>>> v  ~StubRoutines::call_stub
>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*, methodHandle*, JavaCallArguments*, Thread*), JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*)+0x7d
>>> V  [libjvm.so+0x8c1ec7]  JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x1cb
>>> V  [libjvm.so+0x8c2086]  JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Handle, Thread*)+0xb0
>>> V  [libjvm.so+0xce2096]  SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x4ea
>>> V  [libjvm.so+0xce00a8]  SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle, Handle, Thread*)+0x824
>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*, Handle, Handle, Thread*)+0x26d
>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*, Handle, Handle, bool, Thread*)+0x39
>>> V  [libjvm.so+0x98c89e]  find_class_from_class_loader(JNIEnv_*, Symbol*, unsigned char, Handle, Handle, unsigned char, Thread*)+0x49
>>> V  [libjvm.so+0x96f681]  JVM_FindClassFromCaller+0x39d
>>> C  [libjava.so+0xdfd0]  Java_java_lang_Class_forName0+0x130
>>> j  java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>> j  java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>> j  ParallelTransformerLoaderApp.loadClasses(I)V+25
>>> ...<more frames>...
> 


From peter.levart at gmail.com  Sat Nov  1 16:40:02 2014
From: peter.levart at gmail.com (Peter Levart)
Date: Sat, 01 Nov 2014 17:40:02 +0100
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <D3573C59-2279-4977-B9D5-471C42145A80@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>
	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>
	<545406BC.20005@gmail.com>
	<D3573C59-2279-4977-B9D5-471C42145A80@oracle.com>
Message-ID: <54550CE2.9090605@gmail.com>


On 10/31/2014 11:59 PM, David Chase wrote:
> Thanks very much, I shall attend to these irregularities.
>
> David

Hi David,

Just a nit (in Class.ClassData):

2537             if (oldCapacity >  0) {
2538                 element_data[oldCapacity] = element_data[oldCapacity - 1];
2539                 // all array elements are non-null and sorted, increase size.
2540                 // if store to element_data above floats below
2541                 // store to size on the next line, that will be
2542                 // inconsistent to the VM if a safepoint occurs here.
2543                 size += 1;
2544                 for (int i = oldCapacity; i > index; i--) {
2545                     // pre: element_data[i] is duplicated at [i+1]
2546                     element_data[i] = element_data[i - 1];
2547                     // post: element_data[i-1] is duplicated at [i]
2548                 }
2549                 // element_data[index] is duplicated at [index+1]
2550                 element_data[index] = (Comparable<?>) e;
2551             } else {


In line 2544, you could start the for loop with (int i = oldCapacity - 
1; ...), since you have already moved the last element before 
incrementing the size. Also, I would more quickly grasp the code if 
"oldCapacity" was called "oldSize".

Now just a though...

What is the expected ratio of intern() calls that insert new element to 
those that just return existing interned element? If those that return 
existing element are frequent and since you already carefully arrange 
insertion so that VM can at any safepoint see the "consistent" state 
without null elements, I wonder if intern() could itself perform an 
optimistic search without holding an exclusive lock.

This is just a speculation, but would the following code work?

     private Comparable<?>[] elementData() {
         Comparable<?>[] elementData = this.elementData;
         if (elementData == null) {
             synchronized (this) {
                 elementData = this.elementData;
                 if (elementData == null) {
                     this.elementData = elementData = new Comparable[1];
                 }
             }
         }
         return elementData;
     }

     private final StampedLock lock = new StampedLock();

     public <E extends Comparable<? super E>> E intern(Class<?> klass, E 
memberName, int redefined_count) {
         int size, index = 0;
         Comparable<?>[] elementData;
         // try to take an optimistic-read stamp
         long rstamp = lock.tryOptimisticRead();
         long wstamp = 0L;

         if (rstamp != 0L) { // successfull
             // 1st read size so that it doesn't overshoot the actual 
elementData.length
             size = this.size;
             // 2nd read elementData
             elementData = elementData();

             index = Arrays.binarySearch(elementData, 0, size, memberName);
             if (index >= 0) {
                 E element = (E) elementData[index];
                 // validate that our reads were not disturbed by any writes
                 if (lock.validate(rstamp)) {
                     return element;
                 }
             }

             // try to convert to write lock
             wstamp = lock.tryConvertToWriteLock(rstamp);
         }

         if (wstamp == 0L) {
             // either tryOptimisticRead or tryConvertToWriteLock failed -
             // must acquire write lock and re-read/re-try search
             wstamp = lock.writeLock();
             size = this.size;
             elementData = elementData();
             index = Arrays.binarySearch(elementData, 0, size, memberName);
             if (index >= 0) {
                 E element = (E) elementData[index];
                 lock.unlockWrite(wstamp);
                 return element;
             }
         }

         // we have a write lock and are sure there was no element found
         E element = add(klass, ~index, memberName, redefined_count);

         lock.unlockWrite(wstamp);
         return element;
     }


The only thing that will have to be done to add() method is to publish 
new elements safely. Code doing binary-search under optimistic read 
could observe an unsafely published MemberName and comparing with such 
instance could lead to a NPE for example. To remedy this, the newly 
inserted MemberName would have to be published using a volatile write to 
the array slot (using Unsafe) - moving existing elements up and down the 
array does not have to be performed with volatile writes, since they 
have already been published.

Do you think this would be worth the effort?

Regards, Peter


> On 2014-10-31, at 6:01 PM, Peter Levart <peter.levart at gmail.com> wrote:
>
>> On 10/31/2014 07:11 PM, David Chase wrote:
>>> I found a lurking bug and updated the webrevs ? I was mistaken
>>> about this version having passed the ute tests (but now, for real, it does).
>>>
>>> I also added converted Christian?s test code into a jtreg test (which passes):
>>>
>>>
>>> http://cr.openjdk.java.net/~drchase/8013267/hotspot.05/
>>> http://cr.openjdk.java.net/~drchase/8013267/jdk.05/
>> Hi David,
>>
>> I'll just comment on the JDK side of things.
>>
>> In Class.ClassData.intern(), ther is a part that synchronizes on the elementData (volatile field holding array of Comparable(s)):
>>
>> 2500             synchronized (elementData) {
>> 2501                 final int index = Arrays.binarySearch(elementData, 0, size, memberName);
>> 2502                 if (index >= 0) {
>> 2503                     return (E) elementData[index];
>> 2504                 }
>> 2505                 // Not found, add carefully.
>> 2506                 return add(klass, ~index, memberName, redefined_count);
>> 2507             }
>>
>> Inside this synchronized block, add() method is called, which can call grow() method:
>>
>> 2522             if (oldCapacity + 1 > element_data.length ) {
>> 2523                 // Replacing array with a copy is safe; elements are identical.
>> 2524                 grow(oldCapacity + 1);
>> 2525                 element_data = elementData;
>> 2526             }
>>
>> grow() method creates a copy of elementData array and replaces it on this volatile field (line 2584):
>>
>> 2577         private void grow(int minCapacity) {
>> 2578             // overflow-conscious code
>> 2579             int oldCapacity = elementData.length;
>> 2580             int newCapacity = oldCapacity + (oldCapacity >> 1);
>> 2581             if (newCapacity - minCapacity < 0)
>> 2582                 newCapacity = minCapacity;
>> 2583             // minCapacity is usually close to size, so this is a win:
>> 2584             elementData = Arrays.copyOf(elementData, newCapacity);
>> 2585         }
>>
>> A concurrent call to intern() can therefore synchronize on a different monitor, so two threads will be inserting the element into the same array at the same time, Auch!
>>
>>
>>
>> Also, lazy construction of ClassData instance:
>>
>> 2593     private ClassData<T> classData() {
>> 2594         if (this.classData != null) {
>> 2595             return this.classData;
>> 2596         }
>> 2597         synchronized (this) {
>> 2598             if (this.classData == null) {
>> 2599                 this.classData = new ClassData<>();
>> 2600             }
>> 2601         }
>> 2602         return this.classData;
>> 2603     }
>>
>> Synchronizes on the j.l.Class instance, which can interfere with user synchronization (think synchronized static methods). This dangerous.
>>
>> Theres an inner class Class.Atomic which is a home for Unsafe machinery in j.l.Class. You can add a casClassData method to it and use it to atomically install the ClassData instance without synchronized blocks.
>>
>>
>>
>> Regards, Peter
>>


From david.r.chase at oracle.com  Sat Nov  1 17:03:28 2014
From: david.r.chase at oracle.com (David Chase)
Date: Sat, 1 Nov 2014 13:03:28 -0400
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <54550CE2.9090605@gmail.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>
	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>
	<545406BC.20005@gmail.com>
	<D3573C59-2279-4977-B9D5-471C42145A80@oracle.com>
	<54550CE2.9090605@gmail.com>
Message-ID: <13FF3B65-027D-4152-8CEF-0F31773976EA@oracle.com>

Hello Peter,

I think it is expected that inserting-interns will be asymptotically rare ? classes
have a finite number of methods, after all.  I?m not sure if that is worth doing right
now, since this is also a bug fix ? maybe the performance enhancements go
in as an RFE. 

Maybe other reviewers will have an opinion?

David

On 2014-11-01, at 12:40 PM, Peter Levart <peter.levart at gmail.com> wrote:

> 
> On 10/31/2014 11:59 PM, David Chase wrote:
>> Thanks very much, I shall attend to these irregularities.
>> 
>> David
>> 
> 
> Hi David,
> 
> Just a nit (in Class.ClassData):
> 
> 2537             if (oldCapacity >  0) {
> 2538                 element_data[oldCapacity] = element_data[oldCapacity - 1];
> 2539                 // all array elements are non-null and sorted, increase size.
> 2540                 // if store to element_data above floats below
> 2541                 // store to size on the next line, that will be
> 2542                 // inconsistent to the VM if a safepoint occurs here.
> 2543                 size += 1;
> 2544                 for (int i = oldCapacity; i > index; i--) {
> 2545                     // pre: element_data[i] is duplicated at [i+1]
> 2546                     element_data[i] = element_data[i - 1];
> 2547                     // post: element_data[i-1] is duplicated at [i]
> 2548                 }
> 2549                 // element_data[index] is duplicated at [index+1]
> 2550                 element_data[index] = (Comparable<?>) e;
> 2551             } else {
> 
> In line 2544, you could start the for loop with (int i = oldCapacity - 1; ...), since you have already moved the last element before incrementing the size. Also, I would more quickly grasp the code if "oldCapacity" was called "oldSize".
> 
> Now just a though...
> 
> What is the expected ratio of intern() calls that insert new element to those that just return existing interned element? If those that return existing element are frequent and since you already carefully arrange insertion so that VM can at any safepoint see the "consistent" state without null elements, I wonder if intern() could itself perform an optimistic search without holding an exclusive lock. 
> 
> This is just a speculation, but would the following code work?
> 
>     private Comparable<?>[] elementData() {
>         Comparable<?>[] elementData = this.elementData;
>         if (elementData == null) {
>             synchronized (this) {
>                 elementData = this.elementData;
>                 if (elementData == null) {
>                     this.elementData = elementData = new Comparable[1];
>                 }
>             }
>         }
>         return elementData;
>     }
> 
>     private final StampedLock lock = new StampedLock();
> 
>     public <E extends Comparable<? super E>> E intern(Class<?> klass, E memberName, int redefined_count) {
>         int size, index = 0;
>         Comparable<?>[] elementData;
>         // try to take an optimistic-read stamp
>         long rstamp = lock.tryOptimisticRead();
>         long wstamp = 0L;
>         
>         if (rstamp != 0L) { // successfull
>             // 1st read size so that it doesn't overshoot the actual elementData.length
>             size = this.size;
>             // 2nd read elementData
>             elementData = elementData();
> 
>             index = Arrays.binarySearch(elementData, 0, size, memberName);
>             if (index >= 0) {
>                 E element = (E) elementData[index];
>                 // validate that our reads were not disturbed by any writes
>                 if (lock.validate(rstamp)) {
>                     return element;
>                 }
>             }
> 
>             // try to convert to write lock
>             wstamp = lock.tryConvertToWriteLock(rstamp);
>         }
> 
>         if (wstamp == 0L) {
>             // either tryOptimisticRead or tryConvertToWriteLock failed -
>             // must acquire write lock and re-read/re-try search
>             wstamp = lock.writeLock();
>             size = this.size;
>             elementData = elementData();
>             index = Arrays.binarySearch(elementData, 0, size, memberName);
>             if (index >= 0) {
>                 E element = (E) elementData[index];
>                 lock.unlockWrite(wstamp);
>                 return element;
>             }
>         }
> 
>         // we have a write lock and are sure there was no element found
>         E element = add(klass, ~index, memberName, redefined_count);
> 
>         lock.unlockWrite(wstamp);
>         return element;
>     }
> 
> 
> 
> The only thing that will have to be done to add() method is to publish new elements safely. Code doing binary-search under optimistic read could observe an unsafely published MemberName and comparing with such instance could lead to a NPE for example. To remedy this, the newly inserted MemberName would have to be published using a volatile write to the array slot (using Unsafe) - moving existing elements up and down the array does not have to be performed with volatile writes, since they have already been published.
> 
> Do you think this would be worth the effort?
> 
> Regards, Peter
> 
> 
>> On 2014-10-31, at 6:01 PM, Peter Levart <peter.levart at gmail.com>
>>  wrote:
>> 
>> 
>>> On 10/31/2014 07:11 PM, David Chase wrote:
>>> 
>>>> I found a lurking bug and updated the webrevs ? I was mistaken
>>>> about this version having passed the ute tests (but now, for real, it does).
>>>> 
>>>> I also added converted Christian?s test code into a jtreg test (which passes):
>>>> 
>>>> 
>>>> 
>>>> http://cr.openjdk.java.net/~drchase/8013267/hotspot.05/
>>>> http://cr.openjdk.java.net/~drchase/8013267/jdk.05/
>>> Hi David,
>>> 
>>> I'll just comment on the JDK side of things.
>>> 
>>> In Class.ClassData.intern(), ther is a part that synchronizes on the elementData (volatile field holding array of Comparable(s)):
>>> 
>>> 2500             synchronized (elementData) {
>>> 2501                 final int index = Arrays.binarySearch(elementData, 0, size, memberName);
>>> 2502                 if (index >= 0) {
>>> 2503                     return (E) elementData[index];
>>> 2504                 }
>>> 2505                 // Not found, add carefully.
>>> 2506                 return add(klass, ~index, memberName, redefined_count);
>>> 2507             }
>>> 
>>> Inside this synchronized block, add() method is called, which can call grow() method:
>>> 
>>> 2522             if (oldCapacity + 1 > element_data.length ) {
>>> 2523                 // Replacing array with a copy is safe; elements are identical.
>>> 2524                 grow(oldCapacity + 1);
>>> 2525                 element_data = elementData;
>>> 2526             }
>>> 
>>> grow() method creates a copy of elementData array and replaces it on this volatile field (line 2584):
>>> 
>>> 2577         private void grow(int minCapacity) {
>>> 2578             // overflow-conscious code
>>> 2579             int oldCapacity = elementData.length;
>>> 2580             int newCapacity = oldCapacity + (oldCapacity >> 1);
>>> 2581             if (newCapacity - minCapacity < 0)
>>> 2582                 newCapacity = minCapacity;
>>> 2583             // minCapacity is usually close to size, so this is a win:
>>> 2584             elementData = Arrays.copyOf(elementData, newCapacity);
>>> 2585         }
>>> 
>>> A concurrent call to intern() can therefore synchronize on a different monitor, so two threads will be inserting the element into the same array at the same time, Auch!
>>> 
>>> 
>>> 
>>> Also, lazy construction of ClassData instance:
>>> 
>>> 2593     private ClassData<T> classData() {
>>> 2594         if (this.classData != null) {
>>> 2595             return this.classData;
>>> 2596         }
>>> 2597         synchronized (this) {
>>> 2598             if (this.classData == null) {
>>> 2599                 this.classData = new ClassData<>();
>>> 2600             }
>>> 2601         }
>>> 2602         return this.classData;
>>> 2603     }
>>> 
>>> Synchronizes on the j.l.Class instance, which can interfere with user synchronization (think synchronized static methods). This dangerous.
>>> 
>>> Theres an inner class Class.Atomic which is a home for Unsafe machinery in j.l.Class. You can add a casClassData method to it and use it to atomically install the ClassData instance without synchronized blocks.
>>> 
>>> 
>>> 
>>> Regards, Peter
>>> 
>>> 
> 


From david.r.chase at oracle.com  Mon Nov  3 00:05:16 2014
From: david.r.chase at oracle.com (David Chase)
Date: Sun, 2 Nov 2014 19:05:16 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>
	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>
	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>
	<5453C230.8010709@oracle.com>
	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>
	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>
Message-ID: <1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>


On 2014-10-31, at 5:45 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:

> The volatile load prevents subsequent loads and stores from reordering with it, but that doesn't stop C from moving before the B store.  So breaking B into the load (call it BL) and store (BS) you can still get this ordering: A, BL, C, BS

I think this should do the trick.

               element_data[oldCapacity] = element_data[oldCapacity - 1];
                // all array elements are non-null and sorted, increase size.
                // if store to element_data above floats below
                // store to size on the next line, that will be
                // inconsistent to the VM if a safepoint occurs here.
                size += 1;
                // Load of volatile size prevents movement of element_data store
                for (int i = size - 1; i > index; i--) {

The change is to load the volatile size for the loop bound; this stops the stores
in the loop from moving earlier, right?

David


From david.holmes at oracle.com  Mon Nov  3 02:29:29 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 03 Nov 2014 12:29:29 +1000
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <CAPYFHW1okT_-iYx5Ng1NrimppPezfNWg9tGf-w+rESqd_Cf6KQ@mail.gmail.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>	<5453AC95.3060904@oracle.com>
	<CAPYFHW1okT_-iYx5Ng1NrimppPezfNWg9tGf-w+rESqd_Cf6KQ@mail.gmail.com>
Message-ID: <5456E889.8080100@oracle.com>

Again adding in serviceability.

David

On 1/11/2014 6:17 AM, Jeremy Manson wrote:
> Thanks, Coleen - I saw that you committed it, but the change had a long
> contributed-by line, so I wasn't sure whether you were the right person to
> reach out to.
>
> Jeremy
>
> On Fri, Oct 31, 2014 at 8:36 AM, Coleen Phillimore <
> coleen.phillimore at oracle.com> wrote:
>
>>
>> Jeremy,
>> I will review and sponsor this for you since I wrote the original code.
>> Thanks,
>> Coleen
>>
>>
>> On 10/30/14, 1:02 PM, Jeremy Manson wrote:
>>
>>> There's a significant regression in the speed of JVMTI GetClassMethods in
>>> JDK8. I've tracked this down to allocation of jmethodids in a tight loop.
>>> The issue can be addressed by preallocating enough space for all of the
>>> jmethodids when starting the operation and not iterating over all of the
>>> existing jmethodids when you allocate a new one.
>>>
>>> A patch is here:
>>>
>>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/
>>>
>>> A reproducible test case can be found here:
>>>
>>> http://cr.openjdk.java.net/~jmanson/8062116/repro/
>>>
>>> It's a benchmark, though: I have no idea how to turn it into a test.
>>>
>>> For whoever reviews it: can you explain to me why it is okay that this
>>> code
>>> reuses jmethodIDs (in JNIMethodBlock::add_method?  I can imagine a lot of
>>> problems stemming from accidental reuse.
>>>
>>> Jeremy
>>>
>>
>>

From david.holmes at oracle.com  Mon Nov  3 02:39:27 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 03 Nov 2014 12:39:27 +1000
Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh
	fails with ClassCircularityError
In-Reply-To: <5454218D.40009@oracle.com>
References: <543C591E.8010602@oracle.com>	<544AB477.4000204@oracle.com>	<544ADC07.6080904@oracle.com>	<544AE76A.9030701@oracle.com>	<544E5123.1060202@oracle.com>	<544E8844.1070907@oracle.com>	<0FB37288-5995-4E0B-B005-E00A4FB3B22A@oracle.com>
	<5454218D.40009@oracle.com>
Message-ID: <5456EADF.4050203@oracle.com>

On 1/11/2014 9:55 AM, Yumin Qi wrote:
> Karen,
>
>    Thanks for your detail message for debugging. Yes, from my debugging,
> the exception did happen in TestThread other than main thread. I have no
> idea why in the end the exception was reported in main thread.

Until that question is answered I will remain uneasy about simply 
tweaking the test until it no longer fails. I would also like to know 
when it started failing - Karen alludes to the possible introduction of 
a new inner class at some point.

Thanks,
David

>     You mention
>
> So that change to the test would be:
>     in TestTransformer:
>        if (loader != null) {
>            if (tName.equals("TestThread")) {
>            {
>               loadClasses(3);
>            }
>         }
>         return null;
>      }
>
>
> The loader is the one defined in the test case, right? The system class
> loader is never null.
> I will try this change, let's see if it can work it out.
>
> Thanks
> Yumin
>
> On 10/31/2014 3:29 PM, Karen Kinnear wrote:
>> Yumin,
>>
>>  From your earlier exception stack trace (many thanks) you reported:
>>
>> Exception in thread "main" java.lang.ClassCircularityError:  (no - I
>> don't know why this is in thread "main")
>> sun/misc/URLClassPath$JarLoader$2
>> at sun.misc.URLClassPath$JarLoader.checkResource(URLClassPath.java:771)
>> at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:843)
>> at sun.misc.URLClassPath.getResource(URLClassPath.java:199)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:364)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:426)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:359)
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:340)
>> at
>> ParallelTransformerLoaderApp.loadClasses(ParallelTransformerLoaderApp.java:83)
>>
>> at
>> ParallelTransformerLoaderApp.main(ParallelTransformerLoaderApp.java:45)
>>
>>
>> So I ran  with -XX:AbortVMOnException=java.lang.ClassCircularityError
>> -XX:+ShowMessageBoxOnError to get
>> a log file and stack trace. See my instructions below on how to do that.
>>
>> I did this, attached a debugger, which didn't help enough since I
>> needed to see the java stack frames,
>>   and got an hs_err_log also, so the stack traces came from the error
>> log.
>>
>> The stack trace was on Thread 2, which in the hs_err_log was
>> TestThread (which makes sense for what the test logic says).
>> See later in email for stack traces from Thread 2.
>>
>> Summary of stack trace:
>>
>> TestThread:
>>    loadClasses(#) -> forName(TestClass#, URLClassLoader)
>>      vm calls out to URLClassLoader.loadClass(String) which is
>> inherited from java.lang.ClassLoader.loadClass(String)
>>      ... calls java.net.URLClassLoader.findClass(...) which calls
>>        DoPrivileged  java.net.URLClassLoader$1.run which calls
>>           sun.misc.URLClassPath.getResource(name, false)  which calls
>>               sun.misc.URLClassPath$JarLoader.getResource which calls
>>                   sun.misc.URLClassPath$JarLoader.checkResource which
>> tries to call sun.misc.URLClassPath$JarLoader$2
>>     - and then the transformer jumps in with loadClasses(# (which we
>> know is 3) and walks the same logic which tries to load
>> sun.misc.URLClassPath$JarLoader$2 again
>>
>> Note that in the placeholder table information that Yumin printed, the
>> circularity error is on sun.misc.URLClassPath$JarLoader$2 with the
>> null == boot loader, which
>> makes sense -- that is the appropriate defining loader, and therefore
>> the one the CFLH would intercept during the defineClass phase.
>>
>> In the sun.misc.URLClassPath.java file, in the class JarLoader, in the
>> method checkResource
>> ... return new Resource() { ... }
>> -- I do not know why that generates sun.misc.URLClassPath$JarLoader$1,
>> $2 and $3 at build time or when that was added.
>> I would guess that is when the bug started happening.
>>
>> When I have a successful run, sun.misc.URLClassPath$JarLoader$2 loads
>> before any TestClass1 loads.
>>
>> My belief is that the point of the test is to test parallel class
>> loading for URL class loaders.
>> I don't think the point is to test the bootstrap class loader, nor to
>> test bootstrapping - i.e. running the agent before
>> we have loaded sufficient classes to allow loading URLClassLoader
>> classes.
>>
>> What I suggested to Yumin that he try would be to change the test to
>> NOT intercept boot loader loads, so that
>> sun.misc.URLClassPath$JarLoader$#
>> can load which will in turn allow classes loaded by a URLClassLoader
>> subclass to load.
>>
>> So that change to the test would be:
>>     in TestTransformer:
>>        if (loader != null) {
>>            if (tName.equals("TestThread")) {
>>            {
>>               loadClasses(3);
>>            }
>>         }
>>         return null;
>>      }
>> // I also suspect with that change, we can remove the sleep loop
>> Note: there was a printed message which said that the Thread "Signal
>> Dispatcher" has called transform(), which I
>> ignored, however it is good that we don't call loadClass on that
>> thread  - which is part of what the sleep loop does -
>> but that would be handled by the boot loader screening above
>>
>> Alternatively we can preload the URLClassPath classes, but I don't
>> think we want to do that, or
>> we can have the agent explicitly screen on a variety of jdk
>> bootstrapping classes. But I think the cleaner
>> solution is to screen on the boot loader.
>>
>> Does that make any sense to others?
>>
>> thanks,
>> Karen
>>
>> p.s. How to run with hotspot flags (jtreg has a -show:rerun option,
>> but with a shell script in the test, this is more complex, so
>> the following should be easier):
>>
>> So what I did was run the test once for it to pass (not your script,
>> but just once with jtreg) so that it generated
>> the $DST/work directory.
>> I then created a rerun.csh script - attached - you can modify for your
>> own $DST directory.
>> I used it to be able to quickly rerun the test without the jtreg
>> framework and compile time etc. but mostly
>> to be able to actually add hotspot command-line flags.
>>
>>
>>
>>
>> p.p.s. details from the error log (let me know if you want me to
>> attach the error log to the bug report)
>>
>> note: error log shows last 10 events including:
>> Event: 0.928 loading class sun/misc/URLClassPath$JarLoader$2
>> Event: 0.928 loading class TestClass3
>> Event: 0.929 loading class TestClass3 done
>> Event: 0.929 loading class java/lang/ClassCircularityError
>> Event: 0.929 loading class java/lang/ClassCircularityError done
>>
>> TestThread
>>
>> java frames:
>>
>> j
>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>
>> j
>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>
>> j
>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>> v  ~StubRoutines::call_stub
>> j
>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>
>> j
>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>> j
>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>> v  ~StubRoutines::call_stub
>> j
>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>
>> j
>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>
>> j  ParallelTransformerLoaderAgent$TestTransformer.loadClasses(I)V+25
>> j
>> ParallelTransformerLoaderAgent$TestTransformer.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+81
>>
>> j
>> sun.instrument.TransformerManager.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+50
>>
>> j
>> sun.instrument.InstrumentationImpl.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[BZ)[B+34
>>
>> v  ~StubRoutines::call_stub
>> j
>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>
>> j
>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>
>> j
>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>> v  ~StubRoutines::call_stub
>> j
>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>
>> j
>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>> j
>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>> v  ~StubRoutines::call_stub
>> j
>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>
>> j
>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>
>> j  ParallelTransformerLoaderApp.loadClasses(I)V+25
>> j  ParallelTransformerLoaderApp$TestThread.run()V+4
>> v  ~StubRoutines::call_stub
>>
>>
>>
>> detailed frames:
>>
>> V  [libjvm.so+0x760f5a]  Exceptions::_throw_msg(Thread*, char const*,
>> int, Symbol*, char const*)+0x7c
>> V  [libjvm.so+0xce005c]
>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>> Handle, Thread*)+0x7d8
>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>> Handle, Handle, Thread*)+0x26d
>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>> Handle, Handle, bool, Thread*)+0x39
>> V  [libjvm.so+0x690fbc]
>> ConstantPool::klass_at_impl(constantPoolHandle, int, Thread*)+0x3cc
>> V  [libjvm.so+0x5398cb]  ConstantPool::klass_at(int, Thread*)+0x55
>> V  [libjvm.so+0x8b1f3c]  InterpreterRuntime::_new(JavaThread*,
>> ConstantPool*, int)+0x14a
>> j
>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>
>> j
>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>
>> j
>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>> v  ~StubRoutines::call_stub
>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>> JavaCallArguments*, Thread*)+0x7d
>> V  [libjvm.so+0x972a80]  JVM_DoPrivileged+0x63d
>> j
>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>
>> j
>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>> j
>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>> v  ~StubRoutines::call_stub
>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>> JavaCallArguments*, Thread*)+0x7d
>> V  [libjvm.so+0x8c1ec7]  JavaCalls::call_virtual(JavaValue*,
>> KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x1cb
>> V  [libjvm.so+0x8c2086]  JavaCalls::call_virtual(JavaValue*, Handle,
>> KlassHandle, Symbol*, Symbol*, Handle, Thread*)+0xb0
>> V  [libjvm.so+0xce2096]
>> SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x4ea
>> V  [libjvm.so+0xce00a8]
>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>> Handle, Thread*)+0x824
>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>> Handle, Handle, Thread*)+0x26d
>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>> Handle, Handle, bool, Thread*)+0x39
>> V  [libjvm.so+0x98c89e]  find_class_from_class_loader(JNIEnv_*,
>> Symbol*, unsigned char, Handle, Handle, unsigned char, Thread*)+0x49
>> V  [libjvm.so+0x96f681]  JVM_FindClassFromCaller+0x39d
>> C  [libjava.so+0xdfd0]  Java_java_lang_Class_forName0+0x130
>> j
>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>
>> j
>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>
>> j  ParallelTransformerLoaderAgent$TestTransformer.loadClasses(I)V+25
>> j
>> ParallelTransformerLoaderAgent$TestTransformer.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+81
>>
>> j
>> sun.instrument.TransformerManager.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+50
>>
>> j
>> sun.instrument.InstrumentationImpl.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[BZ)[B+34
>>
>> v  ~StubRoutines::call_stub
>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>> JavaCallArguments*, Thread*)+0x7d
>> V  [libjvm.so+0x911bfb]  jni_invoke_nonstatic(JNIEnv_*, JavaValue*,
>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)+0x3cd
>> V  [libjvm.so+0x916918]  jni_CallObjectMethod+0x388
>> C  [libinstrument.so+0x4eb5]  transformClassFile+0x1e5
>> C  [libinstrument.so+0x1e06]  eventHandlerClassFileLoadHook+0x96
>> V  [libjvm.so+0xa04afa]
>> JvmtiClassFileLoadHookPoster::post_to_env(JvmtiEnv*, bool)+0x1a8
>> V  [libjvm.so+0xa0485e]
>> JvmtiClassFileLoadHookPoster::post_all_envs()+0x8a
>> V  [libjvm.so+0xa047c6]  JvmtiClassFileLoadHookPoster::post()+0x18
>> V  [libjvm.so+0x9fb6e1]
>> JvmtiExport::post_class_file_load_hook(Symbol*, Handle, Handle,
>> unsigned char**, unsigned char**, JvmtiCachedClassFileData**)+0x85
>> V  [libjvm.so+0x5cd17d]  ClassFileParser::parseClassFile(Symbol*,
>> ClassLoaderData*, Handle, KlassHandle, GrowableArray<Handle>*,
>> TempNewSymbol&, bool, Thread*)+0x2af
>> V  [libjvm.so+0x5dd441]  ClassFileParser::parseClassFile(Symbol*,
>> ClassLoaderData*, Handle, TempNewSymbol&, bool, Thread*)+0x95
>> V  [libjvm.so+0x5daf03]  ClassLoader::load_classfile(Symbol*,
>> Thread*)+0x2ed
>> V  [libjvm.so+0xce1cc4]
>> SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x118
>> V  [libjvm.so+0xce00a8]
>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>> Handle, Thread*)+0x824
>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>> Handle, Handle, Thread*)+0x26d
>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>> Handle, Handle, bool, Thread*)+0x39
>> V  [libjvm.so+0x690fbc]
>> ConstantPool::klass_at_impl(constantPoolHandle, int, Thread*)+0x3cc
>> V  [libjvm.so+0x5398cb]  ConstantPool::klass_at(int, Thread*)+0x55
>> V  [libjvm.so+0x8b1f3c]  InterpreterRuntime::_new(JavaThread*,
>> ConstantPool*, int)+0x14a
>> j
>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>
>> j
>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>
>> j
>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>> v  ~StubRoutines::call_stub
>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>> JavaCallArguments*, Thread*)+0x7d
>> V  [libjvm.so+0x972a80]  JVM_DoPrivileged+0x63d
>> j
>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>
>> j
>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>> j
>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;
>> v  ~StubRoutines::call_stub
>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>> JavaCallArguments*, Thread*)+0x7d
>> V  [libjvm.so+0x8c1ec7]  JavaCalls::call_virtual(JavaValue*,
>> KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x1cb
>> V  [libjvm.so+0x8c2086]  JavaCalls::call_virtual(JavaValue*, Handle,
>> KlassHandle, Symbol*, Symbol*, Handle, Thread*)+0xb0
>> V  [libjvm.so+0xce2096]
>> SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x4ea
>> V  [libjvm.so+0xce00a8]
>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>> Handle, Thread*)+0x824
>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>> Handle, Handle, Thread*)+0x26d
>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>> Handle, Handle, bool, Thread*)+0x39
>> V  [libjvm.so+0x98c89e]  find_class_from_class_loader(JNIEnv_*,
>> Symbol*, unsigned char, Handle, Handle, unsigned char, Thread*)+0x49
>> V  [libjvm.so+0x96f681]  JVM_FindClassFromCaller+0x39d
>> C  [libjava.so+0xdfd0]  Java_java_lang_Class_forName0+0x130
>> j
>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>
>> j
>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>
>> j  ParallelTransformerLoaderApp.loadClasses(I)V+25
>> ...<more frames>...
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Oct 27, 2014, at 2:00 PM, serguei.spitsyn at oracle.com wrote:
>>
>>> Ok.
>>>
>>> Thanks, Dan!
>>> Serguei
>>>
>>>
>>> On 10/27/14 7:05 AM, Daniel D. Daugherty wrote:
>>>>> The test case was added by Dan.
>>>>> We may want to ask him to clarify the test case purpose.
>>>>> (added Dan to the to-list)
>>>> Here's the changeset that added the test:
>>>>
>>>> $ hg log -v -r bca8bf23ac59
>>>> test/java/lang/instrument/ParallelTransformerLoader.sh
>>>> changeset:   132:bca8bf23ac59
>>>> user:        dcubed
>>>> date:        Mon Mar 24 15:05:09 2008 -0700
>>>> files: test/java/lang/instrument/ParallelTransformerLoader.sh
>>>> test/java/lang/instrument/ParallelTransformerLoaderAgent.java
>>>> test/java/lang/instrument/ParallelTransformerLoaderApp.java
>>>> test/java/lang/instrument/TestClass1.java
>>>> test/java/lang/instrument/TestClass2.java
>>>> test/java/lang/instrument/TestClass3.java
>>>> description:
>>>> 5088398: 3/2 java.lang.instrument TCK test deadlock (test11)
>>>> Summary: Add regression test for single-threaded bootstrap classloader.
>>>> Reviewed-by: sspitsyn
>>>>
>>>>
>>>> Based on my e-mail archive for this bug and from the bug report itself,
>>>> it looks like we got this test from Wily Labs. The original bug was a
>>>> deadlock that stopped being reproducible after:
>>>>
>>>> Karen fixed the bootstrap class loader to work in parallel via:
>>>>
>>>>     4997893 4/5 Investigate allowing bootstrap loader to work in
>>>> parallel
>>>>
>>>> with that fix in place the deadlock no longer reproduces.
>>>> I'm planning to use this bug as the vehicle for getting
>>>> the test program into the INSTRUMENT_REGRESSION test suite.
>>>>
>>>> *** (#2 of 2): 2008-02-29 18:20:17 GMT+00:00 daniel.daugherty at sun.com
>>>>
>>>>
>>>> A careful reading of JDK-5088398 might reveal the intentions of this
>>>> test...
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 10/24/14 5:57 PM, serguei.spitsyn at oracle.com wrote:
>>>>> Yumin,
>>>>>
>>>>> On 10/24/14 4:08 PM, Yumin Qi wrote:
>>>>>> Serguei,
>>>>>>
>>>>>>   Thanks for your comments.
>>>>>>   This test happens intermittently, but now it can repeat with 8/9.
>>>>>>   Loading TestClass1 in main thread while loading TestClass2 in
>>>>>> TestThread in parallel. They both will call transform since
>>>>>> TestClass[1-3] are loaded via agent. When loading TestClass2, it
>>>>>> will call loading TestClass3 in TestThread.
>>>>>>   Note in the main thread, for loop:
>>>>>>
>>>>>>                   for (int i = 0; i < kNumIterations; i++)
>>>>>>                 {
>>>>>>                         // load some classes from multiple threads
>>>>>> (this thread and one other)
>>>>>>                         Thread testThread = new TestThread(2);
>>>>>>                         testThread.start();
>>>>>>                         loadClasses(1);
>>>>>>
>>>>>>                         // log that it completed and reset for the
>>>>>> next iteration
>>>>>>                         testThread.join();
>>>>>>                         System.out.print(".");
>>>>>> ParallelTransformerLoaderAgent.generateNewClassLoader();
>>>>>>                 }
>>>>>>
>>>>>> The loader got renewed after testThread.join(). So both threads
>>>>>> are using the exact same class loader.
>>>>> You are right, thanks.
>>>>> It means that all three classes (TesClass1, TestClass2 and TestClass3)
>>>>> are loaded by the same class loader in each iteration.
>>>>>
>>>>> However, I see more cases when the TestClass3 gets loaded.
>>>>> It happens in a CFLH event when any other class (not TestClass*) in
>>>>> the system is loaded.
>>>>> The class loading thread can be any, not only "main" or "TestClass"
>>>>> thread.
>>>>> I suspect this test case mostly targets class loading that happens
>>>>> on other threads.
>>>>> It is because of the lines:
>>>>>                         // In 160_03 and older, transform() is called
>>>>>                         // with the "system_loader_lock" held and that
>>>>>                         // prevents the bootstrap class loaded from
>>>>>                         // running in parallel. If we add a slight
>>>>> sleep
>>>>>                         // delay here when the transform() call is not
>>>>>                         // main or TestThread, then the deadlock in
>>>>>                         // 160_03 and older is much more reproducible.
>>>>>                         if (!tName.equals("main") &&
>>>>> !tName.equals("TestThread")) {
>>>>>                             System.out.println("Thread '" + tName +
>>>>>                                 "' has called transform()");
>>>>>                             try {
>>>>>                                 Thread.sleep(500);
>>>>>                             } catch (InterruptedException ie) {
>>>>>                             }
>>>>>                         }
>>>>>
>>>>> What about the following?
>>>>>
>>>>> In the ParallelTransformerLoaderAgent.java  make this change:
>>>>>               if (!tName.equals("main"))
>>>>>                   => if (tName.equals("TestThread"))
>>>>>
>>>>> Does such updated test still failing?
>>>>>
>>>>>> After create a new class loader, next loop will use the loader.
>>>>>> This is why quite often on the stack trace we can see it resolves
>>>>>> JarLoader$2.
>>>>>>
>>>>>> I am not quite understand the test case either. Loading TestClass3
>>>>>> inside transform using the same classloader will cause  call to
>>>>>> transform again and form a circle. Nonetheless, if we see
>>>>>> TestClass2 already loaded, the loop will end but that still is a
>>>>>> risk.
>>>>> In fact, I don't like that the test loads the class TestClass3 at
>>>>> the TestClass3 CFLH event.
>>>>> However, it is interesting to know why we did not see (is it the
>>>>> case?) this issue before.
>>>>> Also, it is interesting why the test stops failing with you fix
>>>>> (replacing loader with SystemClassLoader).
>>>>>
>>>>> The test case was added by Dan.
>>>>> We may want to ask him to clarify the test case purpose.
>>>>> (added Dan to the to-list)
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>> On 10/24/2014 1:20 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Yumin,
>>>>>>>
>>>>>>> Below is some analysis to make sure I understand the test
>>>>>>> scenario correctly.
>>>>>>>
>>>>>>> The ParallelTransformerLoaderApp.main() executes a 1000 iteration
>>>>>>> loop.
>>>>>>> At each iteration it does:
>>>>>>>   - creates and starts a new TestThread
>>>>>>>   - loads TestClass1 with the current class loader:
>>>>>>> ParallelTransformerLoaderAgent.getClassLoader()
>>>>>>>   - changes the current class loader with new one:
>>>>>>> ParallelTransformerLoaderAgent.generateNewClassLoader()
>>>>>>>
>>>>>>> The TestThread loads the TestClass2 concurrently with the main
>>>>>>> thread.
>>>>>>>
>>>>>>> At the CFLH events, the ParallelTransformerLoaderAgent does the
>>>>>>> class retransformation.
>>>>>>> If the thread loading the class is not "main", it loads the class
>>>>>>> TestClass3
>>>>>>> with the current class loader
>>>>>>> ParallelTransformerLoaderAgent.getClassLoader().
>>>>>>>
>>>>>>> Sometimes, the TestClass2 and TestClass3 are loaded by the same
>>>>>>> class loader recursively.
>>>>>>> It happens if the class loader has not been changed between
>>>>>>> loading TestClass2 and TestClass3 classes.
>>>>>>>
>>>>>>> I'm not convinced yet the test is incorrect.
>>>>>>> And it is not clear why do we get a ClassCircularityError.
>>>>>>>
>>>>>>> Please, let me know if the above understanding is wrong.
>>>>>>> I also see the reply from David and share his concerns.
>>>>>>>
>>>>>>> It is not clear if this failure is a regression.
>>>>>>> Did we observe this issue before?
>>>>>>> If - NOT then when and why had this failure started to appear?
>>>>>>>
>>>>>>> Unfortunately, it is impossible to look at the test run history
>>>>>>> at the moment.
>>>>>>> The Aurora is at a maintenance.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>> On 10/13/14 3:58 PM, Yumin Qi wrote:
>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468
>>>>>>>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/
>>>>>>>>
>>>>>>>> the bug marked as confidential so post the webrev internally.
>>>>>>>>
>>>>>>>> Problem: The test case tries to load a class from the same jar
>>>>>>>> via agent in the middle of loading another class from the jar
>>>>>>>> via same class loader in same thread. The call happens in
>>>>>>>> transform which is a rare case --- in middle of loading class,
>>>>>>>> loading another class. The result is a CircularityError. When
>>>>>>>> first class is in loading, in vm we put JarLoader$2 on place
>>>>>>>> holder table, then we start the defineClass, which calls
>>>>>>>> transform, begins loading the second class so go along the same
>>>>>>>> routine for loading JarLoader$2 first, found it already in
>>>>>>>> placeholder table. A CircularityError is thrown.
>>>>>>>> Fix: The test case should not call loading class with same class
>>>>>>>> loader in same thread from same jar in 'transform' method. I
>>>>>>>> modify it loading with system class loader and we expect see
>>>>>>>> ClassNotFoundException. Detail see bug comments.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Yumin *
>

From david.holmes at oracle.com  Mon Nov  3 03:49:45 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 03 Nov 2014 13:49:45 +1000
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>
	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>
Message-ID: <5456FB59.60905@oracle.com>

On 3/11/2014 10:05 AM, David Chase wrote:
>
> On 2014-10-31, at 5:45 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>
>> The volatile load prevents subsequent loads and stores from reordering with it, but that doesn't stop C from moving before the B store.  So breaking B into the load (call it BL) and store (BS) you can still get this ordering: A, BL, C, BS
>
> I think this should do the trick.
>
>                 element_data[oldCapacity] = element_data[oldCapacity - 1];
>                  // all array elements are non-null and sorted, increase size.
>                  // if store to element_data above floats below
>                  // store to size on the next line, that will be
>                  // inconsistent to the VM if a safepoint occurs here.
>                  size += 1;
>                  // Load of volatile size prevents movement of element_data store
>                  for (int i = size - 1; i > index; i--) {
>
> The change is to load the volatile size for the loop bound; this stops the stores
> in the loop from moving earlier, right?

Treating volatile accesses like memory barriers is playing a bit 
fast-and-loose with the spec+implementation. The basic happens-before 
relationship for volatiles states that if a volatile read sees a value 
X, then the volatile write that wrote X happened-before the read [1]. 
But in this code there are no checks of the values of the volatile 
fields. Instead you are relying on a volatile read "acting like 
acquire()" and a volatile write "acting like release()".

That said you are trying to "synchronize" the hotspot code with the JDK 
code so you have stepped outside the JMM in any case and reasoning about 
what is and is not allowed is somewhat moot - unless the hotspot code 
always uses Java-style accesses to the Java-level variables.

BTW the Java side of this needs to be reviewed on 
core-libs-dev at openjdk.java.net

David H.

[1] http://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4.4


> David
>

From david.holmes at oracle.com  Mon Nov  3 04:44:59 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 03 Nov 2014 14:44:59 +1000
Subject: RFR(S) Contended Locking fast enter bucket (8061553)
In-Reply-To: <5452C0B4.4070601@oracle.com>
References: <5452C0B4.4070601@oracle.com>
Message-ID: <5457084B.6070808@oracle.com>

Hi Dan,

Looks good. Couple of nits and one semantic query below ...

src/cpu/sparc/vm/macroAssembler_sparc.cpp

Formatting changes were a bit of a distraction.

---

src/cpu/x86/vm/macroAssembler_x86.cpp

Formatting changes were a bit of a distraction.

1929     // unconditionally set stackBox->_displaced_header = 3
1930     movptr(Address(boxReg, 0), 
(int32_t)intptr_t(markOopDesc::unused_mark()));

At 1870 we refer to box rather than stackBox. Also it takes some 
sleuthing to realize that "3" here is somehow a pseudonym for 
unused_mark(). Back up at 1808 we have a to-do:

1808     //   use markOop::unused_mark() instead of "3".

so the current change seems to be implementing that, even though other 
uses of "3" are left untouched.

---

src/share/vm/runtime/sharedRuntime.cpp

1794 JRT_BLOCK_ENTRY(void, 
SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock* 
lock, JavaThread* thread))
1795   if (!SafepointSynchronize::is_synchronizing()) {
1796     if (ObjectSynchronizer::quick_enter(_obj, thread, lock)) return;

Is it necessary to check is_synchronizing? If we are executing this code 
we are not at a safepoint and the quick_enter wont change that, so I'm 
not sure what we are guarding against.

---

src/share/vm/runtime/synchronizer.cpp

Minor nit: line 153 the usual acronym is NPE (for NullPointerException) 
not NPX

Nit:  159     Thread * const ox

Please change ox to owner.

---

Thanks,
David


On 31/10/2014 8:50 AM, Daniel D. Daugherty wrote:
> Greetings,
>
> I have the Contended Locking fast enter bucket ready for review.
>
> The code changes in this bucket are primarily a quick_enter()
> function that works on inflated but uncontended Java monitors.
> This quick_enter() function is used on the "slow path" for Java
> Monitor enter operations when the built-in "fast path" (read
> assembly code) doesn't work.
>
> This work is being tracked by the following bug ID:
>
>      JDK-8061553 Contended Locking fast enter bucket
>      https://bugs.openjdk.java.net/browse/JDK-8061553
>
> Here is the webrev URL:
>
> http://cr.openjdk.java.net/~dcubed/8061553-webrev/0-jdk9-hs-rt/
>
> Here is the JEP link:
>
>      https://bugs.openjdk.java.net/browse/JDK-8046133
>
> 8061553 summary of changes:
>
> macroAssembler_sparc.cpp: MacroAssembler::compiler_lock_object()
>
> - clean up spacing around some
>    'ObjectMonitor::owner_offset_in_bytes() - 2' uses
> - remove optional (EmitSync & 64) code
> - change from cmp() to andcc() so icc.zf flag is set
>
> macroAssembler_x86.cpp: MacroAssembler::fast_lock()
>
> - remove optional (EmitSync & 2) code
> - rewrite LP64 inflated lock code that tries to CAS in
>    the new owner value to be more efficient
>
> interfaceSupport.hpp:
>
> - add JRT_BLOCK_NO_ASYNC to permit splitting a
>    JRT_BLOCK_ENTRY into two pieces.
>
> sharedRuntime.cpp: SharedRuntime::complete_monitor_locking_C()
>
> - change entry type from JRT_ENTRY_NO_ASYNC to JRT_BLOCK_ENTRY
>    to permit ObjectSynchronizer::quick_enter() call
> - add JRT_BLOCK_NO_ASYNC use if the quick_enter() doesn't work
>    to revert to JRT_ENTRY_NO_ASYNC-like semantics
>
> synchronizer.[ch]pp:
>
> - add ObjectSynchronizer::quick_enter() for entering an
>    inflated but unowned Java monitor without thread state
>    changes
>
> Testing:
>
> - Aurora Adhoc RT/SVC baseline batch
> - JPRT test jobs
> - MonitorEnterStresser micro-benchmark (in process)
> - CallTimerGrid stress testing (in process)
> - Aurora performance testing:
>    - out of the box for the "promotion" and 32-bit server configs
>    - heavy weight monitors for the "promotion" and 32-bit server configs
>      (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>      (in process)
>
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan

From yumin.qi at oracle.com  Mon Nov  3 05:52:39 2014
From: yumin.qi at oracle.com (Yumin Qi)
Date: Sun, 02 Nov 2014 21:52:39 -0800
Subject: RFR: 8062247: Allow WhiteBox test to access JVM offsets
In-Reply-To: <5453DB4F.70709@oracle.com>
References: <544F150A.3090500@oracle.com> <54518A60.9080800@oracle.com>
	<5453DB4F.70709@oracle.com>
Message-ID: <54571827.5050807@oracle.com>

Misha,

   It is a generic name, now it only targets on FileMapHeader, it can 
add other data structure of vm if needed in future. Maybe a name like 
getOffsetForName(String name) is better?

Thanks
Yumin

On 10/31/2014 11:56 AM, Mikhailo Seledtsov wrote:
> Hi Yumin,
>
>  The name getOffsets() seems too generic. Perhaps, we could rename it 
> to be more specific to the task.
>
> Thank you,
> Misha
>
> On 10/29/2014 5:46 PM, Yumin Qi wrote:
>> Please review the new changeset at same location.
>> New API supply an interface to get data member offset by it's name.
>> http://cr.openjdk.java.net/~minqi/8062247/webrev00/
>>
>> Thanks
>> Yumin
>>
>> On 10/27/2014 9:01 PM, Yumin Qi wrote:
>>> Please review
>>>
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8062247
>>> webrev: http://cr.openjdk.java.net/~minqi/8062247/webrev00/
>>>
>>> Summary: Internal test failed since the variable offsets changed in 
>>> hotspot. The way to get offset in the test is hard-coded. To reduce 
>>> the risk of future changes of hotspot offsets, the fix add a 
>>> WhiteBox API function to get a map for FileMapHeaderInfo, which 
>>> return the members' offsets in a Hashtable.
>>>
>>> Tests: JPRT, jtreg.
>>>
>>> Thanks
>>> Yumin
>>
>


From david.r.chase at oracle.com  Mon Nov  3 12:49:55 2014
From: david.r.chase at oracle.com (David Chase)
Date: Mon, 3 Nov 2014 07:49:55 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <5456FB59.60905@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>
	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>
	<5456FB59.60905@oracle.com>
Message-ID: <632A5C98-B386-4625-BE12-355241581955@oracle.com>


On 2014-11-02, at 10:49 PM, David Holmes <david.holmes at oracle.com> wrote:
>> The change is to load the volatile size for the loop bound; this stops the stores
>> in the loop from moving earlier, right?
> 
> Treating volatile accesses like memory barriers is playing a bit fast-and-loose with the spec+implementation. The basic happens-before relationship for volatiles states that if a volatile read sees a value X, then the volatile write that wrote X happened-before the read [1]. But in this code there are no checks of the values of the volatile fields. Instead you are relying on a volatile read "acting like acquire()" and a volatile write "acting like release()".
> 
> That said you are trying to "synchronize" the hotspot code with the JDK code so you have stepped outside the JMM in any case and reasoning about what is and is not allowed is somewhat moot - unless the hotspot code always uses Java-style accesses to the Java-level variables.

My main concern is that the compiler is inhibited from any peculiar code motion; I assume that taking a safe point has a bit of barrier built into it anyway, especially given that the worry case is safepoint + JVMTI.

Given the worry, what?s the best way to spell ?barrier? here?
I could synchronize on classData (it would be a recursive lock in the current version of the code)
  synchronized (this) { size++; }
or I could synchronize on elementData (no longer used for a lock elsewhere, so always uncontended)
  synchronized (elementData) { size++; }
or is there some Unsafe thing that would be better?

(core-libs-dev ? there will be another webrev coming.  This is a runtime+jdk patch.)

David

> BTW the Java side of this needs to be reviewed on core-libs-dev at openjdk.java.net
> 
> David H.
> 
> [1] http://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4.4
> 
> 
>> David


From peter.levart at gmail.com  Mon Nov  3 16:16:53 2014
From: peter.levart at gmail.com (Peter Levart)
Date: Mon, 03 Nov 2014 17:16:53 +0100
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <632A5C98-B386-4625-BE12-355241581955@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
Message-ID: <5457AA75.8090103@gmail.com>

On 11/03/2014 01:49 PM, David Chase wrote:
> On 2014-11-02, at 10:49 PM, David Holmes <david.holmes at oracle.com> wrote:
>>> The change is to load the volatile size for the loop bound; this stops the stores
>>> in the loop from moving earlier, right?
>> Treating volatile accesses like memory barriers is playing a bit fast-and-loose with the spec+implementation. The basic happens-before relationship for volatiles states that if a volatile read sees a value X, then the volatile write that wrote X happened-before the read [1]. But in this code there are no checks of the values of the volatile fields. Instead you are relying on a volatile read "acting like acquire()" and a volatile write "acting like release()".
>>
>> That said you are trying to "synchronize" the hotspot code with the JDK code so you have stepped outside the JMM in any case and reasoning about what is and is not allowed is somewhat moot - unless the hotspot code always uses Java-style accesses to the Java-level variables.
> My main concern is that the compiler is inhibited from any peculiar code motion; I assume that taking a safe point has a bit of barrier built into it anyway, especially given that the worry case is safepoint + JVMTI.
>
> Given the worry, what?s the best way to spell ?barrier? here?
> I could synchronize on classData (it would be a recursive lock in the current version of the code)
>    synchronized (this) { size++; }
> or I could synchronize on elementData (no longer used for a lock elsewhere, so always uncontended)
>    synchronized (elementData) { size++; }
> or is there some Unsafe thing that would be better?
>
> (core-libs-dev ? there will be another webrev coming.  This is a runtime+jdk patch.)
>
> David

Hi David,

You're worried that writes moving array elements up for one slot would 
bubble up before write of size = size+1, right? If that happens, VM 
could skip an existing (last) element and not update it.

It seems that Unsafe.storeFence() between size++ and moving of elements 
could do, as the javadoc for it says:

     /**
      * Ensures lack of reordering of stores before the fence
      * with loads or stores after the fence.
      * @since 1.8
      */
     public native void storeFence();


Regards, Peter


>
>> BTW the Java side of this needs to be reviewed on core-libs-dev at openjdk.java.net
>>
>> David H.
>>
>> [1] http://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4.4
>>
>>
>>> David


From david.r.chase at oracle.com  Mon Nov  3 16:36:59 2014
From: david.r.chase at oracle.com (David Chase)
Date: Mon, 3 Nov 2014 11:36:59 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <5457AA75.8090103@gmail.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
Message-ID: <F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>


>> My main concern is that the compiler is inhibited from any peculiar code motion; I assume that taking a safe point has a bit of barrier built into it anyway, especially given that the worry case is safepoint + JVMTI.
>> 
>> Given the worry, what?s the best way to spell ?barrier? here?
>> I could synchronize on classData (it would be a recursive lock in the current version of the code)
>>   synchronized (this) { size++; }
>> or I could synchronize on elementData (no longer used for a lock elsewhere, so always uncontended)
>>   synchronized (elementData) { size++; }
>> or is there some Unsafe thing that would be better?
> 

> You're worried that writes moving array elements up for one slot would bubble up before write of size = size+1, right? If that happens, VM could skip an existing (last) element and not update it.

exactly, with the restriction that it would be compiler-induced bubbling, not architectural.
Which is both better, and worse ? I don?t have to worry about crazy hardware, but the rules
of java/jvm "memory model" are not as thoroughly defined as those for java itself.

I added a method to Atomic (.storeFence() ).  New webrev to come after I rebuild and retest.

Thanks much,

David

> It seems that Unsafe.storeFence() between size++ and moving of elements could do, as the javadoc for it says:
> 
>    /**
>     * Ensures lack of reordering of stores before the fence
>     * with loads or stores after the fence.
>     * @since 1.8
>     */
>    public native void storeFence();


From peter.levart at gmail.com  Mon Nov  3 16:42:30 2014
From: peter.levart at gmail.com (Peter Levart)
Date: Mon, 03 Nov 2014 17:42:30 +0100
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <5457AA75.8090103@gmail.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
Message-ID: <5457B076.10205@gmail.com>

On 11/03/2014 05:16 PM, Peter Levart wrote:
> On 11/03/2014 01:49 PM, David Chase wrote:
>> On 2014-11-02, at 10:49 PM, David Holmes <david.holmes at oracle.com> 
>> wrote:
>>>> The change is to load the volatile size for the loop bound; this 
>>>> stops the stores
>>>> in the loop from moving earlier, right?
>>> Treating volatile accesses like memory barriers is playing a bit 
>>> fast-and-loose with the spec+implementation. The basic 
>>> happens-before relationship for volatiles states that if a volatile 
>>> read sees a value X, then the volatile write that wrote X 
>>> happened-before the read [1]. But in this code there are no checks 
>>> of the values of the volatile fields. Instead you are relying on a 
>>> volatile read "acting like acquire()" and a volatile write "acting 
>>> like release()".
>>>
>>> That said you are trying to "synchronize" the hotspot code with the 
>>> JDK code so you have stepped outside the JMM in any case and 
>>> reasoning about what is and is not allowed is somewhat moot - unless 
>>> the hotspot code always uses Java-style accesses to the Java-level 
>>> variables.
>> My main concern is that the compiler is inhibited from any peculiar 
>> code motion; I assume that taking a safe point has a bit of barrier 
>> built into it anyway, especially given that the worry case is 
>> safepoint + JVMTI.
>>
>> Given the worry, what?s the best way to spell ?barrier? here?
>> I could synchronize on classData (it would be a recursive lock in the 
>> current version of the code)
>>    synchronized (this) { size++; }
>> or I could synchronize on elementData (no longer used for a lock 
>> elsewhere, so always uncontended)
>>    synchronized (elementData) { size++; }
>> or is there some Unsafe thing that would be better?
>>
>> (core-libs-dev ? there will be another webrev coming.  This is a 
>> runtime+jdk patch.)
>>
>> David
>
> Hi David,
>
> You're worried that writes moving array elements up for one slot would 
> bubble up before write of size = size+1, right? If that happens, VM 
> could skip an existing (last) element and not update it.
>
> It seems that Unsafe.storeFence() between size++ and moving of 
> elements could do, as the javadoc for it says:
>
>     /**
>      * Ensures lack of reordering of stores before the fence
>      * with loads or stores after the fence.
>      * @since 1.8
>      */
>     public native void storeFence();

You might need a storeFence() between each two writes into the array 
too. Your moving loop is the following:

2544                 for (int i = oldCapacity; i > index; i--) {
2545                     // pre: element_data[i] is duplicated at [i+1]
2546                     element_data[i] = element_data[i - 1];
2547                     // post: element_data[i-1] is duplicated at [i]
2548                 }


If we start unrolling, it becomes:

w1: element_data[old_capacity - 0] = element_data[old_capacity - 1];
w2: element_data[old_capacity - 1] = element_data[old_capacity - 2];
w3: element_data[old_capacity - 2] = element_data[old_capacity - 3];
...

Can compiler reorder w2 and w3 (just writes - not the whole statements)? 
Say that it reads a chunk of elements into the registers and then writes 
them out, but in different order, and a check for safepoint comes inside 
this chunk of writes... This is hypothetical, but it could do it without 
breaking the local semantics...

Peter

>
>
> Regards, Peter
>
>
>
>>
>>> BTW the Java side of this needs to be reviewed on 
>>> core-libs-dev at openjdk.java.net
>>>
>>> David H.
>>>
>>> [1] 
>>> http://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4.4
>>>
>>>
>>>> David
>


From christian.tornqvist at oracle.com  Mon Nov  3 16:58:12 2014
From: christian.tornqvist at oracle.com (Christian Tornqvist)
Date: Mon, 3 Nov 2014 11:58:12 -0500
Subject: RFR(L): 8056049: getProcessCpuLoad() stops working in one
	process	when a different process exits
In-Reply-To: <88c4f758-f8b7-45d2-96e6-d940bb40fcb3@default>
References: <88c4f758-f8b7-45d2-96e6-d940bb40fcb3@default>
Message-ID: <002f01cff787$5b106340$113129c0$@oracle.com>

Hi Markus,

Thanks for the detailed walkthrough, this looks really good. 

Thanks,
Christian

-----Original Message-----
From: hotspot-runtime-dev
[mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Markus
Gr?nlund
Sent: Wednesday, October 29, 2014 5:23 AM
To: hotspot-runtime-dev at openjdk.java.net
Subject: FW: RFR(L): 8056049: getProcessCpuLoad() stops working in one
process when a different process exits

Hi,

Trying my luck with Runtime as well - any Windows people that might be able
to do a review?

Bug: https://bugs.openjdk.java.net/browse/JDK-8056049

Webrev: http://cr.openjdk.java.net/~mgronlun/8056049/webrev01/


Thanks in advance
Markus


-----Original Message-----
From: Markus Gr?nlund 
Sent: den 24 oktober 2014 12:04
To: core-libs-dev Libs
Subject: FW: RFR(L): 8056049: getProcessCpuLoad() stops working in one
process when a different process exits

Also sending this to core-libs.

?

Thanks in advance

Markus

?

From: Markus Gr?nlund 
Sent: den 22 oktober 2014 11:44
To: serviceability-dev at openjdk.java.net; jmx-dev at openjdk.java.net
Subject: RFR(L): 8056049: getProcessCpuLoad() stops working in one process
when a different process exits

?

Greetings,

?

Kindly asking for reviews for the following changeset.

?

Bug: https://bugs.openjdk.java.net/browse/JDK-8056049 

Webrev: http://cr.openjdk.java.net/~mgronlun/8056049/webrev01/ 

?

Description:

?

The issue is ?Windows specific. And the problem relates to using the
Performance Data Helper API (PDH), more specifically how to use the
"Process" PDH object in PDH queries:

?

// code comment extract

?

/*

* Working against the Process object and it's related counters is inherently
problematic

* when using the PDH API:

*

* For PDH, a process is not primarily identified by it's process id,

* but with a sequential number, for example \Process(java#0),
\Process(java#1), ....

* The really bad part is that this list is reset as soon as one process
exits:

* If \Process(java#1) exits, \Process(java#3) now becomes \Process(java#2)
etc.

*

* The PDH query api requires a process identifier to be submitted when
registering

* a query, but as soon as the list resets, the query is invalidated (since
the name

* changed).

*

* Solution:

* The #number identifier for a Process query can only decrease after process
creation.

*

* Therefore we create an array of counter queries for all process object
instances

* up to and including ourselves:

*

* Ex. we come in as third process instance (java#2), we then create and
register

* queries for the following Process object instances:

* java#0, java#1, java#2

*

* currentQueryIndexForProcess() keeps track of the current "correct" query

* (in order to keep this index valid when the list resets from underneath,

* ensure to call getCurrentQueryIndexForProcess() before every query
involving

* Process object instance data).

*/

?

I have already fixed this in the VM as of
https://bugs.openjdk.java.net/browse/JDK-8019921 

?

In the process of fixing this issue now in the JDK, I realized that the
previous implementation of using PDH in the JDK was a bit convoluted -
especially if you would like to reuse functionality / add new counters.

?

Therefore this change also includes an overall rewrite of the how the JDK
will interface with the PDH library, a rewrite of which (hopefully) improves
both readability and extensibility.

?

I can do a code walkthrough live if anyone is interested to know the exact
details of this change.

?

Testing completed : Testset SVC (includes jdk_instrument, jdk_management,
jdk_jmx, jdk_jdi)

?

Thanks in advance

Markus

?


From david.r.chase at oracle.com  Mon Nov  3 17:28:00 2014
From: david.r.chase at oracle.com (David Chase)
Date: Mon, 3 Nov 2014 12:28:00 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <5457B076.10205@gmail.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com> <5457B076.10205@gmail.com>
Message-ID: <73CF1882-22F1-4D9A-B37A-EAC9BCF675B0@oracle.com>


On 2014-11-03, at 11:42 AM, Peter Levart <peter.levart at gmail.com> wrote:
>> You're worried that writes moving array elements up for one slot would bubble up before write of size = size+1, right? If that happens, VM could skip an existing (last) element and not update it.
>> 
>> It seems that Unsafe.storeFence() between size++ and moving of elements could do, as the javadoc for it says:
>> 
>>    /**
>>     * Ensures lack of reordering of stores before the fence
>>     * with loads or stores after the fence.
>>     * @since 1.8
>>     */
>>    public native void storeFence();


> You might need a storeFence() between each two writes into the array too. Your moving loop is the following:
> 
> 2544                 for (int i = oldCapacity; i > index; i--) {
> 2545                     // pre: element_data[i] is duplicated at [i+1]
> 2546                     element_data[i] = element_data[i - 1];
> 2547                     // post: element_data[i-1] is duplicated at [i]
> 2548                 }
> 
> 
> If we start unrolling, it becomes:
> 
> w1: element_data[old_capacity - 0] = element_data[old_capacity - 1];
> w2: element_data[old_capacity - 1] = element_data[old_capacity - 2];
> w3: element_data[old_capacity - 2] = element_data[old_capacity - 3];
> ...
> 
> Can compiler reorder w2 and w3 (just writes - not the whole statements)? Say that it reads a chunk of elements into the registers and then writes them out, but in different order, and a check for safepoint comes inside this chunk of writes... This is hypothetical, but it could do it without breaking the local semantics?

I think you are right, certainly in theory, and if I don?t hear someone else declaring that in practice we?re both just being paranoid, I?ll do that too.  Seems like it might eventually slow things down to do all those fences.

David


From mikhailo.seledtsov at oracle.com  Mon Nov  3 20:05:55 2014
From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov)
Date: Mon, 03 Nov 2014 12:05:55 -0800
Subject: RFR: 8062247: Allow WhiteBox test to access JVM offsets
In-Reply-To: <54571827.5050807@oracle.com>
References: <544F150A.3090500@oracle.com> <54518A60.9080800@oracle.com>
	<5453DB4F.70709@oracle.com> <54571827.5050807@oracle.com>
Message-ID: <5457E023.20801@oracle.com>

Hi Yumin,

  If this API is intended to get offsets for various data structures, I 
would expect a data  structure type identified to be passed as a 
parameter. For instance,

public native int getOffset(String dataStructureId, String fieldName)
     where
         dataStructureId would be some kind of ID for data structure, 
either data structure name or internal alias
         fieldName - the name of the field for which the offset value is 
returned
       A specific error code could be returned for unsupported 
dataStructureId

Alternatively, this could be an API specific to a given data-structure. 
E.g. getMySpecifiedDatastructOffset(String fieldName)

Thank you,
Misha


On 11/2/2014 9:52 PM, Yumin Qi wrote:
> Misha,
>
>   It is a generic name, now it only targets on FileMapHeader, it can 
> add other data structure of vm if needed in future. Maybe a name like 
> getOffsetForName(String name) is better?
>
> Thanks
> Yumin
>
> On 10/31/2014 11:56 AM, Mikhailo Seledtsov wrote:
>> Hi Yumin,
>>
>>  The name getOffsets() seems too generic. Perhaps, we could rename it 
>> to be more specific to the task.
>>
>> Thank you,
>> Misha
>>
>> On 10/29/2014 5:46 PM, Yumin Qi wrote:
>>> Please review the new changeset at same location.
>>> New API supply an interface to get data member offset by it's name.
>>> http://cr.openjdk.java.net/~minqi/8062247/webrev00/
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 10/27/2014 9:01 PM, Yumin Qi wrote:
>>>> Please review
>>>>
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8062247
>>>> webrev: http://cr.openjdk.java.net/~minqi/8062247/webrev00/
>>>>
>>>> Summary: Internal test failed since the variable offsets changed in 
>>>> hotspot. The way to get offset in the test is hard-coded. To reduce 
>>>> the risk of future changes of hotspot offsets, the fix add a 
>>>> WhiteBox API function to get a map for FileMapHeaderInfo, which 
>>>> return the members' offsets in a Hashtable.
>>>>
>>>> Tests: JPRT, jtreg.
>>>>
>>>> Thanks
>>>> Yumin
>>>
>>
>


From peter.levart at gmail.com  Mon Nov  3 20:09:29 2014
From: peter.levart at gmail.com (Peter Levart)
Date: Mon, 03 Nov 2014 21:09:29 +0100
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
Message-ID: <5457E0F9.8090004@gmail.com>

Hi David,

I was thinking about the fact that java.lang.invoke code is leaking into 
java.lang.Class. Perhaps, If you don't mind rewriting the code, a better 
code structure would be, if j.l.Class changes only consisted of adding a 
simple:

+ // A reference to canonicalizing cache of java.lang.invoke.MemberName(s)
+ // for members declared by class represented by this Class object
+ private transient volatile Object memberNameData;

...and nothing else. All the logic could live in MemberName itself 
(together with Unsafe machinery for accessing/cas-ing Class.memberNameData).

Now to an idea about implementation. Since VM code is not doing any 
binary-search and only linearly scans the array when it has to update 
MemberNames, the code could be changed to scan a linked-list of 
MemberName(s) instead. You could add a field to MemberName:

class MemberName {
...
     // next MemberName in chain of interned MemberNames for particular 
declaring class
     private MemberName next;


Have a volatile field in MemberNameData (or ClassData - whatever you 
call it):

class MemberNameData {
...
     // a chain of interned MemberName(s) for particular declaring class
     // accessed by VM when it has to modify them in-place
     private volatile MemberName memberNames;

     MemberName add(Class<?> klass, int index, MemberName mn, int 
redefined_count) {
         mn.next = memberNames;
memberNames = mn;
         if (jla.getClassRedefinedCount(klass) == redefined_count) { // 
no changes to class
             ...
             ... code to update array of sorted MemberName(s) with new 'mn'
             ...
             return mn;
         }
         // lost race, undo insertion
memberNames = mn.next;
         return null;
     }


This way all the worries about ordering of writes into array and/or size 
are gone. The array is still used to quickly search for an element, but 
VM only scans the linked-list.

What do you think of this?

Regards, Peter


On 11/03/2014 05:36 PM, David Chase wrote:
>>> My main concern is that the compiler is inhibited from any peculiar code motion; I assume that taking a safe point has a bit of barrier built into it anyway, especially given that the worry case is safepoint + JVMTI.
>>>
>>> Given the worry, what?s the best way to spell ?barrier? here?
>>> I could synchronize on classData (it would be a recursive lock in the current version of the code)
>>>    synchronized (this) { size++; }
>>> or I could synchronize on elementData (no longer used for a lock elsewhere, so always uncontended)
>>>    synchronized (elementData) { size++; }
>>> or is there some Unsafe thing that would be better?
>> You're worried that writes moving array elements up for one slot would bubble up before write of size = size+1, right? If that happens, VM could skip an existing (last) element and not update it.
> exactly, with the restriction that it would be compiler-induced bubbling, not architectural.
> Which is both better, and worse ? I don?t have to worry about crazy hardware, but the rules
> of java/jvm "memory model" are not as thoroughly defined as those for java itself.
>
> I added a method to Atomic (.storeFence() ).  New webrev to come after I rebuild and retest.
>
> Thanks much,
>
> David
>
>> It seems that Unsafe.storeFence() between size++ and moving of elements could do, as the javadoc for it says:
>>
>>     /**
>>      * Ensures lack of reordering of stores before the fence
>>      * with loads or stores after the fence.
>>      * @since 1.8
>>      */
>>     public native void storeFence();


From coleen.phillimore at oracle.com  Mon Nov  3 20:19:54 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 03 Nov 2014 15:19:54 -0500
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
Message-ID: <5457E36A.3020800@oracle.com>


Hi Jeremy,

I reviewed your new code and it looks fine.  I had one comment in

http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/share/vm/prims/jvmtiEnv.cpp.udiff.html

The name "need_to_resolve" doesn't make sense when reading this code.  
Isn't it more like "need_to_ensure_space" ?  I think method resolution 
with the other name, which it doesn't do.

I was trying to find a way to make this new code not appear twice (maybe 
with a local jvmtiEnv function get_jmethodID(m) - instanceK_h is 
m->method_holder()).

Also, I was trying to figure out if the new class in utilities called 
chunkedList.hpp could be used to store jmethodIDs, since the data 
structures are similar.  There is still more things in JNIMethodBlock 
has to do so I think a specialized structure is still needed (which is 
why I originally wrote it to be very simple).  I'm not sure if the 
comment above it still applies.  Maybe only the first and third 
sentences.  Can you rewrite the comment slightly?

Your other comments in the changes are good.

I can't completely answer your question about reusing free_methods - but 
if a jmethodID is created provisionally in InstanceKlass::get_jmethod_id 
and not needed because it loses the race in the method id cache, it's 
never handed back to native code, so it's safe to reuse.  This is 
different than jmethodIDs for methods that are unloaded.  They are 
cleared and never reused.  At least that's my reading of this caching 
code but it's pretty complicated stuff.

I've also run our nsk and jck vm/jvmti on this change and they all 
passed.  I'd be happy to sponsor it with these suggested changes and it 
needs another reviewer.

Thanks for diagnosing and fixing this problem!
Coleen


On 10/30/2014 01:02 PM, Jeremy Manson wrote:
> There's a significant regression in the speed of JVMTI GetClassMethods in
> JDK8. I've tracked this down to allocation of jmethodids in a tight loop.
> The issue can be addressed by preallocating enough space for all of the
> jmethodids when starting the operation and not iterating over all of the
> existing jmethodids when you allocate a new one.
>
> A patch is here:
>
> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/
>
> A reproducible test case can be found here:
>
> http://cr.openjdk.java.net/~jmanson/8062116/repro/
>
> It's a benchmark, though: I have no idea how to turn it into a test.
>
> For whoever reviews it: can you explain to me why it is okay that this code
> reuses jmethodIDs (in JNIMethodBlock::add_method?  I can imagine a lot of
> problems stemming from accidental reuse.
>
> Jeremy


From david.r.chase at oracle.com  Mon Nov  3 20:41:30 2014
From: david.r.chase at oracle.com (David Chase)
Date: Mon, 3 Nov 2014 15:41:30 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <5457E0F9.8090004@gmail.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
Message-ID: <F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>


On 2014-11-03, at 3:09 PM, Peter Levart <peter.levart at gmail.com> wrote:

> Hi David,
> 
> I was thinking about the fact that java.lang.invoke code is leaking into java.lang.Class. Perhaps, If you don't mind rewriting the code, a better code structure would be, if j.l.Class changes only consisted of adding a simple:
> ?

> This way all the worries about ordering of writes into array and/or size are gone. The array is still used to quickly search for an element, but VM only scans the linked-list.
> 
> What do you think of this?

I?m not sure.  I know Coleen Ph would like to see that happen.

A couple of people have vague plans to move more of the MemberName resolution into core libs.
(Years ago I worked on a VM where *all* of this occurred in Java, but some of it was ahead of time.)

I heard mention of ?we want to put more stuff in there? but I got the impression that already happened
(there?s reflection data, for example) so I?m not sure that makes sense.

There?s also a proposal from people in the runtime to just use a jmethodid, take the hit of an extra indirection,
and no need to for this worrisome jvm/java concurrency.

And if we instead wrote a hash table that only grew, and never relocated elements, we could
(I think) allow non-synchronized O(1) probes of the table from the Java side, synchronized
O(1) insertions from the Java side, and because nothing moves, a smaller dance with the
VM.  I?m rather tempted to look into this ? given the amount of work it would take to do the
benchmarking to see if (a) jmethodid would have acceptable performance or (b) the existing
costs are too high, I could instead just write fast code and be done.

And another way to view this is that we?re now quibbling about performance, when we still
have an existing correctness problem that this patch solves, so maybe we should just get this
done and then file an RFE.

David


From yumin.qi at oracle.com  Mon Nov  3 21:15:17 2014
From: yumin.qi at oracle.com (Yumin Qi)
Date: Mon, 03 Nov 2014 13:15:17 -0800
Subject: RFR: 8062247: Allow WhiteBox test to access JVM offsets
In-Reply-To: <5457E023.20801@oracle.com>
References: <544F150A.3090500@oracle.com> <54518A60.9080800@oracle.com>
	<5453DB4F.70709@oracle.com> <54571827.5050807@oracle.com>
	<5457E023.20801@oracle.com>
Message-ID: <5457F065.7070300@oracle.com>

If you check the name passed to this function, it already told (I just 
changed to getOffsetForName and in testing):

wb.getOffsetForName("FileMapHeader::_crc")

Do not need to give type info(that will need build a list for type info 
like in vmStructs). This is a simple code fore testing purpose, I think 
we should keep it simple. I added that if the offsetname not supported, 
throw exception instead. Will post webrev today after jprt finished.

Thanks
Yumin

On 11/3/2014 12:05 PM, Mikhailo Seledtsov wrote:
> Hi Yumin,
>
>  If this API is intended to get offsets for various data structures, I 
> would expect a data  structure type identified to be passed as a 
> parameter. For instance,
>
> public native int getOffset(String dataStructureId, String fieldName)
>     where
>         dataStructureId would be some kind of ID for data structure, 
> either data structure name or internal alias
>         fieldName - the name of the field for which the offset value 
> is returned
>       A specific error code could be returned for unsupported 
> dataStructureId
>
> Alternatively, this could be an API specific to a given 
> data-structure. E.g. getMySpecifiedDatastructOffset(String fieldName)
>
> Thank you,
> Misha
>
>
> On 11/2/2014 9:52 PM, Yumin Qi wrote:
>> Misha,
>>
>>   It is a generic name, now it only targets on FileMapHeader, it can 
>> add other data structure of vm if needed in future. Maybe a name like 
>> getOffsetForName(String name) is better?
>>
>> Thanks
>> Yumin
>>
>> On 10/31/2014 11:56 AM, Mikhailo Seledtsov wrote:
>>> Hi Yumin,
>>>
>>>  The name getOffsets() seems too generic. Perhaps, we could rename 
>>> it to be more specific to the task.
>>>
>>> Thank you,
>>> Misha
>>>
>>> On 10/29/2014 5:46 PM, Yumin Qi wrote:
>>>> Please review the new changeset at same location.
>>>> New API supply an interface to get data member offset by it's name.
>>>> http://cr.openjdk.java.net/~minqi/8062247/webrev00/
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>> On 10/27/2014 9:01 PM, Yumin Qi wrote:
>>>>> Please review
>>>>>
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8062247
>>>>> webrev: http://cr.openjdk.java.net/~minqi/8062247/webrev00/
>>>>>
>>>>> Summary: Internal test failed since the variable offsets changed 
>>>>> in hotspot. The way to get offset in the test is hard-coded. To 
>>>>> reduce the risk of future changes of hotspot offsets, the fix add 
>>>>> a WhiteBox API function to get a map for FileMapHeaderInfo, which 
>>>>> return the members' offsets in a Hashtable.
>>>>>
>>>>> Tests: JPRT, jtreg.
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>
>>>
>>
>


From christian.thalinger at oracle.com  Mon Nov  3 21:30:07 2014
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 3 Nov 2014 13:30:07 -0800
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>
	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>
	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>
	<5453C230.8010709@oracle.com>
	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>
	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>
	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>
	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
Message-ID: <00703487-00EB-4E43-9613-01EE9EE64147@oracle.com>


> On Nov 3, 2014, at 12:41 PM, David Chase <david.r.chase at oracle.com> wrote:
> 
> 
> On 2014-11-03, at 3:09 PM, Peter Levart <peter.levart at gmail.com> wrote:
> 
>> Hi David,
>> 
>> I was thinking about the fact that java.lang.invoke code is leaking into java.lang.Class. Perhaps, If you don't mind rewriting the code, a better code structure would be, if j.l.Class changes only consisted of adding a simple:
>> ?
> 
>> This way all the worries about ordering of writes into array and/or size are gone. The array is still used to quickly search for an element, but VM only scans the linked-list.
>> 
>> What do you think of this?
> 
> I?m not sure.  I know Coleen Ph would like to see that happen.
> 
> A couple of people have vague plans to move more of the MemberName resolution into core libs.
> (Years ago I worked on a VM where *all* of this occurred in Java, but some of it was ahead of time.)
> 
> I heard mention of ?we want to put more stuff in there? but I got the impression that already happened
> (there?s reflection data, for example) so I?m not sure that makes sense.
> 
> There?s also a proposal from people in the runtime to just use a jmethodid, take the hit of an extra indirection,
> and no need to for this worrisome jvm/java concurrency.
> 
> And if we instead wrote a hash table that only grew, and never relocated elements, we could
> (I think) allow non-synchronized O(1) probes of the table from the Java side, synchronized
> O(1) insertions from the Java side, and because nothing moves, a smaller dance with the
> VM.  I?m rather tempted to look into this ? given the amount of work it would take to do the
> benchmarking to see if (a) jmethodid would have acceptable performance or (b) the existing
> costs are too high, I could instead just write fast code and be done.

?but you still have to do the benchmarking.  Let?s not forget that there was a performance regression with the first C++ implementation of this.

> 
> And another way to view this is that we?re now quibbling about performance, when we still
> have an existing correctness problem that this patch solves, so maybe we should just get this
> done and then file an RFE.
> 
> David
> 


From mikhailo.seledtsov at oracle.com  Mon Nov  3 21:34:44 2014
From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov)
Date: Mon, 03 Nov 2014 13:34:44 -0800
Subject: RFR: 8062247: Allow WhiteBox test to access JVM offsets
In-Reply-To: <5457F065.7070300@oracle.com>
References: <544F150A.3090500@oracle.com> <54518A60.9080800@oracle.com>
	<5453DB4F.70709@oracle.com> <54571827.5050807@oracle.com>
	<5457E023.20801@oracle.com> <5457F065.7070300@oracle.com>
Message-ID: <5457F4F4.60804@oracle.com>

Yumin,

  OK, I see you chose to pass the name of the struct implicitly instead 
of using an explicit parameter. I have no strong objection to that.
Please make sure to clarify this in the comments.

Thank you,
Misha

On 11/3/2014 1:15 PM, Yumin Qi wrote:
> If you check the name passed to this function, it already told (I just 
> changed to getOffsetForName and in testing):
>
> wb.getOffsetForName("FileMapHeader::_crc")
>
> Do not need to give type info(that will need build a list for type 
> info like in vmStructs). This is a simple code fore testing purpose, I 
> think we should keep it simple. I added that if the offsetname not 
> supported, throw exception instead. Will post webrev today after jprt 
> finished.
>
> Thanks
> Yumin
>
> On 11/3/2014 12:05 PM, Mikhailo Seledtsov wrote:
>> Hi Yumin,
>>
>>  If this API is intended to get offsets for various data structures, 
>> I would expect a data  structure type identified to be passed as a 
>> parameter. For instance,
>>
>> public native int getOffset(String dataStructureId, String fieldName)
>>     where
>>         dataStructureId would be some kind of ID for data structure, 
>> either data structure name or internal alias
>>         fieldName - the name of the field for which the offset value 
>> is returned
>>       A specific error code could be returned for unsupported 
>> dataStructureId
>>
>> Alternatively, this could be an API specific to a given 
>> data-structure. E.g. getMySpecifiedDatastructOffset(String fieldName)
>>
>> Thank you,
>> Misha
>>
>>
>> On 11/2/2014 9:52 PM, Yumin Qi wrote:
>>> Misha,
>>>
>>>   It is a generic name, now it only targets on FileMapHeader, it can 
>>> add other data structure of vm if needed in future. Maybe a name 
>>> like getOffsetForName(String name) is better?
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 10/31/2014 11:56 AM, Mikhailo Seledtsov wrote:
>>>> Hi Yumin,
>>>>
>>>>  The name getOffsets() seems too generic. Perhaps, we could rename 
>>>> it to be more specific to the task.
>>>>
>>>> Thank you,
>>>> Misha
>>>>
>>>> On 10/29/2014 5:46 PM, Yumin Qi wrote:
>>>>> Please review the new changeset at same location.
>>>>> New API supply an interface to get data member offset by it's name.
>>>>> http://cr.openjdk.java.net/~minqi/8062247/webrev00/
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>> On 10/27/2014 9:01 PM, Yumin Qi wrote:
>>>>>> Please review
>>>>>>
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8062247
>>>>>> webrev: http://cr.openjdk.java.net/~minqi/8062247/webrev00/
>>>>>>
>>>>>> Summary: Internal test failed since the variable offsets changed 
>>>>>> in hotspot. The way to get offset in the test is hard-coded. To 
>>>>>> reduce the risk of future changes of hotspot offsets, the fix add 
>>>>>> a WhiteBox API function to get a map for FileMapHeaderInfo, which 
>>>>>> return the members' offsets in a Hashtable.
>>>>>>
>>>>>> Tests: JPRT, jtreg.
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>
>>>>
>>>
>>
>


From yumin.qi at oracle.com  Tue Nov  4 00:11:21 2014
From: yumin.qi at oracle.com (Yumin Qi)
Date: Mon, 03 Nov 2014 16:11:21 -0800
Subject: RFR: 8062247: Allow WhiteBox test to access JVM offsets
In-Reply-To: <5457F4F4.60804@oracle.com>
References: <544F150A.3090500@oracle.com> <54518A60.9080800@oracle.com>
	<5453DB4F.70709@oracle.com> <54571827.5050807@oracle.com>
	<5457E023.20801@oracle.com> <5457F065.7070300@oracle.com>
	<5457F4F4.60804@oracle.com>
Message-ID: <545819A9.20400@oracle.com>

I have made change to the function name in WhiteBox.
New webrev at

http://cr.openjdk.java.net/~minqi/8062247/webrev01/

The function getOffsetForName(String name), takes a form of 
"FileMapHead::_magic" as in vm, if there is no name present in vm for 
search its offset, a RuntimeExcetion with message "<offsetname> not 
found" will be thrown.

tests: JPRT, jtreg

Thanks
Yumin


On 11/3/2014 1:34 PM, Mikhailo Seledtsov wrote:
> Yumin,
>
>  OK, I see you chose to pass the name of the struct implicitly instead 
> of using an explicit parameter. I have no strong objection to that.
> Please make sure to clarify this in the comments.
>
> Thank you,
> Misha
>
> On 11/3/2014 1:15 PM, Yumin Qi wrote:
>> If you check the name passed to this function, it already told (I 
>> just changed to getOffsetForName and in testing):
>>
>> wb.getOffsetForName("FileMapHeader::_crc")
>>
>> Do not need to give type info(that will need build a list for type 
>> info like in vmStructs). This is a simple code fore testing purpose, 
>> I think we should keep it simple. I added that if the offsetname not 
>> supported, throw exception instead. Will post webrev today after jprt 
>> finished.
>>
>> Thanks
>> Yumin
>>
>> On 11/3/2014 12:05 PM, Mikhailo Seledtsov wrote:
>>> Hi Yumin,
>>>
>>>  If this API is intended to get offsets for various data structures, 
>>> I would expect a data  structure type identified to be passed as a 
>>> parameter. For instance,
>>>
>>> public native int getOffset(String dataStructureId, String fieldName)
>>>     where
>>>         dataStructureId would be some kind of ID for data structure, 
>>> either data structure name or internal alias
>>>         fieldName - the name of the field for which the offset value 
>>> is returned
>>>       A specific error code could be returned for unsupported 
>>> dataStructureId
>>>
>>> Alternatively, this could be an API specific to a given 
>>> data-structure. E.g. getMySpecifiedDatastructOffset(String fieldName)
>>>
>>> Thank you,
>>> Misha
>>>
>>>
>>> On 11/2/2014 9:52 PM, Yumin Qi wrote:
>>>> Misha,
>>>>
>>>>   It is a generic name, now it only targets on FileMapHeader, it 
>>>> can add other data structure of vm if needed in future. Maybe a 
>>>> name like getOffsetForName(String name) is better?
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>> On 10/31/2014 11:56 AM, Mikhailo Seledtsov wrote:
>>>>> Hi Yumin,
>>>>>
>>>>>  The name getOffsets() seems too generic. Perhaps, we could rename 
>>>>> it to be more specific to the task.
>>>>>
>>>>> Thank you,
>>>>> Misha
>>>>>
>>>>> On 10/29/2014 5:46 PM, Yumin Qi wrote:
>>>>>> Please review the new changeset at same location.
>>>>>> New API supply an interface to get data member offset by it's name.
>>>>>> http://cr.openjdk.java.net/~minqi/8062247/webrev00/
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>> On 10/27/2014 9:01 PM, Yumin Qi wrote:
>>>>>>> Please review
>>>>>>>
>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8062247
>>>>>>> webrev: http://cr.openjdk.java.net/~minqi/8062247/webrev00/
>>>>>>>
>>>>>>> Summary: Internal test failed since the variable offsets changed 
>>>>>>> in hotspot. The way to get offset in the test is hard-coded. To 
>>>>>>> reduce the risk of future changes of hotspot offsets, the fix 
>>>>>>> add a WhiteBox API function to get a map for FileMapHeaderInfo, 
>>>>>>> which return the members' offsets in a Hashtable.
>>>>>>>
>>>>>>> Tests: JPRT, jtreg.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Yumin
>>>>>>
>>>>>
>>>>
>>>
>>
>


From daniel.daugherty at oracle.com  Tue Nov  4 01:59:42 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 03 Nov 2014 18:59:42 -0700
Subject: RFR(S) Contended Locking fast enter bucket (8061553)
In-Reply-To: <5457084B.6070808@oracle.com>
References: <5452C0B4.4070601@oracle.com> <5457084B.6070808@oracle.com>
Message-ID: <5458330E.1080207@oracle.com>

David,

Thanks for the review! As usual, replies are embedded below...


On 11/2/14 9:44 PM, David Holmes wrote:
> Hi Dan,
>
> Looks good.

Thanks!


> Couple of nits and one semantic query below ...
>
> src/cpu/sparc/vm/macroAssembler_sparc.cpp
>
> Formatting changes were a bit of a distraction.

Yes, I have no idea what got into me. Normally I do formatting
changes separately so the noise does not distract...

It turns out there is a constant defined that should be used
instead of all these literal '2's:

src/share/vm/oops/markOop.hpp:         monitor_value            = 2

Typically used as follows:

src/cpu/x86/vm/macroAssembler_x86.cpp:  int owner_offset = 
ObjectMonitor::owner_offset_in_bytes() - markOopDesc::monitor_value;

I will clean this up just for the files that I'm touching as
part of this fix.


>
> ---
>
> src/cpu/x86/vm/macroAssembler_x86.cpp
>
> Formatting changes were a bit of a distraction.

Same reply as for macroAssembler_sparc.cpp.


> 1929     // unconditionally set stackBox->_displaced_header = 3
> 1930     movptr(Address(boxReg, 0), 
> (int32_t)intptr_t(markOopDesc::unused_mark()));
>
> At 1870 we refer to box rather than stackBox. Also it takes some 
> sleuthing to realize that "3" here is somehow a pseudonym for 
> unused_mark(). Back up at 1808 we have a to-do:
>
> 1808     //   use markOop::unused_mark() instead of "3".
>
> so the current change seems to be implementing that, even though other 
> uses of "3" are left untouched.

I'll take a look at cleaning those up also...

In some cases markOopDesc::marked_value will work for the literal '3',
but in other cases we'll use markOop::unused_mark():

   static markOop unused_mark() {
     return (markOop) marked_value;
   }

to save us the noise of the (markOop) cast.


> ---
>
> src/share/vm/runtime/sharedRuntime.cpp
>
> 1794 JRT_BLOCK_ENTRY(void, 
> SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock* 
> lock, JavaThread* thread))
> 1795   if (!SafepointSynchronize::is_synchronizing()) {
> 1796     if (ObjectSynchronizer::quick_enter(_obj, thread, lock)) return;
>
> Is it necessary to check is_synchronizing? If we are executing this 
> code we are not at a safepoint and the quick_enter wont change that, 
> so I'm not sure what we are guarding against.

So this first state checker:

src/share/vm/runtime/safepoint.hpp:
inline static bool is_synchronizing()  { return _state == _synchronizing;  }

means that we want to go to a safepoint and:

inline static bool is_at_safepoint()   { return _state == _synchronized;  }

means that we are at a safepoint. Dice's optimization bails out if
we want to go to a safepoint and ObjectSynchronizer::quick_enter()
has a "No_Safepoint_Verifier nsv" in it so we're expecting that
code to be quick (and not go to a safepoint). I'm not seeing
anything obvious....

Sometimes we have to be careful with JavaThread suspend requests and
monitor acquisition, but I don't think that's a problem here... In
order for the "suspend requesting" thread to be surprised, the suspend
API, e.g., JVM/TI SuspendThread() has to return to the caller and then
the suspend target has do something unexpected like acquire a monitor
that it was previously blocked upon when it was suspended. We've had
bugs like that in the past... In this optimization case, our target
thread is not blocked on a contended monitor...

In this particular case, the "suspend requesting" thread will set the
suspend request state on the target thread, but the target thread is
busy trying to enter this uncontended monitor (quickly). So the
"suspend requesting" thread, will request a no-op safepoint, but it
won't return from the suspend API until that safepoint completes.
The safepoint won't complete until the target thread is done acquiring
the previously uncontended monitor... so the target thread will be
suspended while holding the previous uncontended monitor and the
"suspend requesting" thread will return from the suspend API all
happy...

Well, I don't see the reason either so I'll have to ping Dave Dice
and Karen Kinnear to see if either of them can fill in the history
here. This could be an abundance of caution case.


> ---
>
> src/share/vm/runtime/synchronizer.cpp
>
> Minor nit: line 153 the usual acronym is NPE (for 
> NullPointerException) not NPX

I'll do a search for uses of NPX and other uses of 'X' in exception
acronyms...


>
> Nit:  159     Thread * const ox
>
> Please change ox to owner.

Will do.

Thanks again for the review!

Dan


>
> ---
>
> Thanks,
> David
>
>
>
> On 31/10/2014 8:50 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have the Contended Locking fast enter bucket ready for review.
>>
>> The code changes in this bucket are primarily a quick_enter()
>> function that works on inflated but uncontended Java monitors.
>> This quick_enter() function is used on the "slow path" for Java
>> Monitor enter operations when the built-in "fast path" (read
>> assembly code) doesn't work.
>>
>> This work is being tracked by the following bug ID:
>>
>>      JDK-8061553 Contended Locking fast enter bucket
>>      https://bugs.openjdk.java.net/browse/JDK-8061553
>>
>> Here is the webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/0-jdk9-hs-rt/
>>
>> Here is the JEP link:
>>
>>      https://bugs.openjdk.java.net/browse/JDK-8046133
>>
>> 8061553 summary of changes:
>>
>> macroAssembler_sparc.cpp: MacroAssembler::compiler_lock_object()
>>
>> - clean up spacing around some
>>    'ObjectMonitor::owner_offset_in_bytes() - 2' uses
>> - remove optional (EmitSync & 64) code
>> - change from cmp() to andcc() so icc.zf flag is set
>>
>> macroAssembler_x86.cpp: MacroAssembler::fast_lock()
>>
>> - remove optional (EmitSync & 2) code
>> - rewrite LP64 inflated lock code that tries to CAS in
>>    the new owner value to be more efficient
>>
>> interfaceSupport.hpp:
>>
>> - add JRT_BLOCK_NO_ASYNC to permit splitting a
>>    JRT_BLOCK_ENTRY into two pieces.
>>
>> sharedRuntime.cpp: SharedRuntime::complete_monitor_locking_C()
>>
>> - change entry type from JRT_ENTRY_NO_ASYNC to JRT_BLOCK_ENTRY
>>    to permit ObjectSynchronizer::quick_enter() call
>> - add JRT_BLOCK_NO_ASYNC use if the quick_enter() doesn't work
>>    to revert to JRT_ENTRY_NO_ASYNC-like semantics
>>
>> synchronizer.[ch]pp:
>>
>> - add ObjectSynchronizer::quick_enter() for entering an
>>    inflated but unowned Java monitor without thread state
>>    changes
>>
>> Testing:
>>
>> - Aurora Adhoc RT/SVC baseline batch
>> - JPRT test jobs
>> - MonitorEnterStresser micro-benchmark (in process)
>> - CallTimerGrid stress testing (in process)
>> - Aurora performance testing:
>>    - out of the box for the "promotion" and 32-bit server configs
>>    - heavy weight monitors for the "promotion" and 32-bit server configs
>>      (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>>      (in process)
>>
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan


From mikhailo.seledtsov at oracle.com  Tue Nov  4 02:30:31 2014
From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov)
Date: Mon, 03 Nov 2014 18:30:31 -0800
Subject: RFR: 8062247: Allow WhiteBox test to access JVM offsets
In-Reply-To: <545819A9.20400@oracle.com>
References: <544F150A.3090500@oracle.com> <54518A60.9080800@oracle.com>
	<5453DB4F.70709@oracle.com> <54571827.5050807@oracle.com>
	<5457E023.20801@oracle.com> <5457F065.7070300@oracle.com>
	<5457F4F4.60804@oracle.com> <545819A9.20400@oracle.com>
Message-ID: <54583A47.2090207@oracle.com>

Looks good to me.

Misha

On 11/3/2014 4:11 PM, Yumin Qi wrote:
> I have made change to the function name in WhiteBox.
> New webrev at
>
> http://cr.openjdk.java.net/~minqi/8062247/webrev01/
>
> The function getOffsetForName(String name), takes a form of 
> "FileMapHead::_magic" as in vm, if there is no name present in vm for 
> search its offset, a RuntimeExcetion with message "<offsetname> not 
> found" will be thrown.
>
> tests: JPRT, jtreg
>
> Thanks
> Yumin
>
>
> On 11/3/2014 1:34 PM, Mikhailo Seledtsov wrote:
>> Yumin,
>>
>>  OK, I see you chose to pass the name of the struct implicitly 
>> instead of using an explicit parameter. I have no strong objection to 
>> that.
>> Please make sure to clarify this in the comments.
>>
>> Thank you,
>> Misha
>>
>> On 11/3/2014 1:15 PM, Yumin Qi wrote:
>>> If you check the name passed to this function, it already told (I 
>>> just changed to getOffsetForName and in testing):
>>>
>>> wb.getOffsetForName("FileMapHeader::_crc")
>>>
>>> Do not need to give type info(that will need build a list for type 
>>> info like in vmStructs). This is a simple code fore testing purpose, 
>>> I think we should keep it simple. I added that if the offsetname not 
>>> supported, throw exception instead. Will post webrev today after 
>>> jprt finished.
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 11/3/2014 12:05 PM, Mikhailo Seledtsov wrote:
>>>> Hi Yumin,
>>>>
>>>>  If this API is intended to get offsets for various data 
>>>> structures, I would expect a data  structure type identified to be 
>>>> passed as a parameter. For instance,
>>>>
>>>> public native int getOffset(String dataStructureId, String fieldName)
>>>>     where
>>>>         dataStructureId would be some kind of ID for data 
>>>> structure, either data structure name or internal alias
>>>>         fieldName - the name of the field for which the offset 
>>>> value is returned
>>>>       A specific error code could be returned for unsupported 
>>>> dataStructureId
>>>>
>>>> Alternatively, this could be an API specific to a given 
>>>> data-structure. E.g. getMySpecifiedDatastructOffset(String fieldName)
>>>>
>>>> Thank you,
>>>> Misha
>>>>
>>>>
>>>> On 11/2/2014 9:52 PM, Yumin Qi wrote:
>>>>> Misha,
>>>>>
>>>>>   It is a generic name, now it only targets on FileMapHeader, it 
>>>>> can add other data structure of vm if needed in future. Maybe a 
>>>>> name like getOffsetForName(String name) is better?
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>> On 10/31/2014 11:56 AM, Mikhailo Seledtsov wrote:
>>>>>> Hi Yumin,
>>>>>>
>>>>>>  The name getOffsets() seems too generic. Perhaps, we could 
>>>>>> rename it to be more specific to the task.
>>>>>>
>>>>>> Thank you,
>>>>>> Misha
>>>>>>
>>>>>> On 10/29/2014 5:46 PM, Yumin Qi wrote:
>>>>>>> Please review the new changeset at same location.
>>>>>>> New API supply an interface to get data member offset by it's name.
>>>>>>> http://cr.openjdk.java.net/~minqi/8062247/webrev00/
>>>>>>>
>>>>>>> Thanks
>>>>>>> Yumin
>>>>>>>
>>>>>>> On 10/27/2014 9:01 PM, Yumin Qi wrote:
>>>>>>>> Please review
>>>>>>>>
>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8062247
>>>>>>>> webrev: http://cr.openjdk.java.net/~minqi/8062247/webrev00/
>>>>>>>>
>>>>>>>> Summary: Internal test failed since the variable offsets 
>>>>>>>> changed in hotspot. The way to get offset in the test is 
>>>>>>>> hard-coded. To reduce the risk of future changes of hotspot 
>>>>>>>> offsets, the fix add a WhiteBox API function to get a map for 
>>>>>>>> FileMapHeaderInfo, which return the members' offsets in a 
>>>>>>>> Hashtable.
>>>>>>>>
>>>>>>>> Tests: JPRT, jtreg.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Yumin
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


From david.holmes at oracle.com  Tue Nov  4 07:03:29 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 04 Nov 2014 17:03:29 +1000
Subject: RFR(S) Contended Locking fast enter bucket (8061553)
In-Reply-To: <5458330E.1080207@oracle.com>
References: <5452C0B4.4070601@oracle.com> <5457084B.6070808@oracle.com>
	<5458330E.1080207@oracle.com>
Message-ID: <54587A41.2020508@oracle.com>

Hi Dan,

One follow up deep below ...

On 4/11/2014 11:59 AM, Daniel D. Daugherty wrote:
> David,
>
> Thanks for the review! As usual, replies are embedded below...
>
>
> On 11/2/14 9:44 PM, David Holmes wrote:
>> Hi Dan,
>>
>> Looks good.
>
> Thanks!
>
>
>> Couple of nits and one semantic query below ...
>>
>> src/cpu/sparc/vm/macroAssembler_sparc.cpp
>>
>> Formatting changes were a bit of a distraction.
>
> Yes, I have no idea what got into me. Normally I do formatting
> changes separately so the noise does not distract...
>
> It turns out there is a constant defined that should be used
> instead of all these literal '2's:
>
> src/share/vm/oops/markOop.hpp:         monitor_value            = 2
>
> Typically used as follows:
>
> src/cpu/x86/vm/macroAssembler_x86.cpp:  int owner_offset =
> ObjectMonitor::owner_offset_in_bytes() - markOopDesc::monitor_value;
>
> I will clean this up just for the files that I'm touching as
> part of this fix.
>
>
>>
>> ---
>>
>> src/cpu/x86/vm/macroAssembler_x86.cpp
>>
>> Formatting changes were a bit of a distraction.
>
> Same reply as for macroAssembler_sparc.cpp.
>
>
>> 1929     // unconditionally set stackBox->_displaced_header = 3
>> 1930     movptr(Address(boxReg, 0),
>> (int32_t)intptr_t(markOopDesc::unused_mark()));
>>
>> At 1870 we refer to box rather than stackBox. Also it takes some
>> sleuthing to realize that "3" here is somehow a pseudonym for
>> unused_mark(). Back up at 1808 we have a to-do:
>>
>> 1808     //   use markOop::unused_mark() instead of "3".
>>
>> so the current change seems to be implementing that, even though other
>> uses of "3" are left untouched.
>
> I'll take a look at cleaning those up also...
>
> In some cases markOopDesc::marked_value will work for the literal '3',
> but in other cases we'll use markOop::unused_mark():
>
>    static markOop unused_mark() {
>      return (markOop) marked_value;
>    }
>
> to save us the noise of the (markOop) cast.
>
>
>> ---
>>
>> src/share/vm/runtime/sharedRuntime.cpp
>>
>> 1794 JRT_BLOCK_ENTRY(void,
>> SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock*
>> lock, JavaThread* thread))
>> 1795   if (!SafepointSynchronize::is_synchronizing()) {
>> 1796     if (ObjectSynchronizer::quick_enter(_obj, thread, lock)) return;
>>
>> Is it necessary to check is_synchronizing? If we are executing this
>> code we are not at a safepoint and the quick_enter wont change that,
>> so I'm not sure what we are guarding against.
>
> So this first state checker:
>
> src/share/vm/runtime/safepoint.hpp:
> inline static bool is_synchronizing()  { return _state ==
> _synchronizing;  }
>
> means that we want to go to a safepoint and:
>
> inline static bool is_at_safepoint()   { return _state == _synchronized;  }
>
> means that we are at a safepoint. Dice's optimization bails out if
> we want to go to a safepoint and ObjectSynchronizer::quick_enter()
> has a "No_Safepoint_Verifier nsv" in it so we're expecting that
> code to be quick (and not go to a safepoint). I'm not seeing
> anything obvious....

So it occurred to me that this is just an optimization not a true guard 
- as the safepoint could be initiated just after we do the check. So 
it's basically trying to ensure that if a safepoint has been requested 
then we don't unduly delay it by taking the non-safepointing quick_enter 
path.

Cheers,
David

> Sometimes we have to be careful with JavaThread suspend requests and
> monitor acquisition, but I don't think that's a problem here... In
> order for the "suspend requesting" thread to be surprised, the suspend
> API, e.g., JVM/TI SuspendThread() has to return to the caller and then
> the suspend target has do something unexpected like acquire a monitor
> that it was previously blocked upon when it was suspended. We've had
> bugs like that in the past... In this optimization case, our target
> thread is not blocked on a contended monitor...
>
> In this particular case, the "suspend requesting" thread will set the
> suspend request state on the target thread, but the target thread is
> busy trying to enter this uncontended monitor (quickly). So the
> "suspend requesting" thread, will request a no-op safepoint, but it
> won't return from the suspend API until that safepoint completes.
> The safepoint won't complete until the target thread is done acquiring
> the previously uncontended monitor... so the target thread will be
> suspended while holding the previous uncontended monitor and the
> "suspend requesting" thread will return from the suspend API all
> happy...
>
> Well, I don't see the reason either so I'll have to ping Dave Dice
> and Karen Kinnear to see if either of them can fill in the history
> here. This could be an abundance of caution case.
>
>
>> ---
>>
>> src/share/vm/runtime/synchronizer.cpp
>>
>> Minor nit: line 153 the usual acronym is NPE (for
>> NullPointerException) not NPX
>
> I'll do a search for uses of NPX and other uses of 'X' in exception
> acronyms...
>
>
>>
>> Nit:  159     Thread * const ox
>>
>> Please change ox to owner.
>
> Will do.
>
> Thanks again for the review!
>
> Dan
>
>
>>
>> ---
>>
>> Thanks,
>> David
>>
>>
>>
>> On 31/10/2014 8:50 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have the Contended Locking fast enter bucket ready for review.
>>>
>>> The code changes in this bucket are primarily a quick_enter()
>>> function that works on inflated but uncontended Java monitors.
>>> This quick_enter() function is used on the "slow path" for Java
>>> Monitor enter operations when the built-in "fast path" (read
>>> assembly code) doesn't work.
>>>
>>> This work is being tracked by the following bug ID:
>>>
>>>      JDK-8061553 Contended Locking fast enter bucket
>>>      https://bugs.openjdk.java.net/browse/JDK-8061553
>>>
>>> Here is the webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/0-jdk9-hs-rt/
>>>
>>> Here is the JEP link:
>>>
>>>      https://bugs.openjdk.java.net/browse/JDK-8046133
>>>
>>> 8061553 summary of changes:
>>>
>>> macroAssembler_sparc.cpp: MacroAssembler::compiler_lock_object()
>>>
>>> - clean up spacing around some
>>>    'ObjectMonitor::owner_offset_in_bytes() - 2' uses
>>> - remove optional (EmitSync & 64) code
>>> - change from cmp() to andcc() so icc.zf flag is set
>>>
>>> macroAssembler_x86.cpp: MacroAssembler::fast_lock()
>>>
>>> - remove optional (EmitSync & 2) code
>>> - rewrite LP64 inflated lock code that tries to CAS in
>>>    the new owner value to be more efficient
>>>
>>> interfaceSupport.hpp:
>>>
>>> - add JRT_BLOCK_NO_ASYNC to permit splitting a
>>>    JRT_BLOCK_ENTRY into two pieces.
>>>
>>> sharedRuntime.cpp: SharedRuntime::complete_monitor_locking_C()
>>>
>>> - change entry type from JRT_ENTRY_NO_ASYNC to JRT_BLOCK_ENTRY
>>>    to permit ObjectSynchronizer::quick_enter() call
>>> - add JRT_BLOCK_NO_ASYNC use if the quick_enter() doesn't work
>>>    to revert to JRT_ENTRY_NO_ASYNC-like semantics
>>>
>>> synchronizer.[ch]pp:
>>>
>>> - add ObjectSynchronizer::quick_enter() for entering an
>>>    inflated but unowned Java monitor without thread state
>>>    changes
>>>
>>> Testing:
>>>
>>> - Aurora Adhoc RT/SVC baseline batch
>>> - JPRT test jobs
>>> - MonitorEnterStresser micro-benchmark (in process)
>>> - CallTimerGrid stress testing (in process)
>>> - Aurora performance testing:
>>>    - out of the box for the "promotion" and 32-bit server configs
>>>    - heavy weight monitors for the "promotion" and 32-bit server configs
>>>      (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>>>      (in process)
>>>
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>

From peter.levart at gmail.com  Tue Nov  4 10:07:56 2014
From: peter.levart at gmail.com (Peter Levart)
Date: Tue, 04 Nov 2014 11:07:56 +0100
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
Message-ID: <5458A57C.4060208@gmail.com>

On 11/03/2014 09:41 PM, David Chase wrote:
> On 2014-11-03, at 3:09 PM, Peter Levart <peter.levart at gmail.com> wrote:
>
>> Hi David,
>>
>> I was thinking about the fact that java.lang.invoke code is leaking into java.lang.Class. Perhaps, If you don't mind rewriting the code, a better code structure would be, if j.l.Class changes only consisted of adding a simple:
>> ?
>> This way all the worries about ordering of writes into array and/or size are gone. The array is still used to quickly search for an element, but VM only scans the linked-list.
>>
>> What do you think of this?
> I?m not sure.  I know Coleen Ph would like to see that happen.
>
> A couple of people have vague plans to move more of the MemberName resolution into core libs.
> (Years ago I worked on a VM where *all* of this occurred in Java, but some of it was ahead of time.)

Hi David,

>
> I heard mention of ?we want to put more stuff in there? but I got the impression that already happened
> (there?s reflection data, for example) so I?m not sure that makes sense.

Reflection is an API that is rooted in j.l.Class. If the plans are to 
move some of the java.lang.invoke public API to java.lang package (into 
the j.l.Class,  ...), then this is understandable.

>
> There?s also a proposal from people in the runtime to just use a jmethodid, take the hit of an extra indirection,
> and no need to for this worrisome jvm/java concurrency.

The linked list of MemberName(s) is also worry-less and doesn't need an 
extra indirection via jmethodid. Does the hit of extra indirection occur 
when invoking a MethodHandle?

>
> And if we instead wrote a hash table that only grew, and never relocated elements, we could
> (I think) allow non-synchronized O(1) probes of the table from the Java side, synchronized
> O(1) insertions from the Java side, and because nothing moves, a smaller dance with the
> VM.  I?m rather tempted to look into this ? given the amount of work it would take to do the
> benchmarking to see if (a) jmethodid would have acceptable performance or (b) the existing
> costs are too high, I could instead just write fast code and be done.

Are you thinking of an IdentityHashMap type of hash table (no 
linked-list of elements for same bucket, just search for 1st free slot 
on insert)? The problem would be how to pre-size the array. Count 
declared members?

>
> And another way to view this is that we?re now quibbling about performance, when we still
> have an existing correctness problem that this patch solves, so maybe we should just get this
> done and then file an RFE.

Perhaps, yes. But note that questions about JMM and ordering of writes 
to array elements are about correctness, not performance.

Regards, Peter

>
> David
>


From daniel.daugherty at oracle.com  Tue Nov  4 14:46:51 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 04 Nov 2014 07:46:51 -0700
Subject: RFR(S) Contended Locking fast enter bucket (8061553)
In-Reply-To: <54587A41.2020508@oracle.com>
References: <5452C0B4.4070601@oracle.com> <5457084B.6070808@oracle.com>
	<5458330E.1080207@oracle.com> <54587A41.2020508@oracle.com>
Message-ID: <5458E6DB.8020409@oracle.com>

On 11/4/14 12:03 AM, David Holmes wrote:
> Hi Dan,
>
> One follow up deep below ...
>
> On 4/11/2014 11:59 AM, Daniel D. Daugherty wrote:
>> David,
>>
>> Thanks for the review! As usual, replies are embedded below...
>>
>>
>> On 11/2/14 9:44 PM, David Holmes wrote:
>>> Hi Dan,
>>>
>>> Looks good.
>>
>> Thanks!
>>
>>
>>> Couple of nits and one semantic query below ...
>>>
>>> src/cpu/sparc/vm/macroAssembler_sparc.cpp
>>>
>>> Formatting changes were a bit of a distraction.
>>
>> Yes, I have no idea what got into me. Normally I do formatting
>> changes separately so the noise does not distract...
>>
>> It turns out there is a constant defined that should be used
>> instead of all these literal '2's:
>>
>> src/share/vm/oops/markOop.hpp:         monitor_value = 2
>>
>> Typically used as follows:
>>
>> src/cpu/x86/vm/macroAssembler_x86.cpp:  int owner_offset =
>> ObjectMonitor::owner_offset_in_bytes() - markOopDesc::monitor_value;
>>
>> I will clean this up just for the files that I'm touching as
>> part of this fix.
>>
>>
>>>
>>> ---
>>>
>>> src/cpu/x86/vm/macroAssembler_x86.cpp
>>>
>>> Formatting changes were a bit of a distraction.
>>
>> Same reply as for macroAssembler_sparc.cpp.
>>
>>
>>> 1929     // unconditionally set stackBox->_displaced_header = 3
>>> 1930     movptr(Address(boxReg, 0),
>>> (int32_t)intptr_t(markOopDesc::unused_mark()));
>>>
>>> At 1870 we refer to box rather than stackBox. Also it takes some
>>> sleuthing to realize that "3" here is somehow a pseudonym for
>>> unused_mark(). Back up at 1808 we have a to-do:
>>>
>>> 1808     //   use markOop::unused_mark() instead of "3".
>>>
>>> so the current change seems to be implementing that, even though other
>>> uses of "3" are left untouched.
>>
>> I'll take a look at cleaning those up also...
>>
>> In some cases markOopDesc::marked_value will work for the literal '3',
>> but in other cases we'll use markOop::unused_mark():
>>
>>    static markOop unused_mark() {
>>      return (markOop) marked_value;
>>    }
>>
>> to save us the noise of the (markOop) cast.
>>
>>
>>> ---
>>>
>>> src/share/vm/runtime/sharedRuntime.cpp
>>>
>>> 1794 JRT_BLOCK_ENTRY(void,
>>> SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock*
>>> lock, JavaThread* thread))
>>> 1795   if (!SafepointSynchronize::is_synchronizing()) {
>>> 1796     if (ObjectSynchronizer::quick_enter(_obj, thread, lock)) 
>>> return;
>>>
>>> Is it necessary to check is_synchronizing? If we are executing this
>>> code we are not at a safepoint and the quick_enter wont change that,
>>> so I'm not sure what we are guarding against.
>>
>> So this first state checker:
>>
>> src/share/vm/runtime/safepoint.hpp:
>> inline static bool is_synchronizing()  { return _state ==
>> _synchronizing;  }
>>
>> means that we want to go to a safepoint and:
>>
>> inline static bool is_at_safepoint()   { return _state == 
>> _synchronized;  }
>>
>> means that we are at a safepoint. Dice's optimization bails out if
>> we want to go to a safepoint and ObjectSynchronizer::quick_enter()
>> has a "No_Safepoint_Verifier nsv" in it so we're expecting that
>> code to be quick (and not go to a safepoint). I'm not seeing
>> anything obvious....
>
> So it occurred to me that this is just an optimization not a true 
> guard - as the safepoint could be initiated just after we do the 
> check. So it's basically trying to ensure that if a safepoint has been 
> requested then we don't unduly delay it by taking the non-safepointing 
> quick_enter path.

Sounds reasonable to me.

Dan


>
> Cheers,
> David
>
>> Sometimes we have to be careful with JavaThread suspend requests and
>> monitor acquisition, but I don't think that's a problem here... In
>> order for the "suspend requesting" thread to be surprised, the suspend
>> API, e.g., JVM/TI SuspendThread() has to return to the caller and then
>> the suspend target has do something unexpected like acquire a monitor
>> that it was previously blocked upon when it was suspended. We've had
>> bugs like that in the past... In this optimization case, our target
>> thread is not blocked on a contended monitor...
>>
>> In this particular case, the "suspend requesting" thread will set the
>> suspend request state on the target thread, but the target thread is
>> busy trying to enter this uncontended monitor (quickly). So the
>> "suspend requesting" thread, will request a no-op safepoint, but it
>> won't return from the suspend API until that safepoint completes.
>> The safepoint won't complete until the target thread is done acquiring
>> the previously uncontended monitor... so the target thread will be
>> suspended while holding the previous uncontended monitor and the
>> "suspend requesting" thread will return from the suspend API all
>> happy...
>>
>> Well, I don't see the reason either so I'll have to ping Dave Dice
>> and Karen Kinnear to see if either of them can fill in the history
>> here. This could be an abundance of caution case.
>>
>>
>>> ---
>>>
>>> src/share/vm/runtime/synchronizer.cpp
>>>
>>> Minor nit: line 153 the usual acronym is NPE (for
>>> NullPointerException) not NPX
>>
>> I'll do a search for uses of NPX and other uses of 'X' in exception
>> acronyms...
>>
>>
>>>
>>> Nit:  159     Thread * const ox
>>>
>>> Please change ox to owner.
>>
>> Will do.
>>
>> Thanks again for the review!
>>
>> Dan
>>
>>
>>>
>>> ---
>>>
>>> Thanks,
>>> David
>>>
>>>
>>>
>>> On 31/10/2014 8:50 AM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have the Contended Locking fast enter bucket ready for review.
>>>>
>>>> The code changes in this bucket are primarily a quick_enter()
>>>> function that works on inflated but uncontended Java monitors.
>>>> This quick_enter() function is used on the "slow path" for Java
>>>> Monitor enter operations when the built-in "fast path" (read
>>>> assembly code) doesn't work.
>>>>
>>>> This work is being tracked by the following bug ID:
>>>>
>>>>      JDK-8061553 Contended Locking fast enter bucket
>>>>      https://bugs.openjdk.java.net/browse/JDK-8061553
>>>>
>>>> Here is the webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/0-jdk9-hs-rt/
>>>>
>>>> Here is the JEP link:
>>>>
>>>>      https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>
>>>> 8061553 summary of changes:
>>>>
>>>> macroAssembler_sparc.cpp: MacroAssembler::compiler_lock_object()
>>>>
>>>> - clean up spacing around some
>>>>    'ObjectMonitor::owner_offset_in_bytes() - 2' uses
>>>> - remove optional (EmitSync & 64) code
>>>> - change from cmp() to andcc() so icc.zf flag is set
>>>>
>>>> macroAssembler_x86.cpp: MacroAssembler::fast_lock()
>>>>
>>>> - remove optional (EmitSync & 2) code
>>>> - rewrite LP64 inflated lock code that tries to CAS in
>>>>    the new owner value to be more efficient
>>>>
>>>> interfaceSupport.hpp:
>>>>
>>>> - add JRT_BLOCK_NO_ASYNC to permit splitting a
>>>>    JRT_BLOCK_ENTRY into two pieces.
>>>>
>>>> sharedRuntime.cpp: SharedRuntime::complete_monitor_locking_C()
>>>>
>>>> - change entry type from JRT_ENTRY_NO_ASYNC to JRT_BLOCK_ENTRY
>>>>    to permit ObjectSynchronizer::quick_enter() call
>>>> - add JRT_BLOCK_NO_ASYNC use if the quick_enter() doesn't work
>>>>    to revert to JRT_ENTRY_NO_ASYNC-like semantics
>>>>
>>>> synchronizer.[ch]pp:
>>>>
>>>> - add ObjectSynchronizer::quick_enter() for entering an
>>>>    inflated but unowned Java monitor without thread state
>>>>    changes
>>>>
>>>> Testing:
>>>>
>>>> - Aurora Adhoc RT/SVC baseline batch
>>>> - JPRT test jobs
>>>> - MonitorEnterStresser micro-benchmark (in process)
>>>> - CallTimerGrid stress testing (in process)
>>>> - Aurora performance testing:
>>>>    - out of the box for the "promotion" and 32-bit server configs
>>>>    - heavy weight monitors for the "promotion" and 32-bit server 
>>>> configs
>>>>      (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>>>>      (in process)
>>>>
>>>>
>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>
>>>> Dan
>>


From david.r.chase at oracle.com  Tue Nov  4 15:19:24 2014
From: david.r.chase at oracle.com (David Chase)
Date: Tue, 4 Nov 2014 10:19:24 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <5458A57C.4060208@gmail.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
	<5458A57C.4060208@gmail.com>
Message-ID: <260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>


On 2014-11-04, at 5:07 AM, Peter Levart <peter.levart at gmail.com> wrote:
> Are you thinking of an IdentityHashMap type of hash table (no linked-list of elements for same bucket, just search for 1st free slot on insert)? The problem would be how to pre-size the array. Count declared members?

It can?t be an identityHashMap, because we are interning member names.
In spite of my grumbling about benchmarking, I?m inclined to do that and try a couple of experiments.
One possibility would be to use two data structures, one for interning, the other for communication with the VM.
Because there?s no lookup in the VM data stucture it can just be an array that gets elements appended,
and the synchronization dance is much simpler.

For interning, maybe I use a ConcurrentHashMap, and I try the following idiom:

mn = resolve(args)
// deal with any errors
mn? = chm.get(mn)
if (mn? != null) return mn? // hoped-for-common-case

synchronized (something) {
  mn? = chm.get(mn)
  if (mn? != null) return mn?
  
  txn_class = mn.getDeclaringClass()

    while (true) {
       redef_count = txn_class.redefCount()
       mn = resolve(args)

      shared_array.add(mn);
      // barrier, because we are a paranoid
      if (redef_count = redef_count.redefCount()) {
          chm.add(mn); // safe to publish to other Java threads.
          return mn;
      }
      shared_array.drop_last(); // Try again
  }
}

(Idiom gets slightly revised for the one or two other intern use cases, but this is the basic idea).

David

>> 
>> And another way to view this is that we?re now quibbling about performance, when we still
>> have an existing correctness problem that this patch solves, so maybe we should just get this
>> done and then file an RFE.
> 
> Perhaps, yes. But note that questions about JMM and ordering of writes to array elements are about correctness, not performance.
> 
> Regards, Peter
> 
>> 
>> David


From coleen.phillimore at oracle.com  Tue Nov  4 15:26:39 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Tue, 04 Nov 2014 10:26:39 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <5458A57C.4060208@gmail.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>	<632A5C98-B386-4625-BE12-355241581955@oracle.com>	<5457AA75.8090103@gmail.com>	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>	<5457E0F9.8090004@gmail.com>	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
	<5458A57C.4060208@gmail.com>
Message-ID: <5458F02F.5020409@oracle.com>


On 11/04/2014 05:07 AM, Peter Levart wrote:
> On 11/03/2014 09:41 PM, David Chase wrote:
>> On 2014-11-03, at 3:09 PM, Peter Levart <peter.levart at gmail.com> wrote:
>>
>>> Hi David,
>>>
>>> I was thinking about the fact that java.lang.invoke code is leaking 
>>> into java.lang.Class. Perhaps, If you don't mind rewriting the code, 
>>> a better code structure would be, if j.l.Class changes only 
>>> consisted of adding a simple:
>>> ?

Peter,

I agreed with your comment about java/lang/invoke things leaking into 
java/lang/Class.  I think this should be in another class with a pointer 
in java/lang/Class to it.   I'm adding jdk9-dev because I think the 
core-libs people may have an opinion about this.

On the JVM side, I suggested jmethodID as an alternate place to store 
the Method* to save the JVM from knowing how to inspect the contents of 
the MemberName type.   I'm not sure if that's the best solution since 
jmethodIDs leak memory and except for jvmti, the code assumes there 
aren't many.   But I would like us to think of a better solution.  My 
original idea was to save method->idnum() like we do with reflection but 
finding Method* from idnum can be complicated and apparently the code to 
to this is in assembly code for MethodHandles.

I would be surprised if the extra level of indirection at these calls 
would be a performance issue given all the code to added intern these 
things.

The idea that we should ship this because it works and file an RFE to 
rewrite it later is not acceptable to me.

Thanks,
Co9leen

>>> This way all the worries about ordering of writes into array and/or 
>>> size are gone. The array is still used to quickly search for an 
>>> element, but VM only scans the linked-list.
>>>
>>> What do you think of this?
>> I?m not sure.  I know Coleen Ph would like to see that happen.
>>
>> A couple of people have vague plans to move more of the MemberName 
>> resolution into core libs.
>> (Years ago I worked on a VM where *all* of this occurred in Java, but 
>> some of it was ahead of time.)
>
> Hi David,
>
>>
>> I heard mention of ?we want to put more stuff in there? but I got the 
>> impression that already happened
>> (there?s reflection data, for example) so I?m not sure that makes sense.
>
> Reflection is an API that is rooted in j.l.Class. If the plans are to 
> move some of the java.lang.invoke public API to java.lang package 
> (into the j.l.Class,  ...), then this is understandable.
>
>>
>> There?s also a proposal from people in the runtime to just use a 
>> jmethodid, take the hit of an extra indirection,
>> and no need to for this worrisome jvm/java concurrency.
>
> The linked list of MemberName(s) is also worry-less and doesn't need 
> an extra indirection via jmethodid. Does the hit of extra indirection 
> occur when invoking a MethodHandle?
>
>>
>> And if we instead wrote a hash table that only grew, and never 
>> relocated elements, we could
>> (I think) allow non-synchronized O(1) probes of the table from the 
>> Java side, synchronized
>> O(1) insertions from the Java side, and because nothing moves, a 
>> smaller dance with the
>> VM.  I?m rather tempted to look into this ? given the amount of work 
>> it would take to do the
>> benchmarking to see if (a) jmethodid would have acceptable 
>> performance or (b) the existing
>> costs are too high, I could instead just write fast code and be done.
>
> Are you thinking of an IdentityHashMap type of hash table (no 
> linked-list of elements for same bucket, just search for 1st free slot 
> on insert)? The problem would be how to pre-size the array. Count 
> declared members?
>
>>
>> And another way to view this is that we?re now quibbling about 
>> performance, when we still
>> have an existing correctness problem that this patch solves, so maybe 
>> we should just get this
>> done and then file an RFE.
>
> Perhaps, yes. But note that questions about JMM and ordering of writes 
> to array elements are about correctness, not performance.
>
> Regards, Peter
>
>>
>> David
>>
>


From peter.levart at gmail.com  Tue Nov  4 16:48:14 2014
From: peter.levart at gmail.com (Peter Levart)
Date: Tue, 04 Nov 2014 17:48:14 +0100
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
	<5458A57C.4060208@gmail.com>
	<260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>
Message-ID: <5459034E.8070809@gmail.com>

On 11/04/2014 04:19 PM, David Chase wrote:
> On 2014-11-04, at 5:07 AM, Peter Levart <peter.levart at gmail.com> wrote:
>> Are you thinking of an IdentityHashMap type of hash table (no linked-list of elements for same bucket, just search for 1st free slot on insert)? The problem would be how to pre-size the array. Count declared members?
> It can?t be an identityHashMap, because we are interning member names.

I know it can't be IdentityHashMap - I just wondered if you were 
thinking of an IdentityHashMap-like data structure in contrast to 
standard HashMap-like. Not in terms of equality/hashCode used, but in 
terms of internal data structure. IdentityHashMap is just an array of 
elements (well pairs of them - key, value are placed in two consecutive 
array slots). Lookup searches for element linearly in the array starting 
from hashCode based index to the element if found or 1st empty array 
slot. It's very easy to implement if the only operations are get() and 
put() and could be used for interning and as a shared structure for VM 
to scan, but array has to be sized to at least 3/2 the number of 
elements for performance to not degrade.

> In spite of my grumbling about benchmarking, I?m inclined to do that and try a couple of experiments.
> One possibility would be to use two data structures, one for interning, the other for communication with the VM.
> Because there?s no lookup in the VM data stucture it can just be an array that gets elements appended,
> and the synchronization dance is much simpler.
>
> For interning, maybe I use a ConcurrentHashMap, and I try the following idiom:
>
> mn = resolve(args)
> // deal with any errors
> mn? = chm.get(mn)
> if (mn? != null) return mn? // hoped-for-common-case
>
> synchronized (something) {
>    mn? = chm.get(mn)
>    if (mn? != null) return mn?
>    
>    txn_class = mn.getDeclaringClass()
>
>      while (true) {
>         redef_count = txn_class.redefCount()
>         mn = resolve(args)
>
>        shared_array.add(mn);
>        // barrier, because we are a paranoid
>        if (redef_count = redef_count.redefCount()) {
>            chm.add(mn); // safe to publish to other Java threads.
>            return mn;
>        }
>        shared_array.drop_last(); // Try again
>    }
> }
>
> (Idiom gets slightly revised for the one or two other intern use cases, but this is the basic idea).

Yes, that's similar to what I suggested by using a linked-list of 
MemberName(s) instead of the "shared_array" (easier to reason about 
ordering of writes) and a sorted array of MemberName(s) instead of the 
"chm" in your scheme above. ConcurrentHashMap would certainly be the 
most performant solution in terms of lookup/insertion-time and 
concurrent throughput, but it will use more heap than a simple packed 
array of MemberNames. CHM is much better now in JDK8 though regarding 
heap use.

A combination of the two approaches is also possible:

- instead of maintaining a "shared_array" of MemberName(s), have them 
form a linked-list (you trade a slot in array for 'next' pointer in 
MemberName)
- use ConcurrentHashMap for interning.

Regards, Peter

>
> David
>
>>> And another way to view this is that we?re now quibbling about performance, when we still
>>> have an existing correctness problem that this patch solves, so maybe we should just get this
>>> done and then file an RFE.
>> Perhaps, yes. But note that questions about JMM and ordering of writes to array elements are about correctness, not performance.
>>
>> Regards, Peter
>>
>>> David


From daniel.daugherty at oracle.com  Tue Nov  4 18:26:02 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 04 Nov 2014 11:26:02 -0700
Subject: RFR(S) Contended Locking fast enter bucket (8061553)
In-Reply-To: <5458330E.1080207@oracle.com>
References: <5452C0B4.4070601@oracle.com> <5457084B.6070808@oracle.com>
	<5458330E.1080207@oracle.com>
Message-ID: <54591A3A.1090005@oracle.com>

The cleanup is turning into a bigger change than the fast enter
bucket itself so I'm spinning the cleanup into a new bug:

     JDK-8062851 cleanup ObjectMonitor offset adjustments
     https://bugs.openjdk.java.net/browse/JDK-8062851

Yes, this means that the Contended Locking cleanup bucket has reopened
for yet another change...

We'll get back to "fast enter" after the dust has settled...

Dan


On 11/3/14 6:59 PM, Daniel D. Daugherty wrote:
> David,
>
> Thanks for the review! As usual, replies are embedded below...
>
>
> On 11/2/14 9:44 PM, David Holmes wrote:
>> Hi Dan,
>>
>> Looks good.
>
> Thanks!
>
>
>> Couple of nits and one semantic query below ...
>>
>> src/cpu/sparc/vm/macroAssembler_sparc.cpp
>>
>> Formatting changes were a bit of a distraction.
>
> Yes, I have no idea what got into me. Normally I do formatting
> changes separately so the noise does not distract...
>
> It turns out there is a constant defined that should be used
> instead of all these literal '2's:
>
> src/share/vm/oops/markOop.hpp:         monitor_value            = 2
>
> Typically used as follows:
>
> src/cpu/x86/vm/macroAssembler_x86.cpp:  int owner_offset = 
> ObjectMonitor::owner_offset_in_bytes() - markOopDesc::monitor_value;
>
> I will clean this up just for the files that I'm touching as
> part of this fix.
>
>
>>
>> ---
>>
>> src/cpu/x86/vm/macroAssembler_x86.cpp
>>
>> Formatting changes were a bit of a distraction.
>
> Same reply as for macroAssembler_sparc.cpp.
>
>
>> 1929     // unconditionally set stackBox->_displaced_header = 3
>> 1930     movptr(Address(boxReg, 0), 
>> (int32_t)intptr_t(markOopDesc::unused_mark()));
>>
>> At 1870 we refer to box rather than stackBox. Also it takes some 
>> sleuthing to realize that "3" here is somehow a pseudonym for 
>> unused_mark(). Back up at 1808 we have a to-do:
>>
>> 1808     //   use markOop::unused_mark() instead of "3".
>>
>> so the current change seems to be implementing that, even though 
>> other uses of "3" are left untouched.
>
> I'll take a look at cleaning those up also...
>
> In some cases markOopDesc::marked_value will work for the literal '3',
> but in other cases we'll use markOop::unused_mark():
>
>   static markOop unused_mark() {
>     return (markOop) marked_value;
>   }
>
> to save us the noise of the (markOop) cast.
>
>
>> ---
>>
>> src/share/vm/runtime/sharedRuntime.cpp
>>
>> 1794 JRT_BLOCK_ENTRY(void, 
>> SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock* 
>> lock, JavaThread* thread))
>> 1795   if (!SafepointSynchronize::is_synchronizing()) {
>> 1796     if (ObjectSynchronizer::quick_enter(_obj, thread, lock)) 
>> return;
>>
>> Is it necessary to check is_synchronizing? If we are executing this 
>> code we are not at a safepoint and the quick_enter wont change that, 
>> so I'm not sure what we are guarding against.
>
> So this first state checker:
>
> src/share/vm/runtime/safepoint.hpp:
> inline static bool is_synchronizing()  { return _state == 
> _synchronizing;  }
>
> means that we want to go to a safepoint and:
>
> inline static bool is_at_safepoint()   { return _state == 
> _synchronized;  }
>
> means that we are at a safepoint. Dice's optimization bails out if
> we want to go to a safepoint and ObjectSynchronizer::quick_enter()
> has a "No_Safepoint_Verifier nsv" in it so we're expecting that
> code to be quick (and not go to a safepoint). I'm not seeing
> anything obvious....
>
> Sometimes we have to be careful with JavaThread suspend requests and
> monitor acquisition, but I don't think that's a problem here... In
> order for the "suspend requesting" thread to be surprised, the suspend
> API, e.g., JVM/TI SuspendThread() has to return to the caller and then
> the suspend target has do something unexpected like acquire a monitor
> that it was previously blocked upon when it was suspended. We've had
> bugs like that in the past... In this optimization case, our target
> thread is not blocked on a contended monitor...
>
> In this particular case, the "suspend requesting" thread will set the
> suspend request state on the target thread, but the target thread is
> busy trying to enter this uncontended monitor (quickly). So the
> "suspend requesting" thread, will request a no-op safepoint, but it
> won't return from the suspend API until that safepoint completes.
> The safepoint won't complete until the target thread is done acquiring
> the previously uncontended monitor... so the target thread will be
> suspended while holding the previous uncontended monitor and the
> "suspend requesting" thread will return from the suspend API all
> happy...
>
> Well, I don't see the reason either so I'll have to ping Dave Dice
> and Karen Kinnear to see if either of them can fill in the history
> here. This could be an abundance of caution case.
>
>
>> ---
>>
>> src/share/vm/runtime/synchronizer.cpp
>>
>> Minor nit: line 153 the usual acronym is NPE (for 
>> NullPointerException) not NPX
>
> I'll do a search for uses of NPX and other uses of 'X' in exception
> acronyms...
>
>
>>
>> Nit:  159     Thread * const ox
>>
>> Please change ox to owner.
>
> Will do.
>
> Thanks again for the review!
>
> Dan
>
>
>>
>> ---
>>
>> Thanks,
>> David
>>
>>
>>
>> On 31/10/2014 8:50 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have the Contended Locking fast enter bucket ready for review.
>>>
>>> The code changes in this bucket are primarily a quick_enter()
>>> function that works on inflated but uncontended Java monitors.
>>> This quick_enter() function is used on the "slow path" for Java
>>> Monitor enter operations when the built-in "fast path" (read
>>> assembly code) doesn't work.
>>>
>>> This work is being tracked by the following bug ID:
>>>
>>>      JDK-8061553 Contended Locking fast enter bucket
>>>      https://bugs.openjdk.java.net/browse/JDK-8061553
>>>
>>> Here is the webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/0-jdk9-hs-rt/
>>>
>>> Here is the JEP link:
>>>
>>>      https://bugs.openjdk.java.net/browse/JDK-8046133
>>>
>>> 8061553 summary of changes:
>>>
>>> macroAssembler_sparc.cpp: MacroAssembler::compiler_lock_object()
>>>
>>> - clean up spacing around some
>>>    'ObjectMonitor::owner_offset_in_bytes() - 2' uses
>>> - remove optional (EmitSync & 64) code
>>> - change from cmp() to andcc() so icc.zf flag is set
>>>
>>> macroAssembler_x86.cpp: MacroAssembler::fast_lock()
>>>
>>> - remove optional (EmitSync & 2) code
>>> - rewrite LP64 inflated lock code that tries to CAS in
>>>    the new owner value to be more efficient
>>>
>>> interfaceSupport.hpp:
>>>
>>> - add JRT_BLOCK_NO_ASYNC to permit splitting a
>>>    JRT_BLOCK_ENTRY into two pieces.
>>>
>>> sharedRuntime.cpp: SharedRuntime::complete_monitor_locking_C()
>>>
>>> - change entry type from JRT_ENTRY_NO_ASYNC to JRT_BLOCK_ENTRY
>>>    to permit ObjectSynchronizer::quick_enter() call
>>> - add JRT_BLOCK_NO_ASYNC use if the quick_enter() doesn't work
>>>    to revert to JRT_ENTRY_NO_ASYNC-like semantics
>>>
>>> synchronizer.[ch]pp:
>>>
>>> - add ObjectSynchronizer::quick_enter() for entering an
>>>    inflated but unowned Java monitor without thread state
>>>    changes
>>>
>>> Testing:
>>>
>>> - Aurora Adhoc RT/SVC baseline batch
>>> - JPRT test jobs
>>> - MonitorEnterStresser micro-benchmark (in process)
>>> - CallTimerGrid stress testing (in process)
>>> - Aurora performance testing:
>>>    - out of the box for the "promotion" and 32-bit server configs
>>>    - heavy weight monitors for the "promotion" and 32-bit server 
>>> configs
>>>      (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>>>      (in process)
>>>
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>
>


From serguei.spitsyn at oracle.com  Tue Nov  4 19:57:54 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 04 Nov 2014 11:57:54 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <5457E36A.3020800@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com>
Message-ID: <54592FC2.7090406@oracle.com>

Hi Jeremy and Coleen,

I'm reviewing this too.
We also need to run the nsk.jvmti, nsk.jdi and jtreg jdi tests.

Thanks,
Serguei

On 11/3/14 12:19 PM, Coleen Phillimore wrote:
>
> Hi Jeremy,
>
> I reviewed your new code and it looks fine.  I had one comment in
>
> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/share/vm/prims/jvmtiEnv.cpp.udiff.html 
>
>
> The name "need_to_resolve" doesn't make sense when reading this code.  
> Isn't it more like "need_to_ensure_space" ?  I think method resolution 
> with the other name, which it doesn't do.
>
> I was trying to find a way to make this new code not appear twice 
> (maybe with a local jvmtiEnv function get_jmethodID(m) - instanceK_h 
> is m->method_holder()).

Agreed on the above.

>
> Also, I was trying to figure out if the new class in utilities called 
> chunkedList.hpp could be used to store jmethodIDs, since the data 
> structures are similar.  There is still more things in JNIMethodBlock 
> has to do so I think a specialized structure is still needed (which is 
> why I originally wrote it to be very simple).  I'm not sure if the 
> comment above it still applies. Maybe only the first and third 
> sentences.  Can you rewrite the comment slightly?
>
> Your other comments in the changes are good.
>
> I can't completely answer your question about reusing free_methods - 
> but if a jmethodID is created provisionally in 
> InstanceKlass::get_jmethod_id and not needed because it loses the race 
> in the method id cache, it's never handed back to native code, so it's 
> safe to reuse.  This is different than jmethodIDs for methods that are 
> unloaded.  They are cleared and never reused.  At least that's my 
> reading of this caching code but it's pretty complicated stuff.
>
> I've also run our nsk and jck vm/jvmti on this change and they all 
> passed.  I'd be happy to sponsor it with these suggested changes and 
> it needs another reviewer.
>
> Thanks for diagnosing and fixing this problem!
> Coleen
>
>
> On 10/30/2014 01:02 PM, Jeremy Manson wrote:
>> There's a significant regression in the speed of JVMTI 
>> GetClassMethods in
>> JDK8. I've tracked this down to allocation of jmethodids in a tight 
>> loop.
>> The issue can be addressed by preallocating enough space for all of the
>> jmethodids when starting the operation and not iterating over all of the
>> existing jmethodids when you allocate a new one.
>>
>> A patch is here:
>>
>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/
>>
>> A reproducible test case can be found here:
>>
>> http://cr.openjdk.java.net/~jmanson/8062116/repro/
>>
>> It's a benchmark, though: I have no idea how to turn it into a test.
>>
>> For whoever reviews it: can you explain to me why it is okay that 
>> this code
>> reuses jmethodIDs (in JNIMethodBlock::add_method?  I can imagine a 
>> lot of
>> problems stemming from accidental reuse.
>>
>> Jeremy
>


From jeremymanson at google.com  Tue Nov  4 19:58:26 2014
From: jeremymanson at google.com (Jeremy Manson)
Date: Tue, 4 Nov 2014 11:58:26 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <5457E36A.3020800@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com>
Message-ID: <CAPYFHW0dRoDcgGQqnzn9meQMU-6cze2YPAuKJUbyQzgUb+p2aA@mail.gmail.com>

Thanks for taking a look, Coleen!

On Mon, Nov 3, 2014 at 12:19 PM, Coleen Phillimore <
coleen.phillimore at oracle.com> wrote:

>
> Hi Jeremy,
>
> I reviewed your new code and it looks fine.  I had one comment in
>
> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/
> share/vm/prims/jvmtiEnv.cpp.udiff.html
>
> The name "need_to_resolve" doesn't make sense when reading this code.
> Isn't it more like "need_to_ensure_space" ?  I think method resolution with
> the other name, which it doesn't do.
>

Hmmm... it is there to tell you that there are jmethodids for that class
that haven't been instantiated.  Is it all right if I change it to
"jmethodids_found" (and reverse the sense of it)?


> I was trying to find a way to make this new code not appear twice (maybe
> with a local jvmtiEnv function get_jmethodID(m) - instanceK_h is
> m->method_holder()).
>

You know, I initially did that, but this file is parsed with some weird XSL
setup that doesn't allow methods other than the ones that map directly to
the JVMTI calls.

Also, I was trying to figure out if the new class in utilities called
> chunkedList.hpp could be used to store jmethodIDs, since the data
> structures are similar.  There is still more things in JNIMethodBlock has
> to do so I think a specialized structure is still needed (which is why I
> originally wrote it to be very simple).  I'm not sure if the comment above
> it still applies.  Maybe only the first and third sentences.  Can you
> rewrite the comment slightly?
>

chunkedList wouldn't work as is, because it doesn't let you parameterize
the bucket size, but it could probably be made to work (in the same way I
made this one work).  It's also an oddly bare-bones class - I'm not sure
why it doesn't have contains and insert methods and so on.

I'm not in love with the idea of doing it, because a) it would complicate
my backport and b) I don't really have a lot of time to do hotspot
refactoring, but if you think it should happen, I can make it happen
(perhaps not in a timely way :) ).

As for the comment, I'll eliminate all but the first and third sentences.


> Your other comments in the changes are good.
>
> I can't completely answer your question about reusing free_methods - but
> if a jmethodID is created provisionally in InstanceKlass::get_jmethod_id
> and not needed because it loses the race in the method id cache, it's never
> handed back to native code, so it's safe to reuse.  This is different than
> jmethodIDs for methods that are unloaded.  They are cleared and never
> reused.  At least that's my reading of this caching code but it's pretty
> complicated stuff.
>

Ah, I see.  Thanks.


> I've also run our nsk and jck vm/jvmti on this change and they all
> passed.  I'd be happy to sponsor it with these suggested changes and it
> needs another reviewer.
>

I've cc'd Chuck Rasbold, who has already reviewed it internally and given
it the thumbs-up.  I'm sure he would be happy to do so publicly, too.

Thanks for diagnosing and fixing this problem!


Happy to do it!  And so are the programs that use my JVMTI!

Jeremy

From jeremymanson at google.com  Tue Nov  4 19:59:33 2014
From: jeremymanson at google.com (Jeremy Manson)
Date: Tue, 4 Nov 2014 11:59:33 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <54592FC2.7090406@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
Message-ID: <CAPYFHW3jmFGzE1t7FfEZPMZce3tUyXhguyHuDhMbTvqSFV9nEg@mail.gmail.com>

Weird coincidence.

On Tue, Nov 4, 2014 at 11:57 AM, serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Hi Jeremy and Coleen,
>
> I'm reviewing this too.
> We also need to run the nsk.jvmti, nsk.jdi and jtreg jdi tests.
>
> Thanks,
> Serguei
>
> On 11/3/14 12:19 PM, Coleen Phillimore wrote:
>
>>
>> Hi Jeremy,
>>
>> I reviewed your new code and it looks fine.  I had one comment in
>>
>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/
>> share/vm/prims/jvmtiEnv.cpp.udiff.html
>>
>> The name "need_to_resolve" doesn't make sense when reading this code.
>> Isn't it more like "need_to_ensure_space" ?  I think method resolution with
>> the other name, which it doesn't do.
>>
>> I was trying to find a way to make this new code not appear twice (maybe
>> with a local jvmtiEnv function get_jmethodID(m) - instanceK_h is
>> m->method_holder()).
>>
>
> Agreed on the above.
>


Per my message to Coleen, you can't add methods to this file.  All other
possibilities seemed like overkill, but other suggestions welcome.

Jeremy

From coleen.phillimore at oracle.com  Tue Nov  4 20:40:33 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Tue, 04 Nov 2014 15:40:33 -0500
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <CAPYFHW0dRoDcgGQqnzn9meQMU-6cze2YPAuKJUbyQzgUb+p2aA@mail.gmail.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com>
	<CAPYFHW0dRoDcgGQqnzn9meQMU-6cze2YPAuKJUbyQzgUb+p2aA@mail.gmail.com>
Message-ID: <545939C1.6040703@oracle.com>


Hi Jeremy,

Having Chuck reply publicly to the review would be good.  We miss seeing 
his emails :)

On 11/04/2014 02:58 PM, Jeremy Manson wrote:
>
> Thanks for taking a look, Coleen!
>
> On Mon, Nov 3, 2014 at 12:19 PM, Coleen Phillimore 
> <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>> 
> wrote:
>
>
>     Hi Jeremy,
>
>     I reviewed your new code and it looks fine.  I had one comment in
>
>     http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/share/vm/prims/jvmtiEnv.cpp.udiff.html
>     <http://cr.openjdk.java.net/%7Ejmanson/8062116/webrev.00/src/share/vm/prims/jvmtiEnv.cpp.udiff.html>
>
>     The name "need_to_resolve" doesn't make sense when reading this
>     code.  Isn't it more like "need_to_ensure_space" ?  I think method
>     resolution with the other name, which it doesn't do.
>
>
> Hmmm... it is there to tell you that there are jmethodids for that 
> class that haven't been instantiated. Is it all right if I change it 
> to "jmethodids_found" (and reverse the sense of it)?

Okay, yes jmethodids_found makes more sense to me in this context.

>     I was trying to find a way to make this new code not appear twice
>     (maybe with a local jvmtiEnv function get_jmethodID(m) -
>     instanceK_h is m->method_holder()).
>
>
> You know, I initially did that, but this file is parsed with some 
> weird XSL setup that doesn't allow methods other than the ones that 
> map directly to the JVMTI calls.

Oh, yes.  You are right.  The code is fine then.  It's not too much 
duplicated.

>
>     Also, I was trying to figure out if the new class in utilities
>     called chunkedList.hpp could be used to store jmethodIDs, since
>     the data structures are similar.  There is still more things in
>     JNIMethodBlock has to do so I think a specialized structure is
>     still needed (which is why I originally wrote it to be very
>     simple).  I'm not sure if the comment above it still applies. 
>     Maybe only the first and third sentences.  Can you rewrite the
>     comment slightly?
>
>
> chunkedList wouldn't work as is, because it doesn't let you 
> parameterize the bucket size, but it could probably be made to work 
> (in the same way I made this one work).  It's also an oddly bare-bones 
> class - I'm not sure why it doesn't have contains and insert methods 
> and so on.
>
> I'm not in love with the idea of doing it, because a) it would 
> complicate my backport and b) I don't really have a lot of time to do 
> hotspot refactoring, but if you think it should happen, I can make it 
> happen (perhaps not in a timely way :) ).
>

No, I don't think you should do this.  It was a general comment that 
this utility class is available for such things but has only one use so far.

> As for the comment, I'll eliminate all but the first and third sentences.

Thanks!

>     Your other comments in the changes are good.
>
>     I can't completely answer your question about reusing free_methods
>     - but if a jmethodID is created provisionally in
>     InstanceKlass::get_jmethod_id and not needed because it loses the
>     race in the method id cache, it's never handed back to native
>     code, so it's safe to reuse.  This is different than jmethodIDs
>     for methods that are unloaded. They are cleared and never reused. 
>     At least that's my reading of this caching code but it's pretty
>     complicated stuff.
>
>
> Ah, I see.  Thanks.
>
>     I've also run our nsk and jck vm/jvmti on this change and they all
>     passed.  I'd be happy to sponsor it with these suggested changes
>     and it needs another reviewer.
>
>
> I've cc'd Chuck Rasbold, who has already reviewed it internally and 
> given it the thumbs-up.  I'm sure he would be happy to do so publicly, 
> too.
>
>     Thanks for diagnosing and fixing this problem!
>
>
> Happy to do it!  And so are the programs that use my JVMTI!
>

Thank you!   If you commit and send me the result of hg export your 
changeset, then I'll get your comments also and won't get the chance to 
mess up and not use commit -u jmanson.

Coleen

> Jeremy
>


From coleen.phillimore at oracle.com  Tue Nov  4 20:43:02 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Tue, 04 Nov 2014 15:43:02 -0500
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <54592FC2.7090406@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
Message-ID: <54593A56.4050603@oracle.com>


On 11/04/2014 02:57 PM, serguei.spitsyn at oracle.com wrote:
> Hi Jeremy and Coleen,
>
> I'm reviewing this too.
> We also need to run the nsk.jvmti, nsk.jdi and jtreg jdi tests.

Hi Serguei,  I ran all of vm.quick.testlist on this which includes 
jvmti, jdi tests.  I'll run jtreg jdi tests too (where are they?)

Thanks,
Coleen

>
> Thanks,
> Serguei
>
> On 11/3/14 12:19 PM, Coleen Phillimore wrote:
>>
>> Hi Jeremy,
>>
>> I reviewed your new code and it looks fine.  I had one comment in
>>
>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/share/vm/prims/jvmtiEnv.cpp.udiff.html 
>>
>>
>> The name "need_to_resolve" doesn't make sense when reading this 
>> code.  Isn't it more like "need_to_ensure_space" ?  I think method 
>> resolution with the other name, which it doesn't do.
>>
>> I was trying to find a way to make this new code not appear twice 
>> (maybe with a local jvmtiEnv function get_jmethodID(m) - instanceK_h 
>> is m->method_holder()).
>
> Agreed on the above.
>
>>
>> Also, I was trying to figure out if the new class in utilities called 
>> chunkedList.hpp could be used to store jmethodIDs, since the data 
>> structures are similar.  There is still more things in JNIMethodBlock 
>> has to do so I think a specialized structure is still needed (which 
>> is why I originally wrote it to be very simple).  I'm not sure if the 
>> comment above it still applies. Maybe only the first and third 
>> sentences.  Can you rewrite the comment slightly?
>>
>> Your other comments in the changes are good.
>>
>> I can't completely answer your question about reusing free_methods - 
>> but if a jmethodID is created provisionally in 
>> InstanceKlass::get_jmethod_id and not needed because it loses the 
>> race in the method id cache, it's never handed back to native code, 
>> so it's safe to reuse.  This is different than jmethodIDs for methods 
>> that are unloaded.  They are cleared and never reused.  At least 
>> that's my reading of this caching code but it's pretty complicated 
>> stuff.
>>
>> I've also run our nsk and jck vm/jvmti on this change and they all 
>> passed.  I'd be happy to sponsor it with these suggested changes and 
>> it needs another reviewer.
>>
>> Thanks for diagnosing and fixing this problem!
>> Coleen
>>
>>
>> On 10/30/2014 01:02 PM, Jeremy Manson wrote:
>>> There's a significant regression in the speed of JVMTI 
>>> GetClassMethods in
>>> JDK8. I've tracked this down to allocation of jmethodids in a tight 
>>> loop.
>>> The issue can be addressed by preallocating enough space for all of the
>>> jmethodids when starting the operation and not iterating over all of 
>>> the
>>> existing jmethodids when you allocate a new one.
>>>
>>> A patch is here:
>>>
>>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/
>>>
>>> A reproducible test case can be found here:
>>>
>>> http://cr.openjdk.java.net/~jmanson/8062116/repro/
>>>
>>> It's a benchmark, though: I have no idea how to turn it into a test.
>>>
>>> For whoever reviews it: can you explain to me why it is okay that 
>>> this code
>>> reuses jmethodIDs (in JNIMethodBlock::add_method?  I can imagine a 
>>> lot of
>>> problems stemming from accidental reuse.
>>>
>>> Jeremy
>>
>


From david.r.chase at oracle.com  Tue Nov  4 20:54:03 2014
From: david.r.chase at oracle.com (David Chase)
Date: Tue, 4 Nov 2014 15:54:03 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <5459034E.8070809@gmail.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
	<5458A57C.4060208@gmail.com>
	<260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>
	<5459034E.8070809@gmail.com>
Message-ID: <D1D7CFCF-D5C0-4A05-A636-80685CF153B0@oracle.com>

I?m working on the initial benchmarking, and so far this arrangement (with synchronization
and binary search for lookup, lots of barriers and linear cost insertion) has not yet been any
slower.

I am nonetheless tempted by the 2-tables solution, because I think the simpler JVM-side
interface that it allows is desirable.

David

On 2014-11-04, at 11:48 AM, Peter Levart <peter.levart at gmail.com> wrote:

> On 11/04/2014 04:19 PM, David Chase wrote:
>> On 2014-11-04, at 5:07 AM, Peter Levart <peter.levart at gmail.com> wrote:
>>> Are you thinking of an IdentityHashMap type of hash table (no linked-list of elements for same bucket, just search for 1st free slot on insert)? The problem would be how to pre-size the array. Count declared members?
>> It can?t be an identityHashMap, because we are interning member names.
> 
> I know it can't be IdentityHashMap - I just wondered if you were thinking of an IdentityHashMap-like data structure in contrast to standard HashMap-like. Not in terms of equality/hashCode used, but in terms of internal data structure. IdentityHashMap is just an array of elements (well pairs of them - key, value are placed in two consecutive array slots). Lookup searches for element linearly in the array starting from hashCode based index to the element if found or 1st empty array slot. It's very easy to implement if the only operations are get() and put() and could be used for interning and as a shared structure for VM to scan, but array has to be sized to at least 3/2 the number of elements for performance to not degrade.
> 
>> In spite of my grumbling about benchmarking, I?m inclined to do that and try a couple of experiments.
>> One possibility would be to use two data structures, one for interning, the other for communication with the VM.
>> Because there?s no lookup in the VM data stucture it can just be an array that gets elements appended,
>> and the synchronization dance is much simpler.
>> 
>> For interning, maybe I use a ConcurrentHashMap, and I try the following idiom:
>> 
>> mn = resolve(args)
>> // deal with any errors
>> mn? = chm.get(mn)
>> if (mn? != null) return mn? // hoped-for-common-case
>> 
>> synchronized (something) {
>>   mn? = chm.get(mn)
>>   if (mn? != null) return mn?
>>      txn_class = mn.getDeclaringClass()
>> 
>>     while (true) {
>>        redef_count = txn_class.redefCount()
>>        mn = resolve(args)
>> 
>>       shared_array.add(mn);
>>       // barrier, because we are a paranoid
>>       if (redef_count = redef_count.redefCount()) {
>>           chm.add(mn); // safe to publish to other Java threads.
>>           return mn;
>>       }
>>       shared_array.drop_last(); // Try again
>>   }
>> }
>> 
>> (Idiom gets slightly revised for the one or two other intern use cases, but this is the basic idea).
> 
> Yes, that's similar to what I suggested by using a linked-list of MemberName(s) instead of the "shared_array" (easier to reason about ordering of writes) and a sorted array of MemberName(s) instead of the "chm" in your scheme above. ConcurrentHashMap would certainly be the most performant solution in terms of lookup/insertion-time and concurrent throughput, but it will use more heap than a simple packed array of MemberNames. CHM is much better now in JDK8 though regarding heap use.
> 
> A combination of the two approaches is also possible:
> 
> - instead of maintaining a "shared_array" of MemberName(s), have them form a linked-list (you trade a slot in array for 'next' pointer in MemberName)
> - use ConcurrentHashMap for interning.
> 
> Regards, Peter
> 
>> 
>> David
>> 
>>>> And another way to view this is that we?re now quibbling about performance, when we still
>>>> have an existing correctness problem that this patch solves, so maybe we should just get this
>>>> done and then file an RFE.
>>> Perhaps, yes. But note that questions about JMM and ordering of writes to array elements are about correctness, not performance.
>>> 
>>> Regards, Peter
>>> 
>>>> David


From jeremymanson at google.com  Tue Nov  4 21:05:37 2014
From: jeremymanson at google.com (Jeremy Manson)
Date: Tue, 4 Nov 2014 13:05:37 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <54593A56.4050603@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<54593A56.4050603@oracle.com>
Message-ID: <CAPYFHW2iqXiJjsyeZGBBCsTJ8xsdwj2oi=_T4uhdXz67QJLtBg@mail.gmail.com>

FWIW, all of the JDK8 jtreg tests passed.

On Tue, Nov 4, 2014 at 12:43 PM, Coleen Phillimore <
coleen.phillimore at oracle.com> wrote:

>
> On 11/04/2014 02:57 PM, serguei.spitsyn at oracle.com wrote:
>
>> Hi Jeremy and Coleen,
>>
>> I'm reviewing this too.
>> We also need to run the nsk.jvmti, nsk.jdi and jtreg jdi tests.
>>
>
> Hi Serguei,  I ran all of vm.quick.testlist on this which includes jvmti,
> jdi tests.  I'll run jtreg jdi tests too (where are they?)
>
> Thanks,
> Coleen
>
>
>
>> Thanks,
>> Serguei
>>
>> On 11/3/14 12:19 PM, Coleen Phillimore wrote:
>>
>>>
>>> Hi Jeremy,
>>>
>>> I reviewed your new code and it looks fine.  I had one comment in
>>>
>>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/
>>> share/vm/prims/jvmtiEnv.cpp.udiff.html
>>>
>>> The name "need_to_resolve" doesn't make sense when reading this code.
>>> Isn't it more like "need_to_ensure_space" ?  I think method resolution with
>>> the other name, which it doesn't do.
>>>
>>> I was trying to find a way to make this new code not appear twice (maybe
>>> with a local jvmtiEnv function get_jmethodID(m) - instanceK_h is
>>> m->method_holder()).
>>>
>>
>> Agreed on the above.
>>
>>
>>> Also, I was trying to figure out if the new class in utilities called
>>> chunkedList.hpp could be used to store jmethodIDs, since the data
>>> structures are similar.  There is still more things in JNIMethodBlock has
>>> to do so I think a specialized structure is still needed (which is why I
>>> originally wrote it to be very simple).  I'm not sure if the comment above
>>> it still applies. Maybe only the first and third sentences.  Can you
>>> rewrite the comment slightly?
>>>
>>> Your other comments in the changes are good.
>>>
>>> I can't completely answer your question about reusing free_methods - but
>>> if a jmethodID is created provisionally in InstanceKlass::get_jmethod_id
>>> and not needed because it loses the race in the method id cache, it's never
>>> handed back to native code, so it's safe to reuse.  This is different than
>>> jmethodIDs for methods that are unloaded.  They are cleared and never
>>> reused.  At least that's my reading of this caching code but it's pretty
>>> complicated stuff.
>>>
>>> I've also run our nsk and jck vm/jvmti on this change and they all
>>> passed.  I'd be happy to sponsor it with these suggested changes and it
>>> needs another reviewer.
>>>
>>> Thanks for diagnosing and fixing this problem!
>>> Coleen
>>>
>>>
>>> On 10/30/2014 01:02 PM, Jeremy Manson wrote:
>>>
>>>> There's a significant regression in the speed of JVMTI GetClassMethods
>>>> in
>>>> JDK8. I've tracked this down to allocation of jmethodids in a tight
>>>> loop.
>>>> The issue can be addressed by preallocating enough space for all of the
>>>> jmethodids when starting the operation and not iterating over all of the
>>>> existing jmethodids when you allocate a new one.
>>>>
>>>> A patch is here:
>>>>
>>>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/
>>>>
>>>> A reproducible test case can be found here:
>>>>
>>>> http://cr.openjdk.java.net/~jmanson/8062116/repro/
>>>>
>>>> It's a benchmark, though: I have no idea how to turn it into a test.
>>>>
>>>> For whoever reviews it: can you explain to me why it is okay that this
>>>> code
>>>> reuses jmethodIDs (in JNIMethodBlock::add_method?  I can imagine a lot
>>>> of
>>>> problems stemming from accidental reuse.
>>>>
>>>> Jeremy
>>>>
>>>
>>>
>>
>

From serguei.spitsyn at oracle.com  Tue Nov  4 21:07:55 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 04 Nov 2014 13:07:55 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <54593A56.4050603@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<54593A56.4050603@oracle.com>
Message-ID: <5459402B.5030304@oracle.com>

On 11/4/14 12:43 PM, Coleen Phillimore wrote:
>
> On 11/04/2014 02:57 PM, serguei.spitsyn at oracle.com wrote:
>> Hi Jeremy and Coleen,
>>
>> I'm reviewing this too.
>> We also need to run the nsk.jvmti, nsk.jdi and jtreg jdi tests.
>
> Hi Serguei,  I ran all of vm.quick.testlist on this which includes 
> jvmti, jdi tests.  I'll run jtreg jdi tests too (where are they?)

Hi Coleen,

It is more safe to run the nsk.jvmti.testlist and nsk.jdi.testlist 
instead of the vm.quick.testlist.
The jtreg jdi tests are in the <repo>/jdk/test/com/sun/jdi folder.

Thanks,
Serguei

>
> Thanks,
> Coleen
>
>>
>> Thanks,
>> Serguei
>>
>> On 11/3/14 12:19 PM, Coleen Phillimore wrote:
>>>
>>> Hi Jeremy,
>>>
>>> I reviewed your new code and it looks fine.  I had one comment in
>>>
>>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/share/vm/prims/jvmtiEnv.cpp.udiff.html 
>>>
>>>
>>> The name "need_to_resolve" doesn't make sense when reading this 
>>> code.  Isn't it more like "need_to_ensure_space" ?  I think method 
>>> resolution with the other name, which it doesn't do.
>>>
>>> I was trying to find a way to make this new code not appear twice 
>>> (maybe with a local jvmtiEnv function get_jmethodID(m) - instanceK_h 
>>> is m->method_holder()).
>>
>> Agreed on the above.
>>
>>>
>>> Also, I was trying to figure out if the new class in utilities 
>>> called chunkedList.hpp could be used to store jmethodIDs, since the 
>>> data structures are similar.  There is still more things in 
>>> JNIMethodBlock has to do so I think a specialized structure is still 
>>> needed (which is why I originally wrote it to be very simple).  I'm 
>>> not sure if the comment above it still applies. Maybe only the first 
>>> and third sentences.  Can you rewrite the comment slightly?
>>>
>>> Your other comments in the changes are good.
>>>
>>> I can't completely answer your question about reusing free_methods - 
>>> but if a jmethodID is created provisionally in 
>>> InstanceKlass::get_jmethod_id and not needed because it loses the 
>>> race in the method id cache, it's never handed back to native code, 
>>> so it's safe to reuse.  This is different than jmethodIDs for 
>>> methods that are unloaded.  They are cleared and never reused.  At 
>>> least that's my reading of this caching code but it's pretty 
>>> complicated stuff.
>>>
>>> I've also run our nsk and jck vm/jvmti on this change and they all 
>>> passed.  I'd be happy to sponsor it with these suggested changes and 
>>> it needs another reviewer.
>>>
>>> Thanks for diagnosing and fixing this problem!
>>> Coleen
>>>
>>>
>>> On 10/30/2014 01:02 PM, Jeremy Manson wrote:
>>>> There's a significant regression in the speed of JVMTI 
>>>> GetClassMethods in
>>>> JDK8. I've tracked this down to allocation of jmethodids in a tight 
>>>> loop.
>>>> The issue can be addressed by preallocating enough space for all of 
>>>> the
>>>> jmethodids when starting the operation and not iterating over all 
>>>> of the
>>>> existing jmethodids when you allocate a new one.
>>>>
>>>> A patch is here:
>>>>
>>>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/
>>>>
>>>> A reproducible test case can be found here:
>>>>
>>>> http://cr.openjdk.java.net/~jmanson/8062116/repro/
>>>>
>>>> It's a benchmark, though: I have no idea how to turn it into a test.
>>>>
>>>> For whoever reviews it: can you explain to me why it is okay that 
>>>> this code
>>>> reuses jmethodIDs (in JNIMethodBlock::add_method?  I can imagine a 
>>>> lot of
>>>> problems stemming from accidental reuse.
>>>>
>>>> Jeremy
>>>
>>
>


From rasbold at google.com  Tue Nov  4 21:11:32 2014
From: rasbold at google.com (Chuck Rasbold)
Date: Tue, 4 Nov 2014 13:11:32 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <CAPYFHW2iqXiJjsyeZGBBCsTJ8xsdwj2oi=_T4uhdXz67QJLtBg@mail.gmail.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<54593A56.4050603@oracle.com>
	<CAPYFHW2iqXiJjsyeZGBBCsTJ8xsdwj2oi=_T4uhdXz67QJLtBg@mail.gmail.com>
Message-ID: <CALFb4Ku6YXCOamkQMrxDF7NufsAUwq50P5NgfpE4NwQ6hnBLKA@mail.gmail.com>

Jeremy's webrev looks good to me.

-- Chuck

On Tue, Nov 4, 2014 at 1:05 PM, Jeremy Manson <jeremymanson at google.com>
wrote:

> FWIW, all of the JDK8 jtreg tests passed.
>
> On Tue, Nov 4, 2014 at 12:43 PM, Coleen Phillimore <
> coleen.phillimore at oracle.com> wrote:
>
>>
>> On 11/04/2014 02:57 PM, serguei.spitsyn at oracle.com wrote:
>>
>>> Hi Jeremy and Coleen,
>>>
>>> I'm reviewing this too.
>>> We also need to run the nsk.jvmti, nsk.jdi and jtreg jdi tests.
>>>
>>
>> Hi Serguei,  I ran all of vm.quick.testlist on this which includes jvmti,
>> jdi tests.  I'll run jtreg jdi tests too (where are they?)
>>
>> Thanks,
>> Coleen
>>
>>
>>
>>> Thanks,
>>> Serguei
>>>
>>> On 11/3/14 12:19 PM, Coleen Phillimore wrote:
>>>
>>>>
>>>> Hi Jeremy,
>>>>
>>>> I reviewed your new code and it looks fine.  I had one comment in
>>>>
>>>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/
>>>> share/vm/prims/jvmtiEnv.cpp.udiff.html
>>>>
>>>> The name "need_to_resolve" doesn't make sense when reading this code.
>>>> Isn't it more like "need_to_ensure_space" ?  I think method resolution with
>>>> the other name, which it doesn't do.
>>>>
>>>> I was trying to find a way to make this new code not appear twice
>>>> (maybe with a local jvmtiEnv function get_jmethodID(m) - instanceK_h is
>>>> m->method_holder()).
>>>>
>>>
>>> Agreed on the above.
>>>
>>>
>>>> Also, I was trying to figure out if the new class in utilities called
>>>> chunkedList.hpp could be used to store jmethodIDs, since the data
>>>> structures are similar.  There is still more things in JNIMethodBlock has
>>>> to do so I think a specialized structure is still needed (which is why I
>>>> originally wrote it to be very simple).  I'm not sure if the comment above
>>>> it still applies. Maybe only the first and third sentences.  Can you
>>>> rewrite the comment slightly?
>>>>
>>>> Your other comments in the changes are good.
>>>>
>>>> I can't completely answer your question about reusing free_methods -
>>>> but if a jmethodID is created provisionally in
>>>> InstanceKlass::get_jmethod_id and not needed because it loses the race in
>>>> the method id cache, it's never handed back to native code, so it's safe to
>>>> reuse.  This is different than jmethodIDs for methods that are unloaded.
>>>> They are cleared and never reused.  At least that's my reading of this
>>>> caching code but it's pretty complicated stuff.
>>>>
>>>> I've also run our nsk and jck vm/jvmti on this change and they all
>>>> passed.  I'd be happy to sponsor it with these suggested changes and it
>>>> needs another reviewer.
>>>>
>>>> Thanks for diagnosing and fixing this problem!
>>>> Coleen
>>>>
>>>>
>>>> On 10/30/2014 01:02 PM, Jeremy Manson wrote:
>>>>
>>>>> There's a significant regression in the speed of JVMTI GetClassMethods
>>>>> in
>>>>> JDK8. I've tracked this down to allocation of jmethodids in a tight
>>>>> loop.
>>>>> The issue can be addressed by preallocating enough space for all of the
>>>>> jmethodids when starting the operation and not iterating over all of
>>>>> the
>>>>> existing jmethodids when you allocate a new one.
>>>>>
>>>>> A patch is here:
>>>>>
>>>>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/
>>>>>
>>>>> A reproducible test case can be found here:
>>>>>
>>>>> http://cr.openjdk.java.net/~jmanson/8062116/repro/
>>>>>
>>>>> It's a benchmark, though: I have no idea how to turn it into a test.
>>>>>
>>>>> For whoever reviews it: can you explain to me why it is okay that this
>>>>> code
>>>>> reuses jmethodIDs (in JNIMethodBlock::add_method?  I can imagine a lot
>>>>> of
>>>>> problems stemming from accidental reuse.
>>>>>
>>>>> Jeremy
>>>>>
>>>>
>>>>
>>>
>>
>

From serguei.spitsyn at oracle.com  Tue Nov  4 22:15:56 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 04 Nov 2014 14:15:56 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <54592FC2.7090406@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>	<5457E36A.3020800@oracle.com>
	<54592FC2.7090406@oracle.com>
Message-ID: <5459501C.4040807@oracle.com>

Jeremy and Coleen,

Thank you for taking care about this bug!

The fix looks good to me.
I do not see any issues.

Coleen,

Please, let me know if you need any help with testing or anything else.

Thanks,
Serguei

On 11/4/14 11:57 AM, serguei.spitsyn at oracle.com wrote:
> Hi Jeremy and Coleen,
>
> I'm reviewing this too.
> We also need to run the nsk.jvmti, nsk.jdi and jtreg jdi tests.
>
> Thanks,
> Serguei
>
> On 11/3/14 12:19 PM, Coleen Phillimore wrote:
>>
>> Hi Jeremy,
>>
>> I reviewed your new code and it looks fine.  I had one comment in
>>
>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/share/vm/prims/jvmtiEnv.cpp.udiff.html 
>>
>>
>> The name "need_to_resolve" doesn't make sense when reading this 
>> code.  Isn't it more like "need_to_ensure_space" ?  I think method 
>> resolution with the other name, which it doesn't do.
>>
>> I was trying to find a way to make this new code not appear twice 
>> (maybe with a local jvmtiEnv function get_jmethodID(m) - instanceK_h 
>> is m->method_holder()).
>
> Agreed on the above.
>
>>
>> Also, I was trying to figure out if the new class in utilities called 
>> chunkedList.hpp could be used to store jmethodIDs, since the data 
>> structures are similar.  There is still more things in JNIMethodBlock 
>> has to do so I think a specialized structure is still needed (which 
>> is why I originally wrote it to be very simple).  I'm not sure if the 
>> comment above it still applies. Maybe only the first and third 
>> sentences.  Can you rewrite the comment slightly?
>>
>> Your other comments in the changes are good.
>>
>> I can't completely answer your question about reusing free_methods - 
>> but if a jmethodID is created provisionally in 
>> InstanceKlass::get_jmethod_id and not needed because it loses the 
>> race in the method id cache, it's never handed back to native code, 
>> so it's safe to reuse.  This is different than jmethodIDs for methods 
>> that are unloaded.  They are cleared and never reused.  At least 
>> that's my reading of this caching code but it's pretty complicated 
>> stuff.
>>
>> I've also run our nsk and jck vm/jvmti on this change and they all 
>> passed.  I'd be happy to sponsor it with these suggested changes and 
>> it needs another reviewer.
>>
>> Thanks for diagnosing and fixing this problem!
>> Coleen
>>
>>
>> On 10/30/2014 01:02 PM, Jeremy Manson wrote:
>>> There's a significant regression in the speed of JVMTI 
>>> GetClassMethods in
>>> JDK8. I've tracked this down to allocation of jmethodids in a tight 
>>> loop.
>>> The issue can be addressed by preallocating enough space for all of the
>>> jmethodids when starting the operation and not iterating over all of 
>>> the
>>> existing jmethodids when you allocate a new one.
>>>
>>> A patch is here:
>>>
>>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/
>>>
>>> A reproducible test case can be found here:
>>>
>>> http://cr.openjdk.java.net/~jmanson/8062116/repro/
>>>
>>> It's a benchmark, though: I have no idea how to turn it into a test.
>>>
>>> For whoever reviews it: can you explain to me why it is okay that 
>>> this code
>>> reuses jmethodIDs (in JNIMethodBlock::add_method?  I can imagine a 
>>> lot of
>>> problems stemming from accidental reuse.
>>>
>>> Jeremy
>>
>


From jeremymanson at google.com  Wed Nov  5 01:52:50 2014
From: jeremymanson at google.com (Jeremy Manson)
Date: Tue, 4 Nov 2014 17:52:50 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <5459501C.4040807@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<5459501C.4040807@oracle.com>
Message-ID: <CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>

Updated patch here:

http://cr.openjdk.java.net/~jmanson/8062116/webrev.01/

Jeremy

On Tue, Nov 4, 2014 at 2:15 PM, serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Jeremy and Coleen,
>
> Thank you for taking care about this bug!
>
> The fix looks good to me.
> I do not see any issues.
>
> Coleen,
>
> Please, let me know if you need any help with testing or anything else.
>
> Thanks,
> Serguei
>
>
> On 11/4/14 11:57 AM, serguei.spitsyn at oracle.com wrote:
>
>> Hi Jeremy and Coleen,
>>
>> I'm reviewing this too.
>> We also need to run the nsk.jvmti, nsk.jdi and jtreg jdi tests.
>>
>> Thanks,
>> Serguei
>>
>> On 11/3/14 12:19 PM, Coleen Phillimore wrote:
>>
>>>
>>> Hi Jeremy,
>>>
>>> I reviewed your new code and it looks fine.  I had one comment in
>>>
>>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/
>>> share/vm/prims/jvmtiEnv.cpp.udiff.html
>>>
>>> The name "need_to_resolve" doesn't make sense when reading this code.
>>> Isn't it more like "need_to_ensure_space" ?  I think method resolution with
>>> the other name, which it doesn't do.
>>>
>>> I was trying to find a way to make this new code not appear twice (maybe
>>> with a local jvmtiEnv function get_jmethodID(m) - instanceK_h is
>>> m->method_holder()).
>>>
>>
>> Agreed on the above.
>>
>>
>>> Also, I was trying to figure out if the new class in utilities called
>>> chunkedList.hpp could be used to store jmethodIDs, since the data
>>> structures are similar.  There is still more things in JNIMethodBlock has
>>> to do so I think a specialized structure is still needed (which is why I
>>> originally wrote it to be very simple).  I'm not sure if the comment above
>>> it still applies. Maybe only the first and third sentences.  Can you
>>> rewrite the comment slightly?
>>>
>>> Your other comments in the changes are good.
>>>
>>> I can't completely answer your question about reusing free_methods - but
>>> if a jmethodID is created provisionally in InstanceKlass::get_jmethod_id
>>> and not needed because it loses the race in the method id cache, it's never
>>> handed back to native code, so it's safe to reuse.  This is different than
>>> jmethodIDs for methods that are unloaded.  They are cleared and never
>>> reused.  At least that's my reading of this caching code but it's pretty
>>> complicated stuff.
>>>
>>> I've also run our nsk and jck vm/jvmti on this change and they all
>>> passed.  I'd be happy to sponsor it with these suggested changes and it
>>> needs another reviewer.
>>>
>>> Thanks for diagnosing and fixing this problem!
>>> Coleen
>>>
>>>
>>> On 10/30/2014 01:02 PM, Jeremy Manson wrote:
>>>
>>>> There's a significant regression in the speed of JVMTI GetClassMethods
>>>> in
>>>> JDK8. I've tracked this down to allocation of jmethodids in a tight
>>>> loop.
>>>> The issue can be addressed by preallocating enough space for all of the
>>>> jmethodids when starting the operation and not iterating over all of the
>>>> existing jmethodids when you allocate a new one.
>>>>
>>>> A patch is here:
>>>>
>>>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/
>>>>
>>>> A reproducible test case can be found here:
>>>>
>>>> http://cr.openjdk.java.net/~jmanson/8062116/repro/
>>>>
>>>> It's a benchmark, though: I have no idea how to turn it into a test.
>>>>
>>>> For whoever reviews it: can you explain to me why it is okay that this
>>>> code
>>>> reuses jmethodIDs (in JNIMethodBlock::add_method?  I can imagine a lot
>>>> of
>>>> problems stemming from accidental reuse.
>>>>
>>>> Jeremy
>>>>
>>>
>>>
>>
>

From daniel.daugherty at oracle.com  Wed Nov  5 04:34:53 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 04 Nov 2014 21:34:53 -0700
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
Message-ID: <5459A8ED.8060808@oracle.com>

Greetings,

I have a Contended Locking cleanup bucket fix ready for review.

This fix was spun off from the Contended Locking fast enter bucket
which was sent out for review late last week. This fix cleans up
the computation of ObjectMonitor field pointers and gets rid of
the use of literal '-2' in appropriate places. For example:

-         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2, 
Rscratch);
+         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);

The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
specified field and subtracts markOopDesc:monitor_value (2).
There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.

Thanks to David Holmes for his comments on JDK-8061553 that
motivated this (long overdue) cleanup.

This work is being tracked by the following bug ID:

     JDK-8062851 cleanup ObjectMonitor offset adjustments
     https://bugs.openjdk.java.net/browse/JDK-8062851

Here is the webrev URL:

http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/

Here is the JEP link:

     https://bugs.openjdk.java.net/browse/JDK-8046133

Testing:

- JPRT test jobs (since this is only syntax and comment cleanup)

Thanks, in advance, for any comments, questions or suggestions.

Dan

From serguei.spitsyn at oracle.com  Wed Nov  5 04:56:27 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 04 Nov 2014 20:56:27 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<5459501C.4040807@oracle.com>
	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>
Message-ID: <5459ADFB.4090808@oracle.com>

The fix looks good in general.

src/share/vm/oops/method.cpp

1785   bool contains(Method** m) {
1786     if (m == NULL) return false;
1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b = b->_next) {
1788       if (b->_methods <= m && m < b->_methods + b->_number_of_methods) {
*1789         ptrdiff_t idx = m - b->_methods;**
**1790         if (b->_methods + idx == m) {**
1791           return true;
1792         }*
1793       }
1794     }
1795     return false;  // not found
1796   }


Just noticed that the lines 1789-1792 can be replaced with one liner:
*        return true;*

It is because the condition *(b->_methods + idx == m)* is always true.   
   :)

Also, should we check the condition: **m != _free_method*** ?
What about the following ?:
*        return (****m != _free_method***);*


Thanks,
Serguei


On 11/4/14 5:52 PM, Jeremy Manson wrote:
> Updated patch here:
>
> http://cr.openjdk.java.net/~jmanson/8062116/webrev.01/ 
> <http://cr.openjdk.java.net/%7Ejmanson/8062116/webrev.01/>
>
> Jeremy
>
> On Tue, Nov 4, 2014 at 2:15 PM, serguei.spitsyn at oracle.com 
> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
> <mailto:serguei.spitsyn at oracle.com>> wrote:
>
>     Jeremy and Coleen,
>
>     Thank you for taking care about this bug!
>
>     The fix looks good to me.
>     I do not see any issues.
>
>     Coleen,
>
>     Please, let me know if you need any help with testing or anything
>     else.
>
>     Thanks,
>     Serguei
>
>
>     On 11/4/14 11:57 AM, serguei.spitsyn at oracle.com
>     <mailto:serguei.spitsyn at oracle.com> wrote:
>
>         Hi Jeremy and Coleen,
>
>         I'm reviewing this too.
>         We also need to run the nsk.jvmti, nsk.jdi and jtreg jdi tests.
>
>         Thanks,
>         Serguei
>
>         On 11/3/14 12:19 PM, Coleen Phillimore wrote:
>
>
>             Hi Jeremy,
>
>             I reviewed your new code and it looks fine.  I had one
>             comment in
>
>             http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/share/vm/prims/jvmtiEnv.cpp.udiff.html
>             <http://cr.openjdk.java.net/%7Ejmanson/8062116/webrev.00/src/share/vm/prims/jvmtiEnv.cpp.udiff.html>
>
>
>             The name "need_to_resolve" doesn't make sense when reading
>             this code.  Isn't it more like "need_to_ensure_space" ?  I
>             think method resolution with the other name, which it
>             doesn't do.
>
>             I was trying to find a way to make this new code not
>             appear twice (maybe with a local jvmtiEnv function
>             get_jmethodID(m) - instanceK_h is m->method_holder()).
>
>
>         Agreed on the above.
>
>
>             Also, I was trying to figure out if the new class in
>             utilities called chunkedList.hpp could be used to store
>             jmethodIDs, since the data structures are similar.  There
>             is still more things in JNIMethodBlock has to do so I
>             think a specialized structure is still needed (which is
>             why I originally wrote it to be very simple).  I'm not
>             sure if the comment above it still applies. Maybe only the
>             first and third sentences.  Can you rewrite the comment
>             slightly?
>
>             Your other comments in the changes are good.
>
>             I can't completely answer your question about reusing
>             free_methods - but if a jmethodID is created provisionally
>             in InstanceKlass::get_jmethod_id and not needed because it
>             loses the race in the method id cache, it's never handed
>             back to native code, so it's safe to reuse.  This is
>             different than jmethodIDs for methods that are unloaded. 
>             They are cleared and never reused.  At least that's my
>             reading of this caching code but it's pretty complicated
>             stuff.
>
>             I've also run our nsk and jck vm/jvmti on this change and
>             they all passed.  I'd be happy to sponsor it with these
>             suggested changes and it needs another reviewer.
>
>             Thanks for diagnosing and fixing this problem!
>             Coleen
>
>
>             On 10/30/2014 01:02 PM, Jeremy Manson wrote:
>
>                 There's a significant regression in the speed of JVMTI
>                 GetClassMethods in
>                 JDK8. I've tracked this down to allocation of
>                 jmethodids in a tight loop.
>                 The issue can be addressed by preallocating enough
>                 space for all of the
>                 jmethodids when starting the operation and not
>                 iterating over all of the
>                 existing jmethodids when you allocate a new one.
>
>                 A patch is here:
>
>                 http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/
>                 <http://cr.openjdk.java.net/%7Ejmanson/8062116/webrev.00/>
>
>                 A reproducible test case can be found here:
>
>                 http://cr.openjdk.java.net/~jmanson/8062116/repro/
>                 <http://cr.openjdk.java.net/%7Ejmanson/8062116/repro/>
>
>                 It's a benchmark, though: I have no idea how to turn
>                 it into a test.
>
>                 For whoever reviews it: can you explain to me why it
>                 is okay that this code
>                 reuses jmethodIDs (in JNIMethodBlock::add_method? I
>                 can imagine a lot of
>                 problems stemming from accidental reuse.
>
>                 Jeremy
>
>
>
>
>


From serguei.spitsyn at oracle.com  Wed Nov  5 06:08:05 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 04 Nov 2014 22:08:05 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <5459ADFB.4090808@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>	<5457E36A.3020800@oracle.com>
	<54592FC2.7090406@oracle.com>	<5459501C.4040807@oracle.com>	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>
	<5459ADFB.4090808@oracle.com>
Message-ID: <5459BEC5.4090809@oracle.com>

Got rid of the bold selection below to make it more readable.

Thanks,
Serguei

On 11/4/14 8:56 PM, serguei.spitsyn at oracle.com wrote:
> The fix looks good in general.
>
> src/share/vm/oops/method.cpp
> 1785   bool contains(Method** m) {
> 1786     if (m == NULL) return false;
> 1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b = b->_next) {
> 1788       if (b->_methods <= m && m < b->_methods + b->_number_of_methods) {
> 1789         ptrdiff_t idx = m - b->_methods;
> 1790         if (b->_methods + idx == m) {
> 1791           return true;
> 1792         }
> 1793       }
> 1794     }
> 1795     return false;  // not found
> 1796   }
>
> Just noticed that the lines 1789-1792 can be replaced with one liner:
> **return true;
>
> It is because the condition (b->_methods + idx == m) is always true.   
>   :)
>
> Also, should we check the condition:  *m != _free_method?
> What about the following ?:
> **return (*m != _free_method);
>
>
> Thanks,
> Serguei
>
>
> On 11/4/14 5:52 PM, Jeremy Manson wrote:
>> Updated patch here:
>>
>> http://cr.openjdk.java.net/~jmanson/8062116/webrev.01/ 
>> <http://cr.openjdk.java.net/%7Ejmanson/8062116/webrev.01/>
>>
>> Jeremy
>>
>> On Tue, Nov 4, 2014 at 2:15 PM, serguei.spitsyn at oracle.com 
>> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
>> <mailto:serguei.spitsyn at oracle.com>> wrote:
>>
>>     Jeremy and Coleen,
>>
>>     Thank you for taking care about this bug!
>>
>>     The fix looks good to me.
>>     I do not see any issues.
>>
>>     Coleen,
>>
>>     Please, let me know if you need any help with testing or anything
>>     else.
>>
>>     Thanks,
>>     Serguei
>>
>>
>>     On 11/4/14 11:57 AM, serguei.spitsyn at oracle.com
>>     <mailto:serguei.spitsyn at oracle.com> wrote:
>>
>>         Hi Jeremy and Coleen,
>>
>>         I'm reviewing this too.
>>         We also need to run the nsk.jvmti, nsk.jdi and jtreg jdi tests.
>>
>>         Thanks,
>>         Serguei
>>
>>         On 11/3/14 12:19 PM, Coleen Phillimore wrote:
>>
>>
>>             Hi Jeremy,
>>
>>             I reviewed your new code and it looks fine.  I had one
>>             comment in
>>
>>             http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/src/share/vm/prims/jvmtiEnv.cpp.udiff.html
>>             <http://cr.openjdk.java.net/%7Ejmanson/8062116/webrev.00/src/share/vm/prims/jvmtiEnv.cpp.udiff.html>
>>
>>
>>             The name "need_to_resolve" doesn't make sense when
>>             reading this code.  Isn't it more like
>>             "need_to_ensure_space" ?  I think method resolution with
>>             the other name, which it doesn't do.
>>
>>             I was trying to find a way to make this new code not
>>             appear twice (maybe with a local jvmtiEnv function
>>             get_jmethodID(m) - instanceK_h is m->method_holder()).
>>
>>
>>         Agreed on the above.
>>
>>
>>             Also, I was trying to figure out if the new class in
>>             utilities called chunkedList.hpp could be used to store
>>             jmethodIDs, since the data structures are similar.  There
>>             is still more things in JNIMethodBlock has to do so I
>>             think a specialized structure is still needed (which is
>>             why I originally wrote it to be very simple).  I'm not
>>             sure if the comment above it still applies. Maybe only
>>             the first and third sentences.  Can you rewrite the
>>             comment slightly?
>>
>>             Your other comments in the changes are good.
>>
>>             I can't completely answer your question about reusing
>>             free_methods - but if a jmethodID is created
>>             provisionally in InstanceKlass::get_jmethod_id and not
>>             needed because it loses the race in the method id cache,
>>             it's never handed back to native code, so it's safe to
>>             reuse.  This is different than jmethodIDs for methods
>>             that are unloaded.  They are cleared and never reused. 
>>             At least that's my reading of this caching code but it's
>>             pretty complicated stuff.
>>
>>             I've also run our nsk and jck vm/jvmti on this change and
>>             they all passed.  I'd be happy to sponsor it with these
>>             suggested changes and it needs another reviewer.
>>
>>             Thanks for diagnosing and fixing this problem!
>>             Coleen
>>
>>
>>             On 10/30/2014 01:02 PM, Jeremy Manson wrote:
>>
>>                 There's a significant regression in the speed of
>>                 JVMTI GetClassMethods in
>>                 JDK8. I've tracked this down to allocation of
>>                 jmethodids in a tight loop.
>>                 The issue can be addressed by preallocating enough
>>                 space for all of the
>>                 jmethodids when starting the operation and not
>>                 iterating over all of the
>>                 existing jmethodids when you allocate a new one.
>>
>>                 A patch is here:
>>
>>                 http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/ <http://cr.openjdk.java.net/%7Ejmanson/8062116/webrev.00/>
>>
>>                 A reproducible test case can be found here:
>>
>>                 http://cr.openjdk.java.net/~jmanson/8062116/repro/
>>                 <http://cr.openjdk.java.net/%7Ejmanson/8062116/repro/>
>>
>>                 It's a benchmark, though: I have no idea how to turn
>>                 it into a test.
>>
>>                 For whoever reviews it: can you explain to me why it
>>                 is okay that this code
>>                 reuses jmethodIDs (in JNIMethodBlock::add_method?  I
>>                 can imagine a lot of
>>                 problems stemming from accidental reuse.
>>
>>                 Jeremy
>>
>>
>>
>>
>>
>


From david.holmes at oracle.com  Wed Nov  5 10:42:25 2014
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 05 Nov 2014 20:42:25 +1000
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <5459A8ED.8060808@oracle.com>
References: <5459A8ED.8060808@oracle.com>
Message-ID: <5459FF11.1080801@oracle.com>

Hi Dan,

Reviewed.

I find the name OM_OFFSET_NO_MONITOR_VALUE somewhat awkward but have no 
better suggestion. In fact I have to ask what _is_ the object monitor 
tagging mechanism? I can't see it defined in the objectMonitor.* files. ??

Thanks,
David

On 5/11/2014 2:34 PM, Daniel D. Daugherty wrote:
> Greetings,
>
> I have a Contended Locking cleanup bucket fix ready for review.
>
> This fix was spun off from the Contended Locking fast enter bucket
> which was sent out for review late last week. This fix cleans up
> the computation of ObjectMonitor field pointers and gets rid of
> the use of literal '-2' in appropriate places. For example:
>
> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
> Rscratch);
> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);
>
> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
> specified field and subtracts markOopDesc:monitor_value (2).
> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>
> Thanks to David Holmes for his comments on JDK-8061553 that
> motivated this (long overdue) cleanup.
>
> This work is being tracked by the following bug ID:
>
>      JDK-8062851 cleanup ObjectMonitor offset adjustments
>      https://bugs.openjdk.java.net/browse/JDK-8062851
>
> Here is the webrev URL:
>
> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>
> Here is the JEP link:
>
>      https://bugs.openjdk.java.net/browse/JDK-8046133
>
> Testing:
>
> - JPRT test jobs (since this is only syntax and comment cleanup)
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan

From christian.tornqvist at oracle.com  Wed Nov  5 14:54:27 2014
From: christian.tornqvist at oracle.com (Christian Tornqvist)
Date: Wed, 5 Nov 2014 09:54:27 -0500
Subject: RFR(XS): 8061733 - [TESTBUG] Exclude tests that have issues with
	Jigsaw M2 changes
Message-ID: <013c01cff908$65d00560$31701020$@oracle.com>

Hi everyone,

 
Please review this small change that adds @ignore to one test that fails
when running with the upcoming changes for Jigsaw M2. The affected test is
not critical and will be fixed at a later time.

 
Webrev:

http://cr.openjdk.java.net/~ctornqvi/webrev/8061733/webrev.00/

 
Bug:

https://bugs.openjdk.java.net/browse/JDK-8061733

 
Thanks,

Christian

 
From lois.foltan at oracle.com  Wed Nov  5 15:01:12 2014
From: lois.foltan at oracle.com (Lois Foltan)
Date: Wed, 05 Nov 2014 10:01:12 -0500
Subject: RFR(XS): 8061733 - [TESTBUG] Exclude tests that have issues with
	Jigsaw M2 changes
In-Reply-To: <013c01cff908$65d00560$31701020$@oracle.com>
References: <013c01cff908$65d00560$31701020$@oracle.com>
Message-ID: <545A3BB8.7020400@oracle.com>

Looks good.
Lois

On 11/5/2014 9:54 AM, Christian Tornqvist wrote:
> Hi everyone,
>
>   
>
> Please review this small change that adds @ignore to one test that fails
> when running with the upcoming changes for Jigsaw M2. The affected test is
> not critical and will be fixed at a later time.
>
>   
>
> Webrev:
>
> http://cr.openjdk.java.net/~ctornqvi/webrev/8061733/webrev.00/
>
>   
>
> Bug:
>
> https://bugs.openjdk.java.net/browse/JDK-8061733
>
>   
>
> Thanks,
>
> Christian
>
>   
>


From george.triantafillou at oracle.com  Wed Nov  5 15:01:31 2014
From: george.triantafillou at oracle.com (George Triantafillou)
Date: Wed, 05 Nov 2014 10:01:31 -0500
Subject: RFR(XS): 8061733 - [TESTBUG] Exclude tests that have issues with
	Jigsaw M2 changes
In-Reply-To: <013c01cff908$65d00560$31701020$@oracle.com>
References: <013c01cff908$65d00560$31701020$@oracle.com>
Message-ID: <545A3BCB.5010604@oracle.com>

Christian,

Looks good.

-George

On 11/5/2014 9:54 AM, Christian Tornqvist wrote:
> Hi everyone,
>
>   
>
> Please review this small change that adds @ignore to one test that fails
> when running with the upcoming changes for Jigsaw M2. The affected test is
> not critical and will be fixed at a later time.
>
>   
>
> Webrev:
>
> http://cr.openjdk.java.net/~ctornqvi/webrev/8061733/webrev.00/
>
>   
>
> Bug:
>
> https://bugs.openjdk.java.net/browse/JDK-8061733
>
>   
>
> Thanks,
>
> Christian
>
>   
>


From daniel.daugherty at oracle.com  Wed Nov  5 15:29:56 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 05 Nov 2014 08:29:56 -0700
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <5459FF11.1080801@oracle.com>
References: <5459A8ED.8060808@oracle.com> <5459FF11.1080801@oracle.com>
Message-ID: <545A4274.6090409@oracle.com>

On 11/5/14 3:42 AM, David Holmes wrote:
> Hi Dan,
>
> Reviewed.

Thanks!


> I find the name OM_OFFSET_NO_MONITOR_VALUE somewhat awkward but have 
> no better suggestion.

Understood. I didn't like the original "OFFSET_SKEWED" name
especially since I was moving it to objectMonitor.hpp...

If you think of a better, let me know... we can always change it.


> In fact I have to ask what _is_ the object monitor tagging mechanism? 
> I can't see it defined in the objectMonitor.* files. ??

That would be this code:

src/share/vm/oops/markOop.hpp:

     317   static markOop encode(ObjectMonitor* monitor) {
     318     intptr_t tmp = (intptr_t) monitor;
     319     return (markOop) (tmp | monitor_value);
     320   }

and the other methods in that file that have to account for
the monitor_value being set...

Dan


>
> Thanks,
> David
>
> On 5/11/2014 2:34 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have a Contended Locking cleanup bucket fix ready for review.
>>
>> This fix was spun off from the Contended Locking fast enter bucket
>> which was sent out for review late last week. This fix cleans up
>> the computation of ObjectMonitor field pointers and gets rid of
>> the use of literal '-2' in appropriate places. For example:
>>
>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>> Rscratch);
>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);
>>
>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>> specified field and subtracts markOopDesc:monitor_value (2).
>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>
>> Thanks to David Holmes for his comments on JDK-8061553 that
>> motivated this (long overdue) cleanup.
>>
>> This work is being tracked by the following bug ID:
>>
>>      JDK-8062851 cleanup ObjectMonitor offset adjustments
>>      https://bugs.openjdk.java.net/browse/JDK-8062851
>>
>> Here is the webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>
>> Here is the JEP link:
>>
>>      https://bugs.openjdk.java.net/browse/JDK-8046133
>>
>> Testing:
>>
>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan


From claes.redestad at oracle.com  Wed Nov  5 15:49:45 2014
From: claes.redestad at oracle.com (Claes Redestad)
Date: Wed, 05 Nov 2014 16:49:45 +0100
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <5459A8ED.8060808@oracle.com>
References: <5459A8ED.8060808@oracle.com>
Message-ID: <545A4719.50705@oracle.com>

Hi,

On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
> Greetings,
>
> I have a Contended Locking cleanup bucket fix ready for review.
>
> This fix was spun off from the Contended Locking fast enter bucket
> which was sent out for review late last week. This fix cleans up
> the computation of ObjectMonitor field pointers and gets rid of
> the use of literal '-2' in appropriate places. For example:
>
> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2, 
> Rscratch);
> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);
>
> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
> specified field and subtracts markOopDesc:monitor_value (2).
> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.

any reason not to add it as a function in objectMonitor.hpp instead of a 
macro? How about:

   static int no_monitor_offset_in_bytes()  { return 
offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }

Example usage:

-         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2, Rscratch);
+         ld_ptr(Rmark, ObjectMonitor::no_monitor_offset_in_bytes(), Rscratch);


Seems this should be inlined regardless and looks a bit cleaner to me.

Thanks!

/Claes

>
> Thanks to David Holmes for his comments on JDK-8061553 that
> motivated this (long overdue) cleanup.
>
> This work is being tracked by the following bug ID:
>
>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>     https://bugs.openjdk.java.net/browse/JDK-8062851
>
> Here is the webrev URL:
>
> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>
> Here is the JEP link:
>
>     https://bugs.openjdk.java.net/browse/JDK-8046133
>
> Testing:
>
> - JPRT test jobs (since this is only syntax and comment cleanup)
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan


From george.triantafillou at oracle.com  Wed Nov  5 16:01:08 2014
From: george.triantafillou at oracle.com (George Triantafillou)
Date: Wed, 05 Nov 2014 11:01:08 -0500
Subject: RFR(S): 8058251 - assert(_count > 0) failed: Negative counter
	when running runtime/NMT/MallocTrackingVerify.java
In-Reply-To: <54394D75.8080601@oracle.com>
References: <028001cfe4e5$8078ae30$816a0a90$@oracle.com>
	<54394D75.8080601@oracle.com>
Message-ID: <545A49C4.3040808@oracle.com>

Hi Christian,

This looks good.  Thanks for fixing this.

As Coleen requested, I filed 8062870 
<https://bugs.openjdk.java.net/browse/JDK-8062870> and assigned it to her.

-George

On 10/11/2014 11:32 AM, Coleen Phillimore wrote:
>
> Hi Christian,
>
> This is a good cleanup.  As we were talking about, I suspect that the 
> tracking level was in the header for startup so that it could be 
> increased, which is something that isn't used.
>
> We should write a test that explicitly overflows the malloc site table 
> buckets though, if we don't have one already.
>
> But this code looks good and we should file another bug for the malloc 
> site table overflows and poor hashing.
>
> Thanks,
> Coleen
>
> On 10/10/14, 7:54 PM, Christian Tornqvist wrote:
>> Hi everyone,
>>
>>
>> Fairly small change which fixes one of the instances of assert(count 
>> > 0),
>> the issue was that the mallocSiteTable became full, NMT changed from 
>> detail
>> to summary but never updated the tracking level field in the malloc 
>> header.
>> Since the malloc was never inserted into the mallocSiteTable we didn't
>> update the bucket and position in the malloc header and when we later 
>> on was
>> trying to free that memory block we found tracking level == detailed and
>> used the never initialized fields for bucket and position indexes.
>>
>>
>> The only place that looked at the level field in the header was
>> MallocHeader::release and it could check the global level state 
>> instead. So
>> I removed the 2bit level  field from the malloc headers and this 
>> enabled me
>> to get rid of the 30bit malloc limitation on 32bit systems.
>>
>>
>> Also fixed a sign conversion issue on 32bit platforms in WB API
>> NMTMallocWithPseudoStack.
>>
>>
>> Note that this fix doesn't solve all the sources for the assert and 
>> I'm not
>> going to enable the test at this point as we continue to track down the
>> additional issues.
>>
>>
>> The fix has been tested using jprt and aurora adhoc with NMT.
>>
>>
>> Webrev:
>>
>> http://cr.openjdk.java.net/~ctornqvi/webrev/8058251/webrev.00/
>>
>>
>> Bug:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8058251
>>
>>
>> Thanks,
>>
>> Christian
>>
>>
>


From coleen.phillimore at oracle.com  Wed Nov  5 17:33:13 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Wed, 05 Nov 2014 12:33:13 -0500
Subject: [9] RFR (S): 8060147: SIGSEGV in Metadata::mark_on_stack() while
	marking metadata in ciEnv
In-Reply-To: <5452A077.2050903@oracle.com>
References: <5450F261.60400@oracle.com>	<545114DF.7040005@oracle.com>	<54511744.4060904@oracle.com>	<5451F43A.1010108@oracle.com>	<5452128C.4090408@oracle.com>	<54522805.5040701@oracle.com>	<1421AC31-CBEA-4AED-A657-3C43B258A6DB@oracle.com>	<54522357.4070705@oracle.com>	<BF7012CB-AA7D-44B5-9C4F-A7DC0142FA4E@oracle.com>	<5452425D.7040405@oracle.com>	<E635DB43-48F9-4C5E-869F-13366D4DFA0B@oracle.com>	<5452517C.4050104@oracle.com>	<54527E1E.1070507@oracle.com>
	<5452A786.7030000@oracle.com> <5452A077.2050903@oracle.com>
Message-ID: <545A5F59.2020907@oracle.com>


On 10/30/14, 4:32 PM, Vladimir Ivanov wrote:
> Coleen,
>
> I implemented 2 approaches of the fix.
>
> The fix with a special case for VM anon classes is:
> http://cr.openjdk.java.net/~vlivanov/8060147/webrev.anon.00/
>
> Both fix the bug, but have different properties.
>
> (1) Special case for VM anon class is very focused on the actual 
> cause, but more fragile - all the logic which keeps metadata from 
> being deallocated is non-trivial and scattered around the whole 
> ciMetadata hierarchy.
>
> (2) On the other hand, initial version, which forcibly creates 
> klass_holder ciObject for each ciMetadata, is much cleaner and 
> localized, but does unnecessary work.
>
> Am I right that you prefer (1) as a fix?

Yes, I think this version does less unnecessary work and creates less 
ciObjects.   And the comment is useful for finding how we keep 
ciMetadata alive for anonymous classes.   You still have a UseNewCode in 
the webrev thought that you want to take out.

>
>> I'm sorry that I didn't get to my email today but from the discussion I
>> think changing two occurrences of "class_loader" in
>> ciInstanceKlass::ciInstanceKlass to "klass_holder" would have solved
>> your problem.
> I don't think that's what we want. For VM anon classes, _loader == 
> NULL, but if we place java_mirror there instead, it could cause 
> problems in other parts of VM, since non-NULL _loader value implicates 
> ClassLoader instance. Not sure all these places are guarded against 
> seeing VM anon classes.

I thought that field was only added to hold the class_loader as a holder 
but if you think using mirror would cause problems, it seems like a 
reason to not do this.

I reviewed this code.

Coleen

>
>
>> Unless, you can add a ciMethod or ciMethodData without adding a
>> ciInstanceKlass (which I don't think you can).
> It's not possible right now. But ciObjectFactory doesn't forbid that.
>
>> I think Roland pointed out a flaw though that you can safepoint before
>> adding a ciInstanceKlass though, which you could fix by moving this up
>> in ciMethod::ciMethod to before the safepoint.
>>
>>    _holder = env->get_instance_klass(h_m()->method_holder());
> I simply pass _holder value into the ciMethod ctor.
>
> Best regards,
> Vladimir Ivanov
>
>> I know I suggested adding the ciObject in ciMetadata but that's because
>> this is done somewhere that is hard to find.  A good comment that this
>> is what keeps metadata that ci points to from being unloaded by GC would
>> help a lot with that.
>
>
>>
>> Thanks,
>> Coleen
>>
>>
>> On 10/30/2014 02:06 PM, Vladimir Kozlov wrote:
>>> I would go with webrev.01 (updated initial version).
>>>
>>> Regards,
>>> Vladimir
>>>
>>> On 10/30/14 7:55 AM, Vladimir Ivanov wrote:
>>>>>> As a solution, _holder can be passed into ciMethod::ciMethod as a
>>>>>> parameter. It should fix the problem.
>>>>>
>>>>> The first change you suggested
>>>>> (http://cr.openjdk.java.net/~vlivanov/8060147/webrev.00) would fix 
>>>>> the
>>>>> ciMethod::ciMethod problem, right? The code would be more robust that
>>>>> way and other similar issues could be avoided.
>>>> Yes, initial version fixes ciMethod::ciMethod problem. It's also more
>>>> robust and easier to reason about.
>>>>
>>>> The downside is that for every ciMetadata instantiation we do more 
>>>> work.
>>>>
>>>> I have an alternative version:
>>>> http://cr.openjdk.java.net/~vlivanov/8060147/webrev.anon.00/
>>>>
>>>> Initial version (updated):
>>>> http://cr.openjdk.java.net/~vlivanov/8060147/webrev.01/
>>>>
>>>> I like initial version more, but I don't have strong opinion here.
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>


From vladimir.x.ivanov at oracle.com  Wed Nov  5 17:02:12 2014
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 05 Nov 2014 21:02:12 +0400
Subject: [9] RFR (S): 8060147: SIGSEGV in Metadata::mark_on_stack() while
	marking metadata in ciEnv
In-Reply-To: <545A5F59.2020907@oracle.com>
References: <5450F261.60400@oracle.com>	<545114DF.7040005@oracle.com>	<54511744.4060904@oracle.com>	<5451F43A.1010108@oracle.com>	<5452128C.4090408@oracle.com>	<54522805.5040701@oracle.com>	<1421AC31-CBEA-4AED-A657-3C43B258A6DB@oracle.com>	<54522357.4070705@oracle.com>	<BF7012CB-AA7D-44B5-9C4F-A7DC0142FA4E@oracle.com>	<5452425D.7040405@oracle.com>	<E635DB43-48F9-4C5E-869F-13366D4DFA0B@oracle.com>	<5452517C.4050104@oracle.com>	<54527E1E.1070507@oracle.com>
	<5452A786.7030000@oracle.com> <5452A077.2050903@oracle.com>
	<545A5F59.2020907@oracle.com>
Message-ID: <545A5814.8000109@oracle.com>


On 11/5/14, 9:33 PM, Coleen Phillimore wrote:
>
> On 10/30/14, 4:32 PM, Vladimir Ivanov wrote:
>> Coleen,
>>
>> I implemented 2 approaches of the fix.
>>
>> The fix with a special case for VM anon classes is:
>> http://cr.openjdk.java.net/~vlivanov/8060147/webrev.anon.00/
>>
>> Both fix the bug, but have different properties.
>>
>> (1) Special case for VM anon class is very focused on the actual
>> cause, but more fragile - all the logic which keeps metadata from
>> being deallocated is non-trivial and scattered around the whole
>> ciMetadata hierarchy.
>>
>> (2) On the other hand, initial version, which forcibly creates
>> klass_holder ciObject for each ciMetadata, is much cleaner and
>> localized, but does unnecessary work.
>>
>> Am I right that you prefer (1) as a fix?
>
> Yes, I think this version does less unnecessary work and creates less
> ciObjects.   And the comment is useful for finding how we keep
> ciMetadata alive for anonymous classes.   You still have a UseNewCode in
> the webrev thought that you want to take out.

Thanks, Coleen.

VladimirK, Roland, what do you think about (1)?

Best regards,
Vladimir Ivanov

From coleen.phillimore at oracle.com  Wed Nov  5 18:37:55 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Wed, 05 Nov 2014 13:37:55 -0500
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545A4274.6090409@oracle.com>
References: <5459A8ED.8060808@oracle.com> <5459FF11.1080801@oracle.com>
	<545A4274.6090409@oracle.com>
Message-ID: <545A6E83.8060909@oracle.com>


Dan,  I had a look at this change too.

On 11/5/14, 10:29 AM, Daniel D. Daugherty wrote:
> On 11/5/14 3:42 AM, David Holmes wrote:
>> Hi Dan,
>>
>> Reviewed.
>
> Thanks!
>
>
>> I find the name OM_OFFSET_NO_MONITOR_VALUE somewhat awkward but have 
>> no better suggestion.
>
> Understood. I didn't like the original "OFFSET_SKEWED" name
> especially since I was moving it to objectMonitor.hpp...
>
> If you think of a better, let me know... we can always change it.
>

So the -2 was a tag?  Then maybe a better name is UNTAGGED_OM_OFFSET ..  
Weird stuff anyway.

In 
http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/src/cpu/x86/vm/macroAssembler_x86.cpp.udiff.html

Can you make the whitespace changes to the lines you've changed:

+         movptr(tmpReg, Address (tmpReg, OM_OFFSET_NO_MONITOR_VALUE(owner)));   // rax, = m->_owner


to

+         movptr(tmpReg, Address(tmpReg, OM_OFFSET_NO_MONITOR_VALUE(owner)));   // rax, = m->_owner


In general, this looks like a great improvement not subtracting two from 
seemingly random places in assembly code.

thanks,
Coleen

>
>
>> In fact I have to ask what _is_ the object monitor tagging mechanism? 
>> I can't see it defined in the objectMonitor.* files. ??
>
> That would be this code:
>
> src/share/vm/oops/markOop.hpp:
>
>     317   static markOop encode(ObjectMonitor* monitor) {
>     318     intptr_t tmp = (intptr_t) monitor;
>     319     return (markOop) (tmp | monitor_value);
>     320   }
>
> and the other methods in that file that have to account for
> the monitor_value being set...
>
> Dan
>
>
>>
>> Thanks,
>> David
>>
>> On 5/11/2014 2:34 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>
>>> This fix was spun off from the Contended Locking fast enter bucket
>>> which was sent out for review late last week. This fix cleans up
>>> the computation of ObjectMonitor field pointers and gets rid of
>>> the use of literal '-2' in appropriate places. For example:
>>>
>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>> Rscratch);
>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);
>>>
>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>> specified field and subtracts markOopDesc:monitor_value (2).
>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>>
>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>> motivated this (long overdue) cleanup.
>>>
>>> This work is being tracked by the following bug ID:
>>>
>>>      JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>      https://bugs.openjdk.java.net/browse/JDK-8062851
>>>
>>> Here is the webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>
>>> Here is the JEP link:
>>>
>>>      https://bugs.openjdk.java.net/browse/JDK-8046133
>>>
>>> Testing:
>>>
>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>


From david.buck at oracle.com  Wed Nov  5 18:59:25 2014
From: david.buck at oracle.com (david buck)
Date: Thu, 06 Nov 2014 03:59:25 +0900
Subject: RFR 8058715: stability issues when being launched as an embedded
	JVM via JNI
Message-ID: <545A738D.2080201@oracle.com>

Hi!

This is a request for code review of my fix for jdk8058715

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058715
WEBREV: http://cr.openjdk.java.net/~dbuck/8058715/webrev.01/

We have also received confirmation from the original reporter of the 
issue that this solution resolves the crashes they were seeing in their 
environment. I have tested that this change does not break the original 
NX bug workaround. I also ran the NX bug reproducer (v8 benchmark of 
Nashorn running in a loop) using a fastdebug build with the 
-XX:NativeMemoryTracking=summary option. Obviously no crashes or other 
issues were detected.

Cheers,
-Buck

From calvin.cheung at oracle.com  Wed Nov  5 19:14:20 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Wed, 05 Nov 2014 11:14:20 -0800
Subject: RFR(XS): 8060721: Test runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
Message-ID: <545A770C.3030503@oracle.com>

While upgrading the compiler on Mac for jdk9, we found this compiler bug 
where it skips the following 2 lines of code in metaspaceShared.cpp when 
optimization is enable (set to -Os) for the fastdebug and product builds.
     strcat(class_list_path_str, os::file_separator());
     strcat(class_list_path_str, "classlist");

The bug is reproducible with Xcode 5.1.1 and 6.1.

A workaround fix is to rewrite an "if" block in the 
MetaspaceShared::preload_and_dump() method.

JBS: https://bugs.openjdk.java.net/browse/JDK-8060721

webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/

Testing:
     JPRT
     The affected testcase with product, fastdebug, and debug builds 
built with Xcode 5.1.1 and 6.1.

thanks,
Calvin

From coleen.phillimore at oracle.com  Wed Nov  5 19:50:20 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Wed, 05 Nov 2014 14:50:20 -0500
Subject: RFR 8058715: stability issues when being launched as an embedded
	JVM via JNI
In-Reply-To: <545A738D.2080201@oracle.com>
References: <545A738D.2080201@oracle.com>
Message-ID: <545A7F7C.7030007@oracle.com>


Looks good, David.
Thank you for diagnosing and resolving this customer problem!
Coleen

On 11/5/14, 1:59 PM, david buck wrote:
> Hi!
>
> This is a request for code review of my fix for jdk8058715
>
> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058715
> WEBREV: http://cr.openjdk.java.net/~dbuck/8058715/webrev.01/
>
> We have also received confirmation from the original reporter of the 
> issue that this solution resolves the crashes they were seeing in 
> their environment. I have tested that this change does not break the 
> original NX bug workaround. I also ran the NX bug reproducer (v8 
> benchmark of Nashorn running in a loop) using a fastdebug build with 
> the -XX:NativeMemoryTracking=summary option. Obviously no crashes or 
> other issues were detected.
>
> Cheers,
> -Buck


From dean.long at oracle.com  Wed Nov  5 21:28:28 2014
From: dean.long at oracle.com (Dean Long)
Date: Wed, 05 Nov 2014 13:28:28 -0800
Subject: RFR(XS): 8060721: Test
	runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
In-Reply-To: <545A770C.3030503@oracle.com>
References: <545A770C.3030503@oracle.com>
Message-ID: <545A967C.6020200@oracle.com>

I'm just curious if the following also works:

  721     strcat(class_list_path_str, (volatile char *)os::file_separator());
  722     strcat(class_list_path_str,(volatile char *)"classlist");

dl

On 11/5/2014 11:14 AM, Calvin Cheung wrote:
> While upgrading the compiler on Mac for jdk9, we found this compiler 
> bug where it skips the following 2 lines of code in 
> metaspaceShared.cpp when optimization is enable (set to -Os) for the 
> fastdebug and product builds.
>     strcat(class_list_path_str, os::file_separator());
>     strcat(class_list_path_str, "classlist");
>
> The bug is reproducible with Xcode 5.1.1 and 6.1.
>
> A workaround fix is to rewrite an "if" block in the 
> MetaspaceShared::preload_and_dump() method.
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8060721
>
> webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>
> Testing:
>     JPRT
>     The affected testcase with product, fastdebug, and debug builds 
> built with Xcode 5.1.1 and 6.1.
>
> thanks,
> Calvin


From vladimir.kozlov at oracle.com  Wed Nov  5 21:51:39 2014
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 05 Nov 2014 13:51:39 -0800
Subject: [9] RFR (S): 8060147: SIGSEGV in Metadata::mark_on_stack() while
	marking metadata in ciEnv
In-Reply-To: <545A5814.8000109@oracle.com>
References: <5450F261.60400@oracle.com>	<545114DF.7040005@oracle.com>	<54511744.4060904@oracle.com>	<5451F43A.1010108@oracle.com>	<5452128C.4090408@oracle.com>	<54522805.5040701@oracle.com>	<1421AC31-CBEA-4AED-A657-3C43B258A6DB@oracle.com>	<54522357.4070705@oracle.com>	<BF7012CB-AA7D-44B5-9C4F-A7DC0142FA4E@oracle.com>	<5452425D.7040405@oracle.com>	<E635DB43-48F9-4C5E-869F-13366D4DFA0B@oracle.com>	<5452517C.4050104@oracle.com>	<54527E1E.1070507@oracle.com>
	<5452A786.7030000@oracle.com> <5452A077.2050903@oracle.com>
	<545A5F59.2020907@oracle.com> <545A5814.8000109@oracle.com>
Message-ID: <545A9BEB.8020507@oracle.com>

I am fine with targeted fix only.

One comment env->get_instance_klass() checks for NULL. Your new code in 
create_new_metadata() does not:

ciInstanceKlass* holder = 
get_metadata(h_m()->method_holder())->as_instance_klass();

Thanks,
Vladimir K

On 11/5/14 9:02 AM, Vladimir Ivanov wrote:
>
> On 11/5/14, 9:33 PM, Coleen Phillimore wrote:
>>
>> On 10/30/14, 4:32 PM, Vladimir Ivanov wrote:
>>> Coleen,
>>>
>>> I implemented 2 approaches of the fix.
>>>
>>> The fix with a special case for VM anon classes is:
>>> http://cr.openjdk.java.net/~vlivanov/8060147/webrev.anon.00/
>>>
>>> Both fix the bug, but have different properties.
>>>
>>> (1) Special case for VM anon class is very focused on the actual
>>> cause, but more fragile - all the logic which keeps metadata from
>>> being deallocated is non-trivial and scattered around the whole
>>> ciMetadata hierarchy.
>>>
>>> (2) On the other hand, initial version, which forcibly creates
>>> klass_holder ciObject for each ciMetadata, is much cleaner and
>>> localized, but does unnecessary work.
>>>
>>> Am I right that you prefer (1) as a fix?
>>
>> Yes, I think this version does less unnecessary work and creates less
>> ciObjects.   And the comment is useful for finding how we keep
>> ciMetadata alive for anonymous classes.   You still have a UseNewCode in
>> the webrev thought that you want to take out.
>
> Thanks, Coleen.
>
> VladimirK, Roland, what do you think about (1)?
>
> Best regards,
> Vladimir Ivanov

From yumin.qi at oracle.com  Wed Nov  5 22:16:38 2014
From: yumin.qi at oracle.com (Yumin Qi)
Date: Wed, 05 Nov 2014 14:16:38 -0800
Subject: RFR(XS): 8060721: Test
	runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
In-Reply-To: <545A770C.3030503@oracle.com>
References: <545A770C.3030503@oracle.com>
Message-ID: <545AA1C6.70902@oracle.com>

Looks good to me.

Thanks
Yumin

On 11/5/2014 11:14 AM, Calvin Cheung wrote:

> While upgrading the compiler on Mac for jdk9, we found this compiler 
> bug where it skips the following 2 lines of code in 
> metaspaceShared.cpp when optimization is enable (set to -Os) for the 
> fastdebug and product builds.
>     strcat(class_list_path_str, os::file_separator());
>     strcat(class_list_path_str, "classlist");
>
> The bug is reproducible with Xcode 5.1.1 and 6.1.
>
> A workaround fix is to rewrite an "if" block in the 
> MetaspaceShared::preload_and_dump() method.
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8060721
>
> webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>
> Testing:
>     JPRT
>     The affected testcase with product, fastdebug, and debug builds 
> built with Xcode 5.1.1 and 6.1.
>
> thanks,
> Calvin


From calvin.cheung at oracle.com  Wed Nov  5 22:34:47 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Wed, 05 Nov 2014 14:34:47 -0800
Subject: RFR(XS): 8060721: Test
	runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
In-Reply-To: <545A967C.6020200@oracle.com>
References: <545A770C.3030503@oracle.com> <545A967C.6020200@oracle.com>
Message-ID: <545AA607.2050500@oracle.com>

Hi Dean,

I've tried your suggestion but got the following compilation error:

Compiling 
/Users/ccheung/jdk9-comp-upgrade/hotspot/src/share/vm/memory/metaspaceShared.cpp
/Users/ccheung/jdk9-comp-upgrade/hotspot/src/share/vm/memory/metaspaceShared.cpp:720:5: 
error: no matching function for call to 'strcat'
     strcat(class_list_path_str, (volatile char *)os::file_separator());
     ^~~~~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include/string.h:75:7: 
note: candidate function not viable: 2nd argument ('volatile char *') 
would lose volatile qualifier
char    *strcat(char *, const char *);
          ^
/Users/ccheung/jdk9-comp-upgrade/hotspot/src/share/vm/memory/metaspaceShared.cpp:721:5: 
error: no matching function for call to 'strcat'
     strcat(class_list_path_str, (volatile char *)"classlist");
     ^~~~~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include/string.h:75:7: 
note: candidate function not viable: 2nd argument ('volatile char *') 
would lose volatile qualifier
char    *strcat(char *, const char *);
          ^
2 errors generated.

Calvin

On 11/5/2014 1:28 PM, Dean Long wrote:
> I'm just curious if the following also works:
>
>  721     strcat(class_list_path_str, (volatile char 
> *)os::file_separator());
>  722     strcat(class_list_path_str,(volatile char *)"classlist");
>
> dl
>
> On 11/5/2014 11:14 AM, Calvin Cheung wrote:
>> While upgrading the compiler on Mac for jdk9, we found this compiler 
>> bug where it skips the following 2 lines of code in 
>> metaspaceShared.cpp when optimization is enable (set to -Os) for the 
>> fastdebug and product builds.
>>     strcat(class_list_path_str, os::file_separator());
>>     strcat(class_list_path_str, "classlist");
>>
>> The bug is reproducible with Xcode 5.1.1 and 6.1.
>>
>> A workaround fix is to rewrite an "if" block in the 
>> MetaspaceShared::preload_and_dump() method.
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8060721
>>
>> webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>
>> Testing:
>>     JPRT
>>     The affected testcase with product, fastdebug, and debug builds 
>> built with Xcode 5.1.1 and 6.1.
>>
>> thanks,
>> Calvin
>


From david.holmes at oracle.com  Wed Nov  5 23:12:42 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Nov 2014 09:12:42 +1000
Subject: RFR 8058715: stability issues when being launched as an embedded
	JVM via JNI
In-Reply-To: <545A738D.2080201@oracle.com>
References: <545A738D.2080201@oracle.com>
Message-ID: <545AAEEA.1070605@oracle.com>

Hi David,

On 6/11/2014 4:59 AM, david buck wrote:
> Hi!
>
> This is a request for code review of my fix for jdk8058715
>
> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058715
> WEBREV: http://cr.openjdk.java.net/~dbuck/8058715/webrev.01/
>
> We have also received confirmation from the original reporter of the
> issue that this solution resolves the crashes they were seeing in their
> environment. I have tested that this change does not break the original
> NX bug workaround. I also ran the NX bug reproducer (v8 benchmark of
> Nashorn running in a loop) using a fastdebug build with the
> -XX:NativeMemoryTracking=summary option. Obviously no crashes or other
> issues were detected.

The failure mode for this suggests we are lacking something when we 
attempt to reserve memory. I think that needs closer examination as we 
should not have something that leads to silent corruption followed by 
spurious failures!

That aside I don't see how this "does not break the original NX bug 
workaround". We will skip the workaround if the memory reservation 
fails. Is it the case that in such circumstances we don't need the 
workaround?

Thanks,
David H.

> Cheers,
> -Buck

From christian.thalinger at oracle.com  Wed Nov  5 23:13:12 2014
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Wed, 5 Nov 2014 15:13:12 -0800
Subject: RFR (XS): 8062950: Bug in locking code when UseOptoBiasInlining
	is disabled: assert(dmw->is_neutral()) failed: invariant
In-Reply-To: <7C9B87B351A4BA4AA9EC95BB418116566ACE597E@DEWDFEMB19C.global.corp.sap>
References: <7C9B87B351A4BA4AA9EC95BB418116566ACE597E@DEWDFEMB19C.global.corp.sap>
Message-ID: <B35EB47F-84F4-4FEA-9C9B-DFC12A87FA08@oracle.com>

I?m not exactly sure who is our biased locking expert these days but I guess it?s someone from the runtime team.  CC?ing them.

> On Nov 5, 2014, at 7:38 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
> 
> Hi,
>  
> we found a bug in MacroAssembler::fast_lock on x86 which shows up when UseOptoBiasInlining is switched off.
> The problem is that biased_locking_enter is used with swap_reg_contains_mark==true, which is no longer correct after biased_locking_enter was put in front of check for IsInflated.
>  
> Please review
> http://cr.openjdk.java.net/~goetz/webrevs/8062950-lockBug/webrev.00/ <http://cr.openjdk.java.net/~goetz/webrevs/8062950-lockBug/webrev.00/>
>  
> Best regards,
> Martin


From jeremymanson at google.com  Wed Nov  5 23:13:45 2014
From: jeremymanson at google.com (Jeremy Manson)
Date: Wed, 5 Nov 2014 15:13:45 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <5459ADFB.4090808@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<5459501C.4040807@oracle.com>
	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>
	<5459ADFB.4090808@oracle.com>
Message-ID: <CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>

On Tue, Nov 4, 2014 at 8:56 PM, serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

>  The fix looks good in general.
>
> src/share/vm/oops/method.cpp
>
> 1785   bool contains(Method** m) {1786     if (m == NULL) return false;1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b = b->_next) {1788       if (b->_methods <= m && m < b->_methods + b->_number_of_methods) {*1789         ptrdiff_t idx = m - b->_methods;**1790         if (b->_methods + idx == m) {**
> 1791           return true;
> 1792         }*
> 1793       }
> 1794     }
> 1795     return false;  // not found
> 1796   }
>
>
> Just noticed that the lines 1789-1792 can be replaced with one liner:
>  *         return true;*
>

Ah, you have found our crappy workaround for wild pointers to non-aligned
places in the middle of _methods.


> It is because the condition * (b->_methods + idx == m)* is always true.
>   :)
>
> Also, should we check the condition:  **m != _free_method* ?
> What about the following ?:
>  *         return (***m != _free_method);*
>

I don't mind adding this, if Coleen thinks those are the semantics this
needs.  It wasn't there before, of course.

Jeremy

From vladimir.kozlov at oracle.com  Wed Nov  5 23:30:54 2014
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 05 Nov 2014 15:30:54 -0800
Subject: RFR (XS): 8062950: Bug in locking code when UseOptoBiasInlining
	is disabled: assert(dmw->is_neutral()) failed: invariant
In-Reply-To: <B35EB47F-84F4-4FEA-9C9B-DFC12A87FA08@oracle.com>
References: <7C9B87B351A4BA4AA9EC95BB418116566ACE597E@DEWDFEMB19C.global.corp.sap>
	<B35EB47F-84F4-4FEA-9C9B-DFC12A87FA08@oracle.com>
Message-ID: <545AB32E.8070402@oracle.com>

It is our (Compiler group) code. This problem was introduced with my 
changes for RTM locking.

Martin your changes are good. But you cleanup a bit this code since we 
now never put markword to tmpReg before this call?

Thanks,
Vladimir

On 11/5/14 3:13 PM, Christian Thalinger wrote:
> I?m not exactly sure who is our biased locking expert these days but I guess it?s someone from the runtime team.  CC?ing them.
>
>> On Nov 5, 2014, at 7:38 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
>>
>> Hi,
>>
>> we found a bug in MacroAssembler::fast_lock on x86 which shows up when UseOptoBiasInlining is switched off.
>> The problem is that biased_locking_enter is used with swap_reg_contains_mark==true, which is no longer correct after biased_locking_enter was put in front of check for IsInflated.
>>
>> Please review
>> http://cr.openjdk.java.net/~goetz/webrevs/8062950-lockBug/webrev.00/ <http://cr.openjdk.java.net/~goetz/webrevs/8062950-lockBug/webrev.00/>
>>
>> Best regards,
>> Martin
>

From coleen.phillimore at oracle.com  Wed Nov  5 23:40:17 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Wed, 05 Nov 2014 18:40:17 -0500
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<5459501C.4040807@oracle.com>
	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>
	<5459ADFB.4090808@oracle.com>
	<CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>
Message-ID: <545AB561.9020204@oracle.com>


On 11/5/14, 6:13 PM, Jeremy Manson wrote:
>
>
> On Tue, Nov 4, 2014 at 8:56 PM, serguei.spitsyn at oracle.com 
> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
> <mailto:serguei.spitsyn at oracle.com>> wrote:
>
>     The fix looks good in general.
>
>     src/share/vm/oops/method.cpp
>
>     1785   bool contains(Method** m) {
>     1786     if (m == NULL) return false;
>     1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b = b->_next) {
>     1788       if (b->_methods <= m && m < b->_methods + b->_number_of_methods) {
>     *1789         ptrdiff_t idx = m - b->_methods;**
>     **1790         if (b->_methods + idx == m) {**
>     1791           return true;
>     1792         }*
>     1793       }
>     1794     }
>     1795     return false;  // not found
>     1796   }
>
>
>     Just noticed that the lines 1789-1792 can be replaced with one liner:
>     *        return true;*
>
>
> Ah, you have found our crappy workaround for wild pointers to 
> non-aligned places in the middle of _methods.

Can you explain this?  Why are there wild pointers?
>
>     It is because the condition *(b->_methods + idx == m)* is always
>     true.     :)
>
>     Also, should we check the condition: **m != _free_method*** ?
>     What about the following ?:
>     *        return (****m != _free_method***);*
>
>
> I don't mind adding this, if Coleen thinks those are the semantics 
> this needs.  It wasn't there before, of course.
>

The semantics weren't there before and the way this is called has 
already checked that *m != _free_method.  Would it be an improvement?  I 
don't think so.  It seems that contains() should just check that the 
Method** is contained in the methodID table.  To be more correct, 
is_method_id should check that it's not a freed methodID but the caller 
verifies this already.   So I don't think this should change.

BTW, I've run the test sets suggested by Serguei and they all passed.

Coleen

> Jeremy
>


From david.holmes at oracle.com  Wed Nov  5 23:44:57 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Nov 2014 09:44:57 +1000
Subject: RFR (XS): 8062950: Bug in locking code when UseOptoBiasInlining
	is disabled: assert(dmw->is_neutral()) failed: invariant
In-Reply-To: <B35EB47F-84F4-4FEA-9C9B-DFC12A87FA08@oracle.com>
References: <7C9B87B351A4BA4AA9EC95BB418116566ACE597E@DEWDFEMB19C.global.corp.sap>
	<B35EB47F-84F4-4FEA-9C9B-DFC12A87FA08@oracle.com>
Message-ID: <545AB679.4040705@oracle.com>

On 6/11/2014 9:13 AM, Christian Thalinger wrote:
> I?m not exactly sure who is our biased locking expert these days but I guess it?s someone from the runtime team.  CC?ing them.

The fact I am responding does not imply I am, or consider myself, such 
an expert. ;-) I think we need to hear from Vladimir and Roland 
concerning the original fix for:

8033805: Move Fast_Lock/Fast_Unlock code from .ad files to macroassembler

Looking at that changeset:

http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/5292439ef895

it seems that in x86_32.ad we had:

if (UseBiasedLocking) {
   masm.biased_locking_enter(boxReg, objReg, tmpReg, scrReg, false, 
DONE_LABEL, NULL, _counters);
}

which passes "false", but in x86_64.ad we had:

if (UseBiasedLocking && !UseOptoBiasInlining) {
   masm.biased_locking_enter(boxReg, objReg, tmpReg, scrReg, true, 
DONE_LABEL, NULL, _counters);
   masm.movptr(tmpReg, Address(objReg, 0)) ;        // [FETCH]
}

which passes "true" because there was a prior load of the markword into 
tmpReg.

The new code then has the 64-bit version:

if (UseBiasedLocking && !UseOptoBiasInlining) {
   biased_locking_enter(boxReg, objReg, tmpReg, scrReg, true, 
DONE_LABEL, NULL, counters);
}

but not the prior load and hence is incorrect.

So I concur with Martin's suggested fix.

Cheers,
David

>> On Nov 5, 2014, at 7:38 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
>>
>> Hi,
>>
>> we found a bug in MacroAssembler::fast_lock on x86 which shows up when UseOptoBiasInlining is switched off.
>> The problem is that biased_locking_enter is used with swap_reg_contains_mark==true, which is no longer correct after biased_locking_enter was put in front of check for IsInflated.


Thanks,
David

>>
>> Please review
>> http://cr.openjdk.java.net/~goetz/webrevs/8062950-lockBug/webrev.00/ <http://cr.openjdk.java.net/~goetz/webrevs/8062950-lockBug/webrev.00/>
>>
>> Best regards,
>> Martin
>

From serguei.spitsyn at oracle.com  Wed Nov  5 23:51:03 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 05 Nov 2014 15:51:03 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <545AB561.9020204@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<5459501C.4040807@oracle.com>
	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>
	<5459ADFB.4090808@oracle.com>
	<CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>
	<545AB561.9020204@oracle.com>
Message-ID: <545AB7E7.4020809@oracle.com>

On 11/5/14 3:40 PM, Coleen Phillimore wrote:
>
> On 11/5/14, 6:13 PM, Jeremy Manson wrote:
>>
>>
>> On Tue, Nov 4, 2014 at 8:56 PM, serguei.spitsyn at oracle.com 
>> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
>> <mailto:serguei.spitsyn at oracle.com>> wrote:
>>
>>     The fix looks good in general.
>>
>>     src/share/vm/oops/method.cpp
>>
>>     1785   bool contains(Method** m) {
>>     1786     if (m == NULL) return false;
>>     1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b = b->_next) {
>>     1788       if (b->_methods <= m && m < b->_methods + b->_number_of_methods) {
>>     *1789         ptrdiff_t idx = m - b->_methods;**
>>     **1790         if (b->_methods + idx == m) {**
>>     1791           return true;
>>     1792         }*
>>     1793       }
>>     1794     }
>>     1795     return false;  // not found
>>     1796   }
>>
>>
>>     Just noticed that the lines 1789-1792 can be replaced with one liner:
>>     *        return true;*
>>
>>
>> Ah, you have found our crappy workaround for wild pointers to 
>> non-aligned places in the middle of _methods.
>
> Can you explain this?  Why are there wild pointers?
>>
>>     It is because the condition *(b->_methods + idx == m)* is always
>>     true.     :)
>>
>>     Also, should we check the condition: **m != _free_method*** ?
>>     What about the following ?:
>>     *        return (****m != _free_method***);*
>>
>>
>> I don't mind adding this, if Coleen thinks those are the semantics 
>> this needs.  It wasn't there before, of course.
>>
>
> The semantics weren't there before and the way this is called has 
> already checked that *m != _free_method.  Would it be an improvement?  
> I don't think so.  It seems that contains() should just check that the 
> Method** is contained in the methodID table. To be more correct, 
> is_method_id should check that it's not a freed methodID but the 
> caller verifies this already.   So I don't think this should change.

Agreed.
Thank you for the explanation!


>
> BTW, I've run the test sets suggested by Serguei and they all passed.

Nice!


Thanks,
Serguei

>
> Coleen
>
>> Jeremy
>>
>


From david.holmes at oracle.com  Thu Nov  6 00:16:49 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Nov 2014 10:16:49 +1000
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545A4719.50705@oracle.com>
References: <5459A8ED.8060808@oracle.com> <545A4719.50705@oracle.com>
Message-ID: <545ABDF1.6050107@oracle.com>

On 6/11/2014 1:49 AM, Claes Redestad wrote:
> Hi,
>
> On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have a Contended Locking cleanup bucket fix ready for review.
>>
>> This fix was spun off from the Contended Locking fast enter bucket
>> which was sent out for review late last week. This fix cleans up
>> the computation of ObjectMonitor field pointers and gets rid of
>> the use of literal '-2' in appropriate places. For example:
>>
>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>> Rscratch);
>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);
>>
>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>> specified field and subtracts markOopDesc:monitor_value (2).
>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>
> any reason not to add it as a function in objectMonitor.hpp instead of a
> macro? How about:
>
>    static int no_monitor_offset_in_bytes()  { return
> offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }

_owner is not the only field used so you would need a function for each one.

David
-----

> Example usage:
>
> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
> Rscratch);
> +         ld_ptr(Rmark, ObjectMonitor::no_monitor_offset_in_bytes(),
> Rscratch);
>
>
> Seems this should be inlined regardless and looks a bit cleaner to me.
>
> Thanks!
>
> /Claes
>
>>
>> Thanks to David Holmes for his comments on JDK-8061553 that
>> motivated this (long overdue) cleanup.
>>
>> This work is being tracked by the following bug ID:
>>
>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>     https://bugs.openjdk.java.net/browse/JDK-8062851
>>
>> Here is the webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>
>> Here is the JEP link:
>>
>>     https://bugs.openjdk.java.net/browse/JDK-8046133
>>
>> Testing:
>>
>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>

From david.holmes at oracle.com  Thu Nov  6 00:21:58 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Nov 2014 10:21:58 +1000
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545A4274.6090409@oracle.com>
References: <5459A8ED.8060808@oracle.com> <5459FF11.1080801@oracle.com>
	<545A4274.6090409@oracle.com>
Message-ID: <545ABF26.5010505@oracle.com>

On 6/11/2014 1:29 AM, Daniel D. Daugherty wrote:
> On 11/5/14 3:42 AM, David Holmes wrote:
>> Hi Dan,
>>
>> Reviewed.
>
> Thanks!
>
>
>> I find the name OM_OFFSET_NO_MONITOR_VALUE somewhat awkward but have
>> no better suggestion.
>
> Understood. I didn't like the original "OFFSET_SKEWED" name
> especially since I was moving it to objectMonitor.hpp...
>
> If you think of a better, let me know... we can always change it.
>
>
>
>> In fact I have to ask what _is_ the object monitor tagging mechanism?
>> I can't see it defined in the objectMonitor.* files. ??
>
> That would be this code:
>
> src/share/vm/oops/markOop.hpp:

Doh! The markword encoding - of course.

Thanks,
David

>      317   static markOop encode(ObjectMonitor* monitor) {
>      318     intptr_t tmp = (intptr_t) monitor;
>      319     return (markOop) (tmp | monitor_value);
>      320   }
>
> and the other methods in that file that have to account for
> the monitor_value being set...
>
> Dan
>
>
>>
>> Thanks,
>> David
>>
>> On 5/11/2014 2:34 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>
>>> This fix was spun off from the Contended Locking fast enter bucket
>>> which was sent out for review late last week. This fix cleans up
>>> the computation of ObjectMonitor field pointers and gets rid of
>>> the use of literal '-2' in appropriate places. For example:
>>>
>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>> Rscratch);
>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);
>>>
>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>> specified field and subtracts markOopDesc:monitor_value (2).
>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>>
>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>> motivated this (long overdue) cleanup.
>>>
>>> This work is being tracked by the following bug ID:
>>>
>>>      JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>      https://bugs.openjdk.java.net/browse/JDK-8062851
>>>
>>> Here is the webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>
>>> Here is the JEP link:
>>>
>>>      https://bugs.openjdk.java.net/browse/JDK-8046133
>>>
>>> Testing:
>>>
>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>

From coleen.phillimore at oracle.com  Thu Nov  6 00:23:42 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Wed, 05 Nov 2014 19:23:42 -0500
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545ABDF1.6050107@oracle.com>
References: <5459A8ED.8060808@oracle.com> <545A4719.50705@oracle.com>
	<545ABDF1.6050107@oracle.com>
Message-ID: <545ABF8E.1050408@oracle.com>


On 11/5/14, 7:16 PM, David Holmes wrote:
> On 6/11/2014 1:49 AM, Claes Redestad wrote:
>> Hi,
>>
>> On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>
>>> This fix was spun off from the Contended Locking fast enter bucket
>>> which was sent out for review late last week. This fix cleans up
>>> the computation of ObjectMonitor field pointers and gets rid of
>>> the use of literal '-2' in appropriate places. For example:
>>>
>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>> Rscratch);
>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);
>>>
>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>> specified field and subtracts markOopDesc:monitor_value (2).
>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>
>> any reason not to add it as a function in objectMonitor.hpp instead of a
>> macro? How about:
>>
>>    static int no_monitor_offset_in_bytes()  { return
>> offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }
>
> _owner is not the only field used so you would need a function for 
> each one.

I thought this would be better too.   There are only 6 functions (6 
lines) max that need this.  It would look nicer.

My suggestion would be to make them static int 
untagged_offset_in_bytes() or whatever monitor_value is.  It's not a 
very descriptive name so better to name the functions after what it's for.

Coleen

>
> David
> -----
>
>> Example usage:
>>
>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>> Rscratch);
>> +         ld_ptr(Rmark, ObjectMonitor::no_monitor_offset_in_bytes(),
>> Rscratch);
>>
>>
>> Seems this should be inlined regardless and looks a bit cleaner to me.
>>
>> Thanks!
>>
>> /Claes
>>
>>>
>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>> motivated this (long overdue) cleanup.
>>>
>>> This work is being tracked by the following bug ID:
>>>
>>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>     https://bugs.openjdk.java.net/browse/JDK-8062851
>>>
>>> Here is the webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>
>>> Here is the JEP link:
>>>
>>>     https://bugs.openjdk.java.net/browse/JDK-8046133
>>>
>>> Testing:
>>>
>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>


From david.holmes at oracle.com  Thu Nov  6 00:34:04 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Nov 2014 10:34:04 +1000
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545ABF8E.1050408@oracle.com>
References: <5459A8ED.8060808@oracle.com>
	<545A4719.50705@oracle.com>	<545ABDF1.6050107@oracle.com>
	<545ABF8E.1050408@oracle.com>
Message-ID: <545AC1FC.8010905@oracle.com>

On 6/11/2014 10:23 AM, Coleen Phillimore wrote:
>
> On 11/5/14, 7:16 PM, David Holmes wrote:
>> On 6/11/2014 1:49 AM, Claes Redestad wrote:
>>> Hi,
>>>
>>> On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>>
>>>> This fix was spun off from the Contended Locking fast enter bucket
>>>> which was sent out for review late last week. This fix cleans up
>>>> the computation of ObjectMonitor field pointers and gets rid of
>>>> the use of literal '-2' in appropriate places. For example:
>>>>
>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>>> Rscratch);
>>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);
>>>>
>>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>>> specified field and subtracts markOopDesc:monitor_value (2).
>>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>>
>>> any reason not to add it as a function in objectMonitor.hpp instead of a
>>> macro? How about:
>>>
>>>    static int no_monitor_offset_in_bytes()  { return
>>> offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }
>>
>> _owner is not the only field used so you would need a function for
>> each one.
>
> I thought this would be better too.   There are only 6 functions (6
> lines) max that need this.  It would look nicer.

Only changes an upper case macro name to a lower case function  name.

> My suggestion would be to make them static int
> untagged_offset_in_bytes() or whatever monitor_value is.  It's not a
> very descriptive name so better to name the functions after what it's for.

You need the field name included in the function name:

untagged_offset_of_owner()
untagged_offset_of_xxx()

but it is only untagged if the OM is currently inflated, so then:

untagged_offset_of_XXX_for_inflated_om()

I can live with Dan's macro (which is an improvement on the original).

David

> Coleen
>
>>
>> David
>> -----
>>
>>> Example usage:
>>>
>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>> Rscratch);
>>> +         ld_ptr(Rmark, ObjectMonitor::no_monitor_offset_in_bytes(),
>>> Rscratch);
>>>
>>>
>>> Seems this should be inlined regardless and looks a bit cleaner to me.
>>>
>>> Thanks!
>>>
>>> /Claes
>>>
>>>>
>>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>>> motivated this (long overdue) cleanup.
>>>>
>>>> This work is being tracked by the following bug ID:
>>>>
>>>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>>     https://bugs.openjdk.java.net/browse/JDK-8062851
>>>>
>>>> Here is the webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>>
>>>> Here is the JEP link:
>>>>
>>>>     https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>
>>>> Testing:
>>>>
>>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>>
>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>
>>>> Dan
>>>
>

From jeremymanson at google.com  Thu Nov  6 00:35:23 2014
From: jeremymanson at google.com (Jeremy Manson)
Date: Wed, 5 Nov 2014 16:35:23 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <545AB561.9020204@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<5459501C.4040807@oracle.com>
	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>
	<5459ADFB.4090808@oracle.com>
	<CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>
	<545AB561.9020204@oracle.com>
Message-ID: <CAPYFHW3SQaCZpv64ad=yaNJxh2xGLn6ZX+R-tEWfKFZxGVhgNA@mail.gmail.com>

On Wed, Nov 5, 2014 at 3:40 PM, Coleen Phillimore <
coleen.phillimore at oracle.com> wrote:

>
> On 11/5/14, 6:13 PM, Jeremy Manson wrote:
>
>
>
> On Tue, Nov 4, 2014 at 8:56 PM, serguei.spitsyn at oracle.com <
> serguei.spitsyn at oracle.com> wrote:
>
>>  The fix looks good in general.
>>
>> src/share/vm/oops/method.cpp
>>
>> 1785   bool contains(Method** m) {1786     if (m == NULL) return false;1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b = b->_next) {1788       if (b->_methods <= m && m < b->_methods + b->_number_of_methods) {*1789         ptrdiff_t idx = m - b->_methods;**1790         if (b->_methods + idx == m) {**
>> 1791           return true;
>> 1792         }*
>> 1793       }
>> 1794     }
>> 1795     return false;  // not found
>> 1796   }
>>
>>
>> Just noticed that the lines 1789-1792 can be replaced with one liner:
>>  *         return true;*
>>
>
>  Ah, you have found our crappy workaround for wild pointers to
> non-aligned places in the middle of _methods.
>
>
> Can you explain this?  Why are there wild pointers?
>

My belief was that end user code could pass any old garbage to this
function.  It's called by Method::is_method_id, which is called
by jniCheck::validate_jmethod_id.  The idea was that this would help check
jni deliver useful information in the case of the end user inputting
garbage that happened to be in the right memory range.

Having said that, at a second glance, it looks as if it that call is
protected by a call to is_method() (in checked_resolve_jmethod_id), so the
program will probably crash before it gets to this check.

The other point about it was that the result of >= and < is technically
unspecified; if it were ever implemented as anything other than a binary
comparison between integers (which it never is, now that no one has a
segmented architecture), the comparison could pass spuriously, so checking
would be a good thing.  Of course, the comparison could fail spuriously,
too.

Anyway, I'm happy to leave it in as belt-and-suspenders (and add a comment,
obviously, since it has caused confusion), or take it out.  Your call.

Jeremy

From coleen.phillimore at oracle.com  Thu Nov  6 00:41:42 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Wed, 05 Nov 2014 19:41:42 -0500
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <CAPYFHW3SQaCZpv64ad=yaNJxh2xGLn6ZX+R-tEWfKFZxGVhgNA@mail.gmail.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<5459501C.4040807@oracle.com>
	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>
	<5459ADFB.4090808@oracle.com>
	<CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>
	<545AB561.9020204@oracle.com>
	<CAPYFHW3SQaCZpv64ad=yaNJxh2xGLn6ZX+R-tEWfKFZxGVhgNA@mail.gmail.com>
Message-ID: <545AC3C6.6070105@oracle.com>


Yes, leave it in and add a comment then (sorry for top-posting). Thank 
you for the explanation.

Coleen

On 11/5/14, 7:35 PM, Jeremy Manson wrote:
>
>
> On Wed, Nov 5, 2014 at 3:40 PM, Coleen Phillimore 
> <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>> 
> wrote:
>
>
>     On 11/5/14, 6:13 PM, Jeremy Manson wrote:
>>
>>
>>     On Tue, Nov 4, 2014 at 8:56 PM, serguei.spitsyn at oracle.com
>>     <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>
>>         The fix looks good in general.
>>
>>         src/share/vm/oops/method.cpp
>>
>>         1785   bool contains(Method** m) {
>>         1786     if (m == NULL) return false;
>>         1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b = b->_next) {
>>         1788       if (b->_methods <= m && m < b->_methods + b->_number_of_methods) {
>>         *1789         ptrdiff_t idx = m - b->_methods;**
>>         **1790         if (b->_methods + idx == m) {**
>>         1791           return true;
>>         1792         }*
>>         1793       }
>>         1794     }
>>         1795     return false;  // not found
>>         1796   }
>>
>>
>>         Just noticed that the lines 1789-1792 can be replaced with
>>         one liner:
>>         *        return true;*
>>
>>
>>     Ah, you have found our crappy workaround for wild pointers to
>>     non-aligned places in the middle of _methods.
>
>     Can you explain this?  Why are there wild pointers?
>
>
> My belief was that end user code could pass any old garbage to this 
> function.  It's called by Method::is_method_id, which is called 
> by jniCheck::validate_jmethod_id.  The idea was that this would help 
> check jni deliver useful information in the case of the end user 
> inputting garbage that happened to be in the right memory range.
>
> Having said that, at a second glance, it looks as if it that call is 
> protected by a call to is_method() (in checked_resolve_jmethod_id), so 
> the program will probably crash before it gets to this check.
>
> The other point about it was that the result of >= and < is 
> technically unspecified; if it were ever implemented as anything other 
> than a binary comparison between integers (which it never is, now that 
> no one has a segmented architecture), the comparison could pass 
> spuriously, so checking would be a good thing.  Of course, the 
> comparison could fail spuriously, too.
>
> Anyway, I'm happy to leave it in as belt-and-suspenders (and add a 
> comment, obviously, since it has caused confusion), or take it out.  
> Your call.
>
> Jeremy


From david.holmes at oracle.com  Thu Nov  6 00:50:27 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Nov 2014 10:50:27 +1000
Subject: RFR(XS): 8060721: Test
	runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
In-Reply-To: <545A770C.3030503@oracle.com>
References: <545A770C.3030503@oracle.com>
Message-ID: <545AC5D3.9090005@oracle.com>

On 6/11/2014 5:14 AM, Calvin Cheung wrote:
> While upgrading the compiler on Mac for jdk9, we found this compiler bug
> where it skips the following 2 lines of code in metaspaceShared.cpp when
> optimization is enable (set to -Os) for the fastdebug and product builds.
>      strcat(class_list_path_str, os::file_separator());
>      strcat(class_list_path_str, "classlist");
>
> The bug is reproducible with Xcode 5.1.1 and 6.1.
>
> A workaround fix is to rewrite an "if" block in the
> MetaspaceShared::preload_and_dump() method.

Can't you simply replace the strcats with jio_snprintf and do away with 
the sub_path array?

Or even try strncat instead of strcat?

David

> JBS: https://bugs.openjdk.java.net/browse/JDK-8060721
>
> webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>
> Testing:
>      JPRT
>      The affected testcase with product, fastdebug, and debug builds
> built with Xcode 5.1.1 and 6.1.
>
> thanks,
> Calvin

From serguei.spitsyn at oracle.com  Thu Nov  6 01:11:00 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 05 Nov 2014 17:11:00 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <CAPYFHW3SQaCZpv64ad=yaNJxh2xGLn6ZX+R-tEWfKFZxGVhgNA@mail.gmail.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<5459501C.4040807@oracle.com>
	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>
	<5459ADFB.4090808@oracle.com>
	<CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>
	<545AB561.9020204@oracle.com>
	<CAPYFHW3SQaCZpv64ad=yaNJxh2xGLn6ZX+R-tEWfKFZxGVhgNA@mail.gmail.com>
Message-ID: <545ACAA4.3020906@oracle.com>


On 11/5/14 4:35 PM, Jeremy Manson wrote:
>
>
> On Wed, Nov 5, 2014 at 3:40 PM, Coleen Phillimore 
> <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>> 
> wrote:
>
>
>     On 11/5/14, 6:13 PM, Jeremy Manson wrote:
>>
>>
>>     On Tue, Nov 4, 2014 at 8:56 PM, serguei.spitsyn at oracle.com
>>     <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>
>>         The fix looks good in general.
>>
>>         src/share/vm/oops/method.cpp
>>
>>         1785   bool contains(Method** m) {
>>         1786     if (m == NULL) return false;
>>         1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b = b->_next) {
>>         1788       if (b->_methods <= m && m < b->_methods + b->_number_of_methods) {
>>         *1789         ptrdiff_t idx = m - b->_methods;**
>>         **1790         if (b->_methods + idx == m) {**
>>         1791           return true;
>>         1792         }*
>>         1793       }
>>         1794     }
>>         1795     return false;  // not found
>>         1796   }
>>
>>
>>         Just noticed that the lines 1789-1792 can be replaced with
>>         one liner:
>>         *        return true;*
>>
>>
>>     Ah, you have found our crappy workaround for wild pointers to
>>     non-aligned places in the middle of _methods.
>
>     Can you explain this?  Why are there wild pointers?
>
>
> My belief was that end user code could pass any old garbage to this 
> function.  It's called by Method::is_method_id, which is called 
> by jniCheck::validate_jmethod_id.  The idea was that this would help 
> check jni deliver useful information in the case of the end user 
> inputting garbage that happened to be in the right memory range.
>
> Having said that, at a second glance, it looks as if it that call is 
> protected by a call to is_method() (in checked_resolve_jmethod_id), so 
> the program will probably crash before it gets to this check.
>
> The other point about it was that the result of >= and < is 
> technically unspecified; if it were ever implemented as anything other 
> than a binary comparison between integers (which it never is, now that 
> no one has a segmented architecture), the comparison could pass 
> spuriously, so checking would be a good thing.  Of course, the 
> comparison could fail spuriously, too.
>
> Anyway, I'm happy to leave it in as belt-and-suspenders (and add a 
> comment, obviously, since it has caused confusion), or take it out.  
> Your call.

I'm still confused.

How this code could possibly check anything?
    ptrdiff_t idx = m - b->_methods;
    if (b->_methods + idx == m) {

The condition above always gives true:
    b->_methods + (idx) == b->_methods + (m - b->_methods) == 
(b->_methods- b->_methods) + m == (0 + m) == m

Even if m was unaligned then at the end we compare m with m which is 
still true.
Do I miss anything?


Thanks,
Serguei

**
>
> Jeremy


From david.holmes at oracle.com  Thu Nov  6 01:34:43 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Nov 2014 11:34:43 +1000
Subject: RFR 8058715: stability issues when being launched as an embedded
	JVM via JNI
In-Reply-To: <545AAEEA.1070605@oracle.com>
References: <545A738D.2080201@oracle.com> <545AAEEA.1070605@oracle.com>
Message-ID: <545AD033.8060107@oracle.com>

On 6/11/2014 9:12 AM, David Holmes wrote:
> Hi David,
>
> On 6/11/2014 4:59 AM, david buck wrote:
>> Hi!
>>
>> This is a request for code review of my fix for jdk8058715
>>
>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058715
>> WEBREV: http://cr.openjdk.java.net/~dbuck/8058715/webrev.01/
>>
>> We have also received confirmation from the original reporter of the
>> issue that this solution resolves the crashes they were seeing in their
>> environment. I have tested that this change does not break the original
>> NX bug workaround. I also ran the NX bug reproducer (v8 benchmark of
>> Nashorn running in a loop) using a fastdebug build with the
>> -XX:NativeMemoryTracking=summary option. Obviously no crashes or other
>> issues were detected.
>
> The failure mode for this suggests we are lacking something when we
> attempt to reserve memory. I think that needs closer examination as we
> should not have something that leads to silent corruption followed by
> spurious failures!
>
> That aside I don't see how this "does not break the original NX bug
> workaround". We will skip the workaround if the memory reservation
> fails. Is it the case that in such circumstances we don't need the
> workaround?

Sorry ignore this part. The original code was already bailing out if the 
reservation failed.

Thanks,
David


> Thanks,
> David H.
>
>> Cheers,
>> -Buck

From coleen.phillimore at oracle.com  Thu Nov  6 03:00:42 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Wed, 05 Nov 2014 22:00:42 -0500
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <545ACAA4.3020906@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<5459501C.4040807@oracle.com>
	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>
	<5459ADFB.4090808@oracle.com>
	<CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>
	<545AB561.9020204@oracle.com>
	<CAPYFHW3SQaCZpv64ad=yaNJxh2xGLn6ZX+R-tEWfKFZxGVhgNA@mail.gmail.com>
	<545ACAA4.3020906@oracle.com>
Message-ID: <545AE45A.5080003@oracle.com>


On 11/5/14, 8:11 PM, serguei.spitsyn at oracle.com wrote:
>
> On 11/5/14 4:35 PM, Jeremy Manson wrote:
>>
>>
>> On Wed, Nov 5, 2014 at 3:40 PM, Coleen Phillimore 
>> <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>> 
>> wrote:
>>
>>
>>     On 11/5/14, 6:13 PM, Jeremy Manson wrote:
>>>
>>>
>>>     On Tue, Nov 4, 2014 at 8:56 PM, serguei.spitsyn at oracle.com
>>>     <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
>>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>
>>>         The fix looks good in general.
>>>
>>>         src/share/vm/oops/method.cpp
>>>
>>>         1785   bool contains(Method** m) {
>>>         1786     if (m == NULL) return false;
>>>         1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b = b->_next) {
>>>         1788       if (b->_methods <= m && m < b->_methods + b->_number_of_methods) {
>>>         *1789         ptrdiff_t idx = m - b->_methods;**
>>>         **1790         if (b->_methods + idx == m) {**
>>>         1791           return true;
>>>         1792         }*
>>>         1793       }
>>>         1794     }
>>>         1795     return false;  // not found
>>>         1796   }
>>>
>>>
>>>         Just noticed that the lines 1789-1792 can be replaced with
>>>         one liner:
>>>         *        return true;*
>>>
>>>
>>>     Ah, you have found our crappy workaround for wild pointers to
>>>     non-aligned places in the middle of _methods.
>>
>>     Can you explain this?  Why are there wild pointers?
>>
>>
>> My belief was that end user code could pass any old garbage to this 
>> function.  It's called by Method::is_method_id, which is called 
>> by jniCheck::validate_jmethod_id.  The idea was that this would help 
>> check jni deliver useful information in the case of the end user 
>> inputting garbage that happened to be in the right memory range.
>>
>> Having said that, at a second glance, it looks as if it that call is 
>> protected by a call to is_method() (in checked_resolve_jmethod_id), 
>> so the program will probably crash before it gets to this check.
>>
>> The other point about it was that the result of >= and < is 
>> technically unspecified; if it were ever implemented as anything 
>> other than a binary comparison between integers (which it never is, 
>> now that no one has a segmented architecture), the comparison could 
>> pass spuriously, so checking would be a good thing.  Of course, the 
>> comparison could fail spuriously, too.
>>
>> Anyway, I'm happy to leave it in as belt-and-suspenders (and add a 
>> comment, obviously, since it has caused confusion), or take it out.  
>> Your call.
>
> I'm still confused.
>
> How this code could possibly check anything?
>    ptrdiff_t idx = m - b->_methods;
>    if (b->_methods + idx == m) {
>
> The condition above always gives true:
>    b->_methods + (idx) == b->_methods + (m - b->_methods) == 
> (b->_methods- b->_methods) + m == (0 + m) == m
>
> Even if m was unaligned then at the end we compare m with m which is 
> still true.
> Do I miss anything?

If 'm' is unaligned we would fail this comparison:

(gdb) print &methods->_data[2]
$34 = (Method **) 0x7fffe0022440
(gdb) print &methods->_data[0]
$35 = (Method **) 0x7fffe0022430
(gdb) print 0x7fffe0022444 - 0x7fffe0022430
$32 = 20
(gdb) print 20/8
$33 = 2

if m is misaligned 0x7fffe0022444 the idx would be 2 and the expression 
(b->_methods + idx) would evaluate to the aligned 0xfffe0022440  so not 
equal m.

But the code could check for misaligned m instead (or it would have 
already crashed).  I think all bets are off if the address space is 
segmented.

The comment Jeremy added is:

       if (b->_methods <= m && m < b->_methods + b->_number_of_methods) {
         // This is a bit of extra checking, for two reasons.  One is
         // that contains() deals with pointers that are passed in by
         // JNI code, so making sure that the pointer is aligned
         // correctly is valuable.  The other is that <= and > are
         // technically not defined on pointers, so the if guard can
         // pass spuriously; no modern compiler is likely to make that
         // a problem, though (and if one did, the guard could also
         // fail spuriously, which would be bad).
         ptrdiff_t idx = m - b->_methods;
         if (b->_methods + idx == m) {
           return true;
         }

Coleen
>
>
> Thanks,
> Serguei
>
> **
>>
>> Jeremy
>


From david.holmes at oracle.com  Thu Nov  6 03:11:18 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Nov 2014 13:11:18 +1000
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <545AE45A.5080003@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>	<5457E36A.3020800@oracle.com>
	<54592FC2.7090406@oracle.com>	<5459501C.4040807@oracle.com>	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>	<5459ADFB.4090808@oracle.com>	<CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>	<545AB561.9020204@oracle.com>	<CAPYFHW3SQaCZpv64ad=yaNJxh2xGLn6ZX+R-tEWfKFZxGVhgNA@mail.gmail.com>	<545ACAA4.3020906@oracle.com>
	<545AE45A.5080003@oracle.com>
Message-ID: <545AE6D6.4040401@oracle.com>

On 6/11/2014 1:00 PM, Coleen Phillimore wrote:
>
> On 11/5/14, 8:11 PM, serguei.spitsyn at oracle.com wrote:
>>
>> On 11/5/14 4:35 PM, Jeremy Manson wrote:
>>>
>>>
>>> On Wed, Nov 5, 2014 at 3:40 PM, Coleen Phillimore
>>> <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>>
>>> wrote:
>>>
>>>
>>>     On 11/5/14, 6:13 PM, Jeremy Manson wrote:
>>>>
>>>>
>>>>     On Tue, Nov 4, 2014 at 8:56 PM, serguei.spitsyn at oracle.com
>>>>     <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
>>>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>>
>>>>         The fix looks good in general.
>>>>
>>>>         src/share/vm/oops/method.cpp
>>>>
>>>>         1785   bool contains(Method** m) {
>>>>         1786     if (m == NULL) return false;
>>>>         1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b = b->_next) {
>>>>         1788       if (b->_methods <= m && m < b->_methods + b->_number_of_methods) {
>>>>         *1789         ptrdiff_t idx = m - b->_methods;**
>>>>         **1790         if (b->_methods + idx == m) {**
>>>>         1791           return true;
>>>>         1792         }*
>>>>         1793       }
>>>>         1794     }
>>>>         1795     return false;  // not found
>>>>         1796   }
>>>>
>>>>
>>>>         Just noticed that the lines 1789-1792 can be replaced with
>>>>         one liner:
>>>>         *        return true;*
>>>>
>>>>
>>>>     Ah, you have found our crappy workaround for wild pointers to
>>>>     non-aligned places in the middle of _methods.
>>>
>>>     Can you explain this?  Why are there wild pointers?
>>>
>>>
>>> My belief was that end user code could pass any old garbage to this
>>> function.  It's called by Method::is_method_id, which is called
>>> by jniCheck::validate_jmethod_id.  The idea was that this would help
>>> check jni deliver useful information in the case of the end user
>>> inputting garbage that happened to be in the right memory range.
>>>
>>> Having said that, at a second glance, it looks as if it that call is
>>> protected by a call to is_method() (in checked_resolve_jmethod_id),
>>> so the program will probably crash before it gets to this check.
>>>
>>> The other point about it was that the result of >= and < is
>>> technically unspecified; if it were ever implemented as anything
>>> other than a binary comparison between integers (which it never is,
>>> now that no one has a segmented architecture), the comparison could
>>> pass spuriously, so checking would be a good thing.  Of course, the
>>> comparison could fail spuriously, too.
>>>
>>> Anyway, I'm happy to leave it in as belt-and-suspenders (and add a
>>> comment, obviously, since it has caused confusion), or take it out.
>>> Your call.
>>
>> I'm still confused.
>>
>> How this code could possibly check anything?
>>    ptrdiff_t idx = m - b->_methods;
>>    if (b->_methods + idx == m) {
>>
>> The condition above always gives true:
>>    b->_methods + (idx) == b->_methods + (m - b->_methods) ==
>> (b->_methods- b->_methods) + m == (0 + m) == m
>>
>> Even if m was unaligned then at the end we compare m with m which is
>> still true.
>> Do I miss anything?
>
> If 'm' is unaligned we would fail this comparison:
>
> (gdb) print &methods->_data[2]
> $34 = (Method **) 0x7fffe0022440
> (gdb) print &methods->_data[0]
> $35 = (Method **) 0x7fffe0022430
> (gdb) print 0x7fffe0022444 - 0x7fffe0022430
> $32 = 20

I was confused about this too. What we have here is pointer arithmetic, 
not regular arithmetic, so I'm assuming an unaligned value has to be 
adjusted before the actual difference is computed. So in practice:

m - b->_methods

is really

adjusted_for_alignment(m) - b->_methods

David
-----

> (gdb) print 20/8
> $33 = 2
>
> if m is misaligned 0x7fffe0022444 the idx would be 2 and the expression
> (b->_methods + idx) would evaluate to the aligned 0xfffe0022440  so not
> equal m.
>
> But the code could check for misaligned m instead (or it would have
> already crashed).  I think all bets are off if the address space is
> segmented.
>
> The comment Jeremy added is:
>
>        if (b->_methods <= m && m < b->_methods + b->_number_of_methods) {
>          // This is a bit of extra checking, for two reasons.  One is
>          // that contains() deals with pointers that are passed in by
>          // JNI code, so making sure that the pointer is aligned
>          // correctly is valuable.  The other is that <= and > are
>          // technically not defined on pointers, so the if guard can
>          // pass spuriously; no modern compiler is likely to make that
>          // a problem, though (and if one did, the guard could also
>          // fail spuriously, which would be bad).
>          ptrdiff_t idx = m - b->_methods;
>          if (b->_methods + idx == m) {
>            return true;
>          }
>
> Coleen
>>
>>
>> Thanks,
>> Serguei
>>
>> **
>>>
>>> Jeremy
>>
>

From calvin.cheung at oracle.com  Thu Nov  6 04:28:34 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Wed, 05 Nov 2014 20:28:34 -0800
Subject: RFR(XS): 8060721: Test
	runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
In-Reply-To: <545AC5D3.9090005@oracle.com>
References: <545A770C.3030503@oracle.com> <545AC5D3.9090005@oracle.com>
Message-ID: <545AF8F2.1010106@oracle.com>

On 11/5/2014 4:50 PM, David Holmes wrote:
> On 6/11/2014 5:14 AM, Calvin Cheung wrote:
>> While upgrading the compiler on Mac for jdk9, we found this compiler bug
>> where it skips the following 2 lines of code in metaspaceShared.cpp when
>> optimization is enable (set to -Os) for the fastdebug and product 
>> builds.
>>      strcat(class_list_path_str, os::file_separator());
>>      strcat(class_list_path_str, "classlist");
>>
>> The bug is reproducible with Xcode 5.1.1 and 6.1.
>>
>> A workaround fix is to rewrite an "if" block in the
>> MetaspaceShared::preload_and_dump() method.
>
> Can't you simply replace the strcats with jio_snprintf and do away 
> with the sub_path array?
The following works. I'll do more testing before sending an updated webrev.

--- a/src/share/vm/memory/metaspaceShared.cpp
+++ b/src/share/vm/memory/metaspaceShared.cpp
@@ -713,12 +713,15 @@
      int class_list_path_len = (int)strlen(class_list_path_str);
      if (class_list_path_len >= 3) {
        if (strcmp(class_list_path_str + class_list_path_len - 3, "lib") 
!= 0) {
-        strcat(class_list_path_str, os::file_separator());
-        strcat(class_list_path_str, "lib");
+        jio_snprintf(class_list_path_str + class_list_path_len,
+                     sizeof(class_list_path_str) - class_list_path_len,
+                     "%slib", os::file_separator());
        }
      }
-    strcat(class_list_path_str, os::file_separator());
-    strcat(class_list_path_str, "classlist");
+    class_list_path_len = (int)strlen(class_list_path_str);
+    jio_snprintf(class_list_path_str + class_list_path_len,
+                 sizeof(class_list_path_str) - class_list_path_len,
+                 "%sclasslist", os::file_separator());
      class_list_path = class_list_path_str;
    } else {
      class_list_path = SharedClassListFile;
>
> Or even try strncat instead of strcat?
I think jio_snprintf is better because it null terminates the string.
If I use strncat, I'll need to initialize the entire buffer to null.

thanks,
Calvin
>
> David
>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8060721
>>
>> webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>
>> Testing:
>>      JPRT
>>      The affected testcase with product, fastdebug, and debug builds
>> built with Xcode 5.1.1 and 6.1.
>>
>> thanks,
>> Calvin


From coleen.phillimore at oracle.com  Thu Nov  6 05:02:01 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Thu, 06 Nov 2014 00:02:01 -0500
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <545AE6D6.4040401@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>	<5457E36A.3020800@oracle.com>
	<54592FC2.7090406@oracle.com>	<5459501C.4040807@oracle.com>	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>	<5459ADFB.4090808@oracle.com>	<CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>	<545AB561.9020204@oracle.com>	<CAPYFHW3SQaCZpv64ad=yaNJxh2xGLn6ZX+R-tEWfKFZxGVhgNA@mail.gmail.com>	<545ACAA4.3020906@oracle.com>
	<545AE45A.5080003@oracle.com> <545AE6D6.4040401@oracle.com>
Message-ID: <545B00C9.1070502@oracle.com>


David and Serguei (and Jeremy), see below.   Summary: I think Jeremy's 
code and comments are good.

On 11/5/14, 10:11 PM, David Holmes wrote:
> On 6/11/2014 1:00 PM, Coleen Phillimore wrote:
>>
>> On 11/5/14, 8:11 PM, serguei.spitsyn at oracle.com wrote:
>>>
>>> On 11/5/14 4:35 PM, Jeremy Manson wrote:
>>>>
>>>>
>>>> On Wed, Nov 5, 2014 at 3:40 PM, Coleen Phillimore
>>>> <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>>
>>>> wrote:
>>>>
>>>>
>>>>     On 11/5/14, 6:13 PM, Jeremy Manson wrote:
>>>>>
>>>>>
>>>>>     On Tue, Nov 4, 2014 at 8:56 PM, serguei.spitsyn at oracle.com
>>>>>     <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
>>>>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>>>
>>>>>         The fix looks good in general.
>>>>>
>>>>>         src/share/vm/oops/method.cpp
>>>>>
>>>>>         1785   bool contains(Method** m) {
>>>>>         1786     if (m == NULL) return false;
>>>>>         1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b 
>>>>> = b->_next) {
>>>>>         1788       if (b->_methods <= m && m < b->_methods + 
>>>>> b->_number_of_methods) {
>>>>>         *1789         ptrdiff_t idx = m - b->_methods;**
>>>>>         **1790         if (b->_methods + idx == m) {**
>>>>>         1791           return true;
>>>>>         1792         }*
>>>>>         1793       }
>>>>>         1794     }
>>>>>         1795     return false;  // not found
>>>>>         1796   }
>>>>>
>>>>>
>>>>>         Just noticed that the lines 1789-1792 can be replaced with
>>>>>         one liner:
>>>>>         *        return true;*
>>>>>
>>>>>
>>>>>     Ah, you have found our crappy workaround for wild pointers to
>>>>>     non-aligned places in the middle of _methods.
>>>>
>>>>     Can you explain this?  Why are there wild pointers?
>>>>
>>>>
>>>> My belief was that end user code could pass any old garbage to this
>>>> function.  It's called by Method::is_method_id, which is called
>>>> by jniCheck::validate_jmethod_id.  The idea was that this would help
>>>> check jni deliver useful information in the case of the end user
>>>> inputting garbage that happened to be in the right memory range.
>>>>
>>>> Having said that, at a second glance, it looks as if it that call is
>>>> protected by a call to is_method() (in checked_resolve_jmethod_id),
>>>> so the program will probably crash before it gets to this check.
>>>>
>>>> The other point about it was that the result of >= and < is
>>>> technically unspecified; if it were ever implemented as anything
>>>> other than a binary comparison between integers (which it never is,
>>>> now that no one has a segmented architecture), the comparison could
>>>> pass spuriously, so checking would be a good thing.  Of course, the
>>>> comparison could fail spuriously, too.
>>>>
>>>> Anyway, I'm happy to leave it in as belt-and-suspenders (and add a
>>>> comment, obviously, since it has caused confusion), or take it out.
>>>> Your call.
>>>
>>> I'm still confused.
>>>
>>> How this code could possibly check anything?
>>>    ptrdiff_t idx = m - b->_methods;
>>>    if (b->_methods + idx == m) {
>>>
>>> The condition above always gives true:
>>>    b->_methods + (idx) == b->_methods + (m - b->_methods) ==
>>> (b->_methods- b->_methods) + m == (0 + m) == m
>>>
>>> Even if m was unaligned then at the end we compare m with m which is
>>> still true.
>>> Do I miss anything?
>>
>> If 'm' is unaligned we would fail this comparison:
>>
>> (gdb) print &methods->_data[2]
>> $34 = (Method **) 0x7fffe0022440
>> (gdb) print &methods->_data[0]
>> $35 = (Method **) 0x7fffe0022430
>> (gdb) print 0x7fffe0022444 - 0x7fffe0022430
>> $32 = 20
>
> I was confused about this too. What we have here is pointer 
> arithmetic, not regular arithmetic, so I'm assuming an unaligned value 
> has to be adjusted before the actual difference is computed. So in 
> practice:
>
> m - b->_methods
>
> is really
>
> adjusted_for_alignment(m) - b->_methods

It's not adjusted for alignment:

#include <cstddef>

extern "C" int printf(const char *,...);
class Method {
   int i ; int j; int k;
};

Method* array[10] = { new Method(),new Method(),new Method(),new 
Method(),new Method(),n
ew Method(),new Method(),new Method(),new Method(),new Method() };

void test(Method** m) {
    printf("m is 0x%p ", m);
    ptrdiff_t idx = m - array;
    if (array + idx == m) {
      printf("true %ld\n", idx);
    } else {
      printf("false %ld\n", idx);
    }
}
main() {
   Method** xx = &array[3];
   test(xx);
   test((Method**)(((char*)xx) - 2));
}

cphilli% a.out
m is 0x0x601098 true 3
m is 0x0x601096 false 2


Coleen

>
> David
> -----
>
>> (gdb) print 20/8
>> $33 = 2
>>
>> if m is misaligned 0x7fffe0022444 the idx would be 2 and the expression
>> (b->_methods + idx) would evaluate to the aligned 0xfffe0022440  so not
>> equal m.
>>
>> But the code could check for misaligned m instead (or it would have
>> already crashed).  I think all bets are off if the address space is
>> segmented.
>>
>> The comment Jeremy added is:
>>
>>        if (b->_methods <= m && m < b->_methods + 
>> b->_number_of_methods) {
>>          // This is a bit of extra checking, for two reasons. One is
>>          // that contains() deals with pointers that are passed in by
>>          // JNI code, so making sure that the pointer is aligned
>>          // correctly is valuable.  The other is that <= and > are
>>          // technically not defined on pointers, so the if guard can
>>          // pass spuriously; no modern compiler is likely to make that
>>          // a problem, though (and if one did, the guard could also
>>          // fail spuriously, which would be bad).
>>          ptrdiff_t idx = m - b->_methods;
>>          if (b->_methods + idx == m) {
>>            return true;
>>          }
>>
>> Coleen
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>> **
>>>>
>>>> Jeremy
>>>
>>


From david.holmes at oracle.com  Thu Nov  6 05:35:10 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Nov 2014 15:35:10 +1000
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <545B00C9.1070502@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>	<5457E36A.3020800@oracle.com>
	<54592FC2.7090406@oracle.com>	<5459501C.4040807@oracle.com>	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>	<5459ADFB.4090808@oracle.com>	<CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>	<545AB561.9020204@oracle.com>	<CAPYFHW3SQaCZpv64ad=yaNJxh2xGLn6ZX+R-tEWfKFZxGVhgNA@mail.gmail.com>	<545ACAA4.3020906@oracle.com>
	<545AE45A.5080003@oracle.com> <545AE6D6.4040401@oracle.com>
	<545B00C9.1070502@oracle.com>
Message-ID: <545B088E.20903@oracle.com>

On 6/11/2014 3:02 PM, Coleen Phillimore wrote:
>
> David and Serguei (and Jeremy), see below.   Summary: I think Jeremy's
> code and comments are good.
>
> On 11/5/14, 10:11 PM, David Holmes wrote:
>> On 6/11/2014 1:00 PM, Coleen Phillimore wrote:
>>>
>>> On 11/5/14, 8:11 PM, serguei.spitsyn at oracle.com wrote:
>>>>
>>>> On 11/5/14 4:35 PM, Jeremy Manson wrote:
>>>>>
>>>>>
>>>>> On Wed, Nov 5, 2014 at 3:40 PM, Coleen Phillimore
>>>>> <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>>
>>>>> wrote:
>>>>>
>>>>>
>>>>>     On 11/5/14, 6:13 PM, Jeremy Manson wrote:
>>>>>>
>>>>>>
>>>>>>     On Tue, Nov 4, 2014 at 8:56 PM, serguei.spitsyn at oracle.com
>>>>>>     <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
>>>>>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>>>>
>>>>>>         The fix looks good in general.
>>>>>>
>>>>>>         src/share/vm/oops/method.cpp
>>>>>>
>>>>>>         1785   bool contains(Method** m) {
>>>>>>         1786     if (m == NULL) return false;
>>>>>>         1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b
>>>>>> = b->_next) {
>>>>>>         1788       if (b->_methods <= m && m < b->_methods +
>>>>>> b->_number_of_methods) {
>>>>>>         *1789         ptrdiff_t idx = m - b->_methods;**
>>>>>>         **1790         if (b->_methods + idx == m) {**
>>>>>>         1791           return true;
>>>>>>         1792         }*
>>>>>>         1793       }
>>>>>>         1794     }
>>>>>>         1795     return false;  // not found
>>>>>>         1796   }
>>>>>>
>>>>>>
>>>>>>         Just noticed that the lines 1789-1792 can be replaced with
>>>>>>         one liner:
>>>>>>         *        return true;*
>>>>>>
>>>>>>
>>>>>>     Ah, you have found our crappy workaround for wild pointers to
>>>>>>     non-aligned places in the middle of _methods.
>>>>>
>>>>>     Can you explain this?  Why are there wild pointers?
>>>>>
>>>>>
>>>>> My belief was that end user code could pass any old garbage to this
>>>>> function.  It's called by Method::is_method_id, which is called
>>>>> by jniCheck::validate_jmethod_id.  The idea was that this would help
>>>>> check jni deliver useful information in the case of the end user
>>>>> inputting garbage that happened to be in the right memory range.
>>>>>
>>>>> Having said that, at a second glance, it looks as if it that call is
>>>>> protected by a call to is_method() (in checked_resolve_jmethod_id),
>>>>> so the program will probably crash before it gets to this check.
>>>>>
>>>>> The other point about it was that the result of >= and < is
>>>>> technically unspecified; if it were ever implemented as anything
>>>>> other than a binary comparison between integers (which it never is,
>>>>> now that no one has a segmented architecture), the comparison could
>>>>> pass spuriously, so checking would be a good thing.  Of course, the
>>>>> comparison could fail spuriously, too.
>>>>>
>>>>> Anyway, I'm happy to leave it in as belt-and-suspenders (and add a
>>>>> comment, obviously, since it has caused confusion), or take it out.
>>>>> Your call.
>>>>
>>>> I'm still confused.
>>>>
>>>> How this code could possibly check anything?
>>>>    ptrdiff_t idx = m - b->_methods;
>>>>    if (b->_methods + idx == m) {
>>>>
>>>> The condition above always gives true:
>>>>    b->_methods + (idx) == b->_methods + (m - b->_methods) ==
>>>> (b->_methods- b->_methods) + m == (0 + m) == m
>>>>
>>>> Even if m was unaligned then at the end we compare m with m which is
>>>> still true.
>>>> Do I miss anything?
>>>
>>> If 'm' is unaligned we would fail this comparison:
>>>
>>> (gdb) print &methods->_data[2]
>>> $34 = (Method **) 0x7fffe0022440
>>> (gdb) print &methods->_data[0]
>>> $35 = (Method **) 0x7fffe0022430
>>> (gdb) print 0x7fffe0022444 - 0x7fffe0022430
>>> $32 = 20
>>
>> I was confused about this too. What we have here is pointer
>> arithmetic, not regular arithmetic, so I'm assuming an unaligned value
>> has to be adjusted before the actual difference is computed. So in
>> practice:
>>
>> m - b->_methods
>>
>> is really
>>
>> adjusted_for_alignment(m) - b->_methods
>
> It's not adjusted for alignment:

Right - now I get it. Pointer difference is an algebraic subtraction 
with "div sizeof what is pointed to". For aligned pointers there will be 
no remainder and adding back the difference to the initial pointer will 
yield the end pointer. But if one of the pointers is not aligned that is 
not the case.

All rather icky.

Thanks,
David
----

> #include <cstddef>
>
> extern "C" int printf(const char *,...);
> class Method {
>    int i ; int j; int k;
> };
>
> Method* array[10] = { new Method(),new Method(),new Method(),new
> Method(),new Method(),n
> ew Method(),new Method(),new Method(),new Method(),new Method() };
>
> void test(Method** m) {
>     printf("m is 0x%p ", m);
>     ptrdiff_t idx = m - array;
>     if (array + idx == m) {
>       printf("true %ld\n", idx);
>     } else {
>       printf("false %ld\n", idx);
>     }
> }
> main() {
>    Method** xx = &array[3];
>    test(xx);
>    test((Method**)(((char*)xx) - 2));
> }
>
> cphilli% a.out
> m is 0x0x601098 true 3
> m is 0x0x601096 false 2
>
>
> Coleen
>
>>
>> David
>> -----
>>
>>> (gdb) print 20/8
>>> $33 = 2
>>>
>>> if m is misaligned 0x7fffe0022444 the idx would be 2 and the expression
>>> (b->_methods + idx) would evaluate to the aligned 0xfffe0022440  so not
>>> equal m.
>>>
>>> But the code could check for misaligned m instead (or it would have
>>> already crashed).  I think all bets are off if the address space is
>>> segmented.
>>>
>>> The comment Jeremy added is:
>>>
>>>        if (b->_methods <= m && m < b->_methods +
>>> b->_number_of_methods) {
>>>          // This is a bit of extra checking, for two reasons. One is
>>>          // that contains() deals with pointers that are passed in by
>>>          // JNI code, so making sure that the pointer is aligned
>>>          // correctly is valuable.  The other is that <= and > are
>>>          // technically not defined on pointers, so the if guard can
>>>          // pass spuriously; no modern compiler is likely to make that
>>>          // a problem, though (and if one did, the guard could also
>>>          // fail spuriously, which would be bad).
>>>          ptrdiff_t idx = m - b->_methods;
>>>          if (b->_methods + idx == m) {
>>>            return true;
>>>          }
>>>
>>> Coleen
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>> **
>>>>>
>>>>> Jeremy
>>>>
>>>
>

From jeremymanson at google.com  Thu Nov  6 05:44:53 2014
From: jeremymanson at google.com (Jeremy Manson)
Date: Wed, 5 Nov 2014 21:44:53 -0800
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <545B088E.20903@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>
	<5457E36A.3020800@oracle.com> <54592FC2.7090406@oracle.com>
	<5459501C.4040807@oracle.com>
	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>
	<5459ADFB.4090808@oracle.com>
	<CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>
	<545AB561.9020204@oracle.com>
	<CAPYFHW3SQaCZpv64ad=yaNJxh2xGLn6ZX+R-tEWfKFZxGVhgNA@mail.gmail.com>
	<545ACAA4.3020906@oracle.com> <545AE45A.5080003@oracle.com>
	<545AE6D6.4040401@oracle.com> <545B00C9.1070502@oracle.com>
	<545B088E.20903@oracle.com>
Message-ID: <CAPYFHW1bzf67f0sLrc20WVHF8tuQJZ7RvBn2vscSqPrWPFTi+Q@mail.gmail.com>

Wow, go take care of my toddler for a few hours, come back, and all the
questions are answered for me!  Thanks, Coleen.

To be fair, the original code was actually correct (instead of, you know,
implementation-dependent-correct), so I feel a little weird about the whole
thing.

Jeremy

On Wed, Nov 5, 2014 at 9:35 PM, David Holmes <david.holmes at oracle.com>
wrote:

> On 6/11/2014 3:02 PM, Coleen Phillimore wrote:
>
>>
>> David and Serguei (and Jeremy), see below.   Summary: I think Jeremy's
>> code and comments are good.
>>
>> On 11/5/14, 10:11 PM, David Holmes wrote:
>>
>>> On 6/11/2014 1:00 PM, Coleen Phillimore wrote:
>>>
>>>>
>>>> On 11/5/14, 8:11 PM, serguei.spitsyn at oracle.com wrote:
>>>>
>>>>>
>>>>> On 11/5/14 4:35 PM, Jeremy Manson wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 5, 2014 at 3:40 PM, Coleen Phillimore
>>>>>> <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>     On 11/5/14, 6:13 PM, Jeremy Manson wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     On Tue, Nov 4, 2014 at 8:56 PM, serguei.spitsyn at oracle.com
>>>>>>>     <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
>>>>>>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>>>>>
>>>>>>>         The fix looks good in general.
>>>>>>>
>>>>>>>         src/share/vm/oops/method.cpp
>>>>>>>
>>>>>>>         1785   bool contains(Method** m) {
>>>>>>>         1786     if (m == NULL) return false;
>>>>>>>         1787     for (JNIMethodBlockNode* b = &_head; b != NULL; b
>>>>>>> = b->_next) {
>>>>>>>         1788       if (b->_methods <= m && m < b->_methods +
>>>>>>> b->_number_of_methods) {
>>>>>>>         *1789         ptrdiff_t idx = m - b->_methods;**
>>>>>>>         **1790         if (b->_methods + idx == m) {**
>>>>>>>         1791           return true;
>>>>>>>         1792         }*
>>>>>>>         1793       }
>>>>>>>         1794     }
>>>>>>>         1795     return false;  // not found
>>>>>>>         1796   }
>>>>>>>
>>>>>>>
>>>>>>>         Just noticed that the lines 1789-1792 can be replaced with
>>>>>>>         one liner:
>>>>>>>         *        return true;*
>>>>>>>
>>>>>>>
>>>>>>>     Ah, you have found our crappy workaround for wild pointers to
>>>>>>>     non-aligned places in the middle of _methods.
>>>>>>>
>>>>>>
>>>>>>     Can you explain this?  Why are there wild pointers?
>>>>>>
>>>>>>
>>>>>> My belief was that end user code could pass any old garbage to this
>>>>>> function.  It's called by Method::is_method_id, which is called
>>>>>> by jniCheck::validate_jmethod_id.  The idea was that this would help
>>>>>> check jni deliver useful information in the case of the end user
>>>>>> inputting garbage that happened to be in the right memory range.
>>>>>>
>>>>>> Having said that, at a second glance, it looks as if it that call is
>>>>>> protected by a call to is_method() (in checked_resolve_jmethod_id),
>>>>>> so the program will probably crash before it gets to this check.
>>>>>>
>>>>>> The other point about it was that the result of >= and < is
>>>>>> technically unspecified; if it were ever implemented as anything
>>>>>> other than a binary comparison between integers (which it never is,
>>>>>> now that no one has a segmented architecture), the comparison could
>>>>>> pass spuriously, so checking would be a good thing.  Of course, the
>>>>>> comparison could fail spuriously, too.
>>>>>>
>>>>>> Anyway, I'm happy to leave it in as belt-and-suspenders (and add a
>>>>>> comment, obviously, since it has caused confusion), or take it out.
>>>>>> Your call.
>>>>>>
>>>>>
>>>>> I'm still confused.
>>>>>
>>>>> How this code could possibly check anything?
>>>>>    ptrdiff_t idx = m - b->_methods;
>>>>>    if (b->_methods + idx == m) {
>>>>>
>>>>> The condition above always gives true:
>>>>>    b->_methods + (idx) == b->_methods + (m - b->_methods) ==
>>>>> (b->_methods- b->_methods) + m == (0 + m) == m
>>>>>
>>>>> Even if m was unaligned then at the end we compare m with m which is
>>>>> still true.
>>>>> Do I miss anything?
>>>>>
>>>>
>>>> If 'm' is unaligned we would fail this comparison:
>>>>
>>>> (gdb) print &methods->_data[2]
>>>> $34 = (Method **) 0x7fffe0022440
>>>> (gdb) print &methods->_data[0]
>>>> $35 = (Method **) 0x7fffe0022430
>>>> (gdb) print 0x7fffe0022444 - 0x7fffe0022430
>>>> $32 = 20
>>>>
>>>
>>> I was confused about this too. What we have here is pointer
>>> arithmetic, not regular arithmetic, so I'm assuming an unaligned value
>>> has to be adjusted before the actual difference is computed. So in
>>> practice:
>>>
>>> m - b->_methods
>>>
>>> is really
>>>
>>> adjusted_for_alignment(m) - b->_methods
>>>
>>
>> It's not adjusted for alignment:
>>
>
> Right - now I get it. Pointer difference is an algebraic subtraction with
> "div sizeof what is pointed to". For aligned pointers there will be no
> remainder and adding back the difference to the initial pointer will yield
> the end pointer. But if one of the pointers is not aligned that is not the
> case.
>
> All rather icky.
>
> Thanks,
> David
> ----
>
>
>  #include <cstddef>
>>
>> extern "C" int printf(const char *,...);
>> class Method {
>>    int i ; int j; int k;
>> };
>>
>> Method* array[10] = { new Method(),new Method(),new Method(),new
>> Method(),new Method(),n
>> ew Method(),new Method(),new Method(),new Method(),new Method() };
>>
>> void test(Method** m) {
>>     printf("m is 0x%p ", m);
>>     ptrdiff_t idx = m - array;
>>     if (array + idx == m) {
>>       printf("true %ld\n", idx);
>>     } else {
>>       printf("false %ld\n", idx);
>>     }
>> }
>> main() {
>>    Method** xx = &array[3];
>>    test(xx);
>>    test((Method**)(((char*)xx) - 2));
>> }
>>
>> cphilli% a.out
>> m is 0x0x601098 true 3
>> m is 0x0x601096 false 2
>>
>>
>> Coleen
>>
>>
>>> David
>>> -----
>>>
>>>  (gdb) print 20/8
>>>> $33 = 2
>>>>
>>>> if m is misaligned 0x7fffe0022444 the idx would be 2 and the expression
>>>> (b->_methods + idx) would evaluate to the aligned 0xfffe0022440  so not
>>>> equal m.
>>>>
>>>> But the code could check for misaligned m instead (or it would have
>>>> already crashed).  I think all bets are off if the address space is
>>>> segmented.
>>>>
>>>> The comment Jeremy added is:
>>>>
>>>>        if (b->_methods <= m && m < b->_methods +
>>>> b->_number_of_methods) {
>>>>          // This is a bit of extra checking, for two reasons. One is
>>>>          // that contains() deals with pointers that are passed in by
>>>>          // JNI code, so making sure that the pointer is aligned
>>>>          // correctly is valuable.  The other is that <= and > are
>>>>          // technically not defined on pointers, so the if guard can
>>>>          // pass spuriously; no modern compiler is likely to make that
>>>>          // a problem, though (and if one did, the guard could also
>>>>          // fail spuriously, which would be bad).
>>>>          ptrdiff_t idx = m - b->_methods;
>>>>          if (b->_methods + idx == m) {
>>>>            return true;
>>>>          }
>>>>
>>>> Coleen
>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>> **
>>>>>
>>>>>>
>>>>>> Jeremy
>>>>>>
>>>>>
>>>>>
>>>>
>>

From david.simms at oracle.com  Thu Nov  6 07:46:01 2014
From: david.simms at oracle.com (David Simms)
Date: Thu, 06 Nov 2014 08:46:01 +0100
Subject: RFR 8058715: stability issues when being launched as an embedded
	JVM via JNI
In-Reply-To: <545A738D.2080201@oracle.com>
References: <545A738D.2080201@oracle.com>
Message-ID: <545B2739.4000807@oracle.com>


Patch looks good David.

Cheers

On 2014-11-05 19:59, david buck wrote:
> Hi!
>
> This is a request for code review of my fix for jdk8058715
>
> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058715
> WEBREV: http://cr.openjdk.java.net/~dbuck/8058715/webrev.01/
>
> We have also received confirmation from the original reporter of the 
> issue that this solution resolves the crashes they were seeing in 
> their environment. I have tested that this change does not break the 
> original NX bug workaround. I also ran the NX bug reproducer (v8 
> benchmark of Nashorn running in a loop) using a fastdebug build with 
> the -XX:NativeMemoryTracking=summary option. Obviously no crashes or 
> other issues were detected.
>
> Cheers,
> -Buck


From roland.westrelin at oracle.com  Thu Nov  6 09:53:17 2014
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Thu, 6 Nov 2014 10:53:17 +0100
Subject: [9] RFR (S): 8060147: SIGSEGV in Metadata::mark_on_stack() while
	marking metadata in ciEnv
In-Reply-To: <545A5814.8000109@oracle.com>
References: <5450F261.60400@oracle.com> <545114DF.7040005@oracle.com>
	<54511744.4060904@oracle.com> <5451F43A.1010108@oracle.com>
	<5452128C.4090408@oracle.com> <54522805.5040701@oracle.com>
	<1421AC31-CBEA-4AED-A657-3C43B258A6DB@oracle.com>
	<54522357.4070705@oracle.com>
	<BF7012CB-AA7D-44B5-9C4F-A7DC0142FA4E@oracle.com>
	<5452425D.7040405@oracle.com>
	<E635DB43-48F9-4C5E-869F-13366D4DFA0B@oracle.com>
	<5452517C.4050104@oracle.com> <54527E1E.1070507@oracle.com>
	<5452A786.7030000@oracle.com> <5452A077.2050903@oracle.com>
	<545A5F59.2020907@oracle.com> <545A5814.8000109@oracle.com>
Message-ID: <B34125A3-0A7D-4ABD-B610-543D3975E4E5@oracle.com>

> VladimirK, Roland, what do you think about (1)?

Looks ok to me.

Roland.

From david.buck at oracle.com  Thu Nov  6 10:21:38 2014
From: david.buck at oracle.com (david buck)
Date: Thu, 06 Nov 2014 19:21:38 +0900
Subject: [8u40] RFR backport 8058715: stability issues when being launched
	as an embedded JVM via JNI
Message-ID: <545B4BB2.9020701@oracle.com>

Hi!

This is a request for approval to backport this fix to jdk8. The jdk9 
change applies cleanly and I have already built and tested on 8.

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058715
JDK9 changeset: 
http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/6748f6322b92

Cheers,
-Buck

From david.holmes at oracle.com  Thu Nov  6 10:32:06 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 06 Nov 2014 20:32:06 +1000
Subject: [8u40] RFR backport 8058715: stability issues when being launched
	as an embedded JVM via JNI
In-Reply-To: <545B4BB2.9020701@oracle.com>
References: <545B4BB2.9020701@oracle.com>
Message-ID: <545B4E26.9030807@oracle.com>

Approved.

David H.

On 6/11/2014 8:21 PM, david buck wrote:
> Hi!
>
> This is a request for approval to backport this fix to jdk8. The jdk9
> change applies cleanly and I have already built and tested on 8.
>
> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058715
> JDK9 changeset:
> http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/6748f6322b92
>
> Cheers,
> -Buck

From martin.doerr at sap.com  Thu Nov  6 10:40:24 2014
From: martin.doerr at sap.com (Doerr, Martin)
Date: Thu, 6 Nov 2014 10:40:24 +0000
Subject: RFR (XS): 8062950: Bug in locking code when UseOptoBiasInlining
	is disabled: assert(dmw->is_neutral()) failed: invariant
In-Reply-To: <545AB32E.8070402@oracle.com>
References: <7C9B87B351A4BA4AA9EC95BB418116566ACE597E@DEWDFEMB19C.global.corp.sap>
	<B35EB47F-84F4-4FEA-9C9B-DFC12A87FA08@oracle.com>
	<545AB32E.8070402@oracle.com>
Message-ID: <7C9B87B351A4BA4AA9EC95BB418116566ACE6BEC@DEWDFEMB19C.global.corp.sap>

Hi Vladimir,

thanks for replying quickly.

Are you sure you want the swap_reg_contains_mark flag to get removed?
There's a TODO in front of the changed line:
    // TODO: optimize away redundant LDs of obj->mark and improve the markword triage
    // order to reduce the number of conditional branches in the most common cases.
    // Beware -- there's a subtle invariant that fetch of the markword
    // at [FETCH], below, will never observe a biased encoding (*101b).
    // If this invariant is not held we risk exclusion (safety) failure.

So I'm not sure if the flag may be useful again when somebody works on this TODO.

Best regards,
Martin


-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Donnerstag, 6. November 2014 00:31
To: Christian Thalinger; Doerr, Martin
Cc: hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime
Subject: Re: RFR (XS): 8062950: Bug in locking code when UseOptoBiasInlining is disabled: assert(dmw->is_neutral()) failed: invariant

It is our (Compiler group) code. This problem was introduced with my 
changes for RTM locking.

Martin your changes are good. But you cleanup a bit this code since we 
now never put markword to tmpReg before this call?

Thanks,
Vladimir

On 11/5/14 3:13 PM, Christian Thalinger wrote:
> I?m not exactly sure who is our biased locking expert these days but I guess it?s someone from the runtime team.  CC?ing them.
>
>> On Nov 5, 2014, at 7:38 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
>>
>> Hi,
>>
>> we found a bug in MacroAssembler::fast_lock on x86 which shows up when UseOptoBiasInlining is switched off.
>> The problem is that biased_locking_enter is used with swap_reg_contains_mark==true, which is no longer correct after biased_locking_enter was put in front of check for IsInflated.
>>
>> Please review
>> http://cr.openjdk.java.net/~goetz/webrevs/8062950-lockBug/webrev.00/ <http://cr.openjdk.java.net/~goetz/webrevs/8062950-lockBug/webrev.00/>
>>
>> Best regards,
>> Martin
>

From aleksey.shipilev at oracle.com  Thu Nov  6 13:00:38 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 06 Nov 2014 16:00:38 +0300
Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to
	use	the same oop map
In-Reply-To: <525B0A18.8000105@oracle.com>
References: <525AC628.4020906@oracle.com>
	<009001cec83e$5e9c6b40$1bd541c0$@oracle.com>
	<525B0A18.8000105@oracle.com>
Message-ID: <545B70F6.60801@oracle.com>

Hi,

The Halloween is over, but here is a creepy undead patch from the past.
  http://cr.openjdk.java.net/~shade/8015272/webrev.01/
  https://bugs.openjdk.java.net/browse/JDK-8015272

8u does not need the patch, but it would be nice to have it in 9.

I have checked:
  * Still the same chunk of code as for non- at Contended cases
  * Still builds fine on Linux x86_64 fastdebug/release
  * Still passes all hotspot/test/runtime jtreg tests
  * Still passes the JPRT

Thanks,
-Aleksey.

On 10/14/2013 01:01 AM, Aleksey Shipilev wrote:
> Hi Christian,
> 
> Your call. I'm merely announcing the patch is ready. :)
> 
> -Aleksey.
> 
> On 10/13/2013 10:02 PM, Christian Tornqvist wrote:
>> Hi Aleksey
>>
>> We're well past Feature Complete (May 23) and Zero Bug Bounce is around the
>> corner, this is not the time to push enhancements. In my opinion this should
>> be done in the next 8u or in jdk9.
>>
>> Thanks,
>> Christian
>>
>> -----Original Message-----
>> From: hotspot-runtime-dev-bounces at openjdk.java.net
>> [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Aleksey
>> Shipilev
>> Sent: Sunday, October 13, 2013 12:11 PM
>> To: hotspot-runtime-dev at openjdk.java.net
>> Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to use
>> the same oop map
>>
>> Hi,
>>
>> Please review the simple improvement:
>>   http://cr.openjdk.java.net/~shade/8015272/webrev.00/
>>
>> I have copy-pasted the same block from the non- at Contended handling, because
>> it is generic for both cases. The change is also on the path which is
>> excercized only with @Contended with the same tag. Both @Contended
>> regression tests cover this case, as well as j.l.Thread containing
>> @Contended over the TLR state implicitly tests it in every VM run.
>>
>> Testing:
>>  - Linux x86_64 fastdebug/release: all hotspot/test/runtime jtreg
>>  - JPRT full cycle against hotspot-rt
>>  - vm.quick (running)
>>
>> -Aleksey.
>>
> 


From coleen.phillimore at oracle.com  Thu Nov  6 13:49:12 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Thu, 06 Nov 2014 08:49:12 -0500
Subject: RFR 8062116: JVMTI GetClassMethods is Slow
In-Reply-To: <545B088E.20903@oracle.com>
References: <CAPYFHW24hC4wAe6uYVOurrRc1QstN76njrOC-kdqza-dVSwtnQ@mail.gmail.com>	<5457E36A.3020800@oracle.com>
	<54592FC2.7090406@oracle.com>	<5459501C.4040807@oracle.com>	<CAPYFHW0n8VK7h_u9Deh_NsxNUuqWUVg5hOhOEetsLk9erBEGCA@mail.gmail.com>	<5459ADFB.4090808@oracle.com>	<CAPYFHW2ByP5v3Se6WjcN-QOABUqFBJwoaUjAu1h9AcNNV-1EyA@mail.gmail.com>	<545AB561.9020204@oracle.com>	<CAPYFHW3SQaCZpv64ad=yaNJxh2xGLn6ZX+R-tEWfKFZxGVhgNA@mail.gmail.com>	<545ACAA4.3020906@oracle.com>
	<545AE45A.5080003@oracle.com> <545AE6D6.4040401@oracle.com>
	<545B00C9.1070502@oracle.com> <545B088E.20903@oracle.com>
Message-ID: <545B7C58.8070404@oracle.com>


David, you didn't recommend taking the code out, because it looked like 
something that would trick people, so we'll leave it in.   It's benign. 
The rest of the change improves performance, which we want.

Thanks,
Coleen

On 11/6/14, 12:35 AM, David Holmes wrote:
> Right - now I get it. Pointer difference is an algebraic subtraction 
> with "div sizeof what is pointed to". For aligned pointers there will 
> be no remainder and adding back the difference to the initial pointer 
> will yield the end pointer. But if one of the pointers is not aligned 
> that is not the case.
>
> All rather icky.
>
> Thanks,
> David
> ----


From karen.kinnear at oracle.com  Thu Nov  6 15:01:30 2014
From: karen.kinnear at oracle.com (Karen Kinnear)
Date: Thu, 6 Nov 2014 10:01:30 -0500
Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to
	use	the same oop map
In-Reply-To: <545B70F6.60801@oracle.com>
References: <525AC628.4020906@oracle.com>
	<009001cec83e$5e9c6b40$1bd541c0$@oracle.com>
	<525B0A18.8000105@oracle.com> <545B70F6.60801@oracle.com>
Message-ID: <7029DBDA-BBE6-420A-BE6E-2EA28B60B560@oracle.com>

I agree with Christian that it is too late for jdk8u.

Could you please do additional testing  
- e.g. what about the vmtestbase vm/runtime/contended tests (and yes, some tests should be removed from that testlist)
- vmtestbase: vm.quick.testlist (required for runtime changes)
- and since @Contended annotation is used in the JDK core libraries - the jtreg jdk tests?

Does @Contended sometimes run into platform-specific bugs? Looking through earlier bugtails
I see bugs only filed against specific platforms, but it is not clear to me if the bugs also were seen
on other platforms and not recorded.

So the question is - is this a feature that needs testing on multiple platforms?

thanks,
Karen

On Nov 6, 2014, at 8:00 AM, Aleksey Shipilev wrote:

> Hi,
> 
> The Halloween is over, but here is a creepy undead patch from the past.
>  http://cr.openjdk.java.net/~shade/8015272/webrev.01/
>  https://bugs.openjdk.java.net/browse/JDK-8015272
> 
> 8u does not need the patch, but it would be nice to have it in 9.
> 
> I have checked:
>  * Still the same chunk of code as for non- at Contended cases
>  * Still builds fine on Linux x86_64 fastdebug/release
>  * Still passes all hotspot/test/runtime jtreg tests
>  * Still passes the JPRT
> 
> Thanks,
> -Aleksey.
> 
> On 10/14/2013 01:01 AM, Aleksey Shipilev wrote:
>> Hi Christian,
>> 
>> Your call. I'm merely announcing the patch is ready. :)
>> 
>> -Aleksey.
>> 
>> On 10/13/2013 10:02 PM, Christian Tornqvist wrote:
>>> Hi Aleksey
>>> 
>>> We're well past Feature Complete (May 23) and Zero Bug Bounce is around the
>>> corner, this is not the time to push enhancements. In my opinion this should
>>> be done in the next 8u or in jdk9.
>>> 
>>> Thanks,
>>> Christian
>>> 
>>> -----Original Message-----
>>> From: hotspot-runtime-dev-bounces at openjdk.java.net
>>> [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Aleksey
>>> Shipilev
>>> Sent: Sunday, October 13, 2013 12:11 PM
>>> To: hotspot-runtime-dev at openjdk.java.net
>>> Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to use
>>> the same oop map
>>> 
>>> Hi,
>>> 
>>> Please review the simple improvement:
>>>  http://cr.openjdk.java.net/~shade/8015272/webrev.00/
>>> 
>>> I have copy-pasted the same block from the non- at Contended handling, because
>>> it is generic for both cases. The change is also on the path which is
>>> excercized only with @Contended with the same tag. Both @Contended
>>> regression tests cover this case, as well as j.l.Thread containing
>>> @Contended over the TLR state implicitly tests it in every VM run.
>>> 
>>> Testing:
>>> - Linux x86_64 fastdebug/release: all hotspot/test/runtime jtreg
>>> - JPRT full cycle against hotspot-rt
>>> - vm.quick (running)
>>> 
>>> -Aleksey.
>>> 
>> 
> 
> 


From aleksey.shipilev at oracle.com  Thu Nov  6 16:07:28 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 06 Nov 2014 19:07:28 +0300
Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to
	use	the same oop map
In-Reply-To: <7029DBDA-BBE6-420A-BE6E-2EA28B60B560@oracle.com>
References: <525AC628.4020906@oracle.com>
	<009001cec83e$5e9c6b40$1bd541c0$@oracle.com>
	<525B0A18.8000105@oracle.com> <545B70F6.60801@oracle.com>
	<7029DBDA-BBE6-420A-BE6E-2EA28B60B560@oracle.com>
Message-ID: <545B9CC0.3080106@oracle.com>

Hi Karen,

Thanks for looking into this.

On 11/06/2014 06:01 PM, Karen Kinnear wrote:
> I agree with Christian that it is too late for jdk8u.

...and I am not advocating for the inclusion to jdk8u, as you can see in
my today's message. This minor cleanup may be done in jdk9 only,
therefore jdk8u schedule does not apply.

> Could you please do additional testing  

Sure, that would take a while. I'm not sure why we need to burn time on
this trivial change, since the the code is copied 1:1 from the same
well-exercised codepath for non- at Contended oops, and additionally
exercised by runtime jtreg tests.

Anyhow:

> - e.g. what about the vmtestbase vm/runtime/contended tests (and yes, some tests should be removed from that testlist)

Submitted the test job, no progress yet.

> - vmtestbase: vm.quick.testlist (required for runtime changes)

Submitted the test job, no progress yet.

> - and since @Contended annotation is used in the JDK core libraries - the jtreg jdk tests?

There are two group of @Contended users: java/lang/Thread and
java/util/concurrent/*. jdk/test/java/lang and
jdk/test/java/util/concurrent jtreg tests yield no new failures on my
Linux x86_64/fastdebug. The testing jobs submitted above should run them
on all platforms.


> Does @Contended sometimes run into platform-specific bugs? Looking through earlier bugtails
> I see bugs only filed against specific platforms, but it is not clear to me if the bugs also were seen
> on other platforms and not recorded.

@Contended handling code is platform-agnostic, we haven't seen the
platform-specific bugs there.

> So the question is - is this a feature that needs testing on multiple platforms?

No, I don't think so.

Thanks,
-Aleksey.

> thanks,
> Karen
> 
> On Nov 6, 2014, at 8:00 AM, Aleksey Shipilev wrote:
> 
>> Hi,
>>
>> The Halloween is over, but here is a creepy undead patch from the past.
>>  http://cr.openjdk.java.net/~shade/8015272/webrev.01/
>>  https://bugs.openjdk.java.net/browse/JDK-8015272
>>
>> 8u does not need the patch, but it would be nice to have it in 9.
>>
>> I have checked:
>>  * Still the same chunk of code as for non- at Contended cases
>>  * Still builds fine on Linux x86_64 fastdebug/release
>>  * Still passes all hotspot/test/runtime jtreg tests
>>  * Still passes the JPRT
>>
>> Thanks,
>> -Aleksey.
>>
>> On 10/14/2013 01:01 AM, Aleksey Shipilev wrote:
>>> Hi Christian,
>>>
>>> Your call. I'm merely announcing the patch is ready. :)
>>>
>>> -Aleksey.
>>>
>>> On 10/13/2013 10:02 PM, Christian Tornqvist wrote:
>>>> Hi Aleksey
>>>>
>>>> We're well past Feature Complete (May 23) and Zero Bug Bounce is around the
>>>> corner, this is not the time to push enhancements. In my opinion this should
>>>> be done in the next 8u or in jdk9.
>>>>
>>>> Thanks,
>>>> Christian
>>>>
>>>> -----Original Message-----
>>>> From: hotspot-runtime-dev-bounces at openjdk.java.net
>>>> [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Aleksey
>>>> Shipilev
>>>> Sent: Sunday, October 13, 2013 12:11 PM
>>>> To: hotspot-runtime-dev at openjdk.java.net
>>>> Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to use
>>>> the same oop map
>>>>
>>>> Hi,
>>>>
>>>> Please review the simple improvement:
>>>>  http://cr.openjdk.java.net/~shade/8015272/webrev.00/
>>>>
>>>> I have copy-pasted the same block from the non- at Contended handling, because
>>>> it is generic for both cases. The change is also on the path which is
>>>> excercized only with @Contended with the same tag. Both @Contended
>>>> regression tests cover this case, as well as j.l.Thread containing
>>>> @Contended over the TLR state implicitly tests it in every VM run.
>>>>
>>>> Testing:
>>>> - Linux x86_64 fastdebug/release: all hotspot/test/runtime jtreg
>>>> - JPRT full cycle against hotspot-rt
>>>> - vm.quick (running)
>>>>
>>>> -Aleksey.
>>>>
>>>
>>
>>
> 


From andreas.eriksson at oracle.com  Thu Nov  6 16:38:25 2014
From: andreas.eriksson at oracle.com (Andreas Eriksson)
Date: Thu, 06 Nov 2014 17:38:25 +0100
Subject: RFR: JDK7 backport of 8020675 - invalid jar file in the bootclasspath
	could lead to jvm fatal error
In-Reply-To: <545B797D.70907@oracle.com>
References: <545B797D.70907@oracle.com>
Message-ID: <545BA401.1070205@oracle.com>

Hi,

Could someone please review this jdk7 backport of JDK-8020675 
<https://bugs.openjdk.java.net/browse/JDK-8020675>.
Summary:
invalid jar file in the bootclasspath could lead to jvm fatal error
removed offending EXCEPTION_MARK calls and code cleanup

One code change necessary for the backport was in method 
ClassLoader::load_classfile.
The change was to use CHECK_(instanceKlassHandle()) instead of CHECK_NULL.
See the mail thread at 
http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-November/015825.html 
for more information.

Webrev: http://cr.openjdk.java.net/~aeriksso/8020675/webrev.00/

Regards,
Andreas


From daniel.daugherty at oracle.com  Thu Nov  6 18:01:45 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 06 Nov 2014 11:01:45 -0700
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545ABDF1.6050107@oracle.com>
References: <5459A8ED.8060808@oracle.com> <545A4719.50705@oracle.com>
	<545ABDF1.6050107@oracle.com>
Message-ID: <545BB789.2050906@oracle.com>

On 11/5/14 5:16 PM, David Holmes wrote:
> On 6/11/2014 1:49 AM, Claes Redestad wrote:
>> Hi,
>>
>> On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>
>>> This fix was spun off from the Contended Locking fast enter bucket
>>> which was sent out for review late last week. This fix cleans up
>>> the computation of ObjectMonitor field pointers and gets rid of
>>> the use of literal '-2' in appropriate places. For example:
>>>
>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>> Rscratch);
>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);
>>>
>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>> specified field and subtracts markOopDesc:monitor_value (2).
>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>
>> any reason not to add it as a function in objectMonitor.hpp instead of a
>> macro? How about:
>>
>>    static int no_monitor_offset_in_bytes()  { return
>> offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }
>
> _owner is not the only field used so you would need a function for 
> each one.

David, thanks for jumping in on this part of the review thread.
I ended up being off-the-air yesterday from mid-morning on.

Claes, David is correct that your suggestion would require a function
for each field that we use currently and for completeness should have
a function for each field that has an offset_in_bytes() function. We
have 12 fields in ObjectMonitor for which we provide an offset_in_bytes()
so I don't think we want to go down that route.

Dan


>
> David
> -----
>
>> Example usage:
>>
>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>> Rscratch);
>> +         ld_ptr(Rmark, ObjectMonitor::no_monitor_offset_in_bytes(),
>> Rscratch);
>>
>>
>> Seems this should be inlined regardless and looks a bit cleaner to me.
>>
>> Thanks!
>>
>> /Claes
>>
>>>
>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>> motivated this (long overdue) cleanup.
>>>
>>> This work is being tracked by the following bug ID:
>>>
>>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>     https://bugs.openjdk.java.net/browse/JDK-8062851
>>>
>>> Here is the webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>
>>> Here is the JEP link:
>>>
>>>     https://bugs.openjdk.java.net/browse/JDK-8046133
>>>
>>> Testing:
>>>
>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>
>


From daniel.daugherty at oracle.com  Thu Nov  6 18:02:35 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 06 Nov 2014 11:02:35 -0700
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545A4719.50705@oracle.com>
References: <5459A8ED.8060808@oracle.com> <545A4719.50705@oracle.com>
Message-ID: <545BB7BB.6050202@oracle.com>

On 11/5/14 8:49 AM, Claes Redestad wrote:
> Hi,
>
> On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have a Contended Locking cleanup bucket fix ready for review.
>>
>> This fix was spun off from the Contended Locking fast enter bucket
>> which was sent out for review late last week. This fix cleans up
>> the computation of ObjectMonitor field pointers and gets rid of
>> the use of literal '-2' in appropriate places. For example:
>>
>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2, 
>> Rscratch);
>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);
>>
>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>> specified field and subtracts markOopDesc:monitor_value (2).
>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>
> any reason not to add it as a function in objectMonitor.hpp instead of 
> a macro? How about:
>
>   static int no_monitor_offset_in_bytes()  { return 
> offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }
>
> Example usage:
>
> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2, 
> Rscratch);
> +         ld_ptr(Rmark, ObjectMonitor::no_monitor_offset_in_bytes(), 
> Rscratch);
>
>
> Seems this should be inlined regardless and looks a bit cleaner to me.

Claes, thanks for reviewing!

Please see my reply to David H where I pickup your comments
(and hopefully resolve them).

Dan


>
> Thanks!
>
> /Claes
>
>>
>> Thanks to David Holmes for his comments on JDK-8061553 that
>> motivated this (long overdue) cleanup.
>>
>> This work is being tracked by the following bug ID:
>>
>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>     https://bugs.openjdk.java.net/browse/JDK-8062851
>>
>> Here is the webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>
>> Here is the JEP link:
>>
>>     https://bugs.openjdk.java.net/browse/JDK-8046133
>>
>> Testing:
>>
>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>
>


From daniel.daugherty at oracle.com  Thu Nov  6 18:10:14 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 06 Nov 2014 11:10:14 -0700
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545A6E83.8060909@oracle.com>
References: <5459A8ED.8060808@oracle.com> <5459FF11.1080801@oracle.com>
	<545A4274.6090409@oracle.com> <545A6E83.8060909@oracle.com>
Message-ID: <545BB986.3000803@oracle.com>

On 11/5/14 11:37 AM, Coleen Phillimore wrote:
>
> Dan,  I had a look at this change too.

Thanks for reviewing!


> On 11/5/14, 10:29 AM, Daniel D. Daugherty wrote:
>> On 11/5/14 3:42 AM, David Holmes wrote:
>>> Hi Dan,
>>>
>>> Reviewed.
>>
>> Thanks!
>>
>>
>>> I find the name OM_OFFSET_NO_MONITOR_VALUE somewhat awkward but have 
>>> no better suggestion.
>>
>> Understood. I didn't like the original "OFFSET_SKEWED" name
>> especially since I was moving it to objectMonitor.hpp...
>>
>> If you think of a better, let me know... we can always change it.
>>
>
> So the -2 was a tag?  Then maybe a better name is UNTAGGED_OM_OFFSET 
> ..  Weird stuff anyway.

Yes, the '2' is one of the markWord encodings so we have to remove it
in order to have a proper pointer. Definitely weird stuff.


> In 
> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/src/cpu/x86/vm/macroAssembler_x86.cpp.udiff.html
>
> Can you make the whitespace changes to the lines you've changed:
>
> +         movptr(tmpReg, Address (tmpReg, 
> OM_OFFSET_NO_MONITOR_VALUE(owner)));   // rax, = m->_owner
>
>
> to
>
> +         movptr(tmpReg, Address(tmpReg, 
> OM_OFFSET_NO_MONITOR_VALUE(owner)));   // rax, = m->_owner

Yes, I'll make some whitespace cleanups to the lines that I touch.


> In general, this looks like a great improvement not subtracting two 
> from seemingly random places in assembly code.

Thanks goes to David H. for noticing the cleanup note that was
left in the code previously. My only tweak to it was putting the
macro in place where it could be shared by different CPU impls
and hopefully I improved the comment. :-)

Dan


>
> thanks,
> Coleen
>
>>
>>
>>> In fact I have to ask what _is_ the object monitor tagging 
>>> mechanism? I can't see it defined in the objectMonitor.* files. ??
>>
>> That would be this code:
>>
>> src/share/vm/oops/markOop.hpp:
>>
>>     317   static markOop encode(ObjectMonitor* monitor) {
>>     318     intptr_t tmp = (intptr_t) monitor;
>>     319     return (markOop) (tmp | monitor_value);
>>     320   }
>>
>> and the other methods in that file that have to account for
>> the monitor_value being set...
>>
>> Dan
>>
>>
>>>
>>> Thanks,
>>> David
>>>
>>> On 5/11/2014 2:34 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>>
>>>> This fix was spun off from the Contended Locking fast enter bucket
>>>> which was sent out for review late last week. This fix cleans up
>>>> the computation of ObjectMonitor field pointers and gets rid of
>>>> the use of literal '-2' in appropriate places. For example:
>>>>
>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>>> Rscratch);
>>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);
>>>>
>>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>>> specified field and subtracts markOopDesc:monitor_value (2).
>>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>>>
>>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>>> motivated this (long overdue) cleanup.
>>>>
>>>> This work is being tracked by the following bug ID:
>>>>
>>>>      JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>>      https://bugs.openjdk.java.net/browse/JDK-8062851
>>>>
>>>> Here is the webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>>
>>>> Here is the JEP link:
>>>>
>>>>      https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>
>>>> Testing:
>>>>
>>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>>
>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>
>>>> Dan
>>
>
>


From calvin.cheung at oracle.com  Thu Nov  6 18:12:47 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Thu, 06 Nov 2014 10:12:47 -0800
Subject: RFR: JDK7 backport of 8020675 - invalid jar file in the
	bootclasspath could lead to jvm fatal error
In-Reply-To: <545BA401.1070205@oracle.com>
References: <545B797D.70907@oracle.com> <545BA401.1070205@oracle.com>
Message-ID: <545BBA1F.3040301@oracle.com>

Hi Andreas,

The change looks good.
There should be a dummy.jar to go with the test cases.
http://cr.openjdk.java.net/~ccheung/8020675/webrev.02/

The webrev won't show any diffs for the jar file but don't forget to 
include it when you push the fix.

thanks,
Calvin

On 11/6/2014 8:38 AM, Andreas Eriksson wrote:
> Hi,
>
> Could someone please review this jdk7 backport of JDK-8020675 
> <https://bugs.openjdk.java.net/browse/JDK-8020675>.
> Summary:
> invalid jar file in the bootclasspath could lead to jvm fatal error
> removed offending EXCEPTION_MARK calls and code cleanup
>
> One code change necessary for the backport was in method 
> ClassLoader::load_classfile.
> The change was to use CHECK_(instanceKlassHandle()) instead of 
> CHECK_NULL.
> See the mail thread at 
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-November/015825.html 
> for more information.
>
> Webrev: http://cr.openjdk.java.net/~aeriksso/8020675/webrev.00/
>
> Regards,
> Andreas
>
>


From daniel.daugherty at oracle.com  Thu Nov  6 18:13:46 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 06 Nov 2014 11:13:46 -0700
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545AC1FC.8010905@oracle.com>
References: <5459A8ED.8060808@oracle.com>
	<545A4719.50705@oracle.com>	<545ABDF1.6050107@oracle.com>
	<545ABF8E.1050408@oracle.com> <545AC1FC.8010905@oracle.com>
Message-ID: <545BBA5A.9000007@oracle.com>

On 11/5/14 5:34 PM, David Holmes wrote:
> On 6/11/2014 10:23 AM, Coleen Phillimore wrote:
>>
>> On 11/5/14, 7:16 PM, David Holmes wrote:
>>> On 6/11/2014 1:49 AM, Claes Redestad wrote:
>>>> Hi,
>>>>
>>>> On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>>>
>>>>> This fix was spun off from the Contended Locking fast enter bucket
>>>>> which was sent out for review late last week. This fix cleans up
>>>>> the computation of ObjectMonitor field pointers and gets rid of
>>>>> the use of literal '-2' in appropriate places. For example:
>>>>>
>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>>>> Rscratch);
>>>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), Rscratch);
>>>>>
>>>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>>>> specified field and subtracts markOopDesc:monitor_value (2).
>>>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>>>
>>>> any reason not to add it as a function in objectMonitor.hpp instead 
>>>> of a
>>>> macro? How about:
>>>>
>>>>    static int no_monitor_offset_in_bytes()  { return
>>>> offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }
>>>
>>> _owner is not the only field used so you would need a function for
>>> each one.
>>
>> I thought this would be better too.   There are only 6 functions (6
>> lines) max that need this.  It would look nicer.
>
> Only changes an upper case macro name to a lower case function name.

As I mentioned in my reply to Claes, there are 12 offset_in_bytes() 
functions
so for completeness we would add 12 new functions. I really don't want to do
that.


>
>> My suggestion would be to make them static int
>> untagged_offset_in_bytes() or whatever monitor_value is.  It's not a
>> very descriptive name so better to name the functions after what it's 
>> for.
>
> You need the field name included in the function name:
>
> untagged_offset_of_owner()
> untagged_offset_of_xxx()
>
> but it is only untagged if the OM is currently inflated, so then:
>
> untagged_offset_of_XXX_for_inflated_om()
>
> I can live with Dan's macro (which is an improvement on the original).

Thanks! I'm planning to stick with the macro which has a much smaller
footprint for such a strange little task...

Dan


>
> David
>
>> Coleen
>>
>>>
>>> David
>>> -----
>>>
>>>> Example usage:
>>>>
>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>>> Rscratch);
>>>> +         ld_ptr(Rmark, ObjectMonitor::no_monitor_offset_in_bytes(),
>>>> Rscratch);
>>>>
>>>>
>>>> Seems this should be inlined regardless and looks a bit cleaner to me.
>>>>
>>>> Thanks!
>>>>
>>>> /Claes
>>>>
>>>>>
>>>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>>>> motivated this (long overdue) cleanup.
>>>>>
>>>>> This work is being tracked by the following bug ID:
>>>>>
>>>>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>>>     https://bugs.openjdk.java.net/browse/JDK-8062851
>>>>>
>>>>> Here is the webrev URL:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>>>
>>>>> Here is the JEP link:
>>>>>
>>>>>     https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>
>>>>> Testing:
>>>>>
>>>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>>>
>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>
>>>>> Dan
>>>>
>>
>
>
>


From daniel.daugherty at oracle.com  Thu Nov  6 18:19:07 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 06 Nov 2014 11:19:07 -0700
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545BBA5A.9000007@oracle.com>
References: <5459A8ED.8060808@oracle.com>
	<545A4719.50705@oracle.com>	<545ABDF1.6050107@oracle.com>
	<545ABF8E.1050408@oracle.com> <545AC1FC.8010905@oracle.com>
	<545BBA5A.9000007@oracle.com>
Message-ID: <545BBB9B.5000807@oracle.com>

I just reread the entire review thread.

I'm going to tweak the macro name a little bit:

     OM_OFFSET_NO_MONITOR_VALUE => OM_OFFSET_NO_MONITOR_VALUE_TAG

I think that captures the intent quite nicely...

Yes, I considered OM_OFFSET_WITHOUT_MONITOR_VALUE_TAG, but that
made the macro even longer... :-)

Dan


On 11/6/14 11:13 AM, Daniel D. Daugherty wrote:
> On 11/5/14 5:34 PM, David Holmes wrote:
>> On 6/11/2014 10:23 AM, Coleen Phillimore wrote:
>>>
>>> On 11/5/14, 7:16 PM, David Holmes wrote:
>>>> On 6/11/2014 1:49 AM, Claes Redestad wrote:
>>>>> Hi,
>>>>>
>>>>> On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>>>>
>>>>>> This fix was spun off from the Contended Locking fast enter bucket
>>>>>> which was sent out for review late last week. This fix cleans up
>>>>>> the computation of ObjectMonitor field pointers and gets rid of
>>>>>> the use of literal '-2' in appropriate places. For example:
>>>>>>
>>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>>>>> Rscratch);
>>>>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), 
>>>>>> Rscratch);
>>>>>>
>>>>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>>>>> specified field and subtracts markOopDesc:monitor_value (2).
>>>>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>>>>
>>>>> any reason not to add it as a function in objectMonitor.hpp 
>>>>> instead of a
>>>>> macro? How about:
>>>>>
>>>>>    static int no_monitor_offset_in_bytes()  { return
>>>>> offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }
>>>>
>>>> _owner is not the only field used so you would need a function for
>>>> each one.
>>>
>>> I thought this would be better too.   There are only 6 functions (6
>>> lines) max that need this.  It would look nicer.
>>
>> Only changes an upper case macro name to a lower case function name.
>
> As I mentioned in my reply to Claes, there are 12 offset_in_bytes() 
> functions
> so for completeness we would add 12 new functions. I really don't want 
> to do
> that.
>
>
>>
>>> My suggestion would be to make them static int
>>> untagged_offset_in_bytes() or whatever monitor_value is.  It's not a
>>> very descriptive name so better to name the functions after what 
>>> it's for.
>>
>> You need the field name included in the function name:
>>
>> untagged_offset_of_owner()
>> untagged_offset_of_xxx()
>>
>> but it is only untagged if the OM is currently inflated, so then:
>>
>> untagged_offset_of_XXX_for_inflated_om()
>>
>> I can live with Dan's macro (which is an improvement on the original).
>
> Thanks! I'm planning to stick with the macro which has a much smaller
> footprint for such a strange little task...
>
> Dan
>
>
>>
>> David
>>
>>> Coleen
>>>
>>>>
>>>> David
>>>> -----
>>>>
>>>>> Example usage:
>>>>>
>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>>>> Rscratch);
>>>>> +         ld_ptr(Rmark, ObjectMonitor::no_monitor_offset_in_bytes(),
>>>>> Rscratch);
>>>>>
>>>>>
>>>>> Seems this should be inlined regardless and looks a bit cleaner to 
>>>>> me.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> /Claes
>>>>>
>>>>>>
>>>>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>>>>> motivated this (long overdue) cleanup.
>>>>>>
>>>>>> This work is being tracked by the following bug ID:
>>>>>>
>>>>>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>>>>     https://bugs.openjdk.java.net/browse/JDK-8062851
>>>>>>
>>>>>> Here is the webrev URL:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>>>>
>>>>>> Here is the JEP link:
>>>>>>
>>>>>>     https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>>
>>>>>> Testing:
>>>>>>
>>>>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>>>>
>>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>>
>>>>>> Dan
>>>>>
>>>
>>
>>
>>
>


From daniel.daugherty at oracle.com  Thu Nov  6 19:01:47 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 06 Nov 2014 12:01:47 -0700
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545BBB9B.5000807@oracle.com>
References: <5459A8ED.8060808@oracle.com>
	<545A4719.50705@oracle.com>	<545ABDF1.6050107@oracle.com>
	<545ABF8E.1050408@oracle.com> <545AC1FC.8010905@oracle.com>
	<545BBA5A.9000007@oracle.com> <545BBB9B.5000807@oracle.com>
Message-ID: <545BC59B.5080902@oracle.com>

Here's an updated webrev:

http://cr.openjdk.java.net/~dcubed/8062851-webrev/1-jdk9-hs-rt/

What I did to sanity check this is compare the patch files from
the two webrevs...

Dan


On 11/6/14 11:19 AM, Daniel D. Daugherty wrote:
> I just reread the entire review thread.
>
> I'm going to tweak the macro name a little bit:
>
>     OM_OFFSET_NO_MONITOR_VALUE => OM_OFFSET_NO_MONITOR_VALUE_TAG
>
> I think that captures the intent quite nicely...
>
> Yes, I considered OM_OFFSET_WITHOUT_MONITOR_VALUE_TAG, but that
> made the macro even longer... :-)
>
> Dan
>
>
> On 11/6/14 11:13 AM, Daniel D. Daugherty wrote:
>> On 11/5/14 5:34 PM, David Holmes wrote:
>>> On 6/11/2014 10:23 AM, Coleen Phillimore wrote:
>>>>
>>>> On 11/5/14, 7:16 PM, David Holmes wrote:
>>>>> On 6/11/2014 1:49 AM, Claes Redestad wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>>>>>
>>>>>>> This fix was spun off from the Contended Locking fast enter bucket
>>>>>>> which was sent out for review late last week. This fix cleans up
>>>>>>> the computation of ObjectMonitor field pointers and gets rid of
>>>>>>> the use of literal '-2' in appropriate places. For example:
>>>>>>>
>>>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>>>>>> Rscratch);
>>>>>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), 
>>>>>>> Rscratch);
>>>>>>>
>>>>>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>>>>>> specified field and subtracts markOopDesc:monitor_value (2).
>>>>>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>>>>>
>>>>>> any reason not to add it as a function in objectMonitor.hpp 
>>>>>> instead of a
>>>>>> macro? How about:
>>>>>>
>>>>>>    static int no_monitor_offset_in_bytes()  { return
>>>>>> offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }
>>>>>
>>>>> _owner is not the only field used so you would need a function for
>>>>> each one.
>>>>
>>>> I thought this would be better too.   There are only 6 functions (6
>>>> lines) max that need this.  It would look nicer.
>>>
>>> Only changes an upper case macro name to a lower case function name.
>>
>> As I mentioned in my reply to Claes, there are 12 offset_in_bytes() 
>> functions
>> so for completeness we would add 12 new functions. I really don't 
>> want to do
>> that.
>>
>>
>>>
>>>> My suggestion would be to make them static int
>>>> untagged_offset_in_bytes() or whatever monitor_value is. It's not a
>>>> very descriptive name so better to name the functions after what 
>>>> it's for.
>>>
>>> You need the field name included in the function name:
>>>
>>> untagged_offset_of_owner()
>>> untagged_offset_of_xxx()
>>>
>>> but it is only untagged if the OM is currently inflated, so then:
>>>
>>> untagged_offset_of_XXX_for_inflated_om()
>>>
>>> I can live with Dan's macro (which is an improvement on the original).
>>
>> Thanks! I'm planning to stick with the macro which has a much smaller
>> footprint for such a strange little task...
>>
>> Dan
>>
>>
>>>
>>> David
>>>
>>>> Coleen
>>>>
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> Example usage:
>>>>>>
>>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>>>>> Rscratch);
>>>>>> +         ld_ptr(Rmark, ObjectMonitor::no_monitor_offset_in_bytes(),
>>>>>> Rscratch);
>>>>>>
>>>>>>
>>>>>> Seems this should be inlined regardless and looks a bit cleaner 
>>>>>> to me.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> /Claes
>>>>>>
>>>>>>>
>>>>>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>>>>>> motivated this (long overdue) cleanup.
>>>>>>>
>>>>>>> This work is being tracked by the following bug ID:
>>>>>>>
>>>>>>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>>>>>     https://bugs.openjdk.java.net/browse/JDK-8062851
>>>>>>>
>>>>>>> Here is the webrev URL:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>>>>>
>>>>>>> Here is the JEP link:
>>>>>>>
>>>>>>>     https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>>>
>>>>>>> Testing:
>>>>>>>
>>>>>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>>>>>
>>>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>>>
>>>>>>> Dan
>>>>>>
>>>>
>>>
>>>
>>>
>>
>
>


From coleen.phillimore at oracle.com  Thu Nov  6 19:34:50 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Thu, 06 Nov 2014 14:34:50 -0500
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545BC59B.5080902@oracle.com>
References: <5459A8ED.8060808@oracle.com>	<545A4719.50705@oracle.com>	<545ABDF1.6050107@oracle.com>	<545ABF8E.1050408@oracle.com>
	<545AC1FC.8010905@oracle.com>	<545BBA5A.9000007@oracle.com>
	<545BBB9B.5000807@oracle.com> <545BC59B.5080902@oracle.com>
Message-ID: <545BCD5A.6060001@oracle.com>


While I'm not a huge fan of long macro names, it's still shorter than 
the thing it replaced:

-      add(Rmark, ObjectMonitor::owner_offset_in_bytes()-2, Rmark);
+      add(Rmark, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner), Rmark);

I like it!

Coleen

On 11/6/14, 2:01 PM, Daniel D. Daugherty wrote:
> Here's an updated webrev:
>
> http://cr.openjdk.java.net/~dcubed/8062851-webrev/1-jdk9-hs-rt/
>
> What I did to sanity check this is compare the patch files from
> the two webrevs...
>
> Dan
>
>
> On 11/6/14 11:19 AM, Daniel D. Daugherty wrote:
>> I just reread the entire review thread.
>>
>> I'm going to tweak the macro name a little bit:
>>
>>     OM_OFFSET_NO_MONITOR_VALUE => OM_OFFSET_NO_MONITOR_VALUE_TAG
>>
>> I think that captures the intent quite nicely...
>>
>> Yes, I considered OM_OFFSET_WITHOUT_MONITOR_VALUE_TAG, but that
>> made the macro even longer... :-)
>>
>> Dan
>>
>>
>> On 11/6/14 11:13 AM, Daniel D. Daugherty wrote:
>>> On 11/5/14 5:34 PM, David Holmes wrote:
>>>> On 6/11/2014 10:23 AM, Coleen Phillimore wrote:
>>>>>
>>>>> On 11/5/14, 7:16 PM, David Holmes wrote:
>>>>>> On 6/11/2014 1:49 AM, Claes Redestad wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>>>>>>
>>>>>>>> This fix was spun off from the Contended Locking fast enter bucket
>>>>>>>> which was sent out for review late last week. This fix cleans up
>>>>>>>> the computation of ObjectMonitor field pointers and gets rid of
>>>>>>>> the use of literal '-2' in appropriate places. For example:
>>>>>>>>
>>>>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() 
>>>>>>>> - 2,
>>>>>>>> Rscratch);
>>>>>>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), 
>>>>>>>> Rscratch);
>>>>>>>>
>>>>>>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>>>>>>> specified field and subtracts markOopDesc:monitor_value (2).
>>>>>>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>>>>>>
>>>>>>> any reason not to add it as a function in objectMonitor.hpp 
>>>>>>> instead of a
>>>>>>> macro? How about:
>>>>>>>
>>>>>>>    static int no_monitor_offset_in_bytes()  { return
>>>>>>> offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }
>>>>>>
>>>>>> _owner is not the only field used so you would need a function for
>>>>>> each one.
>>>>>
>>>>> I thought this would be better too.   There are only 6 functions (6
>>>>> lines) max that need this.  It would look nicer.
>>>>
>>>> Only changes an upper case macro name to a lower case function name.
>>>
>>> As I mentioned in my reply to Claes, there are 12 offset_in_bytes() 
>>> functions
>>> so for completeness we would add 12 new functions. I really don't 
>>> want to do
>>> that.
>>>
>>>
>>>>
>>>>> My suggestion would be to make them static int
>>>>> untagged_offset_in_bytes() or whatever monitor_value is. It's not a
>>>>> very descriptive name so better to name the functions after what 
>>>>> it's for.
>>>>
>>>> You need the field name included in the function name:
>>>>
>>>> untagged_offset_of_owner()
>>>> untagged_offset_of_xxx()
>>>>
>>>> but it is only untagged if the OM is currently inflated, so then:
>>>>
>>>> untagged_offset_of_XXX_for_inflated_om()
>>>>
>>>> I can live with Dan's macro (which is an improvement on the original).
>>>
>>> Thanks! I'm planning to stick with the macro which has a much smaller
>>> footprint for such a strange little task...
>>>
>>> Dan
>>>
>>>
>>>>
>>>> David
>>>>
>>>>> Coleen
>>>>>
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> Example usage:
>>>>>>>
>>>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>>>>>> Rscratch);
>>>>>>> +         ld_ptr(Rmark, 
>>>>>>> ObjectMonitor::no_monitor_offset_in_bytes(),
>>>>>>> Rscratch);
>>>>>>>
>>>>>>>
>>>>>>> Seems this should be inlined regardless and looks a bit cleaner 
>>>>>>> to me.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> /Claes
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>>>>>>> motivated this (long overdue) cleanup.
>>>>>>>>
>>>>>>>> This work is being tracked by the following bug ID:
>>>>>>>>
>>>>>>>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>>>>>>     https://bugs.openjdk.java.net/browse/JDK-8062851
>>>>>>>>
>>>>>>>> Here is the webrev URL:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>>>>>>
>>>>>>>> Here is the JEP link:
>>>>>>>>
>>>>>>>>     https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>>>>
>>>>>>>> Testing:
>>>>>>>>
>>>>>>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>>>>>>
>>>>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>>>>
>>>>>>>> Dan
>>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>


From daniel.daugherty at oracle.com  Thu Nov  6 20:32:22 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 06 Nov 2014 13:32:22 -0700
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545BCD5A.6060001@oracle.com>
References: <5459A8ED.8060808@oracle.com>	<545A4719.50705@oracle.com>	<545ABDF1.6050107@oracle.com>	<545ABF8E.1050408@oracle.com>
	<545AC1FC.8010905@oracle.com>	<545BBA5A.9000007@oracle.com>
	<545BBB9B.5000807@oracle.com> <545BC59B.5080902@oracle.com>
	<545BCD5A.6060001@oracle.com>
Message-ID: <545BDAD6.5050609@oracle.com>

Thanks for the re-review!

Dan


On 11/6/14 12:34 PM, Coleen Phillimore wrote:
>
> While I'm not a huge fan of long macro names, it's still shorter than 
> the thing it replaced:
>
> -      add(Rmark, ObjectMonitor::owner_offset_in_bytes()-2, Rmark);
> +      add(Rmark, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner), Rmark);
>
> I like it!
>
> Coleen
>
> On 11/6/14, 2:01 PM, Daniel D. Daugherty wrote:
>> Here's an updated webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/1-jdk9-hs-rt/
>>
>> What I did to sanity check this is compare the patch files from
>> the two webrevs...
>>
>> Dan
>>
>>
>> On 11/6/14 11:19 AM, Daniel D. Daugherty wrote:
>>> I just reread the entire review thread.
>>>
>>> I'm going to tweak the macro name a little bit:
>>>
>>>     OM_OFFSET_NO_MONITOR_VALUE => OM_OFFSET_NO_MONITOR_VALUE_TAG
>>>
>>> I think that captures the intent quite nicely...
>>>
>>> Yes, I considered OM_OFFSET_WITHOUT_MONITOR_VALUE_TAG, but that
>>> made the macro even longer... :-)
>>>
>>> Dan
>>>
>>>
>>> On 11/6/14 11:13 AM, Daniel D. Daugherty wrote:
>>>> On 11/5/14 5:34 PM, David Holmes wrote:
>>>>> On 6/11/2014 10:23 AM, Coleen Phillimore wrote:
>>>>>>
>>>>>> On 11/5/14, 7:16 PM, David Holmes wrote:
>>>>>>> On 6/11/2014 1:49 AM, Claes Redestad wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>>>>>>>
>>>>>>>>> This fix was spun off from the Contended Locking fast enter 
>>>>>>>>> bucket
>>>>>>>>> which was sent out for review late last week. This fix cleans up
>>>>>>>>> the computation of ObjectMonitor field pointers and gets rid of
>>>>>>>>> the use of literal '-2' in appropriate places. For example:
>>>>>>>>>
>>>>>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() 
>>>>>>>>> - 2,
>>>>>>>>> Rscratch);
>>>>>>>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner), 
>>>>>>>>> Rscratch);
>>>>>>>>>
>>>>>>>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>>>>>>>> specified field and subtracts markOopDesc:monitor_value (2).
>>>>>>>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>>>>>>>
>>>>>>>> any reason not to add it as a function in objectMonitor.hpp 
>>>>>>>> instead of a
>>>>>>>> macro? How about:
>>>>>>>>
>>>>>>>>    static int no_monitor_offset_in_bytes()  { return
>>>>>>>> offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }
>>>>>>>
>>>>>>> _owner is not the only field used so you would need a function for
>>>>>>> each one.
>>>>>>
>>>>>> I thought this would be better too.   There are only 6 functions (6
>>>>>> lines) max that need this.  It would look nicer.
>>>>>
>>>>> Only changes an upper case macro name to a lower case function name.
>>>>
>>>> As I mentioned in my reply to Claes, there are 12 offset_in_bytes() 
>>>> functions
>>>> so for completeness we would add 12 new functions. I really don't 
>>>> want to do
>>>> that.
>>>>
>>>>
>>>>>
>>>>>> My suggestion would be to make them static int
>>>>>> untagged_offset_in_bytes() or whatever monitor_value is. It's not a
>>>>>> very descriptive name so better to name the functions after what 
>>>>>> it's for.
>>>>>
>>>>> You need the field name included in the function name:
>>>>>
>>>>> untagged_offset_of_owner()
>>>>> untagged_offset_of_xxx()
>>>>>
>>>>> but it is only untagged if the OM is currently inflated, so then:
>>>>>
>>>>> untagged_offset_of_XXX_for_inflated_om()
>>>>>
>>>>> I can live with Dan's macro (which is an improvement on the 
>>>>> original).
>>>>
>>>> Thanks! I'm planning to stick with the macro which has a much smaller
>>>> footprint for such a strange little task...
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> David
>>>>>
>>>>>> Coleen
>>>>>>
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> Example usage:
>>>>>>>>
>>>>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() 
>>>>>>>> - 2,
>>>>>>>> Rscratch);
>>>>>>>> +         ld_ptr(Rmark, 
>>>>>>>> ObjectMonitor::no_monitor_offset_in_bytes(),
>>>>>>>> Rscratch);
>>>>>>>>
>>>>>>>>
>>>>>>>> Seems this should be inlined regardless and looks a bit cleaner 
>>>>>>>> to me.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> /Claes
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>>>>>>>> motivated this (long overdue) cleanup.
>>>>>>>>>
>>>>>>>>> This work is being tracked by the following bug ID:
>>>>>>>>>
>>>>>>>>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8062851
>>>>>>>>>
>>>>>>>>> Here is the webrev URL:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>>>>>>>
>>>>>>>>> Here is the JEP link:
>>>>>>>>>
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>>>>>
>>>>>>>>> Testing:
>>>>>>>>>
>>>>>>>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>>>>>>>
>>>>>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>


From calvin.cheung at oracle.com  Fri Nov  7 01:06:57 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Thu, 06 Nov 2014 17:06:57 -0800
Subject: RFR(XS): 8060721: Test
	runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
In-Reply-To: <545AF8F2.1010106@oracle.com>
References: <545A770C.3030503@oracle.com> <545AC5D3.9090005@oracle.com>
	<545AF8F2.1010106@oracle.com>
Message-ID: <545C1B31.3060901@oracle.com>

I've updated the webrev at the same location:
     http://cr.openjdk.java.net/~ccheung/8060721/webrev/
I also re-ran the tests.

Please take a look.

thanks,
Calvin

On 11/5/2014 8:28 PM, Calvin Cheung wrote:
> On 11/5/2014 4:50 PM, David Holmes wrote:
>> On 6/11/2014 5:14 AM, Calvin Cheung wrote:
>>> While upgrading the compiler on Mac for jdk9, we found this compiler 
>>> bug
>>> where it skips the following 2 lines of code in metaspaceShared.cpp 
>>> when
>>> optimization is enable (set to -Os) for the fastdebug and product 
>>> builds.
>>>      strcat(class_list_path_str, os::file_separator());
>>>      strcat(class_list_path_str, "classlist");
>>>
>>> The bug is reproducible with Xcode 5.1.1 and 6.1.
>>>
>>> A workaround fix is to rewrite an "if" block in the
>>> MetaspaceShared::preload_and_dump() method.
>>
>> Can't you simply replace the strcats with jio_snprintf and do away 
>> with the sub_path array?
> The following works. I'll do more testing before sending an updated 
> webrev.
>
> --- a/src/share/vm/memory/metaspaceShared.cpp
> +++ b/src/share/vm/memory/metaspaceShared.cpp
> @@ -713,12 +713,15 @@
>      int class_list_path_len = (int)strlen(class_list_path_str);
>      if (class_list_path_len >= 3) {
>        if (strcmp(class_list_path_str + class_list_path_len - 3, 
> "lib") != 0) {
> -        strcat(class_list_path_str, os::file_separator());
> -        strcat(class_list_path_str, "lib");
> +        jio_snprintf(class_list_path_str + class_list_path_len,
> +                     sizeof(class_list_path_str) - class_list_path_len,
> +                     "%slib", os::file_separator());
>        }
>      }
> -    strcat(class_list_path_str, os::file_separator());
> -    strcat(class_list_path_str, "classlist");
> +    class_list_path_len = (int)strlen(class_list_path_str);
> +    jio_snprintf(class_list_path_str + class_list_path_len,
> +                 sizeof(class_list_path_str) - class_list_path_len,
> +                 "%sclasslist", os::file_separator());
>      class_list_path = class_list_path_str;
>    } else {
>      class_list_path = SharedClassListFile;
>>
>> Or even try strncat instead of strcat?
> I think jio_snprintf is better because it null terminates the string.
> If I use strncat, I'll need to initialize the entire buffer to null.
>
> thanks,
> Calvin
>>
>> David
>>
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8060721
>>>
>>> webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>>
>>> Testing:
>>>      JPRT
>>>      The affected testcase with product, fastdebug, and debug builds
>>> built with Xcode 5.1.1 and 6.1.
>>>
>>> thanks,
>>> Calvin
>


From jiangli.zhou at oracle.com  Fri Nov  7 01:35:34 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Thu, 06 Nov 2014 17:35:34 -0800
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
Message-ID: <545C21E6.90709@oracle.com>

Hi,

Please review the following changes that fix the crash with 
-XX:-LazyBootClassLoader on windows x64 platforms (fastdebug only). 
During VM initialization,  current_stack_pointer() could be called 
before the VM generates stub routines. The generated get_previous_sp 
routine cannot be used during that time, use the estimated value for the 
sp value instead. The x86 implementation is unaffected by the change and 
always returns the estimated sp value as before.

bug: https://bugs.openjdk.java.net/browse/JDK-8054008
webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev/

Tested with JPRT and ExtBadJAR test.

Background:
As part of the VM initialization, classLoader_init() calls ZIP_Open from 
the zip library for processing the boot class path when 
-XX:-LazyBootClassLoader is specified. The call path re-enters VM before 
returning from the zip library call. Following is the backtrace right 
before when the crash happens. The windows x64 version of 
current_stack_pointer() uses generated stub routine get_previous_sp 
(generated by generate_get_previous_sp()) to obtain the stack pointer 
value. Since classLoader_init() happens before stubRoutines_init1() and 
the stub routines are not generated at the time, the execution jumps to 
address 0 (referenced by _get_previous_sp_entry which should contain the 
address of the generated routine after stubRoutines_init1()) when it's 
trying to call the stub routine and crashes.


      jvm.dll!os::current_stack_pointer() Line 468 C++
      jvm.dll!os::verify_stack_alignment() Line 638 + 0x5 bytes C++
      jvm.dll!JVM_NativePath(char * path) Line 691 C++
      zip.dll!000007feebc49de0()
      [Frames below may be incorrect and/or missing, no symbols loaded 
for zip.dll]
      zip.dll!000007feebc4af1d()
      zip.dll!000007feebc4b004()
      jvm.dll!ClassLoader::create_class_path_entry(const char * path, 
const stat * st, bool lazy, bool throw_exception, Thread * 
__the_thread__) Line 666 + 0x13 bytes C++
      jvm.dll!ClassLoader::update_class_path_entry_list(const char * 
path, bool check_for_duplicates, bool throw_exception) Line 763 + 0x2d 
bytes C++
      jvm.dll!ClassLoader::setup_search_path(const char * class_path) 
Line 630 C++
      jvm.dll!ClassLoader::setup_bootstrap_search_path() Line 594 C++
      jvm.dll!ClassLoader::initialize() Line 1237 C++
      jvm.dll!classLoader_init() Line 1291 C++
      jvm.dll!init_globals() Line 100 C++
      jvm.dll!Threads::create_vm(JavaVMInitArgs * args, bool * 
canTryAgain) Line 3414 + 0x5 bytes C++
      jvm.dll!JNI_CreateJavaVM(JavaVM_ * * vm, void * * penv, void * 
args) Line 5199 + 0x12 bytes C++
      java.exe!000000013f0520f6()
      java.exe!000000013f05cb63()
      java.exe!000000013f05cbf7()
      kernel32.dll!0000000076ba59ed()
      ntdll.dll!0000000076cdc541()

Thanks,
Jiangli


From daniel.daugherty at oracle.com  Fri Nov  7 02:17:37 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 06 Nov 2014 19:17:37 -0700
Subject: RFR(S) Contended Locking fast enter bucket (8061553)
In-Reply-To: <54591A3A.1090005@oracle.com>
References: <5452C0B4.4070601@oracle.com> <5457084B.6070808@oracle.com>
	<5458330E.1080207@oracle.com> <54591A3A.1090005@oracle.com>
Message-ID: <545C2BC0.3080207@oracle.com>

The fix for JDK-8062851 has been reviewed, tested and pushed to
RT_Baseline.

Time to get back to this review thread so here's an updated webrev:

http://cr.openjdk.java.net/~dcubed/8061553-webrev/1-jdk9-hs-rt/

David H., I believe I've addressed all of your comments. Please
let me know if I missed something...

Thanks, in advance, for any comments, questions or suggestions.

Dan


On 11/4/14 11:26 AM, Daniel D. Daugherty wrote:
> The cleanup is turning into a bigger change than the fast enter
> bucket itself so I'm spinning the cleanup into a new bug:
>
>     JDK-8062851 cleanup ObjectMonitor offset adjustments
> https://bugs.openjdk.java.net/browse/JDK-8062851
>
> Yes, this means that the Contended Locking cleanup bucket has reopened
> for yet another change...
>
> We'll get back to "fast enter" after the dust has settled...
>
> Dan
>
>
> On 11/3/14 6:59 PM, Daniel D. Daugherty wrote:
>> David,
>>
>> Thanks for the review! As usual, replies are embedded below...
>>
>>
>> On 11/2/14 9:44 PM, David Holmes wrote:
>>> Hi Dan,
>>>
>>> Looks good.
>>
>> Thanks!
>>
>>
>>> Couple of nits and one semantic query below ...
>>>
>>> src/cpu/sparc/vm/macroAssembler_sparc.cpp
>>>
>>> Formatting changes were a bit of a distraction.
>>
>> Yes, I have no idea what got into me. Normally I do formatting
>> changes separately so the noise does not distract...
>>
>> It turns out there is a constant defined that should be used
>> instead of all these literal '2's:
>>
>> src/share/vm/oops/markOop.hpp:         monitor_value = 2
>>
>> Typically used as follows:
>>
>> src/cpu/x86/vm/macroAssembler_x86.cpp:  int owner_offset = 
>> ObjectMonitor::owner_offset_in_bytes() - markOopDesc::monitor_value;
>>
>> I will clean this up just for the files that I'm touching as
>> part of this fix.
>>
>>
>>>
>>> ---
>>>
>>> src/cpu/x86/vm/macroAssembler_x86.cpp
>>>
>>> Formatting changes were a bit of a distraction.
>>
>> Same reply as for macroAssembler_sparc.cpp.
>>
>>
>>> 1929     // unconditionally set stackBox->_displaced_header = 3
>>> 1930     movptr(Address(boxReg, 0), 
>>> (int32_t)intptr_t(markOopDesc::unused_mark()));
>>>
>>> At 1870 we refer to box rather than stackBox. Also it takes some 
>>> sleuthing to realize that "3" here is somehow a pseudonym for 
>>> unused_mark(). Back up at 1808 we have a to-do:
>>>
>>> 1808     //   use markOop::unused_mark() instead of "3".
>>>
>>> so the current change seems to be implementing that, even though 
>>> other uses of "3" are left untouched.
>>
>> I'll take a look at cleaning those up also...
>>
>> In some cases markOopDesc::marked_value will work for the literal '3',
>> but in other cases we'll use markOop::unused_mark():
>>
>>   static markOop unused_mark() {
>>     return (markOop) marked_value;
>>   }
>>
>> to save us the noise of the (markOop) cast.
>>
>>
>>> ---
>>>
>>> src/share/vm/runtime/sharedRuntime.cpp
>>>
>>> 1794 JRT_BLOCK_ENTRY(void, 
>>> SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock* 
>>> lock, JavaThread* thread))
>>> 1795   if (!SafepointSynchronize::is_synchronizing()) {
>>> 1796     if (ObjectSynchronizer::quick_enter(_obj, thread, lock)) 
>>> return;
>>>
>>> Is it necessary to check is_synchronizing? If we are executing this 
>>> code we are not at a safepoint and the quick_enter wont change that, 
>>> so I'm not sure what we are guarding against.
>>
>> So this first state checker:
>>
>> src/share/vm/runtime/safepoint.hpp:
>> inline static bool is_synchronizing()  { return _state == 
>> _synchronizing;  }
>>
>> means that we want to go to a safepoint and:
>>
>> inline static bool is_at_safepoint()   { return _state == 
>> _synchronized;  }
>>
>> means that we are at a safepoint. Dice's optimization bails out if
>> we want to go to a safepoint and ObjectSynchronizer::quick_enter()
>> has a "No_Safepoint_Verifier nsv" in it so we're expecting that
>> code to be quick (and not go to a safepoint). I'm not seeing
>> anything obvious....
>>
>> Sometimes we have to be careful with JavaThread suspend requests and
>> monitor acquisition, but I don't think that's a problem here... In
>> order for the "suspend requesting" thread to be surprised, the suspend
>> API, e.g., JVM/TI SuspendThread() has to return to the caller and then
>> the suspend target has do something unexpected like acquire a monitor
>> that it was previously blocked upon when it was suspended. We've had
>> bugs like that in the past... In this optimization case, our target
>> thread is not blocked on a contended monitor...
>>
>> In this particular case, the "suspend requesting" thread will set the
>> suspend request state on the target thread, but the target thread is
>> busy trying to enter this uncontended monitor (quickly). So the
>> "suspend requesting" thread, will request a no-op safepoint, but it
>> won't return from the suspend API until that safepoint completes.
>> The safepoint won't complete until the target thread is done acquiring
>> the previously uncontended monitor... so the target thread will be
>> suspended while holding the previous uncontended monitor and the
>> "suspend requesting" thread will return from the suspend API all
>> happy...
>>
>> Well, I don't see the reason either so I'll have to ping Dave Dice
>> and Karen Kinnear to see if either of them can fill in the history
>> here. This could be an abundance of caution case.
>>
>>
>>> ---
>>>
>>> src/share/vm/runtime/synchronizer.cpp
>>>
>>> Minor nit: line 153 the usual acronym is NPE (for 
>>> NullPointerException) not NPX
>>
>> I'll do a search for uses of NPX and other uses of 'X' in exception
>> acronyms...
>>
>>
>>>
>>> Nit:  159     Thread * const ox
>>>
>>> Please change ox to owner.
>>
>> Will do.
>>
>> Thanks again for the review!
>>
>> Dan
>>
>>
>>>
>>> ---
>>>
>>> Thanks,
>>> David
>>>
>>>
>>>
>>> On 31/10/2014 8:50 AM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have the Contended Locking fast enter bucket ready for review.
>>>>
>>>> The code changes in this bucket are primarily a quick_enter()
>>>> function that works on inflated but uncontended Java monitors.
>>>> This quick_enter() function is used on the "slow path" for Java
>>>> Monitor enter operations when the built-in "fast path" (read
>>>> assembly code) doesn't work.
>>>>
>>>> This work is being tracked by the following bug ID:
>>>>
>>>>      JDK-8061553 Contended Locking fast enter bucket
>>>> https://bugs.openjdk.java.net/browse/JDK-8061553
>>>>
>>>> Here is the webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/0-jdk9-hs-rt/
>>>>
>>>> Here is the JEP link:
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>
>>>> 8061553 summary of changes:
>>>>
>>>> macroAssembler_sparc.cpp: MacroAssembler::compiler_lock_object()
>>>>
>>>> - clean up spacing around some
>>>>    'ObjectMonitor::owner_offset_in_bytes() - 2' uses
>>>> - remove optional (EmitSync & 64) code
>>>> - change from cmp() to andcc() so icc.zf flag is set
>>>>
>>>> macroAssembler_x86.cpp: MacroAssembler::fast_lock()
>>>>
>>>> - remove optional (EmitSync & 2) code
>>>> - rewrite LP64 inflated lock code that tries to CAS in
>>>>    the new owner value to be more efficient
>>>>
>>>> interfaceSupport.hpp:
>>>>
>>>> - add JRT_BLOCK_NO_ASYNC to permit splitting a
>>>>    JRT_BLOCK_ENTRY into two pieces.
>>>>
>>>> sharedRuntime.cpp: SharedRuntime::complete_monitor_locking_C()
>>>>
>>>> - change entry type from JRT_ENTRY_NO_ASYNC to JRT_BLOCK_ENTRY
>>>>    to permit ObjectSynchronizer::quick_enter() call
>>>> - add JRT_BLOCK_NO_ASYNC use if the quick_enter() doesn't work
>>>>    to revert to JRT_ENTRY_NO_ASYNC-like semantics
>>>>
>>>> synchronizer.[ch]pp:
>>>>
>>>> - add ObjectSynchronizer::quick_enter() for entering an
>>>>    inflated but unowned Java monitor without thread state
>>>>    changes
>>>>
>>>> Testing:
>>>>
>>>> - Aurora Adhoc RT/SVC baseline batch
>>>> - JPRT test jobs
>>>> - MonitorEnterStresser micro-benchmark (in process)
>>>> - CallTimerGrid stress testing (in process)
>>>> - Aurora performance testing:
>>>>    - out of the box for the "promotion" and 32-bit server configs
>>>>    - heavy weight monitors for the "promotion" and 32-bit server 
>>>> configs
>>>>      (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>>>>      (in process)
>>>>
>>>>
>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>
>>>> Dan
>>
>>
>
>
>


From david.holmes at oracle.com  Fri Nov  7 06:26:40 2014
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 07 Nov 2014 16:26:40 +1000
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <545C21E6.90709@oracle.com>
References: <545C21E6.90709@oracle.com>
Message-ID: <545C6620.1040301@oracle.com>

Looks good to me! Glad to see this could be resolved with changing the 
initialization sequence!

Please update copyright year before pushing.

Thanks,
David

On 7/11/2014 11:35 AM, Jiangli Zhou wrote:
> Hi,
>
> Please review the following changes that fix the crash with
> -XX:-LazyBootClassLoader on windows x64 platforms (fastdebug only).
> During VM initialization,  current_stack_pointer() could be called
> before the VM generates stub routines. The generated get_previous_sp
> routine cannot be used during that time, use the estimated value for the
> sp value instead. The x86 implementation is unaffected by the change and
> always returns the estimated sp value as before.
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8054008
> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev/
>
> Tested with JPRT and ExtBadJAR test.
>
> Background:
> As part of the VM initialization, classLoader_init() calls ZIP_Open from
> the zip library for processing the boot class path when
> -XX:-LazyBootClassLoader is specified. The call path re-enters VM before
> returning from the zip library call. Following is the backtrace right
> before when the crash happens. The windows x64 version of
> current_stack_pointer() uses generated stub routine get_previous_sp
> (generated by generate_get_previous_sp()) to obtain the stack pointer
> value. Since classLoader_init() happens before stubRoutines_init1() and
> the stub routines are not generated at the time, the execution jumps to
> address 0 (referenced by _get_previous_sp_entry which should contain the
> address of the generated routine after stubRoutines_init1()) when it's
> trying to call the stub routine and crashes.
>
>
>       jvm.dll!os::current_stack_pointer() Line 468 C++
>       jvm.dll!os::verify_stack_alignment() Line 638 + 0x5 bytes C++
>       jvm.dll!JVM_NativePath(char * path) Line 691 C++
>       zip.dll!000007feebc49de0()
>       [Frames below may be incorrect and/or missing, no symbols loaded
> for zip.dll]
>       zip.dll!000007feebc4af1d()
>       zip.dll!000007feebc4b004()
>       jvm.dll!ClassLoader::create_class_path_entry(const char * path,
> const stat * st, bool lazy, bool throw_exception, Thread *
> __the_thread__) Line 666 + 0x13 bytes C++
>       jvm.dll!ClassLoader::update_class_path_entry_list(const char *
> path, bool check_for_duplicates, bool throw_exception) Line 763 + 0x2d
> bytes C++
>       jvm.dll!ClassLoader::setup_search_path(const char * class_path)
> Line 630 C++
>       jvm.dll!ClassLoader::setup_bootstrap_search_path() Line 594 C++
>       jvm.dll!ClassLoader::initialize() Line 1237 C++
>       jvm.dll!classLoader_init() Line 1291 C++
>       jvm.dll!init_globals() Line 100 C++
>       jvm.dll!Threads::create_vm(JavaVMInitArgs * args, bool *
> canTryAgain) Line 3414 + 0x5 bytes C++
>       jvm.dll!JNI_CreateJavaVM(JavaVM_ * * vm, void * * penv, void *
> args) Line 5199 + 0x12 bytes C++
>       java.exe!000000013f0520f6()
>       java.exe!000000013f05cb63()
>       java.exe!000000013f05cbf7()
>       kernel32.dll!0000000076ba59ed()
>       ntdll.dll!0000000076cdc541()
>
> Thanks,
> Jiangli
>

From david.holmes at oracle.com  Fri Nov  7 06:31:36 2014
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 07 Nov 2014 16:31:36 +1000
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545BC59B.5080902@oracle.com>
References: <5459A8ED.8060808@oracle.com>	<545A4719.50705@oracle.com>	<545ABDF1.6050107@oracle.com>	<545ABF8E.1050408@oracle.com>
	<545AC1FC.8010905@oracle.com>	<545BBA5A.9000007@oracle.com>
	<545BBB9B.5000807@oracle.com> <545BC59B.5080902@oracle.com>
Message-ID: <545C6748.1000901@oracle.com>

Still fine for me.

Thanks,
David

On 7/11/2014 5:01 AM, Daniel D. Daugherty wrote:
> Here's an updated webrev:
>
> http://cr.openjdk.java.net/~dcubed/8062851-webrev/1-jdk9-hs-rt/
>
> What I did to sanity check this is compare the patch files from
> the two webrevs...
>
> Dan
>
>
> On 11/6/14 11:19 AM, Daniel D. Daugherty wrote:
>> I just reread the entire review thread.
>>
>> I'm going to tweak the macro name a little bit:
>>
>>     OM_OFFSET_NO_MONITOR_VALUE => OM_OFFSET_NO_MONITOR_VALUE_TAG
>>
>> I think that captures the intent quite nicely...
>>
>> Yes, I considered OM_OFFSET_WITHOUT_MONITOR_VALUE_TAG, but that
>> made the macro even longer... :-)
>>
>> Dan
>>
>>
>> On 11/6/14 11:13 AM, Daniel D. Daugherty wrote:
>>> On 11/5/14 5:34 PM, David Holmes wrote:
>>>> On 6/11/2014 10:23 AM, Coleen Phillimore wrote:
>>>>>
>>>>> On 11/5/14, 7:16 PM, David Holmes wrote:
>>>>>> On 6/11/2014 1:49 AM, Claes Redestad wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>>>>>>
>>>>>>>> This fix was spun off from the Contended Locking fast enter bucket
>>>>>>>> which was sent out for review late last week. This fix cleans up
>>>>>>>> the computation of ObjectMonitor field pointers and gets rid of
>>>>>>>> the use of literal '-2' in appropriate places. For example:
>>>>>>>>
>>>>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>>>>>>> Rscratch);
>>>>>>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner),
>>>>>>>> Rscratch);
>>>>>>>>
>>>>>>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>>>>>>> specified field and subtracts markOopDesc:monitor_value (2).
>>>>>>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>>>>>>
>>>>>>> any reason not to add it as a function in objectMonitor.hpp
>>>>>>> instead of a
>>>>>>> macro? How about:
>>>>>>>
>>>>>>>    static int no_monitor_offset_in_bytes()  { return
>>>>>>> offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }
>>>>>>
>>>>>> _owner is not the only field used so you would need a function for
>>>>>> each one.
>>>>>
>>>>> I thought this would be better too.   There are only 6 functions (6
>>>>> lines) max that need this.  It would look nicer.
>>>>
>>>> Only changes an upper case macro name to a lower case function name.
>>>
>>> As I mentioned in my reply to Claes, there are 12 offset_in_bytes()
>>> functions
>>> so for completeness we would add 12 new functions. I really don't
>>> want to do
>>> that.
>>>
>>>
>>>>
>>>>> My suggestion would be to make them static int
>>>>> untagged_offset_in_bytes() or whatever monitor_value is. It's not a
>>>>> very descriptive name so better to name the functions after what
>>>>> it's for.
>>>>
>>>> You need the field name included in the function name:
>>>>
>>>> untagged_offset_of_owner()
>>>> untagged_offset_of_xxx()
>>>>
>>>> but it is only untagged if the OM is currently inflated, so then:
>>>>
>>>> untagged_offset_of_XXX_for_inflated_om()
>>>>
>>>> I can live with Dan's macro (which is an improvement on the original).
>>>
>>> Thanks! I'm planning to stick with the macro which has a much smaller
>>> footprint for such a strange little task...
>>>
>>> Dan
>>>
>>>
>>>>
>>>> David
>>>>
>>>>> Coleen
>>>>>
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> Example usage:
>>>>>>>
>>>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() - 2,
>>>>>>> Rscratch);
>>>>>>> +         ld_ptr(Rmark, ObjectMonitor::no_monitor_offset_in_bytes(),
>>>>>>> Rscratch);
>>>>>>>
>>>>>>>
>>>>>>> Seems this should be inlined regardless and looks a bit cleaner
>>>>>>> to me.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> /Claes
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>>>>>>> motivated this (long overdue) cleanup.
>>>>>>>>
>>>>>>>> This work is being tracked by the following bug ID:
>>>>>>>>
>>>>>>>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>>>>>>     https://bugs.openjdk.java.net/browse/JDK-8062851
>>>>>>>>
>>>>>>>> Here is the webrev URL:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>>>>>>
>>>>>>>> Here is the JEP link:
>>>>>>>>
>>>>>>>>     https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>>>>
>>>>>>>> Testing:
>>>>>>>>
>>>>>>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>>>>>>
>>>>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>>>>
>>>>>>>> Dan
>>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>

From david.holmes at oracle.com  Fri Nov  7 06:38:31 2014
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 07 Nov 2014 16:38:31 +1000
Subject: RFR(XS): 8060721: Test
	runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
In-Reply-To: <545C1B31.3060901@oracle.com>
References: <545A770C.3030503@oracle.com>
	<545AC5D3.9090005@oracle.com>	<545AF8F2.1010106@oracle.com>
	<545C1B31.3060901@oracle.com>
Message-ID: <545C68E7.4080807@oracle.com>

Hi Calvin,

On 7/11/2014 11:06 AM, Calvin Cheung wrote:
> I've updated the webrev at the same location:
>      http://cr.openjdk.java.net/~ccheung/8060721/webrev/
> I also re-ran the tests.
>
> Please take a look.

  717         jio_snprintf(class_list_path_str + class_list_path_len,
  718                      sizeof(class_list_path_str) - 
class_list_path_len,
  719                      "%slib", os::file_separator());
  720       }
  721     }
  722     class_list_path_len = (int)strlen(class_list_path_str);

The strlen recalculation at #722 should be moved inside the if-block as 
that is the only time it is needed. Also can we not just do += 4 ?

Thanks,
David

> thanks,
> Calvin
>
> On 11/5/2014 8:28 PM, Calvin Cheung wrote:
>> On 11/5/2014 4:50 PM, David Holmes wrote:
>>> On 6/11/2014 5:14 AM, Calvin Cheung wrote:
>>>> While upgrading the compiler on Mac for jdk9, we found this compiler
>>>> bug
>>>> where it skips the following 2 lines of code in metaspaceShared.cpp
>>>> when
>>>> optimization is enable (set to -Os) for the fastdebug and product
>>>> builds.
>>>>      strcat(class_list_path_str, os::file_separator());
>>>>      strcat(class_list_path_str, "classlist");
>>>>
>>>> The bug is reproducible with Xcode 5.1.1 and 6.1.
>>>>
>>>> A workaround fix is to rewrite an "if" block in the
>>>> MetaspaceShared::preload_and_dump() method.
>>>
>>> Can't you simply replace the strcats with jio_snprintf and do away
>>> with the sub_path array?
>> The following works. I'll do more testing before sending an updated
>> webrev.
>>
>> --- a/src/share/vm/memory/metaspaceShared.cpp
>> +++ b/src/share/vm/memory/metaspaceShared.cpp
>> @@ -713,12 +713,15 @@
>>      int class_list_path_len = (int)strlen(class_list_path_str);
>>      if (class_list_path_len >= 3) {
>>        if (strcmp(class_list_path_str + class_list_path_len - 3,
>> "lib") != 0) {
>> -        strcat(class_list_path_str, os::file_separator());
>> -        strcat(class_list_path_str, "lib");
>> +        jio_snprintf(class_list_path_str + class_list_path_len,
>> +                     sizeof(class_list_path_str) - class_list_path_len,
>> +                     "%slib", os::file_separator());
>>        }
>>      }
>> -    strcat(class_list_path_str, os::file_separator());
>> -    strcat(class_list_path_str, "classlist");
>> +    class_list_path_len = (int)strlen(class_list_path_str);
>> +    jio_snprintf(class_list_path_str + class_list_path_len,
>> +                 sizeof(class_list_path_str) - class_list_path_len,
>> +                 "%sclasslist", os::file_separator());
>>      class_list_path = class_list_path_str;
>>    } else {
>>      class_list_path = SharedClassListFile;
>>>
>>> Or even try strncat instead of strcat?
>> I think jio_snprintf is better because it null terminates the string.
>> If I use strncat, I'll need to initialize the entire buffer to null.
>>
>> thanks,
>> Calvin
>>>
>>> David
>>>
>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8060721
>>>>
>>>> webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>>>
>>>> Testing:
>>>>      JPRT
>>>>      The affected testcase with product, fastdebug, and debug builds
>>>> built with Xcode 5.1.1 and 6.1.
>>>>
>>>> thanks,
>>>> Calvin
>>
>

From david.holmes at oracle.com  Fri Nov  7 06:40:10 2014
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 07 Nov 2014 16:40:10 +1000
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <545C6620.1040301@oracle.com>
References: <545C21E6.90709@oracle.com> <545C6620.1040301@oracle.com>
Message-ID: <545C694A.9070004@oracle.com>

On 7/11/2014 4:26 PM, David Holmes wrote:
> Looks good to me! Glad to see this could be resolved with changing the
> initialization sequence!

s/with/without/ :)

David

>
> Please update copyright year before pushing.
>
> Thanks,
> David
>
> On 7/11/2014 11:35 AM, Jiangli Zhou wrote:
>> Hi,
>>
>> Please review the following changes that fix the crash with
>> -XX:-LazyBootClassLoader on windows x64 platforms (fastdebug only).
>> During VM initialization,  current_stack_pointer() could be called
>> before the VM generates stub routines. The generated get_previous_sp
>> routine cannot be used during that time, use the estimated value for the
>> sp value instead. The x86 implementation is unaffected by the change and
>> always returns the estimated sp value as before.
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8054008
>> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev/
>>
>> Tested with JPRT and ExtBadJAR test.
>>
>> Background:
>> As part of the VM initialization, classLoader_init() calls ZIP_Open from
>> the zip library for processing the boot class path when
>> -XX:-LazyBootClassLoader is specified. The call path re-enters VM before
>> returning from the zip library call. Following is the backtrace right
>> before when the crash happens. The windows x64 version of
>> current_stack_pointer() uses generated stub routine get_previous_sp
>> (generated by generate_get_previous_sp()) to obtain the stack pointer
>> value. Since classLoader_init() happens before stubRoutines_init1() and
>> the stub routines are not generated at the time, the execution jumps to
>> address 0 (referenced by _get_previous_sp_entry which should contain the
>> address of the generated routine after stubRoutines_init1()) when it's
>> trying to call the stub routine and crashes.
>>
>>
>>       jvm.dll!os::current_stack_pointer() Line 468 C++
>>       jvm.dll!os::verify_stack_alignment() Line 638 + 0x5 bytes C++
>>       jvm.dll!JVM_NativePath(char * path) Line 691 C++
>>       zip.dll!000007feebc49de0()
>>       [Frames below may be incorrect and/or missing, no symbols loaded
>> for zip.dll]
>>       zip.dll!000007feebc4af1d()
>>       zip.dll!000007feebc4b004()
>>       jvm.dll!ClassLoader::create_class_path_entry(const char * path,
>> const stat * st, bool lazy, bool throw_exception, Thread *
>> __the_thread__) Line 666 + 0x13 bytes C++
>>       jvm.dll!ClassLoader::update_class_path_entry_list(const char *
>> path, bool check_for_duplicates, bool throw_exception) Line 763 + 0x2d
>> bytes C++
>>       jvm.dll!ClassLoader::setup_search_path(const char * class_path)
>> Line 630 C++
>>       jvm.dll!ClassLoader::setup_bootstrap_search_path() Line 594 C++
>>       jvm.dll!ClassLoader::initialize() Line 1237 C++
>>       jvm.dll!classLoader_init() Line 1291 C++
>>       jvm.dll!init_globals() Line 100 C++
>>       jvm.dll!Threads::create_vm(JavaVMInitArgs * args, bool *
>> canTryAgain) Line 3414 + 0x5 bytes C++
>>       jvm.dll!JNI_CreateJavaVM(JavaVM_ * * vm, void * * penv, void *
>> args) Line 5199 + 0x12 bytes C++
>>       java.exe!000000013f0520f6()
>>       java.exe!000000013f05cb63()
>>       java.exe!000000013f05cbf7()
>>       kernel32.dll!0000000076ba59ed()
>>       ntdll.dll!0000000076cdc541()
>>
>> Thanks,
>> Jiangli
>>

From roland.westrelin at oracle.com  Fri Nov  7 13:16:21 2014
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Fri, 7 Nov 2014 14:16:21 +0100
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <545C21E6.90709@oracle.com>
References: <545C21E6.90709@oracle.com>
Message-ID: <682E3725-DDA0-4A1D-9F6E-94C6CF6045A1@oracle.com>

Hi Jiangli,

> Please review the following changes that fix the crash with -XX:-LazyBootClassLoader on windows x64 platforms (fastdebug only). During VM initialization,  current_stack_pointer() could be called before the VM generates stub routines. The generated get_previous_sp routine cannot be used during that time, use the estimated value for the sp value instead. The x86 implementation is unaffected by the change and always returns the estimated sp value as before.
> 
> bug: https://bugs.openjdk.java.net/browse/JDK-8054008
> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev/
> 
> Tested with JPRT and ExtBadJAR test.

But if what os::current_stack_pointer() returns is no longer ?accurate?, aren?t you at risk of hitting the assert in  os::verify_stack_alignment()? Shouldn?t you skip the assert entirely if the routine is not yet available?

Also why not make that change on all platform to improve robustness while you?re doing this?

Roland.

From daniel.daugherty at oracle.com  Fri Nov  7 14:02:47 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 07 Nov 2014 07:02:47 -0700
Subject: RFR(S) Contended Locking cleanup bucket (8062851)
In-Reply-To: <545C6748.1000901@oracle.com>
References: <5459A8ED.8060808@oracle.com>	<545A4719.50705@oracle.com>	<545ABDF1.6050107@oracle.com>	<545ABF8E.1050408@oracle.com>
	<545AC1FC.8010905@oracle.com>	<545BBA5A.9000007@oracle.com>
	<545BBB9B.5000807@oracle.com> <545BC59B.5080902@oracle.com>
	<545C6748.1000901@oracle.com>
Message-ID: <545CD107.8060700@oracle.com>

Thanks for the re-review!

Dan


On 11/6/14 11:31 PM, David Holmes wrote:
> Still fine for me.
>
> Thanks,
> David
>
> On 7/11/2014 5:01 AM, Daniel D. Daugherty wrote:
>> Here's an updated webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/1-jdk9-hs-rt/
>>
>> What I did to sanity check this is compare the patch files from
>> the two webrevs...
>>
>> Dan
>>
>>
>> On 11/6/14 11:19 AM, Daniel D. Daugherty wrote:
>>> I just reread the entire review thread.
>>>
>>> I'm going to tweak the macro name a little bit:
>>>
>>>     OM_OFFSET_NO_MONITOR_VALUE => OM_OFFSET_NO_MONITOR_VALUE_TAG
>>>
>>> I think that captures the intent quite nicely...
>>>
>>> Yes, I considered OM_OFFSET_WITHOUT_MONITOR_VALUE_TAG, but that
>>> made the macro even longer... :-)
>>>
>>> Dan
>>>
>>>
>>> On 11/6/14 11:13 AM, Daniel D. Daugherty wrote:
>>>> On 11/5/14 5:34 PM, David Holmes wrote:
>>>>> On 6/11/2014 10:23 AM, Coleen Phillimore wrote:
>>>>>>
>>>>>> On 11/5/14, 7:16 PM, David Holmes wrote:
>>>>>>> On 6/11/2014 1:49 AM, Claes Redestad wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 11/05/2014 05:34 AM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> I have a Contended Locking cleanup bucket fix ready for review.
>>>>>>>>>
>>>>>>>>> This fix was spun off from the Contended Locking fast enter 
>>>>>>>>> bucket
>>>>>>>>> which was sent out for review late last week. This fix cleans up
>>>>>>>>> the computation of ObjectMonitor field pointers and gets rid of
>>>>>>>>> the use of literal '-2' in appropriate places. For example:
>>>>>>>>>
>>>>>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() 
>>>>>>>>> - 2,
>>>>>>>>> Rscratch);
>>>>>>>>> +         ld_ptr(Rmark, OM_OFFSET_NO_MONITOR_VALUE(owner),
>>>>>>>>> Rscratch);
>>>>>>>>>
>>>>>>>>> The OM_OFFSET_NO_MONITOR_VALUE macro computes the offset to the
>>>>>>>>> specified field and subtracts markOopDesc:monitor_value (2).
>>>>>>>>> There's a nice comment in src/share/vm/runtime/objectMonitor.hpp.
>>>>>>>>
>>>>>>>> any reason not to add it as a function in objectMonitor.hpp
>>>>>>>> instead of a
>>>>>>>> macro? How about:
>>>>>>>>
>>>>>>>>    static int no_monitor_offset_in_bytes()  { return
>>>>>>>> offset_of(ObjectMonitor, _owner) - markOopDesc::monitor_value; }
>>>>>>>
>>>>>>> _owner is not the only field used so you would need a function for
>>>>>>> each one.
>>>>>>
>>>>>> I thought this would be better too.   There are only 6 functions (6
>>>>>> lines) max that need this.  It would look nicer.
>>>>>
>>>>> Only changes an upper case macro name to a lower case function name.
>>>>
>>>> As I mentioned in my reply to Claes, there are 12 offset_in_bytes()
>>>> functions
>>>> so for completeness we would add 12 new functions. I really don't
>>>> want to do
>>>> that.
>>>>
>>>>
>>>>>
>>>>>> My suggestion would be to make them static int
>>>>>> untagged_offset_in_bytes() or whatever monitor_value is. It's not a
>>>>>> very descriptive name so better to name the functions after what
>>>>>> it's for.
>>>>>
>>>>> You need the field name included in the function name:
>>>>>
>>>>> untagged_offset_of_owner()
>>>>> untagged_offset_of_xxx()
>>>>>
>>>>> but it is only untagged if the OM is currently inflated, so then:
>>>>>
>>>>> untagged_offset_of_XXX_for_inflated_om()
>>>>>
>>>>> I can live with Dan's macro (which is an improvement on the 
>>>>> original).
>>>>
>>>> Thanks! I'm planning to stick with the macro which has a much smaller
>>>> footprint for such a strange little task...
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> David
>>>>>
>>>>>> Coleen
>>>>>>
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> Example usage:
>>>>>>>>
>>>>>>>> -         ld_ptr(Rmark, ObjectMonitor::owner_offset_in_bytes() 
>>>>>>>> - 2,
>>>>>>>> Rscratch);
>>>>>>>> +         ld_ptr(Rmark, 
>>>>>>>> ObjectMonitor::no_monitor_offset_in_bytes(),
>>>>>>>> Rscratch);
>>>>>>>>
>>>>>>>>
>>>>>>>> Seems this should be inlined regardless and looks a bit cleaner
>>>>>>>> to me.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> /Claes
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks to David Holmes for his comments on JDK-8061553 that
>>>>>>>>> motivated this (long overdue) cleanup.
>>>>>>>>>
>>>>>>>>> This work is being tracked by the following bug ID:
>>>>>>>>>
>>>>>>>>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8062851
>>>>>>>>>
>>>>>>>>> Here is the webrev URL:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8062851-webrev/0-jdk9-hs-rt/
>>>>>>>>>
>>>>>>>>> Here is the JEP link:
>>>>>>>>>
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>>>>>
>>>>>>>>> Testing:
>>>>>>>>>
>>>>>>>>> - JPRT test jobs (since this is only syntax and comment cleanup)
>>>>>>>>>
>>>>>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>


From andreas.eriksson at oracle.com  Fri Nov  7 14:48:16 2014
From: andreas.eriksson at oracle.com (Andreas Eriksson)
Date: Fri, 07 Nov 2014 15:48:16 +0100
Subject: RFR: JDK7 backport of 8020675 - invalid jar file in the
	bootclasspath could lead to jvm fatal error
In-Reply-To: <545BBA1F.3040301@oracle.com>
References: <545B797D.70907@oracle.com> <545BA401.1070205@oracle.com>
	<545BBA1F.3040301@oracle.com>
Message-ID: <545CDBB0.80700@oracle.com>

Oh, interesting.
The hsx25 changeset does not display the dummy.jar as being a part of 
the checkin:
http://hg.openjdk.java.net/hsx/hsx25/hotspot/rev/7e7dd25666da

But when I navigate to the dummy.jar path I can see that it was checked 
in as part of that changeset:
http://hg.openjdk.java.net/hsx/hsx25/hotspot/log/7e7dd25666da/test/runtime/LoadClass/dummy.jar

Is this a know issue with mercurial?

Anyway, thanks for pointing this out, I would probably have missed it 
otherwise.
It seems that if the dummy.jar is not present the test always succeeds.

Thanks,
Andreas

On 2014-11-06 19:12, Calvin Cheung wrote:
> Hi Andreas,
>
> The change looks good.
> There should be a dummy.jar to go with the test cases.
> http://cr.openjdk.java.net/~ccheung/8020675/webrev.02/
>
> The webrev won't show any diffs for the jar file but don't forget to 
> include it when you push the fix.
>
> thanks,
> Calvin
>
> On 11/6/2014 8:38 AM, Andreas Eriksson wrote:
>> Hi,
>>
>> Could someone please review this jdk7 backport of JDK-8020675 
>> <https://bugs.openjdk.java.net/browse/JDK-8020675>.
>> Summary:
>> invalid jar file in the bootclasspath could lead to jvm fatal error
>> removed offending EXCEPTION_MARK calls and code cleanup
>>
>> One code change necessary for the backport was in method 
>> ClassLoader::load_classfile.
>> The change was to use CHECK_(instanceKlassHandle()) instead of 
>> CHECK_NULL.
>> See the mail thread at 
>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-November/015825.html 
>> for more information.
>>
>> Webrev: http://cr.openjdk.java.net/~aeriksso/8020675/webrev.00/
>>
>> Regards,
>> Andreas
>>
>>
>


From andreas.eriksson at oracle.com  Fri Nov  7 15:11:01 2014
From: andreas.eriksson at oracle.com (Andreas Eriksson)
Date: Fri, 07 Nov 2014 16:11:01 +0100
Subject: RFR: JDK7 backport of 8020675 - invalid jar file in the
	bootclasspath could lead to jvm fatal error
In-Reply-To: <545CDBB0.80700@oracle.com>
References: <545B797D.70907@oracle.com> <545BA401.1070205@oracle.com>
	<545BBA1F.3040301@oracle.com> <545CDBB0.80700@oracle.com>
Message-ID: <545CE105.4020208@oracle.com>

I think I need a jdk7u Reviewer to look at this as well, right?

New webrev where I added the 0 byte dummy.jar:
http://cr.openjdk.java.net/~aeriksso/8020675/webrev.01/

Checked so that the test fails on older versions and still passes on a 
fixed version.

Regards,
Andreas

On 2014-11-07 15:48, Andreas Eriksson wrote:
> Oh, interesting.
> The hsx25 changeset does not display the dummy.jar as being a part of 
> the checkin:
> http://hg.openjdk.java.net/hsx/hsx25/hotspot/rev/7e7dd25666da
>
> But when I navigate to the dummy.jar path I can see that it was 
> checked in as part of that changeset:
> http://hg.openjdk.java.net/hsx/hsx25/hotspot/log/7e7dd25666da/test/runtime/LoadClass/dummy.jar 
>
>
> Is this a know issue with mercurial?
>
> Anyway, thanks for pointing this out, I would probably have missed it 
> otherwise.
> It seems that if the dummy.jar is not present the test always succeeds.
>
> Thanks,
> Andreas
>
> On 2014-11-06 19:12, Calvin Cheung wrote:
>> Hi Andreas,
>>
>> The change looks good.
>> There should be a dummy.jar to go with the test cases.
>> http://cr.openjdk.java.net/~ccheung/8020675/webrev.02/
>>
>> The webrev won't show any diffs for the jar file but don't forget to 
>> include it when you push the fix.
>>
>> thanks,
>> Calvin
>>
>> On 11/6/2014 8:38 AM, Andreas Eriksson wrote:
>>> Hi,
>>>
>>> Could someone please review this jdk7 backport of JDK-8020675 
>>> <https://bugs.openjdk.java.net/browse/JDK-8020675>.
>>> Summary:
>>> invalid jar file in the bootclasspath could lead to jvm fatal error
>>> removed offending EXCEPTION_MARK calls and code cleanup
>>>
>>> One code change necessary for the backport was in method 
>>> ClassLoader::load_classfile.
>>> The change was to use CHECK_(instanceKlassHandle()) instead of 
>>> CHECK_NULL.
>>> See the mail thread at 
>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-November/015825.html 
>>> for more information.
>>>
>>> Webrev: http://cr.openjdk.java.net/~aeriksso/8020675/webrev.00/
>>>
>>> Regards,
>>> Andreas
>>>
>>>
>>
>


From jiangli.zhou at oracle.com  Fri Nov  7 17:11:07 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Fri, 07 Nov 2014 09:11:07 -0800
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <545C694A.9070004@oracle.com>
References: <545C21E6.90709@oracle.com> <545C6620.1040301@oracle.com>
	<545C694A.9070004@oracle.com>
Message-ID: <545CFD2B.8030108@oracle.com>

Thank you, David. I see Roland has some suggestions regarding the 
change. I'll explore those.

Thanks,
Jiangli

On 11/06/2014 10:40 PM, David Holmes wrote:
> On 7/11/2014 4:26 PM, David Holmes wrote:
>> Looks good to me! Glad to see this could be resolved with changing the
>> initialization sequence!
>
> s/with/without/ :)
>
> David
>
>>
>> Please update copyright year before pushing.
>>
>> Thanks,
>> David
>>
>> On 7/11/2014 11:35 AM, Jiangli Zhou wrote:
>>> Hi,
>>>
>>> Please review the following changes that fix the crash with
>>> -XX:-LazyBootClassLoader on windows x64 platforms (fastdebug only).
>>> During VM initialization,  current_stack_pointer() could be called
>>> before the VM generates stub routines. The generated get_previous_sp
>>> routine cannot be used during that time, use the estimated value for 
>>> the
>>> sp value instead. The x86 implementation is unaffected by the change 
>>> and
>>> always returns the estimated sp value as before.
>>>
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8054008
>>> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev/
>>>
>>> Tested with JPRT and ExtBadJAR test.
>>>
>>> Background:
>>> As part of the VM initialization, classLoader_init() calls ZIP_Open 
>>> from
>>> the zip library for processing the boot class path when
>>> -XX:-LazyBootClassLoader is specified. The call path re-enters VM 
>>> before
>>> returning from the zip library call. Following is the backtrace right
>>> before when the crash happens. The windows x64 version of
>>> current_stack_pointer() uses generated stub routine get_previous_sp
>>> (generated by generate_get_previous_sp()) to obtain the stack pointer
>>> value. Since classLoader_init() happens before stubRoutines_init1() and
>>> the stub routines are not generated at the time, the execution jumps to
>>> address 0 (referenced by _get_previous_sp_entry which should contain 
>>> the
>>> address of the generated routine after stubRoutines_init1()) when it's
>>> trying to call the stub routine and crashes.
>>>
>>>
>>>       jvm.dll!os::current_stack_pointer() Line 468 C++
>>>       jvm.dll!os::verify_stack_alignment() Line 638 + 0x5 bytes C++
>>>       jvm.dll!JVM_NativePath(char * path) Line 691 C++
>>>       zip.dll!000007feebc49de0()
>>>       [Frames below may be incorrect and/or missing, no symbols loaded
>>> for zip.dll]
>>>       zip.dll!000007feebc4af1d()
>>>       zip.dll!000007feebc4b004()
>>>       jvm.dll!ClassLoader::create_class_path_entry(const char * path,
>>> const stat * st, bool lazy, bool throw_exception, Thread *
>>> __the_thread__) Line 666 + 0x13 bytes C++
>>>       jvm.dll!ClassLoader::update_class_path_entry_list(const char *
>>> path, bool check_for_duplicates, bool throw_exception) Line 763 + 0x2d
>>> bytes C++
>>>       jvm.dll!ClassLoader::setup_search_path(const char * class_path)
>>> Line 630 C++
>>>       jvm.dll!ClassLoader::setup_bootstrap_search_path() Line 594 C++
>>>       jvm.dll!ClassLoader::initialize() Line 1237 C++
>>>       jvm.dll!classLoader_init() Line 1291 C++
>>>       jvm.dll!init_globals() Line 100 C++
>>>       jvm.dll!Threads::create_vm(JavaVMInitArgs * args, bool *
>>> canTryAgain) Line 3414 + 0x5 bytes C++
>>>       jvm.dll!JNI_CreateJavaVM(JavaVM_ * * vm, void * * penv, void *
>>> args) Line 5199 + 0x12 bytes C++
>>>       java.exe!000000013f0520f6()
>>>       java.exe!000000013f05cb63()
>>>       java.exe!000000013f05cbf7()
>>>       kernel32.dll!0000000076ba59ed()
>>>       ntdll.dll!0000000076cdc541()
>>>
>>> Thanks,
>>> Jiangli
>>>


From jiangli.zhou at oracle.com  Fri Nov  7 17:29:11 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Fri, 07 Nov 2014 09:29:11 -0800
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <682E3725-DDA0-4A1D-9F6E-94C6CF6045A1@oracle.com>
References: <545C21E6.90709@oracle.com>
	<682E3725-DDA0-4A1D-9F6E-94C6CF6045A1@oracle.com>
Message-ID: <545D0167.3070903@oracle.com>

Hi Roland,

Thank you for the review. Please see comments and questions below.

On 11/07/2014 05:16 AM, Roland Westrelin wrote:
> Hi Jiangli,
>
>> Please review the following changes that fix the crash with -XX:-LazyBootClassLoader on windows x64 platforms (fastdebug only). During VM initialization,  current_stack_pointer() could be called before the VM generates stub routines. The generated get_previous_sp routine cannot be used during that time, use the estimated value for the sp value instead. The x86 implementation is unaffected by the change and always returns the estimated sp value as before.
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8054008
>> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev/
>>
>> Tested with JPRT and ExtBadJAR test.
> But if what os::current_stack_pointer() returns is no longer ?accurate?, aren?t you at risk of hitting the assert in  os::verify_stack_alignment()? Shouldn?t you skip the assert entirely if the routine is not yet available?

For x64, it still returns the "accurate" value once the routine is 
generated. Before the routine is ready, it gives the estimate, which 
might have the risk of upsetting the assert as you suggested. I have a 
few questions. Have you run into the case where the estimate might 
trigger the assertion on x64? What about x86, why that's not handled the 
same as x64?

>
> Also why not make that change on all platform to improve robustness while you?re doing this?

Thank you for the suggestion. Sound good. I'll look into this. Is there 
a global flag that indicates the stub routines are generated?

Thanks,
Jiangli

>
> Roland.


From calvin.cheung at oracle.com  Fri Nov  7 19:28:22 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Fri, 07 Nov 2014 11:28:22 -0800
Subject: RFR(XS): 8060721: Test
	runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
In-Reply-To: <545C68E7.4080807@oracle.com>
References: <545A770C.3030503@oracle.com>
	<545AC5D3.9090005@oracle.com>	<545AF8F2.1010106@oracle.com>
	<545C1B31.3060901@oracle.com> <545C68E7.4080807@oracle.com>
Message-ID: <545D1D56.4050000@oracle.com>

On 11/6/2014 10:38 PM, David Holmes wrote:
> Hi Calvin,
>
> On 7/11/2014 11:06 AM, Calvin Cheung wrote:
>> I've updated the webrev at the same location:
>>      http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>> I also re-ran the tests.
>>
>> Please take a look.
>
>  717         jio_snprintf(class_list_path_str + class_list_path_len,
>  718                      sizeof(class_list_path_str) - 
> class_list_path_len,
>  719                      "%slib", os::file_separator());
>  720       }
>  721     }
>  722     class_list_path_len = (int)strlen(class_list_path_str);
>
> The strlen recalculation at #722 should be moved inside the if-block 
> as that is the only time it is needed.
Agreed.
> Also can we not just do += 4 ?
I didn't want to use 4 to avoid another magic number but in this case I 
think it's obvious.

I've updated webrev at the same location:
     http://cr.openjdk.java.net/~ccheung/8060721/webrev/

thanks,
Calvin
>
> Thanks,
> David
>
>> thanks,
>> Calvin
>>
>> On 11/5/2014 8:28 PM, Calvin Cheung wrote:
>>> On 11/5/2014 4:50 PM, David Holmes wrote:
>>>> On 6/11/2014 5:14 AM, Calvin Cheung wrote:
>>>>> While upgrading the compiler on Mac for jdk9, we found this compiler
>>>>> bug
>>>>> where it skips the following 2 lines of code in metaspaceShared.cpp
>>>>> when
>>>>> optimization is enable (set to -Os) for the fastdebug and product
>>>>> builds.
>>>>>      strcat(class_list_path_str, os::file_separator());
>>>>>      strcat(class_list_path_str, "classlist");
>>>>>
>>>>> The bug is reproducible with Xcode 5.1.1 and 6.1.
>>>>>
>>>>> A workaround fix is to rewrite an "if" block in the
>>>>> MetaspaceShared::preload_and_dump() method.
>>>>
>>>> Can't you simply replace the strcats with jio_snprintf and do away
>>>> with the sub_path array?
>>> The following works. I'll do more testing before sending an updated
>>> webrev.
>>>
>>> --- a/src/share/vm/memory/metaspaceShared.cpp
>>> +++ b/src/share/vm/memory/metaspaceShared.cpp
>>> @@ -713,12 +713,15 @@
>>>      int class_list_path_len = (int)strlen(class_list_path_str);
>>>      if (class_list_path_len >= 3) {
>>>        if (strcmp(class_list_path_str + class_list_path_len - 3,
>>> "lib") != 0) {
>>> -        strcat(class_list_path_str, os::file_separator());
>>> -        strcat(class_list_path_str, "lib");
>>> +        jio_snprintf(class_list_path_str + class_list_path_len,
>>> +                     sizeof(class_list_path_str) - 
>>> class_list_path_len,
>>> +                     "%slib", os::file_separator());
>>>        }
>>>      }
>>> -    strcat(class_list_path_str, os::file_separator());
>>> -    strcat(class_list_path_str, "classlist");
>>> +    class_list_path_len = (int)strlen(class_list_path_str);
>>> +    jio_snprintf(class_list_path_str + class_list_path_len,
>>> +                 sizeof(class_list_path_str) - class_list_path_len,
>>> +                 "%sclasslist", os::file_separator());
>>>      class_list_path = class_list_path_str;
>>>    } else {
>>>      class_list_path = SharedClassListFile;
>>>>
>>>> Or even try strncat instead of strcat?
>>> I think jio_snprintf is better because it null terminates the string.
>>> If I use strncat, I'll need to initialize the entire buffer to null.
>>>
>>> thanks,
>>> Calvin
>>>>
>>>> David
>>>>
>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8060721
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>>>>
>>>>> Testing:
>>>>>      JPRT
>>>>>      The affected testcase with product, fastdebug, and debug builds
>>>>> built with Xcode 5.1.1 and 6.1.
>>>>>
>>>>> thanks,
>>>>> Calvin
>>>
>>


From ioi.lam at oracle.com  Fri Nov  7 20:44:02 2014
From: ioi.lam at oracle.com (Ioi Lam)
Date: Fri, 07 Nov 2014 12:44:02 -0800
Subject: RFR(XS): 8060721: Test
	runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
In-Reply-To: <545D1D56.4050000@oracle.com>
References: <545A770C.3030503@oracle.com>	<545AC5D3.9090005@oracle.com>	<545AF8F2.1010106@oracle.com>	<545C1B31.3060901@oracle.com>
	<545C68E7.4080807@oracle.com> <545D1D56.4050000@oracle.com>
Message-ID: <545D2F12.1000001@oracle.com>

Calvin, the new changes look good to me.

- Ioi

On 11/7/14, 11:28 AM, Calvin Cheung wrote:
> On 11/6/2014 10:38 PM, David Holmes wrote:
>> Hi Calvin,
>>
>> On 7/11/2014 11:06 AM, Calvin Cheung wrote:
>>> I've updated the webrev at the same location:
>>>      http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>> I also re-ran the tests.
>>>
>>> Please take a look.
>>
>>  717         jio_snprintf(class_list_path_str + class_list_path_len,
>>  718                      sizeof(class_list_path_str) - 
>> class_list_path_len,
>>  719                      "%slib", os::file_separator());
>>  720       }
>>  721     }
>>  722     class_list_path_len = (int)strlen(class_list_path_str);
>>
>> The strlen recalculation at #722 should be moved inside the if-block 
>> as that is the only time it is needed.
> Agreed.
>> Also can we not just do += 4 ?
> I didn't want to use 4 to avoid another magic number but in this case 
> I think it's obvious.
>
> I've updated webrev at the same location:
>     http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>
> thanks,
> Calvin
>>
>> Thanks,
>> David
>>
>>> thanks,
>>> Calvin
>>>
>>> On 11/5/2014 8:28 PM, Calvin Cheung wrote:
>>>> On 11/5/2014 4:50 PM, David Holmes wrote:
>>>>> On 6/11/2014 5:14 AM, Calvin Cheung wrote:
>>>>>> While upgrading the compiler on Mac for jdk9, we found this compiler
>>>>>> bug
>>>>>> where it skips the following 2 lines of code in metaspaceShared.cpp
>>>>>> when
>>>>>> optimization is enable (set to -Os) for the fastdebug and product
>>>>>> builds.
>>>>>>      strcat(class_list_path_str, os::file_separator());
>>>>>>      strcat(class_list_path_str, "classlist");
>>>>>>
>>>>>> The bug is reproducible with Xcode 5.1.1 and 6.1.
>>>>>>
>>>>>> A workaround fix is to rewrite an "if" block in the
>>>>>> MetaspaceShared::preload_and_dump() method.
>>>>>
>>>>> Can't you simply replace the strcats with jio_snprintf and do away
>>>>> with the sub_path array?
>>>> The following works. I'll do more testing before sending an updated
>>>> webrev.
>>>>
>>>> --- a/src/share/vm/memory/metaspaceShared.cpp
>>>> +++ b/src/share/vm/memory/metaspaceShared.cpp
>>>> @@ -713,12 +713,15 @@
>>>>      int class_list_path_len = (int)strlen(class_list_path_str);
>>>>      if (class_list_path_len >= 3) {
>>>>        if (strcmp(class_list_path_str + class_list_path_len - 3,
>>>> "lib") != 0) {
>>>> -        strcat(class_list_path_str, os::file_separator());
>>>> -        strcat(class_list_path_str, "lib");
>>>> +        jio_snprintf(class_list_path_str + class_list_path_len,
>>>> +                     sizeof(class_list_path_str) - 
>>>> class_list_path_len,
>>>> +                     "%slib", os::file_separator());
>>>>        }
>>>>      }
>>>> -    strcat(class_list_path_str, os::file_separator());
>>>> -    strcat(class_list_path_str, "classlist");
>>>> +    class_list_path_len = (int)strlen(class_list_path_str);
>>>> +    jio_snprintf(class_list_path_str + class_list_path_len,
>>>> +                 sizeof(class_list_path_str) - class_list_path_len,
>>>> +                 "%sclasslist", os::file_separator());
>>>>      class_list_path = class_list_path_str;
>>>>    } else {
>>>>      class_list_path = SharedClassListFile;
>>>>>
>>>>> Or even try strncat instead of strcat?
>>>> I think jio_snprintf is better because it null terminates the string.
>>>> If I use strncat, I'll need to initialize the entire buffer to null.
>>>>
>>>> thanks,
>>>> Calvin
>>>>>
>>>>> David
>>>>>
>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8060721
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>>>>>
>>>>>> Testing:
>>>>>>      JPRT
>>>>>>      The affected testcase with product, fastdebug, and debug builds
>>>>>> built with Xcode 5.1.1 and 6.1.
>>>>>>
>>>>>> thanks,
>>>>>> Calvin
>>>>
>>>
>


From calvin.cheung at oracle.com  Fri Nov  7 21:06:05 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Fri, 07 Nov 2014 13:06:05 -0800
Subject: RFR(XS): 8060721: Test
	runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
In-Reply-To: <545D2F12.1000001@oracle.com>
References: <545A770C.3030503@oracle.com>	<545AC5D3.9090005@oracle.com>	<545AF8F2.1010106@oracle.com>	<545C1B31.3060901@oracle.com>
	<545C68E7.4080807@oracle.com> <545D1D56.4050000@oracle.com>
	<545D2F12.1000001@oracle.com>
Message-ID: <545D343D.4080001@oracle.com>

Thanks - Ioi.

On 11/7/2014 12:44 PM, Ioi Lam wrote:
> Calvin, the new changes look good to me.
>
> - Ioi
>
> On 11/7/14, 11:28 AM, Calvin Cheung wrote:
>> On 11/6/2014 10:38 PM, David Holmes wrote:
>>> Hi Calvin,
>>>
>>> On 7/11/2014 11:06 AM, Calvin Cheung wrote:
>>>> I've updated the webrev at the same location:
>>>>      http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>>> I also re-ran the tests.
>>>>
>>>> Please take a look.
>>>
>>>  717         jio_snprintf(class_list_path_str + class_list_path_len,
>>>  718                      sizeof(class_list_path_str) - 
>>> class_list_path_len,
>>>  719                      "%slib", os::file_separator());
>>>  720       }
>>>  721     }
>>>  722     class_list_path_len = (int)strlen(class_list_path_str);
>>>
>>> The strlen recalculation at #722 should be moved inside the if-block 
>>> as that is the only time it is needed.
>> Agreed.
>>> Also can we not just do += 4 ?
>> I didn't want to use 4 to avoid another magic number but in this case 
>> I think it's obvious.
>>
>> I've updated webrev at the same location:
>>     http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>
>> thanks,
>> Calvin
>>>
>>> Thanks,
>>> David
>>>
>>>> thanks,
>>>> Calvin
>>>>
>>>> On 11/5/2014 8:28 PM, Calvin Cheung wrote:
>>>>> On 11/5/2014 4:50 PM, David Holmes wrote:
>>>>>> On 6/11/2014 5:14 AM, Calvin Cheung wrote:
>>>>>>> While upgrading the compiler on Mac for jdk9, we found this 
>>>>>>> compiler
>>>>>>> bug
>>>>>>> where it skips the following 2 lines of code in metaspaceShared.cpp
>>>>>>> when
>>>>>>> optimization is enable (set to -Os) for the fastdebug and product
>>>>>>> builds.
>>>>>>>      strcat(class_list_path_str, os::file_separator());
>>>>>>>      strcat(class_list_path_str, "classlist");
>>>>>>>
>>>>>>> The bug is reproducible with Xcode 5.1.1 and 6.1.
>>>>>>>
>>>>>>> A workaround fix is to rewrite an "if" block in the
>>>>>>> MetaspaceShared::preload_and_dump() method.
>>>>>>
>>>>>> Can't you simply replace the strcats with jio_snprintf and do away
>>>>>> with the sub_path array?
>>>>> The following works. I'll do more testing before sending an updated
>>>>> webrev.
>>>>>
>>>>> --- a/src/share/vm/memory/metaspaceShared.cpp
>>>>> +++ b/src/share/vm/memory/metaspaceShared.cpp
>>>>> @@ -713,12 +713,15 @@
>>>>>      int class_list_path_len = (int)strlen(class_list_path_str);
>>>>>      if (class_list_path_len >= 3) {
>>>>>        if (strcmp(class_list_path_str + class_list_path_len - 3,
>>>>> "lib") != 0) {
>>>>> -        strcat(class_list_path_str, os::file_separator());
>>>>> -        strcat(class_list_path_str, "lib");
>>>>> +        jio_snprintf(class_list_path_str + class_list_path_len,
>>>>> +                     sizeof(class_list_path_str) - 
>>>>> class_list_path_len,
>>>>> +                     "%slib", os::file_separator());
>>>>>        }
>>>>>      }
>>>>> -    strcat(class_list_path_str, os::file_separator());
>>>>> -    strcat(class_list_path_str, "classlist");
>>>>> +    class_list_path_len = (int)strlen(class_list_path_str);
>>>>> +    jio_snprintf(class_list_path_str + class_list_path_len,
>>>>> +                 sizeof(class_list_path_str) - class_list_path_len,
>>>>> +                 "%sclasslist", os::file_separator());
>>>>>      class_list_path = class_list_path_str;
>>>>>    } else {
>>>>>      class_list_path = SharedClassListFile;
>>>>>>
>>>>>> Or even try strncat instead of strcat?
>>>>> I think jio_snprintf is better because it null terminates the string.
>>>>> If I use strncat, I'll need to initialize the entire buffer to null.
>>>>>
>>>>> thanks,
>>>>> Calvin
>>>>>>
>>>>>> David
>>>>>>
>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8060721
>>>>>>>
>>>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>>>>>>
>>>>>>> Testing:
>>>>>>>      JPRT
>>>>>>>      The affected testcase with product, fastdebug, and debug 
>>>>>>> builds
>>>>>>> built with Xcode 5.1.1 and 6.1.
>>>>>>>
>>>>>>> thanks,
>>>>>>> Calvin
>>>>>
>>>>
>>
>


From david.r.chase at oracle.com  Fri Nov  7 21:14:38 2014
From: david.r.chase at oracle.com (David Chase)
Date: Fri, 7 Nov 2014 16:14:38 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <D1D7CFCF-D5C0-4A05-A636-80685CF153B0@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
	<5458A57C.4060208@gmail.com>
	<260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>
	<5459034E.8070809@gmail.com>
	<D1D7CFCF-D5C0-4A05-A636-80685CF153B0@oracle.com>
Message-ID: <39826508-110B-4FCE-9A58-8C3D1B9FC7DE@oracle.com>

New webrev:

bug: https://bugs.openjdk.java.net/browse/JDK-8013267

webrevs:
http://cr.openjdk.java.net/~drchase/8013267/jdk.06/
http://cr.openjdk.java.net/~drchase/8013267/hotspot.06/

Changes since last:

1) refactored to put ClassData under java.lang.invoke.MemberName

2) split the data structure into two parts; handshake with JVM uses a linked list,
which makes for a simpler backout-if-race, and Java side continues to use the
simple sorted array.  This should allow easier use of (for example) fancier
data structures (like ConcurrentHashMap) if this later proves necessary.

3) Cleaned up symbol references in the new hotspot code to go through vmSymbols.

4) renamed oldCapacity to oldSize

5) ran two different benchmarks and saw no change in performance.
  a) nashorn ScriptTest (see https://bugs.openjdk.java.net/browse/JDK-8014288 )
  b) JMH microbenchmarks
  (see bug comments for details)

And it continues to pass the previously-failing tests, as well as the new test
which has been added to hotspot/test/compiler/jsr292 .

David

On 2014-11-04, at 3:54 PM, David Chase <david.r.chase at oracle.com> wrote:

> I?m working on the initial benchmarking, and so far this arrangement (with synchronization
> and binary search for lookup, lots of barriers and linear cost insertion) has not yet been any
> slower.
> 
> I am nonetheless tempted by the 2-tables solution, because I think the simpler JVM-side
> interface that it allows is desirable.
> 
> David
> 
> On 2014-11-04, at 11:48 AM, Peter Levart <peter.levart at gmail.com> wrote:
> 
>> On 11/04/2014 04:19 PM, David Chase wrote:
>>> On 2014-11-04, at 5:07 AM, Peter Levart <peter.levart at gmail.com> wrote:
>>>> Are you thinking of an IdentityHashMap type of hash table (no linked-list of elements for same bucket, just search for 1st free slot on insert)? The problem would be how to pre-size the array. Count declared members?
>>> It can?t be an identityHashMap, because we are interning member names.
>> 
>> I know it can't be IdentityHashMap - I just wondered if you were thinking of an IdentityHashMap-like data structure in contrast to standard HashMap-like. Not in terms of equality/hashCode used, but in terms of internal data structure. IdentityHashMap is just an array of elements (well pairs of them - key, value are placed in two consecutive array slots). Lookup searches for element linearly in the array starting from hashCode based index to the element if found or 1st empty array slot. It's very easy to implement if the only operations are get() and put() and could be used for interning and as a shared structure for VM to scan, but array has to be sized to at least 3/2 the number of elements for performance to not degrade.
>> 
>>> In spite of my grumbling about benchmarking, I?m inclined to do that and try a couple of experiments.
>>> One possibility would be to use two data structures, one for interning, the other for communication with the VM.
>>> Because there?s no lookup in the VM data stucture it can just be an array that gets elements appended,
>>> and the synchronization dance is much simpler.
>>> 
>>> For interning, maybe I use a ConcurrentHashMap, and I try the following idiom:
>>> 
>>> mn = resolve(args)
>>> // deal with any errors
>>> mn? = chm.get(mn)
>>> if (mn? != null) return mn? // hoped-for-common-case
>>> 
>>> synchronized (something) {
>>>  mn? = chm.get(mn)
>>>  if (mn? != null) return mn?
>>>     txn_class = mn.getDeclaringClass()
>>> 
>>>    while (true) {
>>>       redef_count = txn_class.redefCount()
>>>       mn = resolve(args)
>>> 
>>>      shared_array.add(mn);
>>>      // barrier, because we are a paranoid
>>>      if (redef_count = redef_count.redefCount()) {
>>>          chm.add(mn); // safe to publish to other Java threads.
>>>          return mn;
>>>      }
>>>      shared_array.drop_last(); // Try again
>>>  }
>>> }
>>> 
>>> (Idiom gets slightly revised for the one or two other intern use cases, but this is the basic idea).
>> 
>> Yes, that's similar to what I suggested by using a linked-list of MemberName(s) instead of the "shared_array" (easier to reason about ordering of writes) and a sorted array of MemberName(s) instead of the "chm" in your scheme above. ConcurrentHashMap would certainly be the most performant solution in terms of lookup/insertion-time and concurrent throughput, but it will use more heap than a simple packed array of MemberNames. CHM is much better now in JDK8 though regarding heap use.
>> 
>> A combination of the two approaches is also possible:
>> 
>> - instead of maintaining a "shared_array" of MemberName(s), have them form a linked-list (you trade a slot in array for 'next' pointer in MemberName)
>> - use ConcurrentHashMap for interning.
>> 
>> Regards, Peter
>> 
>>> 
>>> David
>>> 
>>>>> And another way to view this is that we?re now quibbling about performance, when we still
>>>>> have an existing correctness problem that this patch solves, so maybe we should just get this
>>>>> done and then file an RFE.
>>>> Perhaps, yes. But note that questions about JMM and ordering of writes to array elements are about correctness, not performance.
>>>> 
>>>> Regards, Peter
>>>> 
>>>>> David
> 


From calvin.cheung at oracle.com  Fri Nov  7 21:38:23 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Fri, 07 Nov 2014 13:38:23 -0800
Subject: RFR: JDK7 backport of 8020675 - invalid jar file in the
	bootclasspath could lead to jvm fatal error
In-Reply-To: <545CE105.4020208@oracle.com>
References: <545B797D.70907@oracle.com> <545BA401.1070205@oracle.com>
	<545BBA1F.3040301@oracle.com> <545CDBB0.80700@oracle.com>
	<545CE105.4020208@oracle.com>
Message-ID: <545D3BCF.8090200@oracle.com>

The new webrev looks good.

On 11/7/2014 7:11 AM, Andreas Eriksson wrote:
> I think I need a jdk7u Reviewer to look at this as well, right?
For backport, you only need 1 reviewer and it doesn't have to be a 
capital R Reviewer.

Calvin

>
> New webrev where I added the 0 byte dummy.jar:
> http://cr.openjdk.java.net/~aeriksso/8020675/webrev.01/
>
> Checked so that the test fails on older versions and still passes on a 
> fixed version.
>
> Regards,
> Andreas
>
> On 2014-11-07 15:48, Andreas Eriksson wrote:
>> Oh, interesting.
>> The hsx25 changeset does not display the dummy.jar as being a part of 
>> the checkin:
>> http://hg.openjdk.java.net/hsx/hsx25/hotspot/rev/7e7dd25666da
>>
>> But when I navigate to the dummy.jar path I can see that it was 
>> checked in as part of that changeset:
>> http://hg.openjdk.java.net/hsx/hsx25/hotspot/log/7e7dd25666da/test/runtime/LoadClass/dummy.jar 
>>
>>
>> Is this a know issue with mercurial?
>>
>> Anyway, thanks for pointing this out, I would probably have missed it 
>> otherwise.
>> It seems that if the dummy.jar is not present the test always succeeds.
>>
>> Thanks,
>> Andreas
>>
>> On 2014-11-06 19:12, Calvin Cheung wrote:
>>> Hi Andreas,
>>>
>>> The change looks good.
>>> There should be a dummy.jar to go with the test cases.
>>> http://cr.openjdk.java.net/~ccheung/8020675/webrev.02/
>>>
>>> The webrev won't show any diffs for the jar file but don't forget to 
>>> include it when you push the fix.
>>>
>>> thanks,
>>> Calvin
>>>
>>> On 11/6/2014 8:38 AM, Andreas Eriksson wrote:
>>>> Hi,
>>>>
>>>> Could someone please review this jdk7 backport of JDK-8020675 
>>>> <https://bugs.openjdk.java.net/browse/JDK-8020675>.
>>>> Summary:
>>>> invalid jar file in the bootclasspath could lead to jvm fatal error
>>>> removed offending EXCEPTION_MARK calls and code cleanup
>>>>
>>>> One code change necessary for the backport was in method 
>>>> ClassLoader::load_classfile.
>>>> The change was to use CHECK_(instanceKlassHandle()) instead of 
>>>> CHECK_NULL.
>>>> See the mail thread at 
>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-November/015825.html 
>>>> for more information.
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~aeriksso/8020675/webrev.00/
>>>>
>>>> Regards,
>>>> Andreas
>>>>
>>>>
>>>
>>
>


From jiangli.zhou at oracle.com  Fri Nov  7 22:46:54 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Fri, 07 Nov 2014 14:46:54 -0800
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <545D0167.3070903@oracle.com>
References: <545C21E6.90709@oracle.com>	<682E3725-DDA0-4A1D-9F6E-94C6CF6045A1@oracle.com>
	<545D0167.3070903@oracle.com>
Message-ID: <545D4BDE.9010908@oracle.com>

Hi Roland,

On 11/07/2014 09:29 AM, Jiangli Zhou wrote:
> Hi Roland,
>
> Thank you for the review. Please see comments and questions below.
>
> On 11/07/2014 05:16 AM, Roland Westrelin wrote:
>> Hi Jiangli,
>>
>>> Please review the following changes that fix the crash with 
>>> -XX:-LazyBootClassLoader on windows x64 platforms (fastdebug only). 
>>> During VM initialization, current_stack_pointer() could be called 
>>> before the VM generates stub routines. The generated get_previous_sp 
>>> routine cannot be used during that time, use the estimated value for 
>>> the sp value instead. The x86 implementation is unaffected by the 
>>> change and always returns the estimated sp value as before.
>>>
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8054008
>>> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev/
>>>
>>> Tested with JPRT and ExtBadJAR test.
>> But if what os::current_stack_pointer() returns is no longer 
>> ?accurate?, aren?t you at risk of hitting the assert in 
>> os::verify_stack_alignment()? Shouldn?t you skip the assert entirely 
>> if the routine is not yet available?
>
> For x64, it still returns the "accurate" value once the routine is 
> generated. Before the routine is ready, it gives the estimate, which 
> might have the risk of upsetting the assert as you suggested. I have a 
> few questions. Have you run into the case where the estimate might 
> trigger the assertion on x64? What about x86, why that's not handled 
> the same as x64?

Answering my own question, verify_stack_alignment() is a nop on x86. 
That's probably why there was no need to obtain the "accurate" previous 
sp on x86 and an estimated value was always returned on x86.

>
>>
>> Also why not make that change on all platform to improve robustness 
>> while you?re doing this?
>
> Thank you for the suggestion. Sound good. I'll look into this. Is 
> there a global flag that indicates the stub routines are generated?

I changed windows os::verify_stack_alignment() to skip the assert when 
StubRoutines::code1() is NULL. Please see the following updated webrev.

Regarding you question about other platforms, only windows x64 has this 
particular issue. The os::verify_stack_alignment() is nop for sparc, 
ARM, and x86. The ppc, linux-x64, solaris-x64 verify_stack_alignment()  
implementations do use the generated routine to obtain previous sp.

http://cr.openjdk.java.net/~jiangli/8054008/webrev.02/

Thanks,
Jiangli

>
> Thanks,
> Jiangli
>
>>
>> Roland.
>


From chris.plummer at oracle.com  Sat Nov  8 03:53:01 2014
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 07 Nov 2014 19:53:01 -0800
Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation
	fault
Message-ID: <545D939D.2030308@oracle.com>

This is an initial review for 6762191. I'm guessing there will be 
recommendations to fix in a different way, but thought this would be a 
good time to start the discussion.

https://bugs.openjdk.java.net/browse/JDK-6762191
http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/
http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/

The bug is that if the -Xss size is set to something very small (like 
16k), on linux there will be a crash due to overwriting the end of the 
stack. This happens before hotspot can compute its stack needs and 
verify that the stack is big enough.

It didn't seem viable to move the hotspot stack size check earlier. It 
depends on too much other work done before that point, and the changes 
would have been disruptive. The stack size check is currently done in 
os::init_2().

What is needed is a check before the thread is created. That way we can 
create a thread with a big enough stack to handle all needs up to the 
point of the check in os::init_2(). This initial check does not need to 
be the final check. It just needs to confirm that we have enough stack 
to get us to the check in os::init_2().

I decided to check in java.c if the -Xss size is too small, and set it 
to a larger size if it is. I hard coded this size to 32k (I'll explain 
why 32k later). I suspect this is the part that will result in some 
debate. If you have better suggestions let me know. If it does stay 
here, then probably the 32k needs to be a #define, and maybe even an OS 
porting interface, but I'm not sure where to put it.

The reason I chose 32k is because this is big enough for all platforms 
to get to the stack size check in os::init_2(). It is also smaller than 
the actual minimum stack size allowed on any platform. 32-bit windows 
has the smallest requirement at 64k. I add some printfs to print the 
minimum stack requirement, and then ran a simple JTReg test with every 
JPRT supported platform to get the results.

The TooSmallStackSize.sh will run "java -version" with -Xss16k, -Xss32k, 
and -XXss<minsize>, where <minsize> is the size from the error message 
produced by the JVM, such as in the following:

$ java -Xss32k -version
The stack size specified is too small, Specify at least 100k
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

I ran this test through JPRT on all platforms, and they all pass.

One thing to point out is that Windows behaves a bit different than the 
other platforms. It always rounds the stack size up to a multiple of 64k 
, so even if you specify -Xss16k, you get a 64k stack. On 32-bit Windows 
with C1, 64k is also the minimum requirement, so there is no error 
produced in this case. However, on 32-bit Windows with C2, 68k is the 
minimum, so an error is produced since the stack will only be 64k. There 
is no bug here. It's just a bit confusing.

thanks,

Chris

From peter.levart at gmail.com  Sat Nov  8 15:07:38 2014
From: peter.levart at gmail.com (Peter Levart)
Date: Sat, 08 Nov 2014 16:07:38 +0100
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <39826508-110B-4FCE-9A58-8C3D1B9FC7DE@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
	<5458A57C.4060208@gmail.com>
	<260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>
	<5459034E.8070809@gmail.com>
	<D1D7CFCF-D5C0-4A05-A636-80685CF153B0@oracle.com>
	<39826508-110B-4FCE-9A58-8C3D1B9FC7DE@oracle.com>
Message-ID: <545E31BA.3070500@gmail.com>

Hi David,

As previously, I feel competent to comment only the Java side of the patch.

Using linked list to publish to VM is a safer and easier way for the 
desired purpose and using a separate data structure for interning makes 
it easier to change it in the future should need arise. That need may 
just be around the corner as there are people (Martin Buchholz, Duncan 
MacGregor, for example) that use classes with huge number of methods...

Here are few comments for the new webrev (MemberName):

   78 @SuppressWarnings("rawtypes") //Comparable in next line
   79 /*non-public*/ final class MemberName implements Member, Comparable, Cloneable {


Since MemberName is final and can only be compared to itself, you could 
make it implement Comparable<MemberName> and eliminate the @SuppressWarnings

more MemberName:

84     private volatile MemberName next; // used for a linked list of MemberNames known to VM
...

1375     private static class ClassData {
1376         /**
1377          * This needs to be a simple data structure because we need to access
1378          * and update its elements from the JVM.  Note that the Java side controls
1379          * the allocation and order of elements in the array; the JVM modifies
1380          * fields of those elements during class redefinition.
1381          */
1382         private volatile MemberName[] elementData;
1383         private volatile MemberName publishedToVM;
1384         private volatile int size;
1385
1386         /**
1387          * Interns a member name in the member name table.
1388          * Returns null if a race with the jvm occurred.  Races are detected
1389          * by checking for changes in the class redefinition count that occur
1390          * before an intern is complete.
1391          *
1392          * @param klass class whose redefinition count is checked.
1393          * @param memberName member name to be interned
1394          * @param redefined_count the value of classRedefinedCount() observed before
1395          *                         creation of the MemberName that is being interned.
1396          * @return null if a race occurred, otherwise the interned MemberName.
1397          */
1398         @SuppressWarnings({"unchecked","rawtypes"})
1399         public MemberName intern(Class<?> klass, MemberName memberName, int redefined_count) {
1400             if (elementData == null) {
1401                 synchronized (this) {
1402                     if (elementData == null) {
1403                         elementData = new MemberName[1];
1404                     }
1405                 }
1406             }
1407             synchronized (this) { // this == ClassData
1408                 final int index = Arrays.binarySearch(elementData, 0, size, memberName);
1409                 if (index >= 0) {
1410                     return elementData[index];
1411                 }
1412                 // Not found, add carefully.
1413                 return add(klass, ~index, memberName, redefined_count);
1414             }
1415         }

...

1426         private MemberName add(Class<?> klass, int index, MemberName e, int redefined_count) {
1427             // First attempt publication to JVM, if that succeeds,
1428             // then record internally.
1429             e.next = publishedToVM;
1430             publishedToVM = e;
1431             storeFence();
1432             if (redefined_count != jla.getClassRedefinedCount(klass)) {
1433                 // Lost a race, back out publication and report failure.
1434                 publishedToVM = e.next;
1435                 return null;
1436             }


Since you now synchronize lookup/add *and* lazy elementData construction 
on the same object (the ClassData instance), you can merge two 
synchronized blocks and simplify code. You can make MemberName.next, 
ClassData.elementData and ClassData.size be non-volatile (just 
ClassData.publishedToVM needs to be volatile) and ClassData.intern() can 
look something like that:

public synchronized MemberName intern(Class<?> klass, MemberName 
memberName, int redefined_count) {
     final int index;
     if (elementData == null) {
         elementData = new MemberName[1];
         index = ~0;
     } else {
         index = Arrays.binarySearch(elementData, 0, size, memberName);
         if (index >= 0) return elementData[index];
     }
     // Not found, add carefully.
     return add(klass, ~index, memberName, redefined_count);
}

// Note: no need for additional storeFence() in add()...

private MemberName add(Class<?> klass, int index, MemberName e, int 
redefined_count) {
     // First attempt publication to JVM, if that succeeds,
     // then record internally.
     e.next = publishedToVM;   // volatile read of publishedToVM, 
followed by normal write of e.next...
     publishedToVM = e;              // ...which is ordered before 
volatile write of publishedToVM...
     if (redefined_count != jla.getClassRedefinedCount(klass)) { // 
...which is ordered before volatile read of klass.classRedefinedCount.
       // Lost a race, back out publication and report failure.
       publishedToVM = e.next;
       return null;
     }
     ...


Now let's take for example one of the MemberName.make() methods that 
return interned MemberNames:

  206     public static MemberName make(Method m, boolean wantSpecial) {
  207         // Unreflected member names are resolved so intern them here.
  208         MemberName tmp0 = null;
  209         InternTransaction tx = new InternTransaction(m.getDeclaringClass());
  210         while (tmp0 == null) {
  211             MemberName tmp = new MemberName(m, wantSpecial);
  212             tmp0 = tx.tryIntern(tmp);
  213         }
  214         return tmp0;
  215     }


I'm trying to understand the workings of InternTransaction helper class 
(and find an example that breaks it). You create an instance of it, 
passing Method's declaringClass. You then (in retry loop) create a 
resolved MemberName from the Method and wantSpecial flag. This 
MemberName's clazz can apparently differ from Method's declaringClass. I 
don't know when and why this happens, but apparently it can (super 
method?), so in InternTransaction.tryIntern() you do...

  363             if (member_name.isResolved()) {
  364                 if (member_name.clazz != tx_class) {
  365                     Class prev_tx_class = tx_class;
  366                     int prev_txn_token = txn_token;
  367                     tx_class = member_name.clazz;
  368                     txn_token = internTxnToken(tx_class);
  369                     // Zero is a special case.
  370                     if (txn_token != 0 ||
  371                         prev_txn_token != internTxnToken(prev_tx_class)) {
  372                         // Resolved class is different and at least one
  373                         // redef of it occurred, therefore repeat with
  374                         // proper class for race consistency checking.
  375                         return null;
  376                     }
  377                 }
  378                 member_name = member_name.intern(txn_token);
  379                 if (member_name == null) {
  380                     // Update the token for the next try.
  381                     txn_token = internTxnToken(tx_class);
  382                 }
  383             }


Now let's assume that the resolved member_name.clazz differs from 
Method's declaringClass. Let's assume also that either member_name.clazz 
has had at least one redefinition or Method's declaringClass has been 
redefined between creating InternTransaction and reading 
member_name.clazz's txn_token. You return 'null' in such case, 
concluding that not only the resolved member_name.clazz redefinition 
matters, but Method's declaringClass redefinition can also invalidate 
resolved MemberName am I right? It would be helpful if I could 
understand when and how Method's declaringClass redefinition can affect 
member_name. Can it affect which clazz is resolved for member_name?

Anyway, you return null in such case from an updated InternTransaction 
(tx_class and txn_token are now updated to have values for resolved 
member_name.clazz). In next round the checks of newly constructed and 
resolved member_name are not performed against Method's declaringClass 
but against previous round's member_name.clazz. Is this what is 
intended? I can see there has to be a stop condition for loop to end, 
but shouldn't checks for Method's declaringClass redefinition be 
performed in every iteration (in addition to the check for 
member_name.clazz redefinition if it differs from Method's declaringClass)?


Regards, Peter


On 11/07/2014 10:14 PM, David Chase wrote:
> New webrev:
>
> bug:https://bugs.openjdk.java.net/browse/JDK-8013267
>
> webrevs:
> http://cr.openjdk.java.net/~drchase/8013267/jdk.06/
> http://cr.openjdk.java.net/~drchase/8013267/hotspot.06/
>
> Changes since last:
>
> 1) refactored to put ClassData under java.lang.invoke.MemberName
>
> 2) split the data structure into two parts; handshake with JVM uses a linked list,
> which makes for a simpler backout-if-race, and Java side continues to use the
> simple sorted array.  This should allow easier use of (for example) fancier
> data structures (like ConcurrentHashMap) if this later proves necessary.
>
> 3) Cleaned up symbol references in the new hotspot code to go through vmSymbols.
>
> 4) renamed oldCapacity to oldSize
>
> 5) ran two different benchmarks and saw no change in performance.
>    a) nashorn ScriptTest (seehttps://bugs.openjdk.java.net/browse/JDK-8014288  )
>    b) JMH microbenchmarks
>    (see bug comments for details)
>
> And it continues to pass the previously-failing tests, as well as the new test
> which has been added to hotspot/test/compiler/jsr292 .
>
> David
>
> On 2014-11-04, at 3:54 PM, David Chase<david.r.chase at oracle.com>  wrote:
>
>> I?m working on the initial benchmarking, and so far this arrangement (with synchronization
>> and binary search for lookup, lots of barriers and linear cost insertion) has not yet been any
>> slower.
>>
>> I am nonetheless tempted by the 2-tables solution, because I think the simpler JVM-side
>> interface that it allows is desirable.
>>
>> David
>>
>> On 2014-11-04, at 11:48 AM, Peter Levart<peter.levart at gmail.com>  wrote:
>>
>>> On 11/04/2014 04:19 PM, David Chase wrote:
>>>> On 2014-11-04, at 5:07 AM, Peter Levart<peter.levart at gmail.com>  wrote:
>>>>> Are you thinking of an IdentityHashMap type of hash table (no linked-list of elements for same bucket, just search for 1st free slot on insert)? The problem would be how to pre-size the array. Count declared members?
>>>> It can?t be an identityHashMap, because we are interning member names.
>>> I know it can't be IdentityHashMap - I just wondered if you were thinking of an IdentityHashMap-like data structure in contrast to standard HashMap-like. Not in terms of equality/hashCode used, but in terms of internal data structure. IdentityHashMap is just an array of elements (well pairs of them - key, value are placed in two consecutive array slots). Lookup searches for element linearly in the array starting from hashCode based index to the element if found or 1st empty array slot. It's very easy to implement if the only operations are get() and put() and could be used for interning and as a shared structure for VM to scan, but array has to be sized to at least 3/2 the number of elements for performance to not degrade.
>>>
>>>> In spite of my grumbling about benchmarking, I?m inclined to do that and try a couple of experiments.
>>>> One possibility would be to use two data structures, one for interning, the other for communication with the VM.
>>>> Because there?s no lookup in the VM data stucture it can just be an array that gets elements appended,
>>>> and the synchronization dance is much simpler.
>>>>
>>>> For interning, maybe I use a ConcurrentHashMap, and I try the following idiom:
>>>>
>>>> mn = resolve(args)
>>>> // deal with any errors
>>>> mn? = chm.get(mn)
>>>> if (mn? != null) return mn? // hoped-for-common-case
>>>>
>>>> synchronized (something) {
>>>>   mn? = chm.get(mn)
>>>>   if (mn? != null) return mn?
>>>>      txn_class = mn.getDeclaringClass()
>>>>
>>>>     while (true) {
>>>>        redef_count = txn_class.redefCount()
>>>>        mn = resolve(args)
>>>>
>>>>       shared_array.add(mn);
>>>>       // barrier, because we are a paranoid
>>>>       if (redef_count = redef_count.redefCount()) {
>>>>           chm.add(mn); // safe to publish to other Java threads.
>>>>           return mn;
>>>>       }
>>>>       shared_array.drop_last(); // Try again
>>>>   }
>>>> }
>>>>
>>>> (Idiom gets slightly revised for the one or two other intern use cases, but this is the basic idea).
>>> Yes, that's similar to what I suggested by using a linked-list of MemberName(s) instead of the "shared_array" (easier to reason about ordering of writes) and a sorted array of MemberName(s) instead of the "chm" in your scheme above. ConcurrentHashMap would certainly be the most performant solution in terms of lookup/insertion-time and concurrent throughput, but it will use more heap than a simple packed array of MemberNames. CHM is much better now in JDK8 though regarding heap use.
>>>
>>> A combination of the two approaches is also possible:
>>>
>>> - instead of maintaining a "shared_array" of MemberName(s), have them form a linked-list (you trade a slot in array for 'next' pointer in MemberName)
>>> - use ConcurrentHashMap for interning.
>>>
>>> Regards, Peter
>>>
>>>> David
>>>>
>>>>>> And another way to view this is that we?re now quibbling about performance, when we still
>>>>>> have an existing correctness problem that this patch solves, so maybe we should just get this
>>>>>> done and then file an RFE.
>>>>> Perhaps, yes. But note that questions about JMM and ordering of writes to array elements are about correctness, not performance.
>>>>>
>>>>> Regards, Peter
>>>>>
>>>>>> David


From peter.levart at gmail.com  Sun Nov  9 12:55:10 2014
From: peter.levart at gmail.com (Peter Levart)
Date: Sun, 09 Nov 2014 13:55:10 +0100
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <39826508-110B-4FCE-9A58-8C3D1B9FC7DE@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
	<5458A57C.4060208@gmail.com>
	<260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>
	<5459034E.8070809@gmail.com>
	<D1D7CFCF-D5C0-4A05-A636-80685CF153B0@oracle.com>
	<39826508-110B-4FCE-9A58-8C3D1B9FC7DE@oracle.com>
Message-ID: <545F642E.30205@gmail.com>

Hi David,

I played a little with the idea of having a hash table instead of packed 
sorted array for interning. Using ConcurrentHashMap would present quite 
some memory overhead. A more compact representation is possible in the 
form of a linear-scan hash table where elements of array are MemberNames 
themselves:

http://cr.openjdk.java.net/~plevart/misc/MemberName.intern/jdk.06.diff/

This is a drop-in replacement for MemberName on top of your jdk.06 
patch. If you have some time, you can run this with your performance 
tests to see if it presents any difference. If not, then perhaps this 
interning is not so performance critical after all.

Regards, Peter

On 11/07/2014 10:14 PM, David Chase wrote:
> New webrev:
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8013267
>
> webrevs:
> http://cr.openjdk.java.net/~drchase/8013267/jdk.06/
> http://cr.openjdk.java.net/~drchase/8013267/hotspot.06/
>
> Changes since last:
>
> 1) refactored to put ClassData under java.lang.invoke.MemberName
>
> 2) split the data structure into two parts; handshake with JVM uses a linked list,
> which makes for a simpler backout-if-race, and Java side continues to use the
> simple sorted array.  This should allow easier use of (for example) fancier
> data structures (like ConcurrentHashMap) if this later proves necessary.
>
> 3) Cleaned up symbol references in the new hotspot code to go through vmSymbols.
>
> 4) renamed oldCapacity to oldSize
>
> 5) ran two different benchmarks and saw no change in performance.
>    a) nashorn ScriptTest (see https://bugs.openjdk.java.net/browse/JDK-8014288 )
>    b) JMH microbenchmarks
>    (see bug comments for details)
>
> And it continues to pass the previously-failing tests, as well as the new test
> which has been added to hotspot/test/compiler/jsr292 .
>
> David
>
> On 2014-11-04, at 3:54 PM, David Chase <david.r.chase at oracle.com> wrote:
>
>> I?m working on the initial benchmarking, and so far this arrangement (with synchronization
>> and binary search for lookup, lots of barriers and linear cost insertion) has not yet been any
>> slower.
>>
>> I am nonetheless tempted by the 2-tables solution, because I think the simpler JVM-side
>> interface that it allows is desirable.
>>
>> David
>>
>> On 2014-11-04, at 11:48 AM, Peter Levart <peter.levart at gmail.com> wrote:
>>
>>> On 11/04/2014 04:19 PM, David Chase wrote:
>>>> On 2014-11-04, at 5:07 AM, Peter Levart <peter.levart at gmail.com> wrote:
>>>>> Are you thinking of an IdentityHashMap type of hash table (no linked-list of elements for same bucket, just search for 1st free slot on insert)? The problem would be how to pre-size the array. Count declared members?
>>>> It can?t be an identityHashMap, because we are interning member names.
>>> I know it can't be IdentityHashMap - I just wondered if you were thinking of an IdentityHashMap-like data structure in contrast to standard HashMap-like. Not in terms of equality/hashCode used, but in terms of internal data structure. IdentityHashMap is just an array of elements (well pairs of them - key, value are placed in two consecutive array slots). Lookup searches for element linearly in the array starting from hashCode based index to the element if found or 1st empty array slot. It's very easy to implement if the only operations are get() and put() and could be used for interning and as a shared structure for VM to scan, but array has to be sized to at least 3/2 the number of elements for performance to not degrade.
>>>
>>>> In spite of my grumbling about benchmarking, I?m inclined to do that and try a couple of experiments.
>>>> One possibility would be to use two data structures, one for interning, the other for communication with the VM.
>>>> Because there?s no lookup in the VM data stucture it can just be an array that gets elements appended,
>>>> and the synchronization dance is much simpler.
>>>>
>>>> For interning, maybe I use a ConcurrentHashMap, and I try the following idiom:
>>>>
>>>> mn = resolve(args)
>>>> // deal with any errors
>>>> mn? = chm.get(mn)
>>>> if (mn? != null) return mn? // hoped-for-common-case
>>>>
>>>> synchronized (something) {
>>>>   mn? = chm.get(mn)
>>>>   if (mn? != null) return mn?
>>>>      txn_class = mn.getDeclaringClass()
>>>>
>>>>     while (true) {
>>>>        redef_count = txn_class.redefCount()
>>>>        mn = resolve(args)
>>>>
>>>>       shared_array.add(mn);
>>>>       // barrier, because we are a paranoid
>>>>       if (redef_count = redef_count.redefCount()) {
>>>>           chm.add(mn); // safe to publish to other Java threads.
>>>>           return mn;
>>>>       }
>>>>       shared_array.drop_last(); // Try again
>>>>   }
>>>> }
>>>>
>>>> (Idiom gets slightly revised for the one or two other intern use cases, but this is the basic idea).
>>> Yes, that's similar to what I suggested by using a linked-list of MemberName(s) instead of the "shared_array" (easier to reason about ordering of writes) and a sorted array of MemberName(s) instead of the "chm" in your scheme above. ConcurrentHashMap would certainly be the most performant solution in terms of lookup/insertion-time and concurrent throughput, but it will use more heap than a simple packed array of MemberNames. CHM is much better now in JDK8 though regarding heap use.
>>>
>>> A combination of the two approaches is also possible:
>>>
>>> - instead of maintaining a "shared_array" of MemberName(s), have them form a linked-list (you trade a slot in array for 'next' pointer in MemberName)
>>> - use ConcurrentHashMap for interning.
>>>
>>> Regards, Peter
>>>
>>>> David
>>>>
>>>>>> And another way to view this is that we?re now quibbling about performance, when we still
>>>>>> have an existing correctness problem that this patch solves, so maybe we should just get this
>>>>>> done and then file an RFE.
>>>>> Perhaps, yes. But note that questions about JMM and ordering of writes to array elements are about correctness, not performance.
>>>>>
>>>>> Regards, Peter
>>>>>
>>>>>> David


From aleksey.shipilev at oracle.com  Sun Nov  9 15:49:14 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Sun, 09 Nov 2014 18:49:14 +0300
Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to
	use	the same oop map
In-Reply-To: <545B9CC0.3080106@oracle.com>
References: <525AC628.4020906@oracle.com>	<009001cec83e$5e9c6b40$1bd541c0$@oracle.com>	<525B0A18.8000105@oracle.com>
	<545B70F6.60801@oracle.com>	<7029DBDA-BBE6-420A-BE6E-2EA28B60B560@oracle.com>
	<545B9CC0.3080106@oracle.com>
Message-ID: <545F8CFA.80809@oracle.com>

Hi again,

No changes in webrev:
 http://cr.openjdk.java.net/~shade/8015272/webrev.01/

Please review and sponsor:
 http://cr.openjdk.java.net/~shade/8015272/8015272.changeset

As per Karen's request, more testing is done, ran the tests on my Linux
x86_64/fastdebug:

On 11/06/2014 07:07 PM, Aleksey Shipilev wrote:
> On 11/06/2014 06:01 PM, Karen Kinnear wrote:
>> - e.g. what about the vmtestbase vm/runtime/contended tests (and yes, some tests should be removed from that testlist)

vmtestbase vm/runtime/contended: no issues.
hotspot/test/runtime/ jtreg: no issues.

>> - vmtestbase: vm.quick.testlist (required for runtime changes)

vm.quick.testlist: no issues.

>> - and since @Contended annotation is used in the JDK core libraries - the jtreg jdk tests?

jdk/test/java/util/concurrent jtreg: no issues.
jdk/test/java/lang/Thread jtreg: no issues.


Thanks,
-Aleksey.


From aleksey.shipilev at oracle.com  Sun Nov  9 18:45:35 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Sun, 09 Nov 2014 21:45:35 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
Message-ID: <545FB64F.7090705@oracle.com>

Hi,

Thread.getName() returns String, and does new String instantiation every
time, because the thread name is stored in char[]. Even though we use a
private String constructor that shares the char[] array without copying
it, this still hurts some use cases (think extra-fast logging). To the
extent some people actually maintain Map<Thread, String> to avoid it.
 https://bugs.openjdk.java.net/browse/JDK-8059677

Here's the attempt to maintain String instead of char[]:
 http://cr.openjdk.java.net/~shade/8059677/webrev.01.jdk/
 http://cr.openjdk.java.net/~shade/8059677/webrev.01.hs/

JDK changes are trivial, but HS changes require some rewiring, since VM
treats Thread.name specially. However, it turns out we can make a
contained change, since the getter is used sparingly, and setter seems
to be not used at all. Any trouble with this change?

Testing: JPRT, manual tests, jdk/test/java/lang/Thread jtreg,
hotspot/test/runtime/ jtreg, vm.quick.testlist, nsk.jvmti.testlist,
svc.quick.testlist

Thanks,
-Aleksey.


From andreas.eriksson at oracle.com  Mon Nov 10 11:13:36 2014
From: andreas.eriksson at oracle.com (Andreas Eriksson)
Date: Mon, 10 Nov 2014 12:13:36 +0100
Subject: RFR: JDK7 backport of 8020675 - invalid jar file in the
	bootclasspath could lead to jvm fatal error
In-Reply-To: <545D3BCF.8090200@oracle.com>
References: <545B797D.70907@oracle.com> <545BA401.1070205@oracle.com>
	<545BBA1F.3040301@oracle.com> <545CDBB0.80700@oracle.com>
	<545CE105.4020208@oracle.com> <545D3BCF.8090200@oracle.com>
Message-ID: <54609DE0.3080107@oracle.com>


On 2014-11-07 22:38, Calvin Cheung wrote:
> The new webrev looks good.
>
> On 11/7/2014 7:11 AM, Andreas Eriksson wrote:
>> I think I need a jdk7u Reviewer to look at this as well, right?
> For backport, you only need 1 reviewer and it doesn't have to be a 
> capital R Reviewer.
>

OK, thanks!

- Andreas

> Calvin
>
>>
>> New webrev where I added the 0 byte dummy.jar:
>> http://cr.openjdk.java.net/~aeriksso/8020675/webrev.01/
>>
>> Checked so that the test fails on older versions and still passes on 
>> a fixed version.
>>
>> Regards,
>> Andreas
>>
>> On 2014-11-07 15:48, Andreas Eriksson wrote:
>>> Oh, interesting.
>>> The hsx25 changeset does not display the dummy.jar as being a part 
>>> of the checkin:
>>> http://hg.openjdk.java.net/hsx/hsx25/hotspot/rev/7e7dd25666da
>>>
>>> But when I navigate to the dummy.jar path I can see that it was 
>>> checked in as part of that changeset:
>>> http://hg.openjdk.java.net/hsx/hsx25/hotspot/log/7e7dd25666da/test/runtime/LoadClass/dummy.jar 
>>>
>>>
>>> Is this a know issue with mercurial?
>>>
>>> Anyway, thanks for pointing this out, I would probably have missed 
>>> it otherwise.
>>> It seems that if the dummy.jar is not present the test always succeeds.
>>>
>>> Thanks,
>>> Andreas
>>>
>>> On 2014-11-06 19:12, Calvin Cheung wrote:
>>>> Hi Andreas,
>>>>
>>>> The change looks good.
>>>> There should be a dummy.jar to go with the test cases.
>>>> http://cr.openjdk.java.net/~ccheung/8020675/webrev.02/
>>>>
>>>> The webrev won't show any diffs for the jar file but don't forget 
>>>> to include it when you push the fix.
>>>>
>>>> thanks,
>>>> Calvin
>>>>
>>>> On 11/6/2014 8:38 AM, Andreas Eriksson wrote:
>>>>> Hi,
>>>>>
>>>>> Could someone please review this jdk7 backport of JDK-8020675 
>>>>> <https://bugs.openjdk.java.net/browse/JDK-8020675>.
>>>>> Summary:
>>>>> invalid jar file in the bootclasspath could lead to jvm fatal error
>>>>> removed offending EXCEPTION_MARK calls and code cleanup
>>>>>
>>>>> One code change necessary for the backport was in method 
>>>>> ClassLoader::load_classfile.
>>>>> The change was to use CHECK_(instanceKlassHandle()) instead of 
>>>>> CHECK_NULL.
>>>>> See the mail thread at 
>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-November/015825.html 
>>>>> for more information.
>>>>>
>>>>> Webrev: http://cr.openjdk.java.net/~aeriksso/8020675/webrev.00/
>>>>>
>>>>> Regards,
>>>>> Andreas
>>>>>
>>>>>
>>>>
>>>
>>
>


From david.holmes at oracle.com  Mon Nov 10 11:21:24 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 10 Nov 2014 21:21:24 +1000
Subject: RFR(XS): 8060721: Test
	runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
In-Reply-To: <545D1D56.4050000@oracle.com>
References: <545A770C.3030503@oracle.com>
	<545AC5D3.9090005@oracle.com>	<545AF8F2.1010106@oracle.com>
	<545C1B31.3060901@oracle.com> <545C68E7.4080807@oracle.com>
	<545D1D56.4050000@oracle.com>
Message-ID: <54609FB4.6040203@oracle.com>

On 8/11/2014 5:28 AM, Calvin Cheung wrote:
> On 11/6/2014 10:38 PM, David Holmes wrote:
>> Hi Calvin,
>>
>> On 7/11/2014 11:06 AM, Calvin Cheung wrote:
>>> I've updated the webrev at the same location:
>>>      http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>> I also re-ran the tests.
>>>
>>> Please take a look.
>>
>>  717         jio_snprintf(class_list_path_str + class_list_path_len,
>>  718                      sizeof(class_list_path_str) -
>> class_list_path_len,
>>  719                      "%slib", os::file_separator());
>>  720       }
>>  721     }
>>  722     class_list_path_len = (int)strlen(class_list_path_str);
>>
>> The strlen recalculation at #722 should be moved inside the if-block
>> as that is the only time it is needed.
> Agreed.
>> Also can we not just do += 4 ?
> I didn't want to use 4 to avoid another magic number but in this case I
> think it's obvious.
>
> I've updated webrev at the same location:
>      http://cr.openjdk.java.net/~ccheung/8060721/webrev/

Looks good to me.

Thanks,
David

> thanks,
> Calvin
>>
>> Thanks,
>> David
>>
>>> thanks,
>>> Calvin
>>>
>>> On 11/5/2014 8:28 PM, Calvin Cheung wrote:
>>>> On 11/5/2014 4:50 PM, David Holmes wrote:
>>>>> On 6/11/2014 5:14 AM, Calvin Cheung wrote:
>>>>>> While upgrading the compiler on Mac for jdk9, we found this compiler
>>>>>> bug
>>>>>> where it skips the following 2 lines of code in metaspaceShared.cpp
>>>>>> when
>>>>>> optimization is enable (set to -Os) for the fastdebug and product
>>>>>> builds.
>>>>>>      strcat(class_list_path_str, os::file_separator());
>>>>>>      strcat(class_list_path_str, "classlist");
>>>>>>
>>>>>> The bug is reproducible with Xcode 5.1.1 and 6.1.
>>>>>>
>>>>>> A workaround fix is to rewrite an "if" block in the
>>>>>> MetaspaceShared::preload_and_dump() method.
>>>>>
>>>>> Can't you simply replace the strcats with jio_snprintf and do away
>>>>> with the sub_path array?
>>>> The following works. I'll do more testing before sending an updated
>>>> webrev.
>>>>
>>>> --- a/src/share/vm/memory/metaspaceShared.cpp
>>>> +++ b/src/share/vm/memory/metaspaceShared.cpp
>>>> @@ -713,12 +713,15 @@
>>>>      int class_list_path_len = (int)strlen(class_list_path_str);
>>>>      if (class_list_path_len >= 3) {
>>>>        if (strcmp(class_list_path_str + class_list_path_len - 3,
>>>> "lib") != 0) {
>>>> -        strcat(class_list_path_str, os::file_separator());
>>>> -        strcat(class_list_path_str, "lib");
>>>> +        jio_snprintf(class_list_path_str + class_list_path_len,
>>>> +                     sizeof(class_list_path_str) -
>>>> class_list_path_len,
>>>> +                     "%slib", os::file_separator());
>>>>        }
>>>>      }
>>>> -    strcat(class_list_path_str, os::file_separator());
>>>> -    strcat(class_list_path_str, "classlist");
>>>> +    class_list_path_len = (int)strlen(class_list_path_str);
>>>> +    jio_snprintf(class_list_path_str + class_list_path_len,
>>>> +                 sizeof(class_list_path_str) - class_list_path_len,
>>>> +                 "%sclasslist", os::file_separator());
>>>>      class_list_path = class_list_path_str;
>>>>    } else {
>>>>      class_list_path = SharedClassListFile;
>>>>>
>>>>> Or even try strncat instead of strcat?
>>>> I think jio_snprintf is better because it null terminates the string.
>>>> If I use strncat, I'll need to initialize the entire buffer to null.
>>>>
>>>> thanks,
>>>> Calvin
>>>>>
>>>>> David
>>>>>
>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8060721
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>>>>>
>>>>>> Testing:
>>>>>>      JPRT
>>>>>>      The affected testcase with product, fastdebug, and debug builds
>>>>>> built with Xcode 5.1.1 and 6.1.
>>>>>>
>>>>>> thanks,
>>>>>> Calvin
>>>>
>>>
>

From chris.hegarty at oracle.com  Mon Nov 10 11:52:08 2014
From: chris.hegarty at oracle.com (Chris Hegarty)
Date: Mon, 10 Nov 2014 11:52:08 +0000
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <545FB64F.7090705@oracle.com>
References: <545FB64F.7090705@oracle.com>
Message-ID: <5460A6E8.9050506@oracle.com>

Aleksey,

I have only looked at the libraries changes, and I think they make sense 
. As in, I can find no reason why the name cannot be changed to be a String.

Trivially, after your changes will NPE be thrown if setName(null), as it 
is today ?

-Chris.

On 09/11/14 18:45, Aleksey Shipilev wrote:
> Hi,
>
> Thread.getName() returns String, and does new String instantiation every
> time, because the thread name is stored in char[]. Even though we use a
> private String constructor that shares the char[] array without copying
> it, this still hurts some use cases (think extra-fast logging). To the
> extent some people actually maintain Map<Thread, String> to avoid it.
>   https://bugs.openjdk.java.net/browse/JDK-8059677
>
> Here's the attempt to maintain String instead of char[]:
>   http://cr.openjdk.java.net/~shade/8059677/webrev.01.jdk/
>   http://cr.openjdk.java.net/~shade/8059677/webrev.01.hs/
>
> JDK changes are trivial, but HS changes require some rewiring, since VM
> treats Thread.name specially. However, it turns out we can make a
> contained change, since the getter is used sparingly, and setter seems
> to be not used at all. Any trouble with this change?
>
> Testing: JPRT, manual tests, jdk/test/java/lang/Thread jtreg,
> hotspot/test/runtime/ jtreg, vm.quick.testlist, nsk.jvmti.testlist,
> svc.quick.testlist
>
> Thanks,
> -Aleksey.
>

From david.holmes at oracle.com  Mon Nov 10 12:56:40 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 10 Nov 2014 22:56:40 +1000
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5460A6E8.9050506@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
Message-ID: <5460B608.4050909@oracle.com>

On 10/11/2014 9:52 PM, Chris Hegarty wrote:
> Aleksey,
>
> I have only looked at the libraries changes, and I think they make sense
> . As in, I can find no reason why the name cannot be changed to be a
> String.

Very quick response, but IIRC this has been examined in the past and 
there were reasons why it can't/shouldn't be done. Will try to dig out 
more details in the morning.

If String construction is a bottleneck just cache it.

David
-----

> Trivially, after your changes will NPE be thrown if setName(null), as it
> is today ?
>
> -Chris.
>
> On 09/11/14 18:45, Aleksey Shipilev wrote:
>> Hi,
>>
>> Thread.getName() returns String, and does new String instantiation every
>> time, because the thread name is stored in char[]. Even though we use a
>> private String constructor that shares the char[] array without copying
>> it, this still hurts some use cases (think extra-fast logging). To the
>> extent some people actually maintain Map<Thread, String> to avoid it.
>>   https://bugs.openjdk.java.net/browse/JDK-8059677
>>
>> Here's the attempt to maintain String instead of char[]:
>>   http://cr.openjdk.java.net/~shade/8059677/webrev.01.jdk/
>>   http://cr.openjdk.java.net/~shade/8059677/webrev.01.hs/
>>
>> JDK changes are trivial, but HS changes require some rewiring, since VM
>> treats Thread.name specially. However, it turns out we can make a
>> contained change, since the getter is used sparingly, and setter seems
>> to be not used at all. Any trouble with this change?
>>
>> Testing: JPRT, manual tests, jdk/test/java/lang/Thread jtreg,
>> hotspot/test/runtime/ jtreg, vm.quick.testlist, nsk.jvmti.testlist,
>> svc.quick.testlist
>>
>> Thanks,
>> -Aleksey.
>>

From chris.hegarty at oracle.com  Mon Nov 10 13:53:24 2014
From: chris.hegarty at oracle.com (Chris Hegarty)
Date: Mon, 10 Nov 2014 13:53:24 +0000
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5460B608.4050909@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
	<5460B608.4050909@oracle.com>
Message-ID: <5460C354.5000605@oracle.com>

On 10/11/14 12:56, David Holmes wrote:
> On 10/11/2014 9:52 PM, Chris Hegarty wrote:
>> Aleksey,
>>
>> I have only looked at the libraries changes, and I think they make sense
>> . As in, I can find no reason why the name cannot be changed to be a
>> String.
>
> Very quick response, but IIRC this has been examined in the past and
> there were reasons why it can't/shouldn't be done. Will try to dig out
> more details in the morning.

If there was previous discussion on this, that revealed some substantial 
issue, that would be great, but I can't recall, or find, it now.

Hotspot express, and the desire for hotspot to run with different 
library versions, would certainly cause complication, but I don't 
believe that is an issue now.

Just on that, the library changes are minimal, and if this were to 
proceed then they can accompany the hotspot change, as they make their 
way into jdk9/dev.

Anyway, this should await your reply.

-Chris.

> If String construction is a bottleneck just cache it.
>
> David
> -----
>
>> Trivially, after your changes will NPE be thrown if setName(null), as it
>> is today ?
>>
>> -Chris.
>>
>> On 09/11/14 18:45, Aleksey Shipilev wrote:
>>> Hi,
>>>
>>> Thread.getName() returns String, and does new String instantiation every
>>> time, because the thread name is stored in char[]. Even though we use a
>>> private String constructor that shares the char[] array without copying
>>> it, this still hurts some use cases (think extra-fast logging). To the
>>> extent some people actually maintain Map<Thread, String> to avoid it.
>>>   https://bugs.openjdk.java.net/browse/JDK-8059677
>>>
>>> Here's the attempt to maintain String instead of char[]:
>>>   http://cr.openjdk.java.net/~shade/8059677/webrev.01.jdk/
>>>   http://cr.openjdk.java.net/~shade/8059677/webrev.01.hs/
>>>
>>> JDK changes are trivial, but HS changes require some rewiring, since VM
>>> treats Thread.name specially. However, it turns out we can make a
>>> contained change, since the getter is used sparingly, and setter seems
>>> to be not used at all. Any trouble with this change?
>>>
>>> Testing: JPRT, manual tests, jdk/test/java/lang/Thread jtreg,
>>> hotspot/test/runtime/ jtreg, vm.quick.testlist, nsk.jvmti.testlist,
>>> svc.quick.testlist
>>>
>>> Thanks,
>>> -Aleksey.
>>>

From vladimir.x.ivanov at oracle.com  Mon Nov 10 13:01:39 2014
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Mon, 10 Nov 2014 17:01:39 +0400
Subject: [9] RFR (S): 8060147: SIGSEGV in Metadata::mark_on_stack() while
	marking metadata in ciEnv
In-Reply-To: <545A9BEB.8020507@oracle.com>
References: <5450F261.60400@oracle.com>	<545114DF.7040005@oracle.com>	<54511744.4060904@oracle.com>	<5451F43A.1010108@oracle.com>	<5452128C.4090408@oracle.com>	<54522805.5040701@oracle.com>	<1421AC31-CBEA-4AED-A657-3C43B258A6DB@oracle.com>	<54522357.4070705@oracle.com>	<BF7012CB-AA7D-44B5-9C4F-A7DC0142FA4E@oracle.com>	<5452425D.7040405@oracle.com>	<E635DB43-48F9-4C5E-869F-13366D4DFA0B@oracle.com>	<5452517C.4050104@oracle.com>	<54527E1E.1070507@oracle.com>
	<5452A786.7030000@oracle.com> <5452A077.2050903@oracle.com>
	<545A5F59.2020907@oracle.com> <545A5814.8000109@oracle.com>
	<545A9BEB.8020507@oracle.com>
Message-ID: <5460B733.4040500@oracle.com>

Vladimir, Coleen, Roland, Mikael, thanks for reviews!

On 11/6/14, 1:51 AM, Vladimir Kozlov wrote:
> I am fine with targeted fix only.
>
> One comment env->get_instance_klass() checks for NULL. Your new code in
> create_new_metadata() does not:
>
> ciInstanceKlass* holder =
> get_metadata(h_m()->method_holder())->as_instance_klass();
Good catch. I reverted to ciEnv::get_instance_klass().

FTR updated webrev:
http://cr.openjdk.java.net/~vlivanov/8060147/webrev.02

Best regards,
Vladimir Ivanov

>
> Thanks,
> Vladimir K
>
> On 11/5/14 9:02 AM, Vladimir Ivanov wrote:
>>
>> On 11/5/14, 9:33 PM, Coleen Phillimore wrote:
>>>
>>> On 10/30/14, 4:32 PM, Vladimir Ivanov wrote:
>>>> Coleen,
>>>>
>>>> I implemented 2 approaches of the fix.
>>>>
>>>> The fix with a special case for VM anon classes is:
>>>> http://cr.openjdk.java.net/~vlivanov/8060147/webrev.anon.00/
>>>>
>>>> Both fix the bug, but have different properties.
>>>>
>>>> (1) Special case for VM anon class is very focused on the actual
>>>> cause, but more fragile - all the logic which keeps metadata from
>>>> being deallocated is non-trivial and scattered around the whole
>>>> ciMetadata hierarchy.
>>>>
>>>> (2) On the other hand, initial version, which forcibly creates
>>>> klass_holder ciObject for each ciMetadata, is much cleaner and
>>>> localized, but does unnecessary work.
>>>>
>>>> Am I right that you prefer (1) as a fix?
>>>
>>> Yes, I think this version does less unnecessary work and creates less
>>> ciObjects.   And the comment is useful for finding how we keep
>>> ciMetadata alive for anonymous classes.   You still have a UseNewCode in
>>> the webrev thought that you want to take out.
>>
>> Thanks, Coleen.
>>
>> VladimirK, Roland, what do you think about (1)?
>>
>> Best regards,
>> Vladimir Ivanov

From aleksey.shipilev at oracle.com  Mon Nov 10 14:08:51 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Mon, 10 Nov 2014 17:08:51 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5460A6E8.9050506@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
Message-ID: <5460C6F3.1080201@oracle.com>

Hi Chris,

Thanks for taking a look!

On 11/10/2014 02:52 PM, Chris Hegarty wrote:
> Trivially, after your changes will NPE be thrown if setName(null), as it
> is today ?

There is no way it could throw NPE now, therefore the behavior is
different. The spec says nothing about NPE though, but it feels wrong to
pass the null String to setNativeName. I should add
Objects.requireNonNull there. Will wait for more feedbacks, and update
the webrev.

-Aleksey.


From aleksey.shipilev at oracle.com  Mon Nov 10 14:19:12 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Mon, 10 Nov 2014 17:19:12 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5460C354.5000605@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
	<5460B608.4050909@oracle.com> <5460C354.5000605@oracle.com>
Message-ID: <5460C960.9080509@oracle.com>

Hi David, Chris,

On 11/10/2014 04:53 PM, Chris Hegarty wrote:
> On 10/11/14 12:56, David Holmes wrote:
>> On 10/11/2014 9:52 PM, Chris Hegarty wrote:
>>> I have only looked at the libraries changes, and I think they make sense
>>> . As in, I can find no reason why the name cannot be changed to be a
>>> String.
>>
>> Very quick response, but IIRC this has been examined in the past and
>> there were reasons why it can't/shouldn't be done. Will try to dig out
>> more details in the morning.
> 
> If there was previous discussion on this, that revealed some substantial
> issue, that would be great, but I can't recall, or find, it now.
> 
> Hotspot express, and the desire for hotspot to run with different
> library versions, would certainly cause complication, but I don't
> believe that is an issue now.
> 
> Just on that, the library changes are minimal, and if this were to
> proceed then they can accompany the hotspot change, as they make their
> way into jdk9/dev.
> 
> Anyway, this should await your reply.

Alan was having the same concern, there is an issue with JNI/JVMTI and
other power users that might break when exposed to under-constructed
Thread, e.g:
 https://bugs.openjdk.java.net/browse/JDK-6412693

This is why I ran jvmti and serviceability tests for this change,
yielding no failures. This reinforces my belief this patch does not
break the important invariant: if there is a problem with "Thread.name =
name.toCharArray()" anywhere in Thread code, then "Thread.name = name"
does neither regress it further nor fixes it.

Then I speculated that having char[] name would help VM initialize the
name if we wanted to switch to complete VM-side initialization of
Thread, but it seems we can do String oop instantiation in the similar vein.

Caching the name feels like a band-aid, that will probably complicate
the Thread initialization on VM side even more. Let's wait and see if
David can come up with some horror issue we are overlooking. :)

Thanks,
-Aleksey.


From Alan.Bateman at oracle.com  Mon Nov 10 14:30:33 2014
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Mon, 10 Nov 2014 14:30:33 +0000
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5460C354.5000605@oracle.com>
References: <545FB64F.7090705@oracle.com>
	<5460A6E8.9050506@oracle.com>	<5460B608.4050909@oracle.com>
	<5460C354.5000605@oracle.com>
Message-ID: <5460CC09.7030204@oracle.com>

On 10/11/2014 13:53, Chris Hegarty wrote:
>
> If there was previous discussion on this, that revealed some 
> substantial issue, that would be great, but I can't recall, or find, 
> it now.
>
> Hotspot express, and the desire for hotspot to run with different 
> library versions, would certainly cause complication, but I don't 
> believe that is an issue now.
>
> Just on that, the library changes are minimal, and if this were to 
> proceed then they can accompany the hotspot change, as they make their 
> way into jdk9/dev.
>
I remember the previous discussion on this and at the time it was just 
too troublesome to try to coordinate the change to hotspot + jdk. So a 
jdk-only change was pushed to address the last issue in this area, the 
issue of changing it from char[] to String was kicked down the road.

-Alan

From staffan.larsen at oracle.com  Mon Nov 10 14:51:13 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Mon, 10 Nov 2014 15:51:13 +0100
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5460C960.9080509@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
	<5460B608.4050909@oracle.com> <5460C354.5000605@oracle.com>
	<5460C960.9080509@oracle.com>
Message-ID: <B5073FA9-5725-4CDE-AF75-678E4DD867FE@oracle.com>

I?m afraid this change requires changes in the Serviceability Agent as well. See OopUtilities.threadOopGetName() for example.

/Staffan

> On 10 nov 2014, at 15:19, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
> 
> Hi David, Chris,
> 
> On 11/10/2014 04:53 PM, Chris Hegarty wrote:
>> On 10/11/14 12:56, David Holmes wrote:
>>> On 10/11/2014 9:52 PM, Chris Hegarty wrote:
>>>> I have only looked at the libraries changes, and I think they make sense
>>>> . As in, I can find no reason why the name cannot be changed to be a
>>>> String.
>>> 
>>> Very quick response, but IIRC this has been examined in the past and
>>> there were reasons why it can't/shouldn't be done. Will try to dig out
>>> more details in the morning.
>> 
>> If there was previous discussion on this, that revealed some substantial
>> issue, that would be great, but I can't recall, or find, it now.
>> 
>> Hotspot express, and the desire for hotspot to run with different
>> library versions, would certainly cause complication, but I don't
>> believe that is an issue now.
>> 
>> Just on that, the library changes are minimal, and if this were to
>> proceed then they can accompany the hotspot change, as they make their
>> way into jdk9/dev.
>> 
>> Anyway, this should await your reply.
> 
> Alan was having the same concern, there is an issue with JNI/JVMTI and
> other power users that might break when exposed to under-constructed
> Thread, e.g:
> https://bugs.openjdk.java.net/browse/JDK-6412693
> 
> This is why I ran jvmti and serviceability tests for this change,
> yielding no failures. This reinforces my belief this patch does not
> break the important invariant: if there is a problem with "Thread.name =
> name.toCharArray()" anywhere in Thread code, then "Thread.name = name"
> does neither regress it further nor fixes it.
> 
> Then I speculated that having char[] name would help VM initialize the
> name if we wanted to switch to complete VM-side initialization of
> Thread, but it seems we can do String oop instantiation in the similar vein.
> 
> Caching the name feels like a band-aid, that will probably complicate
> the Thread initialization on VM side even more. Let's wait and see if
> David can come up with some horror issue we are overlooking. :)
> 
> Thanks,
> -Aleksey.
> 


From aleksey.shipilev at oracle.com  Mon Nov 10 14:54:16 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Mon, 10 Nov 2014 17:54:16 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5460CC25.1000609@oracle.com>
References: <545FB64F.7090705@oracle.com>
	<5460A6E8.9050506@oracle.com>	<5460C6F3.1080201@oracle.com>
	<5460CC25.1000609@oracle.com>
Message-ID: <5460D198.4030905@oracle.com>

Hi Roger,

On 11/10/2014 05:31 PM, roger riggs wrote:
> 1) The Thread class javadoc says:
> " Unless otherwise noted, passing a {@code null} argument to a constructor
>  * or method in this class will cause a {@link NullPointerException} to be
>  * thrown."
> 
> So, NPE is already specified for setThreadName(null) or any other method.

Ah, thanks! It is odd to see this specified in a blanked fashion in the
class Javadoc, oh well. So we need to restore the NP check.


> I'm not infavor of adding the Objects.requireNonNull, the NPE will
> be thrown soon enough and it is just noise in the source code in most
> cases that creates larger bytecodes and extra work for the compiler
> /interpreter.

Sorry, I have a hard time understanding what you are saying. How would
you guarantee NPE (as per Javadoc contract above) in the new version of
Thread.setName otherwise?


> 2) About not storing the name as a String, I have some vague 
> recollection of the issue being related to exposing an object
> settable by the application that can be used with synchronize and
> allows communication and sync issues between threads.

Again, I don't quite understand. Is it about storing the reference to
String as the thread name, that can potentially be used for external
synchronization?

If so, I have a hard time devising a sane test case that might fail with
this change. Internal code does not synchronize on Thread.name. Anyone
synchronizing on Thread.getName() result has broken synchronization with
current code. Anyone synchronizing on Thread.getName() result after this
patch will have that (ahem) fixed, plus a performance problem.


> Just because some test doesn't fail, doesn't mean there isn't a
> design/implementation constraint.

I should have said, "I *also* run the jvmti and serviceability tests" to
confirm the change in innocuous. See the HS code change itself -- it
does seem contained.

Thanks,
-Aleksey.


From aleksey.shipilev at oracle.com  Mon Nov 10 14:55:22 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Mon, 10 Nov 2014 17:55:22 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <B5073FA9-5725-4CDE-AF75-678E4DD867FE@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
	<5460B608.4050909@oracle.com> <5460C354.5000605@oracle.com>
	<5460C960.9080509@oracle.com>
	<B5073FA9-5725-4CDE-AF75-678E4DD867FE@oracle.com>
Message-ID: <5460D1DA.4050907@oracle.com>

Hi Staffan,

Ow, it seems very like it.
So, what testlist have I missed to catch this?

-Aleksey.

On 11/10/2014 05:51 PM, Staffan Larsen wrote:
> I?m afraid this change requires changes in the Serviceability Agent as well. See OopUtilities.threadOopGetName() for example.
> 
> /Staffan
> 
>> On 10 nov 2014, at 15:19, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
>>
>> Hi David, Chris,
>>
>> On 11/10/2014 04:53 PM, Chris Hegarty wrote:
>>> On 10/11/14 12:56, David Holmes wrote:
>>>> On 10/11/2014 9:52 PM, Chris Hegarty wrote:
>>>>> I have only looked at the libraries changes, and I think they make sense
>>>>> . As in, I can find no reason why the name cannot be changed to be a
>>>>> String.
>>>>
>>>> Very quick response, but IIRC this has been examined in the past and
>>>> there were reasons why it can't/shouldn't be done. Will try to dig out
>>>> more details in the morning.
>>>
>>> If there was previous discussion on this, that revealed some substantial
>>> issue, that would be great, but I can't recall, or find, it now.
>>>
>>> Hotspot express, and the desire for hotspot to run with different
>>> library versions, would certainly cause complication, but I don't
>>> believe that is an issue now.
>>>
>>> Just on that, the library changes are minimal, and if this were to
>>> proceed then they can accompany the hotspot change, as they make their
>>> way into jdk9/dev.
>>>
>>> Anyway, this should await your reply.
>>
>> Alan was having the same concern, there is an issue with JNI/JVMTI and
>> other power users that might break when exposed to under-constructed
>> Thread, e.g:
>> https://bugs.openjdk.java.net/browse/JDK-6412693
>>
>> This is why I ran jvmti and serviceability tests for this change,
>> yielding no failures. This reinforces my belief this patch does not
>> break the important invariant: if there is a problem with "Thread.name =
>> name.toCharArray()" anywhere in Thread code, then "Thread.name = name"
>> does neither regress it further nor fixes it.
>>
>> Then I speculated that having char[] name would help VM initialize the
>> name if we wanted to switch to complete VM-side initialization of
>> Thread, but it seems we can do String oop instantiation in the similar vein.
>>
>> Caching the name feels like a band-aid, that will probably complicate
>> the Thread initialization on VM side even more. Let's wait and see if
>> David can come up with some horror issue we are overlooking. :)
>>
>> Thanks,
>> -Aleksey.
>>
> 


From vladimir.kozlov at oracle.com  Mon Nov 10 15:59:55 2014
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 10 Nov 2014 07:59:55 -0800
Subject: [9] RFR (S): 8060147: SIGSEGV in Metadata::mark_on_stack() while
	marking metadata in ciEnv
In-Reply-To: <5460B733.4040500@oracle.com>
References: <5450F261.60400@oracle.com>	<545114DF.7040005@oracle.com>	<54511744.4060904@oracle.com>	<5451F43A.1010108@oracle.com>	<5452128C.4090408@oracle.com>	<54522805.5040701@oracle.com>	<1421AC31-CBEA-4AED-A657-3C43B258A6DB@oracle.com>	<54522357.4070705@oracle.com>	<BF7012CB-AA7D-44B5-9C4F-A7DC0142FA4E@oracle.com>	<5452425D.7040405@oracle.com>	<E635DB43-48F9-4C5E-869F-13366D4DFA0B@oracle.com>	<5452517C.4050104@oracle.com>	<54527E1E.1070507@oracle.com>	<5452A786.7030000@oracle.com>
	<5452A077.2050903@oracle.com>	<545A5F59.2020907@oracle.com>
	<545A5814.8000109@oracle.com>	<545A9BEB.8020507@oracle.com>
	<5460B733.4040500@oracle.com>
Message-ID: <5460E0FB.6000704@oracle.com>

Good.

Thanks,
Vladimir

On 11/10/14 5:01 AM, Vladimir Ivanov wrote:
> Vladimir, Coleen, Roland, Mikael, thanks for reviews!
>
> On 11/6/14, 1:51 AM, Vladimir Kozlov wrote:
>> I am fine with targeted fix only.
>>
>> One comment env->get_instance_klass() checks for NULL. Your new code in
>> create_new_metadata() does not:
>>
>> ciInstanceKlass* holder =
>> get_metadata(h_m()->method_holder())->as_instance_klass();
> Good catch. I reverted to ciEnv::get_instance_klass().
>
> FTR updated webrev:
> http://cr.openjdk.java.net/~vlivanov/8060147/webrev.02
>
> Best regards,
> Vladimir Ivanov
>
>>
>> Thanks,
>> Vladimir K
>>
>> On 11/5/14 9:02 AM, Vladimir Ivanov wrote:
>>>
>>> On 11/5/14, 9:33 PM, Coleen Phillimore wrote:
>>>>
>>>> On 10/30/14, 4:32 PM, Vladimir Ivanov wrote:
>>>>> Coleen,
>>>>>
>>>>> I implemented 2 approaches of the fix.
>>>>>
>>>>> The fix with a special case for VM anon classes is:
>>>>> http://cr.openjdk.java.net/~vlivanov/8060147/webrev.anon.00/
>>>>>
>>>>> Both fix the bug, but have different properties.
>>>>>
>>>>> (1) Special case for VM anon class is very focused on the actual
>>>>> cause, but more fragile - all the logic which keeps metadata from
>>>>> being deallocated is non-trivial and scattered around the whole
>>>>> ciMetadata hierarchy.
>>>>>
>>>>> (2) On the other hand, initial version, which forcibly creates
>>>>> klass_holder ciObject for each ciMetadata, is much cleaner and
>>>>> localized, but does unnecessary work.
>>>>>
>>>>> Am I right that you prefer (1) as a fix?
>>>>
>>>> Yes, I think this version does less unnecessary work and creates less
>>>> ciObjects.   And the comment is useful for finding how we keep
>>>> ciMetadata alive for anonymous classes.   You still have a
>>>> UseNewCode in
>>>> the webrev thought that you want to take out.
>>>
>>> Thanks, Coleen.
>>>
>>> VladimirK, Roland, what do you think about (1)?
>>>
>>> Best regards,
>>> Vladimir Ivanov

From mikael.gerdin at oracle.com  Mon Nov 10 16:11:33 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 10 Nov 2014 17:11:33 +0100
Subject: [9] RFR (S): 8060147: SIGSEGV in Metadata::mark_on_stack() while
	marking metadata in ciEnv
In-Reply-To: <5460B733.4040500@oracle.com>
References: <5450F261.60400@oracle.com>	<545114DF.7040005@oracle.com>	<54511744.4060904@oracle.com>	<5451F43A.1010108@oracle.com>	<5452128C.4090408@oracle.com>	<54522805.5040701@oracle.com>	<1421AC31-CBEA-4AED-A657-3C43B258A6DB@oracle.com>	<54522357.4070705@oracle.com>	<BF7012CB-AA7D-44B5-9C4F-A7DC0142FA4E@oracle.com>	<5452425D.7040405@oracle.com>	<E635DB43-48F9-4C5E-869F-13366D4DFA0B@oracle.com>	<5452517C.4050104@oracle.com>	<54527E1E.1070507@oracle.com>	<5452A786.7030000@oracle.com>
	<5452A077.2050903@oracle.com>	<545A5F59.2020907@oracle.com>
	<545A5814.8000109@oracle.com>	<545A9BEB.8020507@oracle.com>
	<5460B733.4040500@oracle.com>
Message-ID: <5460E3B5.2010206@oracle.com>

Vladimir,

On 2014-11-10 14:01, Vladimir Ivanov wrote:
> Vladimir, Coleen, Roland, Mikael, thanks for reviews!
>
> On 11/6/14, 1:51 AM, Vladimir Kozlov wrote:
>> I am fine with targeted fix only.
>>
>> One comment env->get_instance_klass() checks for NULL. Your new code in
>> create_new_metadata() does not:
>>
>> ciInstanceKlass* holder =
>> get_metadata(h_m()->method_holder())->as_instance_klass();
> Good catch. I reverted to ciEnv::get_instance_klass().
>
> FTR updated webrev:
> http://cr.openjdk.java.net/~vlivanov/8060147/webrev.02

Looks good.

/Mikael

>
> Best regards,
> Vladimir Ivanov
>
>>
>> Thanks,
>> Vladimir K
>>
>> On 11/5/14 9:02 AM, Vladimir Ivanov wrote:
>>>
>>> On 11/5/14, 9:33 PM, Coleen Phillimore wrote:
>>>>
>>>> On 10/30/14, 4:32 PM, Vladimir Ivanov wrote:
>>>>> Coleen,
>>>>>
>>>>> I implemented 2 approaches of the fix.
>>>>>
>>>>> The fix with a special case for VM anon classes is:
>>>>> http://cr.openjdk.java.net/~vlivanov/8060147/webrev.anon.00/
>>>>>
>>>>> Both fix the bug, but have different properties.
>>>>>
>>>>> (1) Special case for VM anon class is very focused on the actual
>>>>> cause, but more fragile - all the logic which keeps metadata from
>>>>> being deallocated is non-trivial and scattered around the whole
>>>>> ciMetadata hierarchy.
>>>>>
>>>>> (2) On the other hand, initial version, which forcibly creates
>>>>> klass_holder ciObject for each ciMetadata, is much cleaner and
>>>>> localized, but does unnecessary work.
>>>>>
>>>>> Am I right that you prefer (1) as a fix?
>>>>
>>>> Yes, I think this version does less unnecessary work and creates less
>>>> ciObjects.   And the comment is useful for finding how we keep
>>>> ciMetadata alive for anonymous classes.   You still have a
>>>> UseNewCode in
>>>> the webrev thought that you want to take out.
>>>
>>> Thanks, Coleen.
>>>
>>> VladimirK, Roland, what do you think about (1)?
>>>
>>> Best regards,
>>> Vladimir Ivanov

From staffan.larsen at oracle.com  Mon Nov 10 16:39:27 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Mon, 10 Nov 2014 17:39:27 +0100
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5460D1DA.4050907@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
	<5460B608.4050909@oracle.com> <5460C354.5000605@oracle.com>
	<5460C960.9080509@oracle.com>
	<B5073FA9-5725-4CDE-AF75-678E4DD867FE@oracle.com>
	<5460D1DA.4050907@oracle.com>
Message-ID: <E62D4958-C26F-4BE3-A8F0-479BCAAE97D4@oracle.com>


> On 10 nov 2014, at 15:55, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
> 
> Hi Staffan,
> 
> Ow, it seems very like it.
> So, what testlist have I missed to catch this?

Probably vm.tmtools.testlist and/or nsk.sajdi.testlist. Just a warning that these tests are far from stable. Sorry about that.

/Staffan

> 
> -Aleksey.
> 
> On 11/10/2014 05:51 PM, Staffan Larsen wrote:
>> I?m afraid this change requires changes in the Serviceability Agent as well. See OopUtilities.threadOopGetName() for example.
>> 
>> /Staffan
>> 
>>> On 10 nov 2014, at 15:19, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
>>> 
>>> Hi David, Chris,
>>> 
>>> On 11/10/2014 04:53 PM, Chris Hegarty wrote:
>>>> On 10/11/14 12:56, David Holmes wrote:
>>>>> On 10/11/2014 9:52 PM, Chris Hegarty wrote:
>>>>>> I have only looked at the libraries changes, and I think they make sense
>>>>>> . As in, I can find no reason why the name cannot be changed to be a
>>>>>> String.
>>>>> 
>>>>> Very quick response, but IIRC this has been examined in the past and
>>>>> there were reasons why it can't/shouldn't be done. Will try to dig out
>>>>> more details in the morning.
>>>> 
>>>> If there was previous discussion on this, that revealed some substantial
>>>> issue, that would be great, but I can't recall, or find, it now.
>>>> 
>>>> Hotspot express, and the desire for hotspot to run with different
>>>> library versions, would certainly cause complication, but I don't
>>>> believe that is an issue now.
>>>> 
>>>> Just on that, the library changes are minimal, and if this were to
>>>> proceed then they can accompany the hotspot change, as they make their
>>>> way into jdk9/dev.
>>>> 
>>>> Anyway, this should await your reply.
>>> 
>>> Alan was having the same concern, there is an issue with JNI/JVMTI and
>>> other power users that might break when exposed to under-constructed
>>> Thread, e.g:
>>> https://bugs.openjdk.java.net/browse/JDK-6412693
>>> 
>>> This is why I ran jvmti and serviceability tests for this change,
>>> yielding no failures. This reinforces my belief this patch does not
>>> break the important invariant: if there is a problem with "Thread.name =
>>> name.toCharArray()" anywhere in Thread code, then "Thread.name = name"
>>> does neither regress it further nor fixes it.
>>> 
>>> Then I speculated that having char[] name would help VM initialize the
>>> name if we wanted to switch to complete VM-side initialization of
>>> Thread, but it seems we can do String oop instantiation in the similar vein.
>>> 
>>> Caching the name feels like a band-aid, that will probably complicate
>>> the Thread initialization on VM side even more. Let's wait and see if
>>> David can come up with some horror issue we are overlooking. :)
>>> 
>>> Thanks,
>>> -Aleksey.
>>> 
>> 
> 
> 


From roger.riggs at oracle.com  Mon Nov 10 14:31:01 2014
From: roger.riggs at oracle.com (roger riggs)
Date: Mon, 10 Nov 2014 09:31:01 -0500
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5460C6F3.1080201@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
	<5460C6F3.1080201@oracle.com>
Message-ID: <5460CC25.1000609@oracle.com>

Hi Aleksey,

1) The Thread class javadoc says:
" Unless otherwise noted, passing a {@code null} argument to a constructor
  * or method in this class will cause a {@link NullPointerException} to be
  * thrown."

So, NPE is already specified for setThreadName(null) or any other method.
I'm not infavor of adding the Objects.requireNonNull, the NPE will be 
thrown soon enough
and it is just noise in the source code in most cases that creates 
larger bytecodes
and extra work for the compiler /interpreter.


2) About not storing the name as a String, I have some vague 
recollection of the
issue being related to exposing an object settable by the application 
that can be used
with synchronize and allows communication and sync issues between threads.

Just because some test doesn't fail, doesn't mean there isn't a 
design/implementation constraint.

Roger


On 11/10/2014 9:08 AM, Aleksey Shipilev wrote:
> Hi Chris,
>
> Thanks for taking a look!
>
> On 11/10/2014 02:52 PM, Chris Hegarty wrote:
>> Trivially, after your changes will NPE be thrown if setName(null), as it
>> is today ?
> There is no way it could throw NPE now, therefore the behavior is
> different. The spec says nothing about NPE though, but it feels wrong to
> pass the null String to setNativeName. I should add
> Objects.requireNonNull there. Will wait for more feedbacks, and update
> the webrev.
>
> -Aleksey.
>
>


From calvin.cheung at oracle.com  Mon Nov 10 17:17:46 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Mon, 10 Nov 2014 09:17:46 -0800
Subject: RFR(XS): 8060721: Test
	runtime/SharedArchiveFile/LimitSharedSizes.java
	fails in jdk 9 fcs new platforms/compiler
In-Reply-To: <54609FB4.6040203@oracle.com>
References: <545A770C.3030503@oracle.com>
	<545AC5D3.9090005@oracle.com>	<545AF8F2.1010106@oracle.com>
	<545C1B31.3060901@oracle.com> <545C68E7.4080807@oracle.com>
	<545D1D56.4050000@oracle.com> <54609FB4.6040203@oracle.com>
Message-ID: <5460F33A.1040000@oracle.com>

Thanks for your re-review, David.

Calvin

On 11/10/2014 3:21 AM, David Holmes wrote:
> On 8/11/2014 5:28 AM, Calvin Cheung wrote:
>> On 11/6/2014 10:38 PM, David Holmes wrote:
>>> Hi Calvin,
>>>
>>> On 7/11/2014 11:06 AM, Calvin Cheung wrote:
>>>> I've updated the webrev at the same location:
>>>>      http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>>> I also re-ran the tests.
>>>>
>>>> Please take a look.
>>>
>>>  717         jio_snprintf(class_list_path_str + class_list_path_len,
>>>  718                      sizeof(class_list_path_str) -
>>> class_list_path_len,
>>>  719                      "%slib", os::file_separator());
>>>  720       }
>>>  721     }
>>>  722     class_list_path_len = (int)strlen(class_list_path_str);
>>>
>>> The strlen recalculation at #722 should be moved inside the if-block
>>> as that is the only time it is needed.
>> Agreed.
>>> Also can we not just do += 4 ?
>> I didn't want to use 4 to avoid another magic number but in this case I
>> think it's obvious.
>>
>> I've updated webrev at the same location:
>>      http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>
> Looks good to me.
>
> Thanks,
> David
>
>> thanks,
>> Calvin
>>>
>>> Thanks,
>>> David
>>>
>>>> thanks,
>>>> Calvin
>>>>
>>>> On 11/5/2014 8:28 PM, Calvin Cheung wrote:
>>>>> On 11/5/2014 4:50 PM, David Holmes wrote:
>>>>>> On 6/11/2014 5:14 AM, Calvin Cheung wrote:
>>>>>>> While upgrading the compiler on Mac for jdk9, we found this 
>>>>>>> compiler
>>>>>>> bug
>>>>>>> where it skips the following 2 lines of code in metaspaceShared.cpp
>>>>>>> when
>>>>>>> optimization is enable (set to -Os) for the fastdebug and product
>>>>>>> builds.
>>>>>>>      strcat(class_list_path_str, os::file_separator());
>>>>>>>      strcat(class_list_path_str, "classlist");
>>>>>>>
>>>>>>> The bug is reproducible with Xcode 5.1.1 and 6.1.
>>>>>>>
>>>>>>> A workaround fix is to rewrite an "if" block in the
>>>>>>> MetaspaceShared::preload_and_dump() method.
>>>>>>
>>>>>> Can't you simply replace the strcats with jio_snprintf and do away
>>>>>> with the sub_path array?
>>>>> The following works. I'll do more testing before sending an updated
>>>>> webrev.
>>>>>
>>>>> --- a/src/share/vm/memory/metaspaceShared.cpp
>>>>> +++ b/src/share/vm/memory/metaspaceShared.cpp
>>>>> @@ -713,12 +713,15 @@
>>>>>      int class_list_path_len = (int)strlen(class_list_path_str);
>>>>>      if (class_list_path_len >= 3) {
>>>>>        if (strcmp(class_list_path_str + class_list_path_len - 3,
>>>>> "lib") != 0) {
>>>>> -        strcat(class_list_path_str, os::file_separator());
>>>>> -        strcat(class_list_path_str, "lib");
>>>>> +        jio_snprintf(class_list_path_str + class_list_path_len,
>>>>> +                     sizeof(class_list_path_str) -
>>>>> class_list_path_len,
>>>>> +                     "%slib", os::file_separator());
>>>>>        }
>>>>>      }
>>>>> -    strcat(class_list_path_str, os::file_separator());
>>>>> -    strcat(class_list_path_str, "classlist");
>>>>> +    class_list_path_len = (int)strlen(class_list_path_str);
>>>>> +    jio_snprintf(class_list_path_str + class_list_path_len,
>>>>> +                 sizeof(class_list_path_str) - class_list_path_len,
>>>>> +                 "%sclasslist", os::file_separator());
>>>>>      class_list_path = class_list_path_str;
>>>>>    } else {
>>>>>      class_list_path = SharedClassListFile;
>>>>>>
>>>>>> Or even try strncat instead of strcat?
>>>>> I think jio_snprintf is better because it null terminates the string.
>>>>> If I use strncat, I'll need to initialize the entire buffer to null.
>>>>>
>>>>> thanks,
>>>>> Calvin
>>>>>>
>>>>>> David
>>>>>>
>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8060721
>>>>>>>
>>>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8060721/webrev/
>>>>>>>
>>>>>>> Testing:
>>>>>>>      JPRT
>>>>>>>      The affected testcase with product, fastdebug, and debug 
>>>>>>> builds
>>>>>>> built with Xcode 5.1.1 and 6.1.
>>>>>>>
>>>>>>> thanks,
>>>>>>> Calvin
>>>>>
>>>>
>>


From coleen.phillimore at oracle.com  Mon Nov 10 17:21:06 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 10 Nov 2014 12:21:06 -0500
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64 assert(_count
	> 0) failed: Negative ,counter
Message-ID: <5460F402.4060507@oracle.com>

Summary: Signed bitfield size y can only have (1 << y)-1 values.

We were overflowing the the _pos index and reusing the 0th element in 
the MallocSiteTable for two different stack traces which caused the 
assert for deallocation.

Tested with nsk.quick.testlist and jtreg runtime tests with 
-XX:NativeMemoryTracking=detail.

open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
bug link https://bugs.openjdk.java.net/browse/JDK-8062870

Thanks,
Coleen

From aleksey.shipilev at oracle.com  Mon Nov 10 17:35:01 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Mon, 10 Nov 2014 20:35:01 +0300
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <5460F402.4060507@oracle.com>
References: <5460F402.4060507@oracle.com>
Message-ID: <5460F745.4070808@oracle.com>

On 10.11.2014 20:21, Coleen Phillimore wrote:
> Summary: Signed bitfield size y can only have (1 << y)-1 values.
> 
> We were overflowing the the _pos index and reusing the 0th element in
> the MallocSiteTable for two different stack traces which caused the
> assert for deallocation.
> 
> Tested with nsk.quick.testlist and jtreg runtime tests with
> -XX:NativeMemoryTracking=detail.
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
> bug link https://bugs.openjdk.java.net/browse/JDK-8062870

Looks good, but made my head hurt a little. I think it deserves a more
bullet-proof rework, a la:

#ifdef _LP64
  #define SIZE_BITS 64
  #define FLAGS_BITS 8
  #define POS_BITS 16
  #define BUCKET_BITS 40
#else
  #define SIZE_BITS 32
  #define FLAGS_BITS 8
  #define POS_BITS 8
  #define BUCKET_BITS 16
#endif  // _LP64

#define MAX_MALLOCSITE_TABLE_SIZE ((size_t)((1 << BUCKET_BITS)-1))
#define MAX_BUCKET_LENGTH         ((size_t)((1 << POS_BITS)-1))

 class MallocHeader VALUE_OBJ_CLASS_SPEC {
   size_t           _size      : SIZE_BITS;
   size_t           _flags     : FLAGS_BITS;
   size_t           _pos_idx   : POS_BITS;
   size_t           _bucket_idx: BUCKET_BITS;
 }

...and assert (SIZE_BITS + FLAGS_BITS + BUCKET_BITS + POS_BITS <=
2*BitsPerWord) somewhere?

-Aleksey.


From aleksey.shipilev at oracle.com  Mon Nov 10 17:39:10 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Mon, 10 Nov 2014 20:39:10 +0300
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <5460F745.4070808@oracle.com>
References: <5460F402.4060507@oracle.com> <5460F745.4070808@oracle.com>
Message-ID: <5460F83E.9030901@oracle.com>

On 10.11.2014 20:35, Aleksey Shipilev wrote:
> On 10.11.2014 20:21, Coleen Phillimore wrote:
>> Summary: Signed bitfield size y can only have (1 << y)-1 values.
>>
>> We were overflowing the the _pos index and reusing the 0th element in
>> the MallocSiteTable for two different stack traces which caused the
>> assert for deallocation.
>>
>> Tested with nsk.quick.testlist and jtreg runtime tests with
>> -XX:NativeMemoryTracking=detail.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
>> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
> 
> Looks good, but made my head hurt a little. I think it deserves a more
> bullet-proof rework, a la:
> 
> #ifdef _LP64
>   #define SIZE_BITS 64
>   #define FLAGS_BITS 8
>   #define POS_BITS 16
>   #define BUCKET_BITS 40
> #else
>   #define SIZE_BITS 32
>   #define FLAGS_BITS 8
>   #define POS_BITS 8
>   #define BUCKET_BITS 16
> #endif  // _LP64
> 
> #define MAX_MALLOCSITE_TABLE_SIZE ((size_t)((1 << BUCKET_BITS)-1))
> #define MAX_BUCKET_LENGTH         ((size_t)((1 << POS_BITS)-1))

Also, probably these two guys should be MAX_BUCKET_IDX and MAX_POS_IDX,
respectively. (_pos_idx < MAX_BUCKET_LENGTH) looks more odd than
(_pos_idx < MAX_POS_IDX).


>  class MallocHeader VALUE_OBJ_CLASS_SPEC {
>    size_t           _size      : SIZE_BITS;
>    size_t           _flags     : FLAGS_BITS;
>    size_t           _pos_idx   : POS_BITS;
>    size_t           _bucket_idx: BUCKET_BITS;
>  }
> 
> ...and assert (SIZE_BITS + FLAGS_BITS + BUCKET_BITS + POS_BITS <=
> 2*BitsPerWord) somewhere?

-Aleksey.


From george.triantafillou at oracle.com  Mon Nov 10 17:44:41 2014
From: george.triantafillou at oracle.com (George Triantafillou)
Date: Mon, 10 Nov 2014 12:44:41 -0500
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <5460F402.4060507@oracle.com>
References: <5460F402.4060507@oracle.com>
Message-ID: <5460F989.3030807@oracle.com>

Hi Coleen,

This looks good.  Thanks for fixing this.

-George

On 11/10/2014 12:21 PM, Coleen Phillimore wrote:
> Summary: Signed bitfield size y can only have (1 << y)-1 values.
>
> We were overflowing the the _pos index and reusing the 0th element in 
> the MallocSiteTable for two different stack traces which caused the 
> assert for deallocation.
>
> Tested with nsk.quick.testlist and jtreg runtime tests with 
> -XX:NativeMemoryTracking=detail.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
>
> Thanks,
> Coleen


From mikael.gerdin at oracle.com  Mon Nov 10 17:56:44 2014
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 10 Nov 2014 18:56:44 +0100
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <5460F83E.9030901@oracle.com>
References: <5460F402.4060507@oracle.com> <5460F745.4070808@oracle.com>
	<5460F83E.9030901@oracle.com>
Message-ID: <5460FC5C.9030306@oracle.com>


On 2014-11-10 18:39, Aleksey Shipilev wrote:
> On 10.11.2014 20:35, Aleksey Shipilev wrote:
>> On 10.11.2014 20:21, Coleen Phillimore wrote:
>>> Summary: Signed bitfield size y can only have (1 << y)-1 values.
>>>
>>> We were overflowing the the _pos index and reusing the 0th element in
>>> the MallocSiteTable for two different stack traces which caused the
>>> assert for deallocation.
>>>
>>> Tested with nsk.quick.testlist and jtreg runtime tests with
>>> -XX:NativeMemoryTracking=detail.
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
>>
>> Looks good, but made my head hurt a little. I think it deserves a more
>> bullet-proof rework, a la:
>>
>> #ifdef _LP64
>>    #define SIZE_BITS 64
>>    #define FLAGS_BITS 8
>>    #define POS_BITS 16
>>    #define BUCKET_BITS 40
>> #else
>>    #define SIZE_BITS 32
>>    #define FLAGS_BITS 8
>>    #define POS_BITS 8
>>    #define BUCKET_BITS 16
>> #endif  // _LP64
>>
>> #define MAX_MALLOCSITE_TABLE_SIZE ((size_t)((1 << BUCKET_BITS)-1))
>> #define MAX_BUCKET_LENGTH         ((size_t)((1 << POS_BITS)-1))
>
> Also, probably these two guys should be MAX_BUCKET_IDX and MAX_POS_IDX,
> respectively. (_pos_idx < MAX_BUCKET_LENGTH) looks more odd than
> (_pos_idx < MAX_POS_IDX).
>
>
>>   class MallocHeader VALUE_OBJ_CLASS_SPEC {
>>     size_t           _size      : SIZE_BITS;
>>     size_t           _flags     : FLAGS_BITS;
>>     size_t           _pos_idx   : POS_BITS;
>>     size_t           _bucket_idx: BUCKET_BITS;
>>   }
>>
>> ...and assert (SIZE_BITS + FLAGS_BITS + BUCKET_BITS + POS_BITS <=
>> 2*BitsPerWord) somewhere?

Perhaps even STATIC_ASSERT(...)

/Mikael

>
> -Aleksey.
>
>

From aleksey.shipilev at oracle.com  Mon Nov 10 18:09:05 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Mon, 10 Nov 2014 21:09:05 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <E62D4958-C26F-4BE3-A8F0-479BCAAE97D4@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
	<5460B608.4050909@oracle.com> <5460C354.5000605@oracle.com>
	<5460C960.9080509@oracle.com>
	<B5073FA9-5725-4CDE-AF75-678E4DD867FE@oracle.com>
	<5460D1DA.4050907@oracle.com>
	<E62D4958-C26F-4BE3-A8F0-479BCAAE97D4@oracle.com>
Message-ID: <5460FF41.90208@oracle.com>

On 10.11.2014 19:39, Staffan Larsen wrote:
>> On 10 nov 2014, at 15:55, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
>> Ow, it seems very like it.
>> So, what testlist have I missed to catch this?
> 
> Probably vm.tmtools.testlist and/or nsk.sajdi.testlist. Just a warning that these tests are far from stable. Sorry about that.

Alas, both these testlists pass with current change without a hitch.
That probably tells something about the test coverage. Any other ideas
how to test for it? Maybe some manual way?

Anyhow, there is a synonymous block in ThreadGroup handling, I can copy
the relevant bits from there. Updated webrev follows soon. Still need to
test if that change is safe.

-Aleksey.


From coleen.phillimore at oracle.com  Mon Nov 10 18:26:52 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 10 Nov 2014 13:26:52 -0500
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <5460F745.4070808@oracle.com>
References: <5460F402.4060507@oracle.com> <5460F745.4070808@oracle.com>
Message-ID: <5461036C.7070408@oracle.com>


Yeah, that seems like an improvement.  I'll do it and send it out again.
Thanks,
Coleen

On 11/10/14, 12:35 PM, Aleksey Shipilev wrote:
> On 10.11.2014 20:21, Coleen Phillimore wrote:
>> Summary: Signed bitfield size y can only have (1 << y)-1 values.
>>
>> We were overflowing the the _pos index and reusing the 0th element in
>> the MallocSiteTable for two different stack traces which caused the
>> assert for deallocation.
>>
>> Tested with nsk.quick.testlist and jtreg runtime tests with
>> -XX:NativeMemoryTracking=detail.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
>> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
> Looks good, but made my head hurt a little. I think it deserves a more
> bullet-proof rework, a la:
>
> #ifdef _LP64
>    #define SIZE_BITS 64
>    #define FLAGS_BITS 8
>    #define POS_BITS 16
>    #define BUCKET_BITS 40
> #else
>    #define SIZE_BITS 32
>    #define FLAGS_BITS 8
>    #define POS_BITS 8
>    #define BUCKET_BITS 16
> #endif  // _LP64
>
> #define MAX_MALLOCSITE_TABLE_SIZE ((size_t)((1 << BUCKET_BITS)-1))
> #define MAX_BUCKET_LENGTH         ((size_t)((1 << POS_BITS)-1))
>
>   class MallocHeader VALUE_OBJ_CLASS_SPEC {
>     size_t           _size      : SIZE_BITS;
>     size_t           _flags     : FLAGS_BITS;
>     size_t           _pos_idx   : POS_BITS;
>     size_t           _bucket_idx: BUCKET_BITS;
>   }
>
> ...and assert (SIZE_BITS + FLAGS_BITS + BUCKET_BITS + POS_BITS <=
> 2*BitsPerWord) somewhere?
>
> -Aleksey.
>
>


From christian.tornqvist at oracle.com  Mon Nov 10 20:00:42 2014
From: christian.tornqvist at oracle.com (Christian Tornqvist)
Date: Mon, 10 Nov 2014 15:00:42 -0500
Subject: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count	> 0) failed: Negative , counter
In-Reply-To: <5460F402.4060507@oracle.com>
References: <5460F402.4060507@oracle.com>
Message-ID: <008901cffd21$023660e0$06a322a0$@oracle.com>

Hi Coleen,

As mentioned offline, please make sure you remove the @ignore from
test/runtime/NMT/MallocTrackingVerify.java as well.

Otherwise this looks good, thanks for fixing this.

Thanks,
Christian

-----Original Message-----
From: hotspot-runtime-dev
[mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Coleen
Phillimore
Sent: Monday, November 10, 2014 12:21 PM
To: hotspot-runtime-dev
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
assert(_count > 0) failed: Negative ,counter

Summary: Signed bitfield size y can only have (1 << y)-1 values.

We were overflowing the the _pos index and reusing the 0th element in the
MallocSiteTable for two different stack traces which caused the assert for
deallocation.

Tested with nsk.quick.testlist and jtreg runtime tests with
-XX:NativeMemoryTracking=detail.

open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
bug link https://bugs.openjdk.java.net/browse/JDK-8062870

Thanks,
Coleen


From jiangli.zhou at oracle.com  Mon Nov 10 20:21:48 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Mon, 10 Nov 2014 12:21:48 -0800
Subject: RFR 8064375: Change certain errors to warnings in AppCDS output
Message-ID: <54611E5C.6050605@oracle.com>

Please review following simple fix that changes the non-fatal CDS 
preloading errors into warnings:

http://cr.openjdk.java.net/~jiangli/8064375/webrev.00/

Thanks,
Jiangli

From mikhailo.seledtsov at oracle.com  Mon Nov 10 20:32:51 2014
From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov)
Date: Mon, 10 Nov 2014 12:32:51 -0800
Subject: RFR 8064375: Change certain errors to warnings in AppCDS output
In-Reply-To: <54611E5C.6050605@oracle.com>
References: <54611E5C.6050605@oracle.com>
Message-ID: <546120F3.2090507@oracle.com>

Hi Jiangli,

  The changes look good to me.

Misha

On 11/10/2014 12:21 PM, Jiangli Zhou wrote:
> Please review following simple fix that changes the non-fatal CDS 
> preloading errors into warnings:
>
> http://cr.openjdk.java.net/~jiangli/8064375/webrev.00/
>
> Thanks,
> Jiangli


From jiangli.zhou at oracle.com  Mon Nov 10 20:36:02 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Mon, 10 Nov 2014 12:36:02 -0800
Subject: RFR 8064375: Change certain errors to warnings in AppCDS output
In-Reply-To: <546120F3.2090507@oracle.com>
References: <54611E5C.6050605@oracle.com> <546120F3.2090507@oracle.com>
Message-ID: <546121B2.4070705@oracle.com>

Thanks, Misha!

Jiangli

On 11/10/2014 12:32 PM, Mikhailo Seledtsov wrote:
> Hi Jiangli,
>
>  The changes look good to me.
>
> Misha
>
> On 11/10/2014 12:21 PM, Jiangli Zhou wrote:
>> Please review following simple fix that changes the non-fatal CDS 
>> preloading errors into warnings:
>>
>> http://cr.openjdk.java.net/~jiangli/8064375/webrev.00/
>>
>> Thanks,
>> Jiangli
>


From coleen.phillimore at oracle.com  Mon Nov 10 22:53:29 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 10 Nov 2014 17:53:29 -0500
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <5460F745.4070808@oracle.com>
References: <5460F402.4060507@oracle.com> <5460F745.4070808@oracle.com>
Message-ID: <546141E9.8060602@oracle.com>


Aleksey,

I made this change and I'm not happy with it.  Now I have 6 #defines to 
leak into the global hotspot namespace (can't undef because other code 
uses MAX_BUCKET_LENGTH, which relies on these #defines).

I think the differing constants within 10 lines of each other are less 
ugly and makes better sense.  They're more direct and less visually 
disturbing than the upper case names.

Also MAX_BUCKET_LENGTH is used in other NMT code where it's name makes a 
lot more sense, so I don't want to change that either.

Also the STATIC_ASSERT leads to the most unhelpful error message. I'm 
not a fan.

services/mallocTracker.hpp|264| error: aggregate ?StaticAssert<false> 
DUMMY_STATIC_ASSERT? has incomplete type and cannot be defined

Thank you for the comments which I initially agreed with but working 
with the code, makes me less happy and I will leave it as is (except one 
change which I'm going to put out shortly).

Thanks,
Coleen

On 11/10/14, 12:35 PM, Aleksey Shipilev wrote:
> On 10.11.2014 20:21, Coleen Phillimore wrote:
>> Summary: Signed bitfield size y can only have (1 << y)-1 values.
>>
>> We were overflowing the the _pos index and reusing the 0th element in
>> the MallocSiteTable for two different stack traces which caused the
>> assert for deallocation.
>>
>> Tested with nsk.quick.testlist and jtreg runtime tests with
>> -XX:NativeMemoryTracking=detail.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
>> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
> Looks good, but made my head hurt a little. I think it deserves a more
> bullet-proof rework, a la:
>
> #ifdef _LP64
>    #define SIZE_BITS 64
>    #define FLAGS_BITS 8
>    #define POS_BITS 16
>    #define BUCKET_BITS 40
> #else
>    #define SIZE_BITS 32
>    #define FLAGS_BITS 8
>    #define POS_BITS 8
>    #define BUCKET_BITS 16
> #endif  // _LP64
>
> #define MAX_MALLOCSITE_TABLE_SIZE ((size_t)((1 << BUCKET_BITS)-1))
> #define MAX_BUCKET_LENGTH         ((size_t)((1 << POS_BITS)-1))
>
>   class MallocHeader VALUE_OBJ_CLASS_SPEC {
>     size_t           _size      : SIZE_BITS;
>     size_t           _flags     : FLAGS_BITS;
>     size_t           _pos_idx   : POS_BITS;
>     size_t           _bucket_idx: BUCKET_BITS;
>   }
>
> ...and assert (SIZE_BITS + FLAGS_BITS + BUCKET_BITS + POS_BITS <=
> 2*BitsPerWord) somewhere?
>
> -Aleksey.
>
>


From coleen.phillimore at oracle.com  Mon Nov 10 23:00:02 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 10 Nov 2014 18:00:02 -0500
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <5460F989.3030807@oracle.com>
References: <5460F402.4060507@oracle.com> <5460F989.3030807@oracle.com>
Message-ID: <54614372.6080001@oracle.com>


Hi George,
Thanks for the review.  I didn't know there was another test that I 
needed to remove @ignore.  I've run this test for a couple hours in a 
loop and it always passes now.  The other bug number was the bug that 
Christian fixed.

open webrev at http://cr.openjdk.java.net/~coleenp/8062870_2/

Thanks,
Coleen

On 11/10/14, 12:44 PM, George Triantafillou wrote:
> Hi Coleen,
>
> This looks good.  Thanks for fixing this.
>
> -George
>
> On 11/10/2014 12:21 PM, Coleen Phillimore wrote:
>> Summary: Signed bitfield size y can only have (1 << y)-1 values.
>>
>> We were overflowing the the _pos index and reusing the 0th element in 
>> the MallocSiteTable for two different stack traces which caused the 
>> assert for deallocation.
>>
>> Tested with nsk.quick.testlist and jtreg runtime tests with 
>> -XX:NativeMemoryTracking=detail.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
>> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
>>
>> Thanks,
>> Coleen
>


From coleen.phillimore at oracle.com  Mon Nov 10 23:10:53 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 10 Nov 2014 18:10:53 -0500
Subject: 8062870: src/share/vm/services/mallocTracker.hpp:64 assert(_count
	> 0) failed: Negative ,counter
In-Reply-To: <008901cffd21$023660e0$06a322a0$@oracle.com>
References: <5460F402.4060507@oracle.com>
	<008901cffd21$023660e0$06a322a0$@oracle.com>
Message-ID: <546145FD.8000207@oracle.com>


Thanks Christian.  You took of RFR so I couldn't find it!
Coleen

On 11/10/14, 3:00 PM, Christian Tornqvist wrote:
> Hi Coleen,
>
> As mentioned offline, please make sure you remove the @ignore from
> test/runtime/NMT/MallocTrackingVerify.java as well.
>
> Otherwise this looks good, thanks for fixing this.
>
> Thanks,
> Christian
>
> -----Original Message-----
> From: hotspot-runtime-dev
> [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Coleen
> Phillimore
> Sent: Monday, November 10, 2014 12:21 PM
> To: hotspot-runtime-dev
> Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
> assert(_count > 0) failed: Negative ,counter
>
> Summary: Signed bitfield size y can only have (1 << y)-1 values.
>
> We were overflowing the the _pos index and reusing the 0th element in the
> MallocSiteTable for two different stack traces which caused the assert for
> deallocation.
>
> Tested with nsk.quick.testlist and jtreg runtime tests with
> -XX:NativeMemoryTracking=detail.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
>
> Thanks,
> Coleen
>


From daniel.daugherty at oracle.com  Tue Nov 11 00:00:41 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 10 Nov 2014 17:00:41 -0700
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and 8034005
Message-ID: <546151A9.1080100@oracle.com>

Greetings,

I have a Solaris Full Debug Symbols (FDS) fix ready for review.
Yes, it is a small fix, but it is in Makefiles so feel free to
run screaming from the room... :-)  On the plus side the fix does
delete two work around source files (Coleen would say that's a
Good Thing (TM)!)

The fix is to detect the version of GNU objcopy that is being
used on the machine and only enable Full Debug Symbols when that
version is 2.21.1 or newer. If you don't have the right version,
then the build drops back to pre-FDS build configs with a message
like this:

WARNING: /usr/sfw/bin/gobjcopy --version info:
WARNING: GNU objcopy 2.15
WARNING: an objcopy version of 2.21.1 or newer is needed to create valid 
.debuginfo files.
WARNING: ignoring above objcopy command.
WARNING: patch 149063-01 or newer contains the correct Solaris 10 SPARC 
version.
WARNING: patch 149064-01 or newer contains the correct Solaris 10 X86 
version.
WARNING: Solaris 11 Update 1 contains the correct version.
INFO: no objcopy cmd found so cannot create .debuginfo files.
INFO: ENABLE_FULL_DEBUG_SYMBOLS=0

This work is being tracked by the following bug IDs:

     JDK-8033602 wrong stabs data in libjvm.debuginfo on JDK 8 - SPARC
     https://bugs.openjdk.java.net/browse/JDK-8033602

     JDK-8034005 cannot debug in synchronizer.o or objectMonitor.o on 
Solaris X86
     https://bugs.openjdk.java.net/browse/JDK-8034005

Here is the webrev URL:

http://cr.openjdk.java.net/~dcubed/8033602-webrev/0-jdk9-hs-rt/

Testing:

- JPRT test jobs to verify that the current JPRT Solaris hosts
   are happy
- local builds on my Solaris 10 X86 machine to verify that the
   wrong version of GNU objcopy is caught

Thanks, in advance, for any comments, questions or suggestions.

Dan

From coleen.phillimore at oracle.com  Tue Nov 11 00:12:47 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 10 Nov 2014 19:12:47 -0500
Subject: RFR 8064375: Change certain errors to warnings in AppCDS output
In-Reply-To: <54611E5C.6050605@oracle.com>
References: <54611E5C.6050605@oracle.com>
Message-ID: <5461547F.3050809@oracle.com>


Yes, these messages are better saying Warning since the Error doesn't 
seem to cause the -Xshare:dump to fail.

Coleen

On 11/10/14, 3:21 PM, Jiangli Zhou wrote:
> Please review following simple fix that changes the non-fatal CDS 
> preloading errors into warnings:
>
> http://cr.openjdk.java.net/~jiangli/8064375/webrev.00/
>
> Thanks,
> Jiangli


From jiangli.zhou at oracle.com  Tue Nov 11 00:13:40 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Mon, 10 Nov 2014 16:13:40 -0800
Subject: RFR 8064375: Change certain errors to warnings in AppCDS output
In-Reply-To: <5461547F.3050809@oracle.com>
References: <54611E5C.6050605@oracle.com> <5461547F.3050809@oracle.com>
Message-ID: <546154B4.2080002@oracle.com>

Thanks Coleen!

Jiangli

On 11/10/2014 04:12 PM, Coleen Phillimore wrote:
>
> Yes, these messages are better saying Warning since the Error doesn't 
> seem to cause the -Xshare:dump to fail.
>
> Coleen
>
> On 11/10/14, 3:21 PM, Jiangli Zhou wrote:
>> Please review following simple fix that changes the non-fatal CDS 
>> preloading errors into warnings:
>>
>> http://cr.openjdk.java.net/~jiangli/8064375/webrev.00/
>>
>> Thanks,
>> Jiangli
>


From john.r.rose at oracle.com  Tue Nov 11 00:21:17 2014
From: john.r.rose at oracle.com (John Rose)
Date: Mon, 10 Nov 2014 16:21:17 -0800
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative , counter
In-Reply-To: <5460F402.4060507@oracle.com>
References: <5460F402.4060507@oracle.com>
Message-ID: <1FA4267A-C001-48A5-9C04-1DA54890C3F1@oracle.com>

I think on many LP64 platforms the value of (1<<40) is 256, same as (1<<8).  It will seldom be the intended value of (int64_t)1<<40.

The C "<<" operator is notoriously devious (not to say shifty).

For shift/mask arithmetic we should be continuing to use macros from globalDefinitions.hpp.
They are far more reliable than C expressions.

? John

On Nov 10, 2014, at 9:21 AM, Coleen Phillimore <coleen.phillimore at oracle.com> wrote:

> Summary: Signed bitfield size y can only have (1 << y)-1 values.
> 
> We were overflowing the the _pos index and reusing the 0th element in the MallocSiteTable for two different stack traces which caused the assert for deallocation.
> 
> Tested with nsk.quick.testlist and jtreg runtime tests with -XX:NativeMemoryTracking=detail.
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
> 
> Thanks,
> Coleen


From coleen.phillimore at oracle.com  Tue Nov 11 00:37:23 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 10 Nov 2014 19:37:23 -0500
Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to
	use	the same oop map
In-Reply-To: <545F8CFA.80809@oracle.com>
References: <525AC628.4020906@oracle.com>	<009001cec83e$5e9c6b40$1bd541c0$@oracle.com>	<525B0A18.8000105@oracle.com>	<545B70F6.60801@oracle.com>	<7029DBDA-BBE6-420A-BE6E-2EA28B60B560@oracle.com>	<545B9CC0.3080106@oracle.com>
	<545F8CFA.80809@oracle.com>
Message-ID: <54615A43.10700@oracle.com>


Hi, I think this code looks correct.  Was there a test in the test 
system that exercises this code?  I think it would be hard to test with 
a dedicated test but was there one already in the test sets?

Secondly, could you use the word adjacent in the comments, reuse oopmap 
for adjacent oops in the class or something like that?  That would have 
saved me some jotting down on notebook.

I'll sponsor it if you get another reviewer.

Thanks,
Coleen


On 11/9/14, 10:49 AM, Aleksey Shipilev wrote:
> Hi again,
>
> No changes in webrev:
>   http://cr.openjdk.java.net/~shade/8015272/webrev.01/
>
> Please review and sponsor:
>   http://cr.openjdk.java.net/~shade/8015272/8015272.changeset
>
> As per Karen's request, more testing is done, ran the tests on my Linux
> x86_64/fastdebug:
>
> On 11/06/2014 07:07 PM, Aleksey Shipilev wrote:
>> On 11/06/2014 06:01 PM, Karen Kinnear wrote:
>>> - e.g. what about the vmtestbase vm/runtime/contended tests (and yes, some tests should be removed from that testlist)
> vmtestbase vm/runtime/contended: no issues.
> hotspot/test/runtime/ jtreg: no issues.
>
>>> - vmtestbase: vm.quick.testlist (required for runtime changes)
> vm.quick.testlist: no issues.
>
>>> - and since @Contended annotation is used in the JDK core libraries - the jtreg jdk tests?
> jdk/test/java/util/concurrent jtreg: no issues.
> jdk/test/java/lang/Thread jtreg: no issues.
>
>
> Thanks,
> -Aleksey.
>
>


From coleen.phillimore at oracle.com  Tue Nov 11 02:06:17 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 10 Nov 2014 21:06:17 -0500
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <1FA4267A-C001-48A5-9C04-1DA54890C3F1@oracle.com>
References: <5460F402.4060507@oracle.com>
	<1FA4267A-C001-48A5-9C04-1DA54890C3F1@oracle.com>
Message-ID: <54616F19.6050808@oracle.com>


You are right!  I didn't get 256 because my fix didn't compile on 64 bit 
due to the extra parenthesis I added around 1<<40.

I'm changing it to use right_n_bits(40), 16 and 8.  Nothing good ever 
comes from C shifts.

Thanks!
Coleen

On 11/10/14, 7:21 PM, John Rose wrote:
> I think on many LP64 platforms the value of (1<<40) is 256, same as (1<<8).  It will seldom be the intended value of (int64_t)1<<40.
>
> The C "<<" operator is notoriously devious (not to say shifty).
>
> For shift/mask arithmetic we should be continuing to use macros from globalDefinitions.hpp.
> They are far more reliable than C expressions.
>
> ? John
>
> On Nov 10, 2014, at 9:21 AM, Coleen Phillimore <coleen.phillimore at oracle.com> wrote:
>
>> Summary: Signed bitfield size y can only have (1 << y)-1 values.
>>
>> We were overflowing the the _pos index and reusing the 0th element in the MallocSiteTable for two different stack traces which caused the assert for deallocation.
>>
>> Tested with nsk.quick.testlist and jtreg runtime tests with -XX:NativeMemoryTracking=detail.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
>> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
>>
>> Thanks,
>> Coleen


From coleen.phillimore at oracle.com  Tue Nov 11 02:28:30 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 10 Nov 2014 21:28:30 -0500
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <1FA4267A-C001-48A5-9C04-1DA54890C3F1@oracle.com>
References: <5460F402.4060507@oracle.com>
	<1FA4267A-C001-48A5-9C04-1DA54890C3F1@oracle.com>
Message-ID: <5461744E.2000304@oracle.com>


I've made the change to use right_n_bits and run the NMT tests 
(including the one that crashed in a loop for a while).

open webrev at http://cr.openjdk.java.net/~coleenp/8062870_3/

Thanks.  This is a big improvement.

Coleen

On 11/10/14, 7:21 PM, John Rose wrote:
> I think on many LP64 platforms the value of (1<<40) is 256, same as (1<<8).  It will seldom be the intended value of (int64_t)1<<40.
>
> The C "<<" operator is notoriously devious (not to say shifty).
>
> For shift/mask arithmetic we should be continuing to use macros from globalDefinitions.hpp.
> They are far more reliable than C expressions.
>
> ? John
>
> On Nov 10, 2014, at 9:21 AM, Coleen Phillimore <coleen.phillimore at oracle.com> wrote:
>
>> Summary: Signed bitfield size y can only have (1 << y)-1 values.
>>
>> We were overflowing the the _pos index and reusing the 0th element in the MallocSiteTable for two different stack traces which caused the assert for deallocation.
>>
>> Tested with nsk.quick.testlist and jtreg runtime tests with -XX:NativeMemoryTracking=detail.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
>> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
>>
>> Thanks,
>> Coleen


From daniel.daugherty at oracle.com  Tue Nov 11 03:46:18 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 10 Nov 2014 20:46:18 -0700
Subject: RFR(S) Contended Locking fast enter bucket (8061553)
In-Reply-To: <545C2BC0.3080207@oracle.com>
References: <5452C0B4.4070601@oracle.com> <5457084B.6070808@oracle.com>
	<5458330E.1080207@oracle.com> <54591A3A.1090005@oracle.com>
	<545C2BC0.3080207@oracle.com>
Message-ID: <5461868A.8070308@oracle.com>

The webrev is now available! Sorry for any confusion.

Dan


On 11/6/14 7:17 PM, Daniel D. Daugherty wrote:
> The fix for JDK-8062851 has been reviewed, tested and pushed to
> RT_Baseline.
>
> Time to get back to this review thread so here's an updated webrev:
>
> http://cr.openjdk.java.net/~dcubed/8061553-webrev/1-jdk9-hs-rt/
>
> David H., I believe I've addressed all of your comments. Please
> let me know if I missed something...
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan
>
>
> On 11/4/14 11:26 AM, Daniel D. Daugherty wrote:
>> The cleanup is turning into a bigger change than the fast enter
>> bucket itself so I'm spinning the cleanup into a new bug:
>>
>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>> https://bugs.openjdk.java.net/browse/JDK-8062851
>>
>> Yes, this means that the Contended Locking cleanup bucket has reopened
>> for yet another change...
>>
>> We'll get back to "fast enter" after the dust has settled...
>>
>> Dan
>>
>>
>> On 11/3/14 6:59 PM, Daniel D. Daugherty wrote:
>>> David,
>>>
>>> Thanks for the review! As usual, replies are embedded below...
>>>
>>>
>>> On 11/2/14 9:44 PM, David Holmes wrote:
>>>> Hi Dan,
>>>>
>>>> Looks good.
>>>
>>> Thanks!
>>>
>>>
>>>> Couple of nits and one semantic query below ...
>>>>
>>>> src/cpu/sparc/vm/macroAssembler_sparc.cpp
>>>>
>>>> Formatting changes were a bit of a distraction.
>>>
>>> Yes, I have no idea what got into me. Normally I do formatting
>>> changes separately so the noise does not distract...
>>>
>>> It turns out there is a constant defined that should be used
>>> instead of all these literal '2's:
>>>
>>> src/share/vm/oops/markOop.hpp:         monitor_value = 2
>>>
>>> Typically used as follows:
>>>
>>> src/cpu/x86/vm/macroAssembler_x86.cpp:  int owner_offset = 
>>> ObjectMonitor::owner_offset_in_bytes() - markOopDesc::monitor_value;
>>>
>>> I will clean this up just for the files that I'm touching as
>>> part of this fix.
>>>
>>>
>>>>
>>>> ---
>>>>
>>>> src/cpu/x86/vm/macroAssembler_x86.cpp
>>>>
>>>> Formatting changes were a bit of a distraction.
>>>
>>> Same reply as for macroAssembler_sparc.cpp.
>>>
>>>
>>>> 1929     // unconditionally set stackBox->_displaced_header = 3
>>>> 1930     movptr(Address(boxReg, 0), 
>>>> (int32_t)intptr_t(markOopDesc::unused_mark()));
>>>>
>>>> At 1870 we refer to box rather than stackBox. Also it takes some 
>>>> sleuthing to realize that "3" here is somehow a pseudonym for 
>>>> unused_mark(). Back up at 1808 we have a to-do:
>>>>
>>>> 1808     //   use markOop::unused_mark() instead of "3".
>>>>
>>>> so the current change seems to be implementing that, even though 
>>>> other uses of "3" are left untouched.
>>>
>>> I'll take a look at cleaning those up also...
>>>
>>> In some cases markOopDesc::marked_value will work for the literal '3',
>>> but in other cases we'll use markOop::unused_mark():
>>>
>>>   static markOop unused_mark() {
>>>     return (markOop) marked_value;
>>>   }
>>>
>>> to save us the noise of the (markOop) cast.
>>>
>>>
>>>> ---
>>>>
>>>> src/share/vm/runtime/sharedRuntime.cpp
>>>>
>>>> 1794 JRT_BLOCK_ENTRY(void, 
>>>> SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock* 
>>>> lock, JavaThread* thread))
>>>> 1795   if (!SafepointSynchronize::is_synchronizing()) {
>>>> 1796     if (ObjectSynchronizer::quick_enter(_obj, thread, lock)) 
>>>> return;
>>>>
>>>> Is it necessary to check is_synchronizing? If we are executing this 
>>>> code we are not at a safepoint and the quick_enter wont change 
>>>> that, so I'm not sure what we are guarding against.
>>>
>>> So this first state checker:
>>>
>>> src/share/vm/runtime/safepoint.hpp:
>>> inline static bool is_synchronizing()  { return _state == 
>>> _synchronizing;  }
>>>
>>> means that we want to go to a safepoint and:
>>>
>>> inline static bool is_at_safepoint()   { return _state == 
>>> _synchronized;  }
>>>
>>> means that we are at a safepoint. Dice's optimization bails out if
>>> we want to go to a safepoint and ObjectSynchronizer::quick_enter()
>>> has a "No_Safepoint_Verifier nsv" in it so we're expecting that
>>> code to be quick (and not go to a safepoint). I'm not seeing
>>> anything obvious....
>>>
>>> Sometimes we have to be careful with JavaThread suspend requests and
>>> monitor acquisition, but I don't think that's a problem here... In
>>> order for the "suspend requesting" thread to be surprised, the suspend
>>> API, e.g., JVM/TI SuspendThread() has to return to the caller and then
>>> the suspend target has do something unexpected like acquire a monitor
>>> that it was previously blocked upon when it was suspended. We've had
>>> bugs like that in the past... In this optimization case, our target
>>> thread is not blocked on a contended monitor...
>>>
>>> In this particular case, the "suspend requesting" thread will set the
>>> suspend request state on the target thread, but the target thread is
>>> busy trying to enter this uncontended monitor (quickly). So the
>>> "suspend requesting" thread, will request a no-op safepoint, but it
>>> won't return from the suspend API until that safepoint completes.
>>> The safepoint won't complete until the target thread is done acquiring
>>> the previously uncontended monitor... so the target thread will be
>>> suspended while holding the previous uncontended monitor and the
>>> "suspend requesting" thread will return from the suspend API all
>>> happy...
>>>
>>> Well, I don't see the reason either so I'll have to ping Dave Dice
>>> and Karen Kinnear to see if either of them can fill in the history
>>> here. This could be an abundance of caution case.
>>>
>>>
>>>> ---
>>>>
>>>> src/share/vm/runtime/synchronizer.cpp
>>>>
>>>> Minor nit: line 153 the usual acronym is NPE (for 
>>>> NullPointerException) not NPX
>>>
>>> I'll do a search for uses of NPX and other uses of 'X' in exception
>>> acronyms...
>>>
>>>
>>>>
>>>> Nit:  159     Thread * const ox
>>>>
>>>> Please change ox to owner.
>>>
>>> Will do.
>>>
>>> Thanks again for the review!
>>>
>>> Dan
>>>
>>>
>>>>
>>>> ---
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>
>>>>
>>>> On 31/10/2014 8:50 AM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have the Contended Locking fast enter bucket ready for review.
>>>>>
>>>>> The code changes in this bucket are primarily a quick_enter()
>>>>> function that works on inflated but uncontended Java monitors.
>>>>> This quick_enter() function is used on the "slow path" for Java
>>>>> Monitor enter operations when the built-in "fast path" (read
>>>>> assembly code) doesn't work.
>>>>>
>>>>> This work is being tracked by the following bug ID:
>>>>>
>>>>>      JDK-8061553 Contended Locking fast enter bucket
>>>>> https://bugs.openjdk.java.net/browse/JDK-8061553
>>>>>
>>>>> Here is the webrev URL:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/0-jdk9-hs-rt/
>>>>>
>>>>> Here is the JEP link:
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>
>>>>> 8061553 summary of changes:
>>>>>
>>>>> macroAssembler_sparc.cpp: MacroAssembler::compiler_lock_object()
>>>>>
>>>>> - clean up spacing around some
>>>>>    'ObjectMonitor::owner_offset_in_bytes() - 2' uses
>>>>> - remove optional (EmitSync & 64) code
>>>>> - change from cmp() to andcc() so icc.zf flag is set
>>>>>
>>>>> macroAssembler_x86.cpp: MacroAssembler::fast_lock()
>>>>>
>>>>> - remove optional (EmitSync & 2) code
>>>>> - rewrite LP64 inflated lock code that tries to CAS in
>>>>>    the new owner value to be more efficient
>>>>>
>>>>> interfaceSupport.hpp:
>>>>>
>>>>> - add JRT_BLOCK_NO_ASYNC to permit splitting a
>>>>>    JRT_BLOCK_ENTRY into two pieces.
>>>>>
>>>>> sharedRuntime.cpp: SharedRuntime::complete_monitor_locking_C()
>>>>>
>>>>> - change entry type from JRT_ENTRY_NO_ASYNC to JRT_BLOCK_ENTRY
>>>>>    to permit ObjectSynchronizer::quick_enter() call
>>>>> - add JRT_BLOCK_NO_ASYNC use if the quick_enter() doesn't work
>>>>>    to revert to JRT_ENTRY_NO_ASYNC-like semantics
>>>>>
>>>>> synchronizer.[ch]pp:
>>>>>
>>>>> - add ObjectSynchronizer::quick_enter() for entering an
>>>>>    inflated but unowned Java monitor without thread state
>>>>>    changes
>>>>>
>>>>> Testing:
>>>>>
>>>>> - Aurora Adhoc RT/SVC baseline batch
>>>>> - JPRT test jobs
>>>>> - MonitorEnterStresser micro-benchmark (in process)
>>>>> - CallTimerGrid stress testing (in process)
>>>>> - Aurora performance testing:
>>>>>    - out of the box for the "promotion" and 32-bit server configs
>>>>>    - heavy weight monitors for the "promotion" and 32-bit server 
>>>>> configs
>>>>>      (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>>>>>      (in process)
>>>>>
>>>>>
>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>
>>>>> Dan
>>>
>>>
>>
>>
>>
>
>
>


From calvin.cheung at oracle.com  Tue Nov 11 06:12:46 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Mon, 10 Nov 2014 22:12:46 -0800
Subject: RFR(S): 8043491: warning LNK4197: export '... ...' specified multiple
	times; using first specification
Message-ID: <5461A8DE.1050009@oracle.com>

This is for fixing link warnings on windows such as the following:
jni.obj : warning LNK4197: export 'JNI_CreateJavaVM' specified multiple 
times; using first specification

The warning is reproducible with both VS2010 and VS2013.
It is applicable to 64-bit only probably due to the 
__declspec(dllexport) on 32-bit, it exports the function decorated name 
with a leading underscore, but not the case on 64-bit as described in:
http://stackoverflow.com/questions/3572344/a-warning-with-building-64bit-dll

All those functions are declared with JNIEXPORT (#define JNIEXPORT 
__declspec(dllexport)) and we're adding the /export:<function name> in 
the link command. Therefore, on 64-bit platform, we get the "specified 
multiple times" LNK4197 warning.

A fix is to check if the platform is 64-bit, we don't add those /export 
option to the link command.

JBS: https://bugs.openjdk.java.net/browse/JDK-8043491

webrev: http://cr.openjdk.java.net/~ccheung/8043491/webrev/

Tests:
     (1) build jvm.dll via command line (both 32- and 64-bit)
           use configure.sh to setup and then do "make CONF=<config> 
hotspot"

     (2) generate visual studio project files using ProjectCreator (both 
32- and 64-bit)
           build jvm.dll via VS2013 (both 32- and 64-bit)

     (3) JPRT

thanks,
Calvin


From david.holmes at oracle.com  Tue Nov 11 07:06:59 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 11 Nov 2014 17:06:59 +1000
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5460C960.9080509@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
	<5460B608.4050909@oracle.com> <5460C354.5000605@oracle.com>
	<5460C960.9080509@oracle.com>
Message-ID: <5461B593.1000104@oracle.com>

Hi Aleksey,

On 11/11/2014 12:19 AM, Aleksey Shipilev wrote:
> Hi David, Chris,
>
> On 11/10/2014 04:53 PM, Chris Hegarty wrote:
>> On 10/11/14 12:56, David Holmes wrote:
>>> On 10/11/2014 9:52 PM, Chris Hegarty wrote:
>>>> I have only looked at the libraries changes, and I think they make sense
>>>> . As in, I can find no reason why the name cannot be changed to be a
>>>> String.
>>>
>>> Very quick response, but IIRC this has been examined in the past and
>>> there were reasons why it can't/shouldn't be done. Will try to dig out
>>> more details in the morning.
>>
>> If there was previous discussion on this, that revealed some substantial
>> issue, that would be great, but I can't recall, or find, it now.
>>
>> Hotspot express, and the desire for hotspot to run with different
>> library versions, would certainly cause complication, but I don't
>> believe that is an issue now.
>>
>> Just on that, the library changes are minimal, and if this were to
>> proceed then they can accompany the hotspot change, as they make their
>> way into jdk9/dev.
>>
>> Anyway, this should await your reply.
>
> Alan was having the same concern, there is an issue with JNI/JVMTI and
> other power users that might break when exposed to under-constructed
> Thread, e.g:
>   https://bugs.openjdk.java.net/browse/JDK-6412693
>
> This is why I ran jvmti and serviceability tests for this change,
> yielding no failures. This reinforces my belief this patch does not
> break the important invariant: if there is a problem with "Thread.name =
> name.toCharArray()" anywhere in Thread code, then "Thread.name = name"
> does neither regress it further nor fixes it.

True.

> Then I speculated that having char[] name would help VM initialize the
> name if we wanted to switch to complete VM-side initialization of
> Thread, but it seems we can do String oop instantiation in the similar vein.

I think it really just came down to accessing the Thread name from 
things like JVMDI/PI (now JVM TI) - easier for C code to access a raw 
char[]. Maybe once upon a time (in a land not so far away) we even 
passed char[] to the Thread constructor? :) But having re-discovered 
past discussions etc there's really nothing to stop this from being a 
String (slight memory use increase per Thread object).

> Caching the name feels like a band-aid, that will probably complicate
> the Thread initialization on VM side even more. Let's wait and see if
> David can come up with some horror issue we are overlooking. :)

I don't see how a Java side cache affects anything on the VM 
initialization side - and as Strings can be published unsafely we don't 
even need sync/volatile to do so :)

That aside I think it is as Alan commented - a number of small things 
(some logistical I think) that made this change not worth the effort. 
Maybe now it is worth the effort if getName is a bottleneck (but again 
caching is the common fix for that kind of problem :)). I was concerned 
about executing even more Java code at thread attach time, but we 
already create a String to pass to the Thread constructor, so no change 
there.

So looking at your proposal ... some minor comments ...

JDK change is okay - but "name" doesn't need to be volatile when it is a 
String reference.

Hotspot side:

src/share/vm/classfile/javaClasses.hpp

This added assert seems overly cautious:

  134     oop value = java_string->obj_field(value_offset);
  135     assert((value->is_typeArray() && 
TypeArrayKlass::cast(value->klass())->element_type() == T_CHAR), "expect 
char[]");

you are basically checking that String.value is defined as a char[]. If 
warranted, this is a check needed once in the lifetime of a VM not every 
time this method is called. (Yes I see we had something similarly odd in 
java_lang_thread::name. :( )

---

src/share/vm/classfile/javaClasses.cpp

! oop java_lang_Thread::name(oop java_thread) {
     oop name = java_thread->obj_field(_name_offset);
!   assert(name != NULL, "thread name is NULL");

I'm not confident this can never be called before the name has been set. 
The original assertion allowed for NULL as does the JVM TI code.

---

src/share/vm/prims/jvmtiTrace.cpp

Copyright year needs updating. :)

---

Aside: I wonder if we've inadvertently fixed 6771287 now. :) That was a 
fun one to debug.

Thanks,
David
-----

> Thanks,
> -Aleksey.
>

From david.holmes at oracle.com  Tue Nov 11 08:02:10 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 11 Nov 2014 18:02:10 +1000
Subject: RFR(S) Contended Locking fast enter bucket (8061553)
In-Reply-To: <545C2BC0.3080207@oracle.com>
References: <5452C0B4.4070601@oracle.com>
	<5457084B.6070808@oracle.com>	<5458330E.1080207@oracle.com>
	<54591A3A.1090005@oracle.com> <545C2BC0.3080207@oracle.com>
Message-ID: <5461C282.1020806@oracle.com>

On 7/11/2014 12:17 PM, Daniel D. Daugherty wrote:
> The fix for JDK-8062851 has been reviewed, tested and pushed to
> RT_Baseline.
>
> Time to get back to this review thread so here's an updated webrev:
>
> http://cr.openjdk.java.net/~dcubed/8061553-webrev/1-jdk9-hs-rt/
>
> David H., I believe I've addressed all of your comments. Please
> let me know if I missed something...

Looks good to me - thanks Dan!

David
-----

> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan
>
>
> On 11/4/14 11:26 AM, Daniel D. Daugherty wrote:
>> The cleanup is turning into a bigger change than the fast enter
>> bucket itself so I'm spinning the cleanup into a new bug:
>>
>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>> https://bugs.openjdk.java.net/browse/JDK-8062851
>>
>> Yes, this means that the Contended Locking cleanup bucket has reopened
>> for yet another change...
>>
>> We'll get back to "fast enter" after the dust has settled...
>>
>> Dan
>>
>>
>> On 11/3/14 6:59 PM, Daniel D. Daugherty wrote:
>>> David,
>>>
>>> Thanks for the review! As usual, replies are embedded below...
>>>
>>>
>>> On 11/2/14 9:44 PM, David Holmes wrote:
>>>> Hi Dan,
>>>>
>>>> Looks good.
>>>
>>> Thanks!
>>>
>>>
>>>> Couple of nits and one semantic query below ...
>>>>
>>>> src/cpu/sparc/vm/macroAssembler_sparc.cpp
>>>>
>>>> Formatting changes were a bit of a distraction.
>>>
>>> Yes, I have no idea what got into me. Normally I do formatting
>>> changes separately so the noise does not distract...
>>>
>>> It turns out there is a constant defined that should be used
>>> instead of all these literal '2's:
>>>
>>> src/share/vm/oops/markOop.hpp:         monitor_value = 2
>>>
>>> Typically used as follows:
>>>
>>> src/cpu/x86/vm/macroAssembler_x86.cpp:  int owner_offset =
>>> ObjectMonitor::owner_offset_in_bytes() - markOopDesc::monitor_value;
>>>
>>> I will clean this up just for the files that I'm touching as
>>> part of this fix.
>>>
>>>
>>>>
>>>> ---
>>>>
>>>> src/cpu/x86/vm/macroAssembler_x86.cpp
>>>>
>>>> Formatting changes were a bit of a distraction.
>>>
>>> Same reply as for macroAssembler_sparc.cpp.
>>>
>>>
>>>> 1929     // unconditionally set stackBox->_displaced_header = 3
>>>> 1930     movptr(Address(boxReg, 0),
>>>> (int32_t)intptr_t(markOopDesc::unused_mark()));
>>>>
>>>> At 1870 we refer to box rather than stackBox. Also it takes some
>>>> sleuthing to realize that "3" here is somehow a pseudonym for
>>>> unused_mark(). Back up at 1808 we have a to-do:
>>>>
>>>> 1808     //   use markOop::unused_mark() instead of "3".
>>>>
>>>> so the current change seems to be implementing that, even though
>>>> other uses of "3" are left untouched.
>>>
>>> I'll take a look at cleaning those up also...
>>>
>>> In some cases markOopDesc::marked_value will work for the literal '3',
>>> but in other cases we'll use markOop::unused_mark():
>>>
>>>   static markOop unused_mark() {
>>>     return (markOop) marked_value;
>>>   }
>>>
>>> to save us the noise of the (markOop) cast.
>>>
>>>
>>>> ---
>>>>
>>>> src/share/vm/runtime/sharedRuntime.cpp
>>>>
>>>> 1794 JRT_BLOCK_ENTRY(void,
>>>> SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock*
>>>> lock, JavaThread* thread))
>>>> 1795   if (!SafepointSynchronize::is_synchronizing()) {
>>>> 1796     if (ObjectSynchronizer::quick_enter(_obj, thread, lock))
>>>> return;
>>>>
>>>> Is it necessary to check is_synchronizing? If we are executing this
>>>> code we are not at a safepoint and the quick_enter wont change that,
>>>> so I'm not sure what we are guarding against.
>>>
>>> So this first state checker:
>>>
>>> src/share/vm/runtime/safepoint.hpp:
>>> inline static bool is_synchronizing()  { return _state ==
>>> _synchronizing;  }
>>>
>>> means that we want to go to a safepoint and:
>>>
>>> inline static bool is_at_safepoint()   { return _state ==
>>> _synchronized;  }
>>>
>>> means that we are at a safepoint. Dice's optimization bails out if
>>> we want to go to a safepoint and ObjectSynchronizer::quick_enter()
>>> has a "No_Safepoint_Verifier nsv" in it so we're expecting that
>>> code to be quick (and not go to a safepoint). I'm not seeing
>>> anything obvious....
>>>
>>> Sometimes we have to be careful with JavaThread suspend requests and
>>> monitor acquisition, but I don't think that's a problem here... In
>>> order for the "suspend requesting" thread to be surprised, the suspend
>>> API, e.g., JVM/TI SuspendThread() has to return to the caller and then
>>> the suspend target has do something unexpected like acquire a monitor
>>> that it was previously blocked upon when it was suspended. We've had
>>> bugs like that in the past... In this optimization case, our target
>>> thread is not blocked on a contended monitor...
>>>
>>> In this particular case, the "suspend requesting" thread will set the
>>> suspend request state on the target thread, but the target thread is
>>> busy trying to enter this uncontended monitor (quickly). So the
>>> "suspend requesting" thread, will request a no-op safepoint, but it
>>> won't return from the suspend API until that safepoint completes.
>>> The safepoint won't complete until the target thread is done acquiring
>>> the previously uncontended monitor... so the target thread will be
>>> suspended while holding the previous uncontended monitor and the
>>> "suspend requesting" thread will return from the suspend API all
>>> happy...
>>>
>>> Well, I don't see the reason either so I'll have to ping Dave Dice
>>> and Karen Kinnear to see if either of them can fill in the history
>>> here. This could be an abundance of caution case.
>>>
>>>
>>>> ---
>>>>
>>>> src/share/vm/runtime/synchronizer.cpp
>>>>
>>>> Minor nit: line 153 the usual acronym is NPE (for
>>>> NullPointerException) not NPX
>>>
>>> I'll do a search for uses of NPX and other uses of 'X' in exception
>>> acronyms...
>>>
>>>
>>>>
>>>> Nit:  159     Thread * const ox
>>>>
>>>> Please change ox to owner.
>>>
>>> Will do.
>>>
>>> Thanks again for the review!
>>>
>>> Dan
>>>
>>>
>>>>
>>>> ---
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>
>>>>
>>>> On 31/10/2014 8:50 AM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have the Contended Locking fast enter bucket ready for review.
>>>>>
>>>>> The code changes in this bucket are primarily a quick_enter()
>>>>> function that works on inflated but uncontended Java monitors.
>>>>> This quick_enter() function is used on the "slow path" for Java
>>>>> Monitor enter operations when the built-in "fast path" (read
>>>>> assembly code) doesn't work.
>>>>>
>>>>> This work is being tracked by the following bug ID:
>>>>>
>>>>>      JDK-8061553 Contended Locking fast enter bucket
>>>>> https://bugs.openjdk.java.net/browse/JDK-8061553
>>>>>
>>>>> Here is the webrev URL:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/0-jdk9-hs-rt/
>>>>>
>>>>> Here is the JEP link:
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>
>>>>> 8061553 summary of changes:
>>>>>
>>>>> macroAssembler_sparc.cpp: MacroAssembler::compiler_lock_object()
>>>>>
>>>>> - clean up spacing around some
>>>>>    'ObjectMonitor::owner_offset_in_bytes() - 2' uses
>>>>> - remove optional (EmitSync & 64) code
>>>>> - change from cmp() to andcc() so icc.zf flag is set
>>>>>
>>>>> macroAssembler_x86.cpp: MacroAssembler::fast_lock()
>>>>>
>>>>> - remove optional (EmitSync & 2) code
>>>>> - rewrite LP64 inflated lock code that tries to CAS in
>>>>>    the new owner value to be more efficient
>>>>>
>>>>> interfaceSupport.hpp:
>>>>>
>>>>> - add JRT_BLOCK_NO_ASYNC to permit splitting a
>>>>>    JRT_BLOCK_ENTRY into two pieces.
>>>>>
>>>>> sharedRuntime.cpp: SharedRuntime::complete_monitor_locking_C()
>>>>>
>>>>> - change entry type from JRT_ENTRY_NO_ASYNC to JRT_BLOCK_ENTRY
>>>>>    to permit ObjectSynchronizer::quick_enter() call
>>>>> - add JRT_BLOCK_NO_ASYNC use if the quick_enter() doesn't work
>>>>>    to revert to JRT_ENTRY_NO_ASYNC-like semantics
>>>>>
>>>>> synchronizer.[ch]pp:
>>>>>
>>>>> - add ObjectSynchronizer::quick_enter() for entering an
>>>>>    inflated but unowned Java monitor without thread state
>>>>>    changes
>>>>>
>>>>> Testing:
>>>>>
>>>>> - Aurora Adhoc RT/SVC baseline batch
>>>>> - JPRT test jobs
>>>>> - MonitorEnterStresser micro-benchmark (in process)
>>>>> - CallTimerGrid stress testing (in process)
>>>>> - Aurora performance testing:
>>>>>    - out of the box for the "promotion" and 32-bit server configs
>>>>>    - heavy weight monitors for the "promotion" and 32-bit server
>>>>> configs
>>>>>      (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>>>>>      (in process)
>>>>>
>>>>>
>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>
>>>>> Dan
>>>
>>>
>>
>>
>>
>

From staffan.larsen at oracle.com  Tue Nov 11 08:03:18 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 11 Nov 2014 09:03:18 +0100
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5460FF41.90208@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
	<5460B608.4050909@oracle.com> <5460C354.5000605@oracle.com>
	<5460C960.9080509@oracle.com>
	<B5073FA9-5725-4CDE-AF75-678E4DD867FE@oracle.com>
	<5460D1DA.4050907@oracle.com>
	<E62D4958-C26F-4BE3-A8F0-479BCAAE97D4@oracle.com>
	<5460FF41.90208@oracle.com>
Message-ID: <0A9ACBF1-F16F-444A-9CB4-5338C18F68E4@oracle.com>

I was able to provoke the failure with a ?jstack -F?. I think this patch solves the problem: http://cr.openjdk.java.net/~sla/8059677-thread.name.sa.patch <http://cr.openjdk.java.net/~sla/8059677-thread.name.sa.patch>. Feel free to not include the changes in StackTrace.java if you don?t want to complicate your review.

Thanks,
/Staffan


> On 10 nov 2014, at 19:09, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
> 
> On 10.11.2014 19:39, Staffan Larsen wrote:
>>> On 10 nov 2014, at 15:55, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
>>> Ow, it seems very like it.
>>> So, what testlist have I missed to catch this?
>> 
>> Probably vm.tmtools.testlist and/or nsk.sajdi.testlist. Just a warning that these tests are far from stable. Sorry about that.
> 
> Alas, both these testlists pass with current change without a hitch.
> That probably tells something about the test coverage. Any other ideas
> how to test for it? Maybe some manual way?
> 
> Anyhow, there is a synonymous block in ThreadGroup handling, I can copy
> the relevant bits from there. Updated webrev follows soon. Still need to
> test if that change is safe.
> 
> -Aleksey.
> 


From david.holmes at oracle.com  Tue Nov 11 08:14:38 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 11 Nov 2014 18:14:38 +1000
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <5461744E.2000304@oracle.com>
References: <5460F402.4060507@oracle.com>	<1FA4267A-C001-48A5-9C04-1DA54890C3F1@oracle.com>
	<5461744E.2000304@oracle.com>
Message-ID: <5461C56E.9090301@oracle.com>

On 11/11/2014 12:28 PM, Coleen Phillimore wrote:
>
> I've made the change to use right_n_bits and run the NMT tests
> (including the one that crashed in a loop for a while).
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8062870_3/
>
> Thanks.  This is a big improvement.

Looks good to me too. I think "we" forget the goodies located in 
globalDefinitions.hpp sometimes :)

David

> Coleen
>
> On 11/10/14, 7:21 PM, John Rose wrote:
>> I think on many LP64 platforms the value of (1<<40) is 256, same as
>> (1<<8).  It will seldom be the intended value of (int64_t)1<<40.
>>
>> The C "<<" operator is notoriously devious (not to say shifty).
>>
>> For shift/mask arithmetic we should be continuing to use macros from
>> globalDefinitions.hpp.
>> They are far more reliable than C expressions.
>>
>> ? John
>>
>> On Nov 10, 2014, at 9:21 AM, Coleen Phillimore
>> <coleen.phillimore at oracle.com> wrote:
>>
>>> Summary: Signed bitfield size y can only have (1 << y)-1 values.
>>>
>>> We were overflowing the the _pos index and reusing the 0th element in
>>> the MallocSiteTable for two different stack traces which caused the
>>> assert for deallocation.
>>>
>>> Tested with nsk.quick.testlist and jtreg runtime tests with
>>> -XX:NativeMemoryTracking=detail.
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8062870/
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
>>>
>>> Thanks,
>>> Coleen
>

From dmitry.samersoff at oracle.com  Tue Nov 11 08:35:40 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Tue, 11 Nov 2014 11:35:40 +0300
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <546151A9.1080100@oracle.com>
References: <546151A9.1080100@oracle.com>
Message-ID: <5461CA5C.30409@oracle.com>

Dan,

1. defs.make:

It might be better to join obcopy version check and condition at ll.190

otherwise the user will have a wrong version warning and then misleading
message "no objcopy cmd found"

2. Did you consider moving objcopy detection to configure?


-Dmitry


On 2014-11-11 03:00, Daniel D. Daugherty wrote:
> Greetings,
> 
> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
> Yes, it is a small fix, but it is in Makefiles so feel free to
> run screaming from the room... :-)  On the plus side the fix does
> delete two work around source files (Coleen would say that's a
> Good Thing (TM)!)
> 
> The fix is to detect the version of GNU objcopy that is being
> used on the machine and only enable Full Debug Symbols when that
> version is 2.21.1 or newer. If you don't have the right version,
> then the build drops back to pre-FDS build configs with a message
> like this:
> 
> WARNING: /usr/sfw/bin/gobjcopy --version info:
> WARNING: GNU objcopy 2.15
> WARNING: an objcopy version of 2.21.1 or newer is needed to create valid
> .debuginfo files.
> WARNING: ignoring above objcopy command.
> WARNING: patch 149063-01 or newer contains the correct Solaris 10 SPARC
> version.
> WARNING: patch 149064-01 or newer contains the correct Solaris 10 X86
> version.
> WARNING: Solaris 11 Update 1 contains the correct version.
> INFO: no objcopy cmd found so cannot create .debuginfo files.
> INFO: ENABLE_FULL_DEBUG_SYMBOLS=0
> 
> This work is being tracked by the following bug IDs:
> 
>     JDK-8033602 wrong stabs data in libjvm.debuginfo on JDK 8 - SPARC
>     https://bugs.openjdk.java.net/browse/JDK-8033602
> 
>     JDK-8034005 cannot debug in synchronizer.o or objectMonitor.o on
> Solaris X86
>     https://bugs.openjdk.java.net/browse/JDK-8034005
> 
> Here is the webrev URL:
> 
> http://cr.openjdk.java.net/~dcubed/8033602-webrev/0-jdk9-hs-rt/
> 
> Testing:
> 
> - JPRT test jobs to verify that the current JPRT Solaris hosts
>   are happy
> - local builds on my Solaris 10 X86 machine to verify that the
>   wrong version of GNU objcopy is caught
> 
> Thanks, in advance, for any comments, questions or suggestions.
> 
> Dan


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the source code.

From aleksey.shipilev at oracle.com  Tue Nov 11 09:05:18 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Tue, 11 Nov 2014 12:05:18 +0300
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <546141E9.8060602@oracle.com>
References: <5460F402.4060507@oracle.com> <5460F745.4070808@oracle.com>
	<546141E9.8060602@oracle.com>
Message-ID: <5461D14E.1080708@oracle.com>

On 11.11.2014 01:53, Coleen Phillimore wrote:
> Thank you for the comments which I initially agreed with but working
> with the code, makes me less happy and I will leave it as is (except one
> change which I'm going to put out shortly).

All right, that's your call.

-Aleksey.


From aleksey.shipilev at oracle.com  Tue Nov 11 09:10:09 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Tue, 11 Nov 2014 12:10:09 +0300
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <5461744E.2000304@oracle.com>
References: <5460F402.4060507@oracle.com>	<1FA4267A-C001-48A5-9C04-1DA54890C3F1@oracle.com>
	<5461744E.2000304@oracle.com>
Message-ID: <5461D271.90204@oracle.com>

On 11.11.2014 05:28, Coleen Phillimore wrote:
> 
> I've made the change to use right_n_bits and run the NMT tests
> (including the one that crashed in a loop for a while).
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/8062870_3/

Looks good.

What confused me initially is a conceptual impedance: _pos_idx is
bounded by MAX_BUCKET_LENGTH, and _bucket_idx is bounded by
MAX_MALLOCSITE_TABLE_SIZE. Notice the mention of "bucket" in both cases.
So it does not look correct from the first glance, and I had to push
myself from believing the defined values are not accidentally swapped.
Granted, you can get used to this oddity, but it only takes a valuable
space in a brain ;)

-Aleksey.


From aleksey.shipilev at oracle.com  Tue Nov 11 09:26:01 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Tue, 11 Nov 2014 12:26:01 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <0A9ACBF1-F16F-444A-9CB4-5338C18F68E4@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
	<5460B608.4050909@oracle.com> <5460C354.5000605@oracle.com>
	<5460C960.9080509@oracle.com>
	<B5073FA9-5725-4CDE-AF75-678E4DD867FE@oracle.com>
	<5460D1DA.4050907@oracle.com>
	<E62D4958-C26F-4BE3-A8F0-479BCAAE97D4@oracle.com>
	<5460FF41.90208@oracle.com>
	<0A9ACBF1-F16F-444A-9CB4-5338C18F68E4@oracle.com>
Message-ID: <5461D629.3010001@oracle.com>

Thanks Staffan, your change is exactly what I (blindly) did in my
updated webrev. I will get David's comments in, respin some tests and
publish the update.

-Aleksey.

On 11.11.2014 11:03, Staffan Larsen wrote:
> I was able to provoke the failure with a ?jstack -F?. I think this patch
> solves the
> problem: http://cr.openjdk.java.net/~sla/8059677-thread.name.sa.patch
> <http://cr.openjdk.java.net/%7Esla/8059677-thread.name.sa.patch>. Feel
> free to not include the changes in StackTrace.java if you don?t want to
> complicate your review.
> 
> Thanks,
> /Staffan
> 
> 
>> On 10 nov 2014, at 19:09, Aleksey Shipilev
>> <aleksey.shipilev at oracle.com <mailto:aleksey.shipilev at oracle.com>> wrote:
>>
>> On 10.11.2014 19:39, Staffan Larsen wrote:
>>>> On 10 nov 2014, at 15:55, Aleksey Shipilev
>>>> <aleksey.shipilev at oracle.com <mailto:aleksey.shipilev at oracle.com>>
>>>> wrote:
>>>> Ow, it seems very like it.
>>>> So, what testlist have I missed to catch this?
>>>
>>> Probably vm.tmtools.testlist and/or nsk.sajdi.testlist. Just a
>>> warning that these tests are far from stable. Sorry about that.
>>
>> Alas, both these testlists pass with current change without a hitch.
>> That probably tells something about the test coverage. Any other ideas
>> how to test for it? Maybe some manual way?
>>
>> Anyhow, there is a synonymous block in ThreadGroup handling, I can copy
>> the relevant bits from there. Updated webrev follows soon. Still need to
>> test if that change is safe.
>>
>> -Aleksey.
>>
> 


From aleksey.shipilev at oracle.com  Tue Nov 11 09:38:46 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Tue, 11 Nov 2014 12:38:46 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5461B593.1000104@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
	<5460B608.4050909@oracle.com> <5460C354.5000605@oracle.com>
	<5460C960.9080509@oracle.com> <5461B593.1000104@oracle.com>
Message-ID: <5461D926.6010008@oracle.com>

Hi David,

Updated webrevs will follow after I respin the tests. Meanwhile, some
comments below:

On 11.11.2014 10:06, David Holmes wrote:
> On 11/11/2014 12:19 AM, Aleksey Shipilev wrote:
>> Then I speculated that having char[] name would help VM initialize the
>> name if we wanted to switch to complete VM-side initialization of
>> Thread, but it seems we can do String oop instantiation in the similar
>> vein.
> 
> I think it really just came down to accessing the Thread name from
> things like JVMDI/PI (now JVM TI) - easier for C code to access a raw
> char[]. Maybe once upon a time (in a land not so far away) we even
> passed char[] to the Thread constructor? :) But having re-discovered
> past discussions etc there's really nothing to stop this from being a
> String (slight memory use increase per Thread object).

Yes. char[] does appear simpler from the native side, if not that pesky
Unicode requirement that forces use to use Unicode routines within the
VM to deal with char[] exposed to the Java side. Not so much an
improvement comparing to String oop dance.


> JDK change is okay - but "name" doesn't need to be volatile when it is a
> String reference.

I understand the memory model reasoning about the correctness, but I
think users rightfully expect getName() to return the last "updated"
Thread.name, even though this requirement is not spelled out
specifically. Therefore, I believe "volatile" should stay.

(I would be violently disappointed about the JDK if I realized my
logging is garbled and the same thread "appears" under several names
back and forth within a short time window -- because of data race on
Thread.name)

> Hotspot side:
> 
> src/share/vm/classfile/javaClasses.hpp
> 
> This added assert seems overly cautious:
> 
>  134     oop value = java_string->obj_field(value_offset);
>  135     assert((value->is_typeArray() &&
> TypeArrayKlass::cast(value->klass())->element_type() == T_CHAR), "expect
> char[]");
> 
> you are basically checking that String.value is defined as a char[]. If
> warranted, this is a check needed once in the lifetime of a VM not every
> time this method is called. (Yes I see we had something similarly odd in
> java_lang_thread::name. :( )

Agreed. Dropped the assert from here. I think we already check this for
String.name field when we pre-compute the value_offset.


> ---
> 
> src/share/vm/classfile/javaClasses.cpp
> 
> ! oop java_lang_Thread::name(oop java_thread) {
>     oop name = java_thread->obj_field(_name_offset);
> !   assert(name != NULL, "thread name is NULL");
> 
> I'm not confident this can never be called before the name has been set.
> The original assertion allowed for NULL as does the JVM TI code.

Agreed. Dropped the assert altogether.


> ---
> 
> src/share/vm/prims/jvmtiTrace.cpp
> 
> Copyright year needs updating. :)

Done.


> ---
> 
> Aside: I wonder if we've inadvertently fixed 6771287 now. :) That was a
> fun one to debug.

Ouch.


-Aleksey.


From david.holmes at oracle.com  Tue Nov 11 10:29:22 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 11 Nov 2014 20:29:22 +1000
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5461D926.6010008@oracle.com>
References: <545FB64F.7090705@oracle.com> <5460A6E8.9050506@oracle.com>
	<5460B608.4050909@oracle.com> <5460C354.5000605@oracle.com>
	<5460C960.9080509@oracle.com> <5461B593.1000104@oracle.com>
	<5461D926.6010008@oracle.com>
Message-ID: <5461E502.4070700@oracle.com>

On 11/11/2014 7:38 PM, Aleksey Shipilev wrote:
> Hi David,
>
> Updated webrevs will follow after I respin the tests. Meanwhile, some
> comments below:
>
> On 11.11.2014 10:06, David Holmes wrote:
>> On 11/11/2014 12:19 AM, Aleksey Shipilev wrote:
>>> Then I speculated that having char[] name would help VM initialize the
>>> name if we wanted to switch to complete VM-side initialization of
>>> Thread, but it seems we can do String oop instantiation in the similar
>>> vein.
>>
>> I think it really just came down to accessing the Thread name from
>> things like JVMDI/PI (now JVM TI) - easier for C code to access a raw
>> char[]. Maybe once upon a time (in a land not so far away) we even
>> passed char[] to the Thread constructor? :) But having re-discovered
>> past discussions etc there's really nothing to stop this from being a
>> String (slight memory use increase per Thread object).
>
> Yes. char[] does appear simpler from the native side, if not that pesky
> Unicode requirement that forces use to use Unicode routines within the
> VM to deal with char[] exposed to the Java side. Not so much an
> improvement comparing to String oop dance.
>
>
>> JDK change is okay - but "name" doesn't need to be volatile when it is a
>> String reference.
>
> I understand the memory model reasoning about the correctness, but I
> think users rightfully expect getName() to return the last "updated"
> Thread.name, even though this requirement is not spelled out
> specifically. Therefore, I believe "volatile" should stay.
>
> (I would be violently disappointed about the JDK if I realized my
> logging is garbled and the same thread "appears" under several names
> back and forth within a short time window -- because of data race on
> Thread.name)

Yes - silly of me. I was thinking the name is only set once but of 
course it can be set many times.

Cheers,
David
------


>> Hotspot side:
>>
>> src/share/vm/classfile/javaClasses.hpp
>>
>> This added assert seems overly cautious:
>>
>>   134     oop value = java_string->obj_field(value_offset);
>>   135     assert((value->is_typeArray() &&
>> TypeArrayKlass::cast(value->klass())->element_type() == T_CHAR), "expect
>> char[]");
>>
>> you are basically checking that String.value is defined as a char[]. If
>> warranted, this is a check needed once in the lifetime of a VM not every
>> time this method is called. (Yes I see we had something similarly odd in
>> java_lang_thread::name. :( )
>
> Agreed. Dropped the assert from here. I think we already check this for
> String.name field when we pre-compute the value_offset.
>
>
>> ---
>>
>> src/share/vm/classfile/javaClasses.cpp
>>
>> ! oop java_lang_Thread::name(oop java_thread) {
>>      oop name = java_thread->obj_field(_name_offset);
>> !   assert(name != NULL, "thread name is NULL");
>>
>> I'm not confident this can never be called before the name has been set.
>> The original assertion allowed for NULL as does the JVM TI code.
>
> Agreed. Dropped the assert altogether.
>
>
>> ---
>>
>> src/share/vm/prims/jvmtiTrace.cpp
>>
>> Copyright year needs updating. :)
>
> Done.
>
>
>> ---
>>
>> Aside: I wonder if we've inadvertently fixed 6771287 now. :) That was a
>> fun one to debug.
>
> Ouch.
>
>
> -Aleksey.
>
>
>

From aleksey.shipilev at oracle.com  Tue Nov 11 11:38:20 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Tue, 11 Nov 2014 14:38:20 +0300
Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to
	use	the same oop map
In-Reply-To: <54615A43.10700@oracle.com>
References: <525AC628.4020906@oracle.com>	<009001cec83e$5e9c6b40$1bd541c0$@oracle.com>	<525B0A18.8000105@oracle.com>	<545B70F6.60801@oracle.com>	<7029DBDA-BBE6-420A-BE6E-2EA28B60B560@oracle.com>	<545B9CC0.3080106@oracle.com>	<545F8CFA.80809@oracle.com>
	<54615A43.10700@oracle.com>
Message-ID: <5461F52C.6020002@oracle.com>

Thanks for review, Coleen!

On 11.11.2014 03:37, Coleen Phillimore wrote:
> Hi, I think this code looks correct.  Was there a test in the test
> system that exercises this code?  I think it would be hard to test with
> a dedicated test but was there one already in the test sets?

Yes, there are @Contended tests in vmtestbase that exercise the
@Contended placed over different fields. I added the targeted test that
also does walk through new code. There is nothing to check there, except
for the native assert in the new code.

> Secondly, could you use the word adjacent in the comments, reuse oopmap
> for adjacent oops in the class or something like that?  That would have
> saved me some jotting down on notebook.

Sure, see the update. In previous change, I blindly copied the block
already available for non- at Contended oops. I remember the oop maps code
was tripping me over, this is why we have an explanation all the way on
the top how oop maps are supposed to work.

> I'll sponsor it if you get another reviewer.

Here's the updated webrev:
 http://cr.openjdk.java.net/~shade/8015272/webrev.02/

I have only tested in builds on Linux x86_64/fastdebug, and passes
runtime/contended jtregs. There were no changes in product code since
last webrev, only in comments.

Thanks,
-Aleksey.


From david.holmes at oracle.com  Tue Nov 11 12:01:18 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 11 Nov 2014 22:01:18 +1000
Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to
	use	the same oop map
In-Reply-To: <5461F52C.6020002@oracle.com>
References: <525AC628.4020906@oracle.com>	<009001cec83e$5e9c6b40$1bd541c0$@oracle.com>	<525B0A18.8000105@oracle.com>	<545B70F6.60801@oracle.com>	<7029DBDA-BBE6-420A-BE6E-2EA28B60B560@oracle.com>	<545B9CC0.3080106@oracle.com>	<545F8CFA.80809@oracle.com>	<54615A43.10700@oracle.com>
	<5461F52C.6020002@oracle.com>
Message-ID: <5461FA8E.2000301@oracle.com>

On 11/11/2014 9:38 PM, Aleksey Shipilev wrote:
> Thanks for review, Coleen!
>
> On 11.11.2014 03:37, Coleen Phillimore wrote:
>> Hi, I think this code looks correct.  Was there a test in the test
>> system that exercises this code?  I think it would be hard to test with
>> a dedicated test but was there one already in the test sets?
>
> Yes, there are @Contended tests in vmtestbase that exercise the
> @Contended placed over different fields. I added the targeted test that
> also does walk through new code. There is nothing to check there, except
> for the native assert in the new code.
>
>> Secondly, could you use the word adjacent in the comments, reuse oopmap
>> for adjacent oops in the class or something like that?  That would have
>> saved me some jotting down on notebook.
>
> Sure, see the update. In previous change, I blindly copied the block
> already available for non- at Contended oops. I remember the oop maps code
> was tripping me over, this is why we have an explanation all the way on
> the top how oop maps are supposed to work.
>
>> I'll sponsor it if you get another reviewer.

I'll add my Review. Changes seem okay.

Looks like the style-Police didn't pay enough attention to this section 
of code though as a lot of:

   if( XXX )

have crept in instead of:

   if (XXX)

;-)

Cheers,
David

> Here's the updated webrev:
>   http://cr.openjdk.java.net/~shade/8015272/webrev.02/
>
> I have only tested in builds on Linux x86_64/fastdebug, and passes
> runtime/contended jtregs. There were no changes in product code since
> last webrev, only in comments.
>
> Thanks,
> -Aleksey.
>

From aleksey.shipilev at oracle.com  Tue Nov 11 12:04:47 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Tue, 11 Nov 2014 15:04:47 +0300
Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to
	use	the same oop map
In-Reply-To: <5461FA8E.2000301@oracle.com>
References: <525AC628.4020906@oracle.com>	<009001cec83e$5e9c6b40$1bd541c0$@oracle.com>	<525B0A18.8000105@oracle.com>	<545B70F6.60801@oracle.com>	<7029DBDA-BBE6-420A-BE6E-2EA28B60B560@oracle.com>	<545B9CC0.3080106@oracle.com>	<545F8CFA.80809@oracle.com>	<54615A43.10700@oracle.com>
	<5461F52C.6020002@oracle.com> <5461FA8E.2000301@oracle.com>
Message-ID: <5461FB5F.6060406@oracle.com>

On 11.11.2014 15:01, David Holmes wrote:
>>> I'll sponsor it if you get another reviewer.
> 
> I'll add my Review. Changes seem okay.

Thanks!

> Looks like the style-Police didn't pay enough attention to this section
> of code though as a lot of:
> 
>   if( XXX )
> 
> have crept in instead of:
> 
>   if (XXX)
> 
> ;-)

Yes, but unfortunately, that is consistent with the code style in
method. If we are to change that, we should probably need to change the
style consistently in the entire classLoader.cpp.

-Aleksey.


From aleksey.shipilev at oracle.com  Tue Nov 11 12:09:07 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Tue, 11 Nov 2014 15:09:07 +0300
Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to
	use	the same oop map
In-Reply-To: <5461FB5F.6060406@oracle.com>
References: <525AC628.4020906@oracle.com>	<009001cec83e$5e9c6b40$1bd541c0$@oracle.com>	<525B0A18.8000105@oracle.com>	<545B70F6.60801@oracle.com>	<7029DBDA-BBE6-420A-BE6E-2EA28B60B560@oracle.com>	<545B9CC0.3080106@oracle.com>	<545F8CFA.80809@oracle.com>	<54615A43.10700@oracle.com>	<5461F52C.6020002@oracle.com>
	<5461FA8E.2000301@oracle.com> <5461FB5F.6060406@oracle.com>
Message-ID: <5461FC63.6030001@oracle.com>

On 11.11.2014 15:04, Aleksey Shipilev wrote:
> On 11.11.2014 15:01, David Holmes wrote:
>>>> I'll sponsor it if you get another reviewer.
>>
>> I'll add my Review. Changes seem okay.
> 
> Thanks!

Changeset:
  http://cr.openjdk.java.net/~shade/8015272/8015272.changeset

-Aleksey


From gunter.haug at sap.com  Tue Nov 11 13:23:00 2014
From: gunter.haug at sap.com (Haug, Gunter)
Date: Tue, 11 Nov 2014 13:23:00 +0000
Subject: RFR(XS): 8064471: Port 8013895: G1: G1SummarizeRSetStats output on
	Linux needs improvement to AIX 
Message-ID: <44EB1DA1040EEB44B7F343A782AD30789FEBA3E3@DEWDFEMB11B.global.corp.sap>

Hi All,

The change '8013895:  (G1: G1SummarizeRSetStats output on Linux needs improvement)' makes use of getrusage() to retrieve accurate per-thread data on resource usage. We can use exactly the same code on AIX to achieve this.

Please review the following change:

http://cr.openjdk.java.net/~goetz/webrevs/8064471-aixTime/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8064471

Thanks,
Gunter


From volker.simonis at gmail.com  Tue Nov 11 13:27:48 2014
From: volker.simonis at gmail.com (Volker Simonis)
Date: Tue, 11 Nov 2014 14:27:48 +0100
Subject: RFR(XS): 8064471: Port 8013895: G1: G1SummarizeRSetStats output
	on Linux needs improvement to AIX
In-Reply-To: <44EB1DA1040EEB44B7F343A782AD30789FEBA3E3@DEWDFEMB11B.global.corp.sap>
References: <44EB1DA1040EEB44B7F343A782AD30789FEBA3E3@DEWDFEMB11B.global.corp.sap>
Message-ID: <CA+3eh11FK0dXMw_qudC5io4hpcLg2DseW5tjqffemptjUhgMtA@mail.gmail.com>

Hi Gunter,

I think the change looks good.
I can also sponsor the change if we get one more review.

This is a good possibility to test if we are really able to push
changes to the ppc/aix directories of the hotspot repositories.

Thank you and best regards,
Volker


On Tue, Nov 11, 2014 at 2:23 PM, Haug, Gunter <gunter.haug at sap.com> wrote:
> Hi All,
>
> The change '8013895:  (G1: G1SummarizeRSetStats output on Linux needs improvement)' makes use of getrusage() to retrieve accurate per-thread data on resource usage. We can use exactly the same code on AIX to achieve this.
>
> Please review the following change:
>
> http://cr.openjdk.java.net/~goetz/webrevs/8064471-aixTime/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8064471
>
> Thanks,
> Gunter
>

From daniel.daugherty at oracle.com  Tue Nov 11 13:55:50 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 11 Nov 2014 06:55:50 -0700
Subject: RFR(S) Contended Locking fast enter bucket (8061553)
In-Reply-To: <5461C282.1020806@oracle.com>
References: <5452C0B4.4070601@oracle.com>
	<5457084B.6070808@oracle.com>	<5458330E.1080207@oracle.com>
	<54591A3A.1090005@oracle.com> <545C2BC0.3080207@oracle.com>
	<5461C282.1020806@oracle.com>
Message-ID: <54621566.9040805@oracle.com>

On 11/11/14 1:02 AM, David Holmes wrote:
> On 7/11/2014 12:17 PM, Daniel D. Daugherty wrote:
>> The fix for JDK-8062851 has been reviewed, tested and pushed to
>> RT_Baseline.
>>
>> Time to get back to this review thread so here's an updated webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/1-jdk9-hs-rt/
>>
>> David H., I believe I've addressed all of your comments. Please
>> let me know if I missed something...
>
> Looks good to me - thanks Dan!

Thanks for the re-review!

Dan


>
> David
> -----
>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>>
>> On 11/4/14 11:26 AM, Daniel D. Daugherty wrote:
>>> The cleanup is turning into a bigger change than the fast enter
>>> bucket itself so I'm spinning the cleanup into a new bug:
>>>
>>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>> https://bugs.openjdk.java.net/browse/JDK-8062851
>>>
>>> Yes, this means that the Contended Locking cleanup bucket has reopened
>>> for yet another change...
>>>
>>> We'll get back to "fast enter" after the dust has settled...
>>>
>>> Dan
>>>
>>>
>>> On 11/3/14 6:59 PM, Daniel D. Daugherty wrote:
>>>> David,
>>>>
>>>> Thanks for the review! As usual, replies are embedded below...
>>>>
>>>>
>>>> On 11/2/14 9:44 PM, David Holmes wrote:
>>>>> Hi Dan,
>>>>>
>>>>> Looks good.
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>> Couple of nits and one semantic query below ...
>>>>>
>>>>> src/cpu/sparc/vm/macroAssembler_sparc.cpp
>>>>>
>>>>> Formatting changes were a bit of a distraction.
>>>>
>>>> Yes, I have no idea what got into me. Normally I do formatting
>>>> changes separately so the noise does not distract...
>>>>
>>>> It turns out there is a constant defined that should be used
>>>> instead of all these literal '2's:
>>>>
>>>> src/share/vm/oops/markOop.hpp:         monitor_value = 2
>>>>
>>>> Typically used as follows:
>>>>
>>>> src/cpu/x86/vm/macroAssembler_x86.cpp:  int owner_offset =
>>>> ObjectMonitor::owner_offset_in_bytes() - markOopDesc::monitor_value;
>>>>
>>>> I will clean this up just for the files that I'm touching as
>>>> part of this fix.
>>>>
>>>>
>>>>>
>>>>> ---
>>>>>
>>>>> src/cpu/x86/vm/macroAssembler_x86.cpp
>>>>>
>>>>> Formatting changes were a bit of a distraction.
>>>>
>>>> Same reply as for macroAssembler_sparc.cpp.
>>>>
>>>>
>>>>> 1929     // unconditionally set stackBox->_displaced_header = 3
>>>>> 1930     movptr(Address(boxReg, 0),
>>>>> (int32_t)intptr_t(markOopDesc::unused_mark()));
>>>>>
>>>>> At 1870 we refer to box rather than stackBox. Also it takes some
>>>>> sleuthing to realize that "3" here is somehow a pseudonym for
>>>>> unused_mark(). Back up at 1808 we have a to-do:
>>>>>
>>>>> 1808     //   use markOop::unused_mark() instead of "3".
>>>>>
>>>>> so the current change seems to be implementing that, even though
>>>>> other uses of "3" are left untouched.
>>>>
>>>> I'll take a look at cleaning those up also...
>>>>
>>>> In some cases markOopDesc::marked_value will work for the literal '3',
>>>> but in other cases we'll use markOop::unused_mark():
>>>>
>>>>   static markOop unused_mark() {
>>>>     return (markOop) marked_value;
>>>>   }
>>>>
>>>> to save us the noise of the (markOop) cast.
>>>>
>>>>
>>>>> ---
>>>>>
>>>>> src/share/vm/runtime/sharedRuntime.cpp
>>>>>
>>>>> 1794 JRT_BLOCK_ENTRY(void,
>>>>> SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock*
>>>>> lock, JavaThread* thread))
>>>>> 1795   if (!SafepointSynchronize::is_synchronizing()) {
>>>>> 1796     if (ObjectSynchronizer::quick_enter(_obj, thread, lock))
>>>>> return;
>>>>>
>>>>> Is it necessary to check is_synchronizing? If we are executing this
>>>>> code we are not at a safepoint and the quick_enter wont change that,
>>>>> so I'm not sure what we are guarding against.
>>>>
>>>> So this first state checker:
>>>>
>>>> src/share/vm/runtime/safepoint.hpp:
>>>> inline static bool is_synchronizing()  { return _state ==
>>>> _synchronizing;  }
>>>>
>>>> means that we want to go to a safepoint and:
>>>>
>>>> inline static bool is_at_safepoint()   { return _state ==
>>>> _synchronized;  }
>>>>
>>>> means that we are at a safepoint. Dice's optimization bails out if
>>>> we want to go to a safepoint and ObjectSynchronizer::quick_enter()
>>>> has a "No_Safepoint_Verifier nsv" in it so we're expecting that
>>>> code to be quick (and not go to a safepoint). I'm not seeing
>>>> anything obvious....
>>>>
>>>> Sometimes we have to be careful with JavaThread suspend requests and
>>>> monitor acquisition, but I don't think that's a problem here... In
>>>> order for the "suspend requesting" thread to be surprised, the suspend
>>>> API, e.g., JVM/TI SuspendThread() has to return to the caller and then
>>>> the suspend target has do something unexpected like acquire a monitor
>>>> that it was previously blocked upon when it was suspended. We've had
>>>> bugs like that in the past... In this optimization case, our target
>>>> thread is not blocked on a contended monitor...
>>>>
>>>> In this particular case, the "suspend requesting" thread will set the
>>>> suspend request state on the target thread, but the target thread is
>>>> busy trying to enter this uncontended monitor (quickly). So the
>>>> "suspend requesting" thread, will request a no-op safepoint, but it
>>>> won't return from the suspend API until that safepoint completes.
>>>> The safepoint won't complete until the target thread is done acquiring
>>>> the previously uncontended monitor... so the target thread will be
>>>> suspended while holding the previous uncontended monitor and the
>>>> "suspend requesting" thread will return from the suspend API all
>>>> happy...
>>>>
>>>> Well, I don't see the reason either so I'll have to ping Dave Dice
>>>> and Karen Kinnear to see if either of them can fill in the history
>>>> here. This could be an abundance of caution case.
>>>>
>>>>
>>>>> ---
>>>>>
>>>>> src/share/vm/runtime/synchronizer.cpp
>>>>>
>>>>> Minor nit: line 153 the usual acronym is NPE (for
>>>>> NullPointerException) not NPX
>>>>
>>>> I'll do a search for uses of NPX and other uses of 'X' in exception
>>>> acronyms...
>>>>
>>>>
>>>>>
>>>>> Nit:  159     Thread * const ox
>>>>>
>>>>> Please change ox to owner.
>>>>
>>>> Will do.
>>>>
>>>> Thanks again for the review!
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> ---
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>
>>>>>
>>>>> On 31/10/2014 8:50 AM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I have the Contended Locking fast enter bucket ready for review.
>>>>>>
>>>>>> The code changes in this bucket are primarily a quick_enter()
>>>>>> function that works on inflated but uncontended Java monitors.
>>>>>> This quick_enter() function is used on the "slow path" for Java
>>>>>> Monitor enter operations when the built-in "fast path" (read
>>>>>> assembly code) doesn't work.
>>>>>>
>>>>>> This work is being tracked by the following bug ID:
>>>>>>
>>>>>>      JDK-8061553 Contended Locking fast enter bucket
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8061553
>>>>>>
>>>>>> Here is the webrev URL:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/0-jdk9-hs-rt/
>>>>>>
>>>>>> Here is the JEP link:
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>>
>>>>>> 8061553 summary of changes:
>>>>>>
>>>>>> macroAssembler_sparc.cpp: MacroAssembler::compiler_lock_object()
>>>>>>
>>>>>> - clean up spacing around some
>>>>>>    'ObjectMonitor::owner_offset_in_bytes() - 2' uses
>>>>>> - remove optional (EmitSync & 64) code
>>>>>> - change from cmp() to andcc() so icc.zf flag is set
>>>>>>
>>>>>> macroAssembler_x86.cpp: MacroAssembler::fast_lock()
>>>>>>
>>>>>> - remove optional (EmitSync & 2) code
>>>>>> - rewrite LP64 inflated lock code that tries to CAS in
>>>>>>    the new owner value to be more efficient
>>>>>>
>>>>>> interfaceSupport.hpp:
>>>>>>
>>>>>> - add JRT_BLOCK_NO_ASYNC to permit splitting a
>>>>>>    JRT_BLOCK_ENTRY into two pieces.
>>>>>>
>>>>>> sharedRuntime.cpp: SharedRuntime::complete_monitor_locking_C()
>>>>>>
>>>>>> - change entry type from JRT_ENTRY_NO_ASYNC to JRT_BLOCK_ENTRY
>>>>>>    to permit ObjectSynchronizer::quick_enter() call
>>>>>> - add JRT_BLOCK_NO_ASYNC use if the quick_enter() doesn't work
>>>>>>    to revert to JRT_ENTRY_NO_ASYNC-like semantics
>>>>>>
>>>>>> synchronizer.[ch]pp:
>>>>>>
>>>>>> - add ObjectSynchronizer::quick_enter() for entering an
>>>>>>    inflated but unowned Java monitor without thread state
>>>>>>    changes
>>>>>>
>>>>>> Testing:
>>>>>>
>>>>>> - Aurora Adhoc RT/SVC baseline batch
>>>>>> - JPRT test jobs
>>>>>> - MonitorEnterStresser micro-benchmark (in process)
>>>>>> - CallTimerGrid stress testing (in process)
>>>>>> - Aurora performance testing:
>>>>>>    - out of the box for the "promotion" and 32-bit server configs
>>>>>>    - heavy weight monitors for the "promotion" and 32-bit server
>>>>>> configs
>>>>>>      (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>>>>>>      (in process)
>>>>>>
>>>>>>
>>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>>
>>>>>> Dan
>>>>
>>>>
>>>
>>>
>>>
>>


From aleksey.shipilev at oracle.com  Tue Nov 11 14:40:58 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Tue, 11 Nov 2014 17:40:58 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <545FB64F.7090705@oracle.com>
References: <545FB64F.7090705@oracle.com>
Message-ID: <54621FFA.2070503@oracle.com>

Hi,

On 11/09/2014 09:45 PM, Aleksey Shipilev wrote:
> Thread.getName() returns String, and does new String instantiation every
> time, because the thread name is stored in char[]. Even though we use a
> private String constructor that shares the char[] array without copying
> it, this still hurts some use cases (think extra-fast logging). To the
> extent some people actually maintain Map<Thread, String> to avoid it.
>  https://bugs.openjdk.java.net/browse/JDK-8059677
> 
> Here's the attempt to maintain String instead of char[]:
>  http://cr.openjdk.java.net/~shade/8059677/webrev.01.jdk/
>  http://cr.openjdk.java.net/~shade/8059677/webrev.01.hs/

Updated webrevs:
  http://cr.openjdk.java.net/~shade/8059677/webrev.02.jdk/
  http://cr.openjdk.java.net/~shade/8059677/webrev.02.hs/

This version incorporates feedbacks from Chris, Staffan and David. I
think it is very close to what we would like to push. Opinions?

Testing: JPRT, jdk/test/java/lang/Thread jtreg, hotspot/test/runtime/
jtreg, vm.quick.testlist, nsk.jvmti.testlist, svc.quick.testlist,
vm.tmtools.testlist

Thanks,
-Aleksey.


From coleen.phillimore at oracle.com  Tue Nov 11 14:59:12 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Tue, 11 Nov 2014 09:59:12 -0500
Subject: RFR: 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <5461D271.90204@oracle.com>
References: <5460F402.4060507@oracle.com>	<1FA4267A-C001-48A5-9C04-1DA54890C3F1@oracle.com>	<5461744E.2000304@oracle.com>
	<5461D271.90204@oracle.com>
Message-ID: <54622440.6070607@oracle.com>


On 11/11/14, 4:10 AM, Aleksey Shipilev wrote:
> On 11.11.2014 05:28, Coleen Phillimore wrote:
>> I've made the change to use right_n_bits and run the NMT tests
>> (including the one that crashed in a loop for a while).
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8062870_3/
> Looks good.
>
> What confused me initially is a conceptual impedance: _pos_idx is
> bounded by MAX_BUCKET_LENGTH, and _bucket_idx is bounded by
> MAX_MALLOCSITE_TABLE_SIZE. Notice the mention of "bucket" in both cases.
> So it does not look correct from the first glance, and I had to push
> myself from believing the defined values are not accidentally swapped.
> Granted, you can get used to this oddity, but it only takes a valuable
> space in a brain ;)

Hi Aleksey,

Thanks for the code review.  The MAX_BUCKET_LENGTH concept is used in 
the MallocSiteTable::lookup_or_add function when we add things to the 
hash table for updating counters later.  It makes sense in that 
context.  If I were changing more things in this code, I might be likely 
to make changes to save brain cells.  We have to backport these changes 
to 8u though.

Thanks!
Coleen

>
> -Aleksey.
>


From coleen.phillimore at oracle.com  Tue Nov 11 15:12:20 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Tue, 11 Nov 2014 10:12:20 -0500
Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to
	use	the same oop map
In-Reply-To: <5461FB5F.6060406@oracle.com>
References: <525AC628.4020906@oracle.com>	<009001cec83e$5e9c6b40$1bd541c0$@oracle.com>	<525B0A18.8000105@oracle.com>	<545B70F6.60801@oracle.com>	<7029DBDA-BBE6-420A-BE6E-2EA28B60B560@oracle.com>	<545B9CC0.3080106@oracle.com>	<545F8CFA.80809@oracle.com>	<54615A43.10700@oracle.com>
	<5461F52C.6020002@oracle.com> <5461FA8E.2000301@oracle.com>
	<5461FB5F.6060406@oracle.com>
Message-ID: <54622754.5060205@oracle.com>


On 11/11/14, 7:04 AM, Aleksey Shipilev wrote:
> On 11.11.2014 15:01, David Holmes wrote:
>>>> I'll sponsor it if you get another reviewer.
>> I'll add my Review. Changes seem okay.
> Thanks!
>
>> Looks like the style-Police didn't pay enough attention to this section
>> of code though as a lot of:
>>
>>    if( XXX )
>>
>> have crept in instead of:
>>
>>    if (XXX)
>>
>> ;-)
> Yes, but unfortunately, that is consistent with the code style in
> method. If we are to change that, we should probably need to change the
> style consistently in the entire classLoader.cpp.

I didn't notice that this was a copy.  This code is badly in need of 
refactoring to remove all the duplication, someday.

Coleen

>
> -Aleksey.
>


From karen.kinnear at oracle.com  Tue Nov 11 15:36:54 2014
From: karen.kinnear at oracle.com (Karen Kinnear)
Date: Tue, 11 Nov 2014 10:36:54 -0500
Subject: RFR (XS) JDK-8015272: Make @Contended within the same group to
	use	the same oop map
In-Reply-To: <545F8CFA.80809@oracle.com>
References: <525AC628.4020906@oracle.com>	<009001cec83e$5e9c6b40$1bd541c0$@oracle.com>	<525B0A18.8000105@oracle.com>
	<545B70F6.60801@oracle.com>	<7029DBDA-BBE6-420A-BE6E-2EA28B60B560@oracle.com>
	<545B9CC0.3080106@oracle.com> <545F8CFA.80809@oracle.com>
Message-ID: <E504F7E5-DEE7-4426-AFD3-EE3734C553D0@oracle.com>

Aleksey,

Many thanks for the additional testing and checking that there was no need for platform-specific
testing.

thanks,
Karen

On Nov 9, 2014, at 10:49 AM, Aleksey Shipilev wrote:

> Hi again,
> 
> No changes in webrev:
> http://cr.openjdk.java.net/~shade/8015272/webrev.01/
> 
> Please review and sponsor:
> http://cr.openjdk.java.net/~shade/8015272/8015272.changeset
> 
> As per Karen's request, more testing is done, ran the tests on my Linux
> x86_64/fastdebug:
> 
> On 11/06/2014 07:07 PM, Aleksey Shipilev wrote:
>> On 11/06/2014 06:01 PM, Karen Kinnear wrote:
>>> - e.g. what about the vmtestbase vm/runtime/contended tests (and yes, some tests should be removed from that testlist)
> 
> vmtestbase vm/runtime/contended: no issues.
> hotspot/test/runtime/ jtreg: no issues.
> 
>>> - vmtestbase: vm.quick.testlist (required for runtime changes)
> 
> vm.quick.testlist: no issues.
> 
>>> - and since @Contended annotation is used in the JDK core libraries - the jtreg jdk tests?
> 
> jdk/test/java/util/concurrent jtreg: no issues.
> jdk/test/java/lang/Thread jtreg: no issues.
> 
> 
> Thanks,
> -Aleksey.
> 
> 


From daniel.daugherty at oracle.com  Tue Nov 11 15:40:32 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 11 Nov 2014 08:40:32 -0700
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <5461CA5C.30409@oracle.com>
References: <546151A9.1080100@oracle.com> <5461CA5C.30409@oracle.com>
Message-ID: <54622DF0.5010800@oracle.com>

Dmitry,

Thanks for the quick review!

Replies embedded below...


On 11/11/14 1:35 AM, Dmitry Samersoff wrote:
> Dan,
>
> 1. defs.make:
>
> It might be better to join obcopy version check and condition at ll.190

I looked at that... The seemingly natural place to put the version check
is actually in the else branch on line 194... However, if the version
check is bad, then you have to make a second check for a reset OBJCOPY
value (along with indenting all the code another level or two).

It just looked ugly... it seemed better to keep the version check
separate from the other logic.


> otherwise the user will have a wrong version warning and then misleading
> message "no objcopy cmd found"

However, part of that wrong version warning is this line:

WARNING: ignoring above objcopy command.

so in reality that "no objcopy cmd found" is just confirming
that we are indeed ignoring the objcopy cmd that we found...


> 2. Did you consider moving objcopy detection to configure?

No because this fix has to be backported to JDK8u and JDK7 since
we support FDS in those releases...

Of course, the build-infra team is always welcome to use a new
bug to evolve this code for JDK9 and newer.

Again, thanks for the review!

Dan


>
>
> -Dmitry
>
>
> On 2014-11-11 03:00, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
>> Yes, it is a small fix, but it is in Makefiles so feel free to
>> run screaming from the room... :-)  On the plus side the fix does
>> delete two work around source files (Coleen would say that's a
>> Good Thing (TM)!)
>>
>> The fix is to detect the version of GNU objcopy that is being
>> used on the machine and only enable Full Debug Symbols when that
>> version is 2.21.1 or newer. If you don't have the right version,
>> then the build drops back to pre-FDS build configs with a message
>> like this:
>>
>> WARNING: /usr/sfw/bin/gobjcopy --version info:
>> WARNING: GNU objcopy 2.15
>> WARNING: an objcopy version of 2.21.1 or newer is needed to create valid
>> .debuginfo files.
>> WARNING: ignoring above objcopy command.
>> WARNING: patch 149063-01 or newer contains the correct Solaris 10 SPARC
>> version.
>> WARNING: patch 149064-01 or newer contains the correct Solaris 10 X86
>> version.
>> WARNING: Solaris 11 Update 1 contains the correct version.
>> INFO: no objcopy cmd found so cannot create .debuginfo files.
>> INFO: ENABLE_FULL_DEBUG_SYMBOLS=0
>>
>> This work is being tracked by the following bug IDs:
>>
>>      JDK-8033602 wrong stabs data in libjvm.debuginfo on JDK 8 - SPARC
>>      https://bugs.openjdk.java.net/browse/JDK-8033602
>>
>>      JDK-8034005 cannot debug in synchronizer.o or objectMonitor.o on
>> Solaris X86
>>      https://bugs.openjdk.java.net/browse/JDK-8034005
>>
>> Here is the webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8033602-webrev/0-jdk9-hs-rt/
>>
>> Testing:
>>
>> - JPRT test jobs to verify that the current JPRT Solaris hosts
>>    are happy
>> - local builds on my Solaris 10 X86 machine to verify that the
>>    wrong version of GNU objcopy is caught
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>


From coleen.phillimore at oracle.com  Tue Nov 11 15:59:30 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Tue, 11 Nov 2014 10:59:30 -0500
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <54621FFA.2070503@oracle.com>
References: <545FB64F.7090705@oracle.com> <54621FFA.2070503@oracle.com>
Message-ID: <54623262.9090109@oracle.com>


The Hotspot changes look straightforward and correct to me.
thanks,
Coleen

On 11/11/14, 9:40 AM, Aleksey Shipilev wrote:
> Hi,
>
> On 11/09/2014 09:45 PM, Aleksey Shipilev wrote:
>> Thread.getName() returns String, and does new String instantiation every
>> time, because the thread name is stored in char[]. Even though we use a
>> private String constructor that shares the char[] array without copying
>> it, this still hurts some use cases (think extra-fast logging). To the
>> extent some people actually maintain Map<Thread, String> to avoid it.
>>   https://bugs.openjdk.java.net/browse/JDK-8059677
>>
>> Here's the attempt to maintain String instead of char[]:
>>   http://cr.openjdk.java.net/~shade/8059677/webrev.01.jdk/
>>   http://cr.openjdk.java.net/~shade/8059677/webrev.01.hs/
> Updated webrevs:
>    http://cr.openjdk.java.net/~shade/8059677/webrev.02.jdk/
>    http://cr.openjdk.java.net/~shade/8059677/webrev.02.hs/
>
> This version incorporates feedbacks from Chris, Staffan and David. I
> think it is very close to what we would like to push. Opinions?
>
> Testing: JPRT, jdk/test/java/lang/Thread jtreg, hotspot/test/runtime/
> jtreg, vm.quick.testlist, nsk.jvmti.testlist, svc.quick.testlist,
> vm.tmtools.testlist
>
> Thanks,
> -Aleksey.
>
>
>
>


From aleksey.shipilev at oracle.com  Tue Nov 11 16:04:17 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Tue, 11 Nov 2014 19:04:17 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <54623262.9090109@oracle.com>
References: <545FB64F.7090705@oracle.com> <54621FFA.2070503@oracle.com>
	<54623262.9090109@oracle.com>
Message-ID: <54623381.9080509@oracle.com>

Thanks for review, Coleen!

-Aleksey.

On 11/11/2014 06:59 PM, Coleen Phillimore wrote:
> 
> The Hotspot changes look straightforward and correct to me.
> thanks,
> Coleen
> 
> On 11/11/14, 9:40 AM, Aleksey Shipilev wrote:
>> Hi,
>>
>> On 11/09/2014 09:45 PM, Aleksey Shipilev wrote:
>>> Thread.getName() returns String, and does new String instantiation every
>>> time, because the thread name is stored in char[]. Even though we use a
>>> private String constructor that shares the char[] array without copying
>>> it, this still hurts some use cases (think extra-fast logging). To the
>>> extent some people actually maintain Map<Thread, String> to avoid it.
>>>   https://bugs.openjdk.java.net/browse/JDK-8059677
>>>
>>> Here's the attempt to maintain String instead of char[]:
>>>   http://cr.openjdk.java.net/~shade/8059677/webrev.01.jdk/
>>>   http://cr.openjdk.java.net/~shade/8059677/webrev.01.hs/
>> Updated webrevs:
>>    http://cr.openjdk.java.net/~shade/8059677/webrev.02.jdk/
>>    http://cr.openjdk.java.net/~shade/8059677/webrev.02.hs/
>>
>> This version incorporates feedbacks from Chris, Staffan and David. I
>> think it is very close to what we would like to push. Opinions?
>>
>> Testing: JPRT, jdk/test/java/lang/Thread jtreg, hotspot/test/runtime/
>> jtreg, vm.quick.testlist, nsk.jvmti.testlist, svc.quick.testlist,
>> vm.tmtools.testlist
>>
>> Thanks,
>> -Aleksey.
>>
>>
>>
>>
> 


From chris.hegarty at oracle.com  Tue Nov 11 16:10:35 2014
From: chris.hegarty at oracle.com (Chris Hegarty)
Date: Tue, 11 Nov 2014 16:10:35 +0000
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <54621FFA.2070503@oracle.com>
References: <545FB64F.7090705@oracle.com> <54621FFA.2070503@oracle.com>
Message-ID: <DFF0E4A2-1459-4FD6-8C84-C00FCE14D981@oracle.com>

On 11 Nov 2014, at 14:40, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:

> Hi,
> 
> On 11/09/2014 09:45 PM, Aleksey Shipilev wrote:
>> Thread.getName() returns String, and does new String instantiation every
>> time, because the thread name is stored in char[]. Even though we use a
>> private String constructor that shares the char[] array without copying
>> it, this still hurts some use cases (think extra-fast logging). To the
>> extent some people actually maintain Map<Thread, String> to avoid it.
>> https://bugs.openjdk.java.net/browse/JDK-8059677
>> 
>> Here's the attempt to maintain String instead of char[]:
>> http://cr.openjdk.java.net/~shade/8059677/webrev.01.jdk/
>> http://cr.openjdk.java.net/~shade/8059677/webrev.01.hs/
> 
> Updated webrevs:
>  http://cr.openjdk.java.net/~shade/8059677/webrev.02.jdk/

Looks good.

>  http://cr.openjdk.java.net/~shade/8059677/webrev.02.hs/

I skimmed this webrev, and it also looks fine to me.

-Chris.

> This version incorporates feedbacks from Chris, Staffan and David. I
> think it is very close to what we would like to push. Opinions?
> 
> Testing: JPRT, jdk/test/java/lang/Thread jtreg, hotspot/test/runtime/
> jtreg, vm.quick.testlist, nsk.jvmti.testlist, svc.quick.testlist,
> vm.tmtools.testlist
> 
> Thanks,
> -Aleksey.
> 
> 
> 
> 


From dmitry.samersoff at oracle.com  Tue Nov 11 16:21:36 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Tue, 11 Nov 2014 19:21:36 +0300
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <54622DF0.5010800@oracle.com>
References: <546151A9.1080100@oracle.com> <5461CA5C.30409@oracle.com>
	<54622DF0.5010800@oracle.com>
Message-ID: <54623790.4080103@oracle.com>

Dan,

Thank you for the explanation.

The fix looks good for me.

-Dmitry

On 2014-11-11 18:40, Daniel D. Daugherty wrote:
> Dmitry,
> 
> Thanks for the quick review!
> 
> Replies embedded below...
> 
> 
> On 11/11/14 1:35 AM, Dmitry Samersoff wrote:
>> Dan,
>>
>> 1. defs.make:
>>
>> It might be better to join obcopy version check and condition at ll.190
> 
> I looked at that... The seemingly natural place to put the version check
> is actually in the else branch on line 194... However, if the version
> check is bad, then you have to make a second check for a reset OBJCOPY
> value (along with indenting all the code another level or two).
> 
> It just looked ugly... it seemed better to keep the version check
> separate from the other logic.
> 
> 
>> otherwise the user will have a wrong version warning and then misleading
>> message "no objcopy cmd found"
> 
> However, part of that wrong version warning is this line:
> 
> WARNING: ignoring above objcopy command.
> 
> so in reality that "no objcopy cmd found" is just confirming
> that we are indeed ignoring the objcopy cmd that we found...
> 
> 
>> 2. Did you consider moving objcopy detection to configure?
> 
> No because this fix has to be backported to JDK8u and JDK7 since
> we support FDS in those releases...
> 
> Of course, the build-infra team is always welcome to use a new
> bug to evolve this code for JDK9 and newer.
> 
> Again, thanks for the review!
> 
> Dan
> 
> 
>>
>>
>> -Dmitry
>>
>>
>> On 2014-11-11 03:00, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
>>> Yes, it is a small fix, but it is in Makefiles so feel free to
>>> run screaming from the room... :-)  On the plus side the fix does
>>> delete two work around source files (Coleen would say that's a
>>> Good Thing (TM)!)
>>>
>>> The fix is to detect the version of GNU objcopy that is being
>>> used on the machine and only enable Full Debug Symbols when that
>>> version is 2.21.1 or newer. If you don't have the right version,
>>> then the build drops back to pre-FDS build configs with a message
>>> like this:
>>>
>>> WARNING: /usr/sfw/bin/gobjcopy --version info:
>>> WARNING: GNU objcopy 2.15
>>> WARNING: an objcopy version of 2.21.1 or newer is needed to create valid
>>> .debuginfo files.
>>> WARNING: ignoring above objcopy command.
>>> WARNING: patch 149063-01 or newer contains the correct Solaris 10 SPARC
>>> version.
>>> WARNING: patch 149064-01 or newer contains the correct Solaris 10 X86
>>> version.
>>> WARNING: Solaris 11 Update 1 contains the correct version.
>>> INFO: no objcopy cmd found so cannot create .debuginfo files.
>>> INFO: ENABLE_FULL_DEBUG_SYMBOLS=0
>>>
>>> This work is being tracked by the following bug IDs:
>>>
>>>      JDK-8033602 wrong stabs data in libjvm.debuginfo on JDK 8 - SPARC
>>>      https://bugs.openjdk.java.net/browse/JDK-8033602
>>>
>>>      JDK-8034005 cannot debug in synchronizer.o or objectMonitor.o on
>>> Solaris X86
>>>      https://bugs.openjdk.java.net/browse/JDK-8034005
>>>
>>> Here is the webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8033602-webrev/0-jdk9-hs-rt/
>>>
>>> Testing:
>>>
>>> - JPRT test jobs to verify that the current JPRT Solaris hosts
>>>    are happy
>>> - local builds on my Solaris 10 X86 machine to verify that the
>>>    wrong version of GNU objcopy is caught
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From daniel.daugherty at oracle.com  Tue Nov 11 17:25:42 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 11 Nov 2014 10:25:42 -0700
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <54623790.4080103@oracle.com>
References: <546151A9.1080100@oracle.com> <5461CA5C.30409@oracle.com>
	<54622DF0.5010800@oracle.com> <54623790.4080103@oracle.com>
Message-ID: <54624696.2090201@oracle.com>

Thanks for closing the loop on this!

Dan


On 11/11/14 9:21 AM, Dmitry Samersoff wrote:
> Dan,
>
> Thank you for the explanation.
>
> The fix looks good for me.
>
> -Dmitry
>
> On 2014-11-11 18:40, Daniel D. Daugherty wrote:
>> Dmitry,
>>
>> Thanks for the quick review!
>>
>> Replies embedded below...
>>
>>
>> On 11/11/14 1:35 AM, Dmitry Samersoff wrote:
>>> Dan,
>>>
>>> 1. defs.make:
>>>
>>> It might be better to join obcopy version check and condition at ll.190
>> I looked at that... The seemingly natural place to put the version check
>> is actually in the else branch on line 194... However, if the version
>> check is bad, then you have to make a second check for a reset OBJCOPY
>> value (along with indenting all the code another level or two).
>>
>> It just looked ugly... it seemed better to keep the version check
>> separate from the other logic.
>>
>>
>>> otherwise the user will have a wrong version warning and then misleading
>>> message "no objcopy cmd found"
>> However, part of that wrong version warning is this line:
>>
>> WARNING: ignoring above objcopy command.
>>
>> so in reality that "no objcopy cmd found" is just confirming
>> that we are indeed ignoring the objcopy cmd that we found...
>>
>>
>>> 2. Did you consider moving objcopy detection to configure?
>> No because this fix has to be backported to JDK8u and JDK7 since
>> we support FDS in those releases...
>>
>> Of course, the build-infra team is always welcome to use a new
>> bug to evolve this code for JDK9 and newer.
>>
>> Again, thanks for the review!
>>
>> Dan
>>
>>
>>>
>>> -Dmitry
>>>
>>>
>>> On 2014-11-11 03:00, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
>>>> Yes, it is a small fix, but it is in Makefiles so feel free to
>>>> run screaming from the room... :-)  On the plus side the fix does
>>>> delete two work around source files (Coleen would say that's a
>>>> Good Thing (TM)!)
>>>>
>>>> The fix is to detect the version of GNU objcopy that is being
>>>> used on the machine and only enable Full Debug Symbols when that
>>>> version is 2.21.1 or newer. If you don't have the right version,
>>>> then the build drops back to pre-FDS build configs with a message
>>>> like this:
>>>>
>>>> WARNING: /usr/sfw/bin/gobjcopy --version info:
>>>> WARNING: GNU objcopy 2.15
>>>> WARNING: an objcopy version of 2.21.1 or newer is needed to create valid
>>>> .debuginfo files.
>>>> WARNING: ignoring above objcopy command.
>>>> WARNING: patch 149063-01 or newer contains the correct Solaris 10 SPARC
>>>> version.
>>>> WARNING: patch 149064-01 or newer contains the correct Solaris 10 X86
>>>> version.
>>>> WARNING: Solaris 11 Update 1 contains the correct version.
>>>> INFO: no objcopy cmd found so cannot create .debuginfo files.
>>>> INFO: ENABLE_FULL_DEBUG_SYMBOLS=0
>>>>
>>>> This work is being tracked by the following bug IDs:
>>>>
>>>>       JDK-8033602 wrong stabs data in libjvm.debuginfo on JDK 8 - SPARC
>>>>       https://bugs.openjdk.java.net/browse/JDK-8033602
>>>>
>>>>       JDK-8034005 cannot debug in synchronizer.o or objectMonitor.o on
>>>> Solaris X86
>>>>       https://bugs.openjdk.java.net/browse/JDK-8034005
>>>>
>>>> Here is the webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8033602-webrev/0-jdk9-hs-rt/
>>>>
>>>> Testing:
>>>>
>>>> - JPRT test jobs to verify that the current JPRT Solaris hosts
>>>>     are happy
>>>> - local builds on my Solaris 10 X86 machine to verify that the
>>>>     wrong version of GNU objcopy is caught
>>>>
>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>
>>>> Dan
>


From lois.foltan at oracle.com  Tue Nov 11 18:22:56 2014
From: lois.foltan at oracle.com (Lois Foltan)
Date: Tue, 11 Nov 2014 13:22:56 -0500
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <545C21E6.90709@oracle.com>
References: <545C21E6.90709@oracle.com>
Message-ID: <54625400.1000701@oracle.com>

Hi Jiangli,
Yes, this looks good, reviewed.
Lois

On 11/6/2014 8:35 PM, Jiangli Zhou wrote:
> Hi,
>
> Please review the following changes that fix the crash with 
> -XX:-LazyBootClassLoader on windows x64 platforms (fastdebug only). 
> During VM initialization,  current_stack_pointer() could be called 
> before the VM generates stub routines. The generated get_previous_sp 
> routine cannot be used during that time, use the estimated value for 
> the sp value instead. The x86 implementation is unaffected by the 
> change and always returns the estimated sp value as before.
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8054008
> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev/
>
> Tested with JPRT and ExtBadJAR test.
>
> Background:
> As part of the VM initialization, classLoader_init() calls ZIP_Open 
> from the zip library for processing the boot class path when 
> -XX:-LazyBootClassLoader is specified. The call path re-enters VM 
> before returning from the zip library call. Following is the backtrace 
> right before when the crash happens. The windows x64 version of 
> current_stack_pointer() uses generated stub routine get_previous_sp 
> (generated by generate_get_previous_sp()) to obtain the stack pointer 
> value. Since classLoader_init() happens before stubRoutines_init1() 
> and the stub routines are not generated at the time, the execution 
> jumps to address 0 (referenced by _get_previous_sp_entry which should 
> contain the address of the generated routine after 
> stubRoutines_init1()) when it's trying to call the stub routine and 
> crashes.
>
>
>      jvm.dll!os::current_stack_pointer() Line 468 C++
>      jvm.dll!os::verify_stack_alignment() Line 638 + 0x5 bytes C++
>      jvm.dll!JVM_NativePath(char * path) Line 691 C++
>      zip.dll!000007feebc49de0()
>      [Frames below may be incorrect and/or missing, no symbols loaded 
> for zip.dll]
>      zip.dll!000007feebc4af1d()
>      zip.dll!000007feebc4b004()
>      jvm.dll!ClassLoader::create_class_path_entry(const char * path, 
> const stat * st, bool lazy, bool throw_exception, Thread * 
> __the_thread__) Line 666 + 0x13 bytes C++
>      jvm.dll!ClassLoader::update_class_path_entry_list(const char * 
> path, bool check_for_duplicates, bool throw_exception) Line 763 + 0x2d 
> bytes C++
>      jvm.dll!ClassLoader::setup_search_path(const char * class_path) 
> Line 630 C++
>      jvm.dll!ClassLoader::setup_bootstrap_search_path() Line 594 C++
>      jvm.dll!ClassLoader::initialize() Line 1237 C++
>      jvm.dll!classLoader_init() Line 1291 C++
>      jvm.dll!init_globals() Line 100 C++
>      jvm.dll!Threads::create_vm(JavaVMInitArgs * args, bool * 
> canTryAgain) Line 3414 + 0x5 bytes C++
>      jvm.dll!JNI_CreateJavaVM(JavaVM_ * * vm, void * * penv, void * 
> args) Line 5199 + 0x12 bytes C++
>      java.exe!000000013f0520f6()
>      java.exe!000000013f05cb63()
>      java.exe!000000013f05cbf7()
>      kernel32.dll!0000000076ba59ed()
>      ntdll.dll!0000000076cdc541()
>
> Thanks,
> Jiangli
>


From jiangli.zhou at oracle.com  Tue Nov 11 18:27:43 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Tue, 11 Nov 2014 10:27:43 -0800
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <54625400.1000701@oracle.com>
References: <545C21E6.90709@oracle.com> <54625400.1000701@oracle.com>
Message-ID: <5462551F.1010808@oracle.com>

Thank you for the review, Lois!

Jiangli

On 11/11/2014 10:22 AM, Lois Foltan wrote:
> Hi Jiangli,
> Yes, this looks good, reviewed.
> Lois
>
> On 11/6/2014 8:35 PM, Jiangli Zhou wrote:
>> Hi,
>>
>> Please review the following changes that fix the crash with 
>> -XX:-LazyBootClassLoader on windows x64 platforms (fastdebug only). 
>> During VM initialization,  current_stack_pointer() could be called 
>> before the VM generates stub routines. The generated get_previous_sp 
>> routine cannot be used during that time, use the estimated value for 
>> the sp value instead. The x86 implementation is unaffected by the 
>> change and always returns the estimated sp value as before.
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8054008
>> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev/
>>
>> Tested with JPRT and ExtBadJAR test.
>>
>> Background:
>> As part of the VM initialization, classLoader_init() calls ZIP_Open 
>> from the zip library for processing the boot class path when 
>> -XX:-LazyBootClassLoader is specified. The call path re-enters VM 
>> before returning from the zip library call. Following is the 
>> backtrace right before when the crash happens. The windows x64 
>> version of current_stack_pointer() uses generated stub routine 
>> get_previous_sp (generated by generate_get_previous_sp()) to obtain 
>> the stack pointer value. Since classLoader_init() happens before 
>> stubRoutines_init1() and the stub routines are not generated at the 
>> time, the execution jumps to address 0 (referenced by 
>> _get_previous_sp_entry which should contain the address of the 
>> generated routine after stubRoutines_init1()) when it's trying to 
>> call the stub routine and crashes.
>>
>>
>>      jvm.dll!os::current_stack_pointer() Line 468 C++
>>      jvm.dll!os::verify_stack_alignment() Line 638 + 0x5 bytes C++
>>      jvm.dll!JVM_NativePath(char * path) Line 691 C++
>>      zip.dll!000007feebc49de0()
>>      [Frames below may be incorrect and/or missing, no symbols loaded 
>> for zip.dll]
>>      zip.dll!000007feebc4af1d()
>>      zip.dll!000007feebc4b004()
>>      jvm.dll!ClassLoader::create_class_path_entry(const char * path, 
>> const stat * st, bool lazy, bool throw_exception, Thread * 
>> __the_thread__) Line 666 + 0x13 bytes C++
>>      jvm.dll!ClassLoader::update_class_path_entry_list(const char * 
>> path, bool check_for_duplicates, bool throw_exception) Line 763 + 
>> 0x2d bytes C++
>>      jvm.dll!ClassLoader::setup_search_path(const char * class_path) 
>> Line 630 C++
>>      jvm.dll!ClassLoader::setup_bootstrap_search_path() Line 594 C++
>>      jvm.dll!ClassLoader::initialize() Line 1237 C++
>>      jvm.dll!classLoader_init() Line 1291 C++
>>      jvm.dll!init_globals() Line 100 C++
>>      jvm.dll!Threads::create_vm(JavaVMInitArgs * args, bool * 
>> canTryAgain) Line 3414 + 0x5 bytes C++
>>      jvm.dll!JNI_CreateJavaVM(JavaVM_ * * vm, void * * penv, void * 
>> args) Line 5199 + 0x12 bytes C++
>>      java.exe!000000013f0520f6()
>>      java.exe!000000013f05cb63()
>>      java.exe!000000013f05cbf7()
>>      kernel32.dll!0000000076ba59ed()
>>      ntdll.dll!0000000076cdc541()
>>
>> Thanks,
>> Jiangli
>>
>


From jiangli.zhou at oracle.com  Tue Nov 11 18:30:49 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Tue, 11 Nov 2014 10:30:49 -0800
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <54625400.1000701@oracle.com>
References: <545C21E6.90709@oracle.com> <54625400.1000701@oracle.com>
Message-ID: <546255D9.6030600@oracle.com>

Hi Lois,

Actually there was an updated webrev based on Roland's feedback. Since 
you are replying to the original request, not sure if you reviewed the 
latest webrev. If not, here is the link: 
http://cr.openjdk.java.net/~jiangli/8054008/webrev.02/.

Thanks,
Jiangli

On 11/11/2014 10:22 AM, Lois Foltan wrote:
> Hi Jiangli,
> Yes, this looks good, reviewed.
> Lois
>
> On 11/6/2014 8:35 PM, Jiangli Zhou wrote:
>> Hi,
>>
>> Please review the following changes that fix the crash with 
>> -XX:-LazyBootClassLoader on windows x64 platforms (fastdebug only). 
>> During VM initialization,  current_stack_pointer() could be called 
>> before the VM generates stub routines. The generated get_previous_sp 
>> routine cannot be used during that time, use the estimated value for 
>> the sp value instead. The x86 implementation is unaffected by the 
>> change and always returns the estimated sp value as before.
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8054008
>> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev/
>>
>> Tested with JPRT and ExtBadJAR test.
>>
>> Background:
>> As part of the VM initialization, classLoader_init() calls ZIP_Open 
>> from the zip library for processing the boot class path when 
>> -XX:-LazyBootClassLoader is specified. The call path re-enters VM 
>> before returning from the zip library call. Following is the 
>> backtrace right before when the crash happens. The windows x64 
>> version of current_stack_pointer() uses generated stub routine 
>> get_previous_sp (generated by generate_get_previous_sp()) to obtain 
>> the stack pointer value. Since classLoader_init() happens before 
>> stubRoutines_init1() and the stub routines are not generated at the 
>> time, the execution jumps to address 0 (referenced by 
>> _get_previous_sp_entry which should contain the address of the 
>> generated routine after stubRoutines_init1()) when it's trying to 
>> call the stub routine and crashes.
>>
>>
>>      jvm.dll!os::current_stack_pointer() Line 468 C++
>>      jvm.dll!os::verify_stack_alignment() Line 638 + 0x5 bytes C++
>>      jvm.dll!JVM_NativePath(char * path) Line 691 C++
>>      zip.dll!000007feebc49de0()
>>      [Frames below may be incorrect and/or missing, no symbols loaded 
>> for zip.dll]
>>      zip.dll!000007feebc4af1d()
>>      zip.dll!000007feebc4b004()
>>      jvm.dll!ClassLoader::create_class_path_entry(const char * path, 
>> const stat * st, bool lazy, bool throw_exception, Thread * 
>> __the_thread__) Line 666 + 0x13 bytes C++
>>      jvm.dll!ClassLoader::update_class_path_entry_list(const char * 
>> path, bool check_for_duplicates, bool throw_exception) Line 763 + 
>> 0x2d bytes C++
>>      jvm.dll!ClassLoader::setup_search_path(const char * class_path) 
>> Line 630 C++
>>      jvm.dll!ClassLoader::setup_bootstrap_search_path() Line 594 C++
>>      jvm.dll!ClassLoader::initialize() Line 1237 C++
>>      jvm.dll!classLoader_init() Line 1291 C++
>>      jvm.dll!init_globals() Line 100 C++
>>      jvm.dll!Threads::create_vm(JavaVMInitArgs * args, bool * 
>> canTryAgain) Line 3414 + 0x5 bytes C++
>>      jvm.dll!JNI_CreateJavaVM(JavaVM_ * * vm, void * * penv, void * 
>> args) Line 5199 + 0x12 bytes C++
>>      java.exe!000000013f0520f6()
>>      java.exe!000000013f05cb63()
>>      java.exe!000000013f05cbf7()
>>      kernel32.dll!0000000076ba59ed()
>>      ntdll.dll!0000000076cdc541()
>>
>> Thanks,
>> Jiangli
>>
>


From lois.foltan at oracle.com  Tue Nov 11 18:42:30 2014
From: lois.foltan at oracle.com (Lois Foltan)
Date: Tue, 11 Nov 2014 13:42:30 -0500
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <546255D9.6030600@oracle.com>
References: <545C21E6.90709@oracle.com> <54625400.1000701@oracle.com>
	<546255D9.6030600@oracle.com>
Message-ID: <54625896.9000005@oracle.com>


On 11/11/2014 1:30 PM, Jiangli Zhou wrote:
> Hi Lois,
>
> Actually there was an updated webrev based on Roland's feedback. Since 
> you are replying to the original request, not sure if you reviewed the 
> latest webrev. If not, here is the link: 
> http://cr.openjdk.java.net/~jiangli/8054008/webrev.02/.

My apologies, I did review .02 but responded to your first RFR request.  
Looks fine.
Lois

>
> Thanks,
> Jiangli
>
> On 11/11/2014 10:22 AM, Lois Foltan wrote:
>> Hi Jiangli,
>> Yes, this looks good, reviewed.
>> Lois
>>
>> On 11/6/2014 8:35 PM, Jiangli Zhou wrote:
>>> Hi,
>>>
>>> Please review the following changes that fix the crash with 
>>> -XX:-LazyBootClassLoader on windows x64 platforms (fastdebug only). 
>>> During VM initialization,  current_stack_pointer() could be called 
>>> before the VM generates stub routines. The generated get_previous_sp 
>>> routine cannot be used during that time, use the estimated value for 
>>> the sp value instead. The x86 implementation is unaffected by the 
>>> change and always returns the estimated sp value as before.
>>>
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8054008
>>> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev/
>>>
>>> Tested with JPRT and ExtBadJAR test.
>>>
>>> Background:
>>> As part of the VM initialization, classLoader_init() calls ZIP_Open 
>>> from the zip library for processing the boot class path when 
>>> -XX:-LazyBootClassLoader is specified. The call path re-enters VM 
>>> before returning from the zip library call. Following is the 
>>> backtrace right before when the crash happens. The windows x64 
>>> version of current_stack_pointer() uses generated stub routine 
>>> get_previous_sp (generated by generate_get_previous_sp()) to obtain 
>>> the stack pointer value. Since classLoader_init() happens before 
>>> stubRoutines_init1() and the stub routines are not generated at the 
>>> time, the execution jumps to address 0 (referenced by 
>>> _get_previous_sp_entry which should contain the address of the 
>>> generated routine after stubRoutines_init1()) when it's trying to 
>>> call the stub routine and crashes.
>>>
>>>
>>>      jvm.dll!os::current_stack_pointer() Line 468 C++
>>>      jvm.dll!os::verify_stack_alignment() Line 638 + 0x5 bytes C++
>>>      jvm.dll!JVM_NativePath(char * path) Line 691 C++
>>>      zip.dll!000007feebc49de0()
>>>      [Frames below may be incorrect and/or missing, no symbols 
>>> loaded for zip.dll]
>>>      zip.dll!000007feebc4af1d()
>>>      zip.dll!000007feebc4b004()
>>>      jvm.dll!ClassLoader::create_class_path_entry(const char * path, 
>>> const stat * st, bool lazy, bool throw_exception, Thread * 
>>> __the_thread__) Line 666 + 0x13 bytes C++
>>>      jvm.dll!ClassLoader::update_class_path_entry_list(const char * 
>>> path, bool check_for_duplicates, bool throw_exception) Line 763 + 
>>> 0x2d bytes C++
>>>      jvm.dll!ClassLoader::setup_search_path(const char * class_path) 
>>> Line 630 C++
>>>      jvm.dll!ClassLoader::setup_bootstrap_search_path() Line 594 C++
>>>      jvm.dll!ClassLoader::initialize() Line 1237 C++
>>>      jvm.dll!classLoader_init() Line 1291 C++
>>>      jvm.dll!init_globals() Line 100 C++
>>>      jvm.dll!Threads::create_vm(JavaVMInitArgs * args, bool * 
>>> canTryAgain) Line 3414 + 0x5 bytes C++
>>>      jvm.dll!JNI_CreateJavaVM(JavaVM_ * * vm, void * * penv, void * 
>>> args) Line 5199 + 0x12 bytes C++
>>>      java.exe!000000013f0520f6()
>>>      java.exe!000000013f05cb63()
>>>      java.exe!000000013f05cbf7()
>>>      kernel32.dll!0000000076ba59ed()
>>>      ntdll.dll!0000000076cdc541()
>>>
>>> Thanks,
>>> Jiangli
>>>
>>
>


From jiangli.zhou at oracle.com  Tue Nov 11 18:44:10 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Tue, 11 Nov 2014 10:44:10 -0800
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <54625896.9000005@oracle.com>
References: <545C21E6.90709@oracle.com> <54625400.1000701@oracle.com>
	<546255D9.6030600@oracle.com> <54625896.9000005@oracle.com>
Message-ID: <546258FA.1060404@oracle.com>

Ok. Thank you for confirming that!

Jiangli

On 11/11/2014 10:42 AM, Lois Foltan wrote:
>
> On 11/11/2014 1:30 PM, Jiangli Zhou wrote:
>> Hi Lois,
>>
>> Actually there was an updated webrev based on Roland's feedback. 
>> Since you are replying to the original request, not sure if you 
>> reviewed the latest webrev. If not, here is the link: 
>> http://cr.openjdk.java.net/~jiangli/8054008/webrev.02/.
>
> My apologies, I did review .02 but responded to your first RFR 
> request.  Looks fine.
> Lois
>
>>
>> Thanks,
>> Jiangli
>>
>> On 11/11/2014 10:22 AM, Lois Foltan wrote:
>>> Hi Jiangli,
>>> Yes, this looks good, reviewed.
>>> Lois
>>>
>>> On 11/6/2014 8:35 PM, Jiangli Zhou wrote:
>>>> Hi,
>>>>
>>>> Please review the following changes that fix the crash with 
>>>> -XX:-LazyBootClassLoader on windows x64 platforms (fastdebug only). 
>>>> During VM initialization,  current_stack_pointer() could be called 
>>>> before the VM generates stub routines. The generated 
>>>> get_previous_sp routine cannot be used during that time, use the 
>>>> estimated value for the sp value instead. The x86 implementation is 
>>>> unaffected by the change and always returns the estimated sp value 
>>>> as before.
>>>>
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8054008
>>>> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev/
>>>>
>>>> Tested with JPRT and ExtBadJAR test.
>>>>
>>>> Background:
>>>> As part of the VM initialization, classLoader_init() calls ZIP_Open 
>>>> from the zip library for processing the boot class path when 
>>>> -XX:-LazyBootClassLoader is specified. The call path re-enters VM 
>>>> before returning from the zip library call. Following is the 
>>>> backtrace right before when the crash happens. The windows x64 
>>>> version of current_stack_pointer() uses generated stub routine 
>>>> get_previous_sp (generated by generate_get_previous_sp()) to obtain 
>>>> the stack pointer value. Since classLoader_init() happens before 
>>>> stubRoutines_init1() and the stub routines are not generated at the 
>>>> time, the execution jumps to address 0 (referenced by 
>>>> _get_previous_sp_entry which should contain the address of the 
>>>> generated routine after stubRoutines_init1()) when it's trying to 
>>>> call the stub routine and crashes.
>>>>
>>>>
>>>>      jvm.dll!os::current_stack_pointer() Line 468 C++
>>>>      jvm.dll!os::verify_stack_alignment() Line 638 + 0x5 bytes C++
>>>>      jvm.dll!JVM_NativePath(char * path) Line 691 C++
>>>>      zip.dll!000007feebc49de0()
>>>>      [Frames below may be incorrect and/or missing, no symbols 
>>>> loaded for zip.dll]
>>>>      zip.dll!000007feebc4af1d()
>>>>      zip.dll!000007feebc4b004()
>>>>      jvm.dll!ClassLoader::create_class_path_entry(const char * 
>>>> path, const stat * st, bool lazy, bool throw_exception, Thread * 
>>>> __the_thread__) Line 666 + 0x13 bytes C++
>>>>      jvm.dll!ClassLoader::update_class_path_entry_list(const char * 
>>>> path, bool check_for_duplicates, bool throw_exception) Line 763 + 
>>>> 0x2d bytes C++
>>>>      jvm.dll!ClassLoader::setup_search_path(const char * 
>>>> class_path) Line 630 C++
>>>>      jvm.dll!ClassLoader::setup_bootstrap_search_path() Line 594 C++
>>>>      jvm.dll!ClassLoader::initialize() Line 1237 C++
>>>>      jvm.dll!classLoader_init() Line 1291 C++
>>>>      jvm.dll!init_globals() Line 100 C++
>>>>      jvm.dll!Threads::create_vm(JavaVMInitArgs * args, bool * 
>>>> canTryAgain) Line 3414 + 0x5 bytes C++
>>>>      jvm.dll!JNI_CreateJavaVM(JavaVM_ * * vm, void * * penv, void * 
>>>> args) Line 5199 + 0x12 bytes C++
>>>>      java.exe!000000013f0520f6()
>>>>      java.exe!000000013f05cb63()
>>>>      java.exe!000000013f05cbf7()
>>>>      kernel32.dll!0000000076ba59ed()
>>>>      ntdll.dll!0000000076cdc541()
>>>>
>>>> Thanks,
>>>> Jiangli
>>>>
>>>
>>
>


From david.r.chase at oracle.com  Tue Nov 11 18:58:30 2014
From: david.r.chase at oracle.com (David Chase)
Date: Tue, 11 Nov 2014 13:58:30 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <545E31BA.3070500@gmail.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
	<5458A57C.4060208@gmail.com>
	<260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>
	<5459034E.8070809@gmail.com>
	<D1D7CFCF-D5C0-4A05-A636-80685CF153B0@oracle.com>
	<39826508-110B-4FCE-9A58-8C3D1B9FC7DE@oracle.com>
	<545E31BA.3070500@gmail.com>
Message-ID: <1D96462F-87D3-4794-A818-27DD2EE10046@oracle.com>


On 2014-11-08, at 10:07 AM, Peter Levart <peter.levart at gmail.com> wrote:
> 
> Now let's take for example one of the MemberName.make() methods that return interned MemberNames:
> 
>  206     public static MemberName make(Method m, boolean wantSpecial) {
>  207         // Unreflected member names are resolved so intern them here.
>  208         MemberName tmp0 = null;
>  209         InternTransaction tx = new InternTransaction(m.getDeclaringClass());
>  210         while (tmp0 == null) {
>  211             MemberName tmp = new MemberName(m, wantSpecial);
>  212             tmp0 = tx.tryIntern(tmp);
>  213         }
>  214         return tmp0;
>  215     }
> 
> I'm trying to understand the workings of InternTransaction helper class (and find an example that breaks it). You create an instance of it, passing Method's declaringClass. You then (in retry loop) create a resolved MemberName from the Method and wantSpecial flag. This MemberName's clazz can apparently differ from Method's declaringClass. I don't know when and why this happens, but apparently it can (super method?), so in InternTransaction.tryIntern() you do...
> 
>  363             if (member_name.isResolved()) {
>  364                 if (member_name.clazz != tx_class) {
>  365                     Class prev_tx_class = tx_class;
>  366                     int prev_txn_token = txn_token;
>  367                     tx_class = member_name.clazz;
>  368                     txn_token = internTxnToken(tx_class);
>  369                     // Zero is a special case.
>  370                     if (txn_token != 0 ||
>  371                         prev_txn_token != internTxnToken(prev_tx_class)) {
>  372                         // Resolved class is different and at least one
>  373                         // redef of it occurred, therefore repeat with
>  374                         // proper class for race consistency checking.
>  375                         return null;
>  376                     }
>  377                 }
>  378                 member_name = member_name.intern(txn_token);
>  379                 if (member_name == null) {
>  380                     // Update the token for the next try.
>  381                     txn_token = internTxnToken(tx_class);
>  382                 }
>  383             }
> 
> 
> Now let's assume that the resolved member_name.clazz differs from Method's declaringClass. Let's assume also that either member_name.clazz has had at least one redefinition or Method's declaringClass has been redefined between creating InternTransaction and reading member_name.clazz's txn_token. You return 'null' in such case, concluding that not only the resolved member_name.clazz redefinition matters, but Method's declaringClass redefinition can also invalidate resolved MemberName am I right? It would be helpful if I could understand when and how Method's declaringClass redefinition can affect member_name. Can it affect which clazz is resolved for member_name?

If a declaring class is redefined before a MemberName is ?published? to the VM, then there is a risk that its secret fields will have gone stale because the referenced VM methods changed but were not updated.  Therefore, the resolution must be retried to get a fresh resolution that is known not to be stale.  There is sort of a glitch in the race-checking protocol; I don?t have certain knowledge which class will be resolved, so if I guessed wrong (and the common-case no redefinition at all check fails) then I am forced to retry and get a fresh, known-good resolution.

However, based on my understanding of what is (not) allowed in class redefinition, what differs after redefinition is only the code of the method, and not the owner ? that is, if D.m resolved to B.m before redefinition of D, C, or B, then it will always resolve to B.m ? but the definition of B.m itself might have changed (from the test cases, it might print ?foo? instead of ?bar?).  Or to put it differently, the methods change, but their hierarchy does not.

> Anyway, you return null in such case from an updated InternTransaction (tx_class and txn_token are now updated to have values for resolved member_name.clazz). In next round the checks of newly constructed and resolved member_name are not performed against Method's declaringClass but against previous round's member_name.clazz. Is this what is intended?

> I can see there has to be a stop condition for loop to end, but shouldn't checks for Method's declaringClass redefinition be performed in every iteration (in addition to the check for member_name.clazz redefinition if it differs from Method's declaringClass)?

To the best of my understanding (see restrictions above) the tx_class ought to be wrong at most once;
all subsequent resolutions including those that span a class redefinition should return the same class,
so it suffices to detect redefinition of the method itself.

I?ve incorporated your other changes (not yet the linear-scan hash table) and will be retesting.
One thing I wonder about for both hash table and binary search is if the first try should be attempted with no lock to avoid the overhead of synchronization; I expect that looking will be more common than interning, which in turn will be (vastly) more common than class redefinition.

David


From peter.levart at gmail.com  Tue Nov 11 19:30:10 2014
From: peter.levart at gmail.com (Peter Levart)
Date: Tue, 11 Nov 2014 20:30:10 +0100
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <1D96462F-87D3-4794-A818-27DD2EE10046@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
	<5458A57C.4060208@gmail.com>
	<260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>
	<5459034E.8070809@gmail.com>
	<D1D7CFCF-D5C0-4A05-A636-80685CF153B0@oracle.com>
	<39826508-110B-4FCE-9A58-8C3D1B9FC7DE@oracle.com>
	<545E31BA.3070500@gmail.com>
	<1D96462F-87D3-4794-A818-27DD2EE10046@oracle.com>
Message-ID: <546263C2.1060508@gmail.com>


On 11/11/2014 07:58 PM, David Chase wrote:
> I?ve incorporated your other changes (not yet the linear-scan hash table) and will be retesting.
> One thing I wonder about for both hash table and binary search is if the first try should be attempted with no lock to avoid the overhead of synchronization; I expect that looking will be more common than interning, which in turn will be (vastly) more common than class redefinition.

Hi David,

Yes, that's why I implemented the hash table in a way where lookups are 
lock-free. Binary-search would be trickier to implement without locking, 
but maybe not impossible. Surely not with Arrays.binarySearch() but 
perhaps with a separate implementation. The problem with 
Arrays.binarySearch is that it returns an index. By the time you 
retrieve the element at that index, it can already move. I'm also not 
sure that "careful" concurrent insertion of new element would not break 
the correctness of binary search. But there is another way I showed 
before: using StampedLock. It is a kind of optimistic/pessimistic 
read-write lock. Its beauty is in that optimistic read part is almost 
free (just a volatile read at start and a readFence followed by another 
volatile read at the end). You just have to be sure that the algorithm 
guarded by an optimistic read lock terminates normally (that it doesn't 
spin in an endless loop or throw exceptions) in the presence of 
arbitrary concurrent modifications of looked-up state. Well, binary 
search is such an algorithm.

Regards, Peter

> David


From aleksey.shipilev at oracle.com  Tue Nov 11 19:35:24 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Tue, 11 Nov 2014 22:35:24 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <DFF0E4A2-1459-4FD6-8C84-C00FCE14D981@oracle.com>
References: <545FB64F.7090705@oracle.com> <54621FFA.2070503@oracle.com>
	<DFF0E4A2-1459-4FD6-8C84-C00FCE14D981@oracle.com>
Message-ID: <546264FC.2060001@oracle.com>

On 11/11/2014 07:10 PM, Chris Hegarty wrote:
> On 11 Nov 2014, at 14:40, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
>> Updated webrevs:
>>  http://cr.openjdk.java.net/~shade/8059677/webrev.02.jdk/
> 
> Looks good.
> 
>>  http://cr.openjdk.java.net/~shade/8059677/webrev.02.hs/
> 
> I skimmed this webrev, and it also looks fine to me.
> 
> -Chris.

Thanks Chris!

-Aleksey.


From karen.kinnear at oracle.com  Tue Nov 11 19:35:45 2014
From: karen.kinnear at oracle.com (Karen Kinnear)
Date: Tue, 11 Nov 2014 14:35:45 -0500
Subject: RFR(S) Contended Locking fast enter bucket (8061553)
In-Reply-To: <54621566.9040805@oracle.com>
References: <5452C0B4.4070601@oracle.com>
	<5457084B.6070808@oracle.com>	<5458330E.1080207@oracle.com>
	<54591A3A.1090005@oracle.com> <545C2BC0.3080207@oracle.com>
	<5461C282.1020806@oracle.com> <54621566.9040805@oracle.com>
Message-ID: <01FABD3E-D846-48F9-BDF9-F1AD3CA01090@oracle.com>

Dan,

Code looks good.  I like your choices of changes to pick up.

Couple of minor questions/comments:

1. synchronizer.cpp: What does TLE stand for?
2. in macrosAssembler_x86.cpp - mind keeping the comment about // Without cat to int32_t a movptr will destroy R10 which is typically obj 

thanks,
Karen

p.s. I've forgotten - is the fast_notify in a different bucket?

On Nov 11, 2014, at 8:55 AM, Daniel D. Daugherty wrote:

> On 11/11/14 1:02 AM, David Holmes wrote:
>> On 7/11/2014 12:17 PM, Daniel D. Daugherty wrote:
>>> The fix for JDK-8062851 has been reviewed, tested and pushed to
>>> RT_Baseline.
>>> 
>>> Time to get back to this review thread so here's an updated webrev:
>>> 
>>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/1-jdk9-hs-rt/
>>> 
>>> David H., I believe I've addressed all of your comments. Please
>>> let me know if I missed something...
>> 
>> Looks good to me - thanks Dan!
> 
> Thanks for the re-review!
> 
> Dan
> 
> 
>> 
>> David
>> -----
>> 
>>> Thanks, in advance, for any comments, questions or suggestions.
>>> 
>>> Dan
>>> 
>>> 
>>> On 11/4/14 11:26 AM, Daniel D. Daugherty wrote:
>>>> The cleanup is turning into a bigger change than the fast enter
>>>> bucket itself so I'm spinning the cleanup into a new bug:
>>>> 
>>>>    JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>> https://bugs.openjdk.java.net/browse/JDK-8062851
>>>> 
>>>> Yes, this means that the Contended Locking cleanup bucket has reopened
>>>> for yet another change...
>>>> 
>>>> We'll get back to "fast enter" after the dust has settled...
>>>> 
>>>> Dan
>>>> 
>>>> 
>>>> On 11/3/14 6:59 PM, Daniel D. Daugherty wrote:
>>>>> David,
>>>>> 
>>>>> Thanks for the review! As usual, replies are embedded below...
>>>>> 
>>>>> 
>>>>> On 11/2/14 9:44 PM, David Holmes wrote:
>>>>>> Hi Dan,
>>>>>> 
>>>>>> Looks good.
>>>>> 
>>>>> Thanks!
>>>>> 
>>>>> 
>>>>>> Couple of nits and one semantic query below ...
>>>>>> 
>>>>>> src/cpu/sparc/vm/macroAssembler_sparc.cpp
>>>>>> 
>>>>>> Formatting changes were a bit of a distraction.
>>>>> 
>>>>> Yes, I have no idea what got into me. Normally I do formatting
>>>>> changes separately so the noise does not distract...
>>>>> 
>>>>> It turns out there is a constant defined that should be used
>>>>> instead of all these literal '2's:
>>>>> 
>>>>> src/share/vm/oops/markOop.hpp:         monitor_value = 2
>>>>> 
>>>>> Typically used as follows:
>>>>> 
>>>>> src/cpu/x86/vm/macroAssembler_x86.cpp:  int owner_offset =
>>>>> ObjectMonitor::owner_offset_in_bytes() - markOopDesc::monitor_value;
>>>>> 
>>>>> I will clean this up just for the files that I'm touching as
>>>>> part of this fix.
>>>>> 
>>>>> 
>>>>>> 
>>>>>> ---
>>>>>> 
>>>>>> src/cpu/x86/vm/macroAssembler_x86.cpp
>>>>>> 
>>>>>> Formatting changes were a bit of a distraction.
>>>>> 
>>>>> Same reply as for macroAssembler_sparc.cpp.
>>>>> 
>>>>> 
>>>>>> 1929     // unconditionally set stackBox->_displaced_header = 3
>>>>>> 1930     movptr(Address(boxReg, 0),
>>>>>> (int32_t)intptr_t(markOopDesc::unused_mark()));
>>>>>> 
>>>>>> At 1870 we refer to box rather than stackBox. Also it takes some
>>>>>> sleuthing to realize that "3" here is somehow a pseudonym for
>>>>>> unused_mark(). Back up at 1808 we have a to-do:
>>>>>> 
>>>>>> 1808     //   use markOop::unused_mark() instead of "3".
>>>>>> 
>>>>>> so the current change seems to be implementing that, even though
>>>>>> other uses of "3" are left untouched.
>>>>> 
>>>>> I'll take a look at cleaning those up also...
>>>>> 
>>>>> In some cases markOopDesc::marked_value will work for the literal '3',
>>>>> but in other cases we'll use markOop::unused_mark():
>>>>> 
>>>>>  static markOop unused_mark() {
>>>>>    return (markOop) marked_value;
>>>>>  }
>>>>> 
>>>>> to save us the noise of the (markOop) cast.
>>>>> 
>>>>> 
>>>>>> ---
>>>>>> 
>>>>>> src/share/vm/runtime/sharedRuntime.cpp
>>>>>> 
>>>>>> 1794 JRT_BLOCK_ENTRY(void,
>>>>>> SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock*
>>>>>> lock, JavaThread* thread))
>>>>>> 1795   if (!SafepointSynchronize::is_synchronizing()) {
>>>>>> 1796     if (ObjectSynchronizer::quick_enter(_obj, thread, lock))
>>>>>> return;
>>>>>> 
>>>>>> Is it necessary to check is_synchronizing? If we are executing this
>>>>>> code we are not at a safepoint and the quick_enter wont change that,
>>>>>> so I'm not sure what we are guarding against.
>>>>> 
>>>>> So this first state checker:
>>>>> 
>>>>> src/share/vm/runtime/safepoint.hpp:
>>>>> inline static bool is_synchronizing()  { return _state ==
>>>>> _synchronizing;  }
>>>>> 
>>>>> means that we want to go to a safepoint and:
>>>>> 
>>>>> inline static bool is_at_safepoint()   { return _state ==
>>>>> _synchronized;  }
>>>>> 
>>>>> means that we are at a safepoint. Dice's optimization bails out if
>>>>> we want to go to a safepoint and ObjectSynchronizer::quick_enter()
>>>>> has a "No_Safepoint_Verifier nsv" in it so we're expecting that
>>>>> code to be quick (and not go to a safepoint). I'm not seeing
>>>>> anything obvious....
>>>>> 
>>>>> Sometimes we have to be careful with JavaThread suspend requests and
>>>>> monitor acquisition, but I don't think that's a problem here... In
>>>>> order for the "suspend requesting" thread to be surprised, the suspend
>>>>> API, e.g., JVM/TI SuspendThread() has to return to the caller and then
>>>>> the suspend target has do something unexpected like acquire a monitor
>>>>> that it was previously blocked upon when it was suspended. We've had
>>>>> bugs like that in the past... In this optimization case, our target
>>>>> thread is not blocked on a contended monitor...
>>>>> 
>>>>> In this particular case, the "suspend requesting" thread will set the
>>>>> suspend request state on the target thread, but the target thread is
>>>>> busy trying to enter this uncontended monitor (quickly). So the
>>>>> "suspend requesting" thread, will request a no-op safepoint, but it
>>>>> won't return from the suspend API until that safepoint completes.
>>>>> The safepoint won't complete until the target thread is done acquiring
>>>>> the previously uncontended monitor... so the target thread will be
>>>>> suspended while holding the previous uncontended monitor and the
>>>>> "suspend requesting" thread will return from the suspend API all
>>>>> happy...
>>>>> 
>>>>> Well, I don't see the reason either so I'll have to ping Dave Dice
>>>>> and Karen Kinnear to see if either of them can fill in the history
>>>>> here. This could be an abundance of caution case.
>>>>> 
>>>>> 
>>>>>> ---
>>>>>> 
>>>>>> src/share/vm/runtime/synchronizer.cpp
>>>>>> 
>>>>>> Minor nit: line 153 the usual acronym is NPE (for
>>>>>> NullPointerException) not NPX
>>>>> 
>>>>> I'll do a search for uses of NPX and other uses of 'X' in exception
>>>>> acronyms...
>>>>> 
>>>>> 
>>>>>> 
>>>>>> Nit:  159     Thread * const ox
>>>>>> 
>>>>>> Please change ox to owner.
>>>>> 
>>>>> Will do.
>>>>> 
>>>>> Thanks again for the review!
>>>>> 
>>>>> Dan
>>>>> 
>>>>> 
>>>>>> 
>>>>>> ---
>>>>>> 
>>>>>> Thanks,
>>>>>> David
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 31/10/2014 8:50 AM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>> 
>>>>>>> I have the Contended Locking fast enter bucket ready for review.
>>>>>>> 
>>>>>>> The code changes in this bucket are primarily a quick_enter()
>>>>>>> function that works on inflated but uncontended Java monitors.
>>>>>>> This quick_enter() function is used on the "slow path" for Java
>>>>>>> Monitor enter operations when the built-in "fast path" (read
>>>>>>> assembly code) doesn't work.
>>>>>>> 
>>>>>>> This work is being tracked by the following bug ID:
>>>>>>> 
>>>>>>>     JDK-8061553 Contended Locking fast enter bucket
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8061553
>>>>>>> 
>>>>>>> Here is the webrev URL:
>>>>>>> 
>>>>>>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/0-jdk9-hs-rt/
>>>>>>> 
>>>>>>> Here is the JEP link:
>>>>>>> 
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>>> 
>>>>>>> 8061553 summary of changes:
>>>>>>> 
>>>>>>> macroAssembler_sparc.cpp: MacroAssembler::compiler_lock_object()
>>>>>>> 
>>>>>>> - clean up spacing around some
>>>>>>>   'ObjectMonitor::owner_offset_in_bytes() - 2' uses
>>>>>>> - remove optional (EmitSync & 64) code
>>>>>>> - change from cmp() to andcc() so icc.zf flag is set
>>>>>>> 
>>>>>>> macroAssembler_x86.cpp: MacroAssembler::fast_lock()
>>>>>>> 
>>>>>>> - remove optional (EmitSync & 2) code
>>>>>>> - rewrite LP64 inflated lock code that tries to CAS in
>>>>>>>   the new owner value to be more efficient
>>>>>>> 
>>>>>>> interfaceSupport.hpp:
>>>>>>> 
>>>>>>> - add JRT_BLOCK_NO_ASYNC to permit splitting a
>>>>>>>   JRT_BLOCK_ENTRY into two pieces.
>>>>>>> 
>>>>>>> sharedRuntime.cpp: SharedRuntime::complete_monitor_locking_C()
>>>>>>> 
>>>>>>> - change entry type from JRT_ENTRY_NO_ASYNC to JRT_BLOCK_ENTRY
>>>>>>>   to permit ObjectSynchronizer::quick_enter() call
>>>>>>> - add JRT_BLOCK_NO_ASYNC use if the quick_enter() doesn't work
>>>>>>>   to revert to JRT_ENTRY_NO_ASYNC-like semantics
>>>>>>> 
>>>>>>> synchronizer.[ch]pp:
>>>>>>> 
>>>>>>> - add ObjectSynchronizer::quick_enter() for entering an
>>>>>>>   inflated but unowned Java monitor without thread state
>>>>>>>   changes
>>>>>>> 
>>>>>>> Testing:
>>>>>>> 
>>>>>>> - Aurora Adhoc RT/SVC baseline batch
>>>>>>> - JPRT test jobs
>>>>>>> - MonitorEnterStresser micro-benchmark (in process)
>>>>>>> - CallTimerGrid stress testing (in process)
>>>>>>> - Aurora performance testing:
>>>>>>>   - out of the box for the "promotion" and 32-bit server configs
>>>>>>>   - heavy weight monitors for the "promotion" and 32-bit server
>>>>>>> configs
>>>>>>>     (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>>>>>>>     (in process)
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>>> 
>>>>>>> Dan
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>> 
> 


From staffan.larsen at oracle.com  Tue Nov 11 20:38:15 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 11 Nov 2014 21:38:15 +0100
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <54621FFA.2070503@oracle.com>
References: <545FB64F.7090705@oracle.com> <54621FFA.2070503@oracle.com>
Message-ID: <F8D84938-4D5F-4C9C-98A8-1912DC85741D@oracle.com>

SA changes look good.

/Staffan

> On 11 nov 2014, at 15:40, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
> 
> Hi,
> 
> On 11/09/2014 09:45 PM, Aleksey Shipilev wrote:
>> Thread.getName() returns String, and does new String instantiation every
>> time, because the thread name is stored in char[]. Even though we use a
>> private String constructor that shares the char[] array without copying
>> it, this still hurts some use cases (think extra-fast logging). To the
>> extent some people actually maintain Map<Thread, String> to avoid it.
>> https://bugs.openjdk.java.net/browse/JDK-8059677
>> 
>> Here's the attempt to maintain String instead of char[]:
>> http://cr.openjdk.java.net/~shade/8059677/webrev.01.jdk/
>> http://cr.openjdk.java.net/~shade/8059677/webrev.01.hs/
> 
> Updated webrevs:
>  http://cr.openjdk.java.net/~shade/8059677/webrev.02.jdk/
>  http://cr.openjdk.java.net/~shade/8059677/webrev.02.hs/
> 
> This version incorporates feedbacks from Chris, Staffan and David. I
> think it is very close to what we would like to push. Opinions?
> 
> Testing: JPRT, jdk/test/java/lang/Thread jtreg, hotspot/test/runtime/
> jtreg, vm.quick.testlist, nsk.jvmti.testlist, svc.quick.testlist,
> vm.tmtools.testlist
> 
> Thanks,
> -Aleksey.
> 
> 
> 
> 


From daniel.daugherty at oracle.com  Tue Nov 11 21:23:06 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 11 Nov 2014 14:23:06 -0700
Subject: RFR(S) Contended Locking fast enter bucket (8061553)
In-Reply-To: <01FABD3E-D846-48F9-BDF9-F1AD3CA01090@oracle.com>
References: <5452C0B4.4070601@oracle.com>
	<5457084B.6070808@oracle.com>	<5458330E.1080207@oracle.com>
	<54591A3A.1090005@oracle.com> <545C2BC0.3080207@oracle.com>
	<5461C282.1020806@oracle.com> <54621566.9040805@oracle.com>
	<01FABD3E-D846-48F9-BDF9-F1AD3CA01090@oracle.com>
Message-ID: <54627E3A.7050306@oracle.com>

Thanks for the review!

As usual, replies embedded below...


On 11/11/14 12:35 PM, Karen Kinnear wrote:
> Dan,
>
> Code looks good.

Thanks! However, it is yours and Dice's code with a few tweaks
from my brain... This bucket will also have a triple contributed
by entry...


>    I like your choices of changes to pick up.

Thanks! This bucket was fairly easy to sift/tease out...


> Couple of minor questions/comments:
>
> 1. synchronizer.cpp: What does TLE stand for?

Transactional Lock Elision is my guess. If Dice confirms, then I'll
make sure the first use has it spelled out... Dave likes his TLAs!


> 2. in macrosAssembler_x86.cpp - mind keeping the comment about // Without cat to int32_t a movptr will destroy R10 which is typically obj

Yes, I kept looking at that and wondering why the comments was removed...
I'll put it back...


> thanks,
> Karen
>
> p.s. I've forgotten - is the fast_notify in a different bucket?

fast_enter is optimization #3, bucket #7
fast_exit is optimization #4, bucket #8
fast_notify is optimization #5, bucket #2

Dan

>
> On Nov 11, 2014, at 8:55 AM, Daniel D. Daugherty wrote:
>
>> On 11/11/14 1:02 AM, David Holmes wrote:
>>> On 7/11/2014 12:17 PM, Daniel D. Daugherty wrote:
>>>> The fix for JDK-8062851 has been reviewed, tested and pushed to
>>>> RT_Baseline.
>>>>
>>>> Time to get back to this review thread so here's an updated webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/1-jdk9-hs-rt/
>>>>
>>>> David H., I believe I've addressed all of your comments. Please
>>>> let me know if I missed something...
>>> Looks good to me - thanks Dan!
>> Thanks for the re-review!
>>
>> Dan
>>
>>
>>> David
>>> -----
>>>
>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 11/4/14 11:26 AM, Daniel D. Daugherty wrote:
>>>>> The cleanup is turning into a bigger change than the fast enter
>>>>> bucket itself so I'm spinning the cleanup into a new bug:
>>>>>
>>>>>     JDK-8062851 cleanup ObjectMonitor offset adjustments
>>>>> https://bugs.openjdk.java.net/browse/JDK-8062851
>>>>>
>>>>> Yes, this means that the Contended Locking cleanup bucket has reopened
>>>>> for yet another change...
>>>>>
>>>>> We'll get back to "fast enter" after the dust has settled...
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 11/3/14 6:59 PM, Daniel D. Daugherty wrote:
>>>>>> David,
>>>>>>
>>>>>> Thanks for the review! As usual, replies are embedded below...
>>>>>>
>>>>>>
>>>>>> On 11/2/14 9:44 PM, David Holmes wrote:
>>>>>>> Hi Dan,
>>>>>>>
>>>>>>> Looks good.
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>>>> Couple of nits and one semantic query below ...
>>>>>>>
>>>>>>> src/cpu/sparc/vm/macroAssembler_sparc.cpp
>>>>>>>
>>>>>>> Formatting changes were a bit of a distraction.
>>>>>> Yes, I have no idea what got into me. Normally I do formatting
>>>>>> changes separately so the noise does not distract...
>>>>>>
>>>>>> It turns out there is a constant defined that should be used
>>>>>> instead of all these literal '2's:
>>>>>>
>>>>>> src/share/vm/oops/markOop.hpp:         monitor_value = 2
>>>>>>
>>>>>> Typically used as follows:
>>>>>>
>>>>>> src/cpu/x86/vm/macroAssembler_x86.cpp:  int owner_offset =
>>>>>> ObjectMonitor::owner_offset_in_bytes() - markOopDesc::monitor_value;
>>>>>>
>>>>>> I will clean this up just for the files that I'm touching as
>>>>>> part of this fix.
>>>>>>
>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/cpu/x86/vm/macroAssembler_x86.cpp
>>>>>>>
>>>>>>> Formatting changes were a bit of a distraction.
>>>>>> Same reply as for macroAssembler_sparc.cpp.
>>>>>>
>>>>>>
>>>>>>> 1929     // unconditionally set stackBox->_displaced_header = 3
>>>>>>> 1930     movptr(Address(boxReg, 0),
>>>>>>> (int32_t)intptr_t(markOopDesc::unused_mark()));
>>>>>>>
>>>>>>> At 1870 we refer to box rather than stackBox. Also it takes some
>>>>>>> sleuthing to realize that "3" here is somehow a pseudonym for
>>>>>>> unused_mark(). Back up at 1808 we have a to-do:
>>>>>>>
>>>>>>> 1808     //   use markOop::unused_mark() instead of "3".
>>>>>>>
>>>>>>> so the current change seems to be implementing that, even though
>>>>>>> other uses of "3" are left untouched.
>>>>>> I'll take a look at cleaning those up also...
>>>>>>
>>>>>> In some cases markOopDesc::marked_value will work for the literal '3',
>>>>>> but in other cases we'll use markOop::unused_mark():
>>>>>>
>>>>>>   static markOop unused_mark() {
>>>>>>     return (markOop) marked_value;
>>>>>>   }
>>>>>>
>>>>>> to save us the noise of the (markOop) cast.
>>>>>>
>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/share/vm/runtime/sharedRuntime.cpp
>>>>>>>
>>>>>>> 1794 JRT_BLOCK_ENTRY(void,
>>>>>>> SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock*
>>>>>>> lock, JavaThread* thread))
>>>>>>> 1795   if (!SafepointSynchronize::is_synchronizing()) {
>>>>>>> 1796     if (ObjectSynchronizer::quick_enter(_obj, thread, lock))
>>>>>>> return;
>>>>>>>
>>>>>>> Is it necessary to check is_synchronizing? If we are executing this
>>>>>>> code we are not at a safepoint and the quick_enter wont change that,
>>>>>>> so I'm not sure what we are guarding against.
>>>>>> So this first state checker:
>>>>>>
>>>>>> src/share/vm/runtime/safepoint.hpp:
>>>>>> inline static bool is_synchronizing()  { return _state ==
>>>>>> _synchronizing;  }
>>>>>>
>>>>>> means that we want to go to a safepoint and:
>>>>>>
>>>>>> inline static bool is_at_safepoint()   { return _state ==
>>>>>> _synchronized;  }
>>>>>>
>>>>>> means that we are at a safepoint. Dice's optimization bails out if
>>>>>> we want to go to a safepoint and ObjectSynchronizer::quick_enter()
>>>>>> has a "No_Safepoint_Verifier nsv" in it so we're expecting that
>>>>>> code to be quick (and not go to a safepoint). I'm not seeing
>>>>>> anything obvious....
>>>>>>
>>>>>> Sometimes we have to be careful with JavaThread suspend requests and
>>>>>> monitor acquisition, but I don't think that's a problem here... In
>>>>>> order for the "suspend requesting" thread to be surprised, the suspend
>>>>>> API, e.g., JVM/TI SuspendThread() has to return to the caller and then
>>>>>> the suspend target has do something unexpected like acquire a monitor
>>>>>> that it was previously blocked upon when it was suspended. We've had
>>>>>> bugs like that in the past... In this optimization case, our target
>>>>>> thread is not blocked on a contended monitor...
>>>>>>
>>>>>> In this particular case, the "suspend requesting" thread will set the
>>>>>> suspend request state on the target thread, but the target thread is
>>>>>> busy trying to enter this uncontended monitor (quickly). So the
>>>>>> "suspend requesting" thread, will request a no-op safepoint, but it
>>>>>> won't return from the suspend API until that safepoint completes.
>>>>>> The safepoint won't complete until the target thread is done acquiring
>>>>>> the previously uncontended monitor... so the target thread will be
>>>>>> suspended while holding the previous uncontended monitor and the
>>>>>> "suspend requesting" thread will return from the suspend API all
>>>>>> happy...
>>>>>>
>>>>>> Well, I don't see the reason either so I'll have to ping Dave Dice
>>>>>> and Karen Kinnear to see if either of them can fill in the history
>>>>>> here. This could be an abundance of caution case.
>>>>>>
>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/share/vm/runtime/synchronizer.cpp
>>>>>>>
>>>>>>> Minor nit: line 153 the usual acronym is NPE (for
>>>>>>> NullPointerException) not NPX
>>>>>> I'll do a search for uses of NPX and other uses of 'X' in exception
>>>>>> acronyms...
>>>>>>
>>>>>>
>>>>>>> Nit:  159     Thread * const ox
>>>>>>>
>>>>>>> Please change ox to owner.
>>>>>> Will do.
>>>>>>
>>>>>> Thanks again for the review!
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 31/10/2014 8:50 AM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> I have the Contended Locking fast enter bucket ready for review.
>>>>>>>>
>>>>>>>> The code changes in this bucket are primarily a quick_enter()
>>>>>>>> function that works on inflated but uncontended Java monitors.
>>>>>>>> This quick_enter() function is used on the "slow path" for Java
>>>>>>>> Monitor enter operations when the built-in "fast path" (read
>>>>>>>> assembly code) doesn't work.
>>>>>>>>
>>>>>>>> This work is being tracked by the following bug ID:
>>>>>>>>
>>>>>>>>      JDK-8061553 Contended Locking fast enter bucket
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8061553
>>>>>>>>
>>>>>>>> Here is the webrev URL:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/0-jdk9-hs-rt/
>>>>>>>>
>>>>>>>> Here is the JEP link:
>>>>>>>>
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>>>>
>>>>>>>> 8061553 summary of changes:
>>>>>>>>
>>>>>>>> macroAssembler_sparc.cpp: MacroAssembler::compiler_lock_object()
>>>>>>>>
>>>>>>>> - clean up spacing around some
>>>>>>>>    'ObjectMonitor::owner_offset_in_bytes() - 2' uses
>>>>>>>> - remove optional (EmitSync & 64) code
>>>>>>>> - change from cmp() to andcc() so icc.zf flag is set
>>>>>>>>
>>>>>>>> macroAssembler_x86.cpp: MacroAssembler::fast_lock()
>>>>>>>>
>>>>>>>> - remove optional (EmitSync & 2) code
>>>>>>>> - rewrite LP64 inflated lock code that tries to CAS in
>>>>>>>>    the new owner value to be more efficient
>>>>>>>>
>>>>>>>> interfaceSupport.hpp:
>>>>>>>>
>>>>>>>> - add JRT_BLOCK_NO_ASYNC to permit splitting a
>>>>>>>>    JRT_BLOCK_ENTRY into two pieces.
>>>>>>>>
>>>>>>>> sharedRuntime.cpp: SharedRuntime::complete_monitor_locking_C()
>>>>>>>>
>>>>>>>> - change entry type from JRT_ENTRY_NO_ASYNC to JRT_BLOCK_ENTRY
>>>>>>>>    to permit ObjectSynchronizer::quick_enter() call
>>>>>>>> - add JRT_BLOCK_NO_ASYNC use if the quick_enter() doesn't work
>>>>>>>>    to revert to JRT_ENTRY_NO_ASYNC-like semantics
>>>>>>>>
>>>>>>>> synchronizer.[ch]pp:
>>>>>>>>
>>>>>>>> - add ObjectSynchronizer::quick_enter() for entering an
>>>>>>>>    inflated but unowned Java monitor without thread state
>>>>>>>>    changes
>>>>>>>>
>>>>>>>> Testing:
>>>>>>>>
>>>>>>>> - Aurora Adhoc RT/SVC baseline batch
>>>>>>>> - JPRT test jobs
>>>>>>>> - MonitorEnterStresser micro-benchmark (in process)
>>>>>>>> - CallTimerGrid stress testing (in process)
>>>>>>>> - Aurora performance testing:
>>>>>>>>    - out of the box for the "promotion" and 32-bit server configs
>>>>>>>>    - heavy weight monitors for the "promotion" and 32-bit server
>>>>>>>> configs
>>>>>>>>      (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>>>>>>>>      (in process)
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>>>>
>>>>>>>> Dan
>>>>>>
>>>>>
>>>>>


From serguei.spitsyn at oracle.com  Tue Nov 11 22:04:29 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 11 Nov 2014 14:04:29 -0800
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <546151A9.1080100@oracle.com>
References: <546151A9.1080100@oracle.com>
Message-ID: <546287ED.9050708@oracle.com>

Dan,

The fix looks good.
Nice cleanup from workarounds: Good Thing (TM)! :)

Thanks,
Serguei

On 11/10/14 4:00 PM, Daniel D. Daugherty wrote:
> Greetings,
>
> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
> Yes, it is a small fix, but it is in Makefiles so feel free to
> run screaming from the room... :-)  On the plus side the fix does
> delete two work around source files (Coleen would say that's a
> Good Thing (TM)!)
>
> The fix is to detect the version of GNU objcopy that is being
> used on the machine and only enable Full Debug Symbols when that
> version is 2.21.1 or newer. If you don't have the right version,
> then the build drops back to pre-FDS build configs with a message
> like this:
>
> WARNING: /usr/sfw/bin/gobjcopy --version info:
> WARNING: GNU objcopy 2.15
> WARNING: an objcopy version of 2.21.1 or newer is needed to create 
> valid .debuginfo files.
> WARNING: ignoring above objcopy command.
> WARNING: patch 149063-01 or newer contains the correct Solaris 10 
> SPARC version.
> WARNING: patch 149064-01 or newer contains the correct Solaris 10 X86 
> version.
> WARNING: Solaris 11 Update 1 contains the correct version.
> INFO: no objcopy cmd found so cannot create .debuginfo files.
> INFO: ENABLE_FULL_DEBUG_SYMBOLS=0
>
> This work is being tracked by the following bug IDs:
>
>     JDK-8033602 wrong stabs data in libjvm.debuginfo on JDK 8 - SPARC
>     https://bugs.openjdk.java.net/browse/JDK-8033602
>
>     JDK-8034005 cannot debug in synchronizer.o or objectMonitor.o on 
> Solaris X86
>     https://bugs.openjdk.java.net/browse/JDK-8034005
>
> Here is the webrev URL:
>
> http://cr.openjdk.java.net/~dcubed/8033602-webrev/0-jdk9-hs-rt/
>
> Testing:
>
> - JPRT test jobs to verify that the current JPRT Solaris hosts
>   are happy
> - local builds on my Solaris 10 X86 machine to verify that the
>   wrong version of GNU objcopy is caught
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan


From daniel.daugherty at oracle.com  Tue Nov 11 23:31:42 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 11 Nov 2014 16:31:42 -0700
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <546287ED.9050708@oracle.com>
References: <546151A9.1080100@oracle.com> <546287ED.9050708@oracle.com>
Message-ID: <54629C5E.8080305@oracle.com>

Thanks for the review!


On 11/11/14 3:04 PM, serguei.spitsyn at oracle.com wrote:
> Dan,
>
> The fix looks good.

Thanks!


> Nice cleanup from workarounds: Good Thing (TM)! :)

Yes, this has been in the queue for quite a while... :-)

Dan


>
> Thanks,
> Serguei
>
> On 11/10/14 4:00 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
>> Yes, it is a small fix, but it is in Makefiles so feel free to
>> run screaming from the room... :-)  On the plus side the fix does
>> delete two work around source files (Coleen would say that's a
>> Good Thing (TM)!)
>>
>> The fix is to detect the version of GNU objcopy that is being
>> used on the machine and only enable Full Debug Symbols when that
>> version is 2.21.1 or newer. If you don't have the right version,
>> then the build drops back to pre-FDS build configs with a message
>> like this:
>>
>> WARNING: /usr/sfw/bin/gobjcopy --version info:
>> WARNING: GNU objcopy 2.15
>> WARNING: an objcopy version of 2.21.1 or newer is needed to create 
>> valid .debuginfo files.
>> WARNING: ignoring above objcopy command.
>> WARNING: patch 149063-01 or newer contains the correct Solaris 10 
>> SPARC version.
>> WARNING: patch 149064-01 or newer contains the correct Solaris 10 X86 
>> version.
>> WARNING: Solaris 11 Update 1 contains the correct version.
>> INFO: no objcopy cmd found so cannot create .debuginfo files.
>> INFO: ENABLE_FULL_DEBUG_SYMBOLS=0
>>
>> This work is being tracked by the following bug IDs:
>>
>>     JDK-8033602 wrong stabs data in libjvm.debuginfo on JDK 8 - SPARC
>>     https://bugs.openjdk.java.net/browse/JDK-8033602
>>
>>     JDK-8034005 cannot debug in synchronizer.o or objectMonitor.o on 
>> Solaris X86
>>     https://bugs.openjdk.java.net/browse/JDK-8034005
>>
>> Here is the webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8033602-webrev/0-jdk9-hs-rt/
>>
>> Testing:
>>
>> - JPRT test jobs to verify that the current JPRT Solaris hosts
>>   are happy
>> - local builds on my Solaris 10 X86 machine to verify that the
>>   wrong version of GNU objcopy is caught
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>


From david.holmes at oracle.com  Wed Nov 12 04:48:53 2014
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 12 Nov 2014 14:48:53 +1000
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <54621FFA.2070503@oracle.com>
References: <545FB64F.7090705@oracle.com> <54621FFA.2070503@oracle.com>
Message-ID: <5462E6B5.7080504@oracle.com>

On 12/11/2014 12:40 AM, Aleksey Shipilev wrote:
> Hi,
>
> On 11/09/2014 09:45 PM, Aleksey Shipilev wrote:
>> Thread.getName() returns String, and does new String instantiation every
>> time, because the thread name is stored in char[]. Even though we use a
>> private String constructor that shares the char[] array without copying
>> it, this still hurts some use cases (think extra-fast logging). To the
>> extent some people actually maintain Map<Thread, String> to avoid it.
>>   https://bugs.openjdk.java.net/browse/JDK-8059677
>>
>> Here's the attempt to maintain String instead of char[]:
>>   http://cr.openjdk.java.net/~shade/8059677/webrev.01.jdk/
>>   http://cr.openjdk.java.net/~shade/8059677/webrev.01.hs/
>
> Updated webrevs:
>    http://cr.openjdk.java.net/~shade/8059677/webrev.02.jdk/
>    http://cr.openjdk.java.net/~shade/8059677/webrev.02.hs/
>
> This version incorporates feedbacks from Chris, Staffan and David. I
> think it is very close to what we would like to push. Opinions?

All looks good to me.

But I also noticed this strange (to me) assertion in javaClasses.cpp

  void java_lang_Thread::set_name(oop java_thread, oop name) {
     assert(java_thread->obj_field(_name_offset) == NULL, "name should 
be NULL");
     java_thread->obj_field_put(_name_offset, name);
   }

and on investigation it seems like this is dead code - I couldn't locate 
a call to java_lang_Thread::set_name ?? It would only be usable on an 
attaching thread (else name can't be null) and we pass the name to the 
Thread constructor in that case.

Cheers,
David

> Testing: JPRT, jdk/test/java/lang/Thread jtreg, hotspot/test/runtime/
> jtreg, vm.quick.testlist, nsk.jvmti.testlist, svc.quick.testlist,
> vm.tmtools.testlist
>
> Thanks,
> -Aleksey.
>
>
>
>

From david.holmes at oracle.com  Wed Nov 12 08:04:42 2014
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 12 Nov 2014 18:04:42 +1000
Subject: RFR(XS): 8064471: Port 8013895: G1: G1SummarizeRSetStats output
	on Linux needs improvement to AIX
In-Reply-To: <44EB1DA1040EEB44B7F343A782AD30789FEBA3E3@DEWDFEMB11B.global.corp.sap>
References: <44EB1DA1040EEB44B7F343A782AD30789FEBA3E3@DEWDFEMB11B.global.corp.sap>
Message-ID: <5463149A.6020506@oracle.com>

Hi Gunter,

On 11/11/2014 11:23 PM, Haug, Gunter wrote:
> Hi All,
>
> The change '8013895:  (G1: G1SummarizeRSetStats output on Linux needs improvement)' makes use of getrusage() to retrieve accurate per-thread data on resource usage. We can use exactly the same code on AIX to achieve this.
>
> Please review the following change:
>
> http://cr.openjdk.java.net/~goetz/webrevs/8064471-aixTime/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8064471

I have a couple of comments on this code which presumably also apply to 
the orginal :(

First this comment is no longer applicable (actually it was never 
applicable to AIX!):

   // For now, we say that linux does not support vtime. I have no idea
   // whether it can actually be made to (DLD, 9/13/05).

Second this calculation seems wrong:

return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) + 
(double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000 * 1000);

To me this performs integer division (ie truncation_) then converts the 
resulting integer to a double. I would expect to see additional 
parentheses (even if not needed, for clarity):

return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) + 
((double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec)) / (1000 * 
1000);

or more simply divide by a floating-point value:

return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) + 
(usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);

and you don't need two double casts regardless as the expression will be 
of type double as soon as there is one operand of type double. So that 
should reduce to:

return usage.ru_utime.tv_sec + usage.ru_stime.tv_sec + 
(usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);

Cheers,
David

> Thanks,
> Gunter
>

From roland.westrelin at oracle.com  Wed Nov 12 09:55:21 2014
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Wed, 12 Nov 2014 10:55:21 +0100
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <545D4BDE.9010908@oracle.com>
References: <545C21E6.90709@oracle.com>
	<682E3725-DDA0-4A1D-9F6E-94C6CF6045A1@oracle.com>
	<545D0167.3070903@oracle.com> <545D4BDE.9010908@oracle.com>
Message-ID: <10933301-E9F4-4BDF-B678-50FE846873BD@oracle.com>

> http://cr.openjdk.java.net/~jiangli/8054008/webrev.02/

It looks good to me.

Roland.

From aleksey.shipilev at oracle.com  Wed Nov 12 10:18:41 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Wed, 12 Nov 2014 13:18:41 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <F8D84938-4D5F-4C9C-98A8-1912DC85741D@oracle.com>
References: <545FB64F.7090705@oracle.com> <54621FFA.2070503@oracle.com>
	<F8D84938-4D5F-4C9C-98A8-1912DC85741D@oracle.com>
Message-ID: <54633401.6040208@oracle.com>

Thanks Staffan!

-Aleksey.

On 11.11.2014 23:38, Staffan Larsen wrote:
> SA changes look good.
> 
> /Staffan
> 
>> On 11 nov 2014, at 15:40, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
>>
>> Hi,
>>
>> On 11/09/2014 09:45 PM, Aleksey Shipilev wrote:
>>> Thread.getName() returns String, and does new String instantiation every
>>> time, because the thread name is stored in char[]. Even though we use a
>>> private String constructor that shares the char[] array without copying
>>> it, this still hurts some use cases (think extra-fast logging). To the
>>> extent some people actually maintain Map<Thread, String> to avoid it.
>>> https://bugs.openjdk.java.net/browse/JDK-8059677
>>>
>>> Here's the attempt to maintain String instead of char[]:
>>> http://cr.openjdk.java.net/~shade/8059677/webrev.01.jdk/
>>> http://cr.openjdk.java.net/~shade/8059677/webrev.01.hs/
>>
>> Updated webrevs:
>>  http://cr.openjdk.java.net/~shade/8059677/webrev.02.jdk/
>>  http://cr.openjdk.java.net/~shade/8059677/webrev.02.hs/
>>
>> This version incorporates feedbacks from Chris, Staffan and David. I
>> think it is very close to what we would like to push. Opinions?
>>
>> Testing: JPRT, jdk/test/java/lang/Thread jtreg, hotspot/test/runtime/
>> jtreg, vm.quick.testlist, nsk.jvmti.testlist, svc.quick.testlist,
>> vm.tmtools.testlist
>>
>> Thanks,
>> -Aleksey.
>>
>>
>>
>>
> 


From aleksey.shipilev at oracle.com  Wed Nov 12 10:23:32 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Wed, 12 Nov 2014 13:23:32 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <5462E6B5.7080504@oracle.com>
References: <545FB64F.7090705@oracle.com> <54621FFA.2070503@oracle.com>
	<5462E6B5.7080504@oracle.com>
Message-ID: <54633524.8000607@oracle.com>

Hi David,

On 12.11.2014 07:48, David Holmes wrote:
> On 12/11/2014 12:40 AM, Aleksey Shipilev wrote:
> All looks good to me.

Thanks for the review!

> But I also noticed this strange (to me) assertion in javaClasses.cpp
> 
>  void java_lang_Thread::set_name(oop java_thread, oop name) {
>     assert(java_thread->obj_field(_name_offset) == NULL, "name should be
> NULL");
>     java_thread->obj_field_put(_name_offset, name);
>   }
> 
> and on investigation it seems like this is dead code - I couldn't locate
> a call to java_lang_Thread::set_name ?? It would only be usable on an
> attaching thread (else name can't be null) and we pass the name to the
> Thread constructor in that case.

set_name is not used, as I mentioned earlier -- that makes the change
even more "safe". I was even tempted to drop the setter completely, but
it would break the symmetry against other setters and getters. I dropped
the assert at set_name in this update:
  http://cr.openjdk.java.net/~shade/8059677/webrev.03.hs/
  http://cr.openjdk.java.net/~shade/8059677/webrev.03.jdk/

The only difference against the previous version is the dropped assert,
so I haven't re-spinned the tests.

Thanks,
-Aleksey.


From gunter.haug at sap.com  Wed Nov 12 15:19:54 2014
From: gunter.haug at sap.com (Haug, Gunter)
Date: Wed, 12 Nov 2014 16:19:54 +0100
Subject: RFR(XS): 8064471: Port 8013895: G1: G1SummarizeRSetStats output
	on Linux needs improvement to AIX
In-Reply-To: <5463149A.6020506@oracle.com>
References: <44EB1DA1040EEB44B7F343A782AD30789FEBA3E3@DEWDFEMB11B.global.corp.sap>
	<5463149A.6020506@oracle.com>
Message-ID: <54637A9A.9040108@sap.com>


On 12.11.2014 09:04, David Holmes wrote:
> Hi Gunter,
>
> On 11/11/2014 11:23 PM, Haug, Gunter wrote:
>> Hi All,
>>
>> The change '8013895:  (G1: G1SummarizeRSetStats output on Linux needs 
>> improvement)' makes use of getrusage() to retrieve accurate 
>> per-thread data on resource usage. We can use exactly the same code 
>> on AIX to achieve this.
>>
>> Please review the following change:
>>
>> http://cr.openjdk.java.net/~goetz/webrevs/8064471-aixTime/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8064471
>
> I have a couple of comments on this code which presumably also apply 
> to the orginal :(
Yes, they apply to the original as well, see below.
>
> First this comment is no longer applicable (actually it was never 
> applicable to AIX!):
>
>   // For now, we say that linux does not support vtime. I have no idea
>   // whether it can actually be made to (DLD, 9/13/05).
>
You're right. I will remove it.
> Second this calculation seems wrong:
>
> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) + 
> (double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000 * 
> 1000);
>
> To me this performs integer division (ie truncation_) then converts 
> the resulting integer to a double. I would expect to see additional 
> parentheses (even if not needed, for clarity):
>
> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) + 
> ((double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec)) / (1000 * 
> 1000);
>
> or more simply divide by a floating-point value:
>
> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) + 
> (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);
>
> and you don't need two double casts regardless as the expression will 
> be of type double as soon as there is one operand of type double. So 
> that should reduce to:
>
> return usage.ru_utime.tv_sec + usage.ru_stime.tv_sec + 
> (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);
>
OK. Do you want that we also change the Linux version like you proposed?

Thanks,
Gunter

> Cheers,
> David
>
>> Thanks,
>> Gunter
>>


From karen.kinnear at oracle.com  Wed Nov 12 16:27:54 2014
From: karen.kinnear at oracle.com (Karen Kinnear)
Date: Wed, 12 Nov 2014 11:27:54 -0500
Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh
	fails with ClassCircularityError
In-Reply-To: <5456EADF.4050203@oracle.com>
References: <543C591E.8010602@oracle.com>	<544AB477.4000204@oracle.com>	<544ADC07.6080904@oracle.com>	<544AE76A.9030701@oracle.com>	<544E5123.1060202@oracle.com>	<544E8844.1070907@oracle.com>	<0FB37288-5995-4E0B-B005-E00A4FB3B22A@oracle.com>
	<5454218D.40009@oracle.com> <5456EADF.4050203@oracle.com>
Message-ID: <3CF28613-0C0F-44E2-869A-FF5B01D7E575@oracle.com>

I think there are three things we need to figure out. 

1. I reproduced a problem in TestThread2. Below was the information from that and my analysis
   - all - comments on my analysis are very welcome
   - Yumin - please try the suggested test change below to see if it helps.

   - that is the only example I have seen the full details for.

2. Does the circularity error actually occur in the main thread and if so why?
   - need to catch in a debugger/hs_err file a situation in which this occurs in the main thread please.
   We need the full stack trace for this - native and java please
   - run this without the test change I suggested please
   - try to catch ClassCircularityError in the main thread

3. figure out why we we see this problem more frequently
   - I am not convinced this problem didn't already exist - the test logic has some very odd comments and workarounds which seem to imply there
   were intermittent problems from the beginning
   - that said - worth figuring out if for instance, the sun.misc.URLClassPath logic was rewritten (and when) to add $JarLoader$2
   - and looking at the history of test failure

thanks,
Karen

On Nov 2, 2014, at 9:39 PM, David Holmes wrote:

> On 1/11/2014 9:55 AM, Yumin Qi wrote:
>> Karen,
>> 
>>   Thanks for your detail message for debugging. Yes, from my debugging,
>> the exception did happen in TestThread other than main thread. I have no
>> idea why in the end the exception was reported in main thread.
> 
> Until that question is answered I will remain uneasy about simply tweaking the test until it no longer fails. I would also like to know when it started failing - Karen alludes to the possible introduction of a new inner class at some point.
> 
> Thanks,
> David
> 
>>    You mention
>> 
>> So that change to the test would be:
>>    in TestTransformer:
>>       if (loader != null) {
>>           if (tName.equals("TestThread")) {
>>           {
>>              loadClasses(3);
>>           }
>>        }
>>        return null;
>>     }
>> 
>> 
>> The loader is the one defined in the test case, right? The system class
>> loader is never null.
>> I will try this change, let's see if it can work it out.
>> 
>> Thanks
>> Yumin
>> 
>> On 10/31/2014 3:29 PM, Karen Kinnear wrote:
>>> Yumin,
>>> 
>>> From your earlier exception stack trace (many thanks) you reported:
>>> 
>>> Exception in thread "main" java.lang.ClassCircularityError:  (no - I
>>> don't know why this is in thread "main")
>>> sun/misc/URLClassPath$JarLoader$2
>>> at sun.misc.URLClassPath$JarLoader.checkResource(URLClassPath.java:771)
>>> at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:843)
>>> at sun.misc.URLClassPath.getResource(URLClassPath.java:199)
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:364)
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:426)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:359)
>>> at java.lang.Class.forName0(Native Method)
>>> at java.lang.Class.forName(Class.java:340)
>>> at
>>> ParallelTransformerLoaderApp.loadClasses(ParallelTransformerLoaderApp.java:83)
>>> 
>>> at
>>> ParallelTransformerLoaderApp.main(ParallelTransformerLoaderApp.java:45)
>>> 
>>> 
>>> So I ran  with -XX:AbortVMOnException=java.lang.ClassCircularityError
>>> -XX:+ShowMessageBoxOnError to get
>>> a log file and stack trace. See my instructions below on how to do that.
>>> 
>>> I did this, attached a debugger, which didn't help enough since I
>>> needed to see the java stack frames,
>>>  and got an hs_err_log also, so the stack traces came from the error
>>> log.
>>> 
>>> The stack trace was on Thread 2, which in the hs_err_log was
>>> TestThread (which makes sense for what the test logic says).
>>> See later in email for stack traces from Thread 2.
>>> 
>>> Summary of stack trace:
>>> 
>>> TestThread:
>>>   loadClasses(#) -> forName(TestClass#, URLClassLoader)
>>>     vm calls out to URLClassLoader.loadClass(String) which is
>>> inherited from java.lang.ClassLoader.loadClass(String)
>>>     ... calls java.net.URLClassLoader.findClass(...) which calls
>>>       DoPrivileged  java.net.URLClassLoader$1.run which calls
>>>          sun.misc.URLClassPath.getResource(name, false)  which calls
>>>              sun.misc.URLClassPath$JarLoader.getResource which calls
>>>                  sun.misc.URLClassPath$JarLoader.checkResource which
>>> tries to call sun.misc.URLClassPath$JarLoader$2
>>>    - and then the transformer jumps in with loadClasses(# (which we
>>> know is 3) and walks the same logic which tries to load
>>> sun.misc.URLClassPath$JarLoader$2 again
>>> 
>>> Note that in the placeholder table information that Yumin printed, the
>>> circularity error is on sun.misc.URLClassPath$JarLoader$2 with the
>>> null == boot loader, which
>>> makes sense -- that is the appropriate defining loader, and therefore
>>> the one the CFLH would intercept during the defineClass phase.
>>> 
>>> In the sun.misc.URLClassPath.java file, in the class JarLoader, in the
>>> method checkResource
>>> ... return new Resource() { ... }
>>> -- I do not know why that generates sun.misc.URLClassPath$JarLoader$1,
>>> $2 and $3 at build time or when that was added.
>>> I would guess that is when the bug started happening.
>>> 
>>> When I have a successful run, sun.misc.URLClassPath$JarLoader$2 loads
>>> before any TestClass1 loads.
>>> 
>>> My belief is that the point of the test is to test parallel class
>>> loading for URL class loaders.
>>> I don't think the point is to test the bootstrap class loader, nor to
>>> test bootstrapping - i.e. running the agent before
>>> we have loaded sufficient classes to allow loading URLClassLoader
>>> classes.
>>> 
>>> What I suggested to Yumin that he try would be to change the test to
>>> NOT intercept boot loader loads, so that
>>> sun.misc.URLClassPath$JarLoader$#
>>> can load which will in turn allow classes loaded by a URLClassLoader
>>> subclass to load.
>>> 
>>> So that change to the test would be:
>>>    in TestTransformer:
>>>       if (loader != null) {
>>>           if (tName.equals("TestThread")) {
>>>           {
>>>              loadClasses(3);
>>>           }
>>>        }
>>>        return null;
>>>     }
>>> // I also suspect with that change, we can remove the sleep loop
>>> Note: there was a printed message which said that the Thread "Signal
>>> Dispatcher" has called transform(), which I
>>> ignored, however it is good that we don't call loadClass on that
>>> thread  - which is part of what the sleep loop does -
>>> but that would be handled by the boot loader screening above
>>> 
>>> Alternatively we can preload the URLClassPath classes, but I don't
>>> think we want to do that, or
>>> we can have the agent explicitly screen on a variety of jdk
>>> bootstrapping classes. But I think the cleaner
>>> solution is to screen on the boot loader.
>>> 
>>> Does that make any sense to others?
>>> 
>>> thanks,
>>> Karen
>>> 
>>> p.s. How to run with hotspot flags (jtreg has a -show:rerun option,
>>> but with a shell script in the test, this is more complex, so
>>> the following should be easier):
>>> 
>>> So what I did was run the test once for it to pass (not your script,
>>> but just once with jtreg) so that it generated
>>> the $DST/work directory.
>>> I then created a rerun.csh script - attached - you can modify for your
>>> own $DST directory.
>>> I used it to be able to quickly rerun the test without the jtreg
>>> framework and compile time etc. but mostly
>>> to be able to actually add hotspot command-line flags.
>>> 
>>> 
>>> 
>>> 
>>> p.p.s. details from the error log (let me know if you want me to
>>> attach the error log to the bug report)
>>> 
>>> note: error log shows last 10 events including:
>>> Event: 0.928 loading class sun/misc/URLClassPath$JarLoader$2
>>> Event: 0.928 loading class TestClass3
>>> Event: 0.929 loading class TestClass3 done
>>> Event: 0.929 loading class java/lang/ClassCircularityError
>>> Event: 0.929 loading class java/lang/ClassCircularityError done
>>> 
>>> TestThread
>>> 
>>> java frames:
>>> 
>>> j
>>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>> 
>>> j
>>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>> 
>>> j
>>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>> 
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>> v  ~StubRoutines::call_stub
>>> j
>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>> 
>>> j
>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>> j
>>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>>> v  ~StubRoutines::call_stub
>>> j
>>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>> 
>>> j
>>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>> 
>>> j  ParallelTransformerLoaderAgent$TestTransformer.loadClasses(I)V+25
>>> j
>>> ParallelTransformerLoaderAgent$TestTransformer.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+81
>>> 
>>> j
>>> sun.instrument.TransformerManager.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+50
>>> 
>>> j
>>> sun.instrument.InstrumentationImpl.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[BZ)[B+34
>>> 
>>> v  ~StubRoutines::call_stub
>>> j
>>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>> 
>>> j
>>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>> 
>>> j
>>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>> 
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>> v  ~StubRoutines::call_stub
>>> j
>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>> 
>>> j
>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>> j
>>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>>> v  ~StubRoutines::call_stub
>>> j
>>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>> 
>>> j
>>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>> 
>>> j  ParallelTransformerLoaderApp.loadClasses(I)V+25
>>> j  ParallelTransformerLoaderApp$TestThread.run()V+4
>>> v  ~StubRoutines::call_stub
>>> 
>>> 
>>> 
>>> detailed frames:
>>> 
>>> V  [libjvm.so+0x760f5a]  Exceptions::_throw_msg(Thread*, char const*,
>>> int, Symbol*, char const*)+0x7c
>>> V  [libjvm.so+0xce005c]
>>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>>> Handle, Thread*)+0x7d8
>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>>> Handle, Handle, Thread*)+0x26d
>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>>> Handle, Handle, bool, Thread*)+0x39
>>> V  [libjvm.so+0x690fbc]
>>> ConstantPool::klass_at_impl(constantPoolHandle, int, Thread*)+0x3cc
>>> V  [libjvm.so+0x5398cb]  ConstantPool::klass_at(int, Thread*)+0x55
>>> V  [libjvm.so+0x8b1f3c]  InterpreterRuntime::_new(JavaThread*,
>>> ConstantPool*, int)+0x14a
>>> j
>>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>> 
>>> j
>>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>> 
>>> j
>>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>> 
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>> v  ~StubRoutines::call_stub
>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>> JavaCallArguments*, Thread*)+0x7d
>>> V  [libjvm.so+0x972a80]  JVM_DoPrivileged+0x63d
>>> j
>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>> 
>>> j
>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>> j
>>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>>> v  ~StubRoutines::call_stub
>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>> JavaCallArguments*, Thread*)+0x7d
>>> V  [libjvm.so+0x8c1ec7]  JavaCalls::call_virtual(JavaValue*,
>>> KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x1cb
>>> V  [libjvm.so+0x8c2086]  JavaCalls::call_virtual(JavaValue*, Handle,
>>> KlassHandle, Symbol*, Symbol*, Handle, Thread*)+0xb0
>>> V  [libjvm.so+0xce2096]
>>> SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x4ea
>>> V  [libjvm.so+0xce00a8]
>>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>>> Handle, Thread*)+0x824
>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>>> Handle, Handle, Thread*)+0x26d
>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>>> Handle, Handle, bool, Thread*)+0x39
>>> V  [libjvm.so+0x98c89e]  find_class_from_class_loader(JNIEnv_*,
>>> Symbol*, unsigned char, Handle, Handle, unsigned char, Thread*)+0x49
>>> V  [libjvm.so+0x96f681]  JVM_FindClassFromCaller+0x39d
>>> C  [libjava.so+0xdfd0]  Java_java_lang_Class_forName0+0x130
>>> j
>>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>> 
>>> j
>>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>> 
>>> j  ParallelTransformerLoaderAgent$TestTransformer.loadClasses(I)V+25
>>> j
>>> ParallelTransformerLoaderAgent$TestTransformer.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+81
>>> 
>>> j
>>> sun.instrument.TransformerManager.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+50
>>> 
>>> j
>>> sun.instrument.InstrumentationImpl.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[BZ)[B+34
>>> 
>>> v  ~StubRoutines::call_stub
>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>> JavaCallArguments*, Thread*)+0x7d
>>> V  [libjvm.so+0x911bfb]  jni_invoke_nonstatic(JNIEnv_*, JavaValue*,
>>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)+0x3cd
>>> V  [libjvm.so+0x916918]  jni_CallObjectMethod+0x388
>>> C  [libinstrument.so+0x4eb5]  transformClassFile+0x1e5
>>> C  [libinstrument.so+0x1e06]  eventHandlerClassFileLoadHook+0x96
>>> V  [libjvm.so+0xa04afa]
>>> JvmtiClassFileLoadHookPoster::post_to_env(JvmtiEnv*, bool)+0x1a8
>>> V  [libjvm.so+0xa0485e]
>>> JvmtiClassFileLoadHookPoster::post_all_envs()+0x8a
>>> V  [libjvm.so+0xa047c6]  JvmtiClassFileLoadHookPoster::post()+0x18
>>> V  [libjvm.so+0x9fb6e1]
>>> JvmtiExport::post_class_file_load_hook(Symbol*, Handle, Handle,
>>> unsigned char**, unsigned char**, JvmtiCachedClassFileData**)+0x85
>>> V  [libjvm.so+0x5cd17d]  ClassFileParser::parseClassFile(Symbol*,
>>> ClassLoaderData*, Handle, KlassHandle, GrowableArray<Handle>*,
>>> TempNewSymbol&, bool, Thread*)+0x2af
>>> V  [libjvm.so+0x5dd441]  ClassFileParser::parseClassFile(Symbol*,
>>> ClassLoaderData*, Handle, TempNewSymbol&, bool, Thread*)+0x95
>>> V  [libjvm.so+0x5daf03]  ClassLoader::load_classfile(Symbol*,
>>> Thread*)+0x2ed
>>> V  [libjvm.so+0xce1cc4]
>>> SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x118
>>> V  [libjvm.so+0xce00a8]
>>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>>> Handle, Thread*)+0x824
>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>>> Handle, Handle, Thread*)+0x26d
>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>>> Handle, Handle, bool, Thread*)+0x39
>>> V  [libjvm.so+0x690fbc]
>>> ConstantPool::klass_at_impl(constantPoolHandle, int, Thread*)+0x3cc
>>> V  [libjvm.so+0x5398cb]  ConstantPool::klass_at(int, Thread*)+0x55
>>> V  [libjvm.so+0x8b1f3c]  InterpreterRuntime::_new(JavaThread*,
>>> ConstantPool*, int)+0x14a
>>> j
>>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>> 
>>> j
>>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>> 
>>> j
>>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>> 
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>> v  ~StubRoutines::call_stub
>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>> JavaCallArguments*, Thread*)+0x7d
>>> V  [libjvm.so+0x972a80]  JVM_DoPrivileged+0x63d
>>> j
>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>> 
>>> j
>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>> j
>>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;
>>> v  ~StubRoutines::call_stub
>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>> JavaCallArguments*, Thread*)+0x7d
>>> V  [libjvm.so+0x8c1ec7]  JavaCalls::call_virtual(JavaValue*,
>>> KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x1cb
>>> V  [libjvm.so+0x8c2086]  JavaCalls::call_virtual(JavaValue*, Handle,
>>> KlassHandle, Symbol*, Symbol*, Handle, Thread*)+0xb0
>>> V  [libjvm.so+0xce2096]
>>> SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x4ea
>>> V  [libjvm.so+0xce00a8]
>>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>>> Handle, Thread*)+0x824
>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>>> Handle, Handle, Thread*)+0x26d
>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>>> Handle, Handle, bool, Thread*)+0x39
>>> V  [libjvm.so+0x98c89e]  find_class_from_class_loader(JNIEnv_*,
>>> Symbol*, unsigned char, Handle, Handle, unsigned char, Thread*)+0x49
>>> V  [libjvm.so+0x96f681]  JVM_FindClassFromCaller+0x39d
>>> C  [libjava.so+0xdfd0]  Java_java_lang_Class_forName0+0x130
>>> j
>>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>> 
>>> j
>>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>> 
>>> j  ParallelTransformerLoaderApp.loadClasses(I)V+25
>>> ...<more frames>...
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Oct 27, 2014, at 2:00 PM, serguei.spitsyn at oracle.com wrote:
>>> 
>>>> Ok.
>>>> 
>>>> Thanks, Dan!
>>>> Serguei
>>>> 
>>>> 
>>>> On 10/27/14 7:05 AM, Daniel D. Daugherty wrote:
>>>>>> The test case was added by Dan.
>>>>>> We may want to ask him to clarify the test case purpose.
>>>>>> (added Dan to the to-list)
>>>>> Here's the changeset that added the test:
>>>>> 
>>>>> $ hg log -v -r bca8bf23ac59
>>>>> test/java/lang/instrument/ParallelTransformerLoader.sh
>>>>> changeset:   132:bca8bf23ac59
>>>>> user:        dcubed
>>>>> date:        Mon Mar 24 15:05:09 2008 -0700
>>>>> files: test/java/lang/instrument/ParallelTransformerLoader.sh
>>>>> test/java/lang/instrument/ParallelTransformerLoaderAgent.java
>>>>> test/java/lang/instrument/ParallelTransformerLoaderApp.java
>>>>> test/java/lang/instrument/TestClass1.java
>>>>> test/java/lang/instrument/TestClass2.java
>>>>> test/java/lang/instrument/TestClass3.java
>>>>> description:
>>>>> 5088398: 3/2 java.lang.instrument TCK test deadlock (test11)
>>>>> Summary: Add regression test for single-threaded bootstrap classloader.
>>>>> Reviewed-by: sspitsyn
>>>>> 
>>>>> 
>>>>> Based on my e-mail archive for this bug and from the bug report itself,
>>>>> it looks like we got this test from Wily Labs. The original bug was a
>>>>> deadlock that stopped being reproducible after:
>>>>> 
>>>>> Karen fixed the bootstrap class loader to work in parallel via:
>>>>> 
>>>>>    4997893 4/5 Investigate allowing bootstrap loader to work in
>>>>> parallel
>>>>> 
>>>>> with that fix in place the deadlock no longer reproduces.
>>>>> I'm planning to use this bug as the vehicle for getting
>>>>> the test program into the INSTRUMENT_REGRESSION test suite.
>>>>> 
>>>>> *** (#2 of 2): 2008-02-29 18:20:17 GMT+00:00 daniel.daugherty at sun.com
>>>>> 
>>>>> 
>>>>> A careful reading of JDK-5088398 might reveal the intentions of this
>>>>> test...
>>>>> 
>>>>> Dan
>>>>> 
>>>>> 
>>>>> On 10/24/14 5:57 PM, serguei.spitsyn at oracle.com wrote:
>>>>>> Yumin,
>>>>>> 
>>>>>> On 10/24/14 4:08 PM, Yumin Qi wrote:
>>>>>>> Serguei,
>>>>>>> 
>>>>>>>  Thanks for your comments.
>>>>>>>  This test happens intermittently, but now it can repeat with 8/9.
>>>>>>>  Loading TestClass1 in main thread while loading TestClass2 in
>>>>>>> TestThread in parallel. They both will call transform since
>>>>>>> TestClass[1-3] are loaded via agent. When loading TestClass2, it
>>>>>>> will call loading TestClass3 in TestThread.
>>>>>>>  Note in the main thread, for loop:
>>>>>>> 
>>>>>>>                  for (int i = 0; i < kNumIterations; i++)
>>>>>>>                {
>>>>>>>                        // load some classes from multiple threads
>>>>>>> (this thread and one other)
>>>>>>>                        Thread testThread = new TestThread(2);
>>>>>>>                        testThread.start();
>>>>>>>                        loadClasses(1);
>>>>>>> 
>>>>>>>                        // log that it completed and reset for the
>>>>>>> next iteration
>>>>>>>                        testThread.join();
>>>>>>>                        System.out.print(".");
>>>>>>> ParallelTransformerLoaderAgent.generateNewClassLoader();
>>>>>>>                }
>>>>>>> 
>>>>>>> The loader got renewed after testThread.join(). So both threads
>>>>>>> are using the exact same class loader.
>>>>>> You are right, thanks.
>>>>>> It means that all three classes (TesClass1, TestClass2 and TestClass3)
>>>>>> are loaded by the same class loader in each iteration.
>>>>>> 
>>>>>> However, I see more cases when the TestClass3 gets loaded.
>>>>>> It happens in a CFLH event when any other class (not TestClass*) in
>>>>>> the system is loaded.
>>>>>> The class loading thread can be any, not only "main" or "TestClass"
>>>>>> thread.
>>>>>> I suspect this test case mostly targets class loading that happens
>>>>>> on other threads.
>>>>>> It is because of the lines:
>>>>>>                        // In 160_03 and older, transform() is called
>>>>>>                        // with the "system_loader_lock" held and that
>>>>>>                        // prevents the bootstrap class loaded from
>>>>>>                        // running in parallel. If we add a slight
>>>>>> sleep
>>>>>>                        // delay here when the transform() call is not
>>>>>>                        // main or TestThread, then the deadlock in
>>>>>>                        // 160_03 and older is much more reproducible.
>>>>>>                        if (!tName.equals("main") &&
>>>>>> !tName.equals("TestThread")) {
>>>>>>                            System.out.println("Thread '" + tName +
>>>>>>                                "' has called transform()");
>>>>>>                            try {
>>>>>>                                Thread.sleep(500);
>>>>>>                            } catch (InterruptedException ie) {
>>>>>>                            }
>>>>>>                        }
>>>>>> 
>>>>>> What about the following?
>>>>>> 
>>>>>> In the ParallelTransformerLoaderAgent.java  make this change:
>>>>>>              if (!tName.equals("main"))
>>>>>>                  => if (tName.equals("TestThread"))
>>>>>> 
>>>>>> Does such updated test still failing?
>>>>>> 
>>>>>>> After create a new class loader, next loop will use the loader.
>>>>>>> This is why quite often on the stack trace we can see it resolves
>>>>>>> JarLoader$2.
>>>>>>> 
>>>>>>> I am not quite understand the test case either. Loading TestClass3
>>>>>>> inside transform using the same classloader will cause  call to
>>>>>>> transform again and form a circle. Nonetheless, if we see
>>>>>>> TestClass2 already loaded, the loop will end but that still is a
>>>>>>> risk.
>>>>>> In fact, I don't like that the test loads the class TestClass3 at
>>>>>> the TestClass3 CFLH event.
>>>>>> However, it is interesting to know why we did not see (is it the
>>>>>> case?) this issue before.
>>>>>> Also, it is interesting why the test stops failing with you fix
>>>>>> (replacing loader with SystemClassLoader).
>>>>>> 
>>>>>> The test case was added by Dan.
>>>>>> We may want to ask him to clarify the test case purpose.
>>>>>> (added Dan to the to-list)
>>>>>> 
>>>>>> Thanks,
>>>>>> Serguei
>>>>>> 
>>>>>>> Thanks
>>>>>>> Yumin
>>>>>>> 
>>>>>>> On 10/24/2014 1:20 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Hi Yumin,
>>>>>>>> 
>>>>>>>> Below is some analysis to make sure I understand the test
>>>>>>>> scenario correctly.
>>>>>>>> 
>>>>>>>> The ParallelTransformerLoaderApp.main() executes a 1000 iteration
>>>>>>>> loop.
>>>>>>>> At each iteration it does:
>>>>>>>>  - creates and starts a new TestThread
>>>>>>>>  - loads TestClass1 with the current class loader:
>>>>>>>> ParallelTransformerLoaderAgent.getClassLoader()
>>>>>>>>  - changes the current class loader with new one:
>>>>>>>> ParallelTransformerLoaderAgent.generateNewClassLoader()
>>>>>>>> 
>>>>>>>> The TestThread loads the TestClass2 concurrently with the main
>>>>>>>> thread.
>>>>>>>> 
>>>>>>>> At the CFLH events, the ParallelTransformerLoaderAgent does the
>>>>>>>> class retransformation.
>>>>>>>> If the thread loading the class is not "main", it loads the class
>>>>>>>> TestClass3
>>>>>>>> with the current class loader
>>>>>>>> ParallelTransformerLoaderAgent.getClassLoader().
>>>>>>>> 
>>>>>>>> Sometimes, the TestClass2 and TestClass3 are loaded by the same
>>>>>>>> class loader recursively.
>>>>>>>> It happens if the class loader has not been changed between
>>>>>>>> loading TestClass2 and TestClass3 classes.
>>>>>>>> 
>>>>>>>> I'm not convinced yet the test is incorrect.
>>>>>>>> And it is not clear why do we get a ClassCircularityError.
>>>>>>>> 
>>>>>>>> Please, let me know if the above understanding is wrong.
>>>>>>>> I also see the reply from David and share his concerns.
>>>>>>>> 
>>>>>>>> It is not clear if this failure is a regression.
>>>>>>>> Did we observe this issue before?
>>>>>>>> If - NOT then when and why had this failure started to appear?
>>>>>>>> 
>>>>>>>> Unfortunately, it is impossible to look at the test run history
>>>>>>>> at the moment.
>>>>>>>> The Aurora is at a maintenance.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>> 
>>>>>>>> On 10/13/14 3:58 PM, Yumin Qi wrote:
>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468
>>>>>>>>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/
>>>>>>>>> 
>>>>>>>>> the bug marked as confidential so post the webrev internally.
>>>>>>>>> 
>>>>>>>>> Problem: The test case tries to load a class from the same jar
>>>>>>>>> via agent in the middle of loading another class from the jar
>>>>>>>>> via same class loader in same thread. The call happens in
>>>>>>>>> transform which is a rare case --- in middle of loading class,
>>>>>>>>> loading another class. The result is a CircularityError. When
>>>>>>>>> first class is in loading, in vm we put JarLoader$2 on place
>>>>>>>>> holder table, then we start the defineClass, which calls
>>>>>>>>> transform, begins loading the second class so go along the same
>>>>>>>>> routine for loading JarLoader$2 first, found it already in
>>>>>>>>> placeholder table. A CircularityError is thrown.
>>>>>>>>> Fix: The test case should not call loading class with same class
>>>>>>>>> loader in same thread from same jar in 'transform' method. I
>>>>>>>>> modify it loading with system class loader and we expect see
>>>>>>>>> ClassNotFoundException. Detail see bug comments.
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> Yumin *
>> 


From jiangli.zhou at oracle.com  Wed Nov 12 16:33:18 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Wed, 12 Nov 2014 08:33:18 -0800
Subject: RFR 8054008: Using -XX:-LazyBootClassLoader crashes with
	ACCESS_VIOLATION on Win 64bit
In-Reply-To: <10933301-E9F4-4BDF-B678-50FE846873BD@oracle.com>
References: <545C21E6.90709@oracle.com>
	<682E3725-DDA0-4A1D-9F6E-94C6CF6045A1@oracle.com>
	<545D0167.3070903@oracle.com> <545D4BDE.9010908@oracle.com>
	<10933301-E9F4-4BDF-B678-50FE846873BD@oracle.com>
Message-ID: <54638BCE.5000607@oracle.com>

Thanks, Roland!

Jiangli

On 11/12/2014 01:55 AM, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~jiangli/8054008/webrev.02/
> It looks good to me.
>
> Roland.


From tom.deneau at amd.com  Wed Nov 12 16:52:13 2014
From: tom.deneau at amd.com (Deneau, Tom)
Date: Wed, 12 Nov 2014 16:52:13 +0000
Subject: hang when using -XX:-UseCompilerSafepoints
Message-ID: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>

Hi all --

Forwarding a thread which came about on the jmh-dev mail list, as recommended by Aleksey Shipilev (see below).  The JMH framework has a timing control thread which sleeps for a certain period, then sets a volatile isDone variable.  Meanwhile, the benchmark thread loops doing its benchmark code and also checking the isDone field.   A hang occurs if -XX:-UseCompilerSafepoints is used.

The original issue can be reproduced by the following steps

   hg clone http://hg.openjdk.java.net/code-tools/jmh
   cd jmh
   mvn clean install -DskipTests=true
   cd jmh-samples
   java  -server -XX:-UseCompilerSafepoints -jar target/benchmarks.jar 'JMHSample_01_.*' -t 1 -wi 5 -i 5 -f 0

-- Tom Deneau


-----Original Message-----
From: Aleksey Shipilev [mailto:aleksey.shipilev at oracle.com] 
Sent: Wednesday, November 12, 2014 6:09 AM
To: Deneau, Tom; jmh-dev at openjdk.java.net
Subject: Re: using -XX:-UseCompilerSafepoints

Hi Tom,

On 11/11/2014 07:34 PM, Deneau, Tom wrote:
> It looks like a thread that calls Thread.sleep (as the timing control
> thread does in the harness) will eventually go thru
> SafepointSynchonize::block (as part of the ThreadBlockInVM
> destructor).  So if there is a looping benchmark thread compiled
> without Compiler Safepoints, the control thread will be blocked and
> will never set the isDone flag.

So, you are saying that without the safepoint in the while(!isDone)
loop in workload, control thread and workload thread will never
rendezvous on safepoint? I believe this is a bug with
-XX:-CompilerSafepoints, because the comment in safepoint.cpp calls this
out specifically for VMThread vs. Mutator threads:

 // In a pathological scenario such as that described in CR6415670
 // the VMthread may sleep just before the mutator(s) become safe.
 // In that case the mutators will be stalled waiting for the safepoint
 // to complete and the the VMthread will be sleeping, waiting for the
 // mutators to rendezvous. The VMthread will eventually wake up and
 // detect that all mutators are safe, at which point we'll again make
 // progress.

If this is a case, you probably need to report this to runtime guys.

> This is probably OK, just need to document that CompilerSafepoints
> cannot be turned off.

I think it is safe to presume something will go hairy if you are using
any special VM flag, therefore I am not inclined to document this.

Thanks,
-Aleksey.


From david.r.chase at oracle.com  Wed Nov 12 17:03:11 2014
From: david.r.chase at oracle.com (David Chase)
Date: Wed, 12 Nov 2014 12:03:11 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <545F642E.30205@gmail.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
	<5458A57C.4060208@gmail.com>
	<260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>
	<5459034E.8070809@gmail.com>
	<D1D7CFCF-D5C0-4A05-A636-80685CF153B0@oracle.com>
	<39826508-110B-4FCE-9A58-8C3D1B9FC7DE@oracle.com>
	<545F642E.30205@gmail.com>
Message-ID: <BC4593AE-29E6-4A00-B747-79DB91F02A87@oracle.com>

Hello Peter,

I was looking at this (thinking it would be a useful thing to benchmark, looking for possible improvements)
and noticed that you rely on the hashed objects having a sensible value-dependent hashcode (as opposed
to the default Object hashcode).  Sadly, this seems not to be the case for MemberNames or for ?Types?.
I am sorely tempted to repair this glitch, not sure if it fits in the scope of the original bug, but there?s a lot to
be said for future-performance-proofing.

David

On 2014-11-09, at 7:55 AM, Peter Levart <peter.levart at gmail.com> wrote:

> Hi David,
> 
> I played a little with the idea of having a hash table instead of packed sorted array for interning. Using ConcurrentHashMap would present quite some memory overhead. A more compact representation is possible in the form of a linear-scan hash table where elements of array are MemberNames themselves:
> 
> http://cr.openjdk.java.net/~plevart/misc/MemberName.intern/jdk.06.diff/
> 
> This is a drop-in replacement for MemberName on top of your jdk.06 patch. If you have some time, you can run this with your performance tests to see if it presents any difference. If not, then perhaps this interning is not so performance critical after all.
> 
> Regards, Peter


From david.r.chase at oracle.com  Wed Nov 12 18:27:33 2014
From: david.r.chase at oracle.com (David Chase)
Date: Wed, 12 Nov 2014 13:27:33 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <545F642E.30205@gmail.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
	<5458A57C.4060208@gmail.com>
	<260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>
	<5459034E.8070809@gmail.com>
	<D1D7CFCF-D5C0-4A05-A636-80685CF153B0@oracle.com>
	<39826508-110B-4FCE-9A58-8C3D1B9FC7DE@oracle.com>
	<545F642E.30205@gmail.com>
Message-ID: <B7CBF555-A17C-498A-B259-4E28F4B2198E@oracle.com>


Hello Peter,

> Sadly, this seems not to be the case for MemberNames or for ?Types?.

That statement is inoperative.  Mistakes were made.
It?s compareTo that they lack.

David


On 2014-11-09, at 7:55 AM, Peter Levart <peter.levart at gmail.com> wrote:

> Hi David,
> 
> I played a little with the idea of having a hash table instead of packed sorted array for interning. Using ConcurrentHashMap would present quite some memory overhead. A more compact representation is possible in the form of a linear-scan hash table where elements of array are MemberNames themselves:
> 
> http://cr.openjdk.java.net/~plevart/misc/MemberName.intern/jdk.06.diff/
> 
> This is a drop-in replacement for MemberName on top of your jdk.06 patch. If you have some time, you can run this with your performance tests to see if it presents any difference. If not, then perhaps this interning is not so performance critical after all.
> 
> Regards, Peter


From chris.plummer at oracle.com  Wed Nov 12 19:44:22 2014
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 12 Nov 2014 11:44:22 -0800
Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation
	fault
In-Reply-To: <545D939D.2030308@oracle.com>
References: <545D939D.2030308@oracle.com>
Message-ID: <5463B896.10801@oracle.com>

Hi,

I'm still looking for reviewers.

thanks,

Chris

On 11/7/14 7:53 PM, Chris Plummer wrote:
> This is an initial review for 6762191. I'm guessing there will be 
> recommendations to fix in a different way, but thought this would be a 
> good time to start the discussion.
>
> https://bugs.openjdk.java.net/browse/JDK-6762191
> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/
> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/
>
> The bug is that if the -Xss size is set to something very small (like 
> 16k), on linux there will be a crash due to overwriting the end of the 
> stack. This happens before hotspot can compute its stack needs and 
> verify that the stack is big enough.
>
> It didn't seem viable to move the hotspot stack size check earlier. It 
> depends on too much other work done before that point, and the changes 
> would have been disruptive. The stack size check is currently done in 
> os::init_2().
>
> What is needed is a check before the thread is created. That way we 
> can create a thread with a big enough stack to handle all needs up to 
> the point of the check in os::init_2(). This initial check does not 
> need to be the final check. It just needs to confirm that we have 
> enough stack to get us to the check in os::init_2().
>
> I decided to check in java.c if the -Xss size is too small, and set it 
> to a larger size if it is. I hard coded this size to 32k (I'll explain 
> why 32k later). I suspect this is the part that will result in some 
> debate. If you have better suggestions let me know. If it does stay 
> here, then probably the 32k needs to be a #define, and maybe even an 
> OS porting interface, but I'm not sure where to put it.
>
> The reason I chose 32k is because this is big enough for all platforms 
> to get to the stack size check in os::init_2(). It is also smaller 
> than the actual minimum stack size allowed on any platform. 32-bit 
> windows has the smallest requirement at 64k. I add some printfs to 
> print the minimum stack requirement, and then ran a simple JTReg test 
> with every JPRT supported platform to get the results.
>
> The TooSmallStackSize.sh will run "java -version" with -Xss16k, 
> -Xss32k, and -XXss<minsize>, where <minsize> is the size from the 
> error message produced by the JVM, such as in the following:
>
> $ java -Xss32k -version
> The stack size specified is too small, Specify at least 100k
> Error: Could not create the Java Virtual Machine.
> Error: A fatal exception has occurred. Program will exit.
>
> I ran this test through JPRT on all platforms, and they all pass.
>
> One thing to point out is that Windows behaves a bit different than 
> the other platforms. It always rounds the stack size up to a multiple 
> of 64k , so even if you specify -Xss16k, you get a 64k stack. On 
> 32-bit Windows with C1, 64k is also the minimum requirement, so there 
> is no error produced in this case. However, on 32-bit Windows with C2, 
> 68k is the minimum, so an error is produced since the stack will only 
> be 64k. There is no bug here. It's just a bit confusing.
>
> thanks,
>
> Chris


From aleksey.shipilev at oracle.com  Wed Nov 12 20:13:37 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Wed, 12 Nov 2014 23:13:37 +0300
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>
Message-ID: <5463BF71.4080804@oracle.com>

Hi,

Still not sure if this is a runtime bug: stripping safepoints from the
non-counted loop seems to be a recipe for disaster.

Anyhow, I think it deserves a simpler example. Submitted the bug and
attached a simple test there:
 https://bugs.openjdk.java.net/browse/JDK-8064749

Thanks,
-Aleksey.

On 12.11.2014 19:52, Deneau, Tom wrote:
> Hi all --
> 
> Forwarding a thread which came about on the jmh-dev mail list, as recommended by Aleksey Shipilev (see below).  The JMH framework has a timing control thread which sleeps for a certain period, then sets a volatile isDone variable.  Meanwhile, the benchmark thread loops doing its benchmark code and also checking the isDone field.   A hang occurs if -XX:-UseCompilerSafepoints is used.
> 
> The original issue can be reproduced by the following steps
> 
>    hg clone http://hg.openjdk.java.net/code-tools/jmh
>    cd jmh
>    mvn clean install -DskipTests=true
>    cd jmh-samples
>    java  -server -XX:-UseCompilerSafepoints -jar target/benchmarks.jar 'JMHSample_01_.*' -t 1 -wi 5 -i 5 -f 0
> 
> -- Tom Deneau
> 
> 
> -----Original Message-----
> From: Aleksey Shipilev [mailto:aleksey.shipilev at oracle.com] 
> Sent: Wednesday, November 12, 2014 6:09 AM
> To: Deneau, Tom; jmh-dev at openjdk.java.net
> Subject: Re: using -XX:-UseCompilerSafepoints
> 
> Hi Tom,
> 
> On 11/11/2014 07:34 PM, Deneau, Tom wrote:
>> It looks like a thread that calls Thread.sleep (as the timing control
>> thread does in the harness) will eventually go thru
>> SafepointSynchonize::block (as part of the ThreadBlockInVM
>> destructor).  So if there is a looping benchmark thread compiled
>> without Compiler Safepoints, the control thread will be blocked and
>> will never set the isDone flag.
> 
> So, you are saying that without the safepoint in the while(!isDone)
> loop in workload, control thread and workload thread will never
> rendezvous on safepoint? I believe this is a bug with
> -XX:-CompilerSafepoints, because the comment in safepoint.cpp calls this
> out specifically for VMThread vs. Mutator threads:
> 
>  // In a pathological scenario such as that described in CR6415670
>  // the VMthread may sleep just before the mutator(s) become safe.
>  // In that case the mutators will be stalled waiting for the safepoint
>  // to complete and the the VMthread will be sleeping, waiting for the
>  // mutators to rendezvous. The VMthread will eventually wake up and
>  // detect that all mutators are safe, at which point we'll again make
>  // progress.
> 
> If this is a case, you probably need to report this to runtime guys.
> 
>> This is probably OK, just need to document that CompilerSafepoints
>> cannot be turned off.
> 
> I think it is safe to presume something will go hairy if you are using
> any special VM flag, therefore I am not inclined to document this.
> 
> Thanks,
> -Aleksey.
> 


From christian.tornqvist at oracle.com  Wed Nov 12 20:53:24 2014
From: christian.tornqvist at oracle.com (Christian Tornqvist)
Date: Wed, 12 Nov 2014 15:53:24 -0500
Subject: RFR(S): 8043491: warning LNK4197: export '... ...' specified
	multiple	times; using first specification
In-Reply-To: <5461A8DE.1050009@oracle.com>
References: <5461A8DE.1050009@oracle.com>
Message-ID: <01ff01cffeba$b3aa2c40$1afe84c0$@oracle.com>

Hi Calvin,

Change looks good, thanks for fixing this.

Thanks,
Christian

-----Original Message-----
From: hotspot-runtime-dev
[mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Calvin
Cheung
Sent: Tuesday, November 11, 2014 1:13 AM
To: hotspot-runtime-dev at openjdk.java.net
Subject: RFR(S): 8043491: warning LNK4197: export '... ...' specified
multiple times; using first specification

This is for fixing link warnings on windows such as the following:
jni.obj : warning LNK4197: export 'JNI_CreateJavaVM' specified multiple
times; using first specification

The warning is reproducible with both VS2010 and VS2013.
It is applicable to 64-bit only probably due to the
__declspec(dllexport) on 32-bit, it exports the function decorated name with
a leading underscore, but not the case on 64-bit as described in:
http://stackoverflow.com/questions/3572344/a-warning-with-building-64bit-dll

All those functions are declared with JNIEXPORT (#define JNIEXPORT
__declspec(dllexport)) and we're adding the /export:<function name> in the
link command. Therefore, on 64-bit platform, we get the "specified multiple
times" LNK4197 warning.

A fix is to check if the platform is 64-bit, we don't add those /export
option to the link command.

JBS: https://bugs.openjdk.java.net/browse/JDK-8043491

webrev: http://cr.openjdk.java.net/~ccheung/8043491/webrev/

Tests:
     (1) build jvm.dll via command line (both 32- and 64-bit)
           use configure.sh to setup and then do "make CONF=<config>
hotspot"

     (2) generate visual studio project files using ProjectCreator (both
32- and 64-bit)
           build jvm.dll via VS2013 (both 32- and 64-bit)

     (3) JPRT

thanks,
Calvin


From calvin.cheung at oracle.com  Wed Nov 12 21:08:52 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Wed, 12 Nov 2014 13:08:52 -0800
Subject: RFR(S): 8043491: warning LNK4197: export '... ...' specified
	multiple	times; using first specification
In-Reply-To: <01ff01cffeba$b3aa2c40$1afe84c0$@oracle.com>
References: <5461A8DE.1050009@oracle.com>
	<01ff01cffeba$b3aa2c40$1afe84c0$@oracle.com>
Message-ID: <5463CC64.1000003@oracle.com>

Thanks for your review - Christian.

Calvin

On 11/12/2014 12:53 PM, Christian Tornqvist wrote:
> Hi Calvin,
>
> Change looks good, thanks for fixing this.
>
> Thanks,
> Christian
>
> -----Original Message-----
> From: hotspot-runtime-dev
> [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Calvin
> Cheung
> Sent: Tuesday, November 11, 2014 1:13 AM
> To: hotspot-runtime-dev at openjdk.java.net
> Subject: RFR(S): 8043491: warning LNK4197: export '... ...' specified
> multiple times; using first specification
>
> This is for fixing link warnings on windows such as the following:
> jni.obj : warning LNK4197: export 'JNI_CreateJavaVM' specified multiple
> times; using first specification
>
> The warning is reproducible with both VS2010 and VS2013.
> It is applicable to 64-bit only probably due to the
> __declspec(dllexport) on 32-bit, it exports the function decorated name with
> a leading underscore, but not the case on 64-bit as described in:
> http://stackoverflow.com/questions/3572344/a-warning-with-building-64bit-dll
>
> All those functions are declared with JNIEXPORT (#define JNIEXPORT
> __declspec(dllexport)) and we're adding the /export:<function name> in the
> link command. Therefore, on 64-bit platform, we get the "specified multiple
> times" LNK4197 warning.
>
> A fix is to check if the platform is 64-bit, we don't add those /export
> option to the link command.
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8043491
>
> webrev: http://cr.openjdk.java.net/~ccheung/8043491/webrev/
>
> Tests:
>       (1) build jvm.dll via command line (both 32- and 64-bit)
>             use configure.sh to setup and then do "make CONF=<config>
> hotspot"
>
>       (2) generate visual studio project files using ProjectCreator (both
> 32- and 64-bit)
>             build jvm.dll via VS2013 (both 32- and 64-bit)
>
>       (3) JPRT
>
> thanks,
> Calvin
>
>
>
>


From david.holmes at oracle.com  Wed Nov 12 22:45:16 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 13 Nov 2014 08:45:16 +1000
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <54633524.8000607@oracle.com>
References: <545FB64F.7090705@oracle.com> <54621FFA.2070503@oracle.com>
	<5462E6B5.7080504@oracle.com> <54633524.8000607@oracle.com>
Message-ID: <5463E2FC.5010207@oracle.com>

On 12/11/2014 8:23 PM, Aleksey Shipilev wrote:
> Hi David,
>
> On 12.11.2014 07:48, David Holmes wrote:
>> On 12/11/2014 12:40 AM, Aleksey Shipilev wrote:
>> All looks good to me.
>
> Thanks for the review!
>
>> But I also noticed this strange (to me) assertion in javaClasses.cpp
>>
>>   void java_lang_Thread::set_name(oop java_thread, oop name) {
>>      assert(java_thread->obj_field(_name_offset) == NULL, "name should be
>> NULL");
>>      java_thread->obj_field_put(_name_offset, name);
>>    }
>>
>> and on investigation it seems like this is dead code - I couldn't locate
>> a call to java_lang_Thread::set_name ?? It would only be usable on an
>> attaching thread (else name can't be null) and we pass the name to the
>> Thread constructor in that case.
>
> set_name is not used, as I mentioned earlier -- that makes the change

Sorry, I missed that comment.

> even more "safe". I was even tempted to drop the setter completely, but
> it would break the symmetry against other setters and getters. I dropped
> the assert at set_name in this update:
>    http://cr.openjdk.java.net/~shade/8059677/webrev.03.hs/
>    http://cr.openjdk.java.net/~shade/8059677/webrev.03.jdk/
>
> The only difference against the previous version is the dropped assert,
> so I haven't re-spinned the tests.

OK. I'm more inclined to delete unused code but it is fine as is.

Thanks,
David


> Thanks,
> -Aleksey.
>

From yumin.qi at oracle.com  Wed Nov 12 22:45:42 2014
From: yumin.qi at oracle.com (Yumin Qi)
Date: Wed, 12 Nov 2014 14:45:42 -0800
Subject: RFR(S): 8043491: warning LNK4197: export '... ...' specified
	multiple times; using first specification
In-Reply-To: <5461A8DE.1050009@oracle.com>
References: <5461A8DE.1050009@oracle.com>
Message-ID: <5463E316.30308@oracle.com>

Looks good to me.

Thanks
Yumin
On 11/10/2014 10:12 PM, Calvin Cheung wrote:
> This is for fixing link warnings on windows such as the following:
> jni.obj : warning LNK4197: export 'JNI_CreateJavaVM' specified 
> multiple times; using first specification
>
> The warning is reproducible with both VS2010 and VS2013.
> It is applicable to 64-bit only probably due to the 
> __declspec(dllexport) on 32-bit, it exports the function decorated 
> name with a leading underscore, but not the case on 64-bit as 
> described in:
> http://stackoverflow.com/questions/3572344/a-warning-with-building-64bit-dll 
>
>
> All those functions are declared with JNIEXPORT (#define JNIEXPORT 
> __declspec(dllexport)) and we're adding the /export:<function name> in 
> the link command. Therefore, on 64-bit platform, we get the "specified 
> multiple times" LNK4197 warning.
>
> A fix is to check if the platform is 64-bit, we don't add those 
> /export option to the link command.
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8043491
>
> webrev: http://cr.openjdk.java.net/~ccheung/8043491/webrev/
>
> Tests:
>     (1) build jvm.dll via command line (both 32- and 64-bit)
>           use configure.sh to setup and then do "make CONF=<config> 
> hotspot"
>
>     (2) generate visual studio project files using ProjectCreator 
> (both 32- and 64-bit)
>           build jvm.dll via VS2013 (both 32- and 64-bit)
>
>     (3) JPRT
>
> thanks,
> Calvin
>
>
>


From calvin.cheung at oracle.com  Wed Nov 12 22:48:24 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Wed, 12 Nov 2014 14:48:24 -0800
Subject: RFR(S): 8043491: warning LNK4197: export '... ...' specified
	multiple times; using first specification
In-Reply-To: <5463E316.30308@oracle.com>
References: <5461A8DE.1050009@oracle.com> <5463E316.30308@oracle.com>
Message-ID: <5463E3B8.3050708@oracle.com>

Thanks for your review - Yumin.

On 11/12/2014 2:45 PM, Yumin Qi wrote:
> Looks good to me.
>
> Thanks
> Yumin
> On 11/10/2014 10:12 PM, Calvin Cheung wrote:
>> This is for fixing link warnings on windows such as the following:
>> jni.obj : warning LNK4197: export 'JNI_CreateJavaVM' specified 
>> multiple times; using first specification
>>
>> The warning is reproducible with both VS2010 and VS2013.
>> It is applicable to 64-bit only probably due to the 
>> __declspec(dllexport) on 32-bit, it exports the function decorated 
>> name with a leading underscore, but not the case on 64-bit as 
>> described in:
>> http://stackoverflow.com/questions/3572344/a-warning-with-building-64bit-dll 
>>
>>
>> All those functions are declared with JNIEXPORT (#define JNIEXPORT 
>> __declspec(dllexport)) and we're adding the /export:<function name> 
>> in the link command. Therefore, on 64-bit platform, we get the 
>> "specified multiple times" LNK4197 warning.
>>
>> A fix is to check if the platform is 64-bit, we don't add those 
>> /export option to the link command.
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8043491
>>
>> webrev: http://cr.openjdk.java.net/~ccheung/8043491/webrev/
>>
>> Tests:
>>     (1) build jvm.dll via command line (both 32- and 64-bit)
>>           use configure.sh to setup and then do "make CONF=<config> 
>> hotspot"
>>
>>     (2) generate visual studio project files using ProjectCreator 
>> (both 32- and 64-bit)
>>           build jvm.dll via VS2013 (both 32- and 64-bit)
>>
>>     (3) JPRT
>>
>> thanks,
>> Calvin
>>
>>
>>
>


From aleksey.shipilev at oracle.com  Wed Nov 12 23:01:39 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 13 Nov 2014 02:01:39 +0300
Subject: RFR (S) 8059677: Thread.getName() instantiates Strings
In-Reply-To: <54621FFA.2070503@oracle.com>
References: <545FB64F.7090705@oracle.com> <54621FFA.2070503@oracle.com>
Message-ID: <5463E6D3.6030806@oracle.com>

On 11.11.2014 17:40, Aleksey Shipilev wrote:
> On 11/09/2014 09:45 PM, Aleksey Shipilev wrote:
>> Thread.getName() returns String, and does new String instantiation every
>> time, because the thread name is stored in char[]. Even though we use a
>> private String constructor that shares the char[] array without copying
>> it, this still hurts some use cases (think extra-fast logging). To the
>> extent some people actually maintain Map<Thread, String> to avoid it.
>>  https://bugs.openjdk.java.net/browse/JDK-8059677
>>
>> Here's the attempt to maintain String instead of char[]:
>>  http://cr.openjdk.java.net/~shade/8059677/webrev.01.jdk/
>>  http://cr.openjdk.java.net/~shade/8059677/webrev.01.hs/
> 
> Updated webrevs:
>   http://cr.openjdk.java.net/~shade/8059677/webrev.02.jdk/
>   http://cr.openjdk.java.net/~shade/8059677/webrev.02.hs/

All right, third time a charm. All reviewers seem to be happy with these
changes:
  http://cr.openjdk.java.net/~shade/8059677/webrev.03.jdk/
  http://cr.openjdk.java.net/~shade/8059677/webrev.03.hs/

Coleen had volunteered to sponsor them (thanks!), here are the changesets:
 http://cr.openjdk.java.net/~shade/8059677/8059677-jdk.changeset
 http://cr.openjdk.java.net/~shade/8059677/8059677-hs.changeset

Thanks,
-Aleksey.


From david.holmes at oracle.com  Wed Nov 12 23:27:05 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 13 Nov 2014 09:27:05 +1000
Subject: RFR (S) 8062307: 'Reference handler' thread triggers assert w/
	TraceThreadEvents
In-Reply-To: <5451BD59.4060202@oracle.com>
References: <5451BADF.8040203@oracle.com> <5451BD59.4060202@oracle.com>
Message-ID: <5463ECC9.10309@oracle.com>

The CCC for this trivial removal has been removed.

Still need two reviewers please.

David

On 30/10/2014 2:23 PM, David Holmes wrote:
> On 30/10/2014 2:13 PM, David Holmes wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8062307
>>
>> Webrev: http://cr.openjdk.java.net/~dholmes/8062307/webrev/
>>
>> It turns out that the little known TraceThreadEvents logic has been
>> broken since at least very early in JDK 5. A develop-only option it was
>> intended to show when different Thread methods were called (the VM side
>> of certain java.lang.Thread methods). While that sounds potentially
>> useful for debugging it seems that in practice it is not - this has been
>> broken for over 10 years with nobody noticing: it is unused. So rather
>> than fix unused code it is proposed to simply delete it instead.
>
> Correction this has been noticed in the past:
>
> https://bugs.openjdk.java.net/browse/JDK-6757482 (2008-10-09 06:51)
>
> http://mail.openjdk.java.net/pipermail/distro-pkg-dev/2008-March/001586.html
>
>
> David
>
>> Thanks,
>> David

From coleen.phillimore at oracle.com  Wed Nov 12 23:34:06 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Wed, 12 Nov 2014 18:34:06 -0500
Subject: RFR (S) 8062307: 'Reference handler' thread triggers assert w/
	TraceThreadEvents
In-Reply-To: <5463ECC9.10309@oracle.com>
References: <5451BADF.8040203@oracle.com> <5451BD59.4060202@oracle.com>
	<5463ECC9.10309@oracle.com>
Message-ID: <5463EE6E.7060008@oracle.com>


Change looks good.   You mean the CCC is approved, not removed.

Coleen

On 11/12/14, 6:27 PM, David Holmes wrote:
> The CCC for this trivial removal has been removed.
>
> Still need two reviewers please.
>
> David
>
> On 30/10/2014 2:23 PM, David Holmes wrote:
>> On 30/10/2014 2:13 PM, David Holmes wrote:
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8062307
>>>
>>> Webrev: http://cr.openjdk.java.net/~dholmes/8062307/webrev/
>>>
>>> It turns out that the little known TraceThreadEvents logic has been
>>> broken since at least very early in JDK 5. A develop-only option it was
>>> intended to show when different Thread methods were called (the VM side
>>> of certain java.lang.Thread methods). While that sounds potentially
>>> useful for debugging it seems that in practice it is not - this has 
>>> been
>>> broken for over 10 years with nobody noticing: it is unused. So rather
>>> than fix unused code it is proposed to simply delete it instead.
>>
>> Correction this has been noticed in the past:
>>
>> https://bugs.openjdk.java.net/browse/JDK-6757482 (2008-10-09 06:51)
>>
>> http://mail.openjdk.java.net/pipermail/distro-pkg-dev/2008-March/001586.html 
>>
>>
>>
>> David
>>
>>> Thanks,
>>> David


From jiangli.zhou at oracle.com  Wed Nov 12 23:37:09 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Wed, 12 Nov 2014 15:37:09 -0800
Subject: RFR (S) 8062307: 'Reference handler' thread triggers assert w/
	TraceThreadEvents
In-Reply-To: <5463ECC9.10309@oracle.com>
References: <5451BADF.8040203@oracle.com> <5451BD59.4060202@oracle.com>
	<5463ECC9.10309@oracle.com>
Message-ID: <5463EF25.40804@oracle.com>

Hi David,

The change looks good.

Thanks,
Jiangli

On 11/12/2014 03:27 PM, David Holmes wrote:
> The CCC for this trivial removal has been removed.
>
> Still need two reviewers please.
>
> David
>
> On 30/10/2014 2:23 PM, David Holmes wrote:
>> On 30/10/2014 2:13 PM, David Holmes wrote:
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8062307
>>>
>>> Webrev: http://cr.openjdk.java.net/~dholmes/8062307/webrev/
>>>
>>> It turns out that the little known TraceThreadEvents logic has been
>>> broken since at least very early in JDK 5. A develop-only option it was
>>> intended to show when different Thread methods were called (the VM side
>>> of certain java.lang.Thread methods). While that sounds potentially
>>> useful for debugging it seems that in practice it is not - this has 
>>> been
>>> broken for over 10 years with nobody noticing: it is unused. So rather
>>> than fix unused code it is proposed to simply delete it instead.
>>
>> Correction this has been noticed in the past:
>>
>> https://bugs.openjdk.java.net/browse/JDK-6757482 (2008-10-09 06:51)
>>
>> http://mail.openjdk.java.net/pipermail/distro-pkg-dev/2008-March/001586.html 
>>
>>
>>
>> David
>>
>>> Thanks,
>>> David


From vladimir.kozlov at oracle.com  Wed Nov 12 23:38:48 2014
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 12 Nov 2014 15:38:48 -0800
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <5463BF71.4080804@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>
	<5463BF71.4080804@oracle.com>
Message-ID: <5463EF88.1050100@oracle.com>

On 11/12/14 12:13 PM, Aleksey Shipilev wrote:
> Hi,
>
> Still not sure if this is a runtime bug: stripping safepoints from the
> non-counted loop seems to be a recipe for disaster.

This flag does not affect compiled code - so it is not compiler issue. 
It is only used in runtime/safepoint.cpp and it guards the code which 
protects a polling page.

There are many bugs which shows current problem. For example:

https://bugs.openjdk.java.net/browse/JDK-6873333

I would say that we have to remove it or at least make it experimental 
flag if we want to do experiments with it.

We definitely should not allow to use it in production!

Regards,
Vladimir

>
> Anyhow, I think it deserves a simpler example. Submitted the bug and
> attached a simple test there:
>   https://bugs.openjdk.java.net/browse/JDK-8064749
>
> Thanks,
> -Aleksey.
>
> On 12.11.2014 19:52, Deneau, Tom wrote:
>> Hi all --
>>
>> Forwarding a thread which came about on the jmh-dev mail list, as recommended by Aleksey Shipilev (see below).  The JMH framework has a timing control thread which sleeps for a certain period, then sets a volatile isDone variable.  Meanwhile, the benchmark thread loops doing its benchmark code and also checking the isDone field.   A hang occurs if -XX:-UseCompilerSafepoints is used.
>>
>> The original issue can be reproduced by the following steps
>>
>>     hg clone http://hg.openjdk.java.net/code-tools/jmh
>>     cd jmh
>>     mvn clean install -DskipTests=true
>>     cd jmh-samples
>>     java  -server -XX:-UseCompilerSafepoints -jar target/benchmarks.jar 'JMHSample_01_.*' -t 1 -wi 5 -i 5 -f 0
>>
>> -- Tom Deneau
>>
>>
>> -----Original Message-----
>> From: Aleksey Shipilev [mailto:aleksey.shipilev at oracle.com]
>> Sent: Wednesday, November 12, 2014 6:09 AM
>> To: Deneau, Tom; jmh-dev at openjdk.java.net
>> Subject: Re: using -XX:-UseCompilerSafepoints
>>
>> Hi Tom,
>>
>> On 11/11/2014 07:34 PM, Deneau, Tom wrote:
>>> It looks like a thread that calls Thread.sleep (as the timing control
>>> thread does in the harness) will eventually go thru
>>> SafepointSynchonize::block (as part of the ThreadBlockInVM
>>> destructor).  So if there is a looping benchmark thread compiled
>>> without Compiler Safepoints, the control thread will be blocked and
>>> will never set the isDone flag.
>>
>> So, you are saying that without the safepoint in the while(!isDone)
>> loop in workload, control thread and workload thread will never
>> rendezvous on safepoint? I believe this is a bug with
>> -XX:-CompilerSafepoints, because the comment in safepoint.cpp calls this
>> out specifically for VMThread vs. Mutator threads:
>>
>>   // In a pathological scenario such as that described in CR6415670
>>   // the VMthread may sleep just before the mutator(s) become safe.
>>   // In that case the mutators will be stalled waiting for the safepoint
>>   // to complete and the the VMthread will be sleeping, waiting for the
>>   // mutators to rendezvous. The VMthread will eventually wake up and
>>   // detect that all mutators are safe, at which point we'll again make
>>   // progress.
>>
>> If this is a case, you probably need to report this to runtime guys.
>>
>>> This is probably OK, just need to document that CompilerSafepoints
>>> cannot be turned off.
>>
>> I think it is safe to presume something will go hairy if you are using
>> any special VM flag, therefore I am not inclined to document this.
>>
>> Thanks,
>> -Aleksey.
>>
>
>

From aleksey.shipilev at oracle.com  Wed Nov 12 23:55:42 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 13 Nov 2014 02:55:42 +0300
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <5463EF88.1050100@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>
	<5463BF71.4080804@oracle.com> <5463EF88.1050100@oracle.com>
Message-ID: <5463F37E.7020804@oracle.com>

On 13.11.2014 02:38, Vladimir Kozlov wrote:
> On 11/12/14 12:13 PM, Aleksey Shipilev wrote:
>> Still not sure if this is a runtime bug: stripping safepoints from the
>> non-counted loop seems to be a recipe for disaster.
> 
> This flag does not affect compiled code - so it is not compiler issue.
> It is only used in runtime/safepoint.cpp and it guards the code which
> protects a polling page.
> 
> There are many bugs which shows current problem. For example:
> 
> https://bugs.openjdk.java.net/browse/JDK-6873333
> 
> I would say that we have to remove it or at least make it experimental
> flag if we want to do experiments with it.
> 
> We definitely should not allow to use it in production!

Yes, that's what I meant. By "runtime" I meant JRE as whole, not a
particular component. I am not sure why Tom played with this flag to
begin with, are there legitimate use cases that force users to mess with
safepoint internals? I sure hope there are no such use cases.

I agree that demoting this flag from "product" to "experimental" sets
the expectations about its impact right.

Thanks,
-Aleksey.


From david.holmes at oracle.com  Thu Nov 13 00:03:01 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 13 Nov 2014 10:03:01 +1000
Subject: RFR (S) 8062307: 'Reference handler' thread triggers assert w/
	TraceThreadEvents
In-Reply-To: <5463EE6E.7060008@oracle.com>
References: <5451BADF.8040203@oracle.com>
	<5451BD59.4060202@oracle.com>	<5463ECC9.10309@oracle.com>
	<5463EE6E.7060008@oracle.com>
Message-ID: <5463F535.1070201@oracle.com>

On 13/11/2014 9:34 AM, Coleen Phillimore wrote:
>
> Change looks good.   You mean the CCC is approved, not removed.

Yep approved - must have been delayed keyboard stutter :)

Thanks,
David

> Coleen
>
> On 11/12/14, 6:27 PM, David Holmes wrote:
>> The CCC for this trivial removal has been removed.
>>
>> Still need two reviewers please.
>>
>> David
>>
>> On 30/10/2014 2:23 PM, David Holmes wrote:
>>> On 30/10/2014 2:13 PM, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8062307
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8062307/webrev/
>>>>
>>>> It turns out that the little known TraceThreadEvents logic has been
>>>> broken since at least very early in JDK 5. A develop-only option it was
>>>> intended to show when different Thread methods were called (the VM side
>>>> of certain java.lang.Thread methods). While that sounds potentially
>>>> useful for debugging it seems that in practice it is not - this has
>>>> been
>>>> broken for over 10 years with nobody noticing: it is unused. So rather
>>>> than fix unused code it is proposed to simply delete it instead.
>>>
>>> Correction this has been noticed in the past:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-6757482 (2008-10-09 06:51)
>>>
>>> http://mail.openjdk.java.net/pipermail/distro-pkg-dev/2008-March/001586.html
>>>
>>>
>>>
>>> David
>>>
>>>> Thanks,
>>>> David
>

From david.holmes at oracle.com  Thu Nov 13 00:03:17 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 13 Nov 2014 10:03:17 +1000
Subject: RFR (S) 8062307: 'Reference handler' thread triggers assert w/
	TraceThreadEvents
In-Reply-To: <5463EF25.40804@oracle.com>
References: <5451BADF.8040203@oracle.com> <5451BD59.4060202@oracle.com>
	<5463ECC9.10309@oracle.com> <5463EF25.40804@oracle.com>
Message-ID: <5463F545.1090207@oracle.com>

Thanks Jiangli!

David

On 13/11/2014 9:37 AM, Jiangli Zhou wrote:
> Hi David,
>
> The change looks good.
>
> Thanks,
> Jiangli
>
> On 11/12/2014 03:27 PM, David Holmes wrote:
>> The CCC for this trivial removal has been removed.
>>
>> Still need two reviewers please.
>>
>> David
>>
>> On 30/10/2014 2:23 PM, David Holmes wrote:
>>> On 30/10/2014 2:13 PM, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8062307
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8062307/webrev/
>>>>
>>>> It turns out that the little known TraceThreadEvents logic has been
>>>> broken since at least very early in JDK 5. A develop-only option it was
>>>> intended to show when different Thread methods were called (the VM side
>>>> of certain java.lang.Thread methods). While that sounds potentially
>>>> useful for debugging it seems that in practice it is not - this has
>>>> been
>>>> broken for over 10 years with nobody noticing: it is unused. So rather
>>>> than fix unused code it is proposed to simply delete it instead.
>>>
>>> Correction this has been noticed in the past:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-6757482 (2008-10-09 06:51)
>>>
>>> http://mail.openjdk.java.net/pipermail/distro-pkg-dev/2008-March/001586.html
>>>
>>>
>>>
>>> David
>>>
>>>> Thanks,
>>>> David
>

From david.holmes at oracle.com  Thu Nov 13 02:43:42 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 13 Nov 2014 12:43:42 +1000
Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation
	fault
In-Reply-To: <5463B896.10801@oracle.com>
References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com>
Message-ID: <54641ADE.8030504@oracle.com>

Hi Chris,

Sorry for the delay.

On 13/11/2014 5:44 AM, Chris Plummer wrote:
> Hi,
>
> I'm still looking for reviewers.

As the change is to the launcher it needs to be reviewed by the launcher 
owner - which I think is serviceability (though also cc'd Kumar :) ).

Launcher change, and your rationale, seems okay to me. I'd probably put 
the test in to jdk/test/tools/launcher/ though.

Thanks,
David

> thanks,
>
> Chris
>
> On 11/7/14 7:53 PM, Chris Plummer wrote:
>> This is an initial review for 6762191. I'm guessing there will be
>> recommendations to fix in a different way, but thought this would be a
>> good time to start the discussion.
>>
>> https://bugs.openjdk.java.net/browse/JDK-6762191
>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/
>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/
>>
>> The bug is that if the -Xss size is set to something very small (like
>> 16k), on linux there will be a crash due to overwriting the end of the
>> stack. This happens before hotspot can compute its stack needs and
>> verify that the stack is big enough.
>>
>> It didn't seem viable to move the hotspot stack size check earlier. It
>> depends on too much other work done before that point, and the changes
>> would have been disruptive. The stack size check is currently done in
>> os::init_2().
>>
>> What is needed is a check before the thread is created. That way we
>> can create a thread with a big enough stack to handle all needs up to
>> the point of the check in os::init_2(). This initial check does not
>> need to be the final check. It just needs to confirm that we have
>> enough stack to get us to the check in os::init_2().
>>
>> I decided to check in java.c if the -Xss size is too small, and set it
>> to a larger size if it is. I hard coded this size to 32k (I'll explain
>> why 32k later). I suspect this is the part that will result in some
>> debate. If you have better suggestions let me know. If it does stay
>> here, then probably the 32k needs to be a #define, and maybe even an
>> OS porting interface, but I'm not sure where to put it.
>>
>> The reason I chose 32k is because this is big enough for all platforms
>> to get to the stack size check in os::init_2(). It is also smaller
>> than the actual minimum stack size allowed on any platform. 32-bit
>> windows has the smallest requirement at 64k. I add some printfs to
>> print the minimum stack requirement, and then ran a simple JTReg test
>> with every JPRT supported platform to get the results.
>>
>> The TooSmallStackSize.sh will run "java -version" with -Xss16k,
>> -Xss32k, and -XXss<minsize>, where <minsize> is the size from the
>> error message produced by the JVM, such as in the following:
>>
>> $ java -Xss32k -version
>> The stack size specified is too small, Specify at least 100k
>> Error: Could not create the Java Virtual Machine.
>> Error: A fatal exception has occurred. Program will exit.
>>
>> I ran this test through JPRT on all platforms, and they all pass.
>>
>> One thing to point out is that Windows behaves a bit different than
>> the other platforms. It always rounds the stack size up to a multiple
>> of 64k , so even if you specify -Xss16k, you get a 64k stack. On
>> 32-bit Windows with C1, 64k is also the minimum requirement, so there
>> is no error produced in this case. However, on 32-bit Windows with C2,
>> 68k is the minimum, so an error is produced since the stack will only
>> be 64k. There is no bug here. It's just a bit confusing.
>>
>> thanks,
>>
>> Chris
>

From david.holmes at oracle.com  Thu Nov 13 02:57:43 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 13 Nov 2014 12:57:43 +1000
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <5463EF88.1050100@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>	<5463BF71.4080804@oracle.com>
	<5463EF88.1050100@oracle.com>
Message-ID: <54641E27.4090303@oracle.com>

On 13/11/2014 9:38 AM, Vladimir Kozlov wrote:
> On 11/12/14 12:13 PM, Aleksey Shipilev wrote:
>> Hi,
>>
>> Still not sure if this is a runtime bug: stripping safepoints from the
>> non-counted loop seems to be a recipe for disaster.
>
> This flag does not affect compiled code - so it is not compiler issue.

Well, it disables the mechanism that the compiler inserts for checking 
if a safepoint has been requested. As I've added to the bug report, 
disabling compiler safepoints should go hand-in-hand with disabling the 
compilers (ie run with -Xint) - otherwise you have to know that the 
compiled code will eventually hit a non-compiler safepoint check.

> It is only used in runtime/safepoint.cpp and it guards the code which
> protects a polling page.
>
> There are many bugs which shows current problem. For example:
>
> https://bugs.openjdk.java.net/browse/JDK-6873333
>
> I would say that we have to remove it or at least make it experimental
> flag if we want to do experiments with it.
>
> We definitely should not allow to use it in production!

If we assume there is a reason it was made a product flag then the 
correct fix in my opinion would be to fall back to intepreter-only mode 
when this flag is turned off.

If we don't make that assumption then we could still tie it to 
interpreter-only mode, but we definitely should not make it configurable 
in product mode without some effort.

Or if we can't ascertain a valid reason for ever wanting to do this, we 
could simply delete the flag altogether. :)

Cheers,
David

> Regards,
> Vladimir
>
>>
>> Anyhow, I think it deserves a simpler example. Submitted the bug and
>> attached a simple test there:
>>   https://bugs.openjdk.java.net/browse/JDK-8064749
>>
>> Thanks,
>> -Aleksey.
>>
>> On 12.11.2014 19:52, Deneau, Tom wrote:
>>> Hi all --
>>>
>>> Forwarding a thread which came about on the jmh-dev mail list, as
>>> recommended by Aleksey Shipilev (see below).  The JMH framework has a
>>> timing control thread which sleeps for a certain period, then sets a
>>> volatile isDone variable.  Meanwhile, the benchmark thread loops
>>> doing its benchmark code and also checking the isDone field.   A hang
>>> occurs if -XX:-UseCompilerSafepoints is used.
>>>
>>> The original issue can be reproduced by the following steps
>>>
>>>     hg clone http://hg.openjdk.java.net/code-tools/jmh
>>>     cd jmh
>>>     mvn clean install -DskipTests=true
>>>     cd jmh-samples
>>>     java  -server -XX:-UseCompilerSafepoints -jar
>>> target/benchmarks.jar 'JMHSample_01_.*' -t 1 -wi 5 -i 5 -f 0
>>>
>>> -- Tom Deneau
>>>
>>>
>>> -----Original Message-----
>>> From: Aleksey Shipilev [mailto:aleksey.shipilev at oracle.com]
>>> Sent: Wednesday, November 12, 2014 6:09 AM
>>> To: Deneau, Tom; jmh-dev at openjdk.java.net
>>> Subject: Re: using -XX:-UseCompilerSafepoints
>>>
>>> Hi Tom,
>>>
>>> On 11/11/2014 07:34 PM, Deneau, Tom wrote:
>>>> It looks like a thread that calls Thread.sleep (as the timing control
>>>> thread does in the harness) will eventually go thru
>>>> SafepointSynchonize::block (as part of the ThreadBlockInVM
>>>> destructor).  So if there is a looping benchmark thread compiled
>>>> without Compiler Safepoints, the control thread will be blocked and
>>>> will never set the isDone flag.
>>>
>>> So, you are saying that without the safepoint in the while(!isDone)
>>> loop in workload, control thread and workload thread will never
>>> rendezvous on safepoint? I believe this is a bug with
>>> -XX:-CompilerSafepoints, because the comment in safepoint.cpp calls this
>>> out specifically for VMThread vs. Mutator threads:
>>>
>>>   // In a pathological scenario such as that described in CR6415670
>>>   // the VMthread may sleep just before the mutator(s) become safe.
>>>   // In that case the mutators will be stalled waiting for the safepoint
>>>   // to complete and the the VMthread will be sleeping, waiting for the
>>>   // mutators to rendezvous. The VMthread will eventually wake up and
>>>   // detect that all mutators are safe, at which point we'll again make
>>>   // progress.
>>>
>>> If this is a case, you probably need to report this to runtime guys.
>>>
>>>> This is probably OK, just need to document that CompilerSafepoints
>>>> cannot be turned off.
>>>
>>> I think it is safe to presume something will go hairy if you are using
>>> any special VM flag, therefore I am not inclined to document this.
>>>
>>> Thanks,
>>> -Aleksey.
>>>
>>
>>

From david.holmes at oracle.com  Thu Nov 13 03:10:34 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 13 Nov 2014 13:10:34 +1000
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <546151A9.1080100@oracle.com>
References: <546151A9.1080100@oracle.com>
Message-ID: <5464212A.6070504@oracle.com>

Hi Dan,

If you still need a Reviewer, looks okay to me.

Thanks,
David

On 11/11/2014 10:00 AM, Daniel D. Daugherty wrote:
> Greetings,
>
> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
> Yes, it is a small fix, but it is in Makefiles so feel free to
> run screaming from the room... :-)  On the plus side the fix does
> delete two work around source files (Coleen would say that's a
> Good Thing (TM)!)
>
> The fix is to detect the version of GNU objcopy that is being
> used on the machine and only enable Full Debug Symbols when that
> version is 2.21.1 or newer. If you don't have the right version,
> then the build drops back to pre-FDS build configs with a message
> like this:
>
> WARNING: /usr/sfw/bin/gobjcopy --version info:
> WARNING: GNU objcopy 2.15
> WARNING: an objcopy version of 2.21.1 or newer is needed to create valid
> .debuginfo files.
> WARNING: ignoring above objcopy command.
> WARNING: patch 149063-01 or newer contains the correct Solaris 10 SPARC
> version.
> WARNING: patch 149064-01 or newer contains the correct Solaris 10 X86
> version.
> WARNING: Solaris 11 Update 1 contains the correct version.
> INFO: no objcopy cmd found so cannot create .debuginfo files.
> INFO: ENABLE_FULL_DEBUG_SYMBOLS=0
>
> This work is being tracked by the following bug IDs:
>
>      JDK-8033602 wrong stabs data in libjvm.debuginfo on JDK 8 - SPARC
>      https://bugs.openjdk.java.net/browse/JDK-8033602
>
>      JDK-8034005 cannot debug in synchronizer.o or objectMonitor.o on
> Solaris X86
>      https://bugs.openjdk.java.net/browse/JDK-8034005
>
> Here is the webrev URL:
>
> http://cr.openjdk.java.net/~dcubed/8033602-webrev/0-jdk9-hs-rt/
>
> Testing:
>
> - JPRT test jobs to verify that the current JPRT Solaris hosts
>    are happy
> - local builds on my Solaris 10 X86 machine to verify that the
>    wrong version of GNU objcopy is caught
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan

From vladimir.kozlov at oracle.com  Thu Nov 13 03:39:07 2014
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 12 Nov 2014 19:39:07 -0800
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <54641E27.4090303@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>	<5463BF71.4080804@oracle.com>
	<5463EF88.1050100@oracle.com> <54641E27.4090303@oracle.com>
Message-ID: <546427DB.3070806@oracle.com>

I agrer that workaround is -Xint. But if we disable compilation with 
-UseCompilerSafepoints, the flag becomes useless. You can get the same 
result with just -Xint.

The history shows that it was added at the very beginning of Hotspot 
development, at the day one. I can only speculate that it was used to 
find performance effects of safepoints in compiled code . It could be 
the case that we removed safepoints from Counted loops as result of that 
investigation. I think it was never intended to be used in production.

Although we can fix compilers to generate a runtime call which does 
safepoint when -UseCompilerSafepoints is specified, it will be useless 
work, I think.

thanks,
Vladimir

On 11/12/14 6:57 PM, David Holmes wrote:
> On 13/11/2014 9:38 AM, Vladimir Kozlov wrote:
>> On 11/12/14 12:13 PM, Aleksey Shipilev wrote:
>>> Hi,
>>>
>>> Still not sure if this is a runtime bug: stripping safepoints from the
>>> non-counted loop seems to be a recipe for disaster.
>>
>> This flag does not affect compiled code - so it is not compiler issue.
>
> Well, it disables the mechanism that the compiler inserts for checking
> if a safepoint has been requested. As I've added to the bug report,
> disabling compiler safepoints should go hand-in-hand with disabling the
> compilers (ie run with -Xint) - otherwise you have to know that the
> compiled code will eventually hit a non-compiler safepoint check.
>
>> It is only used in runtime/safepoint.cpp and it guards the code which
>> protects a polling page.
>>
>> There are many bugs which shows current problem. For example:
>>
>> https://bugs.openjdk.java.net/browse/JDK-6873333
>>
>> I would say that we have to remove it or at least make it experimental
>> flag if we want to do experiments with it.
>>
>> We definitely should not allow to use it in production!
>
> If we assume there is a reason it was made a product flag then the
> correct fix in my opinion would be to fall back to intepreter-only mode
> when this flag is turned off.
>
> If we don't make that assumption then we could still tie it to
> interpreter-only mode, but we definitely should not make it configurable
> in product mode without some effort.
>
> Or if we can't ascertain a valid reason for ever wanting to do this, we
> could simply delete the flag altogether. :)
>
> Cheers,
> David
>
>> Regards,
>> Vladimir
>>
>>>
>>> Anyhow, I think it deserves a simpler example. Submitted the bug and
>>> attached a simple test there:
>>>   https://bugs.openjdk.java.net/browse/JDK-8064749
>>>
>>> Thanks,
>>> -Aleksey.
>>>
>>> On 12.11.2014 19:52, Deneau, Tom wrote:
>>>> Hi all --
>>>>
>>>> Forwarding a thread which came about on the jmh-dev mail list, as
>>>> recommended by Aleksey Shipilev (see below).  The JMH framework has a
>>>> timing control thread which sleeps for a certain period, then sets a
>>>> volatile isDone variable.  Meanwhile, the benchmark thread loops
>>>> doing its benchmark code and also checking the isDone field.   A hang
>>>> occurs if -XX:-UseCompilerSafepoints is used.
>>>>
>>>> The original issue can be reproduced by the following steps
>>>>
>>>>     hg clone http://hg.openjdk.java.net/code-tools/jmh
>>>>     cd jmh
>>>>     mvn clean install -DskipTests=true
>>>>     cd jmh-samples
>>>>     java  -server -XX:-UseCompilerSafepoints -jar
>>>> target/benchmarks.jar 'JMHSample_01_.*' -t 1 -wi 5 -i 5 -f 0
>>>>
>>>> -- Tom Deneau
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Aleksey Shipilev [mailto:aleksey.shipilev at oracle.com]
>>>> Sent: Wednesday, November 12, 2014 6:09 AM
>>>> To: Deneau, Tom; jmh-dev at openjdk.java.net
>>>> Subject: Re: using -XX:-UseCompilerSafepoints
>>>>
>>>> Hi Tom,
>>>>
>>>> On 11/11/2014 07:34 PM, Deneau, Tom wrote:
>>>>> It looks like a thread that calls Thread.sleep (as the timing control
>>>>> thread does in the harness) will eventually go thru
>>>>> SafepointSynchonize::block (as part of the ThreadBlockInVM
>>>>> destructor).  So if there is a looping benchmark thread compiled
>>>>> without Compiler Safepoints, the control thread will be blocked and
>>>>> will never set the isDone flag.
>>>>
>>>> So, you are saying that without the safepoint in the while(!isDone)
>>>> loop in workload, control thread and workload thread will never
>>>> rendezvous on safepoint? I believe this is a bug with
>>>> -XX:-CompilerSafepoints, because the comment in safepoint.cpp calls
>>>> this
>>>> out specifically for VMThread vs. Mutator threads:
>>>>
>>>>   // In a pathological scenario such as that described in CR6415670
>>>>   // the VMthread may sleep just before the mutator(s) become safe.
>>>>   // In that case the mutators will be stalled waiting for the
>>>> safepoint
>>>>   // to complete and the the VMthread will be sleeping, waiting for the
>>>>   // mutators to rendezvous. The VMthread will eventually wake up and
>>>>   // detect that all mutators are safe, at which point we'll again make
>>>>   // progress.
>>>>
>>>> If this is a case, you probably need to report this to runtime guys.
>>>>
>>>>> This is probably OK, just need to document that CompilerSafepoints
>>>>> cannot be turned off.
>>>>
>>>> I think it is safe to presume something will go hairy if you are using
>>>> any special VM flag, therefore I am not inclined to document this.
>>>>
>>>> Thanks,
>>>> -Aleksey.
>>>>
>>>
>>>

From daniel.daugherty at oracle.com  Thu Nov 13 03:54:31 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 12 Nov 2014 20:54:31 -0700
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <5464212A.6070504@oracle.com>
References: <546151A9.1080100@oracle.com> <5464212A.6070504@oracle.com>
Message-ID: <54642B77.3030607@oracle.com>

Thanks! I was still in need of a (R)eviewer and a Runtime
team member so thanks for covering both...

Dan


On 11/12/14 8:10 PM, David Holmes wrote:
> Hi Dan,
>
> If you still need a Reviewer, looks okay to me.
>
> Thanks,
> David
>
> On 11/11/2014 10:00 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
>> Yes, it is a small fix, but it is in Makefiles so feel free to
>> run screaming from the room... :-)  On the plus side the fix does
>> delete two work around source files (Coleen would say that's a
>> Good Thing (TM)!)
>>
>> The fix is to detect the version of GNU objcopy that is being
>> used on the machine and only enable Full Debug Symbols when that
>> version is 2.21.1 or newer. If you don't have the right version,
>> then the build drops back to pre-FDS build configs with a message
>> like this:
>>
>> WARNING: /usr/sfw/bin/gobjcopy --version info:
>> WARNING: GNU objcopy 2.15
>> WARNING: an objcopy version of 2.21.1 or newer is needed to create valid
>> .debuginfo files.
>> WARNING: ignoring above objcopy command.
>> WARNING: patch 149063-01 or newer contains the correct Solaris 10 SPARC
>> version.
>> WARNING: patch 149064-01 or newer contains the correct Solaris 10 X86
>> version.
>> WARNING: Solaris 11 Update 1 contains the correct version.
>> INFO: no objcopy cmd found so cannot create .debuginfo files.
>> INFO: ENABLE_FULL_DEBUG_SYMBOLS=0
>>
>> This work is being tracked by the following bug IDs:
>>
>>      JDK-8033602 wrong stabs data in libjvm.debuginfo on JDK 8 - SPARC
>>      https://bugs.openjdk.java.net/browse/JDK-8033602
>>
>>      JDK-8034005 cannot debug in synchronizer.o or objectMonitor.o on
>> Solaris X86
>>      https://bugs.openjdk.java.net/browse/JDK-8034005
>>
>> Here is the webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8033602-webrev/0-jdk9-hs-rt/
>>
>> Testing:
>>
>> - JPRT test jobs to verify that the current JPRT Solaris hosts
>>    are happy
>> - local builds on my Solaris 10 X86 machine to verify that the
>>    wrong version of GNU objcopy is caught
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan


From david.holmes at oracle.com  Thu Nov 13 05:40:26 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 13 Nov 2014 15:40:26 +1000
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <546427DB.3070806@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>	<5463BF71.4080804@oracle.com>
	<5463EF88.1050100@oracle.com> <54641E27.4090303@oracle.com>
	<546427DB.3070806@oracle.com>
Message-ID: <5464444A.7030601@oracle.com>

Hi Vladimir,

On 13/11/2014 1:39 PM, Vladimir Kozlov wrote:
> I agrer that workaround is -Xint. But if we disable compilation with
> -UseCompilerSafepoints, the flag becomes useless. You can get the same
> result with just -Xint.
>
> The history shows that it was added at the very beginning of Hotspot
> development, at the day one. I can only speculate that it was used to
> find performance effects of safepoints in compiled code . It could be
> the case that we removed safepoints from Counted loops as result of that
> investigation. I think it was never intended to be used in production.
>
> Although we can fix compilers to generate a runtime call which does
> safepoint when -UseCompilerSafepoints is specified, it will be useless
> work, I think.

There is some history in JDK-4974572 (which is non-public I'm afraid). 
To all intents and purposes the flag at that point was used to enable 
testing of workarounds if problems were suspected in the "new" 
safepointing code. I think it has outlived its usefulness by a few major 
releases so I'm happy to see it go.

Cheers,
David

> thanks,
> Vladimir
>
> On 11/12/14 6:57 PM, David Holmes wrote:
>> On 13/11/2014 9:38 AM, Vladimir Kozlov wrote:
>>> On 11/12/14 12:13 PM, Aleksey Shipilev wrote:
>>>> Hi,
>>>>
>>>> Still not sure if this is a runtime bug: stripping safepoints from the
>>>> non-counted loop seems to be a recipe for disaster.
>>>
>>> This flag does not affect compiled code - so it is not compiler issue.
>>
>> Well, it disables the mechanism that the compiler inserts for checking
>> if a safepoint has been requested. As I've added to the bug report,
>> disabling compiler safepoints should go hand-in-hand with disabling the
>> compilers (ie run with -Xint) - otherwise you have to know that the
>> compiled code will eventually hit a non-compiler safepoint check.
>>
>>> It is only used in runtime/safepoint.cpp and it guards the code which
>>> protects a polling page.
>>>
>>> There are many bugs which shows current problem. For example:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-6873333
>>>
>>> I would say that we have to remove it or at least make it experimental
>>> flag if we want to do experiments with it.
>>>
>>> We definitely should not allow to use it in production!
>>
>> If we assume there is a reason it was made a product flag then the
>> correct fix in my opinion would be to fall back to intepreter-only mode
>> when this flag is turned off.
>>
>> If we don't make that assumption then we could still tie it to
>> interpreter-only mode, but we definitely should not make it configurable
>> in product mode without some effort.
>>
>> Or if we can't ascertain a valid reason for ever wanting to do this, we
>> could simply delete the flag altogether. :)
>>
>> Cheers,
>> David
>>
>>> Regards,
>>> Vladimir
>>>
>>>>
>>>> Anyhow, I think it deserves a simpler example. Submitted the bug and
>>>> attached a simple test there:
>>>>   https://bugs.openjdk.java.net/browse/JDK-8064749
>>>>
>>>> Thanks,
>>>> -Aleksey.
>>>>
>>>> On 12.11.2014 19:52, Deneau, Tom wrote:
>>>>> Hi all --
>>>>>
>>>>> Forwarding a thread which came about on the jmh-dev mail list, as
>>>>> recommended by Aleksey Shipilev (see below).  The JMH framework has a
>>>>> timing control thread which sleeps for a certain period, then sets a
>>>>> volatile isDone variable.  Meanwhile, the benchmark thread loops
>>>>> doing its benchmark code and also checking the isDone field.   A hang
>>>>> occurs if -XX:-UseCompilerSafepoints is used.
>>>>>
>>>>> The original issue can be reproduced by the following steps
>>>>>
>>>>>     hg clone http://hg.openjdk.java.net/code-tools/jmh
>>>>>     cd jmh
>>>>>     mvn clean install -DskipTests=true
>>>>>     cd jmh-samples
>>>>>     java  -server -XX:-UseCompilerSafepoints -jar
>>>>> target/benchmarks.jar 'JMHSample_01_.*' -t 1 -wi 5 -i 5 -f 0
>>>>>
>>>>> -- Tom Deneau
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Aleksey Shipilev [mailto:aleksey.shipilev at oracle.com]
>>>>> Sent: Wednesday, November 12, 2014 6:09 AM
>>>>> To: Deneau, Tom; jmh-dev at openjdk.java.net
>>>>> Subject: Re: using -XX:-UseCompilerSafepoints
>>>>>
>>>>> Hi Tom,
>>>>>
>>>>> On 11/11/2014 07:34 PM, Deneau, Tom wrote:
>>>>>> It looks like a thread that calls Thread.sleep (as the timing control
>>>>>> thread does in the harness) will eventually go thru
>>>>>> SafepointSynchonize::block (as part of the ThreadBlockInVM
>>>>>> destructor).  So if there is a looping benchmark thread compiled
>>>>>> without Compiler Safepoints, the control thread will be blocked and
>>>>>> will never set the isDone flag.
>>>>>
>>>>> So, you are saying that without the safepoint in the while(!isDone)
>>>>> loop in workload, control thread and workload thread will never
>>>>> rendezvous on safepoint? I believe this is a bug with
>>>>> -XX:-CompilerSafepoints, because the comment in safepoint.cpp calls
>>>>> this
>>>>> out specifically for VMThread vs. Mutator threads:
>>>>>
>>>>>   // In a pathological scenario such as that described in CR6415670
>>>>>   // the VMthread may sleep just before the mutator(s) become safe.
>>>>>   // In that case the mutators will be stalled waiting for the
>>>>> safepoint
>>>>>   // to complete and the the VMthread will be sleeping, waiting for
>>>>> the
>>>>>   // mutators to rendezvous. The VMthread will eventually wake up and
>>>>>   // detect that all mutators are safe, at which point we'll again
>>>>> make
>>>>>   // progress.
>>>>>
>>>>> If this is a case, you probably need to report this to runtime guys.
>>>>>
>>>>>> This is probably OK, just need to document that CompilerSafepoints
>>>>>> cannot be turned off.
>>>>>
>>>>> I think it is safe to presume something will go hairy if you are using
>>>>> any special VM flag, therefore I am not inclined to document this.
>>>>>
>>>>> Thanks,
>>>>> -Aleksey.
>>>>>
>>>>
>>>>

From kirk at kodewerk.com  Thu Nov 13 06:41:32 2014
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Thu, 13 Nov 2014 07:41:32 +0100
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <5463F37E.7020804@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>
	<5463BF71.4080804@oracle.com> <5463EF88.1050100@oracle.com>
	<5463F37E.7020804@oracle.com>
Message-ID: <AD0DE7E1-BE4C-45A3-B4B7-4E71AED9FD02@kodewerk.com>


> 
> I agree that demoting this flag from "product" to "experimental" sets
> the expectations about its impact right.

+1

? Kirk

From yumin.qi at oracle.com  Thu Nov 13 06:52:32 2014
From: yumin.qi at oracle.com (Yumin Qi)
Date: Wed, 12 Nov 2014 22:52:32 -0800
Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh
	fails with ClassCircularityError
In-Reply-To: <3CF28613-0C0F-44E2-869A-FF5B01D7E575@oracle.com>
References: <543C591E.8010602@oracle.com>	<544AB477.4000204@oracle.com>	<544ADC07.6080904@oracle.com>	<544AE76A.9030701@oracle.com>	<544E5123.1060202@oracle.com>	<544E8844.1070907@oracle.com>	<0FB37288-5995-4E0B-B005-E00A4FB3B22A@oracle.com>
	<5454218D.40009@oracle.com> <5456EADF.4050203@oracle.com>
	<3CF28613-0C0F-44E2-869A-FF5B01D7E575@oracle.com>
Message-ID: <54645530.6010107@oracle.com>

Thanks, Karen

   Now I have a standalone tests which easy to reproduce. I am trying to 
set debugger to trace the problem. While, I will  try the suggested fix 
from you too.
   When set AbortVMOnException, it is late for debugger to attach since 
the execution goes to abort. Currently not easy run single (we loop in 
the test script) time to fail.

   Thanks
   Yumin

On 11/12/2014 8:27 AM, Karen Kinnear wrote:
> I think there are three things we need to figure out.
>
> 1. I reproduced a problem in TestThread2. Below was the information from that and my analysis
>     - all - comments on my analysis are very welcome
>     - Yumin - please try the suggested test change below to see if it helps.
>
>     - that is the only example I have seen the full details for.
>
> 2. Does the circularity error actually occur in the main thread and if so why?
>     - need to catch in a debugger/hs_err file a situation in which this occurs in the main thread please.
>     We need the full stack trace for this - native and java please
>     - run this without the test change I suggested please
>     - try to catch ClassCircularityError in the main thread
>
> 3. figure out why we we see this problem more frequently
>     - I am not convinced this problem didn't already exist - the test logic has some very odd comments and workarounds which seem to imply there
>     were intermittent problems from the beginning
>     - that said - worth figuring out if for instance, the sun.misc.URLClassPath logic was rewritten (and when) to add $JarLoader$2
>     - and looking at the history of test failure
>
> thanks,
> Karen
>
> On Nov 2, 2014, at 9:39 PM, David Holmes wrote:
>
>> On 1/11/2014 9:55 AM, Yumin Qi wrote:
>>> Karen,
>>>
>>>    Thanks for your detail message for debugging. Yes, from my debugging,
>>> the exception did happen in TestThread other than main thread. I have no
>>> idea why in the end the exception was reported in main thread.
>> Until that question is answered I will remain uneasy about simply tweaking the test until it no longer fails. I would also like to know when it started failing - Karen alludes to the possible introduction of a new inner class at some point.
>>
>> Thanks,
>> David
>>
>>>     You mention
>>>
>>> So that change to the test would be:
>>>     in TestTransformer:
>>>        if (loader != null) {
>>>            if (tName.equals("TestThread")) {
>>>            {
>>>               loadClasses(3);
>>>            }
>>>         }
>>>         return null;
>>>      }
>>>
>>>
>>> The loader is the one defined in the test case, right? The system class
>>> loader is never null.
>>> I will try this change, let's see if it can work it out.
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 10/31/2014 3:29 PM, Karen Kinnear wrote:
>>>> Yumin,
>>>>
>>>>  From your earlier exception stack trace (many thanks) you reported:
>>>>
>>>> Exception in thread "main" java.lang.ClassCircularityError:  (no - I
>>>> don't know why this is in thread "main")
>>>> sun/misc/URLClassPath$JarLoader$2
>>>> at sun.misc.URLClassPath$JarLoader.checkResource(URLClassPath.java:771)
>>>> at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:843)
>>>> at sun.misc.URLClassPath.getResource(URLClassPath.java:199)
>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:364)
>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:426)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:359)
>>>> at java.lang.Class.forName0(Native Method)
>>>> at java.lang.Class.forName(Class.java:340)
>>>> at
>>>> ParallelTransformerLoaderApp.loadClasses(ParallelTransformerLoaderApp.java:83)
>>>>
>>>> at
>>>> ParallelTransformerLoaderApp.main(ParallelTransformerLoaderApp.java:45)
>>>>
>>>>
>>>> So I ran  with -XX:AbortVMOnException=java.lang.ClassCircularityError
>>>> -XX:+ShowMessageBoxOnError to get
>>>> a log file and stack trace. See my instructions below on how to do that.
>>>>
>>>> I did this, attached a debugger, which didn't help enough since I
>>>> needed to see the java stack frames,
>>>>   and got an hs_err_log also, so the stack traces came from the error
>>>> log.
>>>>
>>>> The stack trace was on Thread 2, which in the hs_err_log was
>>>> TestThread (which makes sense for what the test logic says).
>>>> See later in email for stack traces from Thread 2.
>>>>
>>>> Summary of stack trace:
>>>>
>>>> TestThread:
>>>>    loadClasses(#) -> forName(TestClass#, URLClassLoader)
>>>>      vm calls out to URLClassLoader.loadClass(String) which is
>>>> inherited from java.lang.ClassLoader.loadClass(String)
>>>>      ... calls java.net.URLClassLoader.findClass(...) which calls
>>>>        DoPrivileged  java.net.URLClassLoader$1.run which calls
>>>>           sun.misc.URLClassPath.getResource(name, false)  which calls
>>>>               sun.misc.URLClassPath$JarLoader.getResource which calls
>>>>                   sun.misc.URLClassPath$JarLoader.checkResource which
>>>> tries to call sun.misc.URLClassPath$JarLoader$2
>>>>     - and then the transformer jumps in with loadClasses(# (which we
>>>> know is 3) and walks the same logic which tries to load
>>>> sun.misc.URLClassPath$JarLoader$2 again
>>>>
>>>> Note that in the placeholder table information that Yumin printed, the
>>>> circularity error is on sun.misc.URLClassPath$JarLoader$2 with the
>>>> null == boot loader, which
>>>> makes sense -- that is the appropriate defining loader, and therefore
>>>> the one the CFLH would intercept during the defineClass phase.
>>>>
>>>> In the sun.misc.URLClassPath.java file, in the class JarLoader, in the
>>>> method checkResource
>>>> ... return new Resource() { ... }
>>>> -- I do not know why that generates sun.misc.URLClassPath$JarLoader$1,
>>>> $2 and $3 at build time or when that was added.
>>>> I would guess that is when the bug started happening.
>>>>
>>>> When I have a successful run, sun.misc.URLClassPath$JarLoader$2 loads
>>>> before any TestClass1 loads.
>>>>
>>>> My belief is that the point of the test is to test parallel class
>>>> loading for URL class loaders.
>>>> I don't think the point is to test the bootstrap class loader, nor to
>>>> test bootstrapping - i.e. running the agent before
>>>> we have loaded sufficient classes to allow loading URLClassLoader
>>>> classes.
>>>>
>>>> What I suggested to Yumin that he try would be to change the test to
>>>> NOT intercept boot loader loads, so that
>>>> sun.misc.URLClassPath$JarLoader$#
>>>> can load which will in turn allow classes loaded by a URLClassLoader
>>>> subclass to load.
>>>>
>>>> So that change to the test would be:
>>>>     in TestTransformer:
>>>>        if (loader != null) {
>>>>            if (tName.equals("TestThread")) {
>>>>            {
>>>>               loadClasses(3);
>>>>            }
>>>>         }
>>>>         return null;
>>>>      }
>>>> // I also suspect with that change, we can remove the sleep loop
>>>> Note: there was a printed message which said that the Thread "Signal
>>>> Dispatcher" has called transform(), which I
>>>> ignored, however it is good that we don't call loadClass on that
>>>> thread  - which is part of what the sleep loop does -
>>>> but that would be handled by the boot loader screening above
>>>>
>>>> Alternatively we can preload the URLClassPath classes, but I don't
>>>> think we want to do that, or
>>>> we can have the agent explicitly screen on a variety of jdk
>>>> bootstrapping classes. But I think the cleaner
>>>> solution is to screen on the boot loader.
>>>>
>>>> Does that make any sense to others?
>>>>
>>>> thanks,
>>>> Karen
>>>>
>>>> p.s. How to run with hotspot flags (jtreg has a -show:rerun option,
>>>> but with a shell script in the test, this is more complex, so
>>>> the following should be easier):
>>>>
>>>> So what I did was run the test once for it to pass (not your script,
>>>> but just once with jtreg) so that it generated
>>>> the $DST/work directory.
>>>> I then created a rerun.csh script - attached - you can modify for your
>>>> own $DST directory.
>>>> I used it to be able to quickly rerun the test without the jtreg
>>>> framework and compile time etc. but mostly
>>>> to be able to actually add hotspot command-line flags.
>>>>
>>>>
>>>>
>>>>
>>>> p.p.s. details from the error log (let me know if you want me to
>>>> attach the error log to the bug report)
>>>>
>>>> note: error log shows last 10 events including:
>>>> Event: 0.928 loading class sun/misc/URLClassPath$JarLoader$2
>>>> Event: 0.928 loading class TestClass3
>>>> Event: 0.929 loading class TestClass3 done
>>>> Event: 0.929 loading class java/lang/ClassCircularityError
>>>> Event: 0.929 loading class java/lang/ClassCircularityError done
>>>>
>>>> TestThread
>>>>
>>>> java frames:
>>>>
>>>> j
>>>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>>>
>>>> j
>>>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>>>
>>>> j
>>>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>>>
>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>>> v  ~StubRoutines::call_stub
>>>> j
>>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>>>
>>>> j
>>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>>> j
>>>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>>>> v  ~StubRoutines::call_stub
>>>> j
>>>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>>>
>>>> j
>>>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>>>
>>>> j  ParallelTransformerLoaderAgent$TestTransformer.loadClasses(I)V+25
>>>> j
>>>> ParallelTransformerLoaderAgent$TestTransformer.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+81
>>>>
>>>> j
>>>> sun.instrument.TransformerManager.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+50
>>>>
>>>> j
>>>> sun.instrument.InstrumentationImpl.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[BZ)[B+34
>>>>
>>>> v  ~StubRoutines::call_stub
>>>> j
>>>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>>>
>>>> j
>>>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>>>
>>>> j
>>>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>>>
>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>>> v  ~StubRoutines::call_stub
>>>> j
>>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>>>
>>>> j
>>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>>> j
>>>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>>>> v  ~StubRoutines::call_stub
>>>> j
>>>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>>>
>>>> j
>>>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>>>
>>>> j  ParallelTransformerLoaderApp.loadClasses(I)V+25
>>>> j  ParallelTransformerLoaderApp$TestThread.run()V+4
>>>> v  ~StubRoutines::call_stub
>>>>
>>>>
>>>>
>>>> detailed frames:
>>>>
>>>> V  [libjvm.so+0x760f5a]  Exceptions::_throw_msg(Thread*, char const*,
>>>> int, Symbol*, char const*)+0x7c
>>>> V  [libjvm.so+0xce005c]
>>>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>>>> Handle, Thread*)+0x7d8
>>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>>>> Handle, Handle, Thread*)+0x26d
>>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>>>> Handle, Handle, bool, Thread*)+0x39
>>>> V  [libjvm.so+0x690fbc]
>>>> ConstantPool::klass_at_impl(constantPoolHandle, int, Thread*)+0x3cc
>>>> V  [libjvm.so+0x5398cb]  ConstantPool::klass_at(int, Thread*)+0x55
>>>> V  [libjvm.so+0x8b1f3c]  InterpreterRuntime::_new(JavaThread*,
>>>> ConstantPool*, int)+0x14a
>>>> j
>>>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>>>
>>>> j
>>>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>>>
>>>> j
>>>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>>>
>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>>> v  ~StubRoutines::call_stub
>>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>>> JavaCallArguments*, Thread*)+0x7d
>>>> V  [libjvm.so+0x972a80]  JVM_DoPrivileged+0x63d
>>>> j
>>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>>>
>>>> j
>>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>>> j
>>>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>>>> v  ~StubRoutines::call_stub
>>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>>> JavaCallArguments*, Thread*)+0x7d
>>>> V  [libjvm.so+0x8c1ec7]  JavaCalls::call_virtual(JavaValue*,
>>>> KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x1cb
>>>> V  [libjvm.so+0x8c2086]  JavaCalls::call_virtual(JavaValue*, Handle,
>>>> KlassHandle, Symbol*, Symbol*, Handle, Thread*)+0xb0
>>>> V  [libjvm.so+0xce2096]
>>>> SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x4ea
>>>> V  [libjvm.so+0xce00a8]
>>>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>>>> Handle, Thread*)+0x824
>>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>>>> Handle, Handle, Thread*)+0x26d
>>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>>>> Handle, Handle, bool, Thread*)+0x39
>>>> V  [libjvm.so+0x98c89e]  find_class_from_class_loader(JNIEnv_*,
>>>> Symbol*, unsigned char, Handle, Handle, unsigned char, Thread*)+0x49
>>>> V  [libjvm.so+0x96f681]  JVM_FindClassFromCaller+0x39d
>>>> C  [libjava.so+0xdfd0]  Java_java_lang_Class_forName0+0x130
>>>> j
>>>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>>>
>>>> j
>>>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>>>
>>>> j  ParallelTransformerLoaderAgent$TestTransformer.loadClasses(I)V+25
>>>> j
>>>> ParallelTransformerLoaderAgent$TestTransformer.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+81
>>>>
>>>> j
>>>> sun.instrument.TransformerManager.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+50
>>>>
>>>> j
>>>> sun.instrument.InstrumentationImpl.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[BZ)[B+34
>>>>
>>>> v  ~StubRoutines::call_stub
>>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>>> JavaCallArguments*, Thread*)+0x7d
>>>> V  [libjvm.so+0x911bfb]  jni_invoke_nonstatic(JNIEnv_*, JavaValue*,
>>>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)+0x3cd
>>>> V  [libjvm.so+0x916918]  jni_CallObjectMethod+0x388
>>>> C  [libinstrument.so+0x4eb5]  transformClassFile+0x1e5
>>>> C  [libinstrument.so+0x1e06]  eventHandlerClassFileLoadHook+0x96
>>>> V  [libjvm.so+0xa04afa]
>>>> JvmtiClassFileLoadHookPoster::post_to_env(JvmtiEnv*, bool)+0x1a8
>>>> V  [libjvm.so+0xa0485e]
>>>> JvmtiClassFileLoadHookPoster::post_all_envs()+0x8a
>>>> V  [libjvm.so+0xa047c6]  JvmtiClassFileLoadHookPoster::post()+0x18
>>>> V  [libjvm.so+0x9fb6e1]
>>>> JvmtiExport::post_class_file_load_hook(Symbol*, Handle, Handle,
>>>> unsigned char**, unsigned char**, JvmtiCachedClassFileData**)+0x85
>>>> V  [libjvm.so+0x5cd17d]  ClassFileParser::parseClassFile(Symbol*,
>>>> ClassLoaderData*, Handle, KlassHandle, GrowableArray<Handle>*,
>>>> TempNewSymbol&, bool, Thread*)+0x2af
>>>> V  [libjvm.so+0x5dd441]  ClassFileParser::parseClassFile(Symbol*,
>>>> ClassLoaderData*, Handle, TempNewSymbol&, bool, Thread*)+0x95
>>>> V  [libjvm.so+0x5daf03]  ClassLoader::load_classfile(Symbol*,
>>>> Thread*)+0x2ed
>>>> V  [libjvm.so+0xce1cc4]
>>>> SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x118
>>>> V  [libjvm.so+0xce00a8]
>>>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>>>> Handle, Thread*)+0x824
>>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>>>> Handle, Handle, Thread*)+0x26d
>>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>>>> Handle, Handle, bool, Thread*)+0x39
>>>> V  [libjvm.so+0x690fbc]
>>>> ConstantPool::klass_at_impl(constantPoolHandle, int, Thread*)+0x3cc
>>>> V  [libjvm.so+0x5398cb]  ConstantPool::klass_at(int, Thread*)+0x55
>>>> V  [libjvm.so+0x8b1f3c]  InterpreterRuntime::_new(JavaThread*,
>>>> ConstantPool*, int)+0x14a
>>>> j
>>>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>>>
>>>> j
>>>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>>>
>>>> j
>>>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>>>
>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>>> v  ~StubRoutines::call_stub
>>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>>> JavaCallArguments*, Thread*)+0x7d
>>>> V  [libjvm.so+0x972a80]  JVM_DoPrivileged+0x63d
>>>> j
>>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>>>
>>>> j
>>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>>> j
>>>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;
>>>> v  ~StubRoutines::call_stub
>>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>>> JavaCallArguments*, Thread*)+0x7d
>>>> V  [libjvm.so+0x8c1ec7]  JavaCalls::call_virtual(JavaValue*,
>>>> KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x1cb
>>>> V  [libjvm.so+0x8c2086]  JavaCalls::call_virtual(JavaValue*, Handle,
>>>> KlassHandle, Symbol*, Symbol*, Handle, Thread*)+0xb0
>>>> V  [libjvm.so+0xce2096]
>>>> SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x4ea
>>>> V  [libjvm.so+0xce00a8]
>>>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>>>> Handle, Thread*)+0x824
>>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>>>> Handle, Handle, Thread*)+0x26d
>>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>>>> Handle, Handle, bool, Thread*)+0x39
>>>> V  [libjvm.so+0x98c89e]  find_class_from_class_loader(JNIEnv_*,
>>>> Symbol*, unsigned char, Handle, Handle, unsigned char, Thread*)+0x49
>>>> V  [libjvm.so+0x96f681]  JVM_FindClassFromCaller+0x39d
>>>> C  [libjava.so+0xdfd0]  Java_java_lang_Class_forName0+0x130
>>>> j
>>>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>>>
>>>> j
>>>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>>>
>>>> j  ParallelTransformerLoaderApp.loadClasses(I)V+25
>>>> ...<more frames>...
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Oct 27, 2014, at 2:00 PM, serguei.spitsyn at oracle.com wrote:
>>>>
>>>>> Ok.
>>>>>
>>>>> Thanks, Dan!
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 10/27/14 7:05 AM, Daniel D. Daugherty wrote:
>>>>>>> The test case was added by Dan.
>>>>>>> We may want to ask him to clarify the test case purpose.
>>>>>>> (added Dan to the to-list)
>>>>>> Here's the changeset that added the test:
>>>>>>
>>>>>> $ hg log -v -r bca8bf23ac59
>>>>>> test/java/lang/instrument/ParallelTransformerLoader.sh
>>>>>> changeset:   132:bca8bf23ac59
>>>>>> user:        dcubed
>>>>>> date:        Mon Mar 24 15:05:09 2008 -0700
>>>>>> files: test/java/lang/instrument/ParallelTransformerLoader.sh
>>>>>> test/java/lang/instrument/ParallelTransformerLoaderAgent.java
>>>>>> test/java/lang/instrument/ParallelTransformerLoaderApp.java
>>>>>> test/java/lang/instrument/TestClass1.java
>>>>>> test/java/lang/instrument/TestClass2.java
>>>>>> test/java/lang/instrument/TestClass3.java
>>>>>> description:
>>>>>> 5088398: 3/2 java.lang.instrument TCK test deadlock (test11)
>>>>>> Summary: Add regression test for single-threaded bootstrap classloader.
>>>>>> Reviewed-by: sspitsyn
>>>>>>
>>>>>>
>>>>>> Based on my e-mail archive for this bug and from the bug report itself,
>>>>>> it looks like we got this test from Wily Labs. The original bug was a
>>>>>> deadlock that stopped being reproducible after:
>>>>>>
>>>>>> Karen fixed the bootstrap class loader to work in parallel via:
>>>>>>
>>>>>>     4997893 4/5 Investigate allowing bootstrap loader to work in
>>>>>> parallel
>>>>>>
>>>>>> with that fix in place the deadlock no longer reproduces.
>>>>>> I'm planning to use this bug as the vehicle for getting
>>>>>> the test program into the INSTRUMENT_REGRESSION test suite.
>>>>>>
>>>>>> *** (#2 of 2): 2008-02-29 18:20:17 GMT+00:00 daniel.daugherty at sun.com
>>>>>>
>>>>>>
>>>>>> A careful reading of JDK-5088398 might reveal the intentions of this
>>>>>> test...
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>> On 10/24/14 5:57 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Yumin,
>>>>>>>
>>>>>>> On 10/24/14 4:08 PM, Yumin Qi wrote:
>>>>>>>> Serguei,
>>>>>>>>
>>>>>>>>   Thanks for your comments.
>>>>>>>>   This test happens intermittently, but now it can repeat with 8/9.
>>>>>>>>   Loading TestClass1 in main thread while loading TestClass2 in
>>>>>>>> TestThread in parallel. They both will call transform since
>>>>>>>> TestClass[1-3] are loaded via agent. When loading TestClass2, it
>>>>>>>> will call loading TestClass3 in TestThread.
>>>>>>>>   Note in the main thread, for loop:
>>>>>>>>
>>>>>>>>                   for (int i = 0; i < kNumIterations; i++)
>>>>>>>>                 {
>>>>>>>>                         // load some classes from multiple threads
>>>>>>>> (this thread and one other)
>>>>>>>>                         Thread testThread = new TestThread(2);
>>>>>>>>                         testThread.start();
>>>>>>>>                         loadClasses(1);
>>>>>>>>
>>>>>>>>                         // log that it completed and reset for the
>>>>>>>> next iteration
>>>>>>>>                         testThread.join();
>>>>>>>>                         System.out.print(".");
>>>>>>>> ParallelTransformerLoaderAgent.generateNewClassLoader();
>>>>>>>>                 }
>>>>>>>>
>>>>>>>> The loader got renewed after testThread.join(). So both threads
>>>>>>>> are using the exact same class loader.
>>>>>>> You are right, thanks.
>>>>>>> It means that all three classes (TesClass1, TestClass2 and TestClass3)
>>>>>>> are loaded by the same class loader in each iteration.
>>>>>>>
>>>>>>> However, I see more cases when the TestClass3 gets loaded.
>>>>>>> It happens in a CFLH event when any other class (not TestClass*) in
>>>>>>> the system is loaded.
>>>>>>> The class loading thread can be any, not only "main" or "TestClass"
>>>>>>> thread.
>>>>>>> I suspect this test case mostly targets class loading that happens
>>>>>>> on other threads.
>>>>>>> It is because of the lines:
>>>>>>>                         // In 160_03 and older, transform() is called
>>>>>>>                         // with the "system_loader_lock" held and that
>>>>>>>                         // prevents the bootstrap class loaded from
>>>>>>>                         // running in parallel. If we add a slight
>>>>>>> sleep
>>>>>>>                         // delay here when the transform() call is not
>>>>>>>                         // main or TestThread, then the deadlock in
>>>>>>>                         // 160_03 and older is much more reproducible.
>>>>>>>                         if (!tName.equals("main") &&
>>>>>>> !tName.equals("TestThread")) {
>>>>>>>                             System.out.println("Thread '" + tName +
>>>>>>>                                 "' has called transform()");
>>>>>>>                             try {
>>>>>>>                                 Thread.sleep(500);
>>>>>>>                             } catch (InterruptedException ie) {
>>>>>>>                             }
>>>>>>>                         }
>>>>>>>
>>>>>>> What about the following?
>>>>>>>
>>>>>>> In the ParallelTransformerLoaderAgent.java  make this change:
>>>>>>>               if (!tName.equals("main"))
>>>>>>>                   => if (tName.equals("TestThread"))
>>>>>>>
>>>>>>> Does such updated test still failing?
>>>>>>>
>>>>>>>> After create a new class loader, next loop will use the loader.
>>>>>>>> This is why quite often on the stack trace we can see it resolves
>>>>>>>> JarLoader$2.
>>>>>>>>
>>>>>>>> I am not quite understand the test case either. Loading TestClass3
>>>>>>>> inside transform using the same classloader will cause  call to
>>>>>>>> transform again and form a circle. Nonetheless, if we see
>>>>>>>> TestClass2 already loaded, the loop will end but that still is a
>>>>>>>> risk.
>>>>>>> In fact, I don't like that the test loads the class TestClass3 at
>>>>>>> the TestClass3 CFLH event.
>>>>>>> However, it is interesting to know why we did not see (is it the
>>>>>>> case?) this issue before.
>>>>>>> Also, it is interesting why the test stops failing with you fix
>>>>>>> (replacing loader with SystemClassLoader).
>>>>>>>
>>>>>>> The test case was added by Dan.
>>>>>>> We may want to ask him to clarify the test case purpose.
>>>>>>> (added Dan to the to-list)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Yumin
>>>>>>>>
>>>>>>>> On 10/24/2014 1:20 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Yumin,
>>>>>>>>>
>>>>>>>>> Below is some analysis to make sure I understand the test
>>>>>>>>> scenario correctly.
>>>>>>>>>
>>>>>>>>> The ParallelTransformerLoaderApp.main() executes a 1000 iteration
>>>>>>>>> loop.
>>>>>>>>> At each iteration it does:
>>>>>>>>>   - creates and starts a new TestThread
>>>>>>>>>   - loads TestClass1 with the current class loader:
>>>>>>>>> ParallelTransformerLoaderAgent.getClassLoader()
>>>>>>>>>   - changes the current class loader with new one:
>>>>>>>>> ParallelTransformerLoaderAgent.generateNewClassLoader()
>>>>>>>>>
>>>>>>>>> The TestThread loads the TestClass2 concurrently with the main
>>>>>>>>> thread.
>>>>>>>>>
>>>>>>>>> At the CFLH events, the ParallelTransformerLoaderAgent does the
>>>>>>>>> class retransformation.
>>>>>>>>> If the thread loading the class is not "main", it loads the class
>>>>>>>>> TestClass3
>>>>>>>>> with the current class loader
>>>>>>>>> ParallelTransformerLoaderAgent.getClassLoader().
>>>>>>>>>
>>>>>>>>> Sometimes, the TestClass2 and TestClass3 are loaded by the same
>>>>>>>>> class loader recursively.
>>>>>>>>> It happens if the class loader has not been changed between
>>>>>>>>> loading TestClass2 and TestClass3 classes.
>>>>>>>>>
>>>>>>>>> I'm not convinced yet the test is incorrect.
>>>>>>>>> And it is not clear why do we get a ClassCircularityError.
>>>>>>>>>
>>>>>>>>> Please, let me know if the above understanding is wrong.
>>>>>>>>> I also see the reply from David and share his concerns.
>>>>>>>>>
>>>>>>>>> It is not clear if this failure is a regression.
>>>>>>>>> Did we observe this issue before?
>>>>>>>>> If - NOT then when and why had this failure started to appear?
>>>>>>>>>
>>>>>>>>> Unfortunately, it is impossible to look at the test run history
>>>>>>>>> at the moment.
>>>>>>>>> The Aurora is at a maintenance.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>> On 10/13/14 3:58 PM, Yumin Qi wrote:
>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468
>>>>>>>>>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/
>>>>>>>>>>
>>>>>>>>>> the bug marked as confidential so post the webrev internally.
>>>>>>>>>>
>>>>>>>>>> Problem: The test case tries to load a class from the same jar
>>>>>>>>>> via agent in the middle of loading another class from the jar
>>>>>>>>>> via same class loader in same thread. The call happens in
>>>>>>>>>> transform which is a rare case --- in middle of loading class,
>>>>>>>>>> loading another class. The result is a CircularityError. When
>>>>>>>>>> first class is in loading, in vm we put JarLoader$2 on place
>>>>>>>>>> holder table, then we start the defineClass, which calls
>>>>>>>>>> transform, begins loading the second class so go along the same
>>>>>>>>>> routine for loading JarLoader$2 first, found it already in
>>>>>>>>>> placeholder table. A CircularityError is thrown.
>>>>>>>>>> Fix: The test case should not call loading class with same class
>>>>>>>>>> loader in same thread from same jar in 'transform' method. I
>>>>>>>>>> modify it loading with system class loader and we expect see
>>>>>>>>>> ClassNotFoundException. Detail see bug comments.
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Yumin *


From aleksey.shipilev at oracle.com  Thu Nov 13 07:43:06 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 13 Nov 2014 10:43:06 +0300
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <5464444A.7030601@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>	<5463BF71.4080804@oracle.com>	<5463EF88.1050100@oracle.com>
	<54641E27.4090303@oracle.com>	<546427DB.3070806@oracle.com>
	<5464444A.7030601@oracle.com>
Message-ID: <5464610A.20901@oracle.com>

On 13.11.2014 08:40, David Holmes wrote:
> There is some history in JDK-4974572 (which is non-public I'm afraid).
> To all intents and purposes the flag at that point was used to enable
> testing of workarounds if problems were suspected in the "new"
> safepointing code. I think it has outlived its usefulness by a few major
> releases so I'm happy to see it go.

Filed:
  https://bugs.openjdk.java.net/browse/JDK-8064777

I'll do a patch to remove the flag.

-Aleksey.


From peter.levart at gmail.com  Thu Nov 13 08:24:53 2014
From: peter.levart at gmail.com (Peter Levart)
Date: Thu, 13 Nov 2014 09:24:53 +0100
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <B7CBF555-A17C-498A-B259-4E28F4B2198E@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>
	<632A5C98-B386-4625-BE12-355241581955@oracle.com>
	<5457AA75.8090103@gmail.com>
	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>
	<5457E0F9.8090004@gmail.com>
	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>
	<5458A57C.4060208@gmail.com>
	<260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>
	<5459034E.8070809@gmail.com>
	<D1D7CFCF-D5C0-4A05-A636-80685CF153B0@oracle.com>
	<39826508-110B-4FCE-9A58-8C3D1B9FC7DE@oracle.com>
	<545F642E.30205@gmail.com>
	<B7CBF555-A17C-498A-B259-4E28F4B2198E@oracle.com>
Message-ID: <54646AD5.4000404@gmail.com>


On 11/12/2014 07:27 PM, David Chase wrote:
> Hello Peter,
>
>> Sadly, this seems not to be the case for MemberNames or for ?Types?.
> That statement is inoperative.  Mistakes were made.
> It?s compareTo that they lack.

Yes, I say your quite tricky implementation of MemberName.compareTo, 
based on hashCode(s), String representations, etc... The hash-table 
based interning does not need it though.

Regards, Peter

> David
>
>
> On 2014-11-09, at 7:55 AM, Peter Levart <peter.levart at gmail.com> wrote:
>
>> Hi David,
>>
>> I played a little with the idea of having a hash table instead of packed sorted array for interning. Using ConcurrentHashMap would present quite some memory overhead. A more compact representation is possible in the form of a linear-scan hash table where elements of array are MemberNames themselves:
>>
>> http://cr.openjdk.java.net/~plevart/misc/MemberName.intern/jdk.06.diff/
>>
>> This is a drop-in replacement for MemberName on top of your jdk.06 patch. If you have some time, you can run this with your performance tests to see if it presents any difference. If not, then perhaps this interning is not so performance critical after all.
>>
>> Regards, Peter


From david.holmes at oracle.com  Thu Nov 13 08:47:19 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 13 Nov 2014 18:47:19 +1000
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <5464610A.20901@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>	<5463BF71.4080804@oracle.com>	<5463EF88.1050100@oracle.com>	<54641E27.4090303@oracle.com>	<546427DB.3070806@oracle.com>	<5464444A.7030601@oracle.com>
	<5464610A.20901@oracle.com>
Message-ID: <54647017.8000704@oracle.com>

On 13/11/2014 5:43 PM, Aleksey Shipilev wrote:
> On 13.11.2014 08:40, David Holmes wrote:
>> There is some history in JDK-4974572 (which is non-public I'm afraid).
>> To all intents and purposes the flag at that point was used to enable
>> testing of workarounds if problems were suspected in the "new"
>> safepointing code. I think it has outlived its usefulness by a few major
>> releases so I'm happy to see it go.
>
> Filed:
>    https://bugs.openjdk.java.net/browse/JDK-8064777

You actually filed 8064776 first :)

But neither is needed as removal can be the solution of 8064749.

David

> I'll do a patch to remove the flag.
>
> -Aleksey.
>

From david.holmes at oracle.com  Thu Nov 13 08:50:05 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 13 Nov 2014 18:50:05 +1000
Subject: RFR(XS): 8064471: Port 8013895: G1: G1SummarizeRSetStats output
	on Linux needs improvement to AIX
In-Reply-To: <54637A9A.9040108@sap.com>
References: <44EB1DA1040EEB44B7F343A782AD30789FEBA3E3@DEWDFEMB11B.global.corp.sap>
	<5463149A.6020506@oracle.com> <54637A9A.9040108@sap.com>
Message-ID: <546470BD.9050303@oracle.com>

On 13/11/2014 1:19 AM, Haug, Gunter wrote:
>
> On 12.11.2014 09:04, David Holmes wrote:
>> Hi Gunter,
>>
>> On 11/11/2014 11:23 PM, Haug, Gunter wrote:
>>> Hi All,
>>>
>>> The change '8013895:  (G1: G1SummarizeRSetStats output on Linux needs
>>> improvement)' makes use of getrusage() to retrieve accurate
>>> per-thread data on resource usage. We can use exactly the same code
>>> on AIX to achieve this.
>>>
>>> Please review the following change:
>>>
>>> http://cr.openjdk.java.net/~goetz/webrevs/8064471-aixTime/webrev.00/
>>> https://bugs.openjdk.java.net/browse/JDK-8064471
>>
>> I have a couple of comments on this code which presumably also apply
>> to the orginal :(
> Yes, they apply to the original as well, see below.
>>
>> First this comment is no longer applicable (actually it was never
>> applicable to AIX!):
>>
>>   // For now, we say that linux does not support vtime. I have no idea
>>   // whether it can actually be made to (DLD, 9/13/05).
>>
> You're right. I will remove it.
>> Second this calculation seems wrong:
>>
>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>> (double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000 *
>> 1000);
>>
>> To me this performs integer division (ie truncation_) then converts
>> the resulting integer to a double. I would expect to see additional
>> parentheses (even if not needed, for clarity):
>>
>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>> ((double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec)) / (1000 *
>> 1000);
>>
>> or more simply divide by a floating-point value:
>>
>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>> (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);
>>
>> and you don't need two double casts regardless as the expression will
>> be of type double as soon as there is one operand of type double. So
>> that should reduce to:
>>
>> return usage.ru_utime.tv_sec + usage.ru_stime.tv_sec +
>> (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);
>>
> OK. Do you want that we also change the Linux version like you proposed?

I'll leave it up to you. If you leave this as AIX only then it tests the 
new process :) There can be a follow up cleanup bug for linux.

Thanks,
David

> Thanks,
> Gunter
>
>> Cheers,
>> David
>>
>>> Thanks,
>>> Gunter
>>>
>

From aleksey.shipilev at oracle.com  Thu Nov 13 08:55:28 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 13 Nov 2014 11:55:28 +0300
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <54647017.8000704@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>	<5463BF71.4080804@oracle.com>	<5463EF88.1050100@oracle.com>	<54641E27.4090303@oracle.com>	<546427DB.3070806@oracle.com>	<5464444A.7030601@oracle.com>
	<5464610A.20901@oracle.com> <54647017.8000704@oracle.com>
Message-ID: <54647200.5000103@oracle.com>

On 13.11.2014 11:47, David Holmes wrote:
> On 13/11/2014 5:43 PM, Aleksey Shipilev wrote:
>> On 13.11.2014 08:40, David Holmes wrote:
>>> There is some history in JDK-4974572 (which is non-public I'm afraid).
>>> To all intents and purposes the flag at that point was used to enable
>>> testing of workarounds if problems were suspected in the "new"
>>> safepointing code. I think it has outlived its usefulness by a few major
>>> releases so I'm happy to see it go.
>>
>> Filed:
>>    https://bugs.openjdk.java.net/browse/JDK-8064777
> 
> You actually filed 8064776 first :)

O_o. The submit timestamps are the same. JIRA is funky today, huh.

> But neither is needed as removal can be the solution of 8064749.

I thought we are better off tracking this separately, and then close
all/any pending bugs about UseCompilerSafepoints as WNF citing 8064776.
Still want to do this in 8064749?

Also, I wonder if we want to demote the flag to experimental in 8u. This
does not sound like a backport of 8064749 at all, but rather a separate
change.

-Aleksey.


From david.holmes at oracle.com  Thu Nov 13 09:12:08 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 13 Nov 2014 19:12:08 +1000
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <54647200.5000103@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>	<5463BF71.4080804@oracle.com>	<5463EF88.1050100@oracle.com>	<54641E27.4090303@oracle.com>	<546427DB.3070806@oracle.com>	<5464444A.7030601@oracle.com>
	<5464610A.20901@oracle.com> <54647017.8000704@oracle.com>
	<54647200.5000103@oracle.com>
Message-ID: <546475E8.6000908@oracle.com>

On 13/11/2014 6:55 PM, Aleksey Shipilev wrote:
> On 13.11.2014 11:47, David Holmes wrote:
>> On 13/11/2014 5:43 PM, Aleksey Shipilev wrote:
>>> On 13.11.2014 08:40, David Holmes wrote:
>>>> There is some history in JDK-4974572 (which is non-public I'm afraid).
>>>> To all intents and purposes the flag at that point was used to enable
>>>> testing of workarounds if problems were suspected in the "new"
>>>> safepointing code. I think it has outlived its usefulness by a few major
>>>> releases so I'm happy to see it go.
>>>
>>> Filed:
>>>     https://bugs.openjdk.java.net/browse/JDK-8064777
>>
>> You actually filed 8064776 first :)
>
> O_o. The submit timestamps are the same. JIRA is funky today, huh.
>
>> But neither is needed as removal can be the solution of 8064749.
>
> I thought we are better off tracking this separately, and then close
> all/any pending bugs about UseCompilerSafepoints as WNF citing 8064776.
> Still want to do this in 8064749?

It is what I did in 8062307 for the TraceThreadEvents flag. 8064749 
contains all the pertinent comments.

> Also, I wonder if we want to demote the flag to experimental in 8u. This
> does not sound like a backport of 8064749 at all, but rather a separate
> change.

Any change requires CCC. I don't see any point in making the flag 
experimental as it doesn't really provide any "experimentation".

Happy to let others weigh in.

Cheers,
David


> -Aleksey.
>

From aleksey.shipilev at oracle.com  Thu Nov 13 12:41:16 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 13 Nov 2014 15:41:16 +0300
Subject: RFR (S) 8064749: -XX:-UseCompilerSafepoints breaks safepoint
	rendezvous
Message-ID: <5464A6EC.6090804@oracle.com>

Hi,

This is a patch that removes -XX:+-UseCompilerSafepoints from Hotspot:
 https://bugs.openjdk.java.net/browse/JDK-8064749
 http://cr.openjdk.java.net/~shade/8064749/webrev.01/

Do I understand it right we need a CCC to remove the product flag?

Testing: JPRT, vm.quick.testlist

Thanks,
-Aleksey.


From daniel.daugherty at oracle.com  Thu Nov 13 13:53:57 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 13 Nov 2014 06:53:57 -0700
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <546475E8.6000908@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>	<5463BF71.4080804@oracle.com>	<5463EF88.1050100@oracle.com>	<54641E27.4090303@oracle.com>	<546427DB.3070806@oracle.com>	<5464444A.7030601@oracle.com>
	<5464610A.20901@oracle.com> <54647017.8000704@oracle.com>
	<54647200.5000103@oracle.com> <546475E8.6000908@oracle.com>
Message-ID: <5464B7F5.9060000@oracle.com>

 > Happy to let others weigh in.

Please use 8064749 to remove the flag; David H is correct that all the
right info is there.

Dan


On 11/13/14 2:12 AM, David Holmes wrote:
> On 13/11/2014 6:55 PM, Aleksey Shipilev wrote:
>> On 13.11.2014 11:47, David Holmes wrote:
>>> On 13/11/2014 5:43 PM, Aleksey Shipilev wrote:
>>>> On 13.11.2014 08:40, David Holmes wrote:
>>>>> There is some history in JDK-4974572 (which is non-public I'm 
>>>>> afraid).
>>>>> To all intents and purposes the flag at that point was used to enable
>>>>> testing of workarounds if problems were suspected in the "new"
>>>>> safepointing code. I think it has outlived its usefulness by a few 
>>>>> major
>>>>> releases so I'm happy to see it go.
>>>>
>>>> Filed:
>>>>     https://bugs.openjdk.java.net/browse/JDK-8064777
>>>
>>> You actually filed 8064776 first :)
>>
>> O_o. The submit timestamps are the same. JIRA is funky today, huh.
>>
>>> But neither is needed as removal can be the solution of 8064749.
>>
>> I thought we are better off tracking this separately, and then close
>> all/any pending bugs about UseCompilerSafepoints as WNF citing 8064776.
>> Still want to do this in 8064749?
>
> It is what I did in 8062307 for the TraceThreadEvents flag. 8064749 
> contains all the pertinent comments.
>
>> Also, I wonder if we want to demote the flag to experimental in 8u. This
>> does not sound like a backport of 8064749 at all, but rather a separate
>> change.
>
> Any change requires CCC. I don't see any point in making the flag 
> experimental as it doesn't really provide any "experimentation".
>
> Happy to let others weigh in.
>
> Cheers,
> David
>
>
>
>> -Aleksey.
>>


From magnus.ihse.bursie at oracle.com  Thu Nov 13 14:44:54 2014
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Thu, 13 Nov 2014 15:44:54 +0100
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <546151A9.1080100@oracle.com>
References: <546151A9.1080100@oracle.com>
Message-ID: <5464C3E6.5000309@oracle.com>

On 2014-11-11 01:00, Daniel D. Daugherty wrote:
> Greetings,
>
> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
> Yes, it is a small fix, but it is in Makefiles so feel free to
> run screaming from the room... :-)  On the plus side the fix does
> delete two work around source files (Coleen would say that's a
> Good Thing (TM)!)

... but you're only deleting the make files?

src/os/solaris/add_gnu_debuglink/add_gnu_debuglink.c and 
src/os/solaris/fix_empty_sec_hdr_flags/fix_empty_sec_hdr_flags.c could 
be deleted as well, right?

Good idea for the fix, anyway. I opened 
https://bugs.openjdk.java.net/browse/JDK-8064808 to implement a similar 
solution in configure.

/Magnus

From daniel.daugherty at oracle.com  Thu Nov 13 14:53:06 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 13 Nov 2014 07:53:06 -0700
Subject: RFR (S) 8064749: -XX:-UseCompilerSafepoints breaks safepoint
	rendezvous
In-Reply-To: <5464A6EC.6090804@oracle.com>
References: <5464A6EC.6090804@oracle.com>
Message-ID: <5464C5D2.2050007@oracle.com>

On 11/13/14 5:41 AM, Aleksey Shipilev wrote:
> Hi,
>
> This is a patch that removes -XX:+-UseCompilerSafepoints from Hotspot:
>   https://bugs.openjdk.java.net/browse/JDK-8064749
>   http://cr.openjdk.java.net/~shade/8064749/webrev.01/

src/share/vm/runtime/arguments.cpp
     Not your problem, but that list is formatted quite
     inconsistently.

     It seems like new entries should be added at the bottom
     so your line 309 should be between these two lines:

     line 313: #endif // ZERO
     line 314:   { NULL, JDK_Version(0), JDK_Version(0) }

src/share/vm/runtime/globals.hpp
     No comments.

src/share/vm/runtime/safepoint.cpp
     No comments.


> Do I understand it right we need a CCC to remove the product flag?

Yes. Since this was a product flag, it needs a CCC to remove it.

Dan


>
> Testing: JPRT, vm.quick.testlist
>
> Thanks,
> -Aleksey.
>
>
>


From karen.kinnear at oracle.com  Thu Nov 13 14:54:14 2014
From: karen.kinnear at oracle.com (Karen Kinnear)
Date: Thu, 13 Nov 2014 09:54:14 -0500
Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh
	fails with ClassCircularityError
In-Reply-To: <54645530.6010107@oracle.com>
References: <543C591E.8010602@oracle.com>	<544AB477.4000204@oracle.com>	<544ADC07.6080904@oracle.com>	<544AE76A.9030701@oracle.com>	<544E5123.1060202@oracle.com>	<544E8844.1070907@oracle.com>	<0FB37288-5995-4E0B-B005-E00A4FB3B22A@oracle.com>
	<5454218D.40009@oracle.com> <5456EADF.4050203@oracle.com>
	<3CF28613-0C0F-44E2-869A-FF5B01D7E575@oracle.com>
	<54645530.6010107@oracle.com>
Message-ID: <589B2E5D-9B9E-4945-BC13-A0025B99AABF@oracle.com>

Yumin,

If you run -XX:+AbortVMOnException=java.lang.ClassCircularityError and -XX:+ShowMessageBoxOnError - you can attach a debugger
with the correct stack trace.

My notes below show a way to run a faster test script loop - i.e. see the earlier attached rerun.csh script - that way you can just run
the test, not the recompile etc. each time, and you can add your flags. Hopefully that will make it fail sooner.

hth,
Karen

On Nov 13, 2014, at 1:52 AM, Yumin Qi wrote:

> Thanks, Karen
> 
>  Now I have a standalone tests which easy to reproduce. I am trying to set debugger to trace the problem. While, I will  try the suggested fix from you too.
>  When set AbortVMOnException, it is late for debugger to attach since the execution goes to abort. Currently not easy run single (we loop in the test script) time to fail.
> 
>  Thanks
>  Yumin
> 
> On 11/12/2014 8:27 AM, Karen Kinnear wrote:
>> I think there are three things we need to figure out.
>> 
>> 1. I reproduced a problem in TestThread2. Below was the information from that and my analysis
>>    - all - comments on my analysis are very welcome
>>    - Yumin - please try the suggested test change below to see if it helps.
>> 
>>    - that is the only example I have seen the full details for.
>> 
>> 2. Does the circularity error actually occur in the main thread and if so why?
>>    - need to catch in a debugger/hs_err file a situation in which this occurs in the main thread please.
>>    We need the full stack trace for this - native and java please
>>    - run this without the test change I suggested please
>>    - try to catch ClassCircularityError in the main thread
>> 
>> 3. figure out why we we see this problem more frequently
>>    - I am not convinced this problem didn't already exist - the test logic has some very odd comments and workarounds which seem to imply there
>>    were intermittent problems from the beginning
>>    - that said - worth figuring out if for instance, the sun.misc.URLClassPath logic was rewritten (and when) to add $JarLoader$2
>>    - and looking at the history of test failure
>> 
>> thanks,
>> Karen
>> 
>> On Nov 2, 2014, at 9:39 PM, David Holmes wrote:
>> 
>>> On 1/11/2014 9:55 AM, Yumin Qi wrote:
>>>> Karen,
>>>> 
>>>>   Thanks for your detail message for debugging. Yes, from my debugging,
>>>> the exception did happen in TestThread other than main thread. I have no
>>>> idea why in the end the exception was reported in main thread.
>>> Until that question is answered I will remain uneasy about simply tweaking the test until it no longer fails. I would also like to know when it started failing - Karen alludes to the possible introduction of a new inner class at some point.
>>> 
>>> Thanks,
>>> David
>>> 
>>>>    You mention
>>>> 
>>>> So that change to the test would be:
>>>>    in TestTransformer:
>>>>       if (loader != null) {
>>>>           if (tName.equals("TestThread")) {
>>>>           {
>>>>              loadClasses(3);
>>>>           }
>>>>        }
>>>>        return null;
>>>>     }
>>>> 
>>>> 
>>>> The loader is the one defined in the test case, right? The system class
>>>> loader is never null.
>>>> I will try this change, let's see if it can work it out.
>>>> 
>>>> Thanks
>>>> Yumin
>>>> 
>>>> On 10/31/2014 3:29 PM, Karen Kinnear wrote:
>>>>> Yumin,
>>>>> 
>>>>> From your earlier exception stack trace (many thanks) you reported:
>>>>> 
>>>>> Exception in thread "main" java.lang.ClassCircularityError:  (no - I
>>>>> don't know why this is in thread "main")
>>>>> sun/misc/URLClassPath$JarLoader$2
>>>>> at sun.misc.URLClassPath$JarLoader.checkResource(URLClassPath.java:771)
>>>>> at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:843)
>>>>> at sun.misc.URLClassPath.getResource(URLClassPath.java:199)
>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:364)
>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:426)
>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:359)
>>>>> at java.lang.Class.forName0(Native Method)
>>>>> at java.lang.Class.forName(Class.java:340)
>>>>> at
>>>>> ParallelTransformerLoaderApp.loadClasses(ParallelTransformerLoaderApp.java:83)
>>>>> 
>>>>> at
>>>>> ParallelTransformerLoaderApp.main(ParallelTransformerLoaderApp.java:45)
>>>>> 
>>>>> 
>>>>> So I ran  with -XX:AbortVMOnException=java.lang.ClassCircularityError
>>>>> -XX:+ShowMessageBoxOnError to get
>>>>> a log file and stack trace. See my instructions below on how to do that.
>>>>> 
>>>>> I did this, attached a debugger, which didn't help enough since I
>>>>> needed to see the java stack frames,
>>>>>  and got an hs_err_log also, so the stack traces came from the error
>>>>> log.
>>>>> 
>>>>> The stack trace was on Thread 2, which in the hs_err_log was
>>>>> TestThread (which makes sense for what the test logic says).
>>>>> See later in email for stack traces from Thread 2.
>>>>> 
>>>>> Summary of stack trace:
>>>>> 
>>>>> TestThread:
>>>>>   loadClasses(#) -> forName(TestClass#, URLClassLoader)
>>>>>     vm calls out to URLClassLoader.loadClass(String) which is
>>>>> inherited from java.lang.ClassLoader.loadClass(String)
>>>>>     ... calls java.net.URLClassLoader.findClass(...) which calls
>>>>>       DoPrivileged  java.net.URLClassLoader$1.run which calls
>>>>>          sun.misc.URLClassPath.getResource(name, false)  which calls
>>>>>              sun.misc.URLClassPath$JarLoader.getResource which calls
>>>>>                  sun.misc.URLClassPath$JarLoader.checkResource which
>>>>> tries to call sun.misc.URLClassPath$JarLoader$2
>>>>>    - and then the transformer jumps in with loadClasses(# (which we
>>>>> know is 3) and walks the same logic which tries to load
>>>>> sun.misc.URLClassPath$JarLoader$2 again
>>>>> 
>>>>> Note that in the placeholder table information that Yumin printed, the
>>>>> circularity error is on sun.misc.URLClassPath$JarLoader$2 with the
>>>>> null == boot loader, which
>>>>> makes sense -- that is the appropriate defining loader, and therefore
>>>>> the one the CFLH would intercept during the defineClass phase.
>>>>> 
>>>>> In the sun.misc.URLClassPath.java file, in the class JarLoader, in the
>>>>> method checkResource
>>>>> ... return new Resource() { ... }
>>>>> -- I do not know why that generates sun.misc.URLClassPath$JarLoader$1,
>>>>> $2 and $3 at build time or when that was added.
>>>>> I would guess that is when the bug started happening.
>>>>> 
>>>>> When I have a successful run, sun.misc.URLClassPath$JarLoader$2 loads
>>>>> before any TestClass1 loads.
>>>>> 
>>>>> My belief is that the point of the test is to test parallel class
>>>>> loading for URL class loaders.
>>>>> I don't think the point is to test the bootstrap class loader, nor to
>>>>> test bootstrapping - i.e. running the agent before
>>>>> we have loaded sufficient classes to allow loading URLClassLoader
>>>>> classes.
>>>>> 
>>>>> What I suggested to Yumin that he try would be to change the test to
>>>>> NOT intercept boot loader loads, so that
>>>>> sun.misc.URLClassPath$JarLoader$#
>>>>> can load which will in turn allow classes loaded by a URLClassLoader
>>>>> subclass to load.
>>>>> 
>>>>> So that change to the test would be:
>>>>>    in TestTransformer:
>>>>>       if (loader != null) {
>>>>>           if (tName.equals("TestThread")) {
>>>>>           {
>>>>>              loadClasses(3);
>>>>>           }
>>>>>        }
>>>>>        return null;
>>>>>     }
>>>>> // I also suspect with that change, we can remove the sleep loop
>>>>> Note: there was a printed message which said that the Thread "Signal
>>>>> Dispatcher" has called transform(), which I
>>>>> ignored, however it is good that we don't call loadClass on that
>>>>> thread  - which is part of what the sleep loop does -
>>>>> but that would be handled by the boot loader screening above
>>>>> 
>>>>> Alternatively we can preload the URLClassPath classes, but I don't
>>>>> think we want to do that, or
>>>>> we can have the agent explicitly screen on a variety of jdk
>>>>> bootstrapping classes. But I think the cleaner
>>>>> solution is to screen on the boot loader.
>>>>> 
>>>>> Does that make any sense to others?
>>>>> 
>>>>> thanks,
>>>>> Karen
>>>>> 
>>>>> p.s. How to run with hotspot flags (jtreg has a -show:rerun option,
>>>>> but with a shell script in the test, this is more complex, so
>>>>> the following should be easier):
>>>>> 
>>>>> So what I did was run the test once for it to pass (not your script,
>>>>> but just once with jtreg) so that it generated
>>>>> the $DST/work directory.
>>>>> I then created a rerun.csh script - attached - you can modify for your
>>>>> own $DST directory.
>>>>> I used it to be able to quickly rerun the test without the jtreg
>>>>> framework and compile time etc. but mostly
>>>>> to be able to actually add hotspot command-line flags.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> p.p.s. details from the error log (let me know if you want me to
>>>>> attach the error log to the bug report)
>>>>> 
>>>>> note: error log shows last 10 events including:
>>>>> Event: 0.928 loading class sun/misc/URLClassPath$JarLoader$2
>>>>> Event: 0.928 loading class TestClass3
>>>>> Event: 0.929 loading class TestClass3 done
>>>>> Event: 0.929 loading class java/lang/ClassCircularityError
>>>>> Event: 0.929 loading class java/lang/ClassCircularityError done
>>>>> 
>>>>> TestThread
>>>>> 
>>>>> java frames:
>>>>> 
>>>>> j
>>>>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>>>> 
>>>>> j
>>>>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>>>> 
>>>>> j
>>>>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>>>> 
>>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>>>> v  ~StubRoutines::call_stub
>>>>> j
>>>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>>>> 
>>>>> j
>>>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>>>> j
>>>>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>>>>> v  ~StubRoutines::call_stub
>>>>> j
>>>>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>>>> 
>>>>> j
>>>>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>>>> 
>>>>> j  ParallelTransformerLoaderAgent$TestTransformer.loadClasses(I)V+25
>>>>> j
>>>>> ParallelTransformerLoaderAgent$TestTransformer.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+81
>>>>> 
>>>>> j
>>>>> sun.instrument.TransformerManager.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+50
>>>>> 
>>>>> j
>>>>> sun.instrument.InstrumentationImpl.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[BZ)[B+34
>>>>> 
>>>>> v  ~StubRoutines::call_stub
>>>>> j
>>>>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>>>> 
>>>>> j
>>>>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>>>> 
>>>>> j
>>>>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>>>> 
>>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>>>> v  ~StubRoutines::call_stub
>>>>> j
>>>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>>>> 
>>>>> j
>>>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>>>> j
>>>>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>>>>> v  ~StubRoutines::call_stub
>>>>> j
>>>>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>>>> 
>>>>> j
>>>>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>>>> 
>>>>> j  ParallelTransformerLoaderApp.loadClasses(I)V+25
>>>>> j  ParallelTransformerLoaderApp$TestThread.run()V+4
>>>>> v  ~StubRoutines::call_stub
>>>>> 
>>>>> 
>>>>> 
>>>>> detailed frames:
>>>>> 
>>>>> V  [libjvm.so+0x760f5a]  Exceptions::_throw_msg(Thread*, char const*,
>>>>> int, Symbol*, char const*)+0x7c
>>>>> V  [libjvm.so+0xce005c]
>>>>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>>>>> Handle, Thread*)+0x7d8
>>>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>>>>> Handle, Handle, Thread*)+0x26d
>>>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>>>>> Handle, Handle, bool, Thread*)+0x39
>>>>> V  [libjvm.so+0x690fbc]
>>>>> ConstantPool::klass_at_impl(constantPoolHandle, int, Thread*)+0x3cc
>>>>> V  [libjvm.so+0x5398cb]  ConstantPool::klass_at(int, Thread*)+0x55
>>>>> V  [libjvm.so+0x8b1f3c]  InterpreterRuntime::_new(JavaThread*,
>>>>> ConstantPool*, int)+0x14a
>>>>> j
>>>>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>>>> 
>>>>> j
>>>>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>>>> 
>>>>> j
>>>>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>>>> 
>>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>>>> v  ~StubRoutines::call_stub
>>>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>>>> JavaCallArguments*, Thread*)+0x7d
>>>>> V  [libjvm.so+0x972a80]  JVM_DoPrivileged+0x63d
>>>>> j
>>>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>>>> 
>>>>> j
>>>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>>>> j
>>>>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3
>>>>> v  ~StubRoutines::call_stub
>>>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>>>> JavaCallArguments*, Thread*)+0x7d
>>>>> V  [libjvm.so+0x8c1ec7]  JavaCalls::call_virtual(JavaValue*,
>>>>> KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x1cb
>>>>> V  [libjvm.so+0x8c2086]  JavaCalls::call_virtual(JavaValue*, Handle,
>>>>> KlassHandle, Symbol*, Symbol*, Handle, Thread*)+0xb0
>>>>> V  [libjvm.so+0xce2096]
>>>>> SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x4ea
>>>>> V  [libjvm.so+0xce00a8]
>>>>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>>>>> Handle, Thread*)+0x824
>>>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>>>>> Handle, Handle, Thread*)+0x26d
>>>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>>>>> Handle, Handle, bool, Thread*)+0x39
>>>>> V  [libjvm.so+0x98c89e]  find_class_from_class_loader(JNIEnv_*,
>>>>> Symbol*, unsigned char, Handle, Handle, unsigned char, Thread*)+0x49
>>>>> V  [libjvm.so+0x96f681]  JVM_FindClassFromCaller+0x39d
>>>>> C  [libjava.so+0xdfd0]  Java_java_lang_Class_forName0+0x130
>>>>> j
>>>>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>>>> 
>>>>> j
>>>>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>>>> 
>>>>> j  ParallelTransformerLoaderAgent$TestTransformer.loadClasses(I)V+25
>>>>> j
>>>>> ParallelTransformerLoaderAgent$TestTransformer.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+81
>>>>> 
>>>>> j
>>>>> sun.instrument.TransformerManager.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[B)[B+50
>>>>> 
>>>>> j
>>>>> sun.instrument.InstrumentationImpl.transform(Ljava/lang/ClassLoader;Ljava/lang/String;Ljava/lang/Class;Ljava/security/ProtectionDomain;[BZ)[B+34
>>>>> 
>>>>> v  ~StubRoutines::call_stub
>>>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>>>> JavaCallArguments*, Thread*)+0x7d
>>>>> V  [libjvm.so+0x911bfb]  jni_invoke_nonstatic(JNIEnv_*, JavaValue*,
>>>>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)+0x3cd
>>>>> V  [libjvm.so+0x916918]  jni_CallObjectMethod+0x388
>>>>> C  [libinstrument.so+0x4eb5]  transformClassFile+0x1e5
>>>>> C  [libinstrument.so+0x1e06]  eventHandlerClassFileLoadHook+0x96
>>>>> V  [libjvm.so+0xa04afa]
>>>>> JvmtiClassFileLoadHookPoster::post_to_env(JvmtiEnv*, bool)+0x1a8
>>>>> V  [libjvm.so+0xa0485e]
>>>>> JvmtiClassFileLoadHookPoster::post_all_envs()+0x8a
>>>>> V  [libjvm.so+0xa047c6]  JvmtiClassFileLoadHookPoster::post()+0x18
>>>>> V  [libjvm.so+0x9fb6e1]
>>>>> JvmtiExport::post_class_file_load_hook(Symbol*, Handle, Handle,
>>>>> unsigned char**, unsigned char**, JvmtiCachedClassFileData**)+0x85
>>>>> V  [libjvm.so+0x5cd17d]  ClassFileParser::parseClassFile(Symbol*,
>>>>> ClassLoaderData*, Handle, KlassHandle, GrowableArray<Handle>*,
>>>>> TempNewSymbol&, bool, Thread*)+0x2af
>>>>> V  [libjvm.so+0x5dd441]  ClassFileParser::parseClassFile(Symbol*,
>>>>> ClassLoaderData*, Handle, TempNewSymbol&, bool, Thread*)+0x95
>>>>> V  [libjvm.so+0x5daf03]  ClassLoader::load_classfile(Symbol*,
>>>>> Thread*)+0x2ed
>>>>> V  [libjvm.so+0xce1cc4]
>>>>> SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x118
>>>>> V  [libjvm.so+0xce00a8]
>>>>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>>>>> Handle, Thread*)+0x824
>>>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>>>>> Handle, Handle, Thread*)+0x26d
>>>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>>>>> Handle, Handle, bool, Thread*)+0x39
>>>>> V  [libjvm.so+0x690fbc]
>>>>> ConstantPool::klass_at_impl(constantPoolHandle, int, Thread*)+0x3cc
>>>>> V  [libjvm.so+0x5398cb]  ConstantPool::klass_at(int, Thread*)+0x55
>>>>> V  [libjvm.so+0x8b1f3c]  InterpreterRuntime::_new(JavaThread*,
>>>>> ConstantPool*, int)+0x14a
>>>>> j
>>>>> sun.misc.URLClassPath$JarLoader.checkResource(Ljava/lang/String;ZLjava/util/jar/JarEntry;)Lsun/misc/Resource;+42
>>>>> 
>>>>> j
>>>>> sun.misc.URLClassPath$JarLoader.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+54
>>>>> 
>>>>> j
>>>>> sun.misc.URLClassPath.getResource(Ljava/lang/String;Z)Lsun/misc/Resource;+53
>>>>> 
>>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Class;+26
>>>>> j  java.net.URLClassLoader$1.run()Ljava/lang/Object;+1
>>>>> v  ~StubRoutines::call_stub
>>>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>>>> JavaCallArguments*, Thread*)+0x7d
>>>>> V  [libjvm.so+0x972a80]  JVM_DoPrivileged+0x63d
>>>>> j
>>>>> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;+0
>>>>> 
>>>>> j
>>>>> java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class;+13
>>>>> j
>>>>> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+70
>>>>> j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;
>>>>> v  ~StubRoutines::call_stub
>>>>> V  [libjvm.so+0x8c3060]  JavaCalls::call_helper(JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*)+0x6b2
>>>>> V  [libjvm.so+0xba06bc]  os::os_exception_wrapper(void (*)(JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*), JavaValue*,
>>>>> methodHandle*, JavaCallArguments*, Thread*)+0x3a
>>>>> V  [libjvm.so+0x8c29a7]  JavaCalls::call(JavaValue*, methodHandle,
>>>>> JavaCallArguments*, Thread*)+0x7d
>>>>> V  [libjvm.so+0x8c1ec7]  JavaCalls::call_virtual(JavaValue*,
>>>>> KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x1cb
>>>>> V  [libjvm.so+0x8c2086]  JavaCalls::call_virtual(JavaValue*, Handle,
>>>>> KlassHandle, Symbol*, Symbol*, Handle, Thread*)+0xb0
>>>>> V  [libjvm.so+0xce2096]
>>>>> SystemDictionary::load_instance_class(Symbol*, Handle, Thread*)+0x4ea
>>>>> V  [libjvm.so+0xce00a8]
>>>>> SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle,
>>>>> Handle, Thread*)+0x824
>>>>> V  [libjvm.so+0xcde9e5]  SystemDictionary::resolve_or_null(Symbol*,
>>>>> Handle, Handle, Thread*)+0x26d
>>>>> V  [libjvm.so+0xcde435]  SystemDictionary::resolve_or_fail(Symbol*,
>>>>> Handle, Handle, bool, Thread*)+0x39
>>>>> V  [libjvm.so+0x98c89e]  find_class_from_class_loader(JNIEnv_*,
>>>>> Symbol*, unsigned char, Handle, Handle, unsigned char, Thread*)+0x49
>>>>> V  [libjvm.so+0x96f681]  JVM_FindClassFromCaller+0x39d
>>>>> C  [libjava.so+0xdfd0]  Java_java_lang_Class_forName0+0x130
>>>>> j
>>>>> java.lang.Class.forName0(Ljava/lang/String;ZLjava/lang/ClassLoader;Ljava/lang/Class;)Ljava/lang/Class;+0
>>>>> 
>>>>> j
>>>>> java.lang.Class.forName(Ljava/lang/String;ZLjava/lang/ClassLoader;)Ljava/lang/Class;+49
>>>>> 
>>>>> j  ParallelTransformerLoaderApp.loadClasses(I)V+25
>>>>> ...<more frames>...
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Oct 27, 2014, at 2:00 PM, serguei.spitsyn at oracle.com wrote:
>>>>> 
>>>>>> Ok.
>>>>>> 
>>>>>> Thanks, Dan!
>>>>>> Serguei
>>>>>> 
>>>>>> 
>>>>>> On 10/27/14 7:05 AM, Daniel D. Daugherty wrote:
>>>>>>>> The test case was added by Dan.
>>>>>>>> We may want to ask him to clarify the test case purpose.
>>>>>>>> (added Dan to the to-list)
>>>>>>> Here's the changeset that added the test:
>>>>>>> 
>>>>>>> $ hg log -v -r bca8bf23ac59
>>>>>>> test/java/lang/instrument/ParallelTransformerLoader.sh
>>>>>>> changeset:   132:bca8bf23ac59
>>>>>>> user:        dcubed
>>>>>>> date:        Mon Mar 24 15:05:09 2008 -0700
>>>>>>> files: test/java/lang/instrument/ParallelTransformerLoader.sh
>>>>>>> test/java/lang/instrument/ParallelTransformerLoaderAgent.java
>>>>>>> test/java/lang/instrument/ParallelTransformerLoaderApp.java
>>>>>>> test/java/lang/instrument/TestClass1.java
>>>>>>> test/java/lang/instrument/TestClass2.java
>>>>>>> test/java/lang/instrument/TestClass3.java
>>>>>>> description:
>>>>>>> 5088398: 3/2 java.lang.instrument TCK test deadlock (test11)
>>>>>>> Summary: Add regression test for single-threaded bootstrap classloader.
>>>>>>> Reviewed-by: sspitsyn
>>>>>>> 
>>>>>>> 
>>>>>>> Based on my e-mail archive for this bug and from the bug report itself,
>>>>>>> it looks like we got this test from Wily Labs. The original bug was a
>>>>>>> deadlock that stopped being reproducible after:
>>>>>>> 
>>>>>>> Karen fixed the bootstrap class loader to work in parallel via:
>>>>>>> 
>>>>>>>    4997893 4/5 Investigate allowing bootstrap loader to work in
>>>>>>> parallel
>>>>>>> 
>>>>>>> with that fix in place the deadlock no longer reproduces.
>>>>>>> I'm planning to use this bug as the vehicle for getting
>>>>>>> the test program into the INSTRUMENT_REGRESSION test suite.
>>>>>>> 
>>>>>>> *** (#2 of 2): 2008-02-29 18:20:17 GMT+00:00 daniel.daugherty at sun.com
>>>>>>> 
>>>>>>> 
>>>>>>> A careful reading of JDK-5088398 might reveal the intentions of this
>>>>>>> test...
>>>>>>> 
>>>>>>> Dan
>>>>>>> 
>>>>>>> 
>>>>>>> On 10/24/14 5:57 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Yumin,
>>>>>>>> 
>>>>>>>> On 10/24/14 4:08 PM, Yumin Qi wrote:
>>>>>>>>> Serguei,
>>>>>>>>> 
>>>>>>>>>  Thanks for your comments.
>>>>>>>>>  This test happens intermittently, but now it can repeat with 8/9.
>>>>>>>>>  Loading TestClass1 in main thread while loading TestClass2 in
>>>>>>>>> TestThread in parallel. They both will call transform since
>>>>>>>>> TestClass[1-3] are loaded via agent. When loading TestClass2, it
>>>>>>>>> will call loading TestClass3 in TestThread.
>>>>>>>>>  Note in the main thread, for loop:
>>>>>>>>> 
>>>>>>>>>                  for (int i = 0; i < kNumIterations; i++)
>>>>>>>>>                {
>>>>>>>>>                        // load some classes from multiple threads
>>>>>>>>> (this thread and one other)
>>>>>>>>>                        Thread testThread = new TestThread(2);
>>>>>>>>>                        testThread.start();
>>>>>>>>>                        loadClasses(1);
>>>>>>>>> 
>>>>>>>>>                        // log that it completed and reset for the
>>>>>>>>> next iteration
>>>>>>>>>                        testThread.join();
>>>>>>>>>                        System.out.print(".");
>>>>>>>>> ParallelTransformerLoaderAgent.generateNewClassLoader();
>>>>>>>>>                }
>>>>>>>>> 
>>>>>>>>> The loader got renewed after testThread.join(). So both threads
>>>>>>>>> are using the exact same class loader.
>>>>>>>> You are right, thanks.
>>>>>>>> It means that all three classes (TesClass1, TestClass2 and TestClass3)
>>>>>>>> are loaded by the same class loader in each iteration.
>>>>>>>> 
>>>>>>>> However, I see more cases when the TestClass3 gets loaded.
>>>>>>>> It happens in a CFLH event when any other class (not TestClass*) in
>>>>>>>> the system is loaded.
>>>>>>>> The class loading thread can be any, not only "main" or "TestClass"
>>>>>>>> thread.
>>>>>>>> I suspect this test case mostly targets class loading that happens
>>>>>>>> on other threads.
>>>>>>>> It is because of the lines:
>>>>>>>>                        // In 160_03 and older, transform() is called
>>>>>>>>                        // with the "system_loader_lock" held and that
>>>>>>>>                        // prevents the bootstrap class loaded from
>>>>>>>>                        // running in parallel. If we add a slight
>>>>>>>> sleep
>>>>>>>>                        // delay here when the transform() call is not
>>>>>>>>                        // main or TestThread, then the deadlock in
>>>>>>>>                        // 160_03 and older is much more reproducible.
>>>>>>>>                        if (!tName.equals("main") &&
>>>>>>>> !tName.equals("TestThread")) {
>>>>>>>>                            System.out.println("Thread '" + tName +
>>>>>>>>                                "' has called transform()");
>>>>>>>>                            try {
>>>>>>>>                                Thread.sleep(500);
>>>>>>>>                            } catch (InterruptedException ie) {
>>>>>>>>                            }
>>>>>>>>                        }
>>>>>>>> 
>>>>>>>> What about the following?
>>>>>>>> 
>>>>>>>> In the ParallelTransformerLoaderAgent.java  make this change:
>>>>>>>>              if (!tName.equals("main"))
>>>>>>>>                  => if (tName.equals("TestThread"))
>>>>>>>> 
>>>>>>>> Does such updated test still failing?
>>>>>>>> 
>>>>>>>>> After create a new class loader, next loop will use the loader.
>>>>>>>>> This is why quite often on the stack trace we can see it resolves
>>>>>>>>> JarLoader$2.
>>>>>>>>> 
>>>>>>>>> I am not quite understand the test case either. Loading TestClass3
>>>>>>>>> inside transform using the same classloader will cause  call to
>>>>>>>>> transform again and form a circle. Nonetheless, if we see
>>>>>>>>> TestClass2 already loaded, the loop will end but that still is a
>>>>>>>>> risk.
>>>>>>>> In fact, I don't like that the test loads the class TestClass3 at
>>>>>>>> the TestClass3 CFLH event.
>>>>>>>> However, it is interesting to know why we did not see (is it the
>>>>>>>> case?) this issue before.
>>>>>>>> Also, it is interesting why the test stops failing with you fix
>>>>>>>> (replacing loader with SystemClassLoader).
>>>>>>>> 
>>>>>>>> The test case was added by Dan.
>>>>>>>> We may want to ask him to clarify the test case purpose.
>>>>>>>> (added Dan to the to-list)
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> Yumin
>>>>>>>>> 
>>>>>>>>> On 10/24/2014 1:20 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>> Hi Yumin,
>>>>>>>>>> 
>>>>>>>>>> Below is some analysis to make sure I understand the test
>>>>>>>>>> scenario correctly.
>>>>>>>>>> 
>>>>>>>>>> The ParallelTransformerLoaderApp.main() executes a 1000 iteration
>>>>>>>>>> loop.
>>>>>>>>>> At each iteration it does:
>>>>>>>>>>  - creates and starts a new TestThread
>>>>>>>>>>  - loads TestClass1 with the current class loader:
>>>>>>>>>> ParallelTransformerLoaderAgent.getClassLoader()
>>>>>>>>>>  - changes the current class loader with new one:
>>>>>>>>>> ParallelTransformerLoaderAgent.generateNewClassLoader()
>>>>>>>>>> 
>>>>>>>>>> The TestThread loads the TestClass2 concurrently with the main
>>>>>>>>>> thread.
>>>>>>>>>> 
>>>>>>>>>> At the CFLH events, the ParallelTransformerLoaderAgent does the
>>>>>>>>>> class retransformation.
>>>>>>>>>> If the thread loading the class is not "main", it loads the class
>>>>>>>>>> TestClass3
>>>>>>>>>> with the current class loader
>>>>>>>>>> ParallelTransformerLoaderAgent.getClassLoader().
>>>>>>>>>> 
>>>>>>>>>> Sometimes, the TestClass2 and TestClass3 are loaded by the same
>>>>>>>>>> class loader recursively.
>>>>>>>>>> It happens if the class loader has not been changed between
>>>>>>>>>> loading TestClass2 and TestClass3 classes.
>>>>>>>>>> 
>>>>>>>>>> I'm not convinced yet the test is incorrect.
>>>>>>>>>> And it is not clear why do we get a ClassCircularityError.
>>>>>>>>>> 
>>>>>>>>>> Please, let me know if the above understanding is wrong.
>>>>>>>>>> I also see the reply from David and share his concerns.
>>>>>>>>>> 
>>>>>>>>>> It is not clear if this failure is a regression.
>>>>>>>>>> Did we observe this issue before?
>>>>>>>>>> If - NOT then when and why had this failure started to appear?
>>>>>>>>>> 
>>>>>>>>>> Unfortunately, it is impossible to look at the test run history
>>>>>>>>>> at the moment.
>>>>>>>>>> The Aurora is at a maintenance.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>> 
>>>>>>>>>> On 10/13/14 3:58 PM, Yumin Qi wrote:
>>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468
>>>>>>>>>>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/
>>>>>>>>>>> 
>>>>>>>>>>> the bug marked as confidential so post the webrev internally.
>>>>>>>>>>> 
>>>>>>>>>>> Problem: The test case tries to load a class from the same jar
>>>>>>>>>>> via agent in the middle of loading another class from the jar
>>>>>>>>>>> via same class loader in same thread. The call happens in
>>>>>>>>>>> transform which is a rare case --- in middle of loading class,
>>>>>>>>>>> loading another class. The result is a CircularityError. When
>>>>>>>>>>> first class is in loading, in vm we put JarLoader$2 on place
>>>>>>>>>>> holder table, then we start the defineClass, which calls
>>>>>>>>>>> transform, begins loading the second class so go along the same
>>>>>>>>>>> routine for loading JarLoader$2 first, found it already in
>>>>>>>>>>> placeholder table. A CircularityError is thrown.
>>>>>>>>>>> Fix: The test case should not call loading class with same class
>>>>>>>>>>> loader in same thread from same jar in 'transform' method. I
>>>>>>>>>>> modify it loading with system class loader and we expect see
>>>>>>>>>>> ClassNotFoundException. Detail see bug comments.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin *
> 


From coleen.phillimore at oracle.com  Thu Nov 13 15:17:50 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Thu, 13 Nov 2014 10:17:50 -0500
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <5464B7F5.9060000@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>	<5463BF71.4080804@oracle.com>	<5463EF88.1050100@oracle.com>	<54641E27.4090303@oracle.com>	<546427DB.3070806@oracle.com>	<5464444A.7030601@oracle.com>	<5464610A.20901@oracle.com>
	<54647017.8000704@oracle.com>	<54647200.5000103@oracle.com>
	<546475E8.6000908@oracle.com> <5464B7F5.9060000@oracle.com>
Message-ID: <5464CB9E.60107@oracle.com>


On 11/13/14, 8:53 AM, Daniel D. Daugherty wrote:
> > Happy to let others weigh in.
>
> Please use 8064749 to remove the flag; David H is correct that all the
> right info is there.

Agreed.  There is no point in having an experimental flag that is broken 
also.

This is not worth backporting to 8u.

Coleen
>
> Dan
>
>
> On 11/13/14 2:12 AM, David Holmes wrote:
>> On 13/11/2014 6:55 PM, Aleksey Shipilev wrote:
>>> On 13.11.2014 11:47, David Holmes wrote:
>>>> On 13/11/2014 5:43 PM, Aleksey Shipilev wrote:
>>>>> On 13.11.2014 08:40, David Holmes wrote:
>>>>>> There is some history in JDK-4974572 (which is non-public I'm 
>>>>>> afraid).
>>>>>> To all intents and purposes the flag at that point was used to 
>>>>>> enable
>>>>>> testing of workarounds if problems were suspected in the "new"
>>>>>> safepointing code. I think it has outlived its usefulness by a 
>>>>>> few major
>>>>>> releases so I'm happy to see it go.
>>>>>
>>>>> Filed:
>>>>>     https://bugs.openjdk.java.net/browse/JDK-8064777
>>>>
>>>> You actually filed 8064776 first :)
>>>
>>> O_o. The submit timestamps are the same. JIRA is funky today, huh.
>>>
>>>> But neither is needed as removal can be the solution of 8064749.
>>>
>>> I thought we are better off tracking this separately, and then close
>>> all/any pending bugs about UseCompilerSafepoints as WNF citing 8064776.
>>> Still want to do this in 8064749?
>>
>> It is what I did in 8062307 for the TraceThreadEvents flag. 8064749 
>> contains all the pertinent comments.
>>
>>> Also, I wonder if we want to demote the flag to experimental in 8u. 
>>> This
>>> does not sound like a backport of 8064749 at all, but rather a separate
>>> change.
>>
>> Any change requires CCC. I don't see any point in making the flag 
>> experimental as it doesn't really provide any "experimentation".
>>
>> Happy to let others weigh in.
>>
>> Cheers,
>> David
>>
>>
>>
>>> -Aleksey.
>>>
>


From coleen.phillimore at oracle.com  Thu Nov 13 15:25:18 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Thu, 13 Nov 2014 10:25:18 -0500
Subject: RFR (S) 8064749: -XX:-UseCompilerSafepoints breaks safepoint
	rendezvous
In-Reply-To: <5464A6EC.6090804@oracle.com>
References: <5464A6EC.6090804@oracle.com>
Message-ID: <5464CD5E.4060800@oracle.com>


Yes, you have to file a CCC first.
Coleen

On 11/13/14, 7:41 AM, Aleksey Shipilev wrote:
> Hi,
>
> This is a patch that removes -XX:+-UseCompilerSafepoints from Hotspot:
>   https://bugs.openjdk.java.net/browse/JDK-8064749
>   http://cr.openjdk.java.net/~shade/8064749/webrev.01/
>
> Do I understand it right we need a CCC to remove the product flag?
>
> Testing: JPRT, vm.quick.testlist
>
> Thanks,
> -Aleksey.
>


From aleksey.shipilev at oracle.com  Thu Nov 13 15:43:36 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 13 Nov 2014 18:43:36 +0300
Subject: RFR (S) 8064749: -XX:-UseCompilerSafepoints breaks safepoint
	rendezvous
In-Reply-To: <5464C5D2.2050007@oracle.com>
References: <5464A6EC.6090804@oracle.com> <5464C5D2.2050007@oracle.com>
Message-ID: <5464D1A8.7050901@oracle.com>

Thanks for review, Dan!

Updated webrev:
 http://cr.openjdk.java.net/~shade/8064749/webrev.02/

On 11/13/2014 05:53 PM, Daniel D. Daugherty wrote:
> On 11/13/14 5:41 AM, Aleksey Shipilev wrote:
>> Hi,
>>
>> This is a patch that removes -XX:+-UseCompilerSafepoints from Hotspot:
>>   https://bugs.openjdk.java.net/browse/JDK-8064749
>>   http://cr.openjdk.java.net/~shade/8064749/webrev.01/
> 
> src/share/vm/runtime/arguments.cpp
>     Not your problem, but that list is formatted quite
>     inconsistently.
> 
>     It seems like new entries should be added at the bottom
>     so your line 309 should be between these two lines:
> 
>     line 313: #endif // ZERO
>     line 314:   { NULL, JDK_Version(0), JDK_Version(0) }

Moved.

>> Do I understand it right we need a CCC to remove the product flag?
> 
> Yes. Since this was a product flag, it needs a CCC to remove it.

Filed.

-Aleksey.


From aleksey.shipilev at oracle.com  Thu Nov 13 15:43:58 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 13 Nov 2014 18:43:58 +0300
Subject: RFR (S) 8064749: -XX:-UseCompilerSafepoints breaks safepoint
	rendezvous
In-Reply-To: <5464CD5E.4060800@oracle.com>
References: <5464A6EC.6090804@oracle.com> <5464CD5E.4060800@oracle.com>
Message-ID: <5464D1BE.4090204@oracle.com>

Got it, filed.

Any problems with the code change?
 http://cr.openjdk.java.net/~shade/8064749/webrev.02/

-Aleksey.

On 11/13/2014 06:25 PM, Coleen Phillimore wrote:
> 
> Yes, you have to file a CCC first.
> Coleen
> 
> On 11/13/14, 7:41 AM, Aleksey Shipilev wrote:
>> Hi,
>>
>> This is a patch that removes -XX:+-UseCompilerSafepoints from Hotspot:
>>   https://bugs.openjdk.java.net/browse/JDK-8064749
>>   http://cr.openjdk.java.net/~shade/8064749/webrev.01/
>>
>> Do I understand it right we need a CCC to remove the product flag?
>>
>> Testing: JPRT, vm.quick.testlist
>>
>> Thanks,
>> -Aleksey.
>>
> 


From coleen.phillimore at oracle.com  Thu Nov 13 16:02:40 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Thu, 13 Nov 2014 11:02:40 -0500
Subject: RFR (S) 8064749: -XX:-UseCompilerSafepoints breaks safepoint
	rendezvous
In-Reply-To: <5464D1BE.4090204@oracle.com>
References: <5464A6EC.6090804@oracle.com> <5464CD5E.4060800@oracle.com>
	<5464D1BE.4090204@oracle.com>
Message-ID: <5464D620.80008@oracle.com>


On 11/13/14, 10:43 AM, Aleksey Shipilev wrote:
> Got it, filed.
>
> Any problems with the code change?
>   http://cr.openjdk.java.net/~shade/8064749/webrev.02/

No, the code change looks fine.

Coleen
>
> -Aleksey.
>
> On 11/13/2014 06:25 PM, Coleen Phillimore wrote:
>> Yes, you have to file a CCC first.
>> Coleen
>>
>> On 11/13/14, 7:41 AM, Aleksey Shipilev wrote:
>>> Hi,
>>>
>>> This is a patch that removes -XX:+-UseCompilerSafepoints from Hotspot:
>>>    https://bugs.openjdk.java.net/browse/JDK-8064749
>>>    http://cr.openjdk.java.net/~shade/8064749/webrev.01/
>>>
>>> Do I understand it right we need a CCC to remove the product flag?
>>>
>>> Testing: JPRT, vm.quick.testlist
>>>
>>> Thanks,
>>> -Aleksey.
>>>
>


From aleksey.shipilev at oracle.com  Thu Nov 13 16:14:17 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 13 Nov 2014 19:14:17 +0300
Subject: RFR (S) 8064749: -XX:-UseCompilerSafepoints breaks safepoint
	rendezvous
In-Reply-To: <5464D620.80008@oracle.com>
References: <5464A6EC.6090804@oracle.com> <5464CD5E.4060800@oracle.com>
	<5464D1BE.4090204@oracle.com> <5464D620.80008@oracle.com>
Message-ID: <5464D8D9.70606@oracle.com>

Thanks, Coleen!

Changeset:
 http://cr.openjdk.java.net/~shade/8064749/8064749.changeset

(Patiently waiting for CCC to be approved).

-Aleksey.

On 11/13/2014 07:02 PM, Coleen Phillimore wrote:
> 
> On 11/13/14, 10:43 AM, Aleksey Shipilev wrote:
>> Got it, filed.
>>
>> Any problems with the code change?
>>   http://cr.openjdk.java.net/~shade/8064749/webrev.02/
> 
> No, the code change looks fine.
> 
> Coleen
>>
>> -Aleksey.
>>
>> On 11/13/2014 06:25 PM, Coleen Phillimore wrote:
>>> Yes, you have to file a CCC first.
>>> Coleen
>>>
>>> On 11/13/14, 7:41 AM, Aleksey Shipilev wrote:
>>>> Hi,
>>>>
>>>> This is a patch that removes -XX:+-UseCompilerSafepoints from Hotspot:
>>>>    https://bugs.openjdk.java.net/browse/JDK-8064749
>>>>    http://cr.openjdk.java.net/~shade/8064749/webrev.01/
>>>>
>>>> Do I understand it right we need a CCC to remove the product flag?
>>>>
>>>> Testing: JPRT, vm.quick.testlist
>>>>
>>>> Thanks,
>>>> -Aleksey.
>>>>
>>
> 


From gunter.haug at sap.com  Thu Nov 13 16:39:48 2014
From: gunter.haug at sap.com (Haug, Gunter)
Date: Thu, 13 Nov 2014 17:39:48 +0100
Subject: RFR(XS): 8064471: Port 8013895: G1: G1SummarizeRSetStats output
	on Linux needs improvement to AIX
In-Reply-To: <546470BD.9050303@oracle.com>
References: <44EB1DA1040EEB44B7F343A782AD30789FEBA3E3@DEWDFEMB11B.global.corp.sap>
	<5463149A.6020506@oracle.com> <54637A9A.9040108@sap.com>
	<546470BD.9050303@oracle.com>
Message-ID: <5464DED4.9040909@sap.com>


On 13.11.2014 09:50, David Holmes wrote:
> On 13/11/2014 1:19 AM, Haug, Gunter wrote:
>>
>> On 12.11.2014 09:04, David Holmes wrote:
>>> Hi Gunter,
>>>
>>> On 11/11/2014 11:23 PM, Haug, Gunter wrote:
>>>> Hi All,
>>>>
>>>> The change '8013895:  (G1: G1SummarizeRSetStats output on Linux needs
>>>> improvement)' makes use of getrusage() to retrieve accurate
>>>> per-thread data on resource usage. We can use exactly the same code
>>>> on AIX to achieve this.
>>>>
>>>> Please review the following change:
>>>>
>>>> http://cr.openjdk.java.net/~goetz/webrevs/8064471-aixTime/webrev.00/
>>>> https://bugs.openjdk.java.net/browse/JDK-8064471
>>>
>>> I have a couple of comments on this code which presumably also apply
>>> to the orginal :(
>> Yes, they apply to the original as well, see below.
>>>
>>> First this comment is no longer applicable (actually it was never
>>> applicable to AIX!):
>>>
>>>   // For now, we say that linux does not support vtime. I have no idea
>>>   // whether it can actually be made to (DLD, 9/13/05).
>>>
>> You're right. I will remove it.
>>> Second this calculation seems wrong:
>>>
>>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>>> (double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000 *
>>> 1000);
>>>
>>> To me this performs integer division (ie truncation_) then converts
>>> the resulting integer to a double. I would expect to see additional
>>> parentheses (even if not needed, for clarity):
>>>
>>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>>> ((double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec)) / (1000 *
>>> 1000);
>>>
>>> or more simply divide by a floating-point value:
>>>
>>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>>> (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);
>>>
>>> and you don't need two double casts regardless as the expression will
>>> be of type double as soon as there is one operand of type double. So
>>> that should reduce to:
>>>
>>> return usage.ru_utime.tv_sec + usage.ru_stime.tv_sec +
>>> (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);
>>>
>> OK. Do you want that we also change the Linux version like you proposed?
>
> I'll leave it up to you. If you leave this as AIX only then it tests 
> the new process :) There can be a follow up cleanup bug for linux.

Hi David,

I think it's not worth the effort to make two separate changes on linux 
and aix, so I fixed linux as well. Please find the new webrev below. 
There will probably be more opportunities to test the new process in the 
future.

http://cr.openjdk.java.net/~simonis/webrevs/8064471.v2/ 
<http://cr.openjdk.java.net/%7Esimonis/webrevs/8064471.v2/>


Now we need a sponsor, as it is not aix only anymore.

Thanks,
Gunter


>
> Thanks,
> David
>
>> Thanks,
>> Gunter
>>
>>> Cheers,
>>> David
>>>
>>>> Thanks,
>>>> Gunter
>>>>
>>


From tom.deneau at amd.com  Thu Nov 13 16:43:25 2014
From: tom.deneau at amd.com (Deneau, Tom)
Date: Thu, 13 Nov 2014 16:43:25 +0000
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <5464444A.7030601@oracle.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>
	<5463BF71.4080804@oracle.com> <5463EF88.1050100@oracle.com>
	<54641E27.4090303@oracle.com> <546427DB.3070806@oracle.com>
	<5464444A.7030601@oracle.com>
Message-ID: <BC97738F8E7C8742BABED7F06FB9DF915550688F@SATLEXDAG01.amd.com>

Just as an aside, I got involved in this because I wanted to see the effect of the CompilerSafepoints poll instruction when comparing performance of small JMH benchmarks across a couple of different architectures. 

But I'm fine with getting rid of the flag.

-- Tom

-----Original Message-----
From: David Holmes [mailto:david.holmes at oracle.com] 
Sent: Wednesday, November 12, 2014 11:40 PM
To: Vladimir Kozlov; hotspot-runtime-dev at openjdk.java.net
Cc: Deneau, Tom
Subject: Re: hang when using -XX:-UseCompilerSafepoints

Hi Vladimir,

On 13/11/2014 1:39 PM, Vladimir Kozlov wrote:
> I agrer that workaround is -Xint. But if we disable compilation with
> -UseCompilerSafepoints, the flag becomes useless. You can get the same
> result with just -Xint.
>
> The history shows that it was added at the very beginning of Hotspot
> development, at the day one. I can only speculate that it was used to
> find performance effects of safepoints in compiled code . It could be
> the case that we removed safepoints from Counted loops as result of that
> investigation. I think it was never intended to be used in production.
>
> Although we can fix compilers to generate a runtime call which does
> safepoint when -UseCompilerSafepoints is specified, it will be useless
> work, I think.

There is some history in JDK-4974572 (which is non-public I'm afraid). 
To all intents and purposes the flag at that point was used to enable 
testing of workarounds if problems were suspected in the "new" 
safepointing code. I think it has outlived its usefulness by a few major 
releases so I'm happy to see it go.

Cheers,
David

> thanks,
> Vladimir
>
> On 11/12/14 6:57 PM, David Holmes wrote:
>> On 13/11/2014 9:38 AM, Vladimir Kozlov wrote:
>>> On 11/12/14 12:13 PM, Aleksey Shipilev wrote:
>>>> Hi,
>>>>
>>>> Still not sure if this is a runtime bug: stripping safepoints from the
>>>> non-counted loop seems to be a recipe for disaster.
>>>
>>> This flag does not affect compiled code - so it is not compiler issue.
>>
>> Well, it disables the mechanism that the compiler inserts for checking
>> if a safepoint has been requested. As I've added to the bug report,
>> disabling compiler safepoints should go hand-in-hand with disabling the
>> compilers (ie run with -Xint) - otherwise you have to know that the
>> compiled code will eventually hit a non-compiler safepoint check.
>>
>>> It is only used in runtime/safepoint.cpp and it guards the code which
>>> protects a polling page.
>>>
>>> There are many bugs which shows current problem. For example:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-6873333
>>>
>>> I would say that we have to remove it or at least make it experimental
>>> flag if we want to do experiments with it.
>>>
>>> We definitely should not allow to use it in production!
>>
>> If we assume there is a reason it was made a product flag then the
>> correct fix in my opinion would be to fall back to intepreter-only mode
>> when this flag is turned off.
>>
>> If we don't make that assumption then we could still tie it to
>> interpreter-only mode, but we definitely should not make it configurable
>> in product mode without some effort.
>>
>> Or if we can't ascertain a valid reason for ever wanting to do this, we
>> could simply delete the flag altogether. :)
>>
>> Cheers,
>> David
>>
>>> Regards,
>>> Vladimir
>>>
>>>>
>>>> Anyhow, I think it deserves a simpler example. Submitted the bug and
>>>> attached a simple test there:
>>>>   https://bugs.openjdk.java.net/browse/JDK-8064749
>>>>
>>>> Thanks,
>>>> -Aleksey.
>>>>
>>>> On 12.11.2014 19:52, Deneau, Tom wrote:
>>>>> Hi all --
>>>>>
>>>>> Forwarding a thread which came about on the jmh-dev mail list, as
>>>>> recommended by Aleksey Shipilev (see below).  The JMH framework has a
>>>>> timing control thread which sleeps for a certain period, then sets a
>>>>> volatile isDone variable.  Meanwhile, the benchmark thread loops
>>>>> doing its benchmark code and also checking the isDone field.   A hang
>>>>> occurs if -XX:-UseCompilerSafepoints is used.
>>>>>
>>>>> The original issue can be reproduced by the following steps
>>>>>
>>>>>     hg clone http://hg.openjdk.java.net/code-tools/jmh
>>>>>     cd jmh
>>>>>     mvn clean install -DskipTests=true
>>>>>     cd jmh-samples
>>>>>     java  -server -XX:-UseCompilerSafepoints -jar
>>>>> target/benchmarks.jar 'JMHSample_01_.*' -t 1 -wi 5 -i 5 -f 0
>>>>>
>>>>> -- Tom Deneau
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Aleksey Shipilev [mailto:aleksey.shipilev at oracle.com]
>>>>> Sent: Wednesday, November 12, 2014 6:09 AM
>>>>> To: Deneau, Tom; jmh-dev at openjdk.java.net
>>>>> Subject: Re: using -XX:-UseCompilerSafepoints
>>>>>
>>>>> Hi Tom,
>>>>>
>>>>> On 11/11/2014 07:34 PM, Deneau, Tom wrote:
>>>>>> It looks like a thread that calls Thread.sleep (as the timing control
>>>>>> thread does in the harness) will eventually go thru
>>>>>> SafepointSynchonize::block (as part of the ThreadBlockInVM
>>>>>> destructor).  So if there is a looping benchmark thread compiled
>>>>>> without Compiler Safepoints, the control thread will be blocked and
>>>>>> will never set the isDone flag.
>>>>>
>>>>> So, you are saying that without the safepoint in the while(!isDone)
>>>>> loop in workload, control thread and workload thread will never
>>>>> rendezvous on safepoint? I believe this is a bug with
>>>>> -XX:-CompilerSafepoints, because the comment in safepoint.cpp calls
>>>>> this
>>>>> out specifically for VMThread vs. Mutator threads:
>>>>>
>>>>>   // In a pathological scenario such as that described in CR6415670
>>>>>   // the VMthread may sleep just before the mutator(s) become safe.
>>>>>   // In that case the mutators will be stalled waiting for the
>>>>> safepoint
>>>>>   // to complete and the the VMthread will be sleeping, waiting for
>>>>> the
>>>>>   // mutators to rendezvous. The VMthread will eventually wake up and
>>>>>   // detect that all mutators are safe, at which point we'll again
>>>>> make
>>>>>   // progress.
>>>>>
>>>>> If this is a case, you probably need to report this to runtime guys.
>>>>>
>>>>>> This is probably OK, just need to document that CompilerSafepoints
>>>>>> cannot be turned off.
>>>>>
>>>>> I think it is safe to presume something will go hairy if you are using
>>>>> any special VM flag, therefore I am not inclined to document this.
>>>>>
>>>>> Thanks,
>>>>> -Aleksey.
>>>>>
>>>>
>>>>

From vladimir.kozlov at oracle.com  Thu Nov 13 16:45:02 2014
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 13 Nov 2014 08:45:02 -0800
Subject: RFR (S) 8064749: -XX:-UseCompilerSafepoints breaks safepoint
	rendezvous
In-Reply-To: <5464D1BE.4090204@oracle.com>
References: <5464A6EC.6090804@oracle.com> <5464CD5E.4060800@oracle.com>
	<5464D1BE.4090204@oracle.com>
Message-ID: <5464E00E.4070503@oracle.com>

Good.

Thanks,
Vladimir

On 11/13/14 7:43 AM, Aleksey Shipilev wrote:
> Got it, filed.
>
> Any problems with the code change?
>   http://cr.openjdk.java.net/~shade/8064749/webrev.02/
>
> -Aleksey.
>
> On 11/13/2014 06:25 PM, Coleen Phillimore wrote:
>>
>> Yes, you have to file a CCC first.
>> Coleen
>>
>> On 11/13/14, 7:41 AM, Aleksey Shipilev wrote:
>>> Hi,
>>>
>>> This is a patch that removes -XX:+-UseCompilerSafepoints from Hotspot:
>>>    https://bugs.openjdk.java.net/browse/JDK-8064749
>>>    http://cr.openjdk.java.net/~shade/8064749/webrev.01/
>>>
>>> Do I understand it right we need a CCC to remove the product flag?
>>>
>>> Testing: JPRT, vm.quick.testlist
>>>
>>> Thanks,
>>> -Aleksey.
>>>
>>
>
>

From aleksey.shipilev at oracle.com  Thu Nov 13 16:48:30 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 13 Nov 2014 19:48:30 +0300
Subject: RFR (S) 8064749: -XX:-UseCompilerSafepoints breaks safepoint
	rendezvous
In-Reply-To: <5464E00E.4070503@oracle.com>
References: <5464A6EC.6090804@oracle.com> <5464CD5E.4060800@oracle.com>
	<5464D1BE.4090204@oracle.com> <5464E00E.4070503@oracle.com>
Message-ID: <5464E0DE.6030105@oracle.com>

Thanks!

Added you as the reviewer to the changeset.

-Aleksey.

On 11/13/2014 07:45 PM, Vladimir Kozlov wrote:
> Good.
> 
> Thanks,
> Vladimir
> 
> On 11/13/14 7:43 AM, Aleksey Shipilev wrote:
>> Got it, filed.
>>
>> Any problems with the code change?
>>   http://cr.openjdk.java.net/~shade/8064749/webrev.02/
>>
>> -Aleksey.
>>
>> On 11/13/2014 06:25 PM, Coleen Phillimore wrote:
>>>
>>> Yes, you have to file a CCC first.
>>> Coleen
>>>
>>> On 11/13/14, 7:41 AM, Aleksey Shipilev wrote:
>>>> Hi,
>>>>
>>>> This is a patch that removes -XX:+-UseCompilerSafepoints from Hotspot:
>>>>    https://bugs.openjdk.java.net/browse/JDK-8064749
>>>>    http://cr.openjdk.java.net/~shade/8064749/webrev.01/
>>>>
>>>> Do I understand it right we need a CCC to remove the product flag?
>>>>
>>>> Testing: JPRT, vm.quick.testlist
>>>>
>>>> Thanks,
>>>> -Aleksey.
>>>>
>>>
>>
>>


From daniel.daugherty at oracle.com  Thu Nov 13 18:18:40 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 13 Nov 2014 11:18:40 -0700
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <5464C3E6.5000309@oracle.com>
References: <546151A9.1080100@oracle.com> <5464C3E6.5000309@oracle.com>
Message-ID: <5464F600.7040601@oracle.com>

Magnus,

Thanks for the review!

Replies embedded below...

On 11/13/14 7:44 AM, Magnus Ihse Bursie wrote:
> On 2014-11-11 01:00, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
>> Yes, it is a small fix, but it is in Makefiles so feel free to
>> run screaming from the room... :-)  On the plus side the fix does
>> delete two work around source files (Coleen would say that's a
>> Good Thing (TM)!)
>
> ... but you're only deleting the make files?

Good catch! Looks like when I resurrected this fix from my JDK8
queue I missed a couple of deletes.


> src/os/solaris/add_gnu_debuglink/add_gnu_debuglink.c and 
> src/os/solaris/fix_empty_sec_hdr_flags/fix_empty_sec_hdr_flags.c could 
> be deleted as well, right?

Yes, these should be deleted and I'll do that in this fix.
Since these are two deletes of files that can no longer be
built anyway, I presume I don't need to sent out another
webrev...


>
> Good idea for the fix, anyway. I opened 
> https://bugs.openjdk.java.net/browse/JDK-8064808 to implement a 
> similar solution in configure.

Sounds good to me.

Dan


>
> /Magnus


From david.holmes at oracle.com  Thu Nov 13 20:15:55 2014
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 14 Nov 2014 06:15:55 +1000
Subject: RFR (S) 8064749: -XX:-UseCompilerSafepoints breaks safepoint
	rendezvous
In-Reply-To: <5464D1BE.4090204@oracle.com>
References: <5464A6EC.6090804@oracle.com> <5464CD5E.4060800@oracle.com>
	<5464D1BE.4090204@oracle.com>
Message-ID: <5465117B.1030205@oracle.com>

On 14/11/2014 1:43 AM, Aleksey Shipilev wrote:
> Got it, filed.

Did you fast-track it?

> Any problems with the code change?
>   http://cr.openjdk.java.net/~shade/8064749/webrev.02/

Looks okay. I would have pushed for outright removal rather than the 
"deprecation" mechanism given it is likely an unusable flag. :)

Thanks,
David

> -Aleksey.
>
> On 11/13/2014 06:25 PM, Coleen Phillimore wrote:
>>
>> Yes, you have to file a CCC first.
>> Coleen
>>
>> On 11/13/14, 7:41 AM, Aleksey Shipilev wrote:
>>> Hi,
>>>
>>> This is a patch that removes -XX:+-UseCompilerSafepoints from Hotspot:
>>>    https://bugs.openjdk.java.net/browse/JDK-8064749
>>>    http://cr.openjdk.java.net/~shade/8064749/webrev.01/
>>>
>>> Do I understand it right we need a CCC to remove the product flag?
>>>
>>> Testing: JPRT, vm.quick.testlist
>>>
>>> Thanks,
>>> -Aleksey.
>>>
>>
>
>

From aleksey.shipilev at oracle.com  Thu Nov 13 20:27:14 2014
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 13 Nov 2014 23:27:14 +0300
Subject: RFR (S) 8064749: -XX:-UseCompilerSafepoints breaks safepoint
	rendezvous
In-Reply-To: <5465117B.1030205@oracle.com>
References: <5464A6EC.6090804@oracle.com> <5464CD5E.4060800@oracle.com>
	<5464D1BE.4090204@oracle.com> <5465117B.1030205@oracle.com>
Message-ID: <54651422.2000603@oracle.com>

On 13.11.2014 23:15, David Holmes wrote:
> On 14/11/2014 1:43 AM, Aleksey Shipilev wrote:
>> Got it, filed.
> 
> Did you fast-track it?

<experiencing a TPS report cover sheet moment here :)>

Alas, I wasn't aware I needed a "fast-track", and got just to "submit".
Anyhow, this is not a pressing issue, and Coleen volunteered (thanks!)
to push the changeset as soon as CCC is approved.

Added you to the changeset as the reviewer as well.

Thanks,
-Aleksey.


From david.holmes at oracle.com  Fri Nov 14 02:02:02 2014
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 14 Nov 2014 12:02:02 +1000
Subject: hang when using -XX:-UseCompilerSafepoints
In-Reply-To: <BC97738F8E7C8742BABED7F06FB9DF915550688F@SATLEXDAG01.amd.com>
References: <BC97738F8E7C8742BABED7F06FB9DF915550616D@SATLEXDAG01.amd.com>	<5463BF71.4080804@oracle.com>
	<5463EF88.1050100@oracle.com> <54641E27.4090303@oracle.com>
	<546427DB.3070806@oracle.com> <5464444A.7030601@oracle.com>
	<BC97738F8E7C8742BABED7F06FB9DF915550688F@SATLEXDAG01.amd.com>
Message-ID: <5465629A.9010204@oracle.com>

Hi Tom,

On 14/11/2014 2:43 AM, Deneau, Tom wrote:
> Just as an aside, I got involved in this because I wanted to see the effect of the CompilerSafepoints poll instruction when comparing performance of small JMH benchmarks across a couple of different architectures.

Assuming you are interested in the cost of accessing the page, not the 
cost of the trap, you've probably already deduced that the flag simply 
disabled the arming of the page, as opposed to disabling generation of 
the polling instructions - and so would be of no help.

> But I'm fine with getting rid of the flag.

Thanks for confirming.

David

> -- Tom
>
> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Wednesday, November 12, 2014 11:40 PM
> To: Vladimir Kozlov; hotspot-runtime-dev at openjdk.java.net
> Cc: Deneau, Tom
> Subject: Re: hang when using -XX:-UseCompilerSafepoints
>
> Hi Vladimir,
>
> On 13/11/2014 1:39 PM, Vladimir Kozlov wrote:
>> I agrer that workaround is -Xint. But if we disable compilation with
>> -UseCompilerSafepoints, the flag becomes useless. You can get the same
>> result with just -Xint.
>>
>> The history shows that it was added at the very beginning of Hotspot
>> development, at the day one. I can only speculate that it was used to
>> find performance effects of safepoints in compiled code . It could be
>> the case that we removed safepoints from Counted loops as result of that
>> investigation. I think it was never intended to be used in production.
>>
>> Although we can fix compilers to generate a runtime call which does
>> safepoint when -UseCompilerSafepoints is specified, it will be useless
>> work, I think.
>
> There is some history in JDK-4974572 (which is non-public I'm afraid).
> To all intents and purposes the flag at that point was used to enable
> testing of workarounds if problems were suspected in the "new"
> safepointing code. I think it has outlived its usefulness by a few major
> releases so I'm happy to see it go.
>
> Cheers,
> David
>
>> thanks,
>> Vladimir
>>
>> On 11/12/14 6:57 PM, David Holmes wrote:
>>> On 13/11/2014 9:38 AM, Vladimir Kozlov wrote:
>>>> On 11/12/14 12:13 PM, Aleksey Shipilev wrote:
>>>>> Hi,
>>>>>
>>>>> Still not sure if this is a runtime bug: stripping safepoints from the
>>>>> non-counted loop seems to be a recipe for disaster.
>>>>
>>>> This flag does not affect compiled code - so it is not compiler issue.
>>>
>>> Well, it disables the mechanism that the compiler inserts for checking
>>> if a safepoint has been requested. As I've added to the bug report,
>>> disabling compiler safepoints should go hand-in-hand with disabling the
>>> compilers (ie run with -Xint) - otherwise you have to know that the
>>> compiled code will eventually hit a non-compiler safepoint check.
>>>
>>>> It is only used in runtime/safepoint.cpp and it guards the code which
>>>> protects a polling page.
>>>>
>>>> There are many bugs which shows current problem. For example:
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-6873333
>>>>
>>>> I would say that we have to remove it or at least make it experimental
>>>> flag if we want to do experiments with it.
>>>>
>>>> We definitely should not allow to use it in production!
>>>
>>> If we assume there is a reason it was made a product flag then the
>>> correct fix in my opinion would be to fall back to intepreter-only mode
>>> when this flag is turned off.
>>>
>>> If we don't make that assumption then we could still tie it to
>>> interpreter-only mode, but we definitely should not make it configurable
>>> in product mode without some effort.
>>>
>>> Or if we can't ascertain a valid reason for ever wanting to do this, we
>>> could simply delete the flag altogether. :)
>>>
>>> Cheers,
>>> David
>>>
>>>> Regards,
>>>> Vladimir
>>>>
>>>>>
>>>>> Anyhow, I think it deserves a simpler example. Submitted the bug and
>>>>> attached a simple test there:
>>>>>    https://bugs.openjdk.java.net/browse/JDK-8064749
>>>>>
>>>>> Thanks,
>>>>> -Aleksey.
>>>>>
>>>>> On 12.11.2014 19:52, Deneau, Tom wrote:
>>>>>> Hi all --
>>>>>>
>>>>>> Forwarding a thread which came about on the jmh-dev mail list, as
>>>>>> recommended by Aleksey Shipilev (see below).  The JMH framework has a
>>>>>> timing control thread which sleeps for a certain period, then sets a
>>>>>> volatile isDone variable.  Meanwhile, the benchmark thread loops
>>>>>> doing its benchmark code and also checking the isDone field.   A hang
>>>>>> occurs if -XX:-UseCompilerSafepoints is used.
>>>>>>
>>>>>> The original issue can be reproduced by the following steps
>>>>>>
>>>>>>      hg clone http://hg.openjdk.java.net/code-tools/jmh
>>>>>>      cd jmh
>>>>>>      mvn clean install -DskipTests=true
>>>>>>      cd jmh-samples
>>>>>>      java  -server -XX:-UseCompilerSafepoints -jar
>>>>>> target/benchmarks.jar 'JMHSample_01_.*' -t 1 -wi 5 -i 5 -f 0
>>>>>>
>>>>>> -- Tom Deneau
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Aleksey Shipilev [mailto:aleksey.shipilev at oracle.com]
>>>>>> Sent: Wednesday, November 12, 2014 6:09 AM
>>>>>> To: Deneau, Tom; jmh-dev at openjdk.java.net
>>>>>> Subject: Re: using -XX:-UseCompilerSafepoints
>>>>>>
>>>>>> Hi Tom,
>>>>>>
>>>>>> On 11/11/2014 07:34 PM, Deneau, Tom wrote:
>>>>>>> It looks like a thread that calls Thread.sleep (as the timing control
>>>>>>> thread does in the harness) will eventually go thru
>>>>>>> SafepointSynchonize::block (as part of the ThreadBlockInVM
>>>>>>> destructor).  So if there is a looping benchmark thread compiled
>>>>>>> without Compiler Safepoints, the control thread will be blocked and
>>>>>>> will never set the isDone flag.
>>>>>>
>>>>>> So, you are saying that without the safepoint in the while(!isDone)
>>>>>> loop in workload, control thread and workload thread will never
>>>>>> rendezvous on safepoint? I believe this is a bug with
>>>>>> -XX:-CompilerSafepoints, because the comment in safepoint.cpp calls
>>>>>> this
>>>>>> out specifically for VMThread vs. Mutator threads:
>>>>>>
>>>>>>    // In a pathological scenario such as that described in CR6415670
>>>>>>    // the VMthread may sleep just before the mutator(s) become safe.
>>>>>>    // In that case the mutators will be stalled waiting for the
>>>>>> safepoint
>>>>>>    // to complete and the the VMthread will be sleeping, waiting for
>>>>>> the
>>>>>>    // mutators to rendezvous. The VMthread will eventually wake up and
>>>>>>    // detect that all mutators are safe, at which point we'll again
>>>>>> make
>>>>>>    // progress.
>>>>>>
>>>>>> If this is a case, you probably need to report this to runtime guys.
>>>>>>
>>>>>>> This is probably OK, just need to document that CompilerSafepoints
>>>>>>> cannot be turned off.
>>>>>>
>>>>>> I think it is safe to presume something will go hairy if you are using
>>>>>> any special VM flag, therefore I am not inclined to document this.
>>>>>>
>>>>>> Thanks,
>>>>>> -Aleksey.
>>>>>>
>>>>>
>>>>>

From david.holmes at oracle.com  Fri Nov 14 02:20:50 2014
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 14 Nov 2014 12:20:50 +1000
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <5464F600.7040601@oracle.com>
References: <546151A9.1080100@oracle.com> <5464C3E6.5000309@oracle.com>
	<5464F600.7040601@oracle.com>
Message-ID: <54656702.4090102@oracle.com>

On 14/11/2014 4:18 AM, Daniel D. Daugherty wrote:
> Magnus,
>
> Thanks for the review!
>
> Replies embedded below...
>
> On 11/13/14 7:44 AM, Magnus Ihse Bursie wrote:
>> On 2014-11-11 01:00, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
>>> Yes, it is a small fix, but it is in Makefiles so feel free to
>>> run screaming from the room... :-)  On the plus side the fix does
>>> delete two work around source files (Coleen would say that's a
>>> Good Thing (TM)!)
>>
>> ... but you're only deleting the make files?
>
> Good catch! Looks like when I resurrected this fix from my JDK8
> queue I missed a couple of deletes.
>
>
>> src/os/solaris/add_gnu_debuglink/add_gnu_debuglink.c and
>> src/os/solaris/fix_empty_sec_hdr_flags/fix_empty_sec_hdr_flags.c could
>> be deleted as well, right?
>
> Yes, these should be deleted and I'll do that in this fix.
> Since these are two deletes of files that can no longer be
> built anyway, I presume I don't need to sent out another
> webrev...

I don't need to see an updated webrev :)

Thanks,
David


>
>>
>> Good idea for the fix, anyway. I opened
>> https://bugs.openjdk.java.net/browse/JDK-8064808 to implement a
>> similar solution in configure.
>
> Sounds good to me.
>
> Dan
>
>
>>
>> /Magnus
>

From ivan.gerasimov at oracle.com  Fri Nov 14 12:35:29 2014
From: ivan.gerasimov at oracle.com (Ivan Gerasimov)
Date: Fri, 14 Nov 2014 15:35:29 +0300
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
Message-ID: <5465F711.9090605@oracle.com>

Hello!

The recent fix for JDK-8059533 ((process) Make exiting process wait for 
exiting threads [win]) caused the warning message to be printed in some 
test environments:
-----------
os_windows.cpp:3844 is in the newly updated 
os::win32::exit_process_or_thread(Ept what, int exit_code)
-----------

This has been observed with debug builds on highly loaded systems.


To address the issue it is proposed to do three things:
1) increase the timeout for debug builds,
2) increase the maximum number of the thread handles to be stored,
3) rise the priority of the exiting threads, if we need to wait for them.

Would you please help review the fix?

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/

The fix was tested on all available platforms, with the hotspot testset. 
No failures.

Sincerely yours,
Ivan


From sergey.gabdurakhmanov at oracle.com  Fri Nov 14 13:07:41 2014
From: sergey.gabdurakhmanov at oracle.com (Sergey Gabdurakhmanov)
Date: Fri, 14 Nov 2014 16:07:41 +0300
Subject: RFR(XS): 8048050: Agent NullPointerException when rmi.port in use
Message-ID: <5465FE9D.5060003@oracle.com>

Hi,

Could I please have a review of this small fix.

webrev: http://cr.openjdk.java.net/~sgabdura/8048050/webrev.00/
bug: https://bugs.openjdk.java.net/browse/JDK-8048050

Problem description:
If the com.sun.management.jmxremote.rmi.port option is provided it will 
give a NPE if already in use by a different JVM. Its expected to fail 
but should provide an appropriate exception.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run two instances in different JVMs at same time with the following 
options:
-Dcom.sun.management.jmxremote.port=2222 
-Dcom.sun.management.jmxremote.rmi.port=2223 
-Dcom.sun.management.jmxremote.authenticate=false

Root cause:
Then we trying to start JMXConnectorServer (see method exportMBeanServer 
of class sun.management.jmxremote.ConnectorBootstrap on already used 
port it cause IOException. Call of connServer.getAddress().toString() in 
the exception handler cause NullPointerException because 
connServer.getAddress() returns null.

Solution:
Provide url.toString() if connServer.getAddress() is null

I'm going to push this fix into JDK9, 8 and 7.

BR,
Sergey


From jaroslav.bachorik at oracle.com  Fri Nov 14 13:10:25 2014
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Fri, 14 Nov 2014 14:10:25 +0100
Subject: RFR(XS): 8048050: Agent NullPointerException when rmi.port in use
In-Reply-To: <5465FE9D.5060003@oracle.com>
References: <5465FE9D.5060003@oracle.com>
Message-ID: <5465FF41.2050507@oracle.com>


Good to go.

-JB-

On 11/14/2014 02:07 PM, Sergey Gabdurakhmanov wrote:
> Hi,
>
> Could I please have a review of this small fix.
>
> webrev: http://cr.openjdk.java.net/~sgabdura/8048050/webrev.00/
> bug: https://bugs.openjdk.java.net/browse/JDK-8048050
>
> Problem description:
> If the com.sun.management.jmxremote.rmi.port option is provided it will
> give a NPE if already in use by a different JVM. Its expected to fail
> but should provide an appropriate exception.
>
> STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
> Run two instances in different JVMs at same time with the following
> options:
> -Dcom.sun.management.jmxremote.port=2222
> -Dcom.sun.management.jmxremote.rmi.port=2223
> -Dcom.sun.management.jmxremote.authenticate=false
>
> Root cause:
> Then we trying to start JMXConnectorServer (see method exportMBeanServer
> of class sun.management.jmxremote.ConnectorBootstrap on already used
> port it cause IOException. Call of connServer.getAddress().toString() in
> the exception handler cause NullPointerException because
> connServer.getAddress() returns null.
>
> Solution:
> Provide url.toString() if connServer.getAddress() is null
>
> I'm going to push this fix into JDK9, 8 and 7.
>
> BR,
> Sergey
>


From daniel.fuchs at oracle.com  Fri Nov 14 13:32:26 2014
From: daniel.fuchs at oracle.com (Daniel Fuchs)
Date: Fri, 14 Nov 2014 14:32:26 +0100
Subject: RFR(XS): 8048050: Agent NullPointerException when rmi.port in use
In-Reply-To: <5465FF41.2050507@oracle.com>
References: <5465FE9D.5060003@oracle.com> <5465FF41.2050507@oracle.com>
Message-ID: <5466046A.2060309@oracle.com>

Hi Sergey,

The fix looks fine.
I wonder whether there should be a testcase for that?

best regards,

-- daniel

On 14/11/14 14:10, Jaroslav Bachorik wrote:
>
> Good to go.
>
> -JB-
>
> On 11/14/2014 02:07 PM, Sergey Gabdurakhmanov wrote:
>> Hi,
>>
>> Could I please have a review of this small fix.
>>
>> webrev: http://cr.openjdk.java.net/~sgabdura/8048050/webrev.00/
>> bug: https://bugs.openjdk.java.net/browse/JDK-8048050
>>
>> Problem description:
>> If the com.sun.management.jmxremote.rmi.port option is provided it will
>> give a NPE if already in use by a different JVM. Its expected to fail
>> but should provide an appropriate exception.
>>
>> STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
>> Run two instances in different JVMs at same time with the following
>> options:
>> -Dcom.sun.management.jmxremote.port=2222
>> -Dcom.sun.management.jmxremote.rmi.port=2223
>> -Dcom.sun.management.jmxremote.authenticate=false
>>
>> Root cause:
>> Then we trying to start JMXConnectorServer (see method exportMBeanServer
>> of class sun.management.jmxremote.ConnectorBootstrap on already used
>> port it cause IOException. Call of connServer.getAddress().toString() in
>> the exception handler cause NullPointerException because
>> connServer.getAddress() returns null.
>>
>> Solution:
>> Provide url.toString() if connServer.getAddress() is null
>>
>> I'm going to push this fix into JDK9, 8 and 7.
>>
>> BR,
>> Sergey
>>
>


From sergey.gabdurakhmanov at oracle.com  Fri Nov 14 16:23:21 2014
From: sergey.gabdurakhmanov at oracle.com (Sergey Gabdurakhmanov)
Date: Fri, 14 Nov 2014 08:23:21 -0800 (PST)
Subject: RFR(XS): 8048050: Agent NullPointerException when rmi.port in use
Message-ID: <8c93fff6-4bdd-4293-ba0d-f49649489ecf@default>

Hi Daniel,

Our documentation does not specify the exact exception that should be thrown in this scenario. But it should be reasonable.
That makes testcase very difficult to implement. Because "reasonable" is not clear for test.
E.g. "divided by zero" is not reasonable, but "illegal argument" is...
I prefer do not put any tests where.

BR,
Sergey

----- Original Message -----
From: daniel.fuchs at oracle.com
To: jaroslav.bachorik at oracle.com, sergey.gabdurakhmanov at oracle.com, hotspot-runtime-dev at openjdk.java.net, dmitry.samersoff at oracle.com, serviceability-dev at openjdk.java.net
Sent: Friday, November 14, 2014 4:32:38 PM (GMT+0300) Auto-Detected
Subject: Re: RFR(XS): 8048050: Agent NullPointerException when rmi.port in use

Hi Sergey,

The fix looks fine.
I wonder whether there should be a testcase for that?

best regards,

-- daniel

On 14/11/14 14:10, Jaroslav Bachorik wrote:
>
> Good to go.
>
> -JB-
>
> On 11/14/2014 02:07 PM, Sergey Gabdurakhmanov wrote:
>> Hi,
>>
>> Could I please have a review of this small fix.
>>
>> webrev: http://cr.openjdk.java.net/~sgabdura/8048050/webrev.00/
>> bug: https://bugs.openjdk.java.net/browse/JDK-8048050
>>
>> Problem description:
>> If the com.sun.management.jmxremote.rmi.port option is provided it will
>> give a NPE if already in use by a different JVM. Its expected to fail
>> but should provide an appropriate exception.
>>
>> STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
>> Run two instances in different JVMs at same time with the following
>> options:
>> -Dcom.sun.management.jmxremote.port=2222
>> -Dcom.sun.management.jmxremote.rmi.port=2223
>> -Dcom.sun.management.jmxremote.authenticate=false
>>
>> Root cause:
>> Then we trying to start JMXConnectorServer (see method exportMBeanServer
>> of class sun.management.jmxremote.ConnectorBootstrap on already used
>> port it cause IOException. Call of connServer.getAddress().toString() in
>> the exception handler cause NullPointerException because
>> connServer.getAddress() returns null.
>>
>> Solution:
>> Provide url.toString() if connServer.getAddress() is null
>>
>> I'm going to push this fix into JDK9, 8 and 7.
>>
>> BR,
>> Sergey
>>
>


From daniel.daugherty at oracle.com  Fri Nov 14 21:30:11 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 14 Nov 2014 14:30:11 -0700
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <5464F600.7040601@oracle.com>
References: <546151A9.1080100@oracle.com> <5464C3E6.5000309@oracle.com>
	<5464F600.7040601@oracle.com>
Message-ID: <54667463.7000405@oracle.com>

 > I presume I don't need to sent out another webrev...

I have to change my mind on this because this fix needs to be
backported to JDK8u-hs-dev.

Here's the updated JDK9 webrev:

http://cr.openjdk.java.net/~dcubed/8033602-webrev/1-jdk9-hs-rt/

And here's the JDK8u-hs-dev backport:

http://cr.openjdk.java.net/~dcubed/8033602-webrev/1-jdk8u-hs-dev/

Because of improvements to the JDK9 makefiles, a bunch of the
anchor text has changed. The best way to sanity check the backport
is to download the two patch files and look at them in your favorite
diff tool:

http://cr.openjdk.java.net/~dcubed/8033602-webrev/1-jdk9-hs-rt/hotspot.patch
http://cr.openjdk.java.net/~dcubed/8033602-webrev/1-jdk8u-hs-dev/8033602_for_jdk8u_hs_dev.patch

I need just one sanity check on the backport...

Thanks, in advance, for any comments, questions or suggestions.

Dan


On 11/13/14 11:18 AM, Daniel D. Daugherty wrote:
> Magnus,
>
> Thanks for the review!
>
> Replies embedded below...
>
> On 11/13/14 7:44 AM, Magnus Ihse Bursie wrote:
>> On 2014-11-11 01:00, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
>>> Yes, it is a small fix, but it is in Makefiles so feel free to
>>> run screaming from the room... :-)  On the plus side the fix does
>>> delete two work around source files (Coleen would say that's a
>>> Good Thing (TM)!)
>>
>> ... but you're only deleting the make files?
>
> Good catch! Looks like when I resurrected this fix from my JDK8
> queue I missed a couple of deletes.
>
>
>> src/os/solaris/add_gnu_debuglink/add_gnu_debuglink.c and 
>> src/os/solaris/fix_empty_sec_hdr_flags/fix_empty_sec_hdr_flags.c 
>> could be deleted as well, right?
>
> Yes, these should be deleted and I'll do that in this fix.
> Since these are two deletes of files that can no longer be
> built anyway, I presume I don't need to sent out another
> webrev...
>
>
>>
>> Good idea for the fix, anyway. I opened 
>> https://bugs.openjdk.java.net/browse/JDK-8064808 to implement a 
>> similar solution in configure.
>
> Sounds good to me.
>
> Dan
>
>
>>
>> /Magnus


On 11/10/14 5:00 PM, Daniel D. Daugherty wrote:
 > Greetings,
 >
 > I have a Solaris Full Debug Symbols (FDS) fix ready for review.
 > Yes, it is a small fix, but it is in Makefiles so feel free to
 > run screaming from the room... :-)  On the plus side the fix does
 > delete two work around source files (Coleen would say that's a
 > Good Thing (TM)!)
 >
 > The fix is to detect the version of GNU objcopy that is being
 > used on the machine and only enable Full Debug Symbols when that
 > version is 2.21.1 or newer. If you don't have the right version,
 > then the build drops back to pre-FDS build configs with a message
 > like this:
 >
 > WARNING: /usr/sfw/bin/gobjcopy --version info:
 > WARNING: GNU objcopy 2.15
 > WARNING: an objcopy version of 2.21.1 or newer is needed to create 
valid .debuginfo files.
 > WARNING: ignoring above objcopy command.
 > WARNING: patch 149063-01 or newer contains the correct Solaris 10 
SPARC version.
 > WARNING: patch 149064-01 or newer contains the correct Solaris 10 X86 
version.
 > WARNING: Solaris 11 Update 1 contains the correct version.
 > INFO: no objcopy cmd found so cannot create .debuginfo files.
 > INFO: ENABLE_FULL_DEBUG_SYMBOLS=0
 >
 > This work is being tracked by the following bug IDs:
 >
 >     JDK-8033602 wrong stabs data in libjvm.debuginfo on JDK 8 - SPARC
 >     https://bugs.openjdk.java.net/browse/JDK-8033602
 >
 >     JDK-8034005 cannot debug in synchronizer.o or objectMonitor.o on 
Solaris X86
 >     https://bugs.openjdk.java.net/browse/JDK-8034005
 >
 > Here is the webrev URL:
 >
 > http://cr.openjdk.java.net/~dcubed/8033602-webrev/0-jdk9-hs-rt/
 >
 > Testing:
 >
 > - JPRT test jobs to verify that the current JPRT Solaris hosts
 >   are happy
 > - local builds on my Solaris 10 X86 machine to verify that the
 >   wrong version of GNU objcopy is caught
 >
 > Thanks, in advance, for any comments, questions or suggestions.
 >
 > Dan

From coleen.phillimore at oracle.com  Fri Nov 14 22:47:43 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Fri, 14 Nov 2014 17:47:43 -0500
Subject: [8u40] RFR 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
Message-ID: <5466868F.2020802@oracle.com>

Please approve the backport of the bug fix for this bug.  The fix has 
been tested all week with no problems.  The patch didn't import because 
MallocTrackingVerify.java didn't have an @ignore tag, which was removed 
by the jdk9 fix.  Everything else applied cleanly.

Summary: Signed bitfield size y can only have (1 << y)-1 values.
Reviewed-by: shade, dholmes, jrose, ctornqvi, gtriantafill

open webrev at http://cr.openjdk.java.net/~coleenp/8062870_8u40/
bug link https://bugs.openjdk.java.net/browse/JDK-8062870

Thanks!
Coleen


From daniel.daugherty at oracle.com  Fri Nov 14 23:22:23 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 14 Nov 2014 16:22:23 -0700
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <5465F711.9090605@oracle.com>
References: <5465F711.9090605@oracle.com>
Message-ID: <54668EAF.9070807@oracle.com>

On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
> Hello!
>
> The recent fix for JDK-8059533 ((process) Make exiting process wait 
> for exiting threads [win]) caused the warning message to be printed in 
> some test environments:
> -----------
> os_windows.cpp:3844 is in the newly updated 
> os::win32::exit_process_or_thread(Ept what, int exit_code)
> -----------
>
> This has been observed with debug builds on highly loaded systems.
>
>
> To address the issue it is proposed to do three things:
> 1) increase the timeout for debug builds,
> 2) increase the maximum number of the thread handles to be stored,
> 3) rise the priority of the exiting threads, if we need to wait for them.
>
> Would you please help review the fix?
>
> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/

src/os/windows/vm/os_windows.cpp

   line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
     Instead of DEBUG_ONLY can you used NOT_PRODUCT?

     That uses the smaller value for only one build config (PRODUCT).

   line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000) DEBUG_ONLY(4000) 
/*1 sec in product, 4 sec in debug*/
     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
     Please add spaces between the comment delimiters and the comment text.

     That uses the smaller timeout for only one build config (PRODUCT).

   line 3836           // Rise the priority...
     Typo: 'Rise' -> 'Raise'

     About the general idea of raising the exiting thread's priority,
     if the exiting thread is looping in some Win* OS code after this
     point, will raising the priority make the machine unusable?

Dan


>
> The fix was tested on all available platforms, with the hotspot 
> testset. No failures.
>
> Sincerely yours,
> Ivan
>


From dmitry.samersoff at oracle.com  Sat Nov 15 18:57:10 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Sat, 15 Nov 2014 21:57:10 +0300
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <54667463.7000405@oracle.com>
References: <546151A9.1080100@oracle.com>
	<5464C3E6.5000309@oracle.com>	<5464F600.7040601@oracle.com>
	<54667463.7000405@oracle.com>
Message-ID: <5467A206.7020105@oracle.com>

Dan,

The fix looks good for me.

-Dmitry


On 2014-11-15 00:30, Daniel D. Daugherty wrote:
>> I presume I don't need to sent out another webrev...
> 
> I have to change my mind on this because this fix needs to be
> backported to JDK8u-hs-dev.
> 
> Here's the updated JDK9 webrev:
> 
> http://cr.openjdk.java.net/~dcubed/8033602-webrev/1-jdk9-hs-rt/
> 
> And here's the JDK8u-hs-dev backport:
> 
> http://cr.openjdk.java.net/~dcubed/8033602-webrev/1-jdk8u-hs-dev/
> 
> Because of improvements to the JDK9 makefiles, a bunch of the
> anchor text has changed. The best way to sanity check the backport
> is to download the two patch files and look at them in your favorite
> diff tool:
> 
> http://cr.openjdk.java.net/~dcubed/8033602-webrev/1-jdk9-hs-rt/hotspot.patch
> 
> http://cr.openjdk.java.net/~dcubed/8033602-webrev/1-jdk8u-hs-dev/8033602_for_jdk8u_hs_dev.patch
> 
> 
> I need just one sanity check on the backport...
> 
> Thanks, in advance, for any comments, questions or suggestions.
> 
> Dan
> 
> 
> On 11/13/14 11:18 AM, Daniel D. Daugherty wrote:
>> Magnus,
>>
>> Thanks for the review!
>>
>> Replies embedded below...
>>
>> On 11/13/14 7:44 AM, Magnus Ihse Bursie wrote:
>>> On 2014-11-11 01:00, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
>>>> Yes, it is a small fix, but it is in Makefiles so feel free to
>>>> run screaming from the room... :-)  On the plus side the fix does
>>>> delete two work around source files (Coleen would say that's a
>>>> Good Thing (TM)!)
>>>
>>> ... but you're only deleting the make files?
>>
>> Good catch! Looks like when I resurrected this fix from my JDK8
>> queue I missed a couple of deletes.
>>
>>
>>> src/os/solaris/add_gnu_debuglink/add_gnu_debuglink.c and
>>> src/os/solaris/fix_empty_sec_hdr_flags/fix_empty_sec_hdr_flags.c
>>> could be deleted as well, right?
>>
>> Yes, these should be deleted and I'll do that in this fix.
>> Since these are two deletes of files that can no longer be
>> built anyway, I presume I don't need to sent out another
>> webrev...
>>
>>
>>>
>>> Good idea for the fix, anyway. I opened
>>> https://bugs.openjdk.java.net/browse/JDK-8064808 to implement a
>>> similar solution in configure.
>>
>> Sounds good to me.
>>
>> Dan
>>
>>
>>>
>>> /Magnus
> 
> 
> 
> On 11/10/14 5:00 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
>> Yes, it is a small fix, but it is in Makefiles so feel free to
>> run screaming from the room... :-)  On the plus side the fix does
>> delete two work around source files (Coleen would say that's a
>> Good Thing (TM)!)
>>
>> The fix is to detect the version of GNU objcopy that is being
>> used on the machine and only enable Full Debug Symbols when that
>> version is 2.21.1 or newer. If you don't have the right version,
>> then the build drops back to pre-FDS build configs with a message
>> like this:
>>
>> WARNING: /usr/sfw/bin/gobjcopy --version info:
>> WARNING: GNU objcopy 2.15
>> WARNING: an objcopy version of 2.21.1 or newer is needed to create
> valid .debuginfo files.
>> WARNING: ignoring above objcopy command.
>> WARNING: patch 149063-01 or newer contains the correct Solaris 10
> SPARC version.
>> WARNING: patch 149064-01 or newer contains the correct Solaris 10 X86
> version.
>> WARNING: Solaris 11 Update 1 contains the correct version.
>> INFO: no objcopy cmd found so cannot create .debuginfo files.
>> INFO: ENABLE_FULL_DEBUG_SYMBOLS=0
>>
>> This work is being tracked by the following bug IDs:
>>
>>     JDK-8033602 wrong stabs data in libjvm.debuginfo on JDK 8 - SPARC
>>     https://bugs.openjdk.java.net/browse/JDK-8033602
>>
>>     JDK-8034005 cannot debug in synchronizer.o or objectMonitor.o on
> Solaris X86
>>     https://bugs.openjdk.java.net/browse/JDK-8034005
>>
>> Here is the webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8033602-webrev/0-jdk9-hs-rt/
>>
>> Testing:
>>
>> - JPRT test jobs to verify that the current JPRT Solaris hosts
>>   are happy
>> - local builds on my Solaris 10 X86 machine to verify that the
>>   wrong version of GNU objcopy is caught
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From daniel.daugherty at oracle.com  Sat Nov 15 19:06:59 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Sat, 15 Nov 2014 12:06:59 -0700
Subject: RFR(S) Solaris Full Debug Symbols (FDS) fix for 8033602 and
	8034005
In-Reply-To: <5467A206.7020105@oracle.com>
References: <546151A9.1080100@oracle.com>
	<5464C3E6.5000309@oracle.com>	<5464F600.7040601@oracle.com>
	<54667463.7000405@oracle.com> <5467A206.7020105@oracle.com>
Message-ID: <5467A453.8020001@oracle.com>

Thanks!

Dan


On 11/15/14 11:57 AM, Dmitry Samersoff wrote:
> Dan,
>
> The fix looks good for me.
>
> -Dmitry
>
>
> On 2014-11-15 00:30, Daniel D. Daugherty wrote:
>>> I presume I don't need to sent out another webrev...
>> I have to change my mind on this because this fix needs to be
>> backported to JDK8u-hs-dev.
>>
>> Here's the updated JDK9 webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8033602-webrev/1-jdk9-hs-rt/
>>
>> And here's the JDK8u-hs-dev backport:
>>
>> http://cr.openjdk.java.net/~dcubed/8033602-webrev/1-jdk8u-hs-dev/
>>
>> Because of improvements to the JDK9 makefiles, a bunch of the
>> anchor text has changed. The best way to sanity check the backport
>> is to download the two patch files and look at them in your favorite
>> diff tool:
>>
>> http://cr.openjdk.java.net/~dcubed/8033602-webrev/1-jdk9-hs-rt/hotspot.patch
>>
>> http://cr.openjdk.java.net/~dcubed/8033602-webrev/1-jdk8u-hs-dev/8033602_for_jdk8u_hs_dev.patch
>>
>>
>> I need just one sanity check on the backport...
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>>
>> On 11/13/14 11:18 AM, Daniel D. Daugherty wrote:
>>> Magnus,
>>>
>>> Thanks for the review!
>>>
>>> Replies embedded below...
>>>
>>> On 11/13/14 7:44 AM, Magnus Ihse Bursie wrote:
>>>> On 2014-11-11 01:00, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
>>>>> Yes, it is a small fix, but it is in Makefiles so feel free to
>>>>> run screaming from the room... :-)  On the plus side the fix does
>>>>> delete two work around source files (Coleen would say that's a
>>>>> Good Thing (TM)!)
>>>> ... but you're only deleting the make files?
>>> Good catch! Looks like when I resurrected this fix from my JDK8
>>> queue I missed a couple of deletes.
>>>
>>>
>>>> src/os/solaris/add_gnu_debuglink/add_gnu_debuglink.c and
>>>> src/os/solaris/fix_empty_sec_hdr_flags/fix_empty_sec_hdr_flags.c
>>>> could be deleted as well, right?
>>> Yes, these should be deleted and I'll do that in this fix.
>>> Since these are two deletes of files that can no longer be
>>> built anyway, I presume I don't need to sent out another
>>> webrev...
>>>
>>>
>>>> Good idea for the fix, anyway. I opened
>>>> https://bugs.openjdk.java.net/browse/JDK-8064808 to implement a
>>>> similar solution in configure.
>>> Sounds good to me.
>>>
>>> Dan
>>>
>>>
>>>> /Magnus
>>
>>
>> On 11/10/14 5:00 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a Solaris Full Debug Symbols (FDS) fix ready for review.
>>> Yes, it is a small fix, but it is in Makefiles so feel free to
>>> run screaming from the room... :-)  On the plus side the fix does
>>> delete two work around source files (Coleen would say that's a
>>> Good Thing (TM)!)
>>>
>>> The fix is to detect the version of GNU objcopy that is being
>>> used on the machine and only enable Full Debug Symbols when that
>>> version is 2.21.1 or newer. If you don't have the right version,
>>> then the build drops back to pre-FDS build configs with a message
>>> like this:
>>>
>>> WARNING: /usr/sfw/bin/gobjcopy --version info:
>>> WARNING: GNU objcopy 2.15
>>> WARNING: an objcopy version of 2.21.1 or newer is needed to create
>> valid .debuginfo files.
>>> WARNING: ignoring above objcopy command.
>>> WARNING: patch 149063-01 or newer contains the correct Solaris 10
>> SPARC version.
>>> WARNING: patch 149064-01 or newer contains the correct Solaris 10 X86
>> version.
>>> WARNING: Solaris 11 Update 1 contains the correct version.
>>> INFO: no objcopy cmd found so cannot create .debuginfo files.
>>> INFO: ENABLE_FULL_DEBUG_SYMBOLS=0
>>>
>>> This work is being tracked by the following bug IDs:
>>>
>>>      JDK-8033602 wrong stabs data in libjvm.debuginfo on JDK 8 - SPARC
>>>      https://bugs.openjdk.java.net/browse/JDK-8033602
>>>
>>>      JDK-8034005 cannot debug in synchronizer.o or objectMonitor.o on
>> Solaris X86
>>>      https://bugs.openjdk.java.net/browse/JDK-8034005
>>>
>>> Here is the webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8033602-webrev/0-jdk9-hs-rt/
>>>
>>> Testing:
>>>
>>> - JPRT test jobs to verify that the current JPRT Solaris hosts
>>>    are happy
>>> - local builds on my Solaris 10 X86 machine to verify that the
>>>    wrong version of GNU objcopy is caught
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>


From ivan.gerasimov at oracle.com  Sun Nov 16 21:23:43 2014
From: ivan.gerasimov at oracle.com (Ivan Gerasimov)
Date: Mon, 17 Nov 2014 00:23:43 +0300
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <54668EAF.9070807@oracle.com>
References: <5465F711.9090605@oracle.com> <54668EAF.9070807@oracle.com>
Message-ID: <546915DF.7080106@oracle.com>

Thank you Daniel!

Please find the updated webrev with your suggestions incorporated here:
http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/

Concerning the thread priority: If the application is of 
NORMAL_PRIORITY_CLASS, then setting the thread's priority level to 
THREAD_PRIORITY_HIGHEST will result in its priority value to be only 10 
(of maximum 31).
http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.85).aspx

And if the process is HIGH_PRIORITY_CLASS, then the tread with the 
HIGHEST priority level will have priority value == 15 of 31.

I believe, it should not be too much, and the machine will not become 
busy with only those closing threads.
However, I hope it would be enough to make them complete faster than 
other threads of the NORMAL priority level withing the same application.

Sincerely yours,
Ivan


On 15.11.2014 2:22, Daniel D. Daugherty wrote:
> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>> Hello!
>>
>> The recent fix for JDK-8059533 ((process) Make exiting process wait 
>> for exiting threads [win]) caused the warning message to be printed 
>> in some test environments:
>> -----------
>> os_windows.cpp:3844 is in the newly updated 
>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>> -----------
>>
>> This has been observed with debug builds on highly loaded systems.
>>
>>
>> To address the issue it is proposed to do three things:
>> 1) increase the timeout for debug builds,
>> 2) increase the maximum number of the thread handles to be stored,
>> 3) rise the priority of the exiting threads, if we need to wait for 
>> them.
>>
>> Would you please help review the fix?
>>
>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>
> src/os/windows/vm/os_windows.cpp
>
>   line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>
>     That uses the smaller value for only one build config (PRODUCT).
>
>   line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000) DEBUG_ONLY(4000) 
> /*1 sec in product, 4 sec in debug*/
>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>     Please add spaces between the comment delimiters and the comment 
> text.
>
>     That uses the smaller timeout for only one build config (PRODUCT).
>
>   line 3836           // Rise the priority...
>     Typo: 'Rise' -> 'Raise'
>
>     About the general idea of raising the exiting thread's priority,
>     if the exiting thread is looping in some Win* OS code after this
>     point, will raising the priority make the machine unusable?
>
> Dan
>
>
>>
>> The fix was tested on all available platforms, with the hotspot 
>> testset. No failures.
>>
>> Sincerely yours,
>> Ivan
>>
>
>
>


From david.holmes at oracle.com  Mon Nov 17 06:40:05 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 17 Nov 2014 16:40:05 +1000
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <546915DF.7080106@oracle.com>
References: <5465F711.9090605@oracle.com> <54668EAF.9070807@oracle.com>
	<546915DF.7080106@oracle.com>
Message-ID: <54699845.5010901@oracle.com>

On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
> Thank you Daniel!
>
> Please find the updated webrev with your suggestions incorporated here:
> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>
> Concerning the thread priority: If the application is of
> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to
> THREAD_PRIORITY_HIGHEST will result in its priority value to be only 10
> (of maximum 31).
> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.85).aspx
>
>
> And if the process is HIGH_PRIORITY_CLASS, then the tread with the
> HIGHEST priority level will have priority value == 15 of 31.
>
> I believe, it should not be too much, and the machine will not become
> busy with only those closing threads.
> However, I hope it would be enough to make them complete faster than
> other threads of the NORMAL priority level withing the same application.

I don't think this is necessary or desirable. Under normal usage we're 
giving priority to exiting threads and that may disrupt the usual 
scheduling patterns that applications see. You may posit that it is 
"harmless" but we can't say that for sure. Nor can we actually know that 
this will help with this particular bug. I would not add in this new code.

David

> Sincerely yours,
> Ivan
>
>
> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>> Hello!
>>>
>>> The recent fix for JDK-8059533 ((process) Make exiting process wait
>>> for exiting threads [win]) caused the warning message to be printed
>>> in some test environments:
>>> -----------
>>> os_windows.cpp:3844 is in the newly updated
>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>> -----------
>>>
>>> This has been observed with debug builds on highly loaded systems.
>>>
>>>
>>> To address the issue it is proposed to do three things:
>>> 1) increase the timeout for debug builds,
>>> 2) increase the maximum number of the thread handles to be stored,
>>> 3) rise the priority of the exiting threads, if we need to wait for
>>> them.
>>>
>>> Would you please help review the fix?
>>>
>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>
>> src/os/windows/vm/os_windows.cpp
>>
>>   line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>
>>     That uses the smaller value for only one build config (PRODUCT).
>>
>>   line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000) DEBUG_ONLY(4000)
>> /*1 sec in product, 4 sec in debug*/
>>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>     Please add spaces between the comment delimiters and the comment
>> text.
>>
>>     That uses the smaller timeout for only one build config (PRODUCT).
>>
>>   line 3836           // Rise the priority...
>>     Typo: 'Rise' -> 'Raise'
>>
>>     About the general idea of raising the exiting thread's priority,
>>     if the exiting thread is looping in some Win* OS code after this
>>     point, will raising the priority make the machine unusable?
>>
>> Dan
>>
>>
>>>
>>> The fix was tested on all available platforms, with the hotspot
>>> testset. No failures.
>>>
>>> Sincerely yours,
>>> Ivan
>>>
>>
>>
>>
>

From david.holmes at oracle.com  Mon Nov 17 06:44:29 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 17 Nov 2014 16:44:29 +1000
Subject: RFR(XS): 8064471: Port 8013895: G1: G1SummarizeRSetStats output
	on Linux needs improvement to AIX
In-Reply-To: <5464DED4.9040909@sap.com>
References: <44EB1DA1040EEB44B7F343A782AD30789FEBA3E3@DEWDFEMB11B.global.corp.sap>
	<5463149A.6020506@oracle.com> <54637A9A.9040108@sap.com>
	<546470BD.9050303@oracle.com> <5464DED4.9040909@sap.com>
Message-ID: <5469994D.3070208@oracle.com>

On 14/11/2014 2:39 AM, Haug, Gunter wrote:
>
> On 13.11.2014 09:50, David Holmes wrote:
>> On 13/11/2014 1:19 AM, Haug, Gunter wrote:
>>>
>>> On 12.11.2014 09:04, David Holmes wrote:
>>>> Hi Gunter,
>>>>
>>>> On 11/11/2014 11:23 PM, Haug, Gunter wrote:
>>>>> Hi All,
>>>>>
>>>>> The change '8013895:  (G1: G1SummarizeRSetStats output on Linux needs
>>>>> improvement)' makes use of getrusage() to retrieve accurate
>>>>> per-thread data on resource usage. We can use exactly the same code
>>>>> on AIX to achieve this.
>>>>>
>>>>> Please review the following change:
>>>>>
>>>>> http://cr.openjdk.java.net/~goetz/webrevs/8064471-aixTime/webrev.00/
>>>>> https://bugs.openjdk.java.net/browse/JDK-8064471
>>>>
>>>> I have a couple of comments on this code which presumably also apply
>>>> to the orginal :(
>>> Yes, they apply to the original as well, see below.
>>>>
>>>> First this comment is no longer applicable (actually it was never
>>>> applicable to AIX!):
>>>>
>>>>   // For now, we say that linux does not support vtime. I have no idea
>>>>   // whether it can actually be made to (DLD, 9/13/05).
>>>>
>>> You're right. I will remove it.
>>>> Second this calculation seems wrong:
>>>>
>>>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>>>> (double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000 *
>>>> 1000);
>>>>
>>>> To me this performs integer division (ie truncation_) then converts
>>>> the resulting integer to a double. I would expect to see additional
>>>> parentheses (even if not needed, for clarity):
>>>>
>>>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>>>> ((double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec)) / (1000 *
>>>> 1000);
>>>>
>>>> or more simply divide by a floating-point value:
>>>>
>>>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>>>> (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);
>>>>
>>>> and you don't need two double casts regardless as the expression will
>>>> be of type double as soon as there is one operand of type double. So
>>>> that should reduce to:
>>>>
>>>> return usage.ru_utime.tv_sec + usage.ru_stime.tv_sec +
>>>> (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);
>>>>
>>> OK. Do you want that we also change the Linux version like you proposed?
>>
>> I'll leave it up to you. If you leave this as AIX only then it tests
>> the new process :) There can be a follow up cleanup bug for linux.
>
> Hi David,
>
> I think it's not worth the effort to make two separate changes on linux
> and aix, so I fixed linux as well. Please find the new webrev below.
> There will probably be more opportunities to test the new process in the
> future.
>
> http://cr.openjdk.java.net/~simonis/webrevs/8064471.v2/
> <http://cr.openjdk.java.net/%7Esimonis/webrevs/8064471.v2/>
>
>
> Now we need a sponsor, as it is not aix only anymore.

I guess that will have to be me. :) I will try to look at this again 
tomorrow.

David

> Thanks,
> Gunter
>
>
>>
>> Thanks,
>> David
>>
>>> Thanks,
>>> Gunter
>>>
>>>> Cheers,
>>>> David
>>>>
>>>>> Thanks,
>>>>> Gunter
>>>>>
>>>
>

From markus.gronlund at oracle.com  Mon Nov 17 08:33:45 2014
From: markus.gronlund at oracle.com (=?utf-8?B?TWFya3VzIEdyw7ZubHVuZA==?=)
Date: Mon, 17 Nov 2014 00:33:45 -0800 (PST)
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <54699845.5010901@oracle.com>
References: <5465F711.9090605@oracle.com> <54668EAF.9070807@oracle.com>
	<546915DF.7080106@oracle.com> <54699845.5010901@oracle.com>
Message-ID: <a68e8e09-8efa-4186-8207-d6b4b02a8831@default>

I agree with David.

The side effects will be unknown and very hard to debug.

Is there another way to accomplish the results without manipulating base services?

Thanks
Markus

-----Original Message-----
From: David Holmes 
Sent: den 17 november 2014 07:40
To: Ivan Gerasimov; Daniel Daugherty
Cc: serviceability-dev at openjdk.java.net; hotspot-runtime-dev
Subject: Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in hotspot\src\os\windows\vm\os_windows.cpp: 3844

On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
> Thank you Daniel!
>
> Please find the updated webrev with your suggestions incorporated here:
> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>
> Concerning the thread priority: If the application is of 
> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to 
> THREAD_PRIORITY_HIGHEST will result in its priority value to be only 
> 10 (of maximum 31).
> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.
> 85).aspx
>
>
> And if the process is HIGH_PRIORITY_CLASS, then the tread with the 
> HIGHEST priority level will have priority value == 15 of 31.
>
> I believe, it should not be too much, and the machine will not become 
> busy with only those closing threads.
> However, I hope it would be enough to make them complete faster than 
> other threads of the NORMAL priority level withing the same application.

I don't think this is necessary or desirable. Under normal usage we're giving priority to exiting threads and that may disrupt the usual scheduling patterns that applications see. You may posit that it is "harmless" but we can't say that for sure. Nor can we actually know that this will help with this particular bug. I would not add in this new code.

David

> Sincerely yours,
> Ivan
>
>
> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>> Hello!
>>>
>>> The recent fix for JDK-8059533 ((process) Make exiting process wait 
>>> for exiting threads [win]) caused the warning message to be printed 
>>> in some test environments:
>>> -----------
>>> os_windows.cpp:3844 is in the newly updated 
>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>> -----------
>>>
>>> This has been observed with debug builds on highly loaded systems.
>>>
>>>
>>> To address the issue it is proposed to do three things:
>>> 1) increase the timeout for debug builds,
>>> 2) increase the maximum number of the thread handles to be stored,
>>> 3) rise the priority of the exiting threads, if we need to wait for 
>>> them.
>>>
>>> Would you please help review the fix?
>>>
>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>
>> src/os/windows/vm/os_windows.cpp
>>
>>   line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>
>>     That uses the smaller value for only one build config (PRODUCT).
>>
>>   line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000) DEBUG_ONLY(4000)
>> /*1 sec in product, 4 sec in debug*/
>>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>     Please add spaces between the comment delimiters and the comment 
>> text.
>>
>>     That uses the smaller timeout for only one build config (PRODUCT).
>>
>>   line 3836           // Rise the priority...
>>     Typo: 'Rise' -> 'Raise'
>>
>>     About the general idea of raising the exiting thread's priority,
>>     if the exiting thread is looping in some Win* OS code after this
>>     point, will raising the priority make the machine unusable?
>>
>> Dan
>>
>>
>>>
>>> The fix was tested on all available platforms, with the hotspot 
>>> testset. No failures.
>>>
>>> Sincerely yours,
>>> Ivan
>>>
>>
>>
>>
>

From ivan.gerasimov at oracle.com  Mon Nov 17 09:00:16 2014
From: ivan.gerasimov at oracle.com (Ivan Gerasimov)
Date: Mon, 17 Nov 2014 12:00:16 +0300
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <54699845.5010901@oracle.com>
References: <5465F711.9090605@oracle.com> <54668EAF.9070807@oracle.com>
	<546915DF.7080106@oracle.com> <54699845.5010901@oracle.com>
Message-ID: <5469B920.4040300@oracle.com>

Thanks David!

On 17.11.2014 9:40, David Holmes wrote:
> On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
>> Thank you Daniel!
>>
>> Please find the updated webrev with your suggestions incorporated here:
>> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>>
>> Concerning the thread priority: If the application is of
>> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to
>> THREAD_PRIORITY_HIGHEST will result in its priority value to be only 10
>> (of maximum 31).
>> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.85).aspx 
>>
>>
>>
>> And if the process is HIGH_PRIORITY_CLASS, then the tread with the
>> HIGHEST priority level will have priority value == 15 of 31.
>>
>> I believe, it should not be too much, and the machine will not become
>> busy with only those closing threads.
>> However, I hope it would be enough to make them complete faster than
>> other threads of the NORMAL priority level withing the same application.
>
> I don't think this is necessary or desirable. Under normal usage we're 
> giving priority to exiting threads and that may disrupt the usual 
> scheduling patterns that applications see. You may posit that it is 
> "harmless" but we can't say that for sure. Nor can we actually know 
> that this will help with this particular bug. I would not add in this 
> new code.
>

There are two places where I put adjusting the thread's priority:

1) We've the array of handles filled up.

If we're found in this code branch, it'll mean that unfortunately we've 
already got broken exit pattern, because the current thread has to do a 
blocking call, having the ownership of a critical section.
The full array of handles means that many threads are exiting at that 
time, thus all the threads that are starting to exit after the current 
one will block at the attempt to grab ownership of the critical section.

Raising the priority of one thread that had already reached 
_endthreadex(), seems appropriate to me in such a situation, because it 
helps shorten the period of time when the threads remain blocked.

Choosing the oldest exiting thread ensures that the period of time when 
the priority of one thread is higher is the smallest possible.

2) The process exit branch.

That's the main part of the fix -- here we make the process to wait for 
all the threads having called _endthreadex() to complete, at the same 
time preventing any other threads from starting the exiting procedure.
The execution flow is already changed here (I don't want to say 
disrupted, because it was meant to fix the issue).

All running threads are about to be terminated soon by ending the 
process, so raising the priority of some of the threads should not have 
any bad impact on the program flow.
Instead, it may make the time the process has to wait before calling 
exit() shorter.


I can surely remove that playing with the threads' priority, as it's not 
the essential part of the fix.
However, I think it's a useful hint to the scheduler, which can improve 
things in some situations, and I'm not really sure how it can harm.


Sincerely yours,
Ivan


> David
>
>> Sincerely yours,
>> Ivan
>>
>>
>> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>>> Hello!
>>>>
>>>> The recent fix for JDK-8059533 ((process) Make exiting process wait
>>>> for exiting threads [win]) caused the warning message to be printed
>>>> in some test environments:
>>>> -----------
>>>> os_windows.cpp:3844 is in the newly updated
>>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>>> -----------
>>>>
>>>> This has been observed with debug builds on highly loaded systems.
>>>>
>>>>
>>>> To address the issue it is proposed to do three things:
>>>> 1) increase the timeout for debug builds,
>>>> 2) increase the maximum number of the thread handles to be stored,
>>>> 3) rise the priority of the exiting threads, if we need to wait for
>>>> them.
>>>>
>>>> Would you please help review the fix?
>>>>
>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>>
>>> src/os/windows/vm/os_windows.cpp
>>>
>>>   line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>
>>>     That uses the smaller value for only one build config (PRODUCT).
>>>
>>>   line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000) DEBUG_ONLY(4000)
>>> /*1 sec in product, 4 sec in debug*/
>>>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>     Please add spaces between the comment delimiters and the comment
>>> text.
>>>
>>>     That uses the smaller timeout for only one build config (PRODUCT).
>>>
>>>   line 3836           // Rise the priority...
>>>     Typo: 'Rise' -> 'Raise'
>>>
>>>     About the general idea of raising the exiting thread's priority,
>>>     if the exiting thread is looping in some Win* OS code after this
>>>     point, will raising the priority make the machine unusable?
>>>
>>> Dan
>>>
>>>
>>>>
>>>> The fix was tested on all available platforms, with the hotspot
>>>> testset. No failures.
>>>>
>>>> Sincerely yours,
>>>> Ivan
>>>>
>>>
>>>
>>>
>>
>
>


From george.triantafillou at oracle.com  Mon Nov 17 12:52:40 2014
From: george.triantafillou at oracle.com (George Triantafillou)
Date: Mon, 17 Nov 2014 07:52:40 -0500
Subject: [8u40] RFR 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <5466868F.2020802@oracle.com>
References: <5466868F.2020802@oracle.com>
Message-ID: <5469EF98.8010207@oracle.com>

Hi Coleen,

This looks good.

-George

On 11/14/2014 5:47 PM, Coleen Phillimore wrote:
> Please approve the backport of the bug fix for this bug.  The fix has 
> been tested all week with no problems.  The patch didn't import 
> because MallocTrackingVerify.java didn't have an @ignore tag, which 
> was removed by the jdk9 fix.  Everything else applied cleanly.
>
> Summary: Signed bitfield size y can only have (1 << y)-1 values.
> Reviewed-by: shade, dholmes, jrose, ctornqvi, gtriantafill
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8062870_8u40/
> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
>
> Thanks!
> Coleen
>
>
>


From coleen.phillimore at oracle.com  Mon Nov 17 16:21:05 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 17 Nov 2014 11:21:05 -0500
Subject: [8u40] RFR 8062870: src/share/vm/services/mallocTracker.hpp:64
	assert(_count > 0) failed: Negative ,counter
In-Reply-To: <5469EF98.8010207@oracle.com>
References: <5466868F.2020802@oracle.com> <5469EF98.8010207@oracle.com>
Message-ID: <546A2071.2060204@oracle.com>


Thank you, George!
Coleen

On 11/17/14, 7:52 AM, George Triantafillou wrote:
> Hi Coleen,
>
> This looks good.
>
> -George
>
> On 11/14/2014 5:47 PM, Coleen Phillimore wrote:
>> Please approve the backport of the bug fix for this bug.  The fix has 
>> been tested all week with no problems.  The patch didn't import 
>> because MallocTrackingVerify.java didn't have an @ignore tag, which 
>> was removed by the jdk9 fix.  Everything else applied cleanly.
>>
>> Summary: Signed bitfield size y can only have (1 << y)-1 values.
>> Reviewed-by: shade, dholmes, jrose, ctornqvi, gtriantafill
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8062870_8u40/
>> bug link https://bugs.openjdk.java.net/browse/JDK-8062870
>>
>> Thanks!
>> Coleen
>>
>>
>>
>


From mandy.chung at oracle.com  Mon Nov 17 16:57:23 2014
From: mandy.chung at oracle.com (Mandy Chung)
Date: Mon, 17 Nov 2014 08:57:23 -0800
Subject: [8u40] Review request  8064667: Provide support to help identify
	use of endorsed standards and extension mechanism
Message-ID: <546A28F3.1010802@oracle.com>

This requests both code review and 8u40 approval for:
    https://bugs.openjdk.java.net/browse/JDK-8064667

Webrev:
http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.00/

JEP 220 [1] proposes to remove the endorsed standards override mechanism 
and extension mechanism. This patch adds a VM flag in 8u40 to help 
identify any existing uses of these mechanisms so that users can turn on 
the VM flag to help identify if they depend on the endorsed standards 
override mechanism and extension mechanism and can plan to prepare for 
the migration to a newer JDK release early on. When 
-XX:+CheckEndorsedAndExtDirs is set, the VM will exit if the system 
property -Djava.endorsed.dirs or -Djava.ext.dirs is set, or if 
${java.home}/lib/endorsed or ${java.home}/lib/ext exists, or any system 
extension directory contains JAR files.

Thanks
Mandy
[1] http://openjdk.java.net/jeps/220


From vladimir.kempik at oracle.com  Mon Nov 17 16:20:58 2014
From: vladimir.kempik at oracle.com (Vladimir Kempik)
Date: Mon, 17 Nov 2014 20:20:58 +0400
Subject: RFR: 8058935:  CPU detection gives 0 cores per cpu, 2 threads per
	core in Amazon EC2 environment
Message-ID: <546A206A.4070604@oracle.com>

Hi,

Please review patch adding sanity check to cores_per_cpu():

http://cr.openjdk.java.net/~vkempik/8058935/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8058935

Few months ago we've got reports of java crashing in amazon ec2 
enviroment (they use Xen).
https://bugs.openjdk.java.net/browse/JDK-8058935
https://bugs.openjdk.java.net/browse/JDK-8058937

JVM args was used to make the crash: -XX:+UnlockCommercialFeatures 
-XX:+FlightRecorder

After investigation I think the crash could only have happened if 
support_processor_topology() returned true and 
_cpuid_info.tpl_cpuidB1_ebx.bits.logical_cpus was zero.

I wasn't able to reproduce the bug on amazon ec2 cloud in present days.

The patch adds sanity check, if cpu topology was used and resulted in 0 
cores per cpu, then fallback to non-topology variant, which can't result 
in 0 cores per cpu.

Testing: JPRT.

Thanks,
Vladimir.

From sean.coffey at oracle.com  Mon Nov 17 18:06:17 2014
From: sean.coffey at oracle.com (=?windows-1252?Q?Se=E1n_Coffey?=)
Date: Mon, 17 Nov 2014 18:06:17 +0000
Subject: [8u40] Review request 8064667: Provide support to help identify
	use of endorsed standards and extension mechanism
In-Reply-To: <546A28F3.1010802@oracle.com>
References: <546A28F3.1010802@oracle.com>
Message-ID: <546A3919.7020804@oracle.com>

Looks good to me Mandy. Best to get a runtime reviewer to look at that I 
guess.

Few other comments :
a) Will you be filing a CCC for the new flag ?
b) Maybe a testcase would be useful (simple one launching in ovm mode 
with java.endorsed.dirs etc.
c) Will you be pushing this to hs-dev or jdk8u-dev forest ? Seems most 
relevant for hotspot team forest. No approval required in that case.

regards,
Sean.

On 17/11/14 16:57, Mandy Chung wrote:
> This requests both code review and 8u40 approval for:
>    https://bugs.openjdk.java.net/browse/JDK-8064667
>
> Webrev:
> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.00/
>
> JEP 220 [1] proposes to remove the endorsed standards override 
> mechanism and extension mechanism. This patch adds a VM flag in 8u40 
> to help identify any existing uses of these mechanisms so that users 
> can turn on the VM flag to help identify if they depend on the 
> endorsed standards override mechanism and extension mechanism and can 
> plan to prepare for the migration to a newer JDK release early on. 
> When -XX:+CheckEndorsedAndExtDirs is set, the VM will exit if the 
> system property -Djava.endorsed.dirs or -Djava.ext.dirs is set, or if 
> ${java.home}/lib/endorsed or ${java.home}/lib/ext exists, or any 
> system extension directory contains JAR files.
>
> Thanks
> Mandy
> [1] http://openjdk.java.net/jeps/220
>
>
>


From coleen.phillimore at oracle.com  Mon Nov 17 19:10:36 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 17 Nov 2014 14:10:36 -0500
Subject: [8u40] RFR 8048169: Change 8037816 breaks HS build on PPC64 and
	CPP-Interpreter platforms
Message-ID: <546A482C.7030508@oracle.com>

Summary: Fix the matching of format string parameter types to the actual 
argument types for the PPC64 and CPP-Interpreter files in the same way 
as 8037816 already did it for all the other files
Reviewed-by: stefank, coleenp, dholmes


This is a 8u40 backport of the changes that Volker did for 9. Please 
approve.

Coleen

From christian.tornqvist at oracle.com  Mon Nov 17 19:21:23 2014
From: christian.tornqvist at oracle.com (Christian Tornqvist)
Date: Mon, 17 Nov 2014 14:21:23 -0500
Subject: [8u40] RFR 8048169: Change 8037816 breaks HS build on PPC64
	and	CPP-Interpreter platforms
In-Reply-To: <546A482C.7030508@oracle.com>
References: <546A482C.7030508@oracle.com>
Message-ID: <042701d0029b$adb04060$0910c120$@oracle.com>

Sounds like a good idea to backport this.

Thanks,
Christian

-----Original Message-----
From: hotspot-runtime-dev
[mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Coleen
Phillimore
Sent: Monday, November 17, 2014 2:11 PM
To: hotspot-runtime-dev
Subject: [8u40] RFR 8048169: Change 8037816 breaks HS build on PPC64 and
CPP-Interpreter platforms

Summary: Fix the matching of format string parameter types to the actual
argument types for the PPC64 and CPP-Interpreter files in the same way as
8037816 already did it for all the other files
Reviewed-by: stefank, coleenp, dholmes


This is a 8u40 backport of the changes that Volker did for 9. Please
approve.

Coleen


From coleen.phillimore at oracle.com  Mon Nov 17 19:23:01 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 17 Nov 2014 14:23:01 -0500
Subject: [8u40] RFR 8048169: Change 8037816 breaks HS build on PPC64 and
	CPP-Interpreter platforms
In-Reply-To: <042701d0029b$adb04060$0910c120$@oracle.com>
References: <546A482C.7030508@oracle.com>
	<042701d0029b$adb04060$0910c120$@oracle.com>
Message-ID: <546A4B15.1030909@oracle.com>


Thanks Christian.  I forgot to mention that this change imported cleanly 
from the jdk9 changes.

Coleen

On 11/17/14, 2:21 PM, Christian Tornqvist wrote:
> Sounds like a good idea to backport this.
>
> Thanks,
> Christian
>
> -----Original Message-----
> From: hotspot-runtime-dev
> [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Coleen
> Phillimore
> Sent: Monday, November 17, 2014 2:11 PM
> To: hotspot-runtime-dev
> Subject: [8u40] RFR 8048169: Change 8037816 breaks HS build on PPC64 and
> CPP-Interpreter platforms
>
> Summary: Fix the matching of format string parameter types to the actual
> argument types for the PPC64 and CPP-Interpreter files in the same way as
> 8037816 already did it for all the other files
> Reviewed-by: stefank, coleenp, dholmes
>
>
> This is a 8u40 backport of the changes that Volker did for 9. Please
> approve.
>
> Coleen
>


From calvin.cheung at oracle.com  Mon Nov 17 19:40:33 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Mon, 17 Nov 2014 11:40:33 -0800
Subject: [8u40] Review request 8064667: Provide support to help identify
	use of endorsed standards and extension mechanism
In-Reply-To: <546A28F3.1010802@oracle.com>
References: <546A28F3.1010802@oracle.com>
Message-ID: <546A4F31.2040406@oracle.com>

Hi Mandy,

In 8u40, the jre/lib/ext dir still exists and containing nashorn.jar, 
javafx.jar, etc.
Would those jar files be moved to a different dir?

Some minor comments in arguments.cpp:
lines 3470 and 3472 can be combined as follows:
int nonEmptyDirs = check_non_empty_dirs(Arguments::get_endorsed_dir(), 
"endorsed");

before return JNI_ERR; at lines 3493 and 3503, the dir should be closed:
os::closedir(dir);

Calvin

On 11/17/2014 8:57 AM, Mandy Chung wrote:
> This requests both code review and 8u40 approval for:
>    https://bugs.openjdk.java.net/browse/JDK-8064667
>
> Webrev:
> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.00/
>
> JEP 220 [1] proposes to remove the endorsed standards override 
> mechanism and extension mechanism. This patch adds a VM flag in 8u40 
> to help identify any existing uses of these mechanisms so that users 
> can turn on the VM flag to help identify if they depend on the 
> endorsed standards override mechanism and extension mechanism and can 
> plan to prepare for the migration to a newer JDK release early on. 
> When -XX:+CheckEndorsedAndExtDirs is set, the VM will exit if the 
> system property -Djava.endorsed.dirs or -Djava.ext.dirs is set, or if 
> ${java.home}/lib/endorsed or ${java.home}/lib/ext exists, or any 
> system extension directory contains JAR files.
>
> Thanks
> Mandy
> [1] http://openjdk.java.net/jeps/220
>
>
>


From vladimir.kozlov at oracle.com  Mon Nov 17 19:47:47 2014
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 17 Nov 2014 11:47:47 -0800
Subject: RFR: 8058935:  CPU detection gives 0 cores per cpu, 2 threads
	per core in Amazon EC2 environment
In-Reply-To: <546A206A.4070604@oracle.com>
References: <546A206A.4070604@oracle.com>
Message-ID: <546A50E3.6010200@oracle.com>

According to next document the cpu has 10 cores (and 2 threads per core):

http://ark.intel.com/products/75275/Intel-Xeon-Processor-E5-2670-v2-25M-Cache-2_50-GHz

hs_err in the bug report reports only 2 processors and next lines are 
missing:

physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0

I assume it is some kind of virtual environment with which cpuid 
topology is not working (at least our code does not work).
We may missing some checks which indicates that topology is not supported.
It would be nice if you can put all topology and related cpuid bits from 
amazon ec2 in bug report.
Checking for 0 could be fine but if it is not 0 it could be still wrong 
if topology info is not supported.

Thanks,
Vladimir

On 11/17/14 8:20 AM, Vladimir Kempik wrote:
> Hi,
>
> Please review patch adding sanity check to cores_per_cpu():
>
> http://cr.openjdk.java.net/~vkempik/8058935/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8058935
>
> Few months ago we've got reports of java crashing in amazon ec2
> enviroment (they use Xen).
> https://bugs.openjdk.java.net/browse/JDK-8058935
> https://bugs.openjdk.java.net/browse/JDK-8058937
>
> JVM args was used to make the crash: -XX:+UnlockCommercialFeatures
> -XX:+FlightRecorder
>
> After investigation I think the crash could only have happened if
> support_processor_topology() returned true and
> _cpuid_info.tpl_cpuidB1_ebx.bits.logical_cpus was zero.
>
> I wasn't able to reproduce the bug on amazon ec2 cloud in present days.
>
> The patch adds sanity check, if cpu topology was used and resulted in 0
> cores per cpu, then fallback to non-topology variant, which can't result
> in 0 cores per cpu.
>
> Testing: JPRT.
>
> Thanks,
> Vladimir.

From coleen.phillimore at oracle.com  Mon Nov 17 21:21:01 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 17 Nov 2014 16:21:01 -0500
Subject: [9] RFR(L) 8013267 : move MemberNameTable from native code to
	Java heap, use to intern MemberNames
In-Reply-To: <39826508-110B-4FCE-9A58-8C3D1B9FC7DE@oracle.com>
References: <594FE416-D445-426E-B7A9-C75F012ADE16@oracle.com>	<A7147834-CFBA-4CB7-BDA9-F5F7FEB33B35@oracle.com>	<0BDCD5DD-CBFB-4A3D-9BAD-7F9912E9237C@oracle.com>	<5453C230.8010709@oracle.com>	<9747A3A3-B7F4-407A-8E00-1E647D9DC2D1@oracle.com>	<CAHjP37EEaD1oyYW47twcAgxp_+3Q399fHv4Cg9oeiZU0BNr7yw@mail.gmail.com>	<1D195ED3-8BD2-46CB-9D70-29CF809D9F5A@oracle.com>	<5456FB59.60905@oracle.com>	<632A5C98-B386-4625-BE12-355241581955@oracle.com>	<5457AA75.8090103@gmail.com>	<F5B0B719-8B7B-4FDF-96E3-4058E223E74B@oracle.com>	<5457E0F9.8090004@gmail.com>	<F30775CD-7FAB-4C69-9150-769DF273C3D9@oracle.com>	<5458A57C.4060208@gmail.com>	<260D49F5-6380-4FC3-A900-6CD9AB3ED6F7@oracle.com>	<5459034E.8070809@gmail.com>	<D1D7CFCF-D5C0-4A05-A636-80685CF153B0@oracle.com>
	<39826508-110B-4FCE-9A58-8C3D1B9FC7DE@oracle.com>
Message-ID: <546A66BD.7090904@oracle.com>


Hi,

I recommend that we split this bug into three changes.  One to fix the 
class redefinition problem (RFR coming shortly), one to intern 
MemberNames for performance, and the third to potentially (maybe) make 
class redefinition work with the new member name table.  In this change, 
I don't like how class redefinition has leaked into the java code to 
intern member names.

Thanks,
Coleen

On 11/7/14, 4:14 PM, David Chase wrote:
> New webrev:
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8013267
>
> webrevs:
> http://cr.openjdk.java.net/~drchase/8013267/jdk.06/
> http://cr.openjdk.java.net/~drchase/8013267/hotspot.06/
>
> Changes since last:
>
> 1) refactored to put ClassData under java.lang.invoke.MemberName
>
> 2) split the data structure into two parts; handshake with JVM uses a linked list,
> which makes for a simpler backout-if-race, and Java side continues to use the
> simple sorted array.  This should allow easier use of (for example) fancier
> data structures (like ConcurrentHashMap) if this later proves necessary.
>
> 3) Cleaned up symbol references in the new hotspot code to go through vmSymbols.
>
> 4) renamed oldCapacity to oldSize
>
> 5) ran two different benchmarks and saw no change in performance.
>    a) nashorn ScriptTest (see https://bugs.openjdk.java.net/browse/JDK-8014288 )
>    b) JMH microbenchmarks
>    (see bug comments for details)
>
> And it continues to pass the previously-failing tests, as well as the new test
> which has been added to hotspot/test/compiler/jsr292 .
>
> David
>
> On 2014-11-04, at 3:54 PM, David Chase <david.r.chase at oracle.com> wrote:
>
>> I?m working on the initial benchmarking, and so far this arrangement (with synchronization
>> and binary search for lookup, lots of barriers and linear cost insertion) has not yet been any
>> slower.
>>
>> I am nonetheless tempted by the 2-tables solution, because I think the simpler JVM-side
>> interface that it allows is desirable.
>>
>> David
>>
>> On 2014-11-04, at 11:48 AM, Peter Levart <peter.levart at gmail.com> wrote:
>>
>>> On 11/04/2014 04:19 PM, David Chase wrote:
>>>> On 2014-11-04, at 5:07 AM, Peter Levart <peter.levart at gmail.com> wrote:
>>>>> Are you thinking of an IdentityHashMap type of hash table (no linked-list of elements for same bucket, just search for 1st free slot on insert)? The problem would be how to pre-size the array. Count declared members?
>>>> It can?t be an identityHashMap, because we are interning member names.
>>> I know it can't be IdentityHashMap - I just wondered if you were thinking of an IdentityHashMap-like data structure in contrast to standard HashMap-like. Not in terms of equality/hashCode used, but in terms of internal data structure. IdentityHashMap is just an array of elements (well pairs of them - key, value are placed in two consecutive array slots). Lookup searches for element linearly in the array starting from hashCode based index to the element if found or 1st empty array slot. It's very easy to implement if the only operations are get() and put() and could be used for interning and as a shared structure for VM to scan, but array has to be sized to at least 3/2 the number of elements for performance to not degrade.
>>>
>>>> In spite of my grumbling about benchmarking, I?m inclined to do that and try a couple of experiments.
>>>> One possibility would be to use two data structures, one for interning, the other for communication with the VM.
>>>> Because there?s no lookup in the VM data stucture it can just be an array that gets elements appended,
>>>> and the synchronization dance is much simpler.
>>>>
>>>> For interning, maybe I use a ConcurrentHashMap, and I try the following idiom:
>>>>
>>>> mn = resolve(args)
>>>> // deal with any errors
>>>> mn? = chm.get(mn)
>>>> if (mn? != null) return mn? // hoped-for-common-case
>>>>
>>>> synchronized (something) {
>>>>   mn? = chm.get(mn)
>>>>   if (mn? != null) return mn?
>>>>      txn_class = mn.getDeclaringClass()
>>>>
>>>>     while (true) {
>>>>        redef_count = txn_class.redefCount()
>>>>        mn = resolve(args)
>>>>
>>>>       shared_array.add(mn);
>>>>       // barrier, because we are a paranoid
>>>>       if (redef_count = redef_count.redefCount()) {
>>>>           chm.add(mn); // safe to publish to other Java threads.
>>>>           return mn;
>>>>       }
>>>>       shared_array.drop_last(); // Try again
>>>>   }
>>>> }
>>>>
>>>> (Idiom gets slightly revised for the one or two other intern use cases, but this is the basic idea).
>>> Yes, that's similar to what I suggested by using a linked-list of MemberName(s) instead of the "shared_array" (easier to reason about ordering of writes) and a sorted array of MemberName(s) instead of the "chm" in your scheme above. ConcurrentHashMap would certainly be the most performant solution in terms of lookup/insertion-time and concurrent throughput, but it will use more heap than a simple packed array of MemberNames. CHM is much better now in JDK8 though regarding heap use.
>>>
>>> A combination of the two approaches is also possible:
>>>
>>> - instead of maintaining a "shared_array" of MemberName(s), have them form a linked-list (you trade a slot in array for 'next' pointer in MemberName)
>>> - use ConcurrentHashMap for interning.
>>>
>>> Regards, Peter
>>>
>>>> David
>>>>
>>>>>> And another way to view this is that we?re now quibbling about performance, when we still
>>>>>> have an existing correctness problem that this patch solves, so maybe we should just get this
>>>>>> done and then file an RFE.
>>>>> Perhaps, yes. But note that questions about JMM and ordering of writes to array elements are about correctness, not performance.
>>>>>
>>>>> Regards, Peter
>>>>>
>>>>>> David


From jiangli.zhou at oracle.com  Mon Nov 17 21:52:27 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Mon, 17 Nov 2014 13:52:27 -0800
Subject: [8u40] RFR 8054008 & 8064375 backports
Message-ID: <546A6E1B.2010705@oracle.com>

Hi,

Please approve the backport for following bugs to 8u40:

JDK-8054008 <https://bugs.openjdk.java.net/browse/JDK-8054008>: Using 
-XX:-LazyBootClassLoader crashes with ACCESS_VIOLATION on Win 64bit 
(http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/9dd17854c570)
JDK-8064735 <https://bugs.openjdk.java.net/browse/JDK-8064735>: Change 
certain errors to warnings in CDS output 
(http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/6155ff53a422)

webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev.backport/

Thanks,
Jiangli

From christian.tornqvist at oracle.com  Mon Nov 17 22:02:21 2014
From: christian.tornqvist at oracle.com (Christian Tornqvist)
Date: Mon, 17 Nov 2014 17:02:21 -0500
Subject: [8u40] RFR 8054008 & 8064375 backports
In-Reply-To: <546A6E1B.2010705@oracle.com>
References: <546A6E1B.2010705@oracle.com>
Message-ID: <065a01d002b2$29f1c890$7dd559b0$@oracle.com>

Hi Jiangli,

Sounds like a good idea to backport these.

Thanks,
Christian

-----Original Message-----
From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Jiangli Zhou
Sent: Monday, November 17, 2014 4:52 PM
To: hotspot-runtime-dev at openjdk.java.net
Subject: [8u40] RFR 8054008 & 8064375 backports

Hi,

Please approve the backport for following bugs to 8u40:

JDK-8054008 <https://bugs.openjdk.java.net/browse/JDK-8054008>: Using -XX:-LazyBootClassLoader crashes with ACCESS_VIOLATION on Win 64bit
(http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/9dd17854c570)
JDK-8064735 <https://bugs.openjdk.java.net/browse/JDK-8064735>: Change certain errors to warnings in CDS output
(http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/6155ff53a422)

webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev.backport/

Thanks,
Jiangli


From jiangli.zhou at oracle.com  Mon Nov 17 22:04:12 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Mon, 17 Nov 2014 14:04:12 -0800
Subject: [8u40] RFR 8054008 & 8064375 backports
In-Reply-To: <065a01d002b2$29f1c890$7dd559b0$@oracle.com>
References: <546A6E1B.2010705@oracle.com>
	<065a01d002b2$29f1c890$7dd559b0$@oracle.com>
Message-ID: <546A70DC.7040708@oracle.com>

Thank you Christian for the quick response!

Jiangli

On 11/17/2014 02:02 PM, Christian Tornqvist wrote:
> Hi Jiangli,
>
> Sounds like a good idea to backport these.
>
> Thanks,
> Christian
>
> -----Original Message-----
> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Jiangli Zhou
> Sent: Monday, November 17, 2014 4:52 PM
> To: hotspot-runtime-dev at openjdk.java.net
> Subject: [8u40] RFR 8054008 & 8064375 backports
>
> Hi,
>
> Please approve the backport for following bugs to 8u40:
>
> JDK-8054008 <https://bugs.openjdk.java.net/browse/JDK-8054008>: Using -XX:-LazyBootClassLoader crashes with ACCESS_VIOLATION on Win 64bit
> (http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/9dd17854c570)
> JDK-8064735 <https://bugs.openjdk.java.net/browse/JDK-8064735>: Change certain errors to warnings in CDS output
> (http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/6155ff53a422)
>
> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev.backport/
>
> Thanks,
> Jiangli
>


From mikhailo.seledtsov at oracle.com  Mon Nov 17 23:19:52 2014
From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov)
Date: Mon, 17 Nov 2014 15:19:52 -0800
Subject: [8u40] RFR 8054008 & 8064375 backports
In-Reply-To: <546A70DC.7040708@oracle.com>
References: <546A6E1B.2010705@oracle.com>
	<065a01d002b2$29f1c890$7dd559b0$@oracle.com>
	<546A70DC.7040708@oracle.com>
Message-ID: <546A8298.9040208@oracle.com>

I agree, sounds like a good idea.

Misha

On 11/17/2014 2:04 PM, Jiangli Zhou wrote:
> Thank you Christian for the quick response!
>
> Jiangli
>
> On 11/17/2014 02:02 PM, Christian Tornqvist wrote:
>> Hi Jiangli,
>>
>> Sounds like a good idea to backport these.
>>
>> Thanks,
>> Christian
>>
>> -----Original Message-----
>> From: hotspot-runtime-dev 
>> [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of 
>> Jiangli Zhou
>> Sent: Monday, November 17, 2014 4:52 PM
>> To: hotspot-runtime-dev at openjdk.java.net
>> Subject: [8u40] RFR 8054008 & 8064375 backports
>>
>> Hi,
>>
>> Please approve the backport for following bugs to 8u40:
>>
>> JDK-8054008 <https://bugs.openjdk.java.net/browse/JDK-8054008>: Using 
>> -XX:-LazyBootClassLoader crashes with ACCESS_VIOLATION on Win 64bit
>> (http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/9dd17854c570)
>> JDK-8064735 <https://bugs.openjdk.java.net/browse/JDK-8064735>: 
>> Change certain errors to warnings in CDS output
>> (http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/6155ff53a422)
>>
>> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev.backport/
>>
>> Thanks,
>> Jiangli
>>
>


From jiangli.zhou at oracle.com  Tue Nov 18 00:52:47 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Mon, 17 Nov 2014 16:52:47 -0800
Subject: [8u40] RFR 8054008 & 8064375 backports
In-Reply-To: <546A8298.9040208@oracle.com>
References: <546A6E1B.2010705@oracle.com>
	<065a01d002b2$29f1c890$7dd559b0$@oracle.com>
	<546A70DC.7040708@oracle.com> <546A8298.9040208@oracle.com>
Message-ID: <546A985F.7030504@oracle.com>

Thank you, Misha.

Jiangli

On 11/17/2014 03:19 PM, Mikhailo Seledtsov wrote:
> I agree, sounds like a good idea.
>
> Misha
>
> On 11/17/2014 2:04 PM, Jiangli Zhou wrote:
>> Thank you Christian for the quick response!
>>
>> Jiangli
>>
>> On 11/17/2014 02:02 PM, Christian Tornqvist wrote:
>>> Hi Jiangli,
>>>
>>> Sounds like a good idea to backport these.
>>>
>>> Thanks,
>>> Christian
>>>
>>> -----Original Message-----
>>> From: hotspot-runtime-dev 
>>> [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of 
>>> Jiangli Zhou
>>> Sent: Monday, November 17, 2014 4:52 PM
>>> To: hotspot-runtime-dev at openjdk.java.net
>>> Subject: [8u40] RFR 8054008 & 8064375 backports
>>>
>>> Hi,
>>>
>>> Please approve the backport for following bugs to 8u40:
>>>
>>> JDK-8054008 <https://bugs.openjdk.java.net/browse/JDK-8054008>: 
>>> Using -XX:-LazyBootClassLoader crashes with ACCESS_VIOLATION on Win 
>>> 64bit
>>> (http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/9dd17854c570)
>>> JDK-8064735 <https://bugs.openjdk.java.net/browse/JDK-8064735>: 
>>> Change certain errors to warnings in CDS output
>>> (http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/6155ff53a422)
>>>
>>> webrev: http://cr.openjdk.java.net/~jiangli/8054008/webrev.backport/
>>>
>>> Thanks,
>>> Jiangli
>>>
>>
>


From david.holmes at oracle.com  Tue Nov 18 01:50:14 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 18 Nov 2014 11:50:14 +1000
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <5469B920.4040300@oracle.com>
References: <5465F711.9090605@oracle.com> <54668EAF.9070807@oracle.com>
	<546915DF.7080106@oracle.com> <54699845.5010901@oracle.com>
	<5469B920.4040300@oracle.com>
Message-ID: <546AA5D6.4050203@oracle.com>

Hi Ivan,

On 17/11/2014 7:00 PM, Ivan Gerasimov wrote:
> Thanks David!
>
> On 17.11.2014 9:40, David Holmes wrote:
>> On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
>>> Thank you Daniel!
>>>
>>> Please find the updated webrev with your suggestions incorporated here:
>>> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>>>
>>> Concerning the thread priority: If the application is of
>>> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to
>>> THREAD_PRIORITY_HIGHEST will result in its priority value to be only 10
>>> (of maximum 31).
>>> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.85).aspx
>>>
>>>
>>>
>>> And if the process is HIGH_PRIORITY_CLASS, then the tread with the
>>> HIGHEST priority level will have priority value == 15 of 31.
>>>
>>> I believe, it should not be too much, and the machine will not become
>>> busy with only those closing threads.
>>> However, I hope it would be enough to make them complete faster than
>>> other threads of the NORMAL priority level withing the same application.
>>
>> I don't think this is necessary or desirable. Under normal usage we're
>> giving priority to exiting threads and that may disrupt the usual
>> scheduling patterns that applications see. You may posit that it is
>> "harmless" but we can't say that for sure. Nor can we actually know
>> that this will help with this particular bug. I would not add in this
>> new code.
>>
>
> There are two places where I put adjusting the thread's priority:
>
> 1) We've the array of handles filled up.
>
> If we're found in this code branch, it'll mean that unfortunately we've
> already got broken exit pattern, because the current thread has to do a
> blocking call, having the ownership of a critical section.
> The full array of handles means that many threads are exiting at that
> time, thus all the threads that are starting to exit after the current
> one will block at the attempt to grab ownership of the critical section.
>
> Raising the priority of one thread that had already reached
> _endthreadex(), seems appropriate to me in such a situation, because it
> helps shorten the period of time when the threads remain blocked.
>
> Choosing the oldest exiting thread ensures that the period of time when
> the priority of one thread is higher is the smallest possible.
>
> 2) The process exit branch.
>
> That's the main part of the fix -- here we make the process to wait for
> all the threads having called _endthreadex() to complete, at the same
> time preventing any other threads from starting the exiting procedure.
> The execution flow is already changed here (I don't want to say
> disrupted, because it was meant to fix the issue).
>
> All running threads are about to be terminated soon by ending the
> process, so raising the priority of some of the threads should not have
> any bad impact on the program flow.
> Instead, it may make the time the process has to wait before calling
> exit() shorter.
>
>
> I can surely remove that playing with the threads' priority, as it's not
> the essential part of the fix.
> However, I think it's a useful hint to the scheduler, which can improve
> things in some situations, and I'm not really sure how it can harm.

Okay. You've convinced me. I'm okay with the priority changes to try to 
minimize the exit time blocking.

Thanks,
David

>
> Sincerely yours,
> Ivan
>
>
>> David
>>
>>> Sincerely yours,
>>> Ivan
>>>
>>>
>>> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>>>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>>>> Hello!
>>>>>
>>>>> The recent fix for JDK-8059533 ((process) Make exiting process wait
>>>>> for exiting threads [win]) caused the warning message to be printed
>>>>> in some test environments:
>>>>> -----------
>>>>> os_windows.cpp:3844 is in the newly updated
>>>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>>>> -----------
>>>>>
>>>>> This has been observed with debug builds on highly loaded systems.
>>>>>
>>>>>
>>>>> To address the issue it is proposed to do three things:
>>>>> 1) increase the timeout for debug builds,
>>>>> 2) increase the maximum number of the thread handles to be stored,
>>>>> 3) rise the priority of the exiting threads, if we need to wait for
>>>>> them.
>>>>>
>>>>> Would you please help review the fix?
>>>>>
>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>>>
>>>> src/os/windows/vm/os_windows.cpp
>>>>
>>>>   line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>>>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>
>>>>     That uses the smaller value for only one build config (PRODUCT).
>>>>
>>>>   line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000) DEBUG_ONLY(4000)
>>>> /*1 sec in product, 4 sec in debug*/
>>>>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>     Please add spaces between the comment delimiters and the comment
>>>> text.
>>>>
>>>>     That uses the smaller timeout for only one build config (PRODUCT).
>>>>
>>>>   line 3836           // Rise the priority...
>>>>     Typo: 'Rise' -> 'Raise'
>>>>
>>>>     About the general idea of raising the exiting thread's priority,
>>>>     if the exiting thread is looping in some Win* OS code after this
>>>>     point, will raising the priority make the machine unusable?
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> The fix was tested on all available platforms, with the hotspot
>>>>> testset. No failures.
>>>>>
>>>>> Sincerely yours,
>>>>> Ivan
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>

From daniel.daugherty at oracle.com  Tue Nov 18 02:01:18 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 17 Nov 2014 19:01:18 -0700
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <546AA5D6.4050203@oracle.com>
References: <5465F711.9090605@oracle.com> <54668EAF.9070807@oracle.com>
	<546915DF.7080106@oracle.com> <54699845.5010901@oracle.com>
	<5469B920.4040300@oracle.com> <546AA5D6.4050203@oracle.com>
Message-ID: <546AA86E.3030308@oracle.com>

Ivan,

Please coordinate with Staffan Larsen about when he is planning to
take this week's snapshot of JDK9-hs-rt (RT_Baseline). Please push
your fix after Staffan's snapshot so we can have a week of soak
time for this version of the fix...

Dan


On 11/17/14 6:50 PM, David Holmes wrote:
> Hi Ivan,
>
> On 17/11/2014 7:00 PM, Ivan Gerasimov wrote:
>> Thanks David!
>>
>> On 17.11.2014 9:40, David Holmes wrote:
>>> On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
>>>> Thank you Daniel!
>>>>
>>>> Please find the updated webrev with your suggestions incorporated 
>>>> here:
>>>> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>>>>
>>>> Concerning the thread priority: If the application is of
>>>> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to
>>>> THREAD_PRIORITY_HIGHEST will result in its priority value to be 
>>>> only 10
>>>> (of maximum 31).
>>>> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.85).aspx 
>>>>
>>>>
>>>>
>>>>
>>>> And if the process is HIGH_PRIORITY_CLASS, then the tread with the
>>>> HIGHEST priority level will have priority value == 15 of 31.
>>>>
>>>> I believe, it should not be too much, and the machine will not become
>>>> busy with only those closing threads.
>>>> However, I hope it would be enough to make them complete faster than
>>>> other threads of the NORMAL priority level withing the same 
>>>> application.
>>>
>>> I don't think this is necessary or desirable. Under normal usage we're
>>> giving priority to exiting threads and that may disrupt the usual
>>> scheduling patterns that applications see. You may posit that it is
>>> "harmless" but we can't say that for sure. Nor can we actually know
>>> that this will help with this particular bug. I would not add in this
>>> new code.
>>>
>>
>> There are two places where I put adjusting the thread's priority:
>>
>> 1) We've the array of handles filled up.
>>
>> If we're found in this code branch, it'll mean that unfortunately we've
>> already got broken exit pattern, because the current thread has to do a
>> blocking call, having the ownership of a critical section.
>> The full array of handles means that many threads are exiting at that
>> time, thus all the threads that are starting to exit after the current
>> one will block at the attempt to grab ownership of the critical section.
>>
>> Raising the priority of one thread that had already reached
>> _endthreadex(), seems appropriate to me in such a situation, because it
>> helps shorten the period of time when the threads remain blocked.
>>
>> Choosing the oldest exiting thread ensures that the period of time when
>> the priority of one thread is higher is the smallest possible.
>>
>> 2) The process exit branch.
>>
>> That's the main part of the fix -- here we make the process to wait for
>> all the threads having called _endthreadex() to complete, at the same
>> time preventing any other threads from starting the exiting procedure.
>> The execution flow is already changed here (I don't want to say
>> disrupted, because it was meant to fix the issue).
>>
>> All running threads are about to be terminated soon by ending the
>> process, so raising the priority of some of the threads should not have
>> any bad impact on the program flow.
>> Instead, it may make the time the process has to wait before calling
>> exit() shorter.
>>
>>
>> I can surely remove that playing with the threads' priority, as it's not
>> the essential part of the fix.
>> However, I think it's a useful hint to the scheduler, which can improve
>> things in some situations, and I'm not really sure how it can harm.
>
> Okay. You've convinced me. I'm okay with the priority changes to try 
> to minimize the exit time blocking.
>
> Thanks,
> David
>
>>
>> Sincerely yours,
>> Ivan
>>
>>
>>> David
>>>
>>>> Sincerely yours,
>>>> Ivan
>>>>
>>>>
>>>> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>>>>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>>>>> Hello!
>>>>>>
>>>>>> The recent fix for JDK-8059533 ((process) Make exiting process wait
>>>>>> for exiting threads [win]) caused the warning message to be printed
>>>>>> in some test environments:
>>>>>> -----------
>>>>>> os_windows.cpp:3844 is in the newly updated
>>>>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>>>>> -----------
>>>>>>
>>>>>> This has been observed with debug builds on highly loaded systems.
>>>>>>
>>>>>>
>>>>>> To address the issue it is proposed to do three things:
>>>>>> 1) increase the timeout for debug builds,
>>>>>> 2) increase the maximum number of the thread handles to be stored,
>>>>>> 3) rise the priority of the exiting threads, if we need to wait for
>>>>>> them.
>>>>>>
>>>>>> Would you please help review the fix?
>>>>>>
>>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>>>>
>>>>> src/os/windows/vm/os_windows.cpp
>>>>>
>>>>>   line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>>>>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>>
>>>>>     That uses the smaller value for only one build config (PRODUCT).
>>>>>
>>>>>   line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000) 
>>>>> DEBUG_ONLY(4000)
>>>>> /*1 sec in product, 4 sec in debug*/
>>>>>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>>     Please add spaces between the comment delimiters and the comment
>>>>> text.
>>>>>
>>>>>     That uses the smaller timeout for only one build config 
>>>>> (PRODUCT).
>>>>>
>>>>>   line 3836           // Rise the priority...
>>>>>     Typo: 'Rise' -> 'Raise'
>>>>>
>>>>>     About the general idea of raising the exiting thread's priority,
>>>>>     if the exiting thread is looping in some Win* OS code after this
>>>>>     point, will raising the priority make the machine unusable?
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>>
>>>>>> The fix was tested on all available platforms, with the hotspot
>>>>>> testset. No failures.
>>>>>>
>>>>>> Sincerely yours,
>>>>>> Ivan
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>


From mandy.chung at oracle.com  Tue Nov 18 02:02:58 2014
From: mandy.chung at oracle.com (Mandy Chung)
Date: Mon, 17 Nov 2014 18:02:58 -0800
Subject: [8u40] Review request 8064667: Provide support to help identify
	use of endorsed standards and extension mechanism
In-Reply-To: <546A28F3.1010802@oracle.com>
References: <546A28F3.1010802@oracle.com>
Message-ID: <546AA8D2.1050600@oracle.com>

Updated webrev:
http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.01/

This addresses Calvin's comment.  It now keeps a list of the jar files 
shipped with jre/lib/ext and determine if jre/lib/ext has any other 
non-JDK jar files installed.

Mandy

On 11/17/2014 8:57 AM, Mandy Chung wrote:
> This requests both code review and 8u40 approval for:
>    https://bugs.openjdk.java.net/browse/JDK-8064667
>
> Webrev:
> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.00/
>
> JEP 220 [1] proposes to remove the endorsed standards override 
> mechanism and extension mechanism. This patch adds a VM flag in 8u40 
> to help identify any existing uses of these mechanisms so that users 
> can turn on the VM flag to help identify if they depend on the 
> endorsed standards override mechanism and extension mechanism and can 
> plan to prepare for the migration to a newer JDK release early on. 
> When -XX:+CheckEndorsedAndExtDirs is set, the VM will exit if the 
> system property -Djava.endorsed.dirs or -Djava.ext.dirs is set, or if 
> ${java.home}/lib/endorsed or ${java.home}/lib/ext exists, or any 
> system extension directory contains JAR files.
>
> Thanks
> Mandy
> [1] http://openjdk.java.net/jeps/220
>
>
>


From daniel.daugherty at oracle.com  Tue Nov 18 02:05:30 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 17 Nov 2014 19:05:30 -0700
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <546AA86E.3030308@oracle.com>
References: <5465F711.9090605@oracle.com> <54668EAF.9070807@oracle.com>
	<546915DF.7080106@oracle.com> <54699845.5010901@oracle.com>
	<5469B920.4040300@oracle.com> <546AA5D6.4050203@oracle.com>
	<546AA86E.3030308@oracle.com>
Message-ID: <546AA96A.1020203@oracle.com>

Ivan,

I spoke too soon.

There's a review comment from Markus G that hasn't been addressed.
We need to see if you've convinced Markus in addition to David H.

Dan

P.S.
Look for Markus' reply to David H's e-mail; it not in this
fork of the review thread...


On 11/17/14 7:01 PM, Daniel D. Daugherty wrote:
> Ivan,
>
> Please coordinate with Staffan Larsen about when he is planning to
> take this week's snapshot of JDK9-hs-rt (RT_Baseline). Please push
> your fix after Staffan's snapshot so we can have a week of soak
> time for this version of the fix...
>
> Dan
>
>
> On 11/17/14 6:50 PM, David Holmes wrote:
>> Hi Ivan,
>>
>> On 17/11/2014 7:00 PM, Ivan Gerasimov wrote:
>>> Thanks David!
>>>
>>> On 17.11.2014 9:40, David Holmes wrote:
>>>> On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
>>>>> Thank you Daniel!
>>>>>
>>>>> Please find the updated webrev with your suggestions incorporated 
>>>>> here:
>>>>> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>>>>>
>>>>> Concerning the thread priority: If the application is of
>>>>> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to
>>>>> THREAD_PRIORITY_HIGHEST will result in its priority value to be 
>>>>> only 10
>>>>> (of maximum 31).
>>>>> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.85).aspx 
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> And if the process is HIGH_PRIORITY_CLASS, then the tread with the
>>>>> HIGHEST priority level will have priority value == 15 of 31.
>>>>>
>>>>> I believe, it should not be too much, and the machine will not become
>>>>> busy with only those closing threads.
>>>>> However, I hope it would be enough to make them complete faster than
>>>>> other threads of the NORMAL priority level withing the same 
>>>>> application.
>>>>
>>>> I don't think this is necessary or desirable. Under normal usage we're
>>>> giving priority to exiting threads and that may disrupt the usual
>>>> scheduling patterns that applications see. You may posit that it is
>>>> "harmless" but we can't say that for sure. Nor can we actually know
>>>> that this will help with this particular bug. I would not add in this
>>>> new code.
>>>>
>>>
>>> There are two places where I put adjusting the thread's priority:
>>>
>>> 1) We've the array of handles filled up.
>>>
>>> If we're found in this code branch, it'll mean that unfortunately we've
>>> already got broken exit pattern, because the current thread has to do a
>>> blocking call, having the ownership of a critical section.
>>> The full array of handles means that many threads are exiting at that
>>> time, thus all the threads that are starting to exit after the current
>>> one will block at the attempt to grab ownership of the critical 
>>> section.
>>>
>>> Raising the priority of one thread that had already reached
>>> _endthreadex(), seems appropriate to me in such a situation, because it
>>> helps shorten the period of time when the threads remain blocked.
>>>
>>> Choosing the oldest exiting thread ensures that the period of time when
>>> the priority of one thread is higher is the smallest possible.
>>>
>>> 2) The process exit branch.
>>>
>>> That's the main part of the fix -- here we make the process to wait for
>>> all the threads having called _endthreadex() to complete, at the same
>>> time preventing any other threads from starting the exiting procedure.
>>> The execution flow is already changed here (I don't want to say
>>> disrupted, because it was meant to fix the issue).
>>>
>>> All running threads are about to be terminated soon by ending the
>>> process, so raising the priority of some of the threads should not have
>>> any bad impact on the program flow.
>>> Instead, it may make the time the process has to wait before calling
>>> exit() shorter.
>>>
>>>
>>> I can surely remove that playing with the threads' priority, as it's 
>>> not
>>> the essential part of the fix.
>>> However, I think it's a useful hint to the scheduler, which can improve
>>> things in some situations, and I'm not really sure how it can harm.
>>
>> Okay. You've convinced me. I'm okay with the priority changes to try 
>> to minimize the exit time blocking.
>>
>> Thanks,
>> David
>>
>>>
>>> Sincerely yours,
>>> Ivan
>>>
>>>
>>>> David
>>>>
>>>>> Sincerely yours,
>>>>> Ivan
>>>>>
>>>>>
>>>>> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>>>>>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>>>>>> Hello!
>>>>>>>
>>>>>>> The recent fix for JDK-8059533 ((process) Make exiting process wait
>>>>>>> for exiting threads [win]) caused the warning message to be printed
>>>>>>> in some test environments:
>>>>>>> -----------
>>>>>>> os_windows.cpp:3844 is in the newly updated
>>>>>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>>>>>> -----------
>>>>>>>
>>>>>>> This has been observed with debug builds on highly loaded systems.
>>>>>>>
>>>>>>>
>>>>>>> To address the issue it is proposed to do three things:
>>>>>>> 1) increase the timeout for debug builds,
>>>>>>> 2) increase the maximum number of the thread handles to be stored,
>>>>>>> 3) rise the priority of the exiting threads, if we need to wait for
>>>>>>> them.
>>>>>>>
>>>>>>> Would you please help review the fix?
>>>>>>>
>>>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>>>>>
>>>>>> src/os/windows/vm/os_windows.cpp
>>>>>>
>>>>>>   line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>>>>>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>>>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>>>
>>>>>>     That uses the smaller value for only one build config (PRODUCT).
>>>>>>
>>>>>>   line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000) 
>>>>>> DEBUG_ONLY(4000)
>>>>>> /*1 sec in product, 4 sec in debug*/
>>>>>>     Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>>>     Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>>>     Please add spaces between the comment delimiters and the comment
>>>>>> text.
>>>>>>
>>>>>>     That uses the smaller timeout for only one build config 
>>>>>> (PRODUCT).
>>>>>>
>>>>>>   line 3836           // Rise the priority...
>>>>>>     Typo: 'Rise' -> 'Raise'
>>>>>>
>>>>>>     About the general idea of raising the exiting thread's priority,
>>>>>>     if the exiting thread is looping in some Win* OS code after this
>>>>>>     point, will raising the priority make the machine unusable?
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> The fix was tested on all available platforms, with the hotspot
>>>>>>> testset. No failures.
>>>>>>>
>>>>>>> Sincerely yours,
>>>>>>> Ivan
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>
>
>
>


From ioi.lam at oracle.com  Tue Nov 18 02:13:48 2014
From: ioi.lam at oracle.com (Ioi Lam)
Date: Tue, 18 Nov 2014 10:13:48 +0800
Subject: RFR (XS) 8064701: Some CDS optimizations should be disabled if
	bootclasspath is modified by JVMTI
Message-ID: <546AAB5C.1070009@oracle.com>

Please review a very small fix:

http://cr.openjdk.java.net/~iklam/8064701-append-boot-v1/hotspot/

Bug: Some CDS optimizations should be disabled if bootclasspath is 
modified by JVMTI
     https://bugs.openjdk.java.net/browse/JDK-8064701


Summary of fix:

     This change adds an API so that the class loader is notified when 
JVMTI modifies
     the boot classpath. Further CDS optimizations can use this API to 
disable
     optimizations that may be invalidated by boot classpath modifications.

Also added white box testing API for invoking JVMTI boot/system classpath
     modifications for further CDS testing needs.

Tests:

     JPRT

Thanks
- Ioi

From jiangli.zhou at oracle.com  Tue Nov 18 02:54:41 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Mon, 17 Nov 2014 18:54:41 -0800
Subject: RFR (XS) 8064701: Some CDS optimizations should be disabled if
	bootclasspath is modified by JVMTI
In-Reply-To: <546AAB5C.1070009@oracle.com>
References: <546AAB5C.1070009@oracle.com>
Message-ID: <546AB4F1.4010104@oracle.com>

Hi Ioi,

Looks good.

Thanks,
Jiangli

On 11/17/2014 06:13 PM, Ioi Lam wrote:
> Please review a very small fix:
>
> http://cr.openjdk.java.net/~iklam/8064701-append-boot-v1/hotspot/
>
> Bug: Some CDS optimizations should be disabled if bootclasspath is 
> modified by JVMTI
>     https://bugs.openjdk.java.net/browse/JDK-8064701
>
>
> Summary of fix:
>
>     This change adds an API so that the class loader is notified when 
> JVMTI modifies
>     the boot classpath. Further CDS optimizations can use this API to 
> disable
>     optimizations that may be invalidated by boot classpath 
> modifications.
>
> Also added white box testing API for invoking JVMTI boot/system classpath
>     modifications for further CDS testing needs.
>
> Tests:
>
>     JPRT
>
> Thanks
> - Ioi


From yumin.qi at oracle.com  Tue Nov 18 03:59:58 2014
From: yumin.qi at oracle.com (Yumin Qi)
Date: Mon, 17 Nov 2014 19:59:58 -0800
Subject: RFR (XS) 8064701: Some CDS optimizations should be disabled if
	bootclasspath is modified by JVMTI
In-Reply-To: <546AAB5C.1070009@oracle.com>
References: <546AAB5C.1070009@oracle.com>
Message-ID: <546AC43E.8070800@oracle.com>

Looks good.
Not "R"eviewer.

Thanks
Yumin

On 11/17/2014 6:13 PM, Ioi Lam wrote:
> Please review a very small fix:
>
> http://cr.openjdk.java.net/~iklam/8064701-append-boot-v1/hotspot/
>
> Bug: Some CDS optimizations should be disabled if bootclasspath is 
> modified by JVMTI
>     https://bugs.openjdk.java.net/browse/JDK-8064701
>
>
> Summary of fix:
>
>     This change adds an API so that the class loader is notified when 
> JVMTI modifies
>     the boot classpath. Further CDS optimizations can use this API to 
> disable
>     optimizations that may be invalidated by boot classpath 
> modifications.
>
> Also added white box testing API for invoking JVMTI boot/system classpath
>     modifications for further CDS testing needs.
>
> Tests:
>
>     JPRT
>
> Thanks
> - Ioi


From david.holmes at oracle.com  Tue Nov 18 04:04:44 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 18 Nov 2014 14:04:44 +1000
Subject: RFR(XS): 8064471: Port 8013895: G1: G1SummarizeRSetStats output
	on Linux needs improvement to AIX
In-Reply-To: <5469994D.3070208@oracle.com>
References: <44EB1DA1040EEB44B7F343A782AD30789FEBA3E3@DEWDFEMB11B.global.corp.sap>	<5463149A.6020506@oracle.com>
	<54637A9A.9040108@sap.com>	<546470BD.9050303@oracle.com>
	<5464DED4.9040909@sap.com> <5469994D.3070208@oracle.com>
Message-ID: <546AC55C.4090201@oracle.com>

Gunter,

On 17/11/2014 4:44 PM, David Holmes wrote:
> On 14/11/2014 2:39 AM, Haug, Gunter wrote:
>>
>> On 13.11.2014 09:50, David Holmes wrote:
>>> On 13/11/2014 1:19 AM, Haug, Gunter wrote:
>>>>
>>>> On 12.11.2014 09:04, David Holmes wrote:
>>>>> Hi Gunter,
>>>>>
>>>>> On 11/11/2014 11:23 PM, Haug, Gunter wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> The change '8013895:  (G1: G1SummarizeRSetStats output on Linux needs
>>>>>> improvement)' makes use of getrusage() to retrieve accurate
>>>>>> per-thread data on resource usage. We can use exactly the same code
>>>>>> on AIX to achieve this.
>>>>>>
>>>>>> Please review the following change:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~goetz/webrevs/8064471-aixTime/webrev.00/
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8064471
>>>>>
>>>>> I have a couple of comments on this code which presumably also apply
>>>>> to the orginal :(
>>>> Yes, they apply to the original as well, see below.
>>>>>
>>>>> First this comment is no longer applicable (actually it was never
>>>>> applicable to AIX!):
>>>>>
>>>>>   // For now, we say that linux does not support vtime. I have no idea
>>>>>   // whether it can actually be made to (DLD, 9/13/05).
>>>>>
>>>> You're right. I will remove it.
>>>>> Second this calculation seems wrong:
>>>>>
>>>>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>>>>> (double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000 *
>>>>> 1000);
>>>>>
>>>>> To me this performs integer division (ie truncation_) then converts
>>>>> the resulting integer to a double. I would expect to see additional
>>>>> parentheses (even if not needed, for clarity):
>>>>>
>>>>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>>>>> ((double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec)) / (1000 *
>>>>> 1000);
>>>>>
>>>>> or more simply divide by a floating-point value:
>>>>>
>>>>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>>>>> (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);
>>>>>
>>>>> and you don't need two double casts regardless as the expression will
>>>>> be of type double as soon as there is one operand of type double. So
>>>>> that should reduce to:
>>>>>
>>>>> return usage.ru_utime.tv_sec + usage.ru_stime.tv_sec +
>>>>> (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);
>>>>>
>>>> OK. Do you want that we also change the Linux version like you
>>>> proposed?
>>>
>>> I'll leave it up to you. If you leave this as AIX only then it tests
>>> the new process :) There can be a follow up cleanup bug for linux.
>>
>> Hi David,
>>
>> I think it's not worth the effort to make two separate changes on linux
>> and aix, so I fixed linux as well. Please find the new webrev below.
>> There will probably be more opportunities to test the new process in the
>> future.
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/8064471.v2/
>> <http://cr.openjdk.java.net/%7Esimonis/webrevs/8064471.v2/>
>>
>>
>> Now we need a sponsor, as it is not aix only anymore.
>
> I guess that will have to be me. :) I will try to look at this again
> tomorrow.

The original code was in fact correct - the double cast binds to the 
summation before the division is applied. Given that and the fact the 
linux code doesn't contain the incorrect comment, I don't see any need 
to modify the linux code. You can simply push the AIX change by itself.

Sorry for messing you around on this.

David

> David
>
>> Thanks,
>> Gunter
>>
>>
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Gunter
>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>>> Thanks,
>>>>>> Gunter
>>>>>>
>>>>
>>

From ivan.gerasimov at oracle.com  Tue Nov 18 07:29:30 2014
From: ivan.gerasimov at oracle.com (Ivan Gerasimov)
Date: Tue, 18 Nov 2014 10:29:30 +0300
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <a68e8e09-8efa-4186-8207-d6b4b02a8831@default>
References: <5465F711.9090605@oracle.com> <54668EAF.9070807@oracle.com>
	<546915DF.7080106@oracle.com> <54699845.5010901@oracle.com>
	<a68e8e09-8efa-4186-8207-d6b4b02a8831@default>
Message-ID: <546AF55A.8090203@oracle.com>

Hi Markus!

The priority of the exiting thread will be raised for quite a short 
period of time -- right before the thread finishes exiting.

There are two places where the priority is adjusted.

Under normal conditions we should never see the first place hit. 
However, if we do, this means we have a huge number of threads.
Raising the priority of one of them is a hint about which thread we want 
the scheduler to focus on.

The second place is a bit different.
We have several threads running immediately before ending the process.
Some of them are at the exiting path and block exiting of the whole process.
Raising the priority of those threads is a way to say we're not 
interested in all the other threads, as they are going to be terminated 
anyway.

I just noticed that in second scenario it may be appropriate to set the 
priority of the current thread to the same level as for the exiting threads.
This way it'll be given a fair chance to continue if the timeout expires.

I also think it should be enough to set the priority level to 
THREAD_PRIORITY_ABOVE_NORMAL instead of THREAD_PRIORITY_HIGHEST.
It will give just +1 to the priority value -- should be enough for the hint.

Would you please take a look at the updated webrev:
http://cr.openjdk.java.net/~igerasim/8064694/2/webrev/

Sincerely yours,
Ivan


On 17.11.2014 11:33, Markus Gr?nlund wrote:
> I agree with David.
>
> The side effects will be unknown and very hard to debug.
>
> Is there another way to accomplish the results without manipulating base services?
>
> Thanks
> Markus
>
> -----Original Message-----
> From: David Holmes
> Sent: den 17 november 2014 07:40
> To: Ivan Gerasimov; Daniel Daugherty
> Cc: serviceability-dev at openjdk.java.net; hotspot-runtime-dev
> Subject: Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in hotspot\src\os\windows\vm\os_windows.cpp: 3844
>
> On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
>> Thank you Daniel!
>>
>> Please find the updated webrev with your suggestions incorporated here:
>> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>>
>> Concerning the thread priority: If the application is of
>> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to
>> THREAD_PRIORITY_HIGHEST will result in its priority value to be only
>> 10 (of maximum 31).
>> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.
>> 85).aspx
>>
>>
>> And if the process is HIGH_PRIORITY_CLASS, then the tread with the
>> HIGHEST priority level will have priority value == 15 of 31.
>>
>> I believe, it should not be too much, and the machine will not become
>> busy with only those closing threads.
>> However, I hope it would be enough to make them complete faster than
>> other threads of the NORMAL priority level withing the same application.
> I don't think this is necessary or desirable. Under normal usage we're giving priority to exiting threads and that may disrupt the usual scheduling patterns that applications see. You may posit that it is "harmless" but we can't say that for sure. Nor can we actually know that this will help with this particular bug. I would not add in this new code.
>
> David
>
>> Sincerely yours,
>> Ivan
>>
>>
>> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>>> Hello!
>>>>
>>>> The recent fix for JDK-8059533 ((process) Make exiting process wait
>>>> for exiting threads [win]) caused the warning message to be printed
>>>> in some test environments:
>>>> -----------
>>>> os_windows.cpp:3844 is in the newly updated
>>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>>> -----------
>>>>
>>>> This has been observed with debug builds on highly loaded systems.
>>>>
>>>>
>>>> To address the issue it is proposed to do three things:
>>>> 1) increase the timeout for debug builds,
>>>> 2) increase the maximum number of the thread handles to be stored,
>>>> 3) rise the priority of the exiting threads, if we need to wait for
>>>> them.
>>>>
>>>> Would you please help review the fix?
>>>>
>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>> src/os/windows/vm/os_windows.cpp
>>>
>>>    line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>>      Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>      Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>
>>>      That uses the smaller value for only one build config (PRODUCT).
>>>
>>>    line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000) DEBUG_ONLY(4000)
>>> /*1 sec in product, 4 sec in debug*/
>>>      Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>      Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>      Please add spaces between the comment delimiters and the comment
>>> text.
>>>
>>>      That uses the smaller timeout for only one build config (PRODUCT).
>>>
>>>    line 3836           // Rise the priority...
>>>      Typo: 'Rise' -> 'Raise'
>>>
>>>      About the general idea of raising the exiting thread's priority,
>>>      if the exiting thread is looping in some Win* OS code after this
>>>      point, will raising the priority make the machine unusable?
>>>
>>> Dan
>>>
>>>
>>>> The fix was tested on all available platforms, with the hotspot
>>>> testset. No failures.
>>>>
>>>> Sincerely yours,
>>>> Ivan
>>>>
>>>
>>>
>


From volker.simonis at gmail.com  Tue Nov 18 09:50:37 2014
From: volker.simonis at gmail.com (Volker Simonis)
Date: Tue, 18 Nov 2014 10:50:37 +0100
Subject: RFR(XS): 8064471: Port 8013895: G1: G1SummarizeRSetStats output
	on Linux needs improvement to AIX
In-Reply-To: <546AC55C.4090201@oracle.com>
References: <44EB1DA1040EEB44B7F343A782AD30789FEBA3E3@DEWDFEMB11B.global.corp.sap>
	<5463149A.6020506@oracle.com> <54637A9A.9040108@sap.com>
	<546470BD.9050303@oracle.com> <5464DED4.9040909@sap.com>
	<5469994D.3070208@oracle.com> <546AC55C.4090201@oracle.com>
Message-ID: <CA+3eh13YynvbA+hYV8FuBXJgB14pqdPpWaOaRvn1Nm6EuazJQA@mail.gmail.com>

OK, thanks.

Just pushed it to hotspot-rt and it worked!


Regards,
Volker


On Tue, Nov 18, 2014 at 5:04 AM, David Holmes <david.holmes at oracle.com> wrote:
> Gunter,
>
>
> On 17/11/2014 4:44 PM, David Holmes wrote:
>>
>> On 14/11/2014 2:39 AM, Haug, Gunter wrote:
>>>
>>>
>>> On 13.11.2014 09:50, David Holmes wrote:
>>>>
>>>> On 13/11/2014 1:19 AM, Haug, Gunter wrote:
>>>>>
>>>>>
>>>>> On 12.11.2014 09:04, David Holmes wrote:
>>>>>>
>>>>>> Hi Gunter,
>>>>>>
>>>>>> On 11/11/2014 11:23 PM, Haug, Gunter wrote:
>>>>>>>
>>>>>>> Hi All,
>>>>>>>
>>>>>>> The change '8013895:  (G1: G1SummarizeRSetStats output on Linux needs
>>>>>>> improvement)' makes use of getrusage() to retrieve accurate
>>>>>>> per-thread data on resource usage. We can use exactly the same code
>>>>>>> on AIX to achieve this.
>>>>>>>
>>>>>>> Please review the following change:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~goetz/webrevs/8064471-aixTime/webrev.00/
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8064471
>>>>>>
>>>>>>
>>>>>> I have a couple of comments on this code which presumably also apply
>>>>>> to the orginal :(
>>>>>
>>>>> Yes, they apply to the original as well, see below.
>>>>>>
>>>>>>
>>>>>> First this comment is no longer applicable (actually it was never
>>>>>> applicable to AIX!):
>>>>>>
>>>>>>   // For now, we say that linux does not support vtime. I have no idea
>>>>>>   // whether it can actually be made to (DLD, 9/13/05).
>>>>>>
>>>>> You're right. I will remove it.
>>>>>>
>>>>>> Second this calculation seems wrong:
>>>>>>
>>>>>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>>>>>> (double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000 *
>>>>>> 1000);
>>>>>>
>>>>>> To me this performs integer division (ie truncation_) then converts
>>>>>> the resulting integer to a double. I would expect to see additional
>>>>>> parentheses (even if not needed, for clarity):
>>>>>>
>>>>>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>>>>>> ((double) (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec)) / (1000 *
>>>>>> 1000);
>>>>>>
>>>>>> or more simply divide by a floating-point value:
>>>>>>
>>>>>> return (double) (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) +
>>>>>> (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);
>>>>>>
>>>>>> and you don't need two double casts regardless as the expression will
>>>>>> be of type double as soon as there is one operand of type double. So
>>>>>> that should reduce to:
>>>>>>
>>>>>> return usage.ru_utime.tv_sec + usage.ru_stime.tv_sec +
>>>>>> (usage.ru_utime.tv_usec + usage.ru_stime.tv_usec) / (1000.0 * 1000);
>>>>>>
>>>>> OK. Do you want that we also change the Linux version like you
>>>>> proposed?
>>>>
>>>>
>>>> I'll leave it up to you. If you leave this as AIX only then it tests
>>>> the new process :) There can be a follow up cleanup bug for linux.
>>>
>>>
>>> Hi David,
>>>
>>> I think it's not worth the effort to make two separate changes on linux
>>> and aix, so I fixed linux as well. Please find the new webrev below.
>>> There will probably be more opportunities to test the new process in the
>>> future.
>>>
>>> http://cr.openjdk.java.net/~simonis/webrevs/8064471.v2/
>>> <http://cr.openjdk.java.net/%7Esimonis/webrevs/8064471.v2/>
>>>
>>>
>>> Now we need a sponsor, as it is not aix only anymore.
>>
>>
>> I guess that will have to be me. :) I will try to look at this again
>> tomorrow.
>
>
> The original code was in fact correct - the double cast binds to the
> summation before the division is applied. Given that and the fact the linux
> code doesn't contain the incorrect comment, I don't see any need to modify
> the linux code. You can simply push the AIX change by itself.
>
> Sorry for messing you around on this.
>
> David
>
>
>> David
>>
>>> Thanks,
>>> Gunter
>>>
>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks,
>>>>> Gunter
>>>>>
>>>>>> Cheers,
>>>>>> David
>>>>>>
>>>>>>> Thanks,
>>>>>>> Gunter
>>>>>>>
>>>>>
>>>
>

From markus.gronlund at oracle.com  Tue Nov 18 13:02:45 2014
From: markus.gronlund at oracle.com (=?utf-8?B?TWFya3VzIEdyw7ZubHVuZA==?=)
Date: Tue, 18 Nov 2014 05:02:45 -0800 (PST)
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <546AF55A.8090203@oracle.com>
References: <5465F711.9090605@oracle.com> <54668EAF.9070807@oracle.com>
	<546915DF.7080106@oracle.com> <54699845.5010901@oracle.com>
	<a68e8e09-8efa-4186-8207-d6b4b02a8831@default>
	<546AF55A.8090203@oracle.com>
Message-ID: <31e5d701-c75f-478b-b4a1-3585c40ba274@default>

Hi Ivan,

I don't want to you block you from getting this in - I need to get the full story behind all these changes (backtracking now).

If I find something that I think we should revisit, we can always do that later.

So pls go ahead.

Thanks
Markus

PS.
I have some concerns (but will need to get back to you on that after tracing down the exact details)).

Do you have a particular test case that you have been working on for these changes?


-----Original Message-----
From: Ivan Gerasimov 
Sent: den 18 november 2014 08:30
To: Markus Gr?nlund; David Holmes; Daniel Daugherty
Cc: serviceability-dev at openjdk.java.net; hotspot-runtime-dev
Subject: Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in hotspot\src\os\windows\vm\os_windows.cpp: 3844

Hi Markus!

The priority of the exiting thread will be raised for quite a short period of time -- right before the thread finishes exiting.

There are two places where the priority is adjusted.

Under normal conditions we should never see the first place hit. 
However, if we do, this means we have a huge number of threads.
Raising the priority of one of them is a hint about which thread we want the scheduler to focus on.

The second place is a bit different.
We have several threads running immediately before ending the process.
Some of them are at the exiting path and block exiting of the whole process.
Raising the priority of those threads is a way to say we're not interested in all the other threads, as they are going to be terminated anyway.

I just noticed that in second scenario it may be appropriate to set the priority of the current thread to the same level as for the exiting threads.
This way it'll be given a fair chance to continue if the timeout expires.

I also think it should be enough to set the priority level to THREAD_PRIORITY_ABOVE_NORMAL instead of THREAD_PRIORITY_HIGHEST.
It will give just +1 to the priority value -- should be enough for the hint.

Would you please take a look at the updated webrev:
http://cr.openjdk.java.net/~igerasim/8064694/2/webrev/

Sincerely yours,
Ivan


On 17.11.2014 11:33, Markus Gr?nlund wrote:
> I agree with David.
>
> The side effects will be unknown and very hard to debug.
>
> Is there another way to accomplish the results without manipulating base services?
>
> Thanks
> Markus
>
> -----Original Message-----
> From: David Holmes
> Sent: den 17 november 2014 07:40
> To: Ivan Gerasimov; Daniel Daugherty
> Cc: serviceability-dev at openjdk.java.net; hotspot-runtime-dev
> Subject: Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed 
> in hotspot\src\os\windows\vm\os_windows.cpp: 3844
>
> On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
>> Thank you Daniel!
>>
>> Please find the updated webrev with your suggestions incorporated here:
>> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>>
>> Concerning the thread priority: If the application is of 
>> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to 
>> THREAD_PRIORITY_HIGHEST will result in its priority value to be only
>> 10 (of maximum 31).
>> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.
>> 85).aspx
>>
>>
>> And if the process is HIGH_PRIORITY_CLASS, then the tread with the 
>> HIGHEST priority level will have priority value == 15 of 31.
>>
>> I believe, it should not be too much, and the machine will not become 
>> busy with only those closing threads.
>> However, I hope it would be enough to make them complete faster than 
>> other threads of the NORMAL priority level withing the same application.
> I don't think this is necessary or desirable. Under normal usage we're giving priority to exiting threads and that may disrupt the usual scheduling patterns that applications see. You may posit that it is "harmless" but we can't say that for sure. Nor can we actually know that this will help with this particular bug. I would not add in this new code.
>
> David
>
>> Sincerely yours,
>> Ivan
>>
>>
>> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>>> Hello!
>>>>
>>>> The recent fix for JDK-8059533 ((process) Make exiting process wait 
>>>> for exiting threads [win]) caused the warning message to be printed 
>>>> in some test environments:
>>>> -----------
>>>> os_windows.cpp:3844 is in the newly updated 
>>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>>> -----------
>>>>
>>>> This has been observed with debug builds on highly loaded systems.
>>>>
>>>>
>>>> To address the issue it is proposed to do three things:
>>>> 1) increase the timeout for debug builds,
>>>> 2) increase the maximum number of the thread handles to be stored,
>>>> 3) rise the priority of the exiting threads, if we need to wait for 
>>>> them.
>>>>
>>>> Would you please help review the fix?
>>>>
>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>> src/os/windows/vm/os_windows.cpp
>>>
>>>    line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>>      Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>      Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>
>>>      That uses the smaller value for only one build config (PRODUCT).
>>>
>>>    line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000) DEBUG_ONLY(4000)
>>> /*1 sec in product, 4 sec in debug*/
>>>      Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>      Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>      Please add spaces between the comment delimiters and the 
>>> comment text.
>>>
>>>      That uses the smaller timeout for only one build config (PRODUCT).
>>>
>>>    line 3836           // Rise the priority...
>>>      Typo: 'Rise' -> 'Raise'
>>>
>>>      About the general idea of raising the exiting thread's priority,
>>>      if the exiting thread is looping in some Win* OS code after this
>>>      point, will raising the priority make the machine unusable?
>>>
>>> Dan
>>>
>>>
>>>> The fix was tested on all available platforms, with the hotspot 
>>>> testset. No failures.
>>>>
>>>> Sincerely yours,
>>>> Ivan
>>>>
>>>
>>>
>


From daniel.daugherty at oracle.com  Tue Nov 18 15:27:36 2014
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 18 Nov 2014 08:27:36 -0700
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <546AF55A.8090203@oracle.com>
References: <5465F711.9090605@oracle.com> <54668EAF.9070807@oracle.com>
	<546915DF.7080106@oracle.com> <54699845.5010901@oracle.com>
	<a68e8e09-8efa-4186-8207-d6b4b02a8831@default>
	<546AF55A.8090203@oracle.com>
Message-ID: <546B6568.7040701@oracle.com>

 > http://cr.openjdk.java.net/~igerasim/8064694/2/webrev/

src/os/windows/vm/os_windows.cpp
     No commments.

Thumbs up.

Dan


On 11/18/14 12:29 AM, Ivan Gerasimov wrote:
> Hi Markus!
>
> The priority of the exiting thread will be raised for quite a short 
> period of time -- right before the thread finishes exiting.
>
> There are two places where the priority is adjusted.
>
> Under normal conditions we should never see the first place hit. 
> However, if we do, this means we have a huge number of threads.
> Raising the priority of one of them is a hint about which thread we 
> want the scheduler to focus on.
>
> The second place is a bit different.
> We have several threads running immediately before ending the process.
> Some of them are at the exiting path and block exiting of the whole 
> process.
> Raising the priority of those threads is a way to say we're not 
> interested in all the other threads, as they are going to be 
> terminated anyway.
>
> I just noticed that in second scenario it may be appropriate to set 
> the priority of the current thread to the same level as for the 
> exiting threads.
> This way it'll be given a fair chance to continue if the timeout expires.
>
> I also think it should be enough to set the priority level to 
> THREAD_PRIORITY_ABOVE_NORMAL instead of THREAD_PRIORITY_HIGHEST.
> It will give just +1 to the priority value -- should be enough for the 
> hint.
>
> Would you please take a look at the updated webrev:
> http://cr.openjdk.java.net/~igerasim/8064694/2/webrev/
>
> Sincerely yours,
> Ivan
>
>
> On 17.11.2014 11:33, Markus Gr?nlund wrote:
>> I agree with David.
>>
>> The side effects will be unknown and very hard to debug.
>>
>> Is there another way to accomplish the results without manipulating 
>> base services?
>>
>> Thanks
>> Markus
>>
>> -----Original Message-----
>> From: David Holmes
>> Sent: den 17 november 2014 07:40
>> To: Ivan Gerasimov; Daniel Daugherty
>> Cc: serviceability-dev at openjdk.java.net; hotspot-runtime-dev
>> Subject: Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed 
>> in hotspot\src\os\windows\vm\os_windows.cpp: 3844
>>
>> On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
>>> Thank you Daniel!
>>>
>>> Please find the updated webrev with your suggestions incorporated here:
>>> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>>>
>>> Concerning the thread priority: If the application is of
>>> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to
>>> THREAD_PRIORITY_HIGHEST will result in its priority value to be only
>>> 10 (of maximum 31).
>>> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.
>>> 85).aspx
>>>
>>>
>>> And if the process is HIGH_PRIORITY_CLASS, then the tread with the
>>> HIGHEST priority level will have priority value == 15 of 31.
>>>
>>> I believe, it should not be too much, and the machine will not become
>>> busy with only those closing threads.
>>> However, I hope it would be enough to make them complete faster than
>>> other threads of the NORMAL priority level withing the same 
>>> application.
>> I don't think this is necessary or desirable. Under normal usage 
>> we're giving priority to exiting threads and that may disrupt the 
>> usual scheduling patterns that applications see. You may posit that 
>> it is "harmless" but we can't say that for sure. Nor can we actually 
>> know that this will help with this particular bug. I would not add in 
>> this new code.
>>
>> David
>>
>>> Sincerely yours,
>>> Ivan
>>>
>>>
>>> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>>>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>>>> Hello!
>>>>>
>>>>> The recent fix for JDK-8059533 ((process) Make exiting process wait
>>>>> for exiting threads [win]) caused the warning message to be printed
>>>>> in some test environments:
>>>>> -----------
>>>>> os_windows.cpp:3844 is in the newly updated
>>>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>>>> -----------
>>>>>
>>>>> This has been observed with debug builds on highly loaded systems.
>>>>>
>>>>>
>>>>> To address the issue it is proposed to do three things:
>>>>> 1) increase the timeout for debug builds,
>>>>> 2) increase the maximum number of the thread handles to be stored,
>>>>> 3) rise the priority of the exiting threads, if we need to wait for
>>>>> them.
>>>>>
>>>>> Would you please help review the fix?
>>>>>
>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>>> src/os/windows/vm/os_windows.cpp
>>>>
>>>>    line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>>>      Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>      Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>
>>>>      That uses the smaller value for only one build config (PRODUCT).
>>>>
>>>>    line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000) 
>>>> DEBUG_ONLY(4000)
>>>> /*1 sec in product, 4 sec in debug*/
>>>>      Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>      Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>      Please add spaces between the comment delimiters and the comment
>>>> text.
>>>>
>>>>      That uses the smaller timeout for only one build config 
>>>> (PRODUCT).
>>>>
>>>>    line 3836           // Rise the priority...
>>>>      Typo: 'Rise' -> 'Raise'
>>>>
>>>>      About the general idea of raising the exiting thread's priority,
>>>>      if the exiting thread is looping in some Win* OS code after this
>>>>      point, will raising the priority make the machine unusable?
>>>>
>>>> Dan
>>>>
>>>>
>>>>> The fix was tested on all available platforms, with the hotspot
>>>>> testset. No failures.
>>>>>
>>>>> Sincerely yours,
>>>>> Ivan
>>>>>
>>>>
>>>>
>>
>


From chris.plummer at oracle.com  Tue Nov 18 22:08:30 2014
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 18 Nov 2014 14:08:30 -0800
Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation
	fault
In-Reply-To: <54641ADE.8030504@oracle.com>
References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com>
	<54641ADE.8030504@oracle.com>
Message-ID: <546BC35E.4070402@oracle.com>

Adding core-libs-dev at openjdk.java.net, since one of the changes is in 
java.c.

Chris

On 11/12/14 6:43 PM, David Holmes wrote:
> Hi Chris,
>
> Sorry for the delay.
>
> On 13/11/2014 5:44 AM, Chris Plummer wrote:
>> Hi,
>>
>> I'm still looking for reviewers.
>
> As the change is to the launcher it needs to be reviewed by the 
> launcher owner - which I think is serviceability (though also cc'd 
> Kumar :) ).
>
> Launcher change, and your rationale, seems okay to me. I'd probably 
> put the test in to jdk/test/tools/launcher/ though.
>
> Thanks,
> David
>
>> thanks,
>>
>> Chris
>>
>> On 11/7/14 7:53 PM, Chris Plummer wrote:
>>> This is an initial review for 6762191. I'm guessing there will be
>>> recommendations to fix in a different way, but thought this would be a
>>> good time to start the discussion.
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-6762191
>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/
>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/
>>>
>>> The bug is that if the -Xss size is set to something very small (like
>>> 16k), on linux there will be a crash due to overwriting the end of the
>>> stack. This happens before hotspot can compute its stack needs and
>>> verify that the stack is big enough.
>>>
>>> It didn't seem viable to move the hotspot stack size check earlier. It
>>> depends on too much other work done before that point, and the changes
>>> would have been disruptive. The stack size check is currently done in
>>> os::init_2().
>>>
>>> What is needed is a check before the thread is created. That way we
>>> can create a thread with a big enough stack to handle all needs up to
>>> the point of the check in os::init_2(). This initial check does not
>>> need to be the final check. It just needs to confirm that we have
>>> enough stack to get us to the check in os::init_2().
>>>
>>> I decided to check in java.c if the -Xss size is too small, and set it
>>> to a larger size if it is. I hard coded this size to 32k (I'll explain
>>> why 32k later). I suspect this is the part that will result in some
>>> debate. If you have better suggestions let me know. If it does stay
>>> here, then probably the 32k needs to be a #define, and maybe even an
>>> OS porting interface, but I'm not sure where to put it.
>>>
>>> The reason I chose 32k is because this is big enough for all platforms
>>> to get to the stack size check in os::init_2(). It is also smaller
>>> than the actual minimum stack size allowed on any platform. 32-bit
>>> windows has the smallest requirement at 64k. I add some printfs to
>>> print the minimum stack requirement, and then ran a simple JTReg test
>>> with every JPRT supported platform to get the results.
>>>
>>> The TooSmallStackSize.sh will run "java -version" with -Xss16k,
>>> -Xss32k, and -XXss<minsize>, where <minsize> is the size from the
>>> error message produced by the JVM, such as in the following:
>>>
>>> $ java -Xss32k -version
>>> The stack size specified is too small, Specify at least 100k
>>> Error: Could not create the Java Virtual Machine.
>>> Error: A fatal exception has occurred. Program will exit.
>>>
>>> I ran this test through JPRT on all platforms, and they all pass.
>>>
>>> One thing to point out is that Windows behaves a bit different than
>>> the other platforms. It always rounds the stack size up to a multiple
>>> of 64k , so even if you specify -Xss16k, you get a 64k stack. On
>>> 32-bit Windows with C1, 64k is also the minimum requirement, so there
>>> is no error produced in this case. However, on 32-bit Windows with C2,
>>> 68k is the minimum, so an error is produced since the stack will only
>>> be 64k. There is no bug here. It's just a bit confusing.
>>>
>>> thanks,
>>>
>>> Chris
>>


From coleen.phillimore at oracle.com  Tue Nov 18 22:33:09 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Tue, 18 Nov 2014 17:33:09 -0500
Subject: [8u40] Review request 8064667: Provide support to help identify
	use of endorsed standards and extension mechanism
In-Reply-To: <546AA8D2.1050600@oracle.com>
References: <546A28F3.1010802@oracle.com> <546AA8D2.1050600@oracle.com>
Message-ID: <546BC925.6030000@oracle.com>

Mandy,

In arguments.cpp, I think this should be snprintf in case java_home is 
MAXPATHLEN long.

+  char endorsedDir[JVM_MAXPATHLEN];
+  char extDir[JVM_MAXPATHLEN];
+  const char* fileSep = os::file_separator();
+  sprintf(endorsedDir, "%s%slib%sendorsed", Arguments::get_java_home(), fileSep, fileSep);
+  sprintf(extDir, "%s%slib%sext", Arguments::get_java_home(), fileSep, fileSep);
+

This list could be hard to maintain.  I have no alternatives to suggest 
though.

+  // List of JAR files installed in the default lib/ext directory.
+  // -XX:+CheckEndorsedAndExtDirs checks if any non-JDK file installed

The code looks correct though but a bit painful searching through 
directories.  At least it's optional.

Coleen

On 11/17/14, 9:02 PM, Mandy Chung wrote:
> Updated webrev:
> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.01/
>
> This addresses Calvin's comment.  It now keeps a list of the jar files 
> shipped with jre/lib/ext and determine if jre/lib/ext has any other 
> non-JDK jar files installed.
>
> Mandy
>
> On 11/17/2014 8:57 AM, Mandy Chung wrote:
>> This requests both code review and 8u40 approval for:
>>    https://bugs.openjdk.java.net/browse/JDK-8064667
>>
>> Webrev:
>> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.00/
>>
>> JEP 220 [1] proposes to remove the endorsed standards override 
>> mechanism and extension mechanism. This patch adds a VM flag in 8u40 
>> to help identify any existing uses of these mechanisms so that users 
>> can turn on the VM flag to help identify if they depend on the 
>> endorsed standards override mechanism and extension mechanism and can 
>> plan to prepare for the migration to a newer JDK release early on. 
>> When -XX:+CheckEndorsedAndExtDirs is set, the VM will exit if the 
>> system property -Djava.endorsed.dirs or -Djava.ext.dirs is set, or if 
>> ${java.home}/lib/endorsed or ${java.home}/lib/ext exists, or any 
>> system extension directory contains JAR files.
>>
>> Thanks
>> Mandy
>> [1] http://openjdk.java.net/jeps/220
>>
>>
>>
>


From mandy.chung at oracle.com  Tue Nov 18 22:55:41 2014
From: mandy.chung at oracle.com (Mandy Chung)
Date: Tue, 18 Nov 2014 14:55:41 -0800
Subject: [8u40] Review request 8064667: Provide support to help identify
	use of endorsed standards and extension mechanism
In-Reply-To: <546BC925.6030000@oracle.com>
References: <546A28F3.1010802@oracle.com> <546AA8D2.1050600@oracle.com>
	<546BC925.6030000@oracle.com>
Message-ID: <546BCE6D.3020104@oracle.com>

On 11/18/14 2:33 PM, Coleen Phillimore wrote:
> Mandy,
>
> In arguments.cpp, I think this should be snprintf in case java_home is 
> MAXPATHLEN long.
>

That's what I was wondering as I copied from the existing code in 
arguments.cpp.

> + char endorsedDir[JVM_MAXPATHLEN];
> +  char extDir[JVM_MAXPATHLEN];
> +  const char* fileSep = os::file_separator();
> +  sprintf(endorsedDir, "%s%slib%sendorsed", 
> Arguments::get_java_home(), fileSep, fileSep);
> +  sprintf(extDir, "%s%slib%sext", Arguments::get_java_home(), 
> fileSep, fileSep);
> +

I will fix them to use snprintf.  I assume there is a bug to fix the 
existing use of sprintf; if not you may want to file one.

>
> This list could be hard to maintain.  I have no alternatives to 
> suggest though.

I expect this list will rarely be changed for 8 update.
>
> +  // List of JAR files installed in the default lib/ext directory.
> +  // -XX:+CheckEndorsedAndExtDirs checks if any non-JDK file installed
>
> The code looks correct though but a bit painful searching through 
> directories.  At least it's optional.

It's off by default.  This is to help users using 8u40 to prepare for 
migration and scanning the directories should not be an issue.

thanks for the review.
Mandy

>
> Coleen
>
> On 11/17/14, 9:02 PM, Mandy Chung wrote:
>> Updated webrev:
>> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.01/
>>
>> This addresses Calvin's comment.  It now keeps a list of the jar 
>> files shipped with jre/lib/ext and determine if jre/lib/ext has any 
>> other non-JDK jar files installed.
>>
>> Mandy
>>
>> On 11/17/2014 8:57 AM, Mandy Chung wrote:
>>> This requests both code review and 8u40 approval for:
>>>    https://bugs.openjdk.java.net/browse/JDK-8064667
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.00/
>>>
>>> JEP 220 [1] proposes to remove the endorsed standards override 
>>> mechanism and extension mechanism. This patch adds a VM flag in 8u40 
>>> to help identify any existing uses of these mechanisms so that users 
>>> can turn on the VM flag to help identify if they depend on the 
>>> endorsed standards override mechanism and extension mechanism and 
>>> can plan to prepare for the migration to a newer JDK release early 
>>> on. When -XX:+CheckEndorsedAndExtDirs is set, the VM will exit if 
>>> the system property -Djava.endorsed.dirs or -Djava.ext.dirs is set, 
>>> or if ${java.home}/lib/endorsed or ${java.home}/lib/ext exists, or 
>>> any system extension directory contains JAR files.
>>>
>>> Thanks
>>> Mandy
>>> [1] http://openjdk.java.net/jeps/220
>>>
>>>
>>>
>>
>


From mandy.chung at oracle.com  Tue Nov 18 23:06:38 2014
From: mandy.chung at oracle.com (Mandy Chung)
Date: Tue, 18 Nov 2014 15:06:38 -0800
Subject: [8u40] Review request 8064667: Provide support to help identify
	use of endorsed standards and extension mechanism
In-Reply-To: <546BCE6D.3020104@oracle.com>
References: <546A28F3.1010802@oracle.com>
	<546AA8D2.1050600@oracle.com>	<546BC925.6030000@oracle.com>
	<546BCE6D.3020104@oracle.com>
Message-ID: <546BD0FE.101@oracle.com>

Coleen, Calvin,

Thanks for the review.  Here is the updated webrev and a new test:
http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.02/

Mandy


From calvin.cheung at oracle.com  Tue Nov 18 23:25:16 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Tue, 18 Nov 2014 15:25:16 -0800
Subject: [8u40] Review request 8064667: Provide support to help identify
	use of endorsed standards and extension mechanism
In-Reply-To: <546BD0FE.101@oracle.com>
References: <546A28F3.1010802@oracle.com>
	<546AA8D2.1050600@oracle.com>	<546BC925.6030000@oracle.com>
	<546BCE6D.3020104@oracle.com> <546BD0FE.101@oracle.com>
Message-ID: <546BD55C.7060509@oracle.com>

On 11/18/2014 3:06 PM, Mandy Chung wrote:
> Coleen, Calvin,
>
> Thanks for the review.  Here is the updated webrev and a new test:
> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.02/

Looks good and thanks for adding the testcase.

Minor nit about the testcase - the following import statements are extra:
34 import java.io.*;
36 import java.util.concurrent.TimeUnit;

I don't need to see another webrev for the testcase change.

thanks,
Calvin
>
> Mandy
>


From chris.plummer at oracle.com  Wed Nov 19 08:49:10 2014
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 19 Nov 2014 00:49:10 -0800
Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation
	fault
In-Reply-To: <546BC35E.4070402@oracle.com>
References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com>
	<54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com>
Message-ID: <546C5986.6010500@oracle.com>

I've update the webrev to add STACK_SIZE_MINIMUM in place of the 32k 
references, and also moved the test from hotspot/test/runtime to 
jdk/test/tools/launcher as David requested. That required some 
adjustments to the test script, since test_env.sh does not exist in 
jdk/test, so I had to pull in the bits I needed into the script.

http://cr.openjdk.java.net/~cjplummer/6762191/webrev.01/

I still need to rerun through JPRT. I'll do so once there are no more 
suggested changes.

thanks,

Chris

On 11/18/14 2:08 PM, Chris Plummer wrote:
> Adding core-libs-dev at openjdk.java.net, since one of the changes is in 
> java.c.
>
> Chris
>
> On 11/12/14 6:43 PM, David Holmes wrote:
>> Hi Chris,
>>
>> Sorry for the delay.
>>
>> On 13/11/2014 5:44 AM, Chris Plummer wrote:
>>> Hi,
>>>
>>> I'm still looking for reviewers.
>>
>> As the change is to the launcher it needs to be reviewed by the 
>> launcher owner - which I think is serviceability (though also cc'd 
>> Kumar :) ).
>>
>> Launcher change, and your rationale, seems okay to me. I'd probably 
>> put the test in to jdk/test/tools/launcher/ though.
>>
>> Thanks,
>> David
>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 11/7/14 7:53 PM, Chris Plummer wrote:
>>>> This is an initial review for 6762191. I'm guessing there will be
>>>> recommendations to fix in a different way, but thought this would be a
>>>> good time to start the discussion.
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-6762191
>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/
>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/
>>>>
>>>> The bug is that if the -Xss size is set to something very small (like
>>>> 16k), on linux there will be a crash due to overwriting the end of the
>>>> stack. This happens before hotspot can compute its stack needs and
>>>> verify that the stack is big enough.
>>>>
>>>> It didn't seem viable to move the hotspot stack size check earlier. It
>>>> depends on too much other work done before that point, and the changes
>>>> would have been disruptive. The stack size check is currently done in
>>>> os::init_2().
>>>>
>>>> What is needed is a check before the thread is created. That way we
>>>> can create a thread with a big enough stack to handle all needs up to
>>>> the point of the check in os::init_2(). This initial check does not
>>>> need to be the final check. It just needs to confirm that we have
>>>> enough stack to get us to the check in os::init_2().
>>>>
>>>> I decided to check in java.c if the -Xss size is too small, and set it
>>>> to a larger size if it is. I hard coded this size to 32k (I'll explain
>>>> why 32k later). I suspect this is the part that will result in some
>>>> debate. If you have better suggestions let me know. If it does stay
>>>> here, then probably the 32k needs to be a #define, and maybe even an
>>>> OS porting interface, but I'm not sure where to put it.
>>>>
>>>> The reason I chose 32k is because this is big enough for all platforms
>>>> to get to the stack size check in os::init_2(). It is also smaller
>>>> than the actual minimum stack size allowed on any platform. 32-bit
>>>> windows has the smallest requirement at 64k. I add some printfs to
>>>> print the minimum stack requirement, and then ran a simple JTReg test
>>>> with every JPRT supported platform to get the results.
>>>>
>>>> The TooSmallStackSize.sh will run "java -version" with -Xss16k,
>>>> -Xss32k, and -XXss<minsize>, where <minsize> is the size from the
>>>> error message produced by the JVM, such as in the following:
>>>>
>>>> $ java -Xss32k -version
>>>> The stack size specified is too small, Specify at least 100k
>>>> Error: Could not create the Java Virtual Machine.
>>>> Error: A fatal exception has occurred. Program will exit.
>>>>
>>>> I ran this test through JPRT on all platforms, and they all pass.
>>>>
>>>> One thing to point out is that Windows behaves a bit different than
>>>> the other platforms. It always rounds the stack size up to a multiple
>>>> of 64k , so even if you specify -Xss16k, you get a 64k stack. On
>>>> 32-bit Windows with C1, 64k is also the minimum requirement, so there
>>>> is no error produced in this case. However, on 32-bit Windows with C2,
>>>> 68k is the minimum, so an error is produced since the stack will only
>>>> be 64k. There is no bug here. It's just a bit confusing.
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>
>


From serguei.spitsyn at oracle.com  Wed Nov 19 08:54:48 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 19 Nov 2014 00:54:48 -0800
Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation
	fault
In-Reply-To: <546C5986.6010500@oracle.com>
References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com>
	<54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com>
	<546C5986.6010500@oracle.com>
Message-ID: <546C5AD8.2090701@oracle.com>

Reviewed

Thanks,
Serguei

On 11/19/14 12:49 AM, Chris Plummer wrote:
> I've update the webrev to add STACK_SIZE_MINIMUM in place of the 32k 
> references, and also moved the test from hotspot/test/runtime to 
> jdk/test/tools/launcher as David requested. That required some 
> adjustments to the test script, since test_env.sh does not exist in 
> jdk/test, so I had to pull in the bits I needed into the script.
>
> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.01/
>
> I still need to rerun through JPRT. I'll do so once there are no more 
> suggested changes.
>
> thanks,
>
> Chris
>
> On 11/18/14 2:08 PM, Chris Plummer wrote:
>> Adding core-libs-dev at openjdk.java.net, since one of the changes is in 
>> java.c.
>>
>> Chris
>>
>> On 11/12/14 6:43 PM, David Holmes wrote:
>>> Hi Chris,
>>>
>>> Sorry for the delay.
>>>
>>> On 13/11/2014 5:44 AM, Chris Plummer wrote:
>>>> Hi,
>>>>
>>>> I'm still looking for reviewers.
>>>
>>> As the change is to the launcher it needs to be reviewed by the 
>>> launcher owner - which I think is serviceability (though also cc'd 
>>> Kumar :) ).
>>>
>>> Launcher change, and your rationale, seems okay to me. I'd probably 
>>> put the test in to jdk/test/tools/launcher/ though.
>>>
>>> Thanks,
>>> David
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 11/7/14 7:53 PM, Chris Plummer wrote:
>>>>> This is an initial review for 6762191. I'm guessing there will be
>>>>> recommendations to fix in a different way, but thought this would 
>>>>> be a
>>>>> good time to start the discussion.
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-6762191
>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/
>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/
>>>>>
>>>>> The bug is that if the -Xss size is set to something very small (like
>>>>> 16k), on linux there will be a crash due to overwriting the end of 
>>>>> the
>>>>> stack. This happens before hotspot can compute its stack needs and
>>>>> verify that the stack is big enough.
>>>>>
>>>>> It didn't seem viable to move the hotspot stack size check 
>>>>> earlier. It
>>>>> depends on too much other work done before that point, and the 
>>>>> changes
>>>>> would have been disruptive. The stack size check is currently done in
>>>>> os::init_2().
>>>>>
>>>>> What is needed is a check before the thread is created. That way we
>>>>> can create a thread with a big enough stack to handle all needs up to
>>>>> the point of the check in os::init_2(). This initial check does not
>>>>> need to be the final check. It just needs to confirm that we have
>>>>> enough stack to get us to the check in os::init_2().
>>>>>
>>>>> I decided to check in java.c if the -Xss size is too small, and 
>>>>> set it
>>>>> to a larger size if it is. I hard coded this size to 32k (I'll 
>>>>> explain
>>>>> why 32k later). I suspect this is the part that will result in some
>>>>> debate. If you have better suggestions let me know. If it does stay
>>>>> here, then probably the 32k needs to be a #define, and maybe even an
>>>>> OS porting interface, but I'm not sure where to put it.
>>>>>
>>>>> The reason I chose 32k is because this is big enough for all 
>>>>> platforms
>>>>> to get to the stack size check in os::init_2(). It is also smaller
>>>>> than the actual minimum stack size allowed on any platform. 32-bit
>>>>> windows has the smallest requirement at 64k. I add some printfs to
>>>>> print the minimum stack requirement, and then ran a simple JTReg test
>>>>> with every JPRT supported platform to get the results.
>>>>>
>>>>> The TooSmallStackSize.sh will run "java -version" with -Xss16k,
>>>>> -Xss32k, and -XXss<minsize>, where <minsize> is the size from the
>>>>> error message produced by the JVM, such as in the following:
>>>>>
>>>>> $ java -Xss32k -version
>>>>> The stack size specified is too small, Specify at least 100k
>>>>> Error: Could not create the Java Virtual Machine.
>>>>> Error: A fatal exception has occurred. Program will exit.
>>>>>
>>>>> I ran this test through JPRT on all platforms, and they all pass.
>>>>>
>>>>> One thing to point out is that Windows behaves a bit different than
>>>>> the other platforms. It always rounds the stack size up to a multiple
>>>>> of 64k , so even if you specify -Xss16k, you get a 64k stack. On
>>>>> 32-bit Windows with C1, 64k is also the minimum requirement, so there
>>>>> is no error produced in this case. However, on 32-bit Windows with 
>>>>> C2,
>>>>> 68k is the minimum, so an error is produced since the stack will only
>>>>> be 64k. There is no bug here. It's just a bit confusing.
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>
>>
>


From david.holmes at oracle.com  Wed Nov 19 10:12:42 2014
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 19 Nov 2014 20:12:42 +1000
Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation
	fault
In-Reply-To: <546C5986.6010500@oracle.com>
References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com>
	<54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com>
	<546C5986.6010500@oracle.com>
Message-ID: <546C6D1A.8050903@oracle.com>

On 19/11/2014 6:49 PM, Chris Plummer wrote:
> I've update the webrev to add STACK_SIZE_MINIMUM in place of the 32k
> references, and also moved the test from hotspot/test/runtime to
> jdk/test/tools/launcher as David requested. That required some
> adjustments to the test script, since test_env.sh does not exist in
> jdk/test, so I had to pull in the bits I needed into the script.

Is there a reason this needs a shell script instead of using the 
testlibrary tools to launch the VM and check the output?

Sorry that should have been mentioned much earlier.

David


> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.01/
>
> I still need to rerun through JPRT. I'll do so once there are no more
> suggested changes.
>
> thanks,
>
> Chris
>
> On 11/18/14 2:08 PM, Chris Plummer wrote:
>> Adding core-libs-dev at openjdk.java.net, since one of the changes is in
>> java.c.
>>
>> Chris
>>
>> On 11/12/14 6:43 PM, David Holmes wrote:
>>> Hi Chris,
>>>
>>> Sorry for the delay.
>>>
>>> On 13/11/2014 5:44 AM, Chris Plummer wrote:
>>>> Hi,
>>>>
>>>> I'm still looking for reviewers.
>>>
>>> As the change is to the launcher it needs to be reviewed by the
>>> launcher owner - which I think is serviceability (though also cc'd
>>> Kumar :) ).
>>>
>>> Launcher change, and your rationale, seems okay to me. I'd probably
>>> put the test in to jdk/test/tools/launcher/ though.
>>>
>>> Thanks,
>>> David
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 11/7/14 7:53 PM, Chris Plummer wrote:
>>>>> This is an initial review for 6762191. I'm guessing there will be
>>>>> recommendations to fix in a different way, but thought this would be a
>>>>> good time to start the discussion.
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-6762191
>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/
>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/
>>>>>
>>>>> The bug is that if the -Xss size is set to something very small (like
>>>>> 16k), on linux there will be a crash due to overwriting the end of the
>>>>> stack. This happens before hotspot can compute its stack needs and
>>>>> verify that the stack is big enough.
>>>>>
>>>>> It didn't seem viable to move the hotspot stack size check earlier. It
>>>>> depends on too much other work done before that point, and the changes
>>>>> would have been disruptive. The stack size check is currently done in
>>>>> os::init_2().
>>>>>
>>>>> What is needed is a check before the thread is created. That way we
>>>>> can create a thread with a big enough stack to handle all needs up to
>>>>> the point of the check in os::init_2(). This initial check does not
>>>>> need to be the final check. It just needs to confirm that we have
>>>>> enough stack to get us to the check in os::init_2().
>>>>>
>>>>> I decided to check in java.c if the -Xss size is too small, and set it
>>>>> to a larger size if it is. I hard coded this size to 32k (I'll explain
>>>>> why 32k later). I suspect this is the part that will result in some
>>>>> debate. If you have better suggestions let me know. If it does stay
>>>>> here, then probably the 32k needs to be a #define, and maybe even an
>>>>> OS porting interface, but I'm not sure where to put it.
>>>>>
>>>>> The reason I chose 32k is because this is big enough for all platforms
>>>>> to get to the stack size check in os::init_2(). It is also smaller
>>>>> than the actual minimum stack size allowed on any platform. 32-bit
>>>>> windows has the smallest requirement at 64k. I add some printfs to
>>>>> print the minimum stack requirement, and then ran a simple JTReg test
>>>>> with every JPRT supported platform to get the results.
>>>>>
>>>>> The TooSmallStackSize.sh will run "java -version" with -Xss16k,
>>>>> -Xss32k, and -XXss<minsize>, where <minsize> is the size from the
>>>>> error message produced by the JVM, such as in the following:
>>>>>
>>>>> $ java -Xss32k -version
>>>>> The stack size specified is too small, Specify at least 100k
>>>>> Error: Could not create the Java Virtual Machine.
>>>>> Error: A fatal exception has occurred. Program will exit.
>>>>>
>>>>> I ran this test through JPRT on all platforms, and they all pass.
>>>>>
>>>>> One thing to point out is that Windows behaves a bit different than
>>>>> the other platforms. It always rounds the stack size up to a multiple
>>>>> of 64k , so even if you specify -Xss16k, you get a 64k stack. On
>>>>> 32-bit Windows with C1, 64k is also the minimum requirement, so there
>>>>> is no error produced in this case. However, on 32-bit Windows with C2,
>>>>> 68k is the minimum, so an error is produced since the stack will only
>>>>> be 64k. There is no bug here. It's just a bit confusing.
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>
>>
>

From ioi.lam at oracle.com  Wed Nov 19 14:08:32 2014
From: ioi.lam at oracle.com (Ioi Lam)
Date: Wed, 19 Nov 2014 22:08:32 +0800
Subject: RFR(XS) 8065346 - WB_AddToBootstrapClassLoaderSearch calls
	JvmtiEnv::create_a_jvmti when not in _thread_in_vm state
Message-ID: <546CA460.6080501@oracle.com>

Hi,

Please review a simple fix for whitebox test API:

http://cr.openjdk.java.net/~iklam/8065346-jvmti-test-crash/
     https://bugs.openjdk.java.net/browse/JDK-8065346

Summary of fix:

     The JVMTI calls expect the current thread to be in VM state, but 
JNI GetStringUTFChars
     expects the thread to be in Native state.

     So I moved the ThreadToNativeFromVM constructors accordingly to 
make everyone happy.

Tests:

     I ran the tests with a debug hotspot build and the tests passed 
after the fix.

Thanks
- Ioi

From chris.plummer at oracle.com  Wed Nov 19 15:52:11 2014
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 19 Nov 2014 07:52:11 -0800
Subject: [9] RFR (S) 6762191: Setting stack size to 16K causes segmentation
	fault
In-Reply-To: <546C6D1A.8050903@oracle.com>
References: <545D939D.2030308@oracle.com> <5463B896.10801@oracle.com>
	<54641ADE.8030504@oracle.com> <546BC35E.4070402@oracle.com>
	<546C5986.6010500@oracle.com> <546C6D1A.8050903@oracle.com>
Message-ID: <546CBCAB.7040101@oracle.com>

On 11/19/14 2:12 AM, David Holmes wrote:
> On 19/11/2014 6:49 PM, Chris Plummer wrote:
>> I've update the webrev to add STACK_SIZE_MINIMUM in place of the 32k
>> references, and also moved the test from hotspot/test/runtime to
>> jdk/test/tools/launcher as David requested. That required some
>> adjustments to the test script, since test_env.sh does not exist in
>> jdk/test, so I had to pull in the bits I needed into the script.
>
> Is there a reason this needs a shell script instead of using the 
> testlibrary tools to launch the VM and check the output?
Not that I'm aware of. I guess I just really didn't look at what it 
would take to make it all in java. I'll have a look at java examples and 
convert it.

Chris
>
> Sorry that should have been mentioned much earlier.
>
> David
>
>
>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.01/
>>
>> I still need to rerun through JPRT. I'll do so once there are no more
>> suggested changes.
>>
>> thanks,
>>
>> Chris
>>
>> On 11/18/14 2:08 PM, Chris Plummer wrote:
>>> Adding core-libs-dev at openjdk.java.net, since one of the changes is in
>>> java.c.
>>>
>>> Chris
>>>
>>> On 11/12/14 6:43 PM, David Holmes wrote:
>>>> Hi Chris,
>>>>
>>>> Sorry for the delay.
>>>>
>>>> On 13/11/2014 5:44 AM, Chris Plummer wrote:
>>>>> Hi,
>>>>>
>>>>> I'm still looking for reviewers.
>>>>
>>>> As the change is to the launcher it needs to be reviewed by the
>>>> launcher owner - which I think is serviceability (though also cc'd
>>>> Kumar :) ).
>>>>
>>>> Launcher change, and your rationale, seems okay to me. I'd probably
>>>> put the test in to jdk/test/tools/launcher/ though.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 11/7/14 7:53 PM, Chris Plummer wrote:
>>>>>> This is an initial review for 6762191. I'm guessing there will be
>>>>>> recommendations to fix in a different way, but thought this would 
>>>>>> be a
>>>>>> good time to start the discussion.
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-6762191
>>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.jdk/
>>>>>> http://cr.openjdk.java.net/~cjplummer/6762191/webrev.00.hotspot/
>>>>>>
>>>>>> The bug is that if the -Xss size is set to something very small 
>>>>>> (like
>>>>>> 16k), on linux there will be a crash due to overwriting the end 
>>>>>> of the
>>>>>> stack. This happens before hotspot can compute its stack needs and
>>>>>> verify that the stack is big enough.
>>>>>>
>>>>>> It didn't seem viable to move the hotspot stack size check 
>>>>>> earlier. It
>>>>>> depends on too much other work done before that point, and the 
>>>>>> changes
>>>>>> would have been disruptive. The stack size check is currently 
>>>>>> done in
>>>>>> os::init_2().
>>>>>>
>>>>>> What is needed is a check before the thread is created. That way we
>>>>>> can create a thread with a big enough stack to handle all needs 
>>>>>> up to
>>>>>> the point of the check in os::init_2(). This initial check does not
>>>>>> need to be the final check. It just needs to confirm that we have
>>>>>> enough stack to get us to the check in os::init_2().
>>>>>>
>>>>>> I decided to check in java.c if the -Xss size is too small, and 
>>>>>> set it
>>>>>> to a larger size if it is. I hard coded this size to 32k (I'll 
>>>>>> explain
>>>>>> why 32k later). I suspect this is the part that will result in some
>>>>>> debate. If you have better suggestions let me know. If it does stay
>>>>>> here, then probably the 32k needs to be a #define, and maybe even an
>>>>>> OS porting interface, but I'm not sure where to put it.
>>>>>>
>>>>>> The reason I chose 32k is because this is big enough for all 
>>>>>> platforms
>>>>>> to get to the stack size check in os::init_2(). It is also smaller
>>>>>> than the actual minimum stack size allowed on any platform. 32-bit
>>>>>> windows has the smallest requirement at 64k. I add some printfs to
>>>>>> print the minimum stack requirement, and then ran a simple JTReg 
>>>>>> test
>>>>>> with every JPRT supported platform to get the results.
>>>>>>
>>>>>> The TooSmallStackSize.sh will run "java -version" with -Xss16k,
>>>>>> -Xss32k, and -XXss<minsize>, where <minsize> is the size from the
>>>>>> error message produced by the JVM, such as in the following:
>>>>>>
>>>>>> $ java -Xss32k -version
>>>>>> The stack size specified is too small, Specify at least 100k
>>>>>> Error: Could not create the Java Virtual Machine.
>>>>>> Error: A fatal exception has occurred. Program will exit.
>>>>>>
>>>>>> I ran this test through JPRT on all platforms, and they all pass.
>>>>>>
>>>>>> One thing to point out is that Windows behaves a bit different than
>>>>>> the other platforms. It always rounds the stack size up to a 
>>>>>> multiple
>>>>>> of 64k , so even if you specify -Xss16k, you get a 64k stack. On
>>>>>> 32-bit Windows with C1, 64k is also the minimum requirement, so 
>>>>>> there
>>>>>> is no error produced in this case. However, on 32-bit Windows 
>>>>>> with C2,
>>>>>> 68k is the minimum, so an error is produced since the stack will 
>>>>>> only
>>>>>> be 64k. There is no bug here. It's just a bit confusing.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>
>>>
>>


From yumin.qi at oracle.com  Wed Nov 19 17:28:29 2014
From: yumin.qi at oracle.com (Yumin Qi)
Date: Wed, 19 Nov 2014 09:28:29 -0800
Subject: RFR(XS) 8065346 - WB_AddToBootstrapClassLoaderSearch calls
	JvmtiEnv::create_a_jvmti when not in _thread_in_vm state
In-Reply-To: <546CA460.6080501@oracle.com>
References: <546CA460.6080501@oracle.com>
Message-ID: <546CD33D.5030903@oracle.com>

Ioi,

   In fact you can use

*  const char* seg = java_lang_String::as_utf8_string(JNIHandles::resolve_non_null(segment));*

   which does not need state transition since it is in native. (Coleen 
pointed out in a codereview for my change to whitebox)
   But need ResouceMark first.

Thanks
Yumin

On 11/19/2014 6:08 AM, Ioi Lam wrote:
> Hi,
>
> Please review a simple fix for whitebox test API:
>
> http://cr.openjdk.java.net/~iklam/8065346-jvmti-test-crash/
>     https://bugs.openjdk.java.net/browse/JDK-8065346
>
> Summary of fix:
>
>     The JVMTI calls expect the current thread to be in VM state, but 
> JNI GetStringUTFChars
>     expects the thread to be in Native state.
>
>     So I moved the ThreadToNativeFromVM constructors accordingly to 
> make everyone happy.
>
> Tests:
>
>     I ran the tests with a debug hotspot build and the tests passed 
> after the fix.
>
> Thanks
> - Ioi


From mandy.chung at oracle.com  Wed Nov 19 21:10:54 2014
From: mandy.chung at oracle.com (Mandy Chung)
Date: Wed, 19 Nov 2014 13:10:54 -0800
Subject: [8u40] Putback request for 8064667: Provide support to help identify
	use of endorsed standards and extension mechanism
In-Reply-To: <546A28F3.1010802@oracle.com>
References: <546A28F3.1010802@oracle.com>
Message-ID: <546D075D.8050601@oracle.com>

Coleen and Calvin from runtime team have reviewed and approved this 
fix.  I notice that jdk8u-dev got dropped in their review [1].

May I get the 8u40 approval to putback this change?

Updated webrev:
   http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.02

Thanks
Mandy
[1] 
http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-November/013312.html

On 11/17/2014 8:57 AM, Mandy Chung wrote:
> This requests both code review and 8u40 approval for:
>    https://bugs.openjdk.java.net/browse/JDK-8064667
>
> Webrev:
> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.00/
>
> JEP 220 [1] proposes to remove the endorsed standards override 
> mechanism and extension mechanism. This patch adds a VM flag in 8u40 
> to help identify any existing uses of these mechanisms so that users 
> can turn on the VM flag to help identify if they depend on the 
> endorsed standards override mechanism and extension mechanism and can 
> plan to prepare for the migration to a newer JDK release early on. 
> When -XX:+CheckEndorsedAndExtDirs is set, the VM will exit if the 
> system property -Djava.endorsed.dirs or -Djava.ext.dirs is set, or if 
> ${java.home}/lib/endorsed or ${java.home}/lib/ext exists, or any 
> system extension directory contains JAR files.
>
> Thanks
> Mandy
> [1] http://openjdk.java.net/jeps/220
>
>
>


From naoto.sato at oracle.com  Wed Nov 19 21:37:30 2014
From: naoto.sato at oracle.com (Naoto Sato)
Date: Wed, 19 Nov 2014 13:37:30 -0800
Subject: [8u40] Putback request for 8064667: Provide support to help
	identify use of endorsed standards and extension mechanism
In-Reply-To: <546D075D.8050601@oracle.com>
References: <546A28F3.1010802@oracle.com> <546D075D.8050601@oracle.com>
Message-ID: <546D0D9A.70406@oracle.com>

Since you've already got the code review done specifically for 8u, you 
are good to go.

Naoto

On 11/19/14, 1:10 PM, Mandy Chung wrote:
> Coleen and Calvin from runtime team have reviewed and approved this
> fix.  I notice that jdk8u-dev got dropped in their review [1].
>
> May I get the 8u40 approval to putback this change?
>
> Updated webrev:
>    http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.02
>
> Thanks
> Mandy
> [1]
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-November/013312.html
>
>
> On 11/17/2014 8:57 AM, Mandy Chung wrote:
>> This requests both code review and 8u40 approval for:
>>    https://bugs.openjdk.java.net/browse/JDK-8064667
>>
>> Webrev:
>> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.00/
>>
>> JEP 220 [1] proposes to remove the endorsed standards override
>> mechanism and extension mechanism. This patch adds a VM flag in 8u40
>> to help identify any existing uses of these mechanisms so that users
>> can turn on the VM flag to help identify if they depend on the
>> endorsed standards override mechanism and extension mechanism and can
>> plan to prepare for the migration to a newer JDK release early on.
>> When -XX:+CheckEndorsedAndExtDirs is set, the VM will exit if the
>> system property -Djava.endorsed.dirs or -Djava.ext.dirs is set, or if
>> ${java.home}/lib/endorsed or ${java.home}/lib/ext exists, or any
>> system extension directory contains JAR files.
>>
>> Thanks
>> Mandy
>> [1] http://openjdk.java.net/jeps/220
>>
>>
>>
>

From naoto.sato at oracle.com  Wed Nov 19 22:33:31 2014
From: naoto.sato at oracle.com (Naoto Sato)
Date: Wed, 19 Nov 2014 14:33:31 -0800
Subject: [8u40] Putback request for 8064667: Provide support to help
	identify use of endorsed standards and extension mechanism
In-Reply-To: <546D0D9A.70406@oracle.com>
References: <546A28F3.1010802@oracle.com> <546D075D.8050601@oracle.com>
	<546D0D9A.70406@oracle.com>
Message-ID: <546D1ABB.6090709@oracle.com>

This was somewhat misleading, but take it as an "approval."

Naoto

On 11/19/14, 1:37 PM, Naoto Sato wrote:
> Since you've already got the code review done specifically for 8u, you
> are good to go.
>
> Naoto
>
> On 11/19/14, 1:10 PM, Mandy Chung wrote:
>> Coleen and Calvin from runtime team have reviewed and approved this
>> fix.  I notice that jdk8u-dev got dropped in their review [1].
>>
>> May I get the 8u40 approval to putback this change?
>>
>> Updated webrev:
>>    http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.02
>>
>> Thanks
>> Mandy
>> [1]
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-November/013312.html
>>
>>
>>
>> On 11/17/2014 8:57 AM, Mandy Chung wrote:
>>> This requests both code review and 8u40 approval for:
>>>    https://bugs.openjdk.java.net/browse/JDK-8064667
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.00/
>>>
>>> JEP 220 [1] proposes to remove the endorsed standards override
>>> mechanism and extension mechanism. This patch adds a VM flag in 8u40
>>> to help identify any existing uses of these mechanisms so that users
>>> can turn on the VM flag to help identify if they depend on the
>>> endorsed standards override mechanism and extension mechanism and can
>>> plan to prepare for the migration to a newer JDK release early on.
>>> When -XX:+CheckEndorsedAndExtDirs is set, the VM will exit if the
>>> system property -Djava.endorsed.dirs or -Djava.ext.dirs is set, or if
>>> ${java.home}/lib/endorsed or ${java.home}/lib/ext exists, or any
>>> system extension directory contains JAR files.
>>>
>>> Thanks
>>> Mandy
>>> [1] http://openjdk.java.net/jeps/220
>>>
>>>
>>>
>>

From ioi.lam at oracle.com  Thu Nov 20 01:54:13 2014
From: ioi.lam at oracle.com (Ioi Lam)
Date: Thu, 20 Nov 2014 09:54:13 +0800
Subject: RFR(XS) 8065346 - WB_AddToBootstrapClassLoaderSearch calls
	JvmtiEnv::create_a_jvmti when not in _thread_in_vm state
In-Reply-To: <546CD33D.5030903@oracle.com>
References: <546CA460.6080501@oracle.com> <546CD33D.5030903@oracle.com>
Message-ID: <546D49C5.4060400@oracle.com>

Hi Yumin,

Thanks for the review. I have updated the webrev at

http://cr.openjdk.java.net/~iklam/8065346-jvmti-test-crash-v2/

Thanks
- Ioi

On 11/20/14, 1:28 AM, Yumin Qi wrote:
> Ioi,
>
>   In fact you can use
> *  const char* seg = java_lang_String::as_utf8_string(JNIHandles::resolve_non_null(segment));*
>   which does not need state transition since it is in native. (Coleen 
> pointed out in a codereview for my change to whitebox)
>   But need ResouceMark first.
>
> Thanks
> Yumin
>
> On 11/19/2014 6:08 AM, Ioi Lam wrote:
>> Hi,
>>
>> Please review a simple fix for whitebox test API:
>>
>> http://cr.openjdk.java.net/~iklam/8065346-jvmti-test-crash/
>> https://bugs.openjdk.java.net/browse/JDK-8065346
>>
>> Summary of fix:
>>
>>     The JVMTI calls expect the current thread to be in VM state, but 
>> JNI GetStringUTFChars
>>     expects the thread to be in Native state.
>>
>>     So I moved the ThreadToNativeFromVM constructors accordingly to 
>> make everyone happy.
>>
>> Tests:
>>
>>     I ran the tests with a debug hotspot build and the tests passed 
>> after the fix.
>>
>> Thanks
>> - Ioi
>


From coleen.phillimore at oracle.com  Thu Nov 20 02:10:32 2014
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Wed, 19 Nov 2014 21:10:32 -0500
Subject: RFR(XS) 8065346 - WB_AddToBootstrapClassLoaderSearch calls
	JvmtiEnv::create_a_jvmti when not in _thread_in_vm state
In-Reply-To: <546D49C5.4060400@oracle.com>
References: <546CA460.6080501@oracle.com> <546CD33D.5030903@oracle.com>
	<546D49C5.4060400@oracle.com>
Message-ID: <546D4D98.1080903@oracle.com>


I agree, this second version looks better.  There are a bunch of bizarre 
transitions to native in whitebox.cpp that seem unnecessary.

Coleen

On 11/19/14, 8:54 PM, Ioi Lam wrote:
> Hi Yumin,
>
> Thanks for the review. I have updated the webrev at
>
> http://cr.openjdk.java.net/~iklam/8065346-jvmti-test-crash-v2/
>
> Thanks
> - Ioi
>
> On 11/20/14, 1:28 AM, Yumin Qi wrote:
>> Ioi,
>>
>>   In fact you can use
>> *  const char* seg = 
>> java_lang_String::as_utf8_string(JNIHandles::resolve_non_null(segment));*
>>   which does not need state transition since it is in native. (Coleen 
>> pointed out in a codereview for my change to whitebox)
>>   But need ResouceMark first.
>>
>> Thanks
>> Yumin
>>
>> On 11/19/2014 6:08 AM, Ioi Lam wrote:
>>> Hi,
>>>
>>> Please review a simple fix for whitebox test API:
>>>
>>> http://cr.openjdk.java.net/~iklam/8065346-jvmti-test-crash/
>>> https://bugs.openjdk.java.net/browse/JDK-8065346
>>>
>>> Summary of fix:
>>>
>>>     The JVMTI calls expect the current thread to be in VM state, but 
>>> JNI GetStringUTFChars
>>>     expects the thread to be in Native state.
>>>
>>>     So I moved the ThreadToNativeFromVM constructors accordingly to 
>>> make everyone happy.
>>>
>>> Tests:
>>>
>>>     I ran the tests with a debug hotspot build and the tests passed 
>>> after the fix.
>>>
>>> Thanks
>>> - Ioi
>>
>


From david.holmes at oracle.com  Thu Nov 20 02:20:58 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 20 Nov 2014 12:20:58 +1000
Subject: RFR(XS) 8065346 - WB_AddToBootstrapClassLoaderSearch calls
	JvmtiEnv::create_a_jvmti when not in _thread_in_vm state
In-Reply-To: <546D49C5.4060400@oracle.com>
References: <546CA460.6080501@oracle.com> <546CD33D.5030903@oracle.com>
	<546D49C5.4060400@oracle.com>
Message-ID: <546D500A.4000601@oracle.com>

On 20/11/2014 11:54 AM, Ioi Lam wrote:
> Hi Yumin,
>
> Thanks for the review. I have updated the webrev at
>
> http://cr.openjdk.java.net/~iklam/8065346-jvmti-test-crash-v2/

Yep this looks good.

Thanks,
David

> Thanks
> - Ioi
>
> On 11/20/14, 1:28 AM, Yumin Qi wrote:
>> Ioi,
>>
>>   In fact you can use
>> *  const char* seg =
>> java_lang_String::as_utf8_string(JNIHandles::resolve_non_null(segment));*
>>   which does not need state transition since it is in native. (Coleen
>> pointed out in a codereview for my change to whitebox)
>>   But need ResouceMark first.
>>
>> Thanks
>> Yumin
>>
>> On 11/19/2014 6:08 AM, Ioi Lam wrote:
>>> Hi,
>>>
>>> Please review a simple fix for whitebox test API:
>>>
>>> http://cr.openjdk.java.net/~iklam/8065346-jvmti-test-crash/
>>> https://bugs.openjdk.java.net/browse/JDK-8065346
>>>
>>> Summary of fix:
>>>
>>>     The JVMTI calls expect the current thread to be in VM state, but
>>> JNI GetStringUTFChars
>>>     expects the thread to be in Native state.
>>>
>>>     So I moved the ThreadToNativeFromVM constructors accordingly to
>>> make everyone happy.
>>>
>>> Tests:
>>>
>>>     I ran the tests with a debug hotspot build and the tests passed
>>> after the fix.
>>>
>>> Thanks
>>> - Ioi
>>
>

From sean.coffey at oracle.com  Thu Nov 20 09:21:41 2014
From: sean.coffey at oracle.com (=?windows-1252?Q?Se=E1n_Coffey?=)
Date: Thu, 20 Nov 2014 09:21:41 +0000
Subject: [8u40] Putback request for 8064667: Provide support to help
	identify use of endorsed standards and extension mechanism
In-Reply-To: <546D1ABB.6090709@oracle.com>
References: <546A28F3.1010802@oracle.com>
	<546D075D.8050601@oracle.com>	<546D0D9A.70406@oracle.com>
	<546D1ABB.6090709@oracle.com>
Message-ID: <546DB2A5.7060209@oracle.com>

Is a CCC required for this change Mandy ?

regards,
Sean.

On 19/11/2014 22:33, Naoto Sato wrote:
> This was somewhat misleading, but take it as an "approval."
>
> Naoto
>
> On 11/19/14, 1:37 PM, Naoto Sato wrote:
>> Since you've already got the code review done specifically for 8u, you
>> are good to go.
>>
>> Naoto
>>
>> On 11/19/14, 1:10 PM, Mandy Chung wrote:
>>> Coleen and Calvin from runtime team have reviewed and approved this
>>> fix.  I notice that jdk8u-dev got dropped in their review [1].
>>>
>>> May I get the 8u40 approval to putback this change?
>>>
>>> Updated webrev:
>>> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.02
>>>
>>> Thanks
>>> Mandy
>>> [1]
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-November/013312.html 
>>>
>>>
>>>
>>>
>>> On 11/17/2014 8:57 AM, Mandy Chung wrote:
>>>> This requests both code review and 8u40 approval for:
>>>>    https://bugs.openjdk.java.net/browse/JDK-8064667
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.00/
>>>>
>>>> JEP 220 [1] proposes to remove the endorsed standards override
>>>> mechanism and extension mechanism. This patch adds a VM flag in 8u40
>>>> to help identify any existing uses of these mechanisms so that users
>>>> can turn on the VM flag to help identify if they depend on the
>>>> endorsed standards override mechanism and extension mechanism and can
>>>> plan to prepare for the migration to a newer JDK release early on.
>>>> When -XX:+CheckEndorsedAndExtDirs is set, the VM will exit if the
>>>> system property -Djava.endorsed.dirs or -Djava.ext.dirs is set, or if
>>>> ${java.home}/lib/endorsed or ${java.home}/lib/ext exists, or any
>>>> system extension directory contains JAR files.
>>>>
>>>> Thanks
>>>> Mandy
>>>> [1] http://openjdk.java.net/jeps/220
>>>>
>>>>
>>>>
>>>


From ivan.gerasimov at oracle.com  Thu Nov 20 12:51:18 2014
From: ivan.gerasimov at oracle.com (Ivan Gerasimov)
Date: Thu, 20 Nov 2014 15:51:18 +0300
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <546B6568.7040701@oracle.com>
References: <5465F711.9090605@oracle.com> <54668EAF.9070807@oracle.com>
	<546915DF.7080106@oracle.com> <54699845.5010901@oracle.com>
	<a68e8e09-8efa-4186-8207-d6b4b02a8831@default>
	<546AF55A.8090203@oracle.com> <546B6568.7040701@oracle.com>
Message-ID: <546DE3C6.5050200@oracle.com>

Thank you Daniel!

David, are you still Okay with the updated webrev?

Comparing to the previous one, I've added setting the priority of the 
current thread at the line 3880 and changed the priority level to
from HIGHEST to ABOVE_NORMAL.

Sincerely yours,
Ivan

On 18.11.2014 18:27, Daniel D. Daugherty wrote:
> > http://cr.openjdk.java.net/~igerasim/8064694/2/webrev/
>
> src/os/windows/vm/os_windows.cpp
>     No commments.
>
> Thumbs up.
>
> Dan
>
>
> On 11/18/14 12:29 AM, Ivan Gerasimov wrote:
>> Hi Markus!
>>
>> The priority of the exiting thread will be raised for quite a short 
>> period of time -- right before the thread finishes exiting.
>>
>> There are two places where the priority is adjusted.
>>
>> Under normal conditions we should never see the first place hit. 
>> However, if we do, this means we have a huge number of threads.
>> Raising the priority of one of them is a hint about which thread we 
>> want the scheduler to focus on.
>>
>> The second place is a bit different.
>> We have several threads running immediately before ending the process.
>> Some of them are at the exiting path and block exiting of the whole 
>> process.
>> Raising the priority of those threads is a way to say we're not 
>> interested in all the other threads, as they are going to be 
>> terminated anyway.
>>
>> I just noticed that in second scenario it may be appropriate to set 
>> the priority of the current thread to the same level as for the 
>> exiting threads.
>> This way it'll be given a fair chance to continue if the timeout 
>> expires.
>>
>> I also think it should be enough to set the priority level to 
>> THREAD_PRIORITY_ABOVE_NORMAL instead of THREAD_PRIORITY_HIGHEST.
>> It will give just +1 to the priority value -- should be enough for 
>> the hint.
>>
>> Would you please take a look at the updated webrev:
>> http://cr.openjdk.java.net/~igerasim/8064694/2/webrev/
>>
>> Sincerely yours,
>> Ivan
>>
>>
>> On 17.11.2014 11:33, Markus Gr?nlund wrote:
>>> I agree with David.
>>>
>>> The side effects will be unknown and very hard to debug.
>>>
>>> Is there another way to accomplish the results without manipulating 
>>> base services?
>>>
>>> Thanks
>>> Markus
>>>
>>> -----Original Message-----
>>> From: David Holmes
>>> Sent: den 17 november 2014 07:40
>>> To: Ivan Gerasimov; Daniel Daugherty
>>> Cc: serviceability-dev at openjdk.java.net; hotspot-runtime-dev
>>> Subject: Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed 
>>> in hotspot\src\os\windows\vm\os_windows.cpp: 3844
>>>
>>> On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
>>>> Thank you Daniel!
>>>>
>>>> Please find the updated webrev with your suggestions incorporated 
>>>> here:
>>>> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>>>>
>>>> Concerning the thread priority: If the application is of
>>>> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to
>>>> THREAD_PRIORITY_HIGHEST will result in its priority value to be only
>>>> 10 (of maximum 31).
>>>> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.
>>>> 85).aspx
>>>>
>>>>
>>>> And if the process is HIGH_PRIORITY_CLASS, then the tread with the
>>>> HIGHEST priority level will have priority value == 15 of 31.
>>>>
>>>> I believe, it should not be too much, and the machine will not become
>>>> busy with only those closing threads.
>>>> However, I hope it would be enough to make them complete faster than
>>>> other threads of the NORMAL priority level withing the same 
>>>> application.
>>> I don't think this is necessary or desirable. Under normal usage 
>>> we're giving priority to exiting threads and that may disrupt the 
>>> usual scheduling patterns that applications see. You may posit that 
>>> it is "harmless" but we can't say that for sure. Nor can we actually 
>>> know that this will help with this particular bug. I would not add 
>>> in this new code.
>>>
>>> David
>>>
>>>> Sincerely yours,
>>>> Ivan
>>>>
>>>>
>>>> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>>>>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>>>>> Hello!
>>>>>>
>>>>>> The recent fix for JDK-8059533 ((process) Make exiting process wait
>>>>>> for exiting threads [win]) caused the warning message to be printed
>>>>>> in some test environments:
>>>>>> -----------
>>>>>> os_windows.cpp:3844 is in the newly updated
>>>>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>>>>> -----------
>>>>>>
>>>>>> This has been observed with debug builds on highly loaded systems.
>>>>>>
>>>>>>
>>>>>> To address the issue it is proposed to do three things:
>>>>>> 1) increase the timeout for debug builds,
>>>>>> 2) increase the maximum number of the thread handles to be stored,
>>>>>> 3) rise the priority of the exiting threads, if we need to wait for
>>>>>> them.
>>>>>>
>>>>>> Would you please help review the fix?
>>>>>>
>>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>>>> src/os/windows/vm/os_windows.cpp
>>>>>
>>>>>    line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>>>>      Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>>      Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>>
>>>>>      That uses the smaller value for only one build config (PRODUCT).
>>>>>
>>>>>    line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000) 
>>>>> DEBUG_ONLY(4000)
>>>>> /*1 sec in product, 4 sec in debug*/
>>>>>      Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>>      Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>>      Please add spaces between the comment delimiters and the comment
>>>>> text.
>>>>>
>>>>>      That uses the smaller timeout for only one build config 
>>>>> (PRODUCT).
>>>>>
>>>>>    line 3836           // Rise the priority...
>>>>>      Typo: 'Rise' -> 'Raise'
>>>>>
>>>>>      About the general idea of raising the exiting thread's priority,
>>>>>      if the exiting thread is looping in some Win* OS code after this
>>>>>      point, will raising the priority make the machine unusable?
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>> The fix was tested on all available platforms, with the hotspot
>>>>>> testset. No failures.
>>>>>>
>>>>>> Sincerely yours,
>>>>>> Ivan
>>>>>>
>>>>>
>>>>>
>>>
>>
>
>
>


From mandy.chung at oracle.com  Thu Nov 20 17:25:09 2014
From: mandy.chung at oracle.com (Mandy Chung)
Date: Thu, 20 Nov 2014 09:25:09 -0800
Subject: [8u40] Putback request for 8064667: Provide support to help
	identify use of endorsed standards and extension mechanism
In-Reply-To: <546DB2A5.7060209@oracle.com>
References: <546A28F3.1010802@oracle.com>
	<546D075D.8050601@oracle.com>	<546D0D9A.70406@oracle.com>
	<546D1ABB.6090709@oracle.com> <546DB2A5.7060209@oracle.com>
Message-ID: <546E23F5.6010006@oracle.com>

I should file a CCC (thanks for the reminder) and this option should be 
documented in the release note or some document.

Mandy

On 11/20/14 1:21 AM, Se?n Coffey wrote:
> Is a CCC required for this change Mandy ?
>
> regards,
> Sean.
>
> On 19/11/2014 22:33, Naoto Sato wrote:
>> This was somewhat misleading, but take it as an "approval."
>>
>> Naoto
>>
>> On 11/19/14, 1:37 PM, Naoto Sato wrote:
>>> Since you've already got the code review done specifically for 8u, you
>>> are good to go.
>>>
>>> Naoto
>>>
>>> On 11/19/14, 1:10 PM, Mandy Chung wrote:
>>>> Coleen and Calvin from runtime team have reviewed and approved this
>>>> fix.  I notice that jdk8u-dev got dropped in their review [1].
>>>>
>>>> May I get the 8u40 approval to putback this change?
>>>>
>>>> Updated webrev:
>>>> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.02
>>>>
>>>> Thanks
>>>> Mandy
>>>> [1]
>>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2014-November/013312.html 
>>>>
>>>>
>>>>
>>>>
>>>> On 11/17/2014 8:57 AM, Mandy Chung wrote:
>>>>> This requests both code review and 8u40 approval for:
>>>>>    https://bugs.openjdk.java.net/browse/JDK-8064667
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~mchung/jdk8u/webrevs/8064667/webrev.00/
>>>>>
>>>>> JEP 220 [1] proposes to remove the endorsed standards override
>>>>> mechanism and extension mechanism. This patch adds a VM flag in 8u40
>>>>> to help identify any existing uses of these mechanisms so that users
>>>>> can turn on the VM flag to help identify if they depend on the
>>>>> endorsed standards override mechanism and extension mechanism and can
>>>>> plan to prepare for the migration to a newer JDK release early on.
>>>>> When -XX:+CheckEndorsedAndExtDirs is set, the VM will exit if the
>>>>> system property -Djava.endorsed.dirs or -Djava.ext.dirs is set, or if
>>>>> ${java.home}/lib/endorsed or ${java.home}/lib/ext exists, or any
>>>>> system extension directory contains JAR files.
>>>>>
>>>>> Thanks
>>>>> Mandy
>>>>> [1] http://openjdk.java.net/jeps/220
>>>>>
>>>>>
>>>>>
>>>>
>


From vladimir.kempik at oracle.com  Fri Nov 21 15:31:20 2014
From: vladimir.kempik at oracle.com (Vladimir Kempik)
Date: Fri, 21 Nov 2014 18:31:20 +0300
Subject: RFR: 8058935:  CPU detection gives 0 cores per cpu, 2 threads
	per core in Amazon EC2 environment
In-Reply-To: <546A50E3.6010200@oracle.com>
References: <546A206A.4070604@oracle.com> <546A50E3.6010200@oracle.com>
Message-ID: <546F5AC8.3050705@oracle.com>

Hello

Thanks for looking into this.

It's impossible to collect needed data at the moment, the bug isn't 
reproducible now. And cpuid dump I've collected from ec2 virtual machine 
says that supports_processor_topology() should report false now:

static bool supports_processor_topology() {
   return (_cpuid_info.std_max_function >= 0xB) &&
   // eax[4:0] | ebx[0:15] == 0 indicates invalid topology level.
   // Some cpus have max cpuid >= 0xB but do not support processor topology.
   (((_cpuid_info.tpl_cpuidB0_eax & 0x1f) | _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);
}


  which comes from this being false:

(((_cpuid_info.tpl_cpuidB0_eax & 0x1f) | _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);

The check I've added is sanity check to prevent same crashes in future.

Thanks. Vladimir


On 17.11.2014 22:47, Vladimir Kozlov wrote:
> According to next document the cpu has 10 cores (and 2 threads per core):
>
> http://ark.intel.com/products/75275/Intel-Xeon-Processor-E5-2670-v2-25M-Cache-2_50-GHz 
>
>
> hs_err in the bug report reports only 2 processors and next lines are 
> missing:
>
> physical id    : 0
> siblings    : 4
> core id        : 0
> cpu cores    : 4
> apicid        : 0
> initial apicid    : 0
>
> I assume it is some kind of virtual environment with which cpuid 
> topology is not working (at least our code does not work).
> We may missing some checks which indicates that topology is not 
> supported.
> It would be nice if you can put all topology and related cpuid bits 
> from amazon ec2 in bug report.
> Checking for 0 could be fine but if it is not 0 it could be still 
> wrong if topology info is not supported.
>
> Thanks,
> Vladimir
>
> On 11/17/14 8:20 AM, Vladimir Kempik wrote:
>> Hi,
>>
>> Please review patch adding sanity check to cores_per_cpu():
>>
>> http://cr.openjdk.java.net/~vkempik/8058935/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8058935
>>
>> Few months ago we've got reports of java crashing in amazon ec2
>> enviroment (they use Xen).
>> https://bugs.openjdk.java.net/browse/JDK-8058935
>> https://bugs.openjdk.java.net/browse/JDK-8058937
>>
>> JVM args was used to make the crash: -XX:+UnlockCommercialFeatures
>> -XX:+FlightRecorder
>>
>> After investigation I think the crash could only have happened if
>> support_processor_topology() returned true and
>> _cpuid_info.tpl_cpuidB1_ebx.bits.logical_cpus was zero.
>>
>> I wasn't able to reproduce the bug on amazon ec2 cloud in present days.
>>
>> The patch adds sanity check, if cpu topology was used and resulted in 0
>> cores per cpu, then fallback to non-topology variant, which can't result
>> in 0 cores per cpu.
>>
>> Testing: JPRT.
>>
>> Thanks,
>> Vladimir.


From dmitry.samersoff at oracle.com  Fri Nov 21 15:47:22 2014
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Fri, 21 Nov 2014 18:47:22 +0300
Subject: RFR: 8058935:  CPU detection gives 0 cores per cpu, 2 threads
	per core in Amazon EC2 environment
In-Reply-To: <546A50E3.6010200@oracle.com>
References: <546A206A.4070604@oracle.com> <546A50E3.6010200@oracle.com>
Message-ID: <546F5E8A.9090007@oracle.com>

Vladimir,

If my memory is not bogus, xen hypervisor used to alter cpuinfo provided.

So as soon as we can't detect xen and use xen api to get CPU
capabilities, Vladimir K.  approach looks reasonable to me.

-Dmitry


On 2014-11-17 22:47, Vladimir Kozlov wrote:
> According to next document the cpu has 10 cores (and 2 threads per core):
> 
> http://ark.intel.com/products/75275/Intel-Xeon-Processor-E5-2670-v2-25M-Cache-2_50-GHz
> 
> 
> hs_err in the bug report reports only 2 processors and next lines are
> missing:
> 
> physical id    : 0
> siblings    : 4
> core id        : 0
> cpu cores    : 4
> apicid        : 0
> initial apicid    : 0
> 
> I assume it is some kind of virtual environment with which cpuid
> topology is not working (at least our code does not work).
> We may missing some checks which indicates that topology is not supported.
> It would be nice if you can put all topology and related cpuid bits from
> amazon ec2 in bug report.
> Checking for 0 could be fine but if it is not 0 it could be still wrong
> if topology info is not supported.
> 
> Thanks,
> Vladimir
> 
> On 11/17/14 8:20 AM, Vladimir Kempik wrote:
>> Hi,
>>
>> Please review patch adding sanity check to cores_per_cpu():
>>
>> http://cr.openjdk.java.net/~vkempik/8058935/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8058935
>>
>> Few months ago we've got reports of java crashing in amazon ec2
>> enviroment (they use Xen).
>> https://bugs.openjdk.java.net/browse/JDK-8058935
>> https://bugs.openjdk.java.net/browse/JDK-8058937
>>
>> JVM args was used to make the crash: -XX:+UnlockCommercialFeatures
>> -XX:+FlightRecorder
>>
>> After investigation I think the crash could only have happened if
>> support_processor_topology() returned true and
>> _cpuid_info.tpl_cpuidB1_ebx.bits.logical_cpus was zero.
>>
>> I wasn't able to reproduce the bug on amazon ec2 cloud in present days.
>>
>> The patch adds sanity check, if cpu topology was used and resulted in 0
>> cores per cpu, then fallback to non-topology variant, which can't result
>> in 0 cores per cpu.
>>
>> Testing: JPRT.
>>
>> Thanks,
>> Vladimir.


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From vladimir.kozlov at oracle.com  Fri Nov 21 17:08:06 2014
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 21 Nov 2014 09:08:06 -0800
Subject: RFR: 8058935:  CPU detection gives 0 cores per cpu, 2 threads
	per core in Amazon EC2 environment
In-Reply-To: <546F5AC8.3050705@oracle.com>
References: <546A206A.4070604@oracle.com> <546A50E3.6010200@oracle.com>
	<546F5AC8.3050705@oracle.com>
Message-ID: <546F7176.5020508@oracle.com>

 > (((_cpuid_info.tpl_cpuidB0_eax & 0x1f) | _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);

That check was added long ago for 6968646 and is present in jdk7 and 6update. And the failure happened in jdk which have it:

# JRE version: Java(TM) SE Runtime Environment (7.0_51-b13) (build 1.7.0_51-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.51-b03 mixed mode linux-amd64 compressed oops)

But if Dmitry is right we can do nothing here. So your change seems valid in such case.

One note - do you need to check (result == 0) in threads_per_core() too?

Thanks,
Vladimir

On 11/21/14 7:31 AM, Vladimir Kempik wrote:
> Hello
>
> Thanks for looking into this.
>
> It's impossible to collect needed data at the moment, the bug isn't reproducible now. And cpuid dump I've collected from
> ec2 virtual machine says that supports_processor_topology() should report false now:
>
> static bool supports_processor_topology() {
>    return (_cpuid_info.std_max_function >= 0xB) &&
>    // eax[4:0] | ebx[0:15] == 0 indicates invalid topology level.
>    // Some cpus have max cpuid >= 0xB but do not support processor topology.
>    (((_cpuid_info.tpl_cpuidB0_eax & 0x1f) | _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);
> }
>
>
>   which comes from this being false:
>
> (((_cpuid_info.tpl_cpuidB0_eax & 0x1f) | _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);
>
> The check I've added is sanity check to prevent same crashes in future.
>
> Thanks. Vladimir
>
>
> On 17.11.2014 22:47, Vladimir Kozlov wrote:
>> According to next document the cpu has 10 cores (and 2 threads per core):
>>
>> http://ark.intel.com/products/75275/Intel-Xeon-Processor-E5-2670-v2-25M-Cache-2_50-GHz
>>
>> hs_err in the bug report reports only 2 processors and next lines are missing:
>>
>> physical id    : 0
>> siblings    : 4
>> core id        : 0
>> cpu cores    : 4
>> apicid        : 0
>> initial apicid    : 0
>>
>> I assume it is some kind of virtual environment with which cpuid topology is not working (at least our code does not
>> work).
>> We may missing some checks which indicates that topology is not supported.
>> It would be nice if you can put all topology and related cpuid bits from amazon ec2 in bug report.
>> Checking for 0 could be fine but if it is not 0 it could be still wrong if topology info is not supported.
>>
>> Thanks,
>> Vladimir
>>
>> On 11/17/14 8:20 AM, Vladimir Kempik wrote:
>>> Hi,
>>>
>>> Please review patch adding sanity check to cores_per_cpu():
>>>
>>> http://cr.openjdk.java.net/~vkempik/8058935/webrev.00/
>>> https://bugs.openjdk.java.net/browse/JDK-8058935
>>>
>>> Few months ago we've got reports of java crashing in amazon ec2
>>> enviroment (they use Xen).
>>> https://bugs.openjdk.java.net/browse/JDK-8058935
>>> https://bugs.openjdk.java.net/browse/JDK-8058937
>>>
>>> JVM args was used to make the crash: -XX:+UnlockCommercialFeatures
>>> -XX:+FlightRecorder
>>>
>>> After investigation I think the crash could only have happened if
>>> support_processor_topology() returned true and
>>> _cpuid_info.tpl_cpuidB1_ebx.bits.logical_cpus was zero.
>>>
>>> I wasn't able to reproduce the bug on amazon ec2 cloud in present days.
>>>
>>> The patch adds sanity check, if cpu topology was used and resulted in 0
>>> cores per cpu, then fallback to non-topology variant, which can't result
>>> in 0 cores per cpu.
>>>
>>> Testing: JPRT.
>>>
>>> Thanks,
>>> Vladimir.
>

From vladimir.kempik at oracle.com  Fri Nov 21 17:19:18 2014
From: vladimir.kempik at oracle.com (Vladimir Kempik)
Date: Fri, 21 Nov 2014 20:19:18 +0300
Subject: RFR: 8058935:  CPU detection gives 0 cores per cpu, 2 threads
	per core in Amazon EC2 environment
In-Reply-To: <546F7176.5020508@oracle.com>
References: <546A206A.4070604@oracle.com> <546A50E3.6010200@oracle.com>
	<546F5AC8.3050705@oracle.com> <546F7176.5020508@oracle.com>
Message-ID: <546F7416.5080102@oracle.com>

Hello


 >That check was added long ago for 6968646 and is present in jdk7 and 
6update. And the failure happened in jdk which have it:

I meant this check failed to do its job, there is no other way to get 
cores_per_cpu == 0 on intel cpu in this function.


 >One note - do you need to check (result == 0) in threads_per_core() too?

for result to be 0 in cores_per_cpu()

result = _cpuid_info.tpl_cpuidB1_ebx.bits.logical_cpus /
  _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus;

_cpuid_info.tpl_cpuidB1_ebx.bits.logical_cpus needs to be zero and 
_cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus to be non zero. in this 
case threads_per_core isn't affected:

if (is_intel() && supports_processor_topology()) {
result = _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus;

if _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus == 0 then we would 
crash in cores_per_cpu with div by zero anyway.

That was my reason to do not edit threads_per_cpu.

Thanks, Vladimir
On 21.11.2014 20:08, Vladimir Kozlov wrote:
> > (((_cpuid_info.tpl_cpuidB0_eax & 0x1f) | 
> _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);
>
> That check was added long ago for 6968646 and is present in jdk7 and 
> 6update. And the failure happened in jdk which have it:
>
> # JRE version: Java(TM) SE Runtime Environment (7.0_51-b13) (build 
> 1.7.0_51-b13)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.51-b03 mixed mode 
> linux-amd64 compressed oops)
>
> But if Dmitry is right we can do nothing here. So your change seems 
> valid in such case.
>
> One note - do you need to check (result == 0) in threads_per_core() too?
>
> Thanks,
> Vladimir
>
> On 11/21/14 7:31 AM, Vladimir Kempik wrote:
>> Hello
>>
>> Thanks for looking into this.
>>
>> It's impossible to collect needed data at the moment, the bug isn't 
>> reproducible now. And cpuid dump I've collected from
>> ec2 virtual machine says that supports_processor_topology() should 
>> report false now:
>>
>> static bool supports_processor_topology() {
>>    return (_cpuid_info.std_max_function >= 0xB) &&
>>    // eax[4:0] | ebx[0:15] == 0 indicates invalid topology level.
>>    // Some cpus have max cpuid >= 0xB but do not support processor 
>> topology.
>>    (((_cpuid_info.tpl_cpuidB0_eax & 0x1f) | 
>> _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);
>> }
>>
>>
>>   which comes from this being false:
>>
>> (((_cpuid_info.tpl_cpuidB0_eax & 0x1f) | 
>> _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);
>>
>> The check I've added is sanity check to prevent same crashes in future.
>>
>> Thanks. Vladimir
>>
>>
>> On 17.11.2014 22:47, Vladimir Kozlov wrote:
>>> According to next document the cpu has 10 cores (and 2 threads per 
>>> core):
>>>
>>> http://ark.intel.com/products/75275/Intel-Xeon-Processor-E5-2670-v2-25M-Cache-2_50-GHz 
>>>
>>>
>>> hs_err in the bug report reports only 2 processors and next lines 
>>> are missing:
>>>
>>> physical id    : 0
>>> siblings    : 4
>>> core id        : 0
>>> cpu cores    : 4
>>> apicid        : 0
>>> initial apicid    : 0
>>>
>>> I assume it is some kind of virtual environment with which cpuid 
>>> topology is not working (at least our code does not
>>> work).
>>> We may missing some checks which indicates that topology is not 
>>> supported.
>>> It would be nice if you can put all topology and related cpuid bits 
>>> from amazon ec2 in bug report.
>>> Checking for 0 could be fine but if it is not 0 it could be still 
>>> wrong if topology info is not supported.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 11/17/14 8:20 AM, Vladimir Kempik wrote:
>>>> Hi,
>>>>
>>>> Please review patch adding sanity check to cores_per_cpu():
>>>>
>>>> http://cr.openjdk.java.net/~vkempik/8058935/webrev.00/
>>>> https://bugs.openjdk.java.net/browse/JDK-8058935
>>>>
>>>> Few months ago we've got reports of java crashing in amazon ec2
>>>> enviroment (they use Xen).
>>>> https://bugs.openjdk.java.net/browse/JDK-8058935
>>>> https://bugs.openjdk.java.net/browse/JDK-8058937
>>>>
>>>> JVM args was used to make the crash: -XX:+UnlockCommercialFeatures
>>>> -XX:+FlightRecorder
>>>>
>>>> After investigation I think the crash could only have happened if
>>>> support_processor_topology() returned true and
>>>> _cpuid_info.tpl_cpuidB1_ebx.bits.logical_cpus was zero.
>>>>
>>>> I wasn't able to reproduce the bug on amazon ec2 cloud in present 
>>>> days.
>>>>
>>>> The patch adds sanity check, if cpu topology was used and resulted 
>>>> in 0
>>>> cores per cpu, then fallback to non-topology variant, which can't 
>>>> result
>>>> in 0 cores per cpu.
>>>>
>>>> Testing: JPRT.
>>>>
>>>> Thanks,
>>>> Vladimir.
>>


From vladimir.kozlov at oracle.com  Fri Nov 21 17:40:57 2014
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 21 Nov 2014 09:40:57 -0800
Subject: RFR: 8058935:  CPU detection gives 0 cores per cpu, 2 threads
	per core in Amazon EC2 environment
In-Reply-To: <546F7416.5080102@oracle.com>
References: <546A206A.4070604@oracle.com> <546A50E3.6010200@oracle.com>
	<546F5AC8.3050705@oracle.com> <546F7176.5020508@oracle.com>
	<546F7416.5080102@oracle.com>
Message-ID: <546F7929.60909@oracle.com>

Okay. Looks good.

Thanks,
Vladimir

On 11/21/14 9:19 AM, Vladimir Kempik wrote:
> Hello
>
>
>  >That check was added long ago for 6968646 and is present in jdk7 and
> 6update. And the failure happened in jdk which have it:
>
> I meant this check failed to do its job, there is no other way to get
> cores_per_cpu == 0 on intel cpu in this function.
>
>
>  >One note - do you need to check (result == 0) in threads_per_core() too?
>
> for result to be 0 in cores_per_cpu()
>
> result = _cpuid_info.tpl_cpuidB1_ebx.bits.logical_cpus /
>   _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus;
>
> _cpuid_info.tpl_cpuidB1_ebx.bits.logical_cpus needs to be zero and
> _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus to be non zero. in this
> case threads_per_core isn't affected:
>
> if (is_intel() && supports_processor_topology()) {
> result = _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus;
>
> if _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus == 0 then we would
> crash in cores_per_cpu with div by zero anyway.
>
> That was my reason to do not edit threads_per_cpu.
>
> Thanks, Vladimir
> On 21.11.2014 20:08, Vladimir Kozlov wrote:
>> > (((_cpuid_info.tpl_cpuidB0_eax & 0x1f) |
>> _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);
>>
>> That check was added long ago for 6968646 and is present in jdk7 and
>> 6update. And the failure happened in jdk which have it:
>>
>> # JRE version: Java(TM) SE Runtime Environment (7.0_51-b13) (build
>> 1.7.0_51-b13)
>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.51-b03 mixed mode
>> linux-amd64 compressed oops)
>>
>> But if Dmitry is right we can do nothing here. So your change seems
>> valid in such case.
>>
>> One note - do you need to check (result == 0) in threads_per_core() too?
>>
>> Thanks,
>> Vladimir
>>
>> On 11/21/14 7:31 AM, Vladimir Kempik wrote:
>>> Hello
>>>
>>> Thanks for looking into this.
>>>
>>> It's impossible to collect needed data at the moment, the bug isn't
>>> reproducible now. And cpuid dump I've collected from
>>> ec2 virtual machine says that supports_processor_topology() should
>>> report false now:
>>>
>>> static bool supports_processor_topology() {
>>>    return (_cpuid_info.std_max_function >= 0xB) &&
>>>    // eax[4:0] | ebx[0:15] == 0 indicates invalid topology level.
>>>    // Some cpus have max cpuid >= 0xB but do not support processor
>>> topology.
>>>    (((_cpuid_info.tpl_cpuidB0_eax & 0x1f) |
>>> _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);
>>> }
>>>
>>>
>>>   which comes from this being false:
>>>
>>> (((_cpuid_info.tpl_cpuidB0_eax & 0x1f) |
>>> _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);
>>>
>>> The check I've added is sanity check to prevent same crashes in future.
>>>
>>> Thanks. Vladimir
>>>
>>>
>>> On 17.11.2014 22:47, Vladimir Kozlov wrote:
>>>> According to next document the cpu has 10 cores (and 2 threads per
>>>> core):
>>>>
>>>> http://ark.intel.com/products/75275/Intel-Xeon-Processor-E5-2670-v2-25M-Cache-2_50-GHz
>>>>
>>>>
>>>> hs_err in the bug report reports only 2 processors and next lines
>>>> are missing:
>>>>
>>>> physical id    : 0
>>>> siblings    : 4
>>>> core id        : 0
>>>> cpu cores    : 4
>>>> apicid        : 0
>>>> initial apicid    : 0
>>>>
>>>> I assume it is some kind of virtual environment with which cpuid
>>>> topology is not working (at least our code does not
>>>> work).
>>>> We may missing some checks which indicates that topology is not
>>>> supported.
>>>> It would be nice if you can put all topology and related cpuid bits
>>>> from amazon ec2 in bug report.
>>>> Checking for 0 could be fine but if it is not 0 it could be still
>>>> wrong if topology info is not supported.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 11/17/14 8:20 AM, Vladimir Kempik wrote:
>>>>> Hi,
>>>>>
>>>>> Please review patch adding sanity check to cores_per_cpu():
>>>>>
>>>>> http://cr.openjdk.java.net/~vkempik/8058935/webrev.00/
>>>>> https://bugs.openjdk.java.net/browse/JDK-8058935
>>>>>
>>>>> Few months ago we've got reports of java crashing in amazon ec2
>>>>> enviroment (they use Xen).
>>>>> https://bugs.openjdk.java.net/browse/JDK-8058935
>>>>> https://bugs.openjdk.java.net/browse/JDK-8058937
>>>>>
>>>>> JVM args was used to make the crash: -XX:+UnlockCommercialFeatures
>>>>> -XX:+FlightRecorder
>>>>>
>>>>> After investigation I think the crash could only have happened if
>>>>> support_processor_topology() returned true and
>>>>> _cpuid_info.tpl_cpuidB1_ebx.bits.logical_cpus was zero.
>>>>>
>>>>> I wasn't able to reproduce the bug on amazon ec2 cloud in present
>>>>> days.
>>>>>
>>>>> The patch adds sanity check, if cpu topology was used and resulted
>>>>> in 0
>>>>> cores per cpu, then fallback to non-topology variant, which can't
>>>>> result
>>>>> in 0 cores per cpu.
>>>>>
>>>>> Testing: JPRT.
>>>>>
>>>>> Thanks,
>>>>> Vladimir.
>>>
>

From david.holmes at oracle.com  Mon Nov 24 05:07:19 2014
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 24 Nov 2014 15:07:19 +1000
Subject: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in
	hotspot\src\os\windows\vm\os_windows.cpp: 3844
In-Reply-To: <546DE3C6.5050200@oracle.com>
References: <5465F711.9090605@oracle.com> <54668EAF.9070807@oracle.com>
	<546915DF.7080106@oracle.com> <54699845.5010901@oracle.com>
	<a68e8e09-8efa-4186-8207-d6b4b02a8831@default>
	<546AF55A.8090203@oracle.com> <546B6568.7040701@oracle.com>
	<546DE3C6.5050200@oracle.com>
Message-ID: <5472BD07.6050405@oracle.com>

On 20/11/2014 10:51 PM, Ivan Gerasimov wrote:
> Thank you Daniel!
>
> David, are you still Okay with the updated webrev?

Yes.

Thanks,
David

> Comparing to the previous one, I've added setting the priority of the
> current thread at the line 3880 and changed the priority level to
> from HIGHEST to ABOVE_NORMAL.
>
> Sincerely yours,
> Ivan
>
> On 18.11.2014 18:27, Daniel D. Daugherty wrote:
>> > http://cr.openjdk.java.net/~igerasim/8064694/2/webrev/
>>
>> src/os/windows/vm/os_windows.cpp
>>     No commments.
>>
>> Thumbs up.
>>
>> Dan
>>
>>
>> On 11/18/14 12:29 AM, Ivan Gerasimov wrote:
>>> Hi Markus!
>>>
>>> The priority of the exiting thread will be raised for quite a short
>>> period of time -- right before the thread finishes exiting.
>>>
>>> There are two places where the priority is adjusted.
>>>
>>> Under normal conditions we should never see the first place hit.
>>> However, if we do, this means we have a huge number of threads.
>>> Raising the priority of one of them is a hint about which thread we
>>> want the scheduler to focus on.
>>>
>>> The second place is a bit different.
>>> We have several threads running immediately before ending the process.
>>> Some of them are at the exiting path and block exiting of the whole
>>> process.
>>> Raising the priority of those threads is a way to say we're not
>>> interested in all the other threads, as they are going to be
>>> terminated anyway.
>>>
>>> I just noticed that in second scenario it may be appropriate to set
>>> the priority of the current thread to the same level as for the
>>> exiting threads.
>>> This way it'll be given a fair chance to continue if the timeout
>>> expires.
>>>
>>> I also think it should be enough to set the priority level to
>>> THREAD_PRIORITY_ABOVE_NORMAL instead of THREAD_PRIORITY_HIGHEST.
>>> It will give just +1 to the priority value -- should be enough for
>>> the hint.
>>>
>>> Would you please take a look at the updated webrev:
>>> http://cr.openjdk.java.net/~igerasim/8064694/2/webrev/
>>>
>>> Sincerely yours,
>>> Ivan
>>>
>>>
>>> On 17.11.2014 11:33, Markus Gr?nlund wrote:
>>>> I agree with David.
>>>>
>>>> The side effects will be unknown and very hard to debug.
>>>>
>>>> Is there another way to accomplish the results without manipulating
>>>> base services?
>>>>
>>>> Thanks
>>>> Markus
>>>>
>>>> -----Original Message-----
>>>> From: David Holmes
>>>> Sent: den 17 november 2014 07:40
>>>> To: Ivan Gerasimov; Daniel Daugherty
>>>> Cc: serviceability-dev at openjdk.java.net; hotspot-runtime-dev
>>>> Subject: Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed
>>>> in hotspot\src\os\windows\vm\os_windows.cpp: 3844
>>>>
>>>> On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
>>>>> Thank you Daniel!
>>>>>
>>>>> Please find the updated webrev with your suggestions incorporated
>>>>> here:
>>>>> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>>>>>
>>>>> Concerning the thread priority: If the application is of
>>>>> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to
>>>>> THREAD_PRIORITY_HIGHEST will result in its priority value to be only
>>>>> 10 (of maximum 31).
>>>>> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.
>>>>> 85).aspx
>>>>>
>>>>>
>>>>> And if the process is HIGH_PRIORITY_CLASS, then the tread with the
>>>>> HIGHEST priority level will have priority value == 15 of 31.
>>>>>
>>>>> I believe, it should not be too much, and the machine will not become
>>>>> busy with only those closing threads.
>>>>> However, I hope it would be enough to make them complete faster than
>>>>> other threads of the NORMAL priority level withing the same
>>>>> application.
>>>> I don't think this is necessary or desirable. Under normal usage
>>>> we're giving priority to exiting threads and that may disrupt the
>>>> usual scheduling patterns that applications see. You may posit that
>>>> it is "harmless" but we can't say that for sure. Nor can we actually
>>>> know that this will help with this particular bug. I would not add
>>>> in this new code.
>>>>
>>>> David
>>>>
>>>>> Sincerely yours,
>>>>> Ivan
>>>>>
>>>>>
>>>>> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>>>>>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>>>>>> Hello!
>>>>>>>
>>>>>>> The recent fix for JDK-8059533 ((process) Make exiting process wait
>>>>>>> for exiting threads [win]) caused the warning message to be printed
>>>>>>> in some test environments:
>>>>>>> -----------
>>>>>>> os_windows.cpp:3844 is in the newly updated
>>>>>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>>>>>> -----------
>>>>>>>
>>>>>>> This has been observed with debug builds on highly loaded systems.
>>>>>>>
>>>>>>>
>>>>>>> To address the issue it is proposed to do three things:
>>>>>>> 1) increase the timeout for debug builds,
>>>>>>> 2) increase the maximum number of the thread handles to be stored,
>>>>>>> 3) rise the priority of the exiting threads, if we need to wait for
>>>>>>> them.
>>>>>>>
>>>>>>> Would you please help review the fix?
>>>>>>>
>>>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>>>>> src/os/windows/vm/os_windows.cpp
>>>>>>
>>>>>>    line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>>>>>      Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>>>      Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>>>
>>>>>>      That uses the smaller value for only one build config (PRODUCT).
>>>>>>
>>>>>>    line 3785: #define EXIT_TIMEOUT     NOT_DEBUG(1000)
>>>>>> DEBUG_ONLY(4000)
>>>>>> /*1 sec in product, 4 sec in debug*/
>>>>>>      Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>>>      Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>>>      Please add spaces between the comment delimiters and the comment
>>>>>> text.
>>>>>>
>>>>>>      That uses the smaller timeout for only one build config
>>>>>> (PRODUCT).
>>>>>>
>>>>>>    line 3836           // Rise the priority...
>>>>>>      Typo: 'Rise' -> 'Raise'
>>>>>>
>>>>>>      About the general idea of raising the exiting thread's priority,
>>>>>>      if the exiting thread is looping in some Win* OS code after this
>>>>>>      point, will raising the priority make the machine unusable?
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>> The fix was tested on all available platforms, with the hotspot
>>>>>>> testset. No failures.
>>>>>>>
>>>>>>> Sincerely yours,
>>>>>>> Ivan
>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>
>>
>>
>

From ioi.lam at oracle.com  Mon Nov 24 11:58:52 2014
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 24 Nov 2014 19:58:52 +0800
Subject: RFR (S) [8u40] backport request 8065346 and 8064701
Message-ID: <54731D7C.4010504@oracle.com>

Hi,

Please review the backport of these two bugs from 9 to 8u40. The patches 
applied cleanly.

http://cr.openjdk.java.net/~iklam/8065346_8064701_backport_8u40/

8064701: Some CDS optimizations should be disabled if bootclasspath is 
modified by JVMTI
Summary: Added API to track bootclasspath modification

8065346: WB_AddToBootstrapClassLoaderSearch calls 
JvmtiEnv::create_a_jvmti when not in _thread_in_vm state
Summary: Removed ThreadToNativeFromVM and use 
java_lang_String::as_utf8_string instead

Thanks
- Ioi

From yasuenag at gmail.com  Mon Nov 24 13:21:41 2014
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Mon, 24 Nov 2014 22:21:41 +0900
Subject: RFR: JDK-8059586: hs_err report should treat redirected core
	pattern.
In-Reply-To: <543E80F8.3080204@gmail.com>
References: <542C8274.3010809@gmail.com>	<54338B70.9080709@oracle.com>
	<CAGFVN2DjxGLomf6dzS5BVqnSKf-43XdEt897T7s0cV71C78r-Q@mail.gmail.com>
	<543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com>
	<543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com>
Message-ID: <547330E5.1050708@gmail.com>

Hi all,

I've uploaded webrev for this issue about a month ago.
Could you review it and sponsor it?


Thanks,

Yasumasa


On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote:
> Hi David,
>
> I've uploaded new webrev:
> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/
>
>
>> I wasn't suggesting that you make such a change though because it is large and disruptive.
>
>> Unfactoring check_or_create_dump is a step backwards in terms of code sharing.
>
> I restored check_or_create_dump() to os_posix.cpp .
> And I changed get_core_path() to create message which represents core dump path
> (including filename) in each OS.
>
>
>> Expanding the get_core_path in os_linux.cpp to handle the core_pattern may be okay (but I don't know enough about it to validate everything).
>
> I implemented all parameters in Linux kernel documentation:
> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
>
> So I think that parameters which are processed are enough.
>
>
> Thanks,
>
> Yasumasa
>
>
>
> (2014/10/15 9:41), David Holmes wrote:
>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote:
>>> Hi David,
>>>
>>> Thank you for comments!
>>> I've uploaded new webrev. Could you review it again?
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/
>>>
>>> I am an author of jdk9. So I cannot commit it.
>>> Could you be a sponsor for this enhancement?
>>>
>>>
>>>> In which case that should be handled by the linux specific
>>>> get_core_path() function.
>>>
>>> Agree.
>>> So I implemented it in os_linux.cpp .
>>> But part of format characters (%P: global pid, %s: signal, %t dump time)
>>> are not processed
>>> in this function because I think these parameters are difficult to
>>> handle in it.
>>>
>>>    %P: I could not find API for this.
>>>    %s: We have to change arguments of get_core_path() .
>>>    %t: This parameter means timestamp of coredump. It is decided in Kernel.
>>>
>>>
>>>> Fixing this means changing all the os_posix using platforms. But your
>>>> patch is not about this part. :)
>>>
>>> I moved os::check_or_create_dump() to each OS implementations (AIX, BSD,
>>> Solaris, Linux) .
>>> So I can write Linux specific code to check_or_create_dump() .
>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-)
>>
>> I wasn't suggesting that you make such a change though because it is large and disruptive. The simple handling of the | part of core_pattern was basically ok. Expanding the get_core_path in os_linux.cpp to handle the core_pattern may be okay (but I don't know enough about it to validate everything). Unfactoring check_or_create_dump is a step backwards in terms of code sharing.
>>
>> Sorry this has grown too large for me to deal with right now.
>>
>> David
>> -----
>>
>>>
>>>> Though I'm unclear whether it both invokes the program and creates a
>>>> core dump file; or just invokes the program?
>>>
>>> If '|' is set, Linux kernel will just redirect core image to user process.
>>> Kernel documentation says as below:
>>> ------------
>>> . If the first character of the pattern is a '|', the kernel will treat
>>>    the rest of the pattern as a command to run.  The core dump will be
>>>    written to the standard input of that program instead of to a file.
>>> ------------
>>>
>>> And implementation of coredump (do_coredump()) follows to it.
>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c
>>>
>>>
>>> In case of ABRT, ABRT dumps core image to default location
>>> (<CWD>/core.<PID>)
>>> if user set unlimited to resource limit of core (ulimit -c) .
>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c
>>>
>>>
>>>> A few style nits - you need spaces around keywords and before braces
>>>> I also suggest saying "Core dumps may be processed with ..." rather
>>>> than "treated".
>>>> And as you don't do anything in the non-redirect case I suggest
>>>> collapsing this:
>>>
>>> I've fixed them.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> (2014/10/13 9:41), David Holmes wrote:
>>>> Hi Yasumasa,
>>>>
>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote:
>>>>> Hi David,
>>>>>
>>>>> Sorry for my English.
>>>>>
>>>>> I want to propose that JVM should create message according to core
>>>>> pattern (/proc/sys/kernel/core_pattern) .
>>>>> So I filed it to JBS and created a patch.
>>>>
>>>> So I've had a quick look at this core_pattern business and it seems to
>>>> me that there are two aspects to this.
>>>>
>>>> First, without the leading |, the entry in the core_pattern file is a
>>>> naming pattern for the core file. In which case that should be handled
>>>> by the linux specific get_core_path() function. Though that in itself
>>>> can't fully report the expected name, as part of it is provided in the
>>>> shared code in os::check_or_create_dump. Fixing this means changing
>>>> all the os_posix using platforms. But your patch is not about this
>>>> part. :)
>>>>
>>>> Second, with a leading | the core_pattern is actually the name of a
>>>> program to execute when the program is about to core dump, and that is
>>>> what you report with your patch. Though I'm unclear whether it both
>>>> invokes the program and creates a core dump file; or just invokes the
>>>> program?
>>>>
>>>> So with regards to this second part your patch seems functionally ok.
>>>> I do dislike having a big chunk of linux specific code in this "posix"
>>>> support file but ...
>>>>
>>>> A few style nits - you need spaces around keywords and before braces eg:
>>>>
>>>>    if(x){
>>>>
>>>> should be
>>>>
>>>>    if (x) {
>>>>
>>>> I also suggest saying "Core dumps may be processed with ..." rather
>>>> than "treated".
>>>>
>>>> And as you don't do anything in the non-redirect case I suggest
>>>> collapsing this:
>>>>
>>>>    83           is_redirect = core_pattern[0] == '|';
>>>>    84         }
>>>>    85
>>>>    86         if(is_redirect){
>>>>    87           jio_snprintf(buffer, bufferSize,
>>>>    88                    "Core dumps may be treated with \"%s\"",
>>>> &core_pattern[1]);
>>>>    89         }
>>>>
>>>> to just
>>>>
>>>>    83           if (core_pattern[0] == '|') {  // redirect
>>>>    84             jio_snprintf(buffer, bufferSize, "Core dumps may be
>>>> processed with \"%s\"", &core_pattern[1]);
>>>>    85            }
>>>>    86         }
>>>>
>>>> Comments from other runtime folk appreciated.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>> 2014/10/07 15:43 "David Holmes" <david.holmes at oracle.com
>>>>> <mailto:david.holmes at oracle.com>>:
>>>>>
>>>>>     Hi Yasumasa,
>>>>>
>>>>>     I'm sorry but I don't understand what you are proposing. When you
>>>>> say
>>>>>     "treat" do you mean "create"? Otherwise what do you mean by
>>>>> "treated"?
>>>>>
>>>>>     Thanks,
>>>>>     David
>>>>>
>>>>>     On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote:
>>>>>      > I'm in Hackergarten @ JavaOne :-)
>>>>>      >
>>>>>      >
>>>>>      > Hi all,
>>>>>      >
>>>>>      > I would like to enhance the messages in hs_err report.
>>>>>      > Modern Linux kernel can treat core dump with user process
>>>>> (e.g. ABRT)
>>>>>      > However, hs_err report cannot detect it.
>>>>>      >
>>>>>      > I think that hs_err report should output messages as below:
>>>>>      > -------------
>>>>>      >     Failed to write core dump. Core dumps may be treated with
>>>>>     "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s %c %p
>>>>>     %u %g %t e"
>>>>>      > -------------
>>>>>      >
>>>>>      > I've uploaded webrev of this enhancement.
>>>>>      > Could you review it?
>>>>>      >
>>>>>      > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/
>>>>>      >
>>>>>      > This patch works fine on Fedora20 x86_64.
>>>>>      >
>>>>>      >
>>>>>      >
>>>>>      > Thanks,
>>>>>      >
>>>>>      > Yasumasa
>>>>>      >
>>>>>

From jiangli.zhou at oracle.com  Mon Nov 24 18:00:12 2014
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Mon, 24 Nov 2014 10:00:12 -0800
Subject: RFR (S) [8u40] backport request 8065346 and 8064701
In-Reply-To: <54731D7C.4010504@oracle.com>
References: <54731D7C.4010504@oracle.com>
Message-ID: <5473722C.2020907@oracle.com>

Hi Ioi,

Looks good for backport.

Thanks,
Jiangli

On 11/24/2014 03:58 AM, Ioi Lam wrote:
> Hi,
>
> Please review the backport of these two bugs from 9 to 8u40. The 
> patches applied cleanly.
>
> http://cr.openjdk.java.net/~iklam/8065346_8064701_backport_8u40/
>
> 8064701: Some CDS optimizations should be disabled if bootclasspath is 
> modified by JVMTI
> Summary: Added API to track bootclasspath modification
>
> 8065346: WB_AddToBootstrapClassLoaderSearch calls 
> JvmtiEnv::create_a_jvmti when not in _thread_in_vm state
> Summary: Removed ThreadToNativeFromVM and use 
> java_lang_String::as_utf8_string instead
>
> Thanks
> - Ioi


From ioi.lam at oracle.com  Tue Nov 25 01:32:27 2014
From: ioi.lam at oracle.com (Ioi Lam)
Date: Tue, 25 Nov 2014 09:32:27 +0800
Subject: RFR (S) [8u40] backport request 8065346 and 8064701
In-Reply-To: <5473722C.2020907@oracle.com>
References: <54731D7C.4010504@oracle.com> <5473722C.2020907@oracle.com>
Message-ID: <5473DC2B.7000903@oracle.com>

Thanks Jiangli!

- Ioi

On 11/25/14, 2:00 AM, Jiangli Zhou wrote:
> Hi Ioi,
>
> Looks good for backport.
>
> Thanks,
> Jiangli
>
> On 11/24/2014 03:58 AM, Ioi Lam wrote:
>> Hi,
>>
>> Please review the backport of these two bugs from 9 to 8u40. The 
>> patches applied cleanly.
>>
>> http://cr.openjdk.java.net/~iklam/8065346_8064701_backport_8u40/
>>
>> 8064701: Some CDS optimizations should be disabled if bootclasspath 
>> is modified by JVMTI
>> Summary: Added API to track bootclasspath modification
>>
>> 8065346: WB_AddToBootstrapClassLoaderSearch calls 
>> JvmtiEnv::create_a_jvmti when not in _thread_in_vm state
>> Summary: Removed ThreadToNativeFromVM and use 
>> java_lang_String::as_utf8_string instead
>>
>> Thanks
>> - Ioi
>


From yasuenag at gmail.com  Tue Nov 25 03:34:44 2014
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Tue, 25 Nov 2014 12:34:44 +0900
Subject: guarantee(PageArmed == 0) failed: invaliant
Message-ID: <5473F8D4.8000107@gmail.com>

Hi all,

My customer encountered crash with below messages:
--------
Internal Error (safepoint.cpp:309)
guarantee(PageArmed == 0) failed: invaliant
--------
 - JDK: JDK6u37 x64
 -  OS: RHEL 5.4 x86_64

I found similar issues in JBS:
 - JDK-7116986
 - JDK-7156454
 - JDK-8033717
  
I read safepoint.cpp in jdk9, I guess this error is caused in below:
--------
     if (int(iterations) == DeferPollingPageLoopCount) {
        guarantee (PageArmed == 0, "invariant") ;
        PageArmed = 1 ;
        os::make_polling_page_unreadable();
     }
--------

"iterations" is defined as "unsigned int", and increments in each loop.
On the other hand, DeferPollingPageLoopCount is defined intx and default
value is "-1" .

"PageArmed" sets to 1.
--------
  if (DeferPollingPageLoopCount < 0) {
    // Make polling safepoint aware
    guarantee (PageArmed == 0, "invariant") ;
    PageArmed = 1 ;
    os::make_polling_page_unreadable();
  }
--------


If "iterations" is overflowed, do we encounter this guarantee ?
I think this "if" statement should rewrite as below:
--------
diff -r 7e08ae41ddbe src/share/vm/runtime/safepoint.cpp
--- a/src/share/vm/runtime/safepoint.cpp	Mon Nov 24 09:57:02 2014 +0100
+++ b/src/share/vm/runtime/safepoint.cpp	Tue Nov 25 12:19:58 2014 +0900
@@ -288,7 +288,8 @@
       // 9. On windows consider using the return value from SwitchThreadTo()
       //    to drive subsequent spin/SwitchThreadTo()/Sleep(N) decisions.
 
-      if (int(iterations) == DeferPollingPageLoopCount) {
+      if ((DeferPollingPageLoopCount >= 0) &&
+                  (int(iterations) == DeferPollingPageLoopCount)) {
          guarantee (PageArmed == 0, "invariant") ;
          PageArmed = 1 ;
          os::make_polling_page_unreadable();
--------


If it is correct, I will file it to JBS and upload webrev.
Could you help me to resolve this issue?


Thanks,

Yasumasa


From david.holmes at oracle.com  Tue Nov 25 07:04:20 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 25 Nov 2014 17:04:20 +1000
Subject: guarantee(PageArmed == 0) failed: invaliant
In-Reply-To: <5473F8D4.8000107@gmail.com>
References: <5473F8D4.8000107@gmail.com>
Message-ID: <547429F4.2020803@oracle.com>

Hi Yasumasa,

On 25/11/2014 1:34 PM, Yasumasa Suenaga wrote:
> Hi all,
> 
> My customer encountered crash with below messages:
> --------
> Internal Error (safepoint.cpp:309)
> guarantee(PageArmed == 0) failed: invaliant
> --------
>   - JDK: JDK6u37 x64
>   -  OS: RHEL 5.4 x86_64
> 
> I found similar issues in JBS:
>   - JDK-7116986
>   - JDK-7156454
>   - JDK-8033717
>    
> I read safepoint.cpp in jdk9, I guess this error is caused in below:
> --------
>       if (int(iterations) == DeferPollingPageLoopCount) {
>          guarantee (PageArmed == 0, "invariant") ;
>          PageArmed = 1 ;
>          os::make_polling_page_unreadable();
>       }
> --------
> 
> "iterations" is defined as "unsigned int", and increments in each loop.
> On the other hand, DeferPollingPageLoopCount is defined intx and default
> value is "-1" .
> 
> "PageArmed" sets to 1.
> --------
>    if (DeferPollingPageLoopCount < 0) {
>      // Make polling safepoint aware
>      guarantee (PageArmed == 0, "invariant") ;
>      PageArmed = 1 ;
>      os::make_polling_page_unreadable();
>    }
> --------
> 
> 
> If "iterations" is overflowed, do we encounter this guarantee ?
> I think this "if" statement should rewrite as below:

No we want this overflow to trigger the guarantee failure - it indicates
a problem elsewhere in the VM because a thread is not reaching the
safepoint that has been requested, in a timely manner.

When crashes like this occur you need to examine all the running threads
to find out which are not safepoint-safe and then determine what they
are doing and why they have not performed a safepoint check.

David
------

> --------
> diff -r 7e08ae41ddbe src/share/vm/runtime/safepoint.cpp
> --- a/src/share/vm/runtime/safepoint.cpp	Mon Nov 24 09:57:02 2014 +0100
> +++ b/src/share/vm/runtime/safepoint.cpp	Tue Nov 25 12:19:58 2014 +0900
> @@ -288,7 +288,8 @@
>         // 9. On windows consider using the return value from SwitchThreadTo()
>         //    to drive subsequent spin/SwitchThreadTo()/Sleep(N) decisions.
>   
> -      if (int(iterations) == DeferPollingPageLoopCount) {
> +      if ((DeferPollingPageLoopCount >= 0) &&
> +                  (int(iterations) == DeferPollingPageLoopCount)) {
>            guarantee (PageArmed == 0, "invariant") ;
>            PageArmed = 1 ;
>            os::make_polling_page_unreadable();
> --------
> 
> 
> If it is correct, I will file it to JBS and upload webrev.
> Could you help me to resolve this issue?
> 
> 
> Thanks,
> 
> Yasumasa
> 

From david.holmes at oracle.com  Tue Nov 25 08:38:58 2014
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 25 Nov 2014 18:38:58 +1000
Subject: RFR: JDK-8059586: hs_err report should treat redirected core
	pattern.
In-Reply-To: <547330E5.1050708@gmail.com>
References: <542C8274.3010809@gmail.com>	<54338B70.9080709@oracle.com>
	<CAGFVN2DjxGLomf6dzS5BVqnSKf-43XdEt897T7s0cV71C78r-Q@mail.gmail.com>
	<543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com>
	<543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com>
	<547330E5.1050708@gmail.com>
Message-ID: <54744022.2030208@oracle.com>

Sorry Yasumasa, this fell off my radar and I was hoping for others to 
comment. We still need a second reviewer.

The change in:
  src/os/aix/vm/os_aix.cpp
  src/os/solaris/vm/os_solaris.cpp

   jio_snprintf(buffer, bufferSize, "%s/core or core.%d", 
current_process_id());

has no argument for the %s - presumably p was intended.

Thanks,
David

On 24/11/2014 11:21 PM, Yasumasa Suenaga wrote:
> Hi all,
>
> I've uploaded webrev for this issue about a month ago.
> Could you review it and sponsor it?
>
>
> Thanks,
>
> Yasumasa
>
>
> On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote:
>> Hi David,
>>
>> I've uploaded new webrev:
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/
>>
>>
>>> I wasn't suggesting that you make such a change though because it is
>>> large and disruptive.
>>
>>> Unfactoring check_or_create_dump is a step backwards in terms of code
>>> sharing.
>>
>> I restored check_or_create_dump() to os_posix.cpp .
>> And I changed get_core_path() to create message which represents core
>> dump path
>> (including filename) in each OS.
>>
>>
>>> Expanding the get_core_path in os_linux.cpp to handle the
>>> core_pattern may be okay (but I don't know enough about it to
>>> validate everything).
>>
>> I implemented all parameters in Linux kernel documentation:
>> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
>>
>> So I think that parameters which are processed are enough.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>>
>> (2014/10/15 9:41), David Holmes wrote:
>>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote:
>>>> Hi David,
>>>>
>>>> Thank you for comments!
>>>> I've uploaded new webrev. Could you review it again?
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/
>>>>
>>>> I am an author of jdk9. So I cannot commit it.
>>>> Could you be a sponsor for this enhancement?
>>>>
>>>>
>>>>> In which case that should be handled by the linux specific
>>>>> get_core_path() function.
>>>>
>>>> Agree.
>>>> So I implemented it in os_linux.cpp .
>>>> But part of format characters (%P: global pid, %s: signal, %t dump
>>>> time)
>>>> are not processed
>>>> in this function because I think these parameters are difficult to
>>>> handle in it.
>>>>
>>>>    %P: I could not find API for this.
>>>>    %s: We have to change arguments of get_core_path() .
>>>>    %t: This parameter means timestamp of coredump. It is decided in
>>>> Kernel.
>>>>
>>>>
>>>>> Fixing this means changing all the os_posix using platforms. But your
>>>>> patch is not about this part. :)
>>>>
>>>> I moved os::check_or_create_dump() to each OS implementations (AIX,
>>>> BSD,
>>>> Solaris, Linux) .
>>>> So I can write Linux specific code to check_or_create_dump() .
>>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-)
>>>
>>> I wasn't suggesting that you make such a change though because it is
>>> large and disruptive. The simple handling of the | part of
>>> core_pattern was basically ok. Expanding the get_core_path in
>>> os_linux.cpp to handle the core_pattern may be okay (but I don't know
>>> enough about it to validate everything). Unfactoring
>>> check_or_create_dump is a step backwards in terms of code sharing.
>>>
>>> Sorry this has grown too large for me to deal with right now.
>>>
>>> David
>>> -----
>>>
>>>>
>>>>> Though I'm unclear whether it both invokes the program and creates a
>>>>> core dump file; or just invokes the program?
>>>>
>>>> If '|' is set, Linux kernel will just redirect core image to user
>>>> process.
>>>> Kernel documentation says as below:
>>>> ------------
>>>> . If the first character of the pattern is a '|', the kernel will treat
>>>>    the rest of the pattern as a command to run.  The core dump will be
>>>>    written to the standard input of that program instead of to a file.
>>>> ------------
>>>>
>>>> And implementation of coredump (do_coredump()) follows to it.
>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c
>>>>
>>>>
>>>>
>>>> In case of ABRT, ABRT dumps core image to default location
>>>> (<CWD>/core.<PID>)
>>>> if user set unlimited to resource limit of core (ulimit -c) .
>>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c
>>>>
>>>>
>>>>> A few style nits - you need spaces around keywords and before braces
>>>>> I also suggest saying "Core dumps may be processed with ..." rather
>>>>> than "treated".
>>>>> And as you don't do anything in the non-redirect case I suggest
>>>>> collapsing this:
>>>>
>>>> I've fixed them.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> (2014/10/13 9:41), David Holmes wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> Sorry for my English.
>>>>>>
>>>>>> I want to propose that JVM should create message according to core
>>>>>> pattern (/proc/sys/kernel/core_pattern) .
>>>>>> So I filed it to JBS and created a patch.
>>>>>
>>>>> So I've had a quick look at this core_pattern business and it seems to
>>>>> me that there are two aspects to this.
>>>>>
>>>>> First, without the leading |, the entry in the core_pattern file is a
>>>>> naming pattern for the core file. In which case that should be handled
>>>>> by the linux specific get_core_path() function. Though that in itself
>>>>> can't fully report the expected name, as part of it is provided in the
>>>>> shared code in os::check_or_create_dump. Fixing this means changing
>>>>> all the os_posix using platforms. But your patch is not about this
>>>>> part. :)
>>>>>
>>>>> Second, with a leading | the core_pattern is actually the name of a
>>>>> program to execute when the program is about to core dump, and that is
>>>>> what you report with your patch. Though I'm unclear whether it both
>>>>> invokes the program and creates a core dump file; or just invokes the
>>>>> program?
>>>>>
>>>>> So with regards to this second part your patch seems functionally ok.
>>>>> I do dislike having a big chunk of linux specific code in this "posix"
>>>>> support file but ...
>>>>>
>>>>> A few style nits - you need spaces around keywords and before
>>>>> braces eg:
>>>>>
>>>>>    if(x){
>>>>>
>>>>> should be
>>>>>
>>>>>    if (x) {
>>>>>
>>>>> I also suggest saying "Core dumps may be processed with ..." rather
>>>>> than "treated".
>>>>>
>>>>> And as you don't do anything in the non-redirect case I suggest
>>>>> collapsing this:
>>>>>
>>>>>    83           is_redirect = core_pattern[0] == '|';
>>>>>    84         }
>>>>>    85
>>>>>    86         if(is_redirect){
>>>>>    87           jio_snprintf(buffer, bufferSize,
>>>>>    88                    "Core dumps may be treated with \"%s\"",
>>>>> &core_pattern[1]);
>>>>>    89         }
>>>>>
>>>>> to just
>>>>>
>>>>>    83           if (core_pattern[0] == '|') {  // redirect
>>>>>    84             jio_snprintf(buffer, bufferSize, "Core dumps may be
>>>>> processed with \"%s\"", &core_pattern[1]);
>>>>>    85            }
>>>>>    86         }
>>>>>
>>>>> Comments from other runtime folk appreciated.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>> 2014/10/07 15:43 "David Holmes" <david.holmes at oracle.com
>>>>>> <mailto:david.holmes at oracle.com>>:
>>>>>>
>>>>>>     Hi Yasumasa,
>>>>>>
>>>>>>     I'm sorry but I don't understand what you are proposing. When you
>>>>>> say
>>>>>>     "treat" do you mean "create"? Otherwise what do you mean by
>>>>>> "treated"?
>>>>>>
>>>>>>     Thanks,
>>>>>>     David
>>>>>>
>>>>>>     On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote:
>>>>>>      > I'm in Hackergarten @ JavaOne :-)
>>>>>>      >
>>>>>>      >
>>>>>>      > Hi all,
>>>>>>      >
>>>>>>      > I would like to enhance the messages in hs_err report.
>>>>>>      > Modern Linux kernel can treat core dump with user process
>>>>>> (e.g. ABRT)
>>>>>>      > However, hs_err report cannot detect it.
>>>>>>      >
>>>>>>      > I think that hs_err report should output messages as below:
>>>>>>      > -------------
>>>>>>      >     Failed to write core dump. Core dumps may be treated with
>>>>>>     "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s
>>>>>> %c %p
>>>>>>     %u %g %t e"
>>>>>>      > -------------
>>>>>>      >
>>>>>>      > I've uploaded webrev of this enhancement.
>>>>>>      > Could you review it?
>>>>>>      >
>>>>>>      > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/
>>>>>>      >
>>>>>>      > This patch works fine on Fedora20 x86_64.
>>>>>>      >
>>>>>>      >
>>>>>>      >
>>>>>>      > Thanks,
>>>>>>      >
>>>>>>      > Yasumasa
>>>>>>      >
>>>>>>

From yasuenag at gmail.com  Tue Nov 25 08:48:33 2014
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Tue, 25 Nov 2014 17:48:33 +0900
Subject: guarantee(PageArmed == 0) failed: invaliant
In-Reply-To: <547429F4.2020803@oracle.com>
References: <5473F8D4.8000107@gmail.com>
	<547429F4.2020803@oracle.com>
Message-ID: <CAGFVN2AnNi-xreqoKXAAMm6jFg9OjjJ1zpXqnOD9XYCkGwjfRg@mail.gmail.com>

Hi David,
Thank you for details.

I can understand purpose for this guarantee.

I read hs_err again, I found thread which state is _thread_new .
I guess it is reason of this issue, but I cannot evaluate because core
image was not available.

If this crash will be reproduced, I will try check details.

Thanks,

Yasumasa
 2014/11/25 16:04 "David Holmes" <david.holmes at oracle.com>:

> Hi Yasumasa,
>
> On 25/11/2014 1:34 PM, Yasumasa Suenaga wrote:
> > Hi all,
> >
> > My customer encountered crash with below messages:
> > --------
> > Internal Error (safepoint.cpp:309)
> > guarantee(PageArmed == 0) failed: invaliant
> > --------
> >   - JDK: JDK6u37 x64
> >   -  OS: RHEL 5.4 x86_64
> >
> > I found similar issues in JBS:
> >   - JDK-7116986
> >   - JDK-7156454
> >   - JDK-8033717
> >
> > I read safepoint.cpp in jdk9, I guess this error is caused in below:
> > --------
> >       if (int(iterations) == DeferPollingPageLoopCount) {
> >          guarantee (PageArmed == 0, "invariant") ;
> >          PageArmed = 1 ;
> >          os::make_polling_page_unreadable();
> >       }
> > --------
> >
> > "iterations" is defined as "unsigned int", and increments in each loop.
> > On the other hand, DeferPollingPageLoopCount is defined intx and default
> > value is "-1" .
> >
> > "PageArmed" sets to 1.
> > --------
> >    if (DeferPollingPageLoopCount < 0) {
> >      // Make polling safepoint aware
> >      guarantee (PageArmed == 0, "invariant") ;
> >      PageArmed = 1 ;
> >      os::make_polling_page_unreadable();
> >    }
> > --------
> >
> >
> > If "iterations" is overflowed, do we encounter this guarantee ?
> > I think this "if" statement should rewrite as below:
>
> No we want this overflow to trigger the guarantee failure - it indicates
> a problem elsewhere in the VM because a thread is not reaching the
> safepoint that has been requested, in a timely manner.
>
> When crashes like this occur you need to examine all the running threads
> to find out which are not safepoint-safe and then determine what they
> are doing and why they have not performed a safepoint check.
>
> David
> ------
>
> > --------
> > diff -r 7e08ae41ddbe src/share/vm/runtime/safepoint.cpp
> > --- a/src/share/vm/runtime/safepoint.cpp      Mon Nov 24 09:57:02 2014
> +0100
> > +++ b/src/share/vm/runtime/safepoint.cpp      Tue Nov 25 12:19:58 2014
> +0900
> > @@ -288,7 +288,8 @@
> >         // 9. On windows consider using the return value from
> SwitchThreadTo()
> >         //    to drive subsequent spin/SwitchThreadTo()/Sleep(N)
> decisions.
> >
> > -      if (int(iterations) == DeferPollingPageLoopCount) {
> > +      if ((DeferPollingPageLoopCount >= 0) &&
> > +                  (int(iterations) == DeferPollingPageLoopCount)) {
> >            guarantee (PageArmed == 0, "invariant") ;
> >            PageArmed = 1 ;
> >            os::make_polling_page_unreadable();
> > --------
> >
> >
> > If it is correct, I will file it to JBS and upload webrev.
> > Could you help me to resolve this issue?
> >
> >
> > Thanks,
> >
> > Yasumasa
> >
>

From staffan.larsen at oracle.com  Tue Nov 25 09:15:34 2014
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 25 Nov 2014 10:15:34 +0100
Subject: RFR: JDK-8059586: hs_err report should treat redirected core
	pattern.
In-Reply-To: <547330E5.1050708@gmail.com>
References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com>
	<CAGFVN2DjxGLomf6dzS5BVqnSKf-43XdEt897T7s0cV71C78r-Q@mail.gmail.com>
	<543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com>
	<543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com>
	<547330E5.1050708@gmail.com>
Message-ID: <FE1302A7-A228-43E9-BCB6-74558268E296@oracle.com>

src/os/bsd/vm/os_linux.cpp:
I?m inclined to think this is too complicated and hard to test and maintain (and I see no tests in the webrev). Could we not simplify this to print a helpful message instead? Something that prints the core_pattern and perhaps some of the values that could be used for substitution, but does not do the actual substitution? I think that would go a long way but be a lot more maintainable.

src/os/bsd/vm/os_bsd.cpp:
On OS X cores are by default written to /cores/core.<pid>. This is configureable with the kern.corefile sysctl variable, although it is rare to do so.

 /Staffan

> On 24 nov 2014, at 14:21, Yasumasa Suenaga <yasuenag at gmail.com> wrote:
> 
> Hi all,
> 
> I've uploaded webrev for this issue about a month ago.
> Could you review it and sponsor it?
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote:
>> Hi David,
>> 
>> I've uploaded new webrev:
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/
>> 
>> 
>>> I wasn't suggesting that you make such a change though because it is large and disruptive.
>> 
>>> Unfactoring check_or_create_dump is a step backwards in terms of code sharing.
>> 
>> I restored check_or_create_dump() to os_posix.cpp .
>> And I changed get_core_path() to create message which represents core dump path
>> (including filename) in each OS.
>> 
>> 
>>> Expanding the get_core_path in os_linux.cpp to handle the core_pattern may be okay (but I don't know enough about it to validate everything).
>> 
>> I implemented all parameters in Linux kernel documentation:
>> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
>> 
>> So I think that parameters which are processed are enough.
>> 
>> 
>> Thanks,
>> 
>> Yasumasa
>> 
>> 
>> 
>> (2014/10/15 9:41), David Holmes wrote:
>>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote:
>>>> Hi David,
>>>> 
>>>> Thank you for comments!
>>>> I've uploaded new webrev. Could you review it again?
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/
>>>> 
>>>> I am an author of jdk9. So I cannot commit it.
>>>> Could you be a sponsor for this enhancement?
>>>> 
>>>> 
>>>>> In which case that should be handled by the linux specific
>>>>> get_core_path() function.
>>>> 
>>>> Agree.
>>>> So I implemented it in os_linux.cpp .
>>>> But part of format characters (%P: global pid, %s: signal, %t dump time)
>>>> are not processed
>>>> in this function because I think these parameters are difficult to
>>>> handle in it.
>>>> 
>>>>   %P: I could not find API for this.
>>>>   %s: We have to change arguments of get_core_path() .
>>>>   %t: This parameter means timestamp of coredump. It is decided in Kernel.
>>>> 
>>>> 
>>>>> Fixing this means changing all the os_posix using platforms. But your
>>>>> patch is not about this part. :)
>>>> 
>>>> I moved os::check_or_create_dump() to each OS implementations (AIX, BSD,
>>>> Solaris, Linux) .
>>>> So I can write Linux specific code to check_or_create_dump() .
>>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-)
>>> 
>>> I wasn't suggesting that you make such a change though because it is large and disruptive. The simple handling of the | part of core_pattern was basically ok. Expanding the get_core_path in os_linux.cpp to handle the core_pattern may be okay (but I don't know enough about it to validate everything). Unfactoring check_or_create_dump is a step backwards in terms of code sharing.
>>> 
>>> Sorry this has grown too large for me to deal with right now.
>>> 
>>> David
>>> -----
>>> 
>>>> 
>>>>> Though I'm unclear whether it both invokes the program and creates a
>>>>> core dump file; or just invokes the program?
>>>> 
>>>> If '|' is set, Linux kernel will just redirect core image to user process.
>>>> Kernel documentation says as below:
>>>> ------------
>>>> . If the first character of the pattern is a '|', the kernel will treat
>>>>   the rest of the pattern as a command to run.  The core dump will be
>>>>   written to the standard input of that program instead of to a file.
>>>> ------------
>>>> 
>>>> And implementation of coredump (do_coredump()) follows to it.
>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c
>>>> 
>>>> 
>>>> In case of ABRT, ABRT dumps core image to default location
>>>> (<CWD>/core.<PID>)
>>>> if user set unlimited to resource limit of core (ulimit -c) .
>>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c
>>>> 
>>>> 
>>>>> A few style nits - you need spaces around keywords and before braces
>>>>> I also suggest saying "Core dumps may be processed with ..." rather
>>>>> than "treated".
>>>>> And as you don't do anything in the non-redirect case I suggest
>>>>> collapsing this:
>>>> 
>>>> I've fixed them.
>>>> 
>>>> 
>>>> Thanks,
>>>> 
>>>> Yasumasa
>>>> 
>>>> 
>>>> (2014/10/13 9:41), David Holmes wrote:
>>>>> Hi Yasumasa,
>>>>> 
>>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote:
>>>>>> Hi David,
>>>>>> 
>>>>>> Sorry for my English.
>>>>>> 
>>>>>> I want to propose that JVM should create message according to core
>>>>>> pattern (/proc/sys/kernel/core_pattern) .
>>>>>> So I filed it to JBS and created a patch.
>>>>> 
>>>>> So I've had a quick look at this core_pattern business and it seems to
>>>>> me that there are two aspects to this.
>>>>> 
>>>>> First, without the leading |, the entry in the core_pattern file is a
>>>>> naming pattern for the core file. In which case that should be handled
>>>>> by the linux specific get_core_path() function. Though that in itself
>>>>> can't fully report the expected name, as part of it is provided in the
>>>>> shared code in os::check_or_create_dump. Fixing this means changing
>>>>> all the os_posix using platforms. But your patch is not about this
>>>>> part. :)
>>>>> 
>>>>> Second, with a leading | the core_pattern is actually the name of a
>>>>> program to execute when the program is about to core dump, and that is
>>>>> what you report with your patch. Though I'm unclear whether it both
>>>>> invokes the program and creates a core dump file; or just invokes the
>>>>> program?
>>>>> 
>>>>> So with regards to this second part your patch seems functionally ok.
>>>>> I do dislike having a big chunk of linux specific code in this "posix"
>>>>> support file but ...
>>>>> 
>>>>> A few style nits - you need spaces around keywords and before braces eg:
>>>>> 
>>>>>   if(x){
>>>>> 
>>>>> should be
>>>>> 
>>>>>   if (x) {
>>>>> 
>>>>> I also suggest saying "Core dumps may be processed with ..." rather
>>>>> than "treated".
>>>>> 
>>>>> And as you don't do anything in the non-redirect case I suggest
>>>>> collapsing this:
>>>>> 
>>>>>   83           is_redirect = core_pattern[0] == '|';
>>>>>   84         }
>>>>>   85
>>>>>   86         if(is_redirect){
>>>>>   87           jio_snprintf(buffer, bufferSize,
>>>>>   88                    "Core dumps may be treated with \"%s\"",
>>>>> &core_pattern[1]);
>>>>>   89         }
>>>>> 
>>>>> to just
>>>>> 
>>>>>   83           if (core_pattern[0] == '|') {  // redirect
>>>>>   84             jio_snprintf(buffer, bufferSize, "Core dumps may be
>>>>> processed with \"%s\"", &core_pattern[1]);
>>>>>   85            }
>>>>>   86         }
>>>>> 
>>>>> Comments from other runtime folk appreciated.
>>>>> 
>>>>> Thanks,
>>>>> David
>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Yasumasa
>>>>>> 
>>>>>> 2014/10/07 15:43 "David Holmes" <david.holmes at oracle.com
>>>>>> <mailto:david.holmes at oracle.com>>:
>>>>>> 
>>>>>>    Hi Yasumasa,
>>>>>> 
>>>>>>    I'm sorry but I don't understand what you are proposing. When you
>>>>>> say
>>>>>>    "treat" do you mean "create"? Otherwise what do you mean by
>>>>>> "treated"?
>>>>>> 
>>>>>>    Thanks,
>>>>>>    David
>>>>>> 
>>>>>>    On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote:
>>>>>>     > I'm in Hackergarten @ JavaOne :-)
>>>>>>     >
>>>>>>     >
>>>>>>     > Hi all,
>>>>>>     >
>>>>>>     > I would like to enhance the messages in hs_err report.
>>>>>>     > Modern Linux kernel can treat core dump with user process
>>>>>> (e.g. ABRT)
>>>>>>     > However, hs_err report cannot detect it.
>>>>>>     >
>>>>>>     > I think that hs_err report should output messages as below:
>>>>>>     > -------------
>>>>>>     >     Failed to write core dump. Core dumps may be treated with
>>>>>>    "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s %c %p
>>>>>>    %u %g %t e"
>>>>>>     > -------------
>>>>>>     >
>>>>>>     > I've uploaded webrev of this enhancement.
>>>>>>     > Could you review it?
>>>>>>     >
>>>>>>     > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/
>>>>>>     >
>>>>>>     > This patch works fine on Fedora20 x86_64.
>>>>>>     >
>>>>>>     >
>>>>>>     >
>>>>>>     > Thanks,
>>>>>>     >
>>>>>>     > Yasumasa
>>>>>>     >
>>>>>> 


From thomas.stuefe at gmail.com  Tue Nov 25 14:12:19 2014
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 25 Nov 2014 15:12:19 +0100
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
Message-ID: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>

Hi all,

I'd like to contribute a fix to error handling to improve stability of
error reporting.


Bug Report:
https://bugs.openjdk.java.net/browse/JDK-8065895


Webrev:
http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.00/


Problem:

When a synchronous error signal happens during error reporting, and the
signal is different from the original signal which triggered error
reporting, VM may die or hang (depends on platform). This causes empty or
almost-empty hs-err files.

Example: we first crash with a SIGILL (e.g in compiled code), then a
SIGSEGV happens when printing stack trace.

Secondary error handling should catch the SIGSEGV and continue error
reporting with the next step. But that does not work in this case.

Causes:
  - hotspot blocks all signals when installing signal handlers. Within the
secondary signal handler, only the original signal gets unblocked, the rest
remained blocked. If another synchronous error signal happens, it is still
blocked. If the second signal is a synchronous signal, the OS would
terminate the process right away because there is no way to defer
synchronous error signals.
  - when installing signal handlers for secondary error handling, only
signal handlers for SIGBUS and SIGSEGV were added; but more signals may
happen during error handling (we saw SIGTRAP, SIGILL, ..etc).

Fix:
secondary signal handler is installed for all synchronous error signals
(which is now a list and easily expandable in vmError_<os>.cpp). All those
signals are unblocked.

In order to test the fix, some test code was added too:

a) debug.cpp: changed "test_error_handler()" to a more generic
"controlled_crash(int how)", which can be called at arbitrary places, not
only at initialization time. "test_error_handler()" still exists and just
calls "controlled_crash(ErrorHandlerTest)", so its behaviour did not change.

b) expand controlled_crash():
  - added option 14, a guaranteed crash with a SIGSEGV at a predefined
address, which is printed out and can later be tested against. Note that I
realize that this is a bit redundant to option 12 or 13, but the crash is
guaranteed and it crashes with a not-null address which should turn up in
hs-err file (to check that hs-err file is correct).
  - added option 15, a guaranteed crash with a SIGILL at a predefined
instruction address. Here, the point is to get a real-world SIGILL (not
just raising it) at a not-null known pc.

c) Add a parameter "-XX:TestCrashDuringErrorHandler=<n>", which works the
same as "-XX:ErrorHandlerTest=<n>". This parameter is used to trigger
controlled crashes inside the error handler. That way secondary error
handling can be tested.

(a)-(c) allow us to test the fixes manually, for example:

java -XX:ErrorHandlerTest=15  -XX:TestCrashDuringErrorHandler=14

causes a SIGILL during initialization, and a secondary SIGSEGV inside error
handling. This demonstrates the effect of the bug. Without the fix, the VM
will abort right away without finishing the hs-err file.

--

I am in the process of writing some JTreg Tests, but I would like to put
those into a separate change. This is because there are more fixes to error
reporting in our pipeline and I'd like to bundle the jtreg tests in one
change.

Kind Regards,

Thomas Stuefe

From david.holmes at oracle.com  Wed Nov 26 01:29:28 2014
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 26 Nov 2014 11:29:28 +1000
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>
Message-ID: <54752CF8.5070408@oracle.com>

Hi Thomas,

A few quick comments as I need to think more about this:

- On Solaris we use the UI thread API thr_* not pthreads
- In debug.cpp for the SIGILL can you define the  all zero case as a 
default so we only need to add platform specific definitions when all 
zeroes doesn't work. I really hate seeing all that CPU selection in 
shared code. :(
- Style nit: please use i++ rather than i ++

Aside: we should eradicate the use of sigprocmask and replace with the 
thread specific version.

Getting back to the "thinking more about this" ... If a synchronous 
signal is blocked at the time it is generated then it should remain 
pending on the thread (POSIX spec) but that doesn't tell us what the 
thread will then do - retry the faulting instruction? Become 
unschedulable? So I can easily imagine that a hang or process 
termination may result. In that sense unblocking those signals whilst 
handling the initial signal may well allow the error reporting process 
to continue further. But I'm unclear exactly how this plays out:

- synchronous signal encountered
- crash_handler invoked
- VMError::report_and_die executes
- secondary signal encountered
- crash_handler invoked again
- VMError::report_and_die executes again and sees the recursion and 
returns (ignoring abort due to excessive recursive errors)

Is that right? So we actually return from the crash_handler? Because 
this puts us in undefined territory according to POSIX:

"The behavior of a process is undefined after it returns normally from a 
signal-catching function for a SIGBUS, SIGFPE, SIGILL, or SIGSEGV signal 
that was not generated by kill(), sigqueue(), or raise()."

On top of that you also have the issue that error reporting does a whole 
bunch of things that are not async-signal-safe so we can easily 
encounter hangs or aborts.

But we're dying anyway so I guess none of this really matters. If 
re-enabling these signals allows error reporting to progress further in 
some cases then that is a win.

Cheers,
David

On 26/11/2014 12:12 AM, Thomas St?fe wrote:
> Hi all,
>
> I'd like to contribute a fix to error handling to improve stability of
> error reporting.
>
>
> Bug Report:
> https://bugs.openjdk.java.net/browse/JDK-8065895
>
>
> Webrev:
> http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.00/
>
>
> Problem:
>
> When a synchronous error signal happens during error reporting, and the
> signal is different from the original signal which triggered error
> reporting, VM may die or hang (depends on platform). This causes empty or
> almost-empty hs-err files.
>
> Example: we first crash with a SIGILL (e.g in compiled code), then a
> SIGSEGV happens when printing stack trace.
>
> Secondary error handling should catch the SIGSEGV and continue error
> reporting with the next step. But that does not work in this case.
>
> Causes:
>    - hotspot blocks all signals when installing signal handlers. Within the
> secondary signal handler, only the original signal gets unblocked, the rest
> remained blocked. If another synchronous error signal happens, it is still
> blocked. If the second signal is a synchronous signal, the OS would
> terminate the process right away because there is no way to defer
> synchronous error signals.
>    - when installing signal handlers for secondary error handling, only
> signal handlers for SIGBUS and SIGSEGV were added; but more signals may
> happen during error handling (we saw SIGTRAP, SIGILL, ..etc).
>
> Fix:
> secondary signal handler is installed for all synchronous error signals
> (which is now a list and easily expandable in vmError_<os>.cpp). All those
> signals are unblocked.
>
> In order to test the fix, some test code was added too:
>
> a) debug.cpp: changed "test_error_handler()" to a more generic
> "controlled_crash(int how)", which can be called at arbitrary places, not
> only at initialization time. "test_error_handler()" still exists and just
> calls "controlled_crash(ErrorHandlerTest)", so its behaviour did not change.
>
> b) expand controlled_crash():
>    - added option 14, a guaranteed crash with a SIGSEGV at a predefined
> address, which is printed out and can later be tested against. Note that I
> realize that this is a bit redundant to option 12 or 13, but the crash is
> guaranteed and it crashes with a not-null address which should turn up in
> hs-err file (to check that hs-err file is correct).
>    - added option 15, a guaranteed crash with a SIGILL at a predefined
> instruction address. Here, the point is to get a real-world SIGILL (not
> just raising it) at a not-null known pc.
>
> c) Add a parameter "-XX:TestCrashDuringErrorHandler=<n>", which works the
> same as "-XX:ErrorHandlerTest=<n>". This parameter is used to trigger
> controlled crashes inside the error handler. That way secondary error
> handling can be tested.
>
> (a)-(c) allow us to test the fixes manually, for example:
>
> java -XX:ErrorHandlerTest=15  -XX:TestCrashDuringErrorHandler=14
>
> causes a SIGILL during initialization, and a secondary SIGSEGV inside error
> handling. This demonstrates the effect of the bug. Without the fix, the VM
> will abort right away without finishing the hs-err file.
>
> --
>
> I am in the process of writing some JTreg Tests, but I would like to put
> those into a separate change. This is because there are more fixes to error
> reporting in our pipeline and I'd like to bundle the jtreg tests in one
> change.
>
> Kind Regards,
>
> Thomas Stuefe
>

From yumin.qi at oracle.com  Wed Nov 26 01:36:47 2014
From: yumin.qi at oracle.com (Yumin Qi)
Date: Tue, 25 Nov 2014 17:36:47 -0800
Subject: RFR(XS): 8053995: Add method to WhiteBox to get vm pagesize.
In-Reply-To: <53DAC336.6050302@oracle.com>
References: <53DAC336.6050302@oracle.com>
Message-ID: <54752EAF.4020404@oracle.com>

Please review

bugs: https://bugs.openjdk.java.net/browse/JDK-8053995
webrev: http://cr.openjdk.java.net/~minqi/8053995/webrev01/

Now the API usage is in internal test case, see separate email for the 
webrev.

It is same as previous version (webrev00).

Thanks
Yumin

On 7/31/14, 3:29 PM, Yumin Qi wrote:
> Please review:
>
> http://cr.openjdk.java.net/~minqi/8053995/webrev00/
>
> Summary: Currently there is no java API to get underlying OS native VM 
> page size unless using Unsafe which is not recommended.  The new added 
> method to WhiteBox can read this property and used in test.
>
>
> Tests: JPRT and  jtreg.
>
> Thanks
> Yumin

From david.holmes at oracle.com  Wed Nov 26 01:54:08 2014
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 26 Nov 2014 11:54:08 +1000
Subject: RFR(XS): 8053995: Add method to WhiteBox to get vm pagesize.
In-Reply-To: <54752EAF.4020404@oracle.com>
References: <53DAC336.6050302@oracle.com> <54752EAF.4020404@oracle.com>
Message-ID: <547532C0.4080500@oracle.com>

Hi Yumin,

On 26/11/2014 11:36 AM, Yumin Qi wrote:
> Please review
>
> bugs: https://bugs.openjdk.java.net/browse/JDK-8053995
> webrev: http://cr.openjdk.java.net/~minqi/8053995/webrev01/

The test also needs to ensure the testlibrary gets built.

Otherwise seems okay.

Thanks,
David

> Now the API usage is in internal test case, see separate email for the
> webrev.
>
> It is same as previous version (webrev00).
>
> Thanks
> Yumin
>
> On 7/31/14, 3:29 PM, Yumin Qi wrote:
>> Please review:
>>
>> http://cr.openjdk.java.net/~minqi/8053995/webrev00/
>>
>> Summary: Currently there is no java API to get underlying OS native VM
>> page size unless using Unsafe which is not recommended.  The new added
>> method to WhiteBox can read this property and used in test.
>>
>>
>> Tests: JPRT and  jtreg.
>>
>> Thanks
>> Yumin

From yasuenag at gmail.com  Wed Nov 26 03:39:33 2014
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Wed, 26 Nov 2014 12:39:33 +0900
Subject: RFR: JDK-8059586: hs_err report should treat redirected core
	pattern.
In-Reply-To: <54744022.2030208@oracle.com>
References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com>
	<CAGFVN2DjxGLomf6dzS5BVqnSKf-43XdEt897T7s0cV71C78r-Q@mail.gmail.com>
	<543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com>
	<543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com>
	<547330E5.1050708@gmail.com> <54744022.2030208@oracle.com>
Message-ID: <CAGFVN2AC3ArpYunqPvSXRYiedFAekZVa-j=9XwFvCPwdSz1Gjw@mail.gmail.com>

Hi David,
Thank you for reviewing!

I will fix it after discussion with Staffan.

Thanks

Yasumasa
2014/11/25 17:39 "David Holmes" <david.holmes at oracle.com>:

> Sorry Yasumasa, this fell off my radar and I was hoping for others to
> comment. We still need a second reviewer.
>
> The change in:
>  src/os/aix/vm/os_aix.cpp
>  src/os/solaris/vm/os_solaris.cpp
>
>   jio_snprintf(buffer, bufferSize, "%s/core or core.%d",
> current_process_id());
>
> has no argument for the %s - presumably p was intended.
>
> Thanks,
> David
>
> On 24/11/2014 11:21 PM, Yasumasa Suenaga wrote:
>
>> Hi all,
>>
>> I've uploaded webrev for this issue about a month ago.
>> Could you review it and sponsor it?
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote:
>>
>>> Hi David,
>>>
>>> I've uploaded new webrev:
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/
>>>
>>>
>>>  I wasn't suggesting that you make such a change though because it is
>>>> large and disruptive.
>>>>
>>>
>>>  Unfactoring check_or_create_dump is a step backwards in terms of code
>>>> sharing.
>>>>
>>>
>>> I restored check_or_create_dump() to os_posix.cpp .
>>> And I changed get_core_path() to create message which represents core
>>> dump path
>>> (including filename) in each OS.
>>>
>>>
>>>  Expanding the get_core_path in os_linux.cpp to handle the
>>>> core_pattern may be okay (but I don't know enough about it to
>>>> validate everything).
>>>>
>>>
>>> I implemented all parameters in Linux kernel documentation:
>>> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
>>>
>>> So I think that parameters which are processed are enough.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>>
>>> (2014/10/15 9:41), David Holmes wrote:
>>>
>>>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote:
>>>>
>>>>> Hi David,
>>>>>
>>>>> Thank you for comments!
>>>>> I've uploaded new webrev. Could you review it again?
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/
>>>>>
>>>>> I am an author of jdk9. So I cannot commit it.
>>>>> Could you be a sponsor for this enhancement?
>>>>>
>>>>>
>>>>>  In which case that should be handled by the linux specific
>>>>>> get_core_path() function.
>>>>>>
>>>>>
>>>>> Agree.
>>>>> So I implemented it in os_linux.cpp .
>>>>> But part of format characters (%P: global pid, %s: signal, %t dump
>>>>> time)
>>>>> are not processed
>>>>> in this function because I think these parameters are difficult to
>>>>> handle in it.
>>>>>
>>>>>    %P: I could not find API for this.
>>>>>    %s: We have to change arguments of get_core_path() .
>>>>>    %t: This parameter means timestamp of coredump. It is decided in
>>>>> Kernel.
>>>>>
>>>>>
>>>>>  Fixing this means changing all the os_posix using platforms. But your
>>>>>> patch is not about this part. :)
>>>>>>
>>>>>
>>>>> I moved os::check_or_create_dump() to each OS implementations (AIX,
>>>>> BSD,
>>>>> Solaris, Linux) .
>>>>> So I can write Linux specific code to check_or_create_dump() .
>>>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-)
>>>>>
>>>>
>>>> I wasn't suggesting that you make such a change though because it is
>>>> large and disruptive. The simple handling of the | part of
>>>> core_pattern was basically ok. Expanding the get_core_path in
>>>> os_linux.cpp to handle the core_pattern may be okay (but I don't know
>>>> enough about it to validate everything). Unfactoring
>>>> check_or_create_dump is a step backwards in terms of code sharing.
>>>>
>>>> Sorry this has grown too large for me to deal with right now.
>>>>
>>>> David
>>>> -----
>>>>
>>>>
>>>>>  Though I'm unclear whether it both invokes the program and creates a
>>>>>> core dump file; or just invokes the program?
>>>>>>
>>>>>
>>>>> If '|' is set, Linux kernel will just redirect core image to user
>>>>> process.
>>>>> Kernel documentation says as below:
>>>>> ------------
>>>>> . If the first character of the pattern is a '|', the kernel will treat
>>>>>    the rest of the pattern as a command to run.  The core dump will be
>>>>>    written to the standard input of that program instead of to a file.
>>>>> ------------
>>>>>
>>>>> And implementation of coredump (do_coredump()) follows to it.
>>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/
>>>>> linux.git/tree/fs/coredump.c
>>>>>
>>>>>
>>>>>
>>>>> In case of ABRT, ABRT dumps core image to default location
>>>>> (<CWD>/core.<PID>)
>>>>> if user set unlimited to resource limit of core (ulimit -c) .
>>>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c
>>>>>
>>>>>
>>>>>  A few style nits - you need spaces around keywords and before braces
>>>>>> I also suggest saying "Core dumps may be processed with ..." rather
>>>>>> than "treated".
>>>>>> And as you don't do anything in the non-redirect case I suggest
>>>>>> collapsing this:
>>>>>>
>>>>>
>>>>> I've fixed them.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> (2014/10/13 9:41), David Holmes wrote:
>>>>>
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote:
>>>>>>
>>>>>>> Hi David,
>>>>>>>
>>>>>>> Sorry for my English.
>>>>>>>
>>>>>>> I want to propose that JVM should create message according to core
>>>>>>> pattern (/proc/sys/kernel/core_pattern) .
>>>>>>> So I filed it to JBS and created a patch.
>>>>>>>
>>>>>>
>>>>>> So I've had a quick look at this core_pattern business and it seems to
>>>>>> me that there are two aspects to this.
>>>>>>
>>>>>> First, without the leading |, the entry in the core_pattern file is a
>>>>>> naming pattern for the core file. In which case that should be handled
>>>>>> by the linux specific get_core_path() function. Though that in itself
>>>>>> can't fully report the expected name, as part of it is provided in the
>>>>>> shared code in os::check_or_create_dump. Fixing this means changing
>>>>>> all the os_posix using platforms. But your patch is not about this
>>>>>> part. :)
>>>>>>
>>>>>> Second, with a leading | the core_pattern is actually the name of a
>>>>>> program to execute when the program is about to core dump, and that is
>>>>>> what you report with your patch. Though I'm unclear whether it both
>>>>>> invokes the program and creates a core dump file; or just invokes the
>>>>>> program?
>>>>>>
>>>>>> So with regards to this second part your patch seems functionally ok.
>>>>>> I do dislike having a big chunk of linux specific code in this "posix"
>>>>>> support file but ...
>>>>>>
>>>>>> A few style nits - you need spaces around keywords and before
>>>>>> braces eg:
>>>>>>
>>>>>>    if(x){
>>>>>>
>>>>>> should be
>>>>>>
>>>>>>    if (x) {
>>>>>>
>>>>>> I also suggest saying "Core dumps may be processed with ..." rather
>>>>>> than "treated".
>>>>>>
>>>>>> And as you don't do anything in the non-redirect case I suggest
>>>>>> collapsing this:
>>>>>>
>>>>>>    83           is_redirect = core_pattern[0] == '|';
>>>>>>    84         }
>>>>>>    85
>>>>>>    86         if(is_redirect){
>>>>>>    87           jio_snprintf(buffer, bufferSize,
>>>>>>    88                    "Core dumps may be treated with \"%s\"",
>>>>>> &core_pattern[1]);
>>>>>>    89         }
>>>>>>
>>>>>> to just
>>>>>>
>>>>>>    83           if (core_pattern[0] == '|') {  // redirect
>>>>>>    84             jio_snprintf(buffer, bufferSize, "Core dumps may be
>>>>>> processed with \"%s\"", &core_pattern[1]);
>>>>>>    85            }
>>>>>>    86         }
>>>>>>
>>>>>> Comments from other runtime folk appreciated.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>  Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>> 2014/10/07 15:43 "David Holmes" <david.holmes at oracle.com
>>>>>>> <mailto:david.holmes at oracle.com>>:
>>>>>>>
>>>>>>>     Hi Yasumasa,
>>>>>>>
>>>>>>>     I'm sorry but I don't understand what you are proposing. When you
>>>>>>> say
>>>>>>>     "treat" do you mean "create"? Otherwise what do you mean by
>>>>>>> "treated"?
>>>>>>>
>>>>>>>     Thanks,
>>>>>>>     David
>>>>>>>
>>>>>>>     On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote:
>>>>>>>      > I'm in Hackergarten @ JavaOne :-)
>>>>>>>      >
>>>>>>>      >
>>>>>>>      > Hi all,
>>>>>>>      >
>>>>>>>      > I would like to enhance the messages in hs_err report.
>>>>>>>      > Modern Linux kernel can treat core dump with user process
>>>>>>> (e.g. ABRT)
>>>>>>>      > However, hs_err report cannot detect it.
>>>>>>>      >
>>>>>>>      > I think that hs_err report should output messages as below:
>>>>>>>      > -------------
>>>>>>>      >     Failed to write core dump. Core dumps may be treated with
>>>>>>>     "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s
>>>>>>> %c %p
>>>>>>>     %u %g %t e"
>>>>>>>      > -------------
>>>>>>>      >
>>>>>>>      > I've uploaded webrev of this enhancement.
>>>>>>>      > Could you review it?
>>>>>>>      >
>>>>>>>      > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/
>>>>>>>      >
>>>>>>>      > This patch works fine on Fedora20 x86_64.
>>>>>>>      >
>>>>>>>      >
>>>>>>>      >
>>>>>>>      > Thanks,
>>>>>>>      >
>>>>>>>      > Yasumasa
>>>>>>>      >
>>>>>>>
>>>>>>>

From yasuenag at gmail.com  Wed Nov 26 03:54:48 2014
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Wed, 26 Nov 2014 12:54:48 +0900
Subject: RFR: JDK-8059586: hs_err report should treat redirected core
	pattern.
In-Reply-To: <FE1302A7-A228-43E9-BCB6-74558268E296@oracle.com>
References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com>
	<CAGFVN2DjxGLomf6dzS5BVqnSKf-43XdEt897T7s0cV71C78r-Q@mail.gmail.com>
	<543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com>
	<543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com>
	<547330E5.1050708@gmail.com>
	<FE1302A7-A228-43E9-BCB6-74558268E296@oracle.com>
Message-ID: <CAGFVN2CPxXC9bhWvdq-wfmbyBGTZCS2gz5dq9jWf6dMCXGMYvg@mail.gmail.com>

Hi Staffan,

Thank you for reviewing!

os_linux.cpp:
I want to print coredump location correctly to hs_err. So I want to output
whether coredump is processed in other process or is written to file.
If os::get_core_path() should be more simply, I will print raw string in
core_pattern.

os_bsd.cpp:
I don't have OS X. So I cannot check it.
I am focusing Linux in this enhancement. Could you file it as another
enhancement if it need?

Thanks,

Yasumasa

 2014/11/25 18:15 "Staffan Larsen" <staffan.larsen at oracle.com>:

> src/os/bsd/vm/os_linux.cpp:
> I?m inclined to think this is too complicated and hard to test and
> maintain (and I see no tests in the webrev). Could we not simplify this to
> print a helpful message instead? Something that prints the core_pattern and
> perhaps some of the values that could be used for substitution, but does
> not do the actual substitution? I think that would go a long way but be a
> lot more maintainable.
>
> src/os/bsd/vm/os_bsd.cpp:
> On OS X cores are by default written to /cores/core.<pid>. This is
> configureable with the kern.corefile sysctl variable, although it is rare
> to do so.
>
>  /Staffan
>
> > On 24 nov 2014, at 14:21, Yasumasa Suenaga <yasuenag at gmail.com> wrote:
> >
> > Hi all,
> >
> > I've uploaded webrev for this issue about a month ago.
> > Could you review it and sponsor it?
> >
> >
> > Thanks,
> >
> > Yasumasa
> >
> >
> > On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote:
> >> Hi David,
> >>
> >> I've uploaded new webrev:
> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/
> >>
> >>
> >>> I wasn't suggesting that you make such a change though because it is
> large and disruptive.
> >>
> >>> Unfactoring check_or_create_dump is a step backwards in terms of code
> sharing.
> >>
> >> I restored check_or_create_dump() to os_posix.cpp .
> >> And I changed get_core_path() to create message which represents core
> dump path
> >> (including filename) in each OS.
> >>
> >>
> >>> Expanding the get_core_path in os_linux.cpp to handle the core_pattern
> may be okay (but I don't know enough about it to validate everything).
> >>
> >> I implemented all parameters in Linux kernel documentation:
> >> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
> >>
> >> So I think that parameters which are processed are enough.
> >>
> >>
> >> Thanks,
> >>
> >> Yasumasa
> >>
> >>
> >>
> >> (2014/10/15 9:41), David Holmes wrote:
> >>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote:
> >>>> Hi David,
> >>>>
> >>>> Thank you for comments!
> >>>> I've uploaded new webrev. Could you review it again?
> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/
> >>>>
> >>>> I am an author of jdk9. So I cannot commit it.
> >>>> Could you be a sponsor for this enhancement?
> >>>>
> >>>>
> >>>>> In which case that should be handled by the linux specific
> >>>>> get_core_path() function.
> >>>>
> >>>> Agree.
> >>>> So I implemented it in os_linux.cpp .
> >>>> But part of format characters (%P: global pid, %s: signal, %t dump
> time)
> >>>> are not processed
> >>>> in this function because I think these parameters are difficult to
> >>>> handle in it.
> >>>>
> >>>>   %P: I could not find API for this.
> >>>>   %s: We have to change arguments of get_core_path() .
> >>>>   %t: This parameter means timestamp of coredump. It is decided in
> Kernel.
> >>>>
> >>>>
> >>>>> Fixing this means changing all the os_posix using platforms. But your
> >>>>> patch is not about this part. :)
> >>>>
> >>>> I moved os::check_or_create_dump() to each OS implementations (AIX,
> BSD,
> >>>> Solaris, Linux) .
> >>>> So I can write Linux specific code to check_or_create_dump() .
> >>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-)
> >>>
> >>> I wasn't suggesting that you make such a change though because it is
> large and disruptive. The simple handling of the | part of core_pattern was
> basically ok. Expanding the get_core_path in os_linux.cpp to handle the
> core_pattern may be okay (but I don't know enough about it to validate
> everything). Unfactoring check_or_create_dump is a step backwards in terms
> of code sharing.
> >>>
> >>> Sorry this has grown too large for me to deal with right now.
> >>>
> >>> David
> >>> -----
> >>>
> >>>>
> >>>>> Though I'm unclear whether it both invokes the program and creates a
> >>>>> core dump file; or just invokes the program?
> >>>>
> >>>> If '|' is set, Linux kernel will just redirect core image to user
> process.
> >>>> Kernel documentation says as below:
> >>>> ------------
> >>>> . If the first character of the pattern is a '|', the kernel will
> treat
> >>>>   the rest of the pattern as a command to run.  The core dump will be
> >>>>   written to the standard input of that program instead of to a file.
> >>>> ------------
> >>>>
> >>>> And implementation of coredump (do_coredump()) follows to it.
> >>>>
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c
> >>>>
> >>>>
> >>>> In case of ABRT, ABRT dumps core image to default location
> >>>> (<CWD>/core.<PID>)
> >>>> if user set unlimited to resource limit of core (ulimit -c) .
> >>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c
> >>>>
> >>>>
> >>>>> A few style nits - you need spaces around keywords and before braces
> >>>>> I also suggest saying "Core dumps may be processed with ..." rather
> >>>>> than "treated".
> >>>>> And as you don't do anything in the non-redirect case I suggest
> >>>>> collapsing this:
> >>>>
> >>>> I've fixed them.
> >>>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Yasumasa
> >>>>
> >>>>
> >>>> (2014/10/13 9:41), David Holmes wrote:
> >>>>> Hi Yasumasa,
> >>>>>
> >>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote:
> >>>>>> Hi David,
> >>>>>>
> >>>>>> Sorry for my English.
> >>>>>>
> >>>>>> I want to propose that JVM should create message according to core
> >>>>>> pattern (/proc/sys/kernel/core_pattern) .
> >>>>>> So I filed it to JBS and created a patch.
> >>>>>
> >>>>> So I've had a quick look at this core_pattern business and it seems
> to
> >>>>> me that there are two aspects to this.
> >>>>>
> >>>>> First, without the leading |, the entry in the core_pattern file is a
> >>>>> naming pattern for the core file. In which case that should be
> handled
> >>>>> by the linux specific get_core_path() function. Though that in itself
> >>>>> can't fully report the expected name, as part of it is provided in
> the
> >>>>> shared code in os::check_or_create_dump. Fixing this means changing
> >>>>> all the os_posix using platforms. But your patch is not about this
> >>>>> part. :)
> >>>>>
> >>>>> Second, with a leading | the core_pattern is actually the name of a
> >>>>> program to execute when the program is about to core dump, and that
> is
> >>>>> what you report with your patch. Though I'm unclear whether it both
> >>>>> invokes the program and creates a core dump file; or just invokes the
> >>>>> program?
> >>>>>
> >>>>> So with regards to this second part your patch seems functionally ok.
> >>>>> I do dislike having a big chunk of linux specific code in this
> "posix"
> >>>>> support file but ...
> >>>>>
> >>>>> A few style nits - you need spaces around keywords and before braces
> eg:
> >>>>>
> >>>>>   if(x){
> >>>>>
> >>>>> should be
> >>>>>
> >>>>>   if (x) {
> >>>>>
> >>>>> I also suggest saying "Core dumps may be processed with ..." rather
> >>>>> than "treated".
> >>>>>
> >>>>> And as you don't do anything in the non-redirect case I suggest
> >>>>> collapsing this:
> >>>>>
> >>>>>   83           is_redirect = core_pattern[0] == '|';
> >>>>>   84         }
> >>>>>   85
> >>>>>   86         if(is_redirect){
> >>>>>   87           jio_snprintf(buffer, bufferSize,
> >>>>>   88                    "Core dumps may be treated with \"%s\"",
> >>>>> &core_pattern[1]);
> >>>>>   89         }
> >>>>>
> >>>>> to just
> >>>>>
> >>>>>   83           if (core_pattern[0] == '|') {  // redirect
> >>>>>   84             jio_snprintf(buffer, bufferSize, "Core dumps may be
> >>>>> processed with \"%s\"", &core_pattern[1]);
> >>>>>   85            }
> >>>>>   86         }
> >>>>>
> >>>>> Comments from other runtime folk appreciated.
> >>>>>
> >>>>> Thanks,
> >>>>> David
> >>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Yasumasa
> >>>>>>
> >>>>>> 2014/10/07 15:43 "David Holmes" <david.holmes at oracle.com
> >>>>>> <mailto:david.holmes at oracle.com>>:
> >>>>>>
> >>>>>>    Hi Yasumasa,
> >>>>>>
> >>>>>>    I'm sorry but I don't understand what you are proposing. When you
> >>>>>> say
> >>>>>>    "treat" do you mean "create"? Otherwise what do you mean by
> >>>>>> "treated"?
> >>>>>>
> >>>>>>    Thanks,
> >>>>>>    David
> >>>>>>
> >>>>>>    On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote:
> >>>>>>     > I'm in Hackergarten @ JavaOne :-)
> >>>>>>     >
> >>>>>>     >
> >>>>>>     > Hi all,
> >>>>>>     >
> >>>>>>     > I would like to enhance the messages in hs_err report.
> >>>>>>     > Modern Linux kernel can treat core dump with user process
> >>>>>> (e.g. ABRT)
> >>>>>>     > However, hs_err report cannot detect it.
> >>>>>>     >
> >>>>>>     > I think that hs_err report should output messages as below:
> >>>>>>     > -------------
> >>>>>>     >     Failed to write core dump. Core dumps may be treated with
> >>>>>>    "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s
> %c %p
> >>>>>>    %u %g %t e"
> >>>>>>     > -------------
> >>>>>>     >
> >>>>>>     > I've uploaded webrev of this enhancement.
> >>>>>>     > Could you review it?
> >>>>>>     >
> >>>>>>     > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/
> >>>>>>     >
> >>>>>>     > This patch works fine on Fedora20 x86_64.
> >>>>>>     >
> >>>>>>     >
> >>>>>>     >
> >>>>>>     > Thanks,
> >>>>>>     >
> >>>>>>     > Yasumasa
> >>>>>>     >
> >>>>>>
>
>

From thomas.stuefe at gmail.com  Wed Nov 26 07:06:52 2014
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 26 Nov 2014 08:06:52 +0100
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <54752CF8.5070408@oracle.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>
	<54752CF8.5070408@oracle.com>
Message-ID: <CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>

Hi David,

thanks for looking at this. Here is the updated webrev:
http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.01/

See my comments below.

On Wed, Nov 26, 2014 at 2:29 AM, David Holmes <david.holmes at oracle.com>
wrote:

> Hi Thomas,
>
> A few quick comments as I need to think more about this:
>
> - On Solaris we use the UI thread API thr_* not pthreads
>

Fixed, now I use thr_sigsetmask() (though both sigprocmask and
pthread_sigmask seemed to work too)


> - In debug.cpp for the SIGILL can you define the  all zero case as a
> default so we only need to add platform specific definitions when all
> zeroes doesn't work. I really hate seeing all that CPU selection in shared
> code. :(
>

Agreed and fixed, moved the CPU-specific sections into CPU-specific files.


> - Style nit: please use i++ rather than i ++
>
>
Fixed.

Aside: we should eradicate the use of sigprocmask and replace with the
> thread specific version.
>
>
Agree. Though I never saw any errors stemming from the use of
sigprocmask(). According to POSIX, sigprocmask() is undefined in
multithreaded environment, and I guess most OSes just default to
pthread_sigmask.


> Getting back to the "thinking more about this" ... If a synchronous signal
> is blocked at the time it is generated then it should remain pending on the
> thread (POSIX spec) but that doesn't tell us what the thread will then do -
> retry the faulting instruction? Become unschedulable? So I can easily
> imagine that a hang or process termination may result.


This is exactly what happens, but it is actually covered by POSIX, see doc
on pthread_sigmask: "If any of the SIGFPE, SIGILL, SIGSEGV, or SIGBUS
signals are generated while they are blocked, the result is undefined,
unless the signal was generated by the *kill*()
<http://pubs.opengroup.org/onlinepubs/009695399/functions/kill.html>
function, the *sigqueue*()
<http://pubs.opengroup.org/onlinepubs/009695399/functions/sigqueue.html>
function, or the *raise*()
<http://pubs.opengroup.org/onlinepubs/009695399/functions/raise.html>
function."

In reality, process usually aborts abnormally with the default action for
the signal, e.g. printing out "Illegal Instruction". On MacOS, we hang
(until the Watcherthread finally kills the VM). On old AIXes, we die
without a trace.

This also can be easily tried out by removing SIGILL from the list of
signals in vmError_<os>.cpp and executing:

java -XX:ErrorHandlerTest=14 -XX:TestCrashInErrorHandler=15

which will crash first with a SIGSEGV, then in error handling with a
secondary SIGILL. This will interrupt error reporting and kill or hang the
process.


In that sense unblocking those signals whilst handling the initial signal
> may well allow the error reporting process to continue further. But I'm
> unclear exactly how this plays out:
>
> - synchronous signal encountered
> - crash_handler invoked
>
- VMError::report_and_die executes
> - secondary signal encountered
>
- crash_handler invoked again
>

almost: not again, different signal handler now. First signal was handled
by "JVM_handle_<os>_signal()"


> - VMError::report_and_die executes again and sees the recursion and
> returns (ignoring abort due to excessive recursive errors)
>
>
No..

Is that right? So we actually return from the crash_handler?


Oh, but we dont return. VMError::report_and_die() will just create a new
frame and re-execute VMError::report() of the first VMError object. Which
then will continue with the next STEP. We never return, for each secondary
error signal a new frame is created.

This all happens in VMError::report_and_die:
-> first error ? anchor VMError object in a static variable and execute
VMError::report()
-> secondary error?
   -> different thread? just sleep forever
   -> same thread? new frame, re-enter VMError::report(). Once done, abort.

I always found that rather neat, but in fact that is not our invention but
Sun's :) Anyway, my fix does not change this behaviour for better or worse,
it only makes it usable for more cases.


> Because this puts us in undefined territory according to POSIX:
>
> "The behavior of a process is undefined after it returns normally from a
> signal-catching function for a SIGBUS, SIGFPE, SIGILL, or SIGSEGV signal
> that was not generated by kill(), sigqueue(), or raise()."
>
> true, but we dont return...


> On top of that you also have the issue that error reporting does a whole
> bunch of things that are not async-signal-safe so we can easily encounter
> hangs or aborts.
>
> But we're dying anyway so I guess none of this really matters. If
> re-enabling these signals allows error reporting to progress further in
> some cases then that is a win.
>
>
Actually, this covers a lot of cases, mostly because SIGSEGV during error
reporting is common, so if the original error was not SIGSEGV, but e.g.
SIGILL, this would always result in broken hs-err files.

The back story is that at SAP, we rely heavily on the hs-err files. They
are our main tool for support, because working with cores is often not
feasible. So, we put a lot of work in making error reporting reliable
across all platforms. This is also covered by many tests which crash the VM
in exciting ways and check the hs-err files for completeness.

Kind Regards, Thomas


> Cheers,
> David
>
>
> On 26/11/2014 12:12 AM, Thomas St?fe wrote:
>
>> Hi all,
>>
>> I'd like to contribute a fix to error handling to improve stability of
>> error reporting.
>>
>>
>> Bug Report:
>> https://bugs.openjdk.java.net/browse/JDK-8065895
>>
>>
>> Webrev:
>> http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.00/
>>
>>
>> Problem:
>>
>> When a synchronous error signal happens during error reporting, and the
>> signal is different from the original signal which triggered error
>> reporting, VM may die or hang (depends on platform). This causes empty or
>> almost-empty hs-err files.
>>
>> Example: we first crash with a SIGILL (e.g in compiled code), then a
>> SIGSEGV happens when printing stack trace.
>>
>> Secondary error handling should catch the SIGSEGV and continue error
>> reporting with the next step. But that does not work in this case.
>>
>> Causes:
>>    - hotspot blocks all signals when installing signal handlers. Within
>> the
>> secondary signal handler, only the original signal gets unblocked, the
>> rest
>> remained blocked. If another synchronous error signal happens, it is still
>> blocked. If the second signal is a synchronous signal, the OS would
>> terminate the process right away because there is no way to defer
>> synchronous error signals.
>>    - when installing signal handlers for secondary error handling, only
>> signal handlers for SIGBUS and SIGSEGV were added; but more signals may
>> happen during error handling (we saw SIGTRAP, SIGILL, ..etc).
>>
>> Fix:
>> secondary signal handler is installed for all synchronous error signals
>> (which is now a list and easily expandable in vmError_<os>.cpp). All those
>> signals are unblocked.
>>
>> In order to test the fix, some test code was added too:
>>
>> a) debug.cpp: changed "test_error_handler()" to a more generic
>> "controlled_crash(int how)", which can be called at arbitrary places, not
>> only at initialization time. "test_error_handler()" still exists and just
>> calls "controlled_crash(ErrorHandlerTest)", so its behaviour did not
>> change.
>>
>> b) expand controlled_crash():
>>    - added option 14, a guaranteed crash with a SIGSEGV at a predefined
>> address, which is printed out and can later be tested against. Note that I
>> realize that this is a bit redundant to option 12 or 13, but the crash is
>> guaranteed and it crashes with a not-null address which should turn up in
>> hs-err file (to check that hs-err file is correct).
>>    - added option 15, a guaranteed crash with a SIGILL at a predefined
>> instruction address. Here, the point is to get a real-world SIGILL (not
>> just raising it) at a not-null known pc.
>>
>> c) Add a parameter "-XX:TestCrashDuringErrorHandler=<n>", which works the
>> same as "-XX:ErrorHandlerTest=<n>". This parameter is used to trigger
>> controlled crashes inside the error handler. That way secondary error
>> handling can be tested.
>>
>> (a)-(c) allow us to test the fixes manually, for example:
>>
>> java -XX:ErrorHandlerTest=15  -XX:TestCrashDuringErrorHandler=14
>>
>> causes a SIGILL during initialization, and a secondary SIGSEGV inside
>> error
>> handling. This demonstrates the effect of the bug. Without the fix, the VM
>> will abort right away without finishing the hs-err file.
>>
>> --
>>
>> I am in the process of writing some JTreg Tests, but I would like to put
>> those into a separate change. This is because there are more fixes to
>> error
>> reporting in our pipeline and I'd like to bundle the jtreg tests in one
>> change.
>>
>> Kind Regards,
>>
>> Thomas Stuefe
>>
>>

From david.holmes at oracle.com  Wed Nov 26 09:31:42 2014
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 26 Nov 2014 19:31:42 +1000
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>	<54752CF8.5070408@oracle.com>
	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>
Message-ID: <54759DFE.7020300@oracle.com>

Hi Thomas,

On 26/11/2014 5:06 PM, Thomas St?fe wrote:
> Hi David,
>
> thanks for looking at this. Here is the updated webrev:
> http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.01/
>
> See my comments below.
>
> On Wed, Nov 26, 2014 at 2:29 AM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
>     Hi Thomas,
>
>     A few quick comments as I need to think more about this:
>
>     - On Solaris we use the UI thread API thr_* not pthreads
>
>
> Fixed, now I use thr_sigsetmask() (though both sigprocmask and
> pthread_sigmask seemed to work too)

Thanks. They are interchangeable semantically but for consistency I 
prefer not to mix them.

>     - In debug.cpp for the SIGILL can you define the  all zero case as a
>     default so we only need to add platform specific definitions when
>     all zeroes doesn't work. I really hate seeing all that CPU selection
>     in shared code. :(
>
>
> Agreed and fixed, moved the CPU-specific sections into CPU-specific files.

I'd really like to see a way to share the all-zeroes case so that we 
don't need to add platform specific code unnecessarily.

>     - Style nit: please use i++ rather than i ++
>
>
> Fixed.
>
>     Aside: we should eradicate the use of sigprocmask and replace with
>     the thread specific version.
>
>
> Agree. Though I never saw any errors stemming from the use of
> sigprocmask(). According to POSIX, sigprocmask() is undefined in
> multithreaded environment, and I guess most OSes just default to
> pthread_sigmask.

Yes "probably" works okay but I hate to see us using something with 
undefined semantics. That's future clean up though.

>     Getting back to the "thinking more about this" ... If a synchronous
>     signal is blocked at the time it is generated then it should remain
>     pending on the thread (POSIX spec) but that doesn't tell us what the
>     thread will then do - retry the faulting instruction? Become
>     unschedulable? So I can easily imagine that a hang or process
>     termination may result.
>
>
> This is exactly what happens, but it is actually covered by POSIX, see
> doc on pthread_sigmask: "If any of the SIGFPE, SIGILL, SIGSEGV, or
> SIGBUS signals are generated while they are blocked, the result is
> undefined, unless the signal was generated by the /kill/()
> <http://pubs.opengroup.org/onlinepubs/009695399/functions/kill.html>
> function, the /sigqueue/()
> <http://pubs.opengroup.org/onlinepubs/009695399/functions/sigqueue.html>
> function, or the /raise/()
> <http://pubs.opengroup.org/onlinepubs/009695399/functions/raise.html>
> function."

Thanks - I managed to miss that part even though I found the other part 
about the signal handling function returning. :(

> In reality, process usually aborts abnormally with the default action
> for the signal, e.g. printing out "Illegal Instruction". On MacOS, we
> hang (until the Watcherthread finally kills the VM). On old AIXes, we
> die without a trace.
>
> This also can be easily tried out by removing SIGILL from the list of
> signals in vmError_<os>.cpp and executing:
>
> java -XX:ErrorHandlerTest=14 -XX:TestCrashInErrorHandler=15
>
> which will crash first with a SIGSEGV, then in error handling with a
> secondary SIGILL. This will interrupt error reporting and kill or hang
> the process.
>
>
>     In that sense unblocking those signals whilst handling the initial
>     signal may well allow the error reporting process to continue
>     further. But I'm unclear exactly how this plays out:
>
>     - synchronous signal encountered
>     - crash_handler invoked
>
>     - VMError::report_and_die executes
>     - secondary signal encountered
>
>     - crash_handler invoked again
>
>
> almost: not again, different signal handler now. First signal was
> handled by "JVM_handle_<os>_signal()"

Ah missed that - thanks - not that it makes much difference :)

>     - VMError::report_and_die executes again and sees the recursion and
>     returns (ignoring abort due to excessive recursive errors)
>
>
> No..
>
>     Is that right? So we actually return from the crash_handler?
>
>
> Oh, but we dont return. VMError::report_and_die() will just create a new
> frame and re-execute VMError::report() of the first VMError object.
> Which then will continue with the next STEP. We never return, for each
> secondary error signal a new frame is created.

I had trouble tracing through exactly what might happen on the recursive 
call to report_and_die. I see know that report comes from:

     staticBufferStream sbs(buffer, O_BUFLEN, &log);
     first_error->report(&sbs);
     first_error->_current_step = 0;         // reset current_step
     first_error->_current_step_info = "";   // reset current_step string

so the second time through we will call report and _current_step should 
indicate where to start executing from.

> This all happens in VMError::report_and_die:
> -> first error ? anchor VMError object in a static variable and execute
> VMError::report()
> -> secondary error?
>     -> different thread? just sleep forever
>     -> same thread? new frame, re-enter VMError::report(). Once done, abort.
>
> I always found that rather neat, but in fact that is not our invention
> but Sun's :) Anyway, my fix does not change this behaviour for better or
> worse, it only makes it usable for more cases.
>
>     Because this puts us in undefined territory according to POSIX:
>
>     "The behavior of a process is undefined after it returns normally
>     from a signal-catching function for a SIGBUS, SIGFPE, SIGILL, or
>     SIGSEGV signal that was not generated by kill(), sigqueue(), or
>     raise()."
>
> true, but we dont return...
>
>     On top of that you also have the issue that error reporting does a
>     whole bunch of things that are not async-signal-safe so we can
>     easily encounter hangs or aborts.
>
>     But we're dying anyway so I guess none of this really matters. If
>     re-enabling these signals allows error reporting to progress further
>     in some cases then that is a win.
>
>
> Actually, this covers a lot of cases, mostly because SIGSEGV during
> error reporting is common, so if the original error was not SIGSEGV, but
> e.g. SIGILL, this would always result in broken hs-err files.
>
> The back story is that at SAP, we rely heavily on the hs-err files. They
> are our main tool for support, because working with cores is often not
> feasible. So, we put a lot of work in making error reporting reliable
> across all platforms. This is also covered by many tests which crash the
> VM in exciting ways and check the hs-err files for completeness.

OK. Modulo the cpu specific SIGILL part everything else seems fine.

Thanks,
David

> Kind Regards, Thomas
>
>     Cheers,
>     David
>
>
>     On 26/11/2014 12:12 AM, Thomas St?fe wrote:
>
>         Hi all,
>
>         I'd like to contribute a fix to error handling to improve
>         stability of
>         error reporting.
>
>
>         Bug Report:
>         https://bugs.openjdk.java.net/__browse/JDK-8065895
>         <https://bugs.openjdk.java.net/browse/JDK-8065895>
>
>
>         Webrev:
>         http://cr.openjdk.java.net/~__stuefe/webrevs/8065895/webrev.__00/ <http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.00/>
>
>
>         Problem:
>
>         When a synchronous error signal happens during error reporting,
>         and the
>         signal is different from the original signal which triggered error
>         reporting, VM may die or hang (depends on platform). This causes
>         empty or
>         almost-empty hs-err files.
>
>         Example: we first crash with a SIGILL (e.g in compiled code), then a
>         SIGSEGV happens when printing stack trace.
>
>         Secondary error handling should catch the SIGSEGV and continue error
>         reporting with the next step. But that does not work in this case.
>
>         Causes:
>             - hotspot blocks all signals when installing signal
>         handlers. Within the
>         secondary signal handler, only the original signal gets
>         unblocked, the rest
>         remained blocked. If another synchronous error signal happens,
>         it is still
>         blocked. If the second signal is a synchronous signal, the OS would
>         terminate the process right away because there is no way to defer
>         synchronous error signals.
>             - when installing signal handlers for secondary error
>         handling, only
>         signal handlers for SIGBUS and SIGSEGV were added; but more
>         signals may
>         happen during error handling (we saw SIGTRAP, SIGILL, ..etc).
>
>         Fix:
>         secondary signal handler is installed for all synchronous error
>         signals
>         (which is now a list and easily expandable in vmError_<os>.cpp).
>         All those
>         signals are unblocked.
>
>         In order to test the fix, some test code was added too:
>
>         a) debug.cpp: changed "test_error_handler()" to a more generic
>         "controlled_crash(int how)", which can be called at arbitrary
>         places, not
>         only at initialization time. "test_error_handler()" still exists
>         and just
>         calls "controlled_crash(__ErrorHandlerTest)", so its behaviour
>         did not change.
>
>         b) expand controlled_crash():
>             - added option 14, a guaranteed crash with a SIGSEGV at a
>         predefined
>         address, which is printed out and can later be tested against.
>         Note that I
>         realize that this is a bit redundant to option 12 or 13, but the
>         crash is
>         guaranteed and it crashes with a not-null address which should
>         turn up in
>         hs-err file (to check that hs-err file is correct).
>             - added option 15, a guaranteed crash with a SIGILL at a
>         predefined
>         instruction address. Here, the point is to get a real-world
>         SIGILL (not
>         just raising it) at a not-null known pc.
>
>         c) Add a parameter "-XX:__TestCrashDuringErrorHandler=<__n>",
>         which works the
>         same as "-XX:ErrorHandlerTest=<n>". This parameter is used to
>         trigger
>         controlled crashes inside the error handler. That way secondary
>         error
>         handling can be tested.
>
>         (a)-(c) allow us to test the fixes manually, for example:
>
>         java -XX:ErrorHandlerTest=15  -XX:__TestCrashDuringErrorHandler=14
>
>         causes a SIGILL during initialization, and a secondary SIGSEGV
>         inside error
>         handling. This demonstrates the effect of the bug. Without the
>         fix, the VM
>         will abort right away without finishing the hs-err file.
>
>         --
>
>         I am in the process of writing some JTreg Tests, but I would
>         like to put
>         those into a separate change. This is because there are more
>         fixes to error
>         reporting in our pipeline and I'd like to bundle the jtreg tests
>         in one
>         change.
>
>         Kind Regards,
>
>         Thomas Stuefe
>
>

From thomas.stuefe at gmail.com  Wed Nov 26 11:37:44 2014
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 26 Nov 2014 12:37:44 +0100
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <54759DFE.7020300@oracle.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>
	<54752CF8.5070408@oracle.com>
	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>
	<54759DFE.7020300@oracle.com>
Message-ID: <CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>

Hi David,
...


>      - In debug.cpp for the SIGILL can you define the  all zero case as a
>>     default so we only need to add platform specific definitions when
>>     all zeroes doesn't work. I really hate seeing all that CPU selection
>>     in shared code. :(
>>
>>
>> Agreed and fixed, moved the CPU-specific sections into CPU-specific files.
>>
>
> I'd really like to see a way to share the all-zeroes case so that we don't
> need to add platform specific code unnecessarily.
>
>
sooo.. back to the original code then, just with the #ifdef, just with the
zero-cases all folded in into the #else path? Or do you prefer something
else?


>      - Style nit: please use i++ rather than i ++
>>
>>
>> Fixed.
>>
>>     Aside: we should eradicate the use of sigprocmask and replace with
>>     the thread specific version.
>>
>>
>> Agree. Though I never saw any errors stemming from the use of
>> sigprocmask(). According to POSIX, sigprocmask() is undefined in
>> multithreaded environment, and I guess most OSes just default to
>> pthread_sigmask.
>>
>
> Yes "probably" works okay but I hate to see us using something with
> undefined semantics. That's future clean up though.
>
>
We (SAP JVM) already use pthread_sigmask() / thr_sigsetmask() instead of
sigprocmask. Works fine. We can port this to the OpenJDK.


>      Getting back to the "thinking more about this" ... If a synchronous
>>     signal is blocked at the time it is generated then it should remain
>>     pending on the thread (POSIX spec) but that doesn't tell us what the
>>     thread will then do - retry the faulting instruction? Become
>>     unschedulable? So I can easily imagine that a hang or process
>>     termination may result.
>>
>>
>> This is exactly what happens, but it is actually covered by POSIX, see
>> doc on pthread_sigmask: "If any of the SIGFPE, SIGILL, SIGSEGV, or
>> SIGBUS signals are generated while they are blocked, the result is
>> undefined, unless the signal was generated by the /kill/()
>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/kill.html>
>> function, the /sigqueue/()
>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/sigqueue.html>
>> function, or the /raise/()
>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/raise.html>
>> function."
>>
>
> Thanks - I managed to miss that part even though I found the other part
> about the signal handling function returning. :(


It is well hidden, I found it by accident :) To me it looks like they kept
it intentionally vague, to not block platforms where those signals could be
somehow dealt with automatically? Hard to see though how this would work.


>
>
>  In reality, process usually aborts abnormally with the default action
>> for the signal, e.g. printing out "Illegal Instruction". On MacOS, we
>> hang (until the Watcherthread finally kills the VM). On old AIXes, we
>> die without a trace.
>>
>> This also can be easily tried out by removing SIGILL from the list of
>> signals in vmError_<os>.cpp and executing:
>>
>> java -XX:ErrorHandlerTest=14 -XX:TestCrashInErrorHandler=15
>>
>> which will crash first with a SIGSEGV, then in error handling with a
>> secondary SIGILL. This will interrupt error reporting and kill or hang
>> the process.
>>
>>
>>     In that sense unblocking those signals whilst handling the initial
>>     signal may well allow the error reporting process to continue
>>     further. But I'm unclear exactly how this plays out:
>>
>>     - synchronous signal encountered
>>     - crash_handler invoked
>>
>>     - VMError::report_and_die executes
>>     - secondary signal encountered
>>
>>     - crash_handler invoked again
>>
>>
>> almost: not again, different signal handler now. First signal was
>> handled by "JVM_handle_<os>_signal()"
>>
>
> Ah missed that - thanks - not that it makes much difference :)
>
>
I just like nitpicking :)


>      - VMError::report_and_die executes again and sees the recursion and
>>     returns (ignoring abort due to excessive recursive errors)
>>
>>
>> No..
>>
>>     Is that right? So we actually return from the crash_handler?
>>
>>
>> Oh, but we dont return. VMError::report_and_die() will just create a new
>> frame and re-execute VMError::report() of the first VMError object.
>> Which then will continue with the next STEP. We never return, for each
>> secondary error signal a new frame is created.
>>
>
> I had trouble tracing through exactly what might happen on the recursive
> call to report_and_die. I see know that report comes from:
>
>     staticBufferStream sbs(buffer, O_BUFLEN, &log);
>     first_error->report(&sbs);
>     first_error->_current_step = 0;         // reset current_step
>     first_error->_current_step_info = "";   // reset current_step string
>
> so the second time through we will call report and _current_step should
> indicate where to start executing from.
>
>
Exactly. There is also a catch, in that the stack usage goes up. Not
endlessly, it is limited by the number of error reporting steps.
The more stack VmError::report() does cost, the less well this works,
especially in stack overflow scenarios.

Which is why we extended SafeFetch and enabled it for the use in the error
handler, which will be one of the the next patches I'd like to port to the
OpenJDK, once this one is thru.


>
>  This all happens in VMError::report_and_die:
>> -> first error ? anchor VMError object in a static variable and execute
>> VMError::report()
>> -> secondary error?
>>     -> different thread? just sleep forever
>>     -> same thread? new frame, re-enter VMError::report(). Once done,
>> abort.
>>
>> I always found that rather neat, but in fact that is not our invention
>> but Sun's :) Anyway, my fix does not change this behaviour for better or
>> worse, it only makes it usable for more cases.
>>
>>     Because this puts us in undefined territory according to POSIX:
>>
>>     "The behavior of a process is undefined after it returns normally
>>     from a signal-catching function for a SIGBUS, SIGFPE, SIGILL, or
>>     SIGSEGV signal that was not generated by kill(), sigqueue(), or
>>     raise()."
>>
>> true, but we dont return...
>>
>>     On top of that you also have the issue that error reporting does a
>>     whole bunch of things that are not async-signal-safe so we can
>>     easily encounter hangs or aborts.
>>
>>     But we're dying anyway so I guess none of this really matters. If
>>     re-enabling these signals allows error reporting to progress further
>>     in some cases then that is a win.
>>
>>
>> Actually, this covers a lot of cases, mostly because SIGSEGV during
>> error reporting is common, so if the original error was not SIGSEGV, but
>> e.g. SIGILL, this would always result in broken hs-err files.
>>
>> The back story is that at SAP, we rely heavily on the hs-err files. They
>> are our main tool for support, because working with cores is often not
>> feasible. So, we put a lot of work in making error reporting reliable
>> across all platforms. This is also covered by many tests which crash the
>> VM in exciting ways and check the hs-err files for completeness.
>>
>
> OK. Modulo the cpu specific SIGILL part everything else seems fine.
>
> Great. just tell me how you want that part.

Kind regards, Thomas


> Thanks,
> David
>

From david.holmes at oracle.com  Wed Nov 26 12:02:38 2014
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 26 Nov 2014 22:02:38 +1000
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>	<54752CF8.5070408@oracle.com>	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>	<54759DFE.7020300@oracle.com>
	<CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>
Message-ID: <5475C15E.30207@oracle.com>

On 26/11/2014 9:37 PM, Thomas St?fe wrote:
> Hi David,
> ...
>
>              - In debug.cpp for the SIGILL can you define the  all zero
>         case as a
>              default so we only need to add platform specific
>         definitions when
>              all zeroes doesn't work. I really hate seeing all that CPU
>         selection
>              in shared code. :(
>
>
>         Agreed and fixed, moved the CPU-specific sections into
>         CPU-specific files.
>
>
>     I'd really like to see a way to share the all-zeroes case so that we
>     don't need to add platform specific code unnecessarily.
>
>
> sooo.. back to the original code then, just with the #ifdef, just with
> the zero-cases all folded in into the #else path? Or do you prefer
> something else?

Elsewhere there is a pattern of defining per-platform values that can 
override the shared definition. eg:

#ifndef HAS_SPECIAL_PLATFORM_VALUE_FOR_XXXX
   Foo XXX = ...;  //shared/default initalization
#endif

but this assumes a platform specific header has already been included 
that can do:

#define HAS_SPECIAL_PLATFORM_VALUE_FOR_XXXX
Foo XXX = ... ; // platform specific initialization

But that is not the case for debug.hpp.

So I guess folding the zero-case into the else path is the best we can 
do. However I'm assuming the zero case will work for our internal 
platforms ... if it doesn't then we'd have to pollute the shared code 
with info for the closed platforms. :(

David
-----

>
>              - Style nit: please use i++ rather than i ++
>
>
>         Fixed.
>
>              Aside: we should eradicate the use of sigprocmask and
>         replace with
>              the thread specific version.
>
>
>         Agree. Though I never saw any errors stemming from the use of
>         sigprocmask(). According to POSIX, sigprocmask() is undefined in
>         multithreaded environment, and I guess most OSes just default to
>         pthread_sigmask.
>
>
>     Yes "probably" works okay but I hate to see us using something with
>     undefined semantics. That's future clean up though.
>
>
> We (SAP JVM) already use pthread_sigmask() / thr_sigsetmask() instead of
> sigprocmask. Works fine. We can port this to the OpenJDK.
>
>              Getting back to the "thinking more about this" ... If a
>         synchronous
>              signal is blocked at the time it is generated then it
>         should remain
>              pending on the thread (POSIX spec) but that doesn't tell us
>         what the
>              thread will then do - retry the faulting instruction? Become
>              unschedulable? So I can easily imagine that a hang or process
>              termination may result.
>
>
>         This is exactly what happens, but it is actually covered by
>         POSIX, see
>         doc on pthread_sigmask: "If any of the SIGFPE, SIGILL, SIGSEGV, or
>         SIGBUS signals are generated while they are blocked, the result is
>         undefined, unless the signal was generated by the /kill/()
>         <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/kill.html
>         <http://pubs.opengroup.org/onlinepubs/009695399/functions/kill.html>>
>         function, the /sigqueue/()
>         <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/sigqueue.html
>         <http://pubs.opengroup.org/onlinepubs/009695399/functions/sigqueue.html>>
>         function, or the /raise/()
>         <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/raise.html
>         <http://pubs.opengroup.org/onlinepubs/009695399/functions/raise.html>>
>         function."
>
>
>     Thanks - I managed to miss that part even though I found the other
>     part about the signal handling function returning. :(
>
>
> It is well hidden, I found it by accident :) To me it looks like they
> kept it intentionally vague, to not block platforms where those signals
> could be somehow dealt with automatically? Hard to see though how this
> would work.
>
>
>
>         In reality, process usually aborts abnormally with the default
>         action
>         for the signal, e.g. printing out "Illegal Instruction". On
>         MacOS, we
>         hang (until the Watcherthread finally kills the VM). On old
>         AIXes, we
>         die without a trace.
>
>         This also can be easily tried out by removing SIGILL from the
>         list of
>         signals in vmError_<os>.cpp and executing:
>
>         java -XX:ErrorHandlerTest=14 -XX:TestCrashInErrorHandler=15
>
>         which will crash first with a SIGSEGV, then in error handling with a
>         secondary SIGILL. This will interrupt error reporting and kill
>         or hang
>         the process.
>
>
>              In that sense unblocking those signals whilst handling the
>         initial
>              signal may well allow the error reporting process to continue
>              further. But I'm unclear exactly how this plays out:
>
>              - synchronous signal encountered
>              - crash_handler invoked
>
>              - VMError::report_and_die executes
>              - secondary signal encountered
>
>              - crash_handler invoked again
>
>
>         almost: not again, different signal handler now. First signal was
>         handled by "JVM_handle_<os>_signal()"
>
>
>     Ah missed that - thanks - not that it makes much difference :)
>
>
> I just like nitpicking :)
>
>              - VMError::report_and_die executes again and sees the
>         recursion and
>              returns (ignoring abort due to excessive recursive errors)
>
>
>         No..
>
>              Is that right? So we actually return from the crash_handler?
>
>
>         Oh, but we dont return. VMError::report_and_die() will just
>         create a new
>         frame and re-execute VMError::report() of the first VMError object.
>         Which then will continue with the next STEP. We never return,
>         for each
>         secondary error signal a new frame is created.
>
>
>     I had trouble tracing through exactly what might happen on the
>     recursive call to report_and_die. I see know that report comes from:
>
>          staticBufferStream sbs(buffer, O_BUFLEN, &log);
>          first_error->report(&sbs);
>          first_error->_current_step = 0;         // reset current_step
>          first_error->_current_step___info = "";   // reset current_step
>     string
>
>     so the second time through we will call report and _current_step
>     should indicate where to start executing from.
>
>
> Exactly. There is also a catch, in that the stack usage goes up. Not
> endlessly, it is limited by the number of error reporting steps.
> The more stack VmError::report() does cost, the less well this works,
> especially in stack overflow scenarios.
>
> Which is why we extended SafeFetch and enabled it for the use in the
> error handler, which will be one of the the next patches I'd like to
> port to the OpenJDK, once this one is thru.
>
>
>         This all happens in VMError::report_and_die:
>         -> first error ? anchor VMError object in a static variable and
>         execute
>         VMError::report()
>         -> secondary error?
>              -> different thread? just sleep forever
>              -> same thread? new frame, re-enter VMError::report(). Once
>         done, abort.
>
>         I always found that rather neat, but in fact that is not our
>         invention
>         but Sun's :) Anyway, my fix does not change this behaviour for
>         better or
>         worse, it only makes it usable for more cases.
>
>              Because this puts us in undefined territory according to POSIX:
>
>              "The behavior of a process is undefined after it returns
>         normally
>              from a signal-catching function for a SIGBUS, SIGFPE,
>         SIGILL, or
>              SIGSEGV signal that was not generated by kill(), sigqueue(), or
>              raise()."
>
>         true, but we dont return...
>
>              On top of that you also have the issue that error reporting
>         does a
>              whole bunch of things that are not async-signal-safe so we can
>              easily encounter hangs or aborts.
>
>              But we're dying anyway so I guess none of this really
>         matters. If
>              re-enabling these signals allows error reporting to
>         progress further
>              in some cases then that is a win.
>
>
>         Actually, this covers a lot of cases, mostly because SIGSEGV during
>         error reporting is common, so if the original error was not
>         SIGSEGV, but
>         e.g. SIGILL, this would always result in broken hs-err files.
>
>         The back story is that at SAP, we rely heavily on the hs-err
>         files. They
>         are our main tool for support, because working with cores is
>         often not
>         feasible. So, we put a lot of work in making error reporting
>         reliable
>         across all platforms. This is also covered by many tests which
>         crash the
>         VM in exciting ways and check the hs-err files for completeness.
>
>
>     OK. Modulo the cpu specific SIGILL part everything else seems fine.
>
> Great. just tell me how you want that part.
>
> Kind regards, Thomas
>
>     Thanks,
>     David
>

From thomas.stuefe at gmail.com  Wed Nov 26 13:33:27 2014
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 26 Nov 2014 14:33:27 +0100
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <5475C15E.30207@oracle.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>
	<54752CF8.5070408@oracle.com>
	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>
	<54759DFE.7020300@oracle.com>
	<CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>
	<5475C15E.30207@oracle.com>
Message-ID: <CAA-vtUx6xfjLD4w05eiLWnce-exB-Ryo6SnVR52Oi-DRRvkvew@mail.gmail.com>

Hi David,

here you go: http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.02/

Reverted SIGILL-generating function back to its original form, plus the
folding of the 000 case.

I only can guess what your closed platforms are, but if it is ARM, I
believe opcodes 0-31 are undefined. For ia64, 0 is undefined as well.

Kind regards, Thomas


On Wed, Nov 26, 2014 at 1:02 PM, David Holmes <david.holmes at oracle.com>
wrote:

> On 26/11/2014 9:37 PM, Thomas St?fe wrote:
>
>> Hi David,
>> ...
>>
>>              - In debug.cpp for the SIGILL can you define the  all zero
>>         case as a
>>              default so we only need to add platform specific
>>         definitions when
>>              all zeroes doesn't work. I really hate seeing all that CPU
>>         selection
>>              in shared code. :(
>>
>>
>>         Agreed and fixed, moved the CPU-specific sections into
>>         CPU-specific files.
>>
>>
>>     I'd really like to see a way to share the all-zeroes case so that we
>>     don't need to add platform specific code unnecessarily.
>>
>>
>> sooo.. back to the original code then, just with the #ifdef, just with
>> the zero-cases all folded in into the #else path? Or do you prefer
>> something else?
>>
>
> Elsewhere there is a pattern of defining per-platform values that can
> override the shared definition. eg:
>
> #ifndef HAS_SPECIAL_PLATFORM_VALUE_FOR_XXXX
>   Foo XXX = ...;  //shared/default initalization
> #endif
>
> but this assumes a platform specific header has already been included that
> can do:
>
> #define HAS_SPECIAL_PLATFORM_VALUE_FOR_XXXX
> Foo XXX = ... ; // platform specific initialization
>
> But that is not the case for debug.hpp.
>
> So I guess folding the zero-case into the else path is the best we can do.
> However I'm assuming the zero case will work for our internal platforms ...
> if it doesn't then we'd have to pollute the shared code with info for the
> closed platforms. :(
>
> David
> -----
>
>
>>              - Style nit: please use i++ rather than i ++
>>
>>
>>         Fixed.
>>
>>              Aside: we should eradicate the use of sigprocmask and
>>         replace with
>>              the thread specific version.
>>
>>
>>         Agree. Though I never saw any errors stemming from the use of
>>         sigprocmask(). According to POSIX, sigprocmask() is undefined in
>>         multithreaded environment, and I guess most OSes just default to
>>         pthread_sigmask.
>>
>>
>>     Yes "probably" works okay but I hate to see us using something with
>>     undefined semantics. That's future clean up though.
>>
>>
>> We (SAP JVM) already use pthread_sigmask() / thr_sigsetmask() instead of
>> sigprocmask. Works fine. We can port this to the OpenJDK.
>>
>>              Getting back to the "thinking more about this" ... If a
>>         synchronous
>>              signal is blocked at the time it is generated then it
>>         should remain
>>              pending on the thread (POSIX spec) but that doesn't tell us
>>         what the
>>              thread will then do - retry the faulting instruction? Become
>>              unschedulable? So I can easily imagine that a hang or process
>>              termination may result.
>>
>>
>>         This is exactly what happens, but it is actually covered by
>>         POSIX, see
>>         doc on pthread_sigmask: "If any of the SIGFPE, SIGILL, SIGSEGV, or
>>         SIGBUS signals are generated while they are blocked, the result is
>>         undefined, unless the signal was generated by the /kill/()
>>         <http://pubs.opengroup.org/__onlinepubs/009695399/__
>> functions/kill.html
>>         <http://pubs.opengroup.org/onlinepubs/009695399/
>> functions/kill.html>>
>>         function, the /sigqueue/()
>>         <http://pubs.opengroup.org/__onlinepubs/009695399/__
>> functions/sigqueue.html
>>         <http://pubs.opengroup.org/onlinepubs/009695399/
>> functions/sigqueue.html>>
>>         function, or the /raise/()
>>         <http://pubs.opengroup.org/__onlinepubs/009695399/__
>> functions/raise.html
>>
>>         <http://pubs.opengroup.org/onlinepubs/009695399/
>> functions/raise.html>>
>>         function."
>>
>>
>>     Thanks - I managed to miss that part even though I found the other
>>     part about the signal handling function returning. :(
>>
>>
>> It is well hidden, I found it by accident :) To me it looks like they
>> kept it intentionally vague, to not block platforms where those signals
>> could be somehow dealt with automatically? Hard to see though how this
>> would work.
>>
>>
>>
>>         In reality, process usually aborts abnormally with the default
>>         action
>>         for the signal, e.g. printing out "Illegal Instruction". On
>>         MacOS, we
>>         hang (until the Watcherthread finally kills the VM). On old
>>         AIXes, we
>>         die without a trace.
>>
>>         This also can be easily tried out by removing SIGILL from the
>>         list of
>>         signals in vmError_<os>.cpp and executing:
>>
>>         java -XX:ErrorHandlerTest=14 -XX:TestCrashInErrorHandler=15
>>
>>         which will crash first with a SIGSEGV, then in error handling
>> with a
>>         secondary SIGILL. This will interrupt error reporting and kill
>>         or hang
>>         the process.
>>
>>
>>              In that sense unblocking those signals whilst handling the
>>         initial
>>              signal may well allow the error reporting process to continue
>>              further. But I'm unclear exactly how this plays out:
>>
>>              - synchronous signal encountered
>>              - crash_handler invoked
>>
>>              - VMError::report_and_die executes
>>              - secondary signal encountered
>>
>>              - crash_handler invoked again
>>
>>
>>         almost: not again, different signal handler now. First signal was
>>         handled by "JVM_handle_<os>_signal()"
>>
>>
>>     Ah missed that - thanks - not that it makes much difference :)
>>
>>
>> I just like nitpicking :)
>>
>>              - VMError::report_and_die executes again and sees the
>>         recursion and
>>              returns (ignoring abort due to excessive recursive errors)
>>
>>
>>         No..
>>
>>              Is that right? So we actually return from the crash_handler?
>>
>>
>>         Oh, but we dont return. VMError::report_and_die() will just
>>         create a new
>>         frame and re-execute VMError::report() of the first VMError
>> object.
>>         Which then will continue with the next STEP. We never return,
>>         for each
>>         secondary error signal a new frame is created.
>>
>>
>>     I had trouble tracing through exactly what might happen on the
>>     recursive call to report_and_die. I see know that report comes from:
>>
>>          staticBufferStream sbs(buffer, O_BUFLEN, &log);
>>          first_error->report(&sbs);
>>          first_error->_current_step = 0;         // reset current_step
>>          first_error->_current_step___info = "";   // reset current_step
>>
>>     string
>>
>>     so the second time through we will call report and _current_step
>>     should indicate where to start executing from.
>>
>>
>> Exactly. There is also a catch, in that the stack usage goes up. Not
>> endlessly, it is limited by the number of error reporting steps.
>> The more stack VmError::report() does cost, the less well this works,
>> especially in stack overflow scenarios.
>>
>> Which is why we extended SafeFetch and enabled it for the use in the
>> error handler, which will be one of the the next patches I'd like to
>> port to the OpenJDK, once this one is thru.
>>
>>
>>         This all happens in VMError::report_and_die:
>>         -> first error ? anchor VMError object in a static variable and
>>         execute
>>         VMError::report()
>>         -> secondary error?
>>              -> different thread? just sleep forever
>>              -> same thread? new frame, re-enter VMError::report(). Once
>>         done, abort.
>>
>>         I always found that rather neat, but in fact that is not our
>>         invention
>>         but Sun's :) Anyway, my fix does not change this behaviour for
>>         better or
>>         worse, it only makes it usable for more cases.
>>
>>              Because this puts us in undefined territory according to
>> POSIX:
>>
>>              "The behavior of a process is undefined after it returns
>>         normally
>>              from a signal-catching function for a SIGBUS, SIGFPE,
>>         SIGILL, or
>>              SIGSEGV signal that was not generated by kill(), sigqueue(),
>> or
>>              raise()."
>>
>>         true, but we dont return...
>>
>>              On top of that you also have the issue that error reporting
>>         does a
>>              whole bunch of things that are not async-signal-safe so we
>> can
>>              easily encounter hangs or aborts.
>>
>>              But we're dying anyway so I guess none of this really
>>         matters. If
>>              re-enabling these signals allows error reporting to
>>         progress further
>>              in some cases then that is a win.
>>
>>
>>         Actually, this covers a lot of cases, mostly because SIGSEGV
>> during
>>         error reporting is common, so if the original error was not
>>         SIGSEGV, but
>>         e.g. SIGILL, this would always result in broken hs-err files.
>>
>>         The back story is that at SAP, we rely heavily on the hs-err
>>         files. They
>>         are our main tool for support, because working with cores is
>>         often not
>>         feasible. So, we put a lot of work in making error reporting
>>         reliable
>>         across all platforms. This is also covered by many tests which
>>         crash the
>>         VM in exciting ways and check the hs-err files for completeness.
>>
>>
>>     OK. Modulo the cpu specific SIGILL part everything else seems fine.
>>
>> Great. just tell me how you want that part.
>>
>> Kind regards, Thomas
>>
>>     Thanks,
>>     David
>>
>>

From thomas.stuefe at gmail.com  Wed Nov 26 14:12:52 2014
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 26 Nov 2014 15:12:52 +0100
Subject: RFR: JDK-8059586: hs_err report should treat redirected core
	pattern.
In-Reply-To: <CAGFVN2CPxXC9bhWvdq-wfmbyBGTZCS2gz5dq9jWf6dMCXGMYvg@mail.gmail.com>
References: <542C8274.3010809@gmail.com> <54338B70.9080709@oracle.com>
	<CAGFVN2DjxGLomf6dzS5BVqnSKf-43XdEt897T7s0cV71C78r-Q@mail.gmail.com>
	<543B1FD6.3000200@oracle.com> <543CF553.80601@gmail.com>
	<543DC2BF.9050407@oracle.com> <543E80F8.3080204@gmail.com>
	<547330E5.1050708@gmail.com>
	<FE1302A7-A228-43E9-BCB6-74558268E296@oracle.com>
	<CAGFVN2CPxXC9bhWvdq-wfmbyBGTZCS2gz5dq9jWf6dMCXGMYvg@mail.gmail.com>
Message-ID: <CAA-vtUx5iVr6=CoywjdnyH4mkP1_zdpOceW0TMeYSTsND_PUDg@mail.gmail.com>

Hi Yasumasa,

I am not a Reviewer. Barring the general decision of the real reviewers,
here are some thoughts:

os_linux.cpp

- jio_snprintf() returns -1 on truncation. n+=written may walk backwards. I
would probably check for (written >= 0) and also, at the start of the loop,
for (n < sizeof(core_path)).
- code is used in error reporting. I would be hesitant to create larger
buffers on the stack. malloc may be better.
- code does not detect truncation of core_path (unlikely but possible)

the rest is more matter of taste:
- I would prefer sizeof(core_path) over PATH_MAX at all places where you
refer to the size of the buffer. So you could make the buffer very small
and test e.g. how your code behaves with truncation.
- when reading /proc/sys/kernel/core_uses_pid, using fgetc instead of fgets
may be a tiny bit simpler.

Kind Regards, Thomas


On Wed, Nov 26, 2014 at 4:54 AM, Yasumasa Suenaga <yasuenag at gmail.com>
wrote:

> Hi Staffan,
>
> Thank you for reviewing!
>
> os_linux.cpp:
> I want to print coredump location correctly to hs_err. So I want to output
> whether coredump is processed in other process or is written to file.
> If os::get_core_path() should be more simply, I will print raw string in
> core_pattern.
>
> os_bsd.cpp:
> I don't have OS X. So I cannot check it.
> I am focusing Linux in this enhancement. Could you file it as another
> enhancement if it need?
>
> Thanks,
>
> Yasumasa
>
>  2014/11/25 18:15 "Staffan Larsen" <staffan.larsen at oracle.com>:
>
> > src/os/bsd/vm/os_linux.cpp:
> > I?m inclined to think this is too complicated and hard to test and
> > maintain (and I see no tests in the webrev). Could we not simplify this
> to
> > print a helpful message instead? Something that prints the core_pattern
> and
> > perhaps some of the values that could be used for substitution, but does
> > not do the actual substitution? I think that would go a long way but be a
> > lot more maintainable.
> >
> > src/os/bsd/vm/os_bsd.cpp:
> > On OS X cores are by default written to /cores/core.<pid>. This is
> > configureable with the kern.corefile sysctl variable, although it is rare
> > to do so.
> >
> >  /Staffan
> >
> > > On 24 nov 2014, at 14:21, Yasumasa Suenaga <yasuenag at gmail.com> wrote:
> > >
> > > Hi all,
> > >
> > > I've uploaded webrev for this issue about a month ago.
> > > Could you review it and sponsor it?
> > >
> > >
> > > Thanks,
> > >
> > > Yasumasa
> > >
> > >
> > > On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote:
> > >> Hi David,
> > >>
> > >> I've uploaded new webrev:
> > >> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/
> > >>
> > >>
> > >>> I wasn't suggesting that you make such a change though because it is
> > large and disruptive.
> > >>
> > >>> Unfactoring check_or_create_dump is a step backwards in terms of code
> > sharing.
> > >>
> > >> I restored check_or_create_dump() to os_posix.cpp .
> > >> And I changed get_core_path() to create message which represents core
> > dump path
> > >> (including filename) in each OS.
> > >>
> > >>
> > >>> Expanding the get_core_path in os_linux.cpp to handle the
> core_pattern
> > may be okay (but I don't know enough about it to validate everything).
> > >>
> > >> I implemented all parameters in Linux kernel documentation:
> > >> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
> > >>
> > >> So I think that parameters which are processed are enough.
> > >>
> > >>
> > >> Thanks,
> > >>
> > >> Yasumasa
> > >>
> > >>
> > >>
> > >> (2014/10/15 9:41), David Holmes wrote:
> > >>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote:
> > >>>> Hi David,
> > >>>>
> > >>>> Thank you for comments!
> > >>>> I've uploaded new webrev. Could you review it again?
> > >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/
> > >>>>
> > >>>> I am an author of jdk9. So I cannot commit it.
> > >>>> Could you be a sponsor for this enhancement?
> > >>>>
> > >>>>
> > >>>>> In which case that should be handled by the linux specific
> > >>>>> get_core_path() function.
> > >>>>
> > >>>> Agree.
> > >>>> So I implemented it in os_linux.cpp .
> > >>>> But part of format characters (%P: global pid, %s: signal, %t dump
> > time)
> > >>>> are not processed
> > >>>> in this function because I think these parameters are difficult to
> > >>>> handle in it.
> > >>>>
> > >>>>   %P: I could not find API for this.
> > >>>>   %s: We have to change arguments of get_core_path() .
> > >>>>   %t: This parameter means timestamp of coredump. It is decided in
> > Kernel.
> > >>>>
> > >>>>
> > >>>>> Fixing this means changing all the os_posix using platforms. But
> your
> > >>>>> patch is not about this part. :)
> > >>>>
> > >>>> I moved os::check_or_create_dump() to each OS implementations (AIX,
> > BSD,
> > >>>> Solaris, Linux) .
> > >>>> So I can write Linux specific code to check_or_create_dump() .
> > >>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-)
> > >>>
> > >>> I wasn't suggesting that you make such a change though because it is
> > large and disruptive. The simple handling of the | part of core_pattern
> was
> > basically ok. Expanding the get_core_path in os_linux.cpp to handle the
> > core_pattern may be okay (but I don't know enough about it to validate
> > everything). Unfactoring check_or_create_dump is a step backwards in
> terms
> > of code sharing.
> > >>>
> > >>> Sorry this has grown too large for me to deal with right now.
> > >>>
> > >>> David
> > >>> -----
> > >>>
> > >>>>
> > >>>>> Though I'm unclear whether it both invokes the program and creates
> a
> > >>>>> core dump file; or just invokes the program?
> > >>>>
> > >>>> If '|' is set, Linux kernel will just redirect core image to user
> > process.
> > >>>> Kernel documentation says as below:
> > >>>> ------------
> > >>>> . If the first character of the pattern is a '|', the kernel will
> > treat
> > >>>>   the rest of the pattern as a command to run.  The core dump will
> be
> > >>>>   written to the standard input of that program instead of to a
> file.
> > >>>> ------------
> > >>>>
> > >>>> And implementation of coredump (do_coredump()) follows to it.
> > >>>>
> >
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c
> > >>>>
> > >>>>
> > >>>> In case of ABRT, ABRT dumps core image to default location
> > >>>> (<CWD>/core.<PID>)
> > >>>> if user set unlimited to resource limit of core (ulimit -c) .
> > >>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c
> > >>>>
> > >>>>
> > >>>>> A few style nits - you need spaces around keywords and before
> braces
> > >>>>> I also suggest saying "Core dumps may be processed with ..." rather
> > >>>>> than "treated".
> > >>>>> And as you don't do anything in the non-redirect case I suggest
> > >>>>> collapsing this:
> > >>>>
> > >>>> I've fixed them.
> > >>>>
> > >>>>
> > >>>> Thanks,
> > >>>>
> > >>>> Yasumasa
> > >>>>
> > >>>>
> > >>>> (2014/10/13 9:41), David Holmes wrote:
> > >>>>> Hi Yasumasa,
> > >>>>>
> > >>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote:
> > >>>>>> Hi David,
> > >>>>>>
> > >>>>>> Sorry for my English.
> > >>>>>>
> > >>>>>> I want to propose that JVM should create message according to core
> > >>>>>> pattern (/proc/sys/kernel/core_pattern) .
> > >>>>>> So I filed it to JBS and created a patch.
> > >>>>>
> > >>>>> So I've had a quick look at this core_pattern business and it seems
> > to
> > >>>>> me that there are two aspects to this.
> > >>>>>
> > >>>>> First, without the leading |, the entry in the core_pattern file
> is a
> > >>>>> naming pattern for the core file. In which case that should be
> > handled
> > >>>>> by the linux specific get_core_path() function. Though that in
> itself
> > >>>>> can't fully report the expected name, as part of it is provided in
> > the
> > >>>>> shared code in os::check_or_create_dump. Fixing this means changing
> > >>>>> all the os_posix using platforms. But your patch is not about this
> > >>>>> part. :)
> > >>>>>
> > >>>>> Second, with a leading | the core_pattern is actually the name of a
> > >>>>> program to execute when the program is about to core dump, and that
> > is
> > >>>>> what you report with your patch. Though I'm unclear whether it both
> > >>>>> invokes the program and creates a core dump file; or just invokes
> the
> > >>>>> program?
> > >>>>>
> > >>>>> So with regards to this second part your patch seems functionally
> ok.
> > >>>>> I do dislike having a big chunk of linux specific code in this
> > "posix"
> > >>>>> support file but ...
> > >>>>>
> > >>>>> A few style nits - you need spaces around keywords and before
> braces
> > eg:
> > >>>>>
> > >>>>>   if(x){
> > >>>>>
> > >>>>> should be
> > >>>>>
> > >>>>>   if (x) {
> > >>>>>
> > >>>>> I also suggest saying "Core dumps may be processed with ..." rather
> > >>>>> than "treated".
> > >>>>>
> > >>>>> And as you don't do anything in the non-redirect case I suggest
> > >>>>> collapsing this:
> > >>>>>
> > >>>>>   83           is_redirect = core_pattern[0] == '|';
> > >>>>>   84         }
> > >>>>>   85
> > >>>>>   86         if(is_redirect){
> > >>>>>   87           jio_snprintf(buffer, bufferSize,
> > >>>>>   88                    "Core dumps may be treated with \"%s\"",
> > >>>>> &core_pattern[1]);
> > >>>>>   89         }
> > >>>>>
> > >>>>> to just
> > >>>>>
> > >>>>>   83           if (core_pattern[0] == '|') {  // redirect
> > >>>>>   84             jio_snprintf(buffer, bufferSize, "Core dumps may
> be
> > >>>>> processed with \"%s\"", &core_pattern[1]);
> > >>>>>   85            }
> > >>>>>   86         }
> > >>>>>
> > >>>>> Comments from other runtime folk appreciated.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> David
> > >>>>>
> > >>>>>> Thanks,
> > >>>>>>
> > >>>>>> Yasumasa
> > >>>>>>
> > >>>>>> 2014/10/07 15:43 "David Holmes" <david.holmes at oracle.com
> > >>>>>> <mailto:david.holmes at oracle.com>>:
> > >>>>>>
> > >>>>>>    Hi Yasumasa,
> > >>>>>>
> > >>>>>>    I'm sorry but I don't understand what you are proposing. When
> you
> > >>>>>> say
> > >>>>>>    "treat" do you mean "create"? Otherwise what do you mean by
> > >>>>>> "treated"?
> > >>>>>>
> > >>>>>>    Thanks,
> > >>>>>>    David
> > >>>>>>
> > >>>>>>    On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote:
> > >>>>>>     > I'm in Hackergarten @ JavaOne :-)
> > >>>>>>     >
> > >>>>>>     >
> > >>>>>>     > Hi all,
> > >>>>>>     >
> > >>>>>>     > I would like to enhance the messages in hs_err report.
> > >>>>>>     > Modern Linux kernel can treat core dump with user process
> > >>>>>> (e.g. ABRT)
> > >>>>>>     > However, hs_err report cannot detect it.
> > >>>>>>     >
> > >>>>>>     > I think that hs_err report should output messages as below:
> > >>>>>>     > -------------
> > >>>>>>     >     Failed to write core dump. Core dumps may be treated
> with
> > >>>>>>    "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s
> > %c %p
> > >>>>>>    %u %g %t e"
> > >>>>>>     > -------------
> > >>>>>>     >
> > >>>>>>     > I've uploaded webrev of this enhancement.
> > >>>>>>     > Could you review it?
> > >>>>>>     >
> > >>>>>>     > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/
> > >>>>>>     >
> > >>>>>>     > This patch works fine on Fedora20 x86_64.
> > >>>>>>     >
> > >>>>>>     >
> > >>>>>>     >
> > >>>>>>     > Thanks,
> > >>>>>>     >
> > >>>>>>     > Yasumasa
> > >>>>>>     >
> > >>>>>>
> >
> >
>

From yumin.qi at oracle.com  Wed Nov 26 17:36:16 2014
From: yumin.qi at oracle.com (Yumin Qi)
Date: Wed, 26 Nov 2014 09:36:16 -0800
Subject: RFR(XS): 8053995: Add method to WhiteBox to get vm pagesize.
In-Reply-To: <547532C0.4080500@oracle.com>
References: <53DAC336.6050302@oracle.com> <54752EAF.4020404@oracle.com>
	<547532C0.4080500@oracle.com>
Message-ID: <54760F90.6040100@oracle.com>

Thanks for the review. Yes, the test will build testlibrary with

@library /testlibrary /testlibrary/whitebox


Thanks
Yumin


On 11/25/14, 5:54 PM, David Holmes wrote:
> Hi Yumin,
>
> On 26/11/2014 11:36 AM, Yumin Qi wrote:
>> Please review
>>
>> bugs: https://bugs.openjdk.java.net/browse/JDK-8053995
>> webrev: http://cr.openjdk.java.net/~minqi/8053995/webrev01/
>
> The test also needs to ensure the testlibrary gets built.
>
> Otherwise seems okay.
>
> Thanks,
> David
>
>> Now the API usage is in internal test case, see separate email for the
>> webrev.
>>
>> It is same as previous version (webrev00).
>>
>> Thanks
>> Yumin
>>
>> On 7/31/14, 3:29 PM, Yumin Qi wrote:
>>> Please review:
>>>
>>> http://cr.openjdk.java.net/~minqi/8053995/webrev00/
>>>
>>> Summary: Currently there is no java API to get underlying OS native VM
>>> page size unless using Unsafe which is not recommended.  The new added
>>> method to WhiteBox can read this property and used in test.
>>>
>>>
>>> Tests: JPRT and  jtreg.
>>>
>>> Thanks
>>> Yumin

From calvin.cheung at oracle.com  Wed Nov 26 18:07:29 2014
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Wed, 26 Nov 2014 10:07:29 -0800
Subject: RFR(XS): 8053995: Add method to WhiteBox to get vm pagesize.
In-Reply-To: <54760F90.6040100@oracle.com>
References: <53DAC336.6050302@oracle.com> <54752EAF.4020404@oracle.com>
	<547532C0.4080500@oracle.com> <54760F90.6040100@oracle.com>
Message-ID: <547616E1.5090706@oracle.com>

Looks good to me too.

Calvin

On 11/26/2014 9:36 AM, Yumin Qi wrote:
> Thanks for the review. Yes, the test will build testlibrary with
>
> @library /testlibrary /testlibrary/whitebox
>
>
> Thanks
> Yumin
>
>
>
> On 11/25/14, 5:54 PM, David Holmes wrote:
>> Hi Yumin,
>>
>> On 26/11/2014 11:36 AM, Yumin Qi wrote:
>>> Please review
>>>
>>> bugs: https://bugs.openjdk.java.net/browse/JDK-8053995
>>> webrev: http://cr.openjdk.java.net/~minqi/8053995/webrev01/
>>
>> The test also needs to ensure the testlibrary gets built.
>>
>> Otherwise seems okay.
>>
>> Thanks,
>> David
>>
>>> Now the API usage is in internal test case, see separate email for the
>>> webrev.
>>>
>>> It is same as previous version (webrev00).
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 7/31/14, 3:29 PM, Yumin Qi wrote:
>>>> Please review:
>>>>
>>>> http://cr.openjdk.java.net/~minqi/8053995/webrev00/
>>>>
>>>> Summary: Currently there is no java API to get underlying OS native VM
>>>> page size unless using Unsafe which is not recommended.  The new added
>>>> method to WhiteBox can read this property and used in test.
>>>>
>>>>
>>>> Tests: JPRT and  jtreg.
>>>>
>>>> Thanks
>>>> Yumin


From yumin.qi at oracle.com  Wed Nov 26 22:36:53 2014
From: yumin.qi at oracle.com (Yumin Qi)
Date: Wed, 26 Nov 2014 14:36:53 -0800
Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh
	fails with ClassCircularityError
In-Reply-To: <543C591E.8010602@oracle.com>
References: <543C591E.8010602@oracle.com>
Message-ID: <54765605.8030909@oracle.com>

Hi, please review again for new change for fixing the 
ClassCircularityError (CCE) in this test case.

More debug tails revealed that the CCE always happened at the beginning 
of the loop, before the real loading of TestClass[1-3] loaded, transform 
is called against system classes too (though they did not get loaded by 
agent). The check for loader which passed to transform is done before 
calling loading 'TestClass3', if it is null skip loading. This can 
prevent from loading loader itself before loading 'TestClass3', thus 
avoid seeing $JarLoader$2 twice on PlaceHolderTable. Meanwhile remove 
the block 'sleep' which is used to workaround deadlock at the beginning 
of transform. With the change which only loads class TestClass3 when 
loader is not null, this workaround is not needed. It is the loader 
loading caused both the issues here.

new URL:
http://cr.openjdk.java.net/~minqi/8038468/webrev02/


On 10/13/14, 3:58 PM, Yumin Qi wrote:
> bug: https://bugs.openjdk.java.net/browse/JDK-8038468
> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/
>
> the bug marked as confidential so post the webrev internally.
>
> Problem: The test case tries to load a class from the same jar via 
> agent in the middle of loading another class from the jar via same 
> class loader in same thread. The call happens in transform which is a 
> rare case --- in middle of loading class, loading another class. The 
> result is a CircularityError. When first class is in loading, in vm we 
> put JarLoader$2 on place holder table, then we start the defineClass, 
> which calls transform, begins loading the second class so go along the 
> same routine for loading JarLoader$2 first, found it already in 
> placeholder table. A CircularityError is thrown.
> Fix: The test case should not call loading class with same class 
> loader in same thread from same jar in 'transform' method. I modify it 
> loading with system class loader and we expect see 
> ClassNotFoundException. Detail see bug comments.
>
> Thanks
> Yumin *

From karen.kinnear at oracle.com  Wed Nov 26 22:55:38 2014
From: karen.kinnear at oracle.com (Karen Kinnear)
Date: Wed, 26 Nov 2014 17:55:38 -0500
Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh
	fails with ClassCircularityError
In-Reply-To: <54765605.8030909@oracle.com>
References: <543C591E.8010602@oracle.com> <54765605.8030909@oracle.com>
Message-ID: <D3E973CE-D51D-4603-94C3-455F500B4254@oracle.com>

Yumin,

Looks good. thanks very much,
Karen

On Nov 26, 2014, at 5:36 PM, Yumin Qi wrote:

> Hi, please review again for new change for fixing the ClassCircularityError (CCE) in this test case.
> 
> More debug tails revealed that the CCE always happened at the beginning of the loop, before the real loading of TestClass[1-3] loaded, transform is called against system classes too (though they did not get loaded by agent). The check for loader which passed to transform is done before calling loading 'TestClass3', if it is null skip loading. This can prevent from loading loader itself before loading 'TestClass3', thus avoid seeing $JarLoader$2 twice on PlaceHolderTable. Meanwhile remove the block 'sleep' which is used to workaround deadlock at the beginning of transform. With the change which only loads class TestClass3 when loader is not null, this workaround is not needed. It is the loader loading caused both the issues here.
> 
> new URL:
> http://cr.openjdk.java.net/~minqi/8038468/webrev02/
> 
> 
> On 10/13/14, 3:58 PM, Yumin Qi wrote:
>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468
>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/
>> 
>> the bug marked as confidential so post the webrev internally.
>> 
>> Problem: The test case tries to load a class from the same jar via agent in the middle of loading another class from the jar via same class loader in same thread. The call happens in transform which is a rare case --- in middle of loading class, loading another class. The result is a CircularityError. When first class is in loading, in vm we put JarLoader$2 on place holder table, then we start the defineClass, which calls transform, begins loading the second class so go along the same routine for loading JarLoader$2 first, found it already in placeholder table. A CircularityError is thrown.
>> Fix: The test case should not call loading class with same class loader in same thread from same jar in 'transform' method. I modify it loading with system class loader and we expect see ClassNotFoundException. Detail see bug comments.
>> 
>> Thanks
>> Yumin *


From serguei.spitsyn at oracle.com  Wed Nov 26 23:01:20 2014
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 26 Nov 2014 15:01:20 -0800
Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh
	fails with ClassCircularityError
In-Reply-To: <54765605.8030909@oracle.com>
References: <543C591E.8010602@oracle.com> <54765605.8030909@oracle.com>
Message-ID: <54765BC0.30700@oracle.com>

The fix looks good to me.
The class loading condition change is reasonable.

Thanks,
Serguei

On 11/26/14 2:36 PM, Yumin Qi wrote:
> Hi, please review again for new change for fixing the 
> ClassCircularityError (CCE) in this test case.
>
> More debug tails revealed that the CCE always happened at the 
> beginning of the loop, before the real loading of TestClass[1-3] 
> loaded, transform is called against system classes too (though they 
> did not get loaded by agent). The check for loader which passed to 
> transform is done before calling loading 'TestClass3', if it is null 
> skip loading. This can prevent from loading loader itself before 
> loading 'TestClass3', thus avoid seeing $JarLoader$2 twice on 
> PlaceHolderTable. Meanwhile remove the block 'sleep' which is used to 
> workaround deadlock at the beginning of transform. With the change 
> which only loads class TestClass3 when loader is not null, this 
> workaround is not needed. It is the loader loading caused both the 
> issues here.
>
> new URL:
> http://cr.openjdk.java.net/~minqi/8038468/webrev02/
>
>
> On 10/13/14, 3:58 PM, Yumin Qi wrote:
>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468
>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/
>>
>> the bug marked as confidential so post the webrev internally.
>>
>> Problem: The test case tries to load a class from the same jar via 
>> agent in the middle of loading another class from the jar via same 
>> class loader in same thread. The call happens in transform which is a 
>> rare case --- in middle of loading class, loading another class. The 
>> result is a CircularityError. When first class is in loading, in vm 
>> we put JarLoader$2 on place holder table, then we start the 
>> defineClass, which calls transform, begins loading the second class 
>> so go along the same routine for loading JarLoader$2 first, found it 
>> already in placeholder table. A CircularityError is thrown.
>> Fix: The test case should not call loading class with same class 
>> loader in same thread from same jar in 'transform' method. I modify 
>> it loading with system class loader and we expect see 
>> ClassNotFoundException. Detail see bug comments.
>>
>> Thanks
>> Yumin *


From david.holmes at oracle.com  Thu Nov 27 00:49:06 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 27 Nov 2014 10:49:06 +1000
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <CAA-vtUx6xfjLD4w05eiLWnce-exB-Ryo6SnVR52Oi-DRRvkvew@mail.gmail.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>	<54752CF8.5070408@oracle.com>	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>	<54759DFE.7020300@oracle.com>	<CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>	<5475C15E.30207@oracle.com>
	<CAA-vtUx6xfjLD4w05eiLWnce-exB-Ryo6SnVR52Oi-DRRvkvew@mail.gmail.com>
Message-ID: <54767502.6010907@oracle.com>

On 26/11/2014 11:33 PM, Thomas St?fe wrote:
> Hi David,
>
> here you go: http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.02/
>
> Reverted SIGILL-generating function back to its original form, plus the
> folding of the 000 case.

Thanks Thomas! While we are awaiting a second reviewer I will test this 
out internally. It may take a day or two sorry.

David

> I only can guess what your closed platforms are, but if it is ARM, I
> believe opcodes 0-31 are undefined. For ia64, 0 is undefined as well.
>
> Kind regards, Thomas
>
>
> On Wed, Nov 26, 2014 at 1:02 PM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
>     On 26/11/2014 9:37 PM, Thomas St?fe wrote:
>
>         Hi David,
>         ...
>
>                       - In debug.cpp for the SIGILL can you define the
>         all zero
>                  case as a
>                       default so we only need to add platform specific
>                  definitions when
>                       all zeroes doesn't work. I really hate seeing all
>         that CPU
>                  selection
>                       in shared code. :(
>
>
>                  Agreed and fixed, moved the CPU-specific sections into
>                  CPU-specific files.
>
>
>              I'd really like to see a way to share the all-zeroes case
>         so that we
>              don't need to add platform specific code unnecessarily.
>
>
>         sooo.. back to the original code then, just with the #ifdef,
>         just with
>         the zero-cases all folded in into the #else path? Or do you prefer
>         something else?
>
>
>     Elsewhere there is a pattern of defining per-platform values that
>     can override the shared definition. eg:
>
>     #ifndef HAS_SPECIAL_PLATFORM_VALUE___FOR_XXXX
>        Foo XXX = ...;  //shared/default initalization
>     #endif
>
>     but this assumes a platform specific header has already been
>     included that can do:
>
>     #define HAS_SPECIAL_PLATFORM_VALUE___FOR_XXXX
>     Foo XXX = ... ; // platform specific initialization
>
>     But that is not the case for debug.hpp.
>
>     So I guess folding the zero-case into the else path is the best we
>     can do. However I'm assuming the zero case will work for our
>     internal platforms ... if it doesn't then we'd have to pollute the
>     shared code with info for the closed platforms. :(
>
>     David
>     -----
>
>
>                       - Style nit: please use i++ rather than i ++
>
>
>                  Fixed.
>
>                       Aside: we should eradicate the use of sigprocmask and
>                  replace with
>                       the thread specific version.
>
>
>                  Agree. Though I never saw any errors stemming from the
>         use of
>                  sigprocmask(). According to POSIX, sigprocmask() is
>         undefined in
>                  multithreaded environment, and I guess most OSes just
>         default to
>                  pthread_sigmask.
>
>
>              Yes "probably" works okay but I hate to see us using
>         something with
>              undefined semantics. That's future clean up though.
>
>
>         We (SAP JVM) already use pthread_sigmask() / thr_sigsetmask()
>         instead of
>         sigprocmask. Works fine. We can port this to the OpenJDK.
>
>                       Getting back to the "thinking more about this" ...
>         If a
>                  synchronous
>                       signal is blocked at the time it is generated then it
>                  should remain
>                       pending on the thread (POSIX spec) but that
>         doesn't tell us
>                  what the
>                       thread will then do - retry the faulting
>         instruction? Become
>                       unschedulable? So I can easily imagine that a hang
>         or process
>                       termination may result.
>
>
>                  This is exactly what happens, but it is actually covered by
>                  POSIX, see
>                  doc on pthread_sigmask: "If any of the SIGFPE, SIGILL,
>         SIGSEGV, or
>                  SIGBUS signals are generated while they are blocked,
>         the result is
>                  undefined, unless the signal was generated by the /kill/()
>
>         <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/kill.html
>         <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/kill.html>
>
>         <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/kill.html
>         <http://pubs.opengroup.org/onlinepubs/009695399/functions/kill.html>>>
>                  function, the /sigqueue/()
>
>         <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/sigqueue.html
>         <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/sigqueue.html>
>
>         <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/sigqueue.html
>         <http://pubs.opengroup.org/onlinepubs/009695399/functions/sigqueue.html>>>
>                  function, or the /raise/()
>
>         <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/raise.html
>         <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/raise.html>
>
>
>         <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/raise.html
>         <http://pubs.opengroup.org/onlinepubs/009695399/functions/raise.html>>>
>                  function."
>
>
>              Thanks - I managed to miss that part even though I found
>         the other
>              part about the signal handling function returning. :(
>
>
>         It is well hidden, I found it by accident :) To me it looks like
>         they
>         kept it intentionally vague, to not block platforms where those
>         signals
>         could be somehow dealt with automatically? Hard to see though
>         how this
>         would work.
>
>
>
>                  In reality, process usually aborts abnormally with the
>         default
>                  action
>                  for the signal, e.g. printing out "Illegal Instruction". On
>                  MacOS, we
>                  hang (until the Watcherthread finally kills the VM). On old
>                  AIXes, we
>                  die without a trace.
>
>                  This also can be easily tried out by removing SIGILL
>         from the
>                  list of
>                  signals in vmError_<os>.cpp and executing:
>
>                  java -XX:ErrorHandlerTest=14 -XX:TestCrashInErrorHandler=15
>
>                  which will crash first with a SIGSEGV, then in error
>         handling with a
>                  secondary SIGILL. This will interrupt error reporting
>         and kill
>                  or hang
>                  the process.
>
>
>                       In that sense unblocking those signals whilst
>         handling the
>                  initial
>                       signal may well allow the error reporting process
>         to continue
>                       further. But I'm unclear exactly how this plays out:
>
>                       - synchronous signal encountered
>                       - crash_handler invoked
>
>                       - VMError::report_and_die executes
>                       - secondary signal encountered
>
>                       - crash_handler invoked again
>
>
>                  almost: not again, different signal handler now. First
>         signal was
>                  handled by "JVM_handle_<os>_signal()"
>
>
>              Ah missed that - thanks - not that it makes much difference :)
>
>
>         I just like nitpicking :)
>
>                       - VMError::report_and_die executes again and sees the
>                  recursion and
>                       returns (ignoring abort due to excessive recursive
>         errors)
>
>
>                  No..
>
>                       Is that right? So we actually return from the
>         crash_handler?
>
>
>                  Oh, but we dont return. VMError::report_and_die() will just
>                  create a new
>                  frame and re-execute VMError::report() of the first
>         VMError object.
>                  Which then will continue with the next STEP. We never
>         return,
>                  for each
>                  secondary error signal a new frame is created.
>
>
>              I had trouble tracing through exactly what might happen on the
>              recursive call to report_and_die. I see know that report
>         comes from:
>
>                   staticBufferStream sbs(buffer, O_BUFLEN, &log);
>                   first_error->report(&sbs);
>                   first_error->_current_step = 0;         // reset
>         current_step
>                   first_error->_current_step_____info = "";   // reset
>         current_step
>
>              string
>
>              so the second time through we will call report and
>         _current_step
>              should indicate where to start executing from.
>
>
>         Exactly. There is also a catch, in that the stack usage goes up. Not
>         endlessly, it is limited by the number of error reporting steps.
>         The more stack VmError::report() does cost, the less well this
>         works,
>         especially in stack overflow scenarios.
>
>         Which is why we extended SafeFetch and enabled it for the use in the
>         error handler, which will be one of the the next patches I'd like to
>         port to the OpenJDK, once this one is thru.
>
>
>                  This all happens in VMError::report_and_die:
>                  -> first error ? anchor VMError object in a static
>         variable and
>                  execute
>                  VMError::report()
>                  -> secondary error?
>                       -> different thread? just sleep forever
>                       -> same thread? new frame, re-enter
>         VMError::report(). Once
>                  done, abort.
>
>                  I always found that rather neat, but in fact that is
>         not our
>                  invention
>                  but Sun's :) Anyway, my fix does not change this
>         behaviour for
>                  better or
>                  worse, it only makes it usable for more cases.
>
>                       Because this puts us in undefined territory
>         according to POSIX:
>
>                       "The behavior of a process is undefined after it
>         returns
>                  normally
>                       from a signal-catching function for a SIGBUS, SIGFPE,
>                  SIGILL, or
>                       SIGSEGV signal that was not generated by kill(),
>         sigqueue(), or
>                       raise()."
>
>                  true, but we dont return...
>
>                       On top of that you also have the issue that error
>         reporting
>                  does a
>                       whole bunch of things that are not
>         async-signal-safe so we can
>                       easily encounter hangs or aborts.
>
>                       But we're dying anyway so I guess none of this really
>                  matters. If
>                       re-enabling these signals allows error reporting to
>                  progress further
>                       in some cases then that is a win.
>
>
>                  Actually, this covers a lot of cases, mostly because
>         SIGSEGV during
>                  error reporting is common, so if the original error was not
>                  SIGSEGV, but
>                  e.g. SIGILL, this would always result in broken hs-err
>         files.
>
>                  The back story is that at SAP, we rely heavily on the
>         hs-err
>                  files. They
>                  are our main tool for support, because working with
>         cores is
>                  often not
>                  feasible. So, we put a lot of work in making error
>         reporting
>                  reliable
>                  across all platforms. This is also covered by many
>         tests which
>                  crash the
>                  VM in exciting ways and check the hs-err files for
>         completeness.
>
>
>              OK. Modulo the cpu specific SIGILL part everything else
>         seems fine.
>
>         Great. just tell me how you want that part.
>
>         Kind regards, Thomas
>
>              Thanks,
>              David
>
>

From david.holmes at oracle.com  Thu Nov 27 00:57:34 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 27 Nov 2014 10:57:34 +1000
Subject: RFR(XS): 8053995: Add method to WhiteBox to get vm pagesize.
In-Reply-To: <54760F90.6040100@oracle.com>
References: <53DAC336.6050302@oracle.com> <54752EAF.4020404@oracle.com>
	<547532C0.4080500@oracle.com> <54760F90.6040100@oracle.com>
Message-ID: <547676FE.3080204@oracle.com>

On 27/11/2014 3:36 AM, Yumin Qi wrote:
> Thanks for the review. Yes, the test will build testlibrary with
>
> @library /testlibrary /testlibrary/whitebox

No that won't necessarily build the testlibrary.

 From other email:

 >> I'm having a problem running a test in 8u25 that uses the testlibrary
 >> ProcessTools API. I get a ClassNotFoundException. Looking in the
 >> classes directory I only see two testlibrary classes - which map to
 >> two specific testlibrary classes that one test has on its @build
 >> line. The test in question simply has:
 >>
 >> @library /testlibrary
 >>
 >> Does it need an explicit:
 >>
 >> @build com.oracle.java.testlibrary.*
 >
 > Yes. It turns out that JTReg might not compile the library classes on
 > demand (but it does sometimes). So it is better to specify the
 > required build manually.
 >
 > -JB-

David
-----

>
> Thanks
> Yumin
>
>
>
> On 11/25/14, 5:54 PM, David Holmes wrote:
>> Hi Yumin,
>>
>> On 26/11/2014 11:36 AM, Yumin Qi wrote:
>>> Please review
>>>
>>> bugs: https://bugs.openjdk.java.net/browse/JDK-8053995
>>> webrev: http://cr.openjdk.java.net/~minqi/8053995/webrev01/
>>
>> The test also needs to ensure the testlibrary gets built.
>>
>> Otherwise seems okay.
>>
>> Thanks,
>> David
>>
>>> Now the API usage is in internal test case, see separate email for the
>>> webrev.
>>>
>>> It is same as previous version (webrev00).
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 7/31/14, 3:29 PM, Yumin Qi wrote:
>>>> Please review:
>>>>
>>>> http://cr.openjdk.java.net/~minqi/8053995/webrev00/
>>>>
>>>> Summary: Currently there is no java API to get underlying OS native VM
>>>> page size unless using Unsafe which is not recommended.  The new added
>>>> method to WhiteBox can read this property and used in test.
>>>>
>>>>
>>>> Tests: JPRT and  jtreg.
>>>>
>>>> Thanks
>>>> Yumin

From david.holmes at oracle.com  Thu Nov 27 05:18:15 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 27 Nov 2014 15:18:15 +1000
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <54767502.6010907@oracle.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>	<54752CF8.5070408@oracle.com>	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>	<54759DFE.7020300@oracle.com>	<CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>	<5475C15E.30207@oracle.com>	<CAA-vtUx6xfjLD4w05eiLWnce-exB-Ryo6SnVR52Oi-DRRvkvew@mail.gmail.com>
	<54767502.6010907@oracle.com>
Message-ID: <5476B417.9030008@oracle.com>

On 27/11/2014 10:49 AM, David Holmes wrote:
> On 26/11/2014 11:33 PM, Thomas St?fe wrote:
>> Hi David,
>>
>> here you go:
>> http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.02/
>>
>> Reverted SIGILL-generating function back to its original form, plus the
>> folding of the 000 case.
>
> Thanks Thomas! While we are awaiting a second reviewer I will test this
> out internally. It may take a day or two sorry.

Unfortunately, on ARM (emulator), the SIGILL test generated a SEGV instead:

will jump to PC 0xb6fb1000, which should cause a SIGILL.
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0xb6fb2000, pid=13095, tid=3060999280

If I read the ARM architecture manual correctly all zeroes will map to a 
conditional AND instruction (Ref A8.6.12 AND(register))

David

> David
>
>> I only can guess what your closed platforms are, but if it is ARM, I
>> believe opcodes 0-31 are undefined. For ia64, 0 is undefined as well.
>>
>> Kind regards, Thomas
>>
>>
>> On Wed, Nov 26, 2014 at 1:02 PM, David Holmes <david.holmes at oracle.com
>> <mailto:david.holmes at oracle.com>> wrote:
>>
>>     On 26/11/2014 9:37 PM, Thomas St?fe wrote:
>>
>>         Hi David,
>>         ...
>>
>>                       - In debug.cpp for the SIGILL can you define the
>>         all zero
>>                  case as a
>>                       default so we only need to add platform specific
>>                  definitions when
>>                       all zeroes doesn't work. I really hate seeing all
>>         that CPU
>>                  selection
>>                       in shared code. :(
>>
>>
>>                  Agreed and fixed, moved the CPU-specific sections into
>>                  CPU-specific files.
>>
>>
>>              I'd really like to see a way to share the all-zeroes case
>>         so that we
>>              don't need to add platform specific code unnecessarily.
>>
>>
>>         sooo.. back to the original code then, just with the #ifdef,
>>         just with
>>         the zero-cases all folded in into the #else path? Or do you
>> prefer
>>         something else?
>>
>>
>>     Elsewhere there is a pattern of defining per-platform values that
>>     can override the shared definition. eg:
>>
>>     #ifndef HAS_SPECIAL_PLATFORM_VALUE___FOR_XXXX
>>        Foo XXX = ...;  //shared/default initalization
>>     #endif
>>
>>     but this assumes a platform specific header has already been
>>     included that can do:
>>
>>     #define HAS_SPECIAL_PLATFORM_VALUE___FOR_XXXX
>>     Foo XXX = ... ; // platform specific initialization
>>
>>     But that is not the case for debug.hpp.
>>
>>     So I guess folding the zero-case into the else path is the best we
>>     can do. However I'm assuming the zero case will work for our
>>     internal platforms ... if it doesn't then we'd have to pollute the
>>     shared code with info for the closed platforms. :(
>>
>>     David
>>     -----
>>
>>
>>                       - Style nit: please use i++ rather than i ++
>>
>>
>>                  Fixed.
>>
>>                       Aside: we should eradicate the use of
>> sigprocmask and
>>                  replace with
>>                       the thread specific version.
>>
>>
>>                  Agree. Though I never saw any errors stemming from the
>>         use of
>>                  sigprocmask(). According to POSIX, sigprocmask() is
>>         undefined in
>>                  multithreaded environment, and I guess most OSes just
>>         default to
>>                  pthread_sigmask.
>>
>>
>>              Yes "probably" works okay but I hate to see us using
>>         something with
>>              undefined semantics. That's future clean up though.
>>
>>
>>         We (SAP JVM) already use pthread_sigmask() / thr_sigsetmask()
>>         instead of
>>         sigprocmask. Works fine. We can port this to the OpenJDK.
>>
>>                       Getting back to the "thinking more about this" ...
>>         If a
>>                  synchronous
>>                       signal is blocked at the time it is generated
>> then it
>>                  should remain
>>                       pending on the thread (POSIX spec) but that
>>         doesn't tell us
>>                  what the
>>                       thread will then do - retry the faulting
>>         instruction? Become
>>                       unschedulable? So I can easily imagine that a hang
>>         or process
>>                       termination may result.
>>
>>
>>                  This is exactly what happens, but it is actually
>> covered by
>>                  POSIX, see
>>                  doc on pthread_sigmask: "If any of the SIGFPE, SIGILL,
>>         SIGSEGV, or
>>                  SIGBUS signals are generated while they are blocked,
>>         the result is
>>                  undefined, unless the signal was generated by the
>> /kill/()
>>
>>
>> <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/kill.html
>>
>>
>> <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/kill.html>
>>
>>
>> <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/kill.html
>>
>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/kill.html>>>
>>                  function, the /sigqueue/()
>>
>>
>> <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/sigqueue.html
>>
>>
>> <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/sigqueue.html>
>>
>>
>>
>> <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/sigqueue.html
>>
>>
>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/sigqueue.html>>>
>>
>>                  function, or the /raise/()
>>
>>
>> <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/raise.html
>>
>>
>> <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/raise.html>
>>
>>
>>
>> <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/raise.html
>>
>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/raise.html>>>
>>                  function."
>>
>>
>>              Thanks - I managed to miss that part even though I found
>>         the other
>>              part about the signal handling function returning. :(
>>
>>
>>         It is well hidden, I found it by accident :) To me it looks like
>>         they
>>         kept it intentionally vague, to not block platforms where those
>>         signals
>>         could be somehow dealt with automatically? Hard to see though
>>         how this
>>         would work.
>>
>>
>>
>>                  In reality, process usually aborts abnormally with the
>>         default
>>                  action
>>                  for the signal, e.g. printing out "Illegal
>> Instruction". On
>>                  MacOS, we
>>                  hang (until the Watcherthread finally kills the VM).
>> On old
>>                  AIXes, we
>>                  die without a trace.
>>
>>                  This also can be easily tried out by removing SIGILL
>>         from the
>>                  list of
>>                  signals in vmError_<os>.cpp and executing:
>>
>>                  java -XX:ErrorHandlerTest=14
>> -XX:TestCrashInErrorHandler=15
>>
>>                  which will crash first with a SIGSEGV, then in error
>>         handling with a
>>                  secondary SIGILL. This will interrupt error reporting
>>         and kill
>>                  or hang
>>                  the process.
>>
>>
>>                       In that sense unblocking those signals whilst
>>         handling the
>>                  initial
>>                       signal may well allow the error reporting process
>>         to continue
>>                       further. But I'm unclear exactly how this plays
>> out:
>>
>>                       - synchronous signal encountered
>>                       - crash_handler invoked
>>
>>                       - VMError::report_and_die executes
>>                       - secondary signal encountered
>>
>>                       - crash_handler invoked again
>>
>>
>>                  almost: not again, different signal handler now. First
>>         signal was
>>                  handled by "JVM_handle_<os>_signal()"
>>
>>
>>              Ah missed that - thanks - not that it makes much
>> difference :)
>>
>>
>>         I just like nitpicking :)
>>
>>                       - VMError::report_and_die executes again and
>> sees the
>>                  recursion and
>>                       returns (ignoring abort due to excessive recursive
>>         errors)
>>
>>
>>                  No..
>>
>>                       Is that right? So we actually return from the
>>         crash_handler?
>>
>>
>>                  Oh, but we dont return. VMError::report_and_die()
>> will just
>>                  create a new
>>                  frame and re-execute VMError::report() of the first
>>         VMError object.
>>                  Which then will continue with the next STEP. We never
>>         return,
>>                  for each
>>                  secondary error signal a new frame is created.
>>
>>
>>              I had trouble tracing through exactly what might happen
>> on the
>>              recursive call to report_and_die. I see know that report
>>         comes from:
>>
>>                   staticBufferStream sbs(buffer, O_BUFLEN, &log);
>>                   first_error->report(&sbs);
>>                   first_error->_current_step = 0;         // reset
>>         current_step
>>                   first_error->_current_step_____info = "";   // reset
>>         current_step
>>
>>              string
>>
>>              so the second time through we will call report and
>>         _current_step
>>              should indicate where to start executing from.
>>
>>
>>         Exactly. There is also a catch, in that the stack usage goes
>> up. Not
>>         endlessly, it is limited by the number of error reporting steps.
>>         The more stack VmError::report() does cost, the less well this
>>         works,
>>         especially in stack overflow scenarios.
>>
>>         Which is why we extended SafeFetch and enabled it for the use
>> in the
>>         error handler, which will be one of the the next patches I'd
>> like to
>>         port to the OpenJDK, once this one is thru.
>>
>>
>>                  This all happens in VMError::report_and_die:
>>                  -> first error ? anchor VMError object in a static
>>         variable and
>>                  execute
>>                  VMError::report()
>>                  -> secondary error?
>>                       -> different thread? just sleep forever
>>                       -> same thread? new frame, re-enter
>>         VMError::report(). Once
>>                  done, abort.
>>
>>                  I always found that rather neat, but in fact that is
>>         not our
>>                  invention
>>                  but Sun's :) Anyway, my fix does not change this
>>         behaviour for
>>                  better or
>>                  worse, it only makes it usable for more cases.
>>
>>                       Because this puts us in undefined territory
>>         according to POSIX:
>>
>>                       "The behavior of a process is undefined after it
>>         returns
>>                  normally
>>                       from a signal-catching function for a SIGBUS,
>> SIGFPE,
>>                  SIGILL, or
>>                       SIGSEGV signal that was not generated by kill(),
>>         sigqueue(), or
>>                       raise()."
>>
>>                  true, but we dont return...
>>
>>                       On top of that you also have the issue that error
>>         reporting
>>                  does a
>>                       whole bunch of things that are not
>>         async-signal-safe so we can
>>                       easily encounter hangs or aborts.
>>
>>                       But we're dying anyway so I guess none of this
>> really
>>                  matters. If
>>                       re-enabling these signals allows error reporting to
>>                  progress further
>>                       in some cases then that is a win.
>>
>>
>>                  Actually, this covers a lot of cases, mostly because
>>         SIGSEGV during
>>                  error reporting is common, so if the original error
>> was not
>>                  SIGSEGV, but
>>                  e.g. SIGILL, this would always result in broken hs-err
>>         files.
>>
>>                  The back story is that at SAP, we rely heavily on the
>>         hs-err
>>                  files. They
>>                  are our main tool for support, because working with
>>         cores is
>>                  often not
>>                  feasible. So, we put a lot of work in making error
>>         reporting
>>                  reliable
>>                  across all platforms. This is also covered by many
>>         tests which
>>                  crash the
>>                  VM in exciting ways and check the hs-err files for
>>         completeness.
>>
>>
>>              OK. Modulo the cpu specific SIGILL part everything else
>>         seems fine.
>>
>>         Great. just tell me how you want that part.
>>
>>         Kind regards, Thomas
>>
>>              Thanks,
>>              David
>>
>>

From thomas.stuefe at gmail.com  Thu Nov 27 07:36:44 2014
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Thu, 27 Nov 2014 08:36:44 +0100
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <5476B417.9030008@oracle.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>
	<54752CF8.5070408@oracle.com>
	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>
	<54759DFE.7020300@oracle.com>
	<CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>
	<5475C15E.30207@oracle.com>
	<CAA-vtUx6xfjLD4w05eiLWnce-exB-Ryo6SnVR52Oi-DRRvkvew@mail.gmail.com>
	<54767502.6010907@oracle.com> <5476B417.9030008@oracle.com>
Message-ID: <CAA-vtUzWxgnbWmKH81UABTCb9czmXvAx0u+wK5CvAKzz0Dqp8A@mail.gmail.com>

Unfortunately, I cannot test it, as I have no ARM environment. The best I
can come up with without testing is this:
http://stackoverflow.com/questions/16081618/programmatically-cause-undefined-instruction-exception

Kind regards, Thomas

On Thu, Nov 27, 2014 at 6:18 AM, David Holmes <david.holmes at oracle.com>
wrote:

> On 27/11/2014 10:49 AM, David Holmes wrote:
>
>> On 26/11/2014 11:33 PM, Thomas St?fe wrote:
>>
>>> Hi David,
>>>
>>> here you go:
>>> http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.02/
>>>
>>> Reverted SIGILL-generating function back to its original form, plus the
>>> folding of the 000 case.
>>>
>>
>> Thanks Thomas! While we are awaiting a second reviewer I will test this
>> out internally. It may take a day or two sorry.
>>
>
> Unfortunately, on ARM (emulator), the SIGILL test generated a SEGV instead:
>
> will jump to PC 0xb6fb1000, which should cause a SIGILL.
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0xb6fb2000, pid=13095, tid=3060999280
>
> If I read the ARM architecture manual correctly all zeroes will map to a
> conditional AND instruction (Ref A8.6.12 AND(register))
>
> David
>
>
>  David
>>
>>  I only can guess what your closed platforms are, but if it is ARM, I
>>> believe opcodes 0-31 are undefined. For ia64, 0 is undefined as well.
>>>
>>> Kind regards, Thomas
>>>
>>>
>>> On Wed, Nov 26, 2014 at 1:02 PM, David Holmes <david.holmes at oracle.com
>>> <mailto:david.holmes at oracle.com>> wrote:
>>>
>>>     On 26/11/2014 9:37 PM, Thomas St?fe wrote:
>>>
>>>         Hi David,
>>>         ...
>>>
>>>                       - In debug.cpp for the SIGILL can you define the
>>>         all zero
>>>                  case as a
>>>                       default so we only need to add platform specific
>>>                  definitions when
>>>                       all zeroes doesn't work. I really hate seeing all
>>>         that CPU
>>>                  selection
>>>                       in shared code. :(
>>>
>>>
>>>                  Agreed and fixed, moved the CPU-specific sections into
>>>                  CPU-specific files.
>>>
>>>
>>>              I'd really like to see a way to share the all-zeroes case
>>>         so that we
>>>              don't need to add platform specific code unnecessarily.
>>>
>>>
>>>         sooo.. back to the original code then, just with the #ifdef,
>>>         just with
>>>         the zero-cases all folded in into the #else path? Or do you
>>> prefer
>>>         something else?
>>>
>>>
>>>     Elsewhere there is a pattern of defining per-platform values that
>>>     can override the shared definition. eg:
>>>
>>>     #ifndef HAS_SPECIAL_PLATFORM_VALUE___FOR_XXXX
>>>        Foo XXX = ...;  //shared/default initalization
>>>     #endif
>>>
>>>     but this assumes a platform specific header has already been
>>>     included that can do:
>>>
>>>     #define HAS_SPECIAL_PLATFORM_VALUE___FOR_XXXX
>>>     Foo XXX = ... ; // platform specific initialization
>>>
>>>     But that is not the case for debug.hpp.
>>>
>>>     So I guess folding the zero-case into the else path is the best we
>>>     can do. However I'm assuming the zero case will work for our
>>>     internal platforms ... if it doesn't then we'd have to pollute the
>>>     shared code with info for the closed platforms. :(
>>>
>>>     David
>>>     -----
>>>
>>>
>>>                       - Style nit: please use i++ rather than i ++
>>>
>>>
>>>                  Fixed.
>>>
>>>                       Aside: we should eradicate the use of
>>> sigprocmask and
>>>                  replace with
>>>                       the thread specific version.
>>>
>>>
>>>                  Agree. Though I never saw any errors stemming from the
>>>         use of
>>>                  sigprocmask(). According to POSIX, sigprocmask() is
>>>         undefined in
>>>                  multithreaded environment, and I guess most OSes just
>>>         default to
>>>                  pthread_sigmask.
>>>
>>>
>>>              Yes "probably" works okay but I hate to see us using
>>>         something with
>>>              undefined semantics. That's future clean up though.
>>>
>>>
>>>         We (SAP JVM) already use pthread_sigmask() / thr_sigsetmask()
>>>         instead of
>>>         sigprocmask. Works fine. We can port this to the OpenJDK.
>>>
>>>                       Getting back to the "thinking more about this" ...
>>>         If a
>>>                  synchronous
>>>                       signal is blocked at the time it is generated
>>> then it
>>>                  should remain
>>>                       pending on the thread (POSIX spec) but that
>>>         doesn't tell us
>>>                  what the
>>>                       thread will then do - retry the faulting
>>>         instruction? Become
>>>                       unschedulable? So I can easily imagine that a hang
>>>         or process
>>>                       termination may result.
>>>
>>>
>>>                  This is exactly what happens, but it is actually
>>> covered by
>>>                  POSIX, see
>>>                  doc on pthread_sigmask: "If any of the SIGFPE, SIGILL,
>>>         SIGSEGV, or
>>>                  SIGBUS signals are generated while they are blocked,
>>>         the result is
>>>                  undefined, unless the signal was generated by the
>>> /kill/()
>>>
>>>
>>> <http://pubs.opengroup.org/____onlinepubs/009695399/____
>>> functions/kill.html
>>>
>>>
>>> <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/kill.html>
>>>
>>>
>>> <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/kill.html
>>>
>>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/kill.html>>>
>>>                  function, the /sigqueue/()
>>>
>>>
>>> <http://pubs.opengroup.org/____onlinepubs/009695399/____
>>> functions/sigqueue.html
>>>
>>>
>>> <http://pubs.opengroup.org/__onlinepubs/009695399/__
>>> functions/sigqueue.html>
>>>
>>>
>>>
>>> <http://pubs.opengroup.org/__onlinepubs/009695399/__
>>> functions/sigqueue.html
>>>
>>>
>>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/sigqueue.html
>>> >>>
>>>
>>>                  function, or the /raise/()
>>>
>>>
>>> <http://pubs.opengroup.org/____onlinepubs/009695399/____
>>> functions/raise.html
>>>
>>>
>>> <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/raise.html
>>> >
>>>
>>>
>>>
>>> <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/raise.html
>>>
>>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/raise.html>>>
>>>                  function."
>>>
>>>
>>>              Thanks - I managed to miss that part even though I found
>>>         the other
>>>              part about the signal handling function returning. :(
>>>
>>>
>>>         It is well hidden, I found it by accident :) To me it looks like
>>>         they
>>>         kept it intentionally vague, to not block platforms where those
>>>         signals
>>>         could be somehow dealt with automatically? Hard to see though
>>>         how this
>>>         would work.
>>>
>>>
>>>
>>>                  In reality, process usually aborts abnormally with the
>>>         default
>>>                  action
>>>                  for the signal, e.g. printing out "Illegal
>>> Instruction". On
>>>                  MacOS, we
>>>                  hang (until the Watcherthread finally kills the VM).
>>> On old
>>>                  AIXes, we
>>>                  die without a trace.
>>>
>>>                  This also can be easily tried out by removing SIGILL
>>>         from the
>>>                  list of
>>>                  signals in vmError_<os>.cpp and executing:
>>>
>>>                  java -XX:ErrorHandlerTest=14
>>> -XX:TestCrashInErrorHandler=15
>>>
>>>                  which will crash first with a SIGSEGV, then in error
>>>         handling with a
>>>                  secondary SIGILL. This will interrupt error reporting
>>>         and kill
>>>                  or hang
>>>                  the process.
>>>
>>>
>>>                       In that sense unblocking those signals whilst
>>>         handling the
>>>                  initial
>>>                       signal may well allow the error reporting process
>>>         to continue
>>>                       further. But I'm unclear exactly how this plays
>>> out:
>>>
>>>                       - synchronous signal encountered
>>>                       - crash_handler invoked
>>>
>>>                       - VMError::report_and_die executes
>>>                       - secondary signal encountered
>>>
>>>                       - crash_handler invoked again
>>>
>>>
>>>                  almost: not again, different signal handler now. First
>>>         signal was
>>>                  handled by "JVM_handle_<os>_signal()"
>>>
>>>
>>>              Ah missed that - thanks - not that it makes much
>>> difference :)
>>>
>>>
>>>         I just like nitpicking :)
>>>
>>>                       - VMError::report_and_die executes again and
>>> sees the
>>>                  recursion and
>>>                       returns (ignoring abort due to excessive recursive
>>>         errors)
>>>
>>>
>>>                  No..
>>>
>>>                       Is that right? So we actually return from the
>>>         crash_handler?
>>>
>>>
>>>                  Oh, but we dont return. VMError::report_and_die()
>>> will just
>>>                  create a new
>>>                  frame and re-execute VMError::report() of the first
>>>         VMError object.
>>>                  Which then will continue with the next STEP. We never
>>>         return,
>>>                  for each
>>>                  secondary error signal a new frame is created.
>>>
>>>
>>>              I had trouble tracing through exactly what might happen
>>> on the
>>>              recursive call to report_and_die. I see know that report
>>>         comes from:
>>>
>>>                   staticBufferStream sbs(buffer, O_BUFLEN, &log);
>>>                   first_error->report(&sbs);
>>>                   first_error->_current_step = 0;         // reset
>>>         current_step
>>>                   first_error->_current_step_____info = "";   // reset
>>>         current_step
>>>
>>>              string
>>>
>>>              so the second time through we will call report and
>>>         _current_step
>>>              should indicate where to start executing from.
>>>
>>>
>>>         Exactly. There is also a catch, in that the stack usage goes
>>> up. Not
>>>         endlessly, it is limited by the number of error reporting steps.
>>>         The more stack VmError::report() does cost, the less well this
>>>         works,
>>>         especially in stack overflow scenarios.
>>>
>>>         Which is why we extended SafeFetch and enabled it for the use
>>> in the
>>>         error handler, which will be one of the the next patches I'd
>>> like to
>>>         port to the OpenJDK, once this one is thru.
>>>
>>>
>>>                  This all happens in VMError::report_and_die:
>>>                  -> first error ? anchor VMError object in a static
>>>         variable and
>>>                  execute
>>>                  VMError::report()
>>>                  -> secondary error?
>>>                       -> different thread? just sleep forever
>>>                       -> same thread? new frame, re-enter
>>>         VMError::report(). Once
>>>                  done, abort.
>>>
>>>                  I always found that rather neat, but in fact that is
>>>         not our
>>>                  invention
>>>                  but Sun's :) Anyway, my fix does not change this
>>>         behaviour for
>>>                  better or
>>>                  worse, it only makes it usable for more cases.
>>>
>>>                       Because this puts us in undefined territory
>>>         according to POSIX:
>>>
>>>                       "The behavior of a process is undefined after it
>>>         returns
>>>                  normally
>>>                       from a signal-catching function for a SIGBUS,
>>> SIGFPE,
>>>                  SIGILL, or
>>>                       SIGSEGV signal that was not generated by kill(),
>>>         sigqueue(), or
>>>                       raise()."
>>>
>>>                  true, but we dont return...
>>>
>>>                       On top of that you also have the issue that error
>>>         reporting
>>>                  does a
>>>                       whole bunch of things that are not
>>>         async-signal-safe so we can
>>>                       easily encounter hangs or aborts.
>>>
>>>                       But we're dying anyway so I guess none of this
>>> really
>>>                  matters. If
>>>                       re-enabling these signals allows error reporting to
>>>                  progress further
>>>                       in some cases then that is a win.
>>>
>>>
>>>                  Actually, this covers a lot of cases, mostly because
>>>         SIGSEGV during
>>>                  error reporting is common, so if the original error
>>> was not
>>>                  SIGSEGV, but
>>>                  e.g. SIGILL, this would always result in broken hs-err
>>>         files.
>>>
>>>                  The back story is that at SAP, we rely heavily on the
>>>         hs-err
>>>                  files. They
>>>                  are our main tool for support, because working with
>>>         cores is
>>>                  often not
>>>                  feasible. So, we put a lot of work in making error
>>>         reporting
>>>                  reliable
>>>                  across all platforms. This is also covered by many
>>>         tests which
>>>                  crash the
>>>                  VM in exciting ways and check the hs-err files for
>>>         completeness.
>>>
>>>
>>>              OK. Modulo the cpu specific SIGILL part everything else
>>>         seems fine.
>>>
>>>         Great. just tell me how you want that part.
>>>
>>>         Kind regards, Thomas
>>>
>>>              Thanks,
>>>              David
>>>
>>>
>>>

From david.holmes at oracle.com  Thu Nov 27 09:01:05 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 27 Nov 2014 19:01:05 +1000
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <CAA-vtUzWxgnbWmKH81UABTCb9czmXvAx0u+wK5CvAKzz0Dqp8A@mail.gmail.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>	<54752CF8.5070408@oracle.com>	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>	<54759DFE.7020300@oracle.com>	<CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>	<5475C15E.30207@oracle.com>	<CAA-vtUx6xfjLD4w05eiLWnce-exB-Ryo6SnVR52Oi-DRRvkvew@mail.gmail.com>	<54767502.6010907@oracle.com>	<5476B417.9030008@oracle.com>
	<CAA-vtUzWxgnbWmKH81UABTCb9czmXvAx0u+wK5CvAKzz0Dqp8A@mail.gmail.com>
Message-ID: <5476E851.8050802@oracle.com>

On 27/11/2014 5:36 PM, Thomas St?fe wrote:
> Unfortunately, I cannot test it, as I have no ARM environment. The best
> I can come up with without testing is this:
> http://stackoverflow.com/questions/16081618/programmatically-cause-undefined-instruction-exception

The issue is how to handle this? Put ifdefs for ARM in the open code? 
Revert to your per-platform solution? Some other variation? Or do we 
just not care if we can't trigger SIGILL on ARM? Though I'd like to hear 
from the AARCH64 folk too.

David

> Kind regards, Thomas
>
> On Thu, Nov 27, 2014 at 6:18 AM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
>     On 27/11/2014 10:49 AM, David Holmes wrote:
>
>         On 26/11/2014 11:33 PM, Thomas St?fe wrote:
>
>             Hi David,
>
>             here you go:
>             http://cr.openjdk.java.net/~__stuefe/webrevs/8065895/webrev.__02/
>             <http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.02/>
>
>             Reverted SIGILL-generating function back to its original
>             form, plus the
>             folding of the 000 case.
>
>
>         Thanks Thomas! While we are awaiting a second reviewer I will
>         test this
>         out internally. It may take a day or two sorry.
>
>
>     Unfortunately, on ARM (emulator), the SIGILL test generated a SEGV
>     instead:
>
>     will jump to PC 0xb6fb1000, which should cause a SIGILL.
>     #
>     # A fatal error has been detected by the Java Runtime Environment:
>     #
>     #  SIGSEGV (0xb) at pc=0xb6fb2000, pid=13095, tid=3060999280
>
>     If I read the ARM architecture manual correctly all zeroes will map
>     to a conditional AND instruction (Ref A8.6.12 AND(register))
>
>     David
>
>
>         David
>
>             I only can guess what your closed platforms are, but if it
>             is ARM, I
>             believe opcodes 0-31 are undefined. For ia64, 0 is undefined
>             as well.
>
>             Kind regards, Thomas
>
>
>             On Wed, Nov 26, 2014 at 1:02 PM, David Holmes
>             <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>             <mailto:david.holmes at oracle.__com
>             <mailto:david.holmes at oracle.com>>> wrote:
>
>                  On 26/11/2014 9:37 PM, Thomas St?fe wrote:
>
>                      Hi David,
>                      ...
>
>                                    - In debug.cpp for the SIGILL can you
>             define the
>                      all zero
>                               case as a
>                                    default so we only need to add
>             platform specific
>                               definitions when
>                                    all zeroes doesn't work. I really
>             hate seeing all
>                      that CPU
>                               selection
>                                    in shared code. :(
>
>
>                               Agreed and fixed, moved the CPU-specific
>             sections into
>                               CPU-specific files.
>
>
>                           I'd really like to see a way to share the
>             all-zeroes case
>                      so that we
>                           don't need to add platform specific code
>             unnecessarily.
>
>
>                      sooo.. back to the original code then, just with
>             the #ifdef,
>                      just with
>                      the zero-cases all folded in into the #else path?
>             Or do you
>             prefer
>                      something else?
>
>
>                  Elsewhere there is a pattern of defining per-platform
>             values that
>                  can override the shared definition. eg:
>
>                  #ifndef HAS_SPECIAL_PLATFORM_VALUE_____FOR_XXXX
>                     Foo XXX = ...;  //shared/default initalization
>                  #endif
>
>                  but this assumes a platform specific header has already
>             been
>                  included that can do:
>
>                  #define HAS_SPECIAL_PLATFORM_VALUE_____FOR_XXXX
>                  Foo XXX = ... ; // platform specific initialization
>
>                  But that is not the case for debug.hpp.
>
>                  So I guess folding the zero-case into the else path is
>             the best we
>                  can do. However I'm assuming the zero case will work
>             for our
>                  internal platforms ... if it doesn't then we'd have to
>             pollute the
>                  shared code with info for the closed platforms. :(
>
>                  David
>                  -----
>
>
>                                    - Style nit: please use i++ rather
>             than i ++
>
>
>                               Fixed.
>
>                                    Aside: we should eradicate the use of
>             sigprocmask and
>                               replace with
>                                    the thread specific version.
>
>
>                               Agree. Though I never saw any errors
>             stemming from the
>                      use of
>                               sigprocmask(). According to POSIX,
>             sigprocmask() is
>                      undefined in
>                               multithreaded environment, and I guess
>             most OSes just
>                      default to
>                               pthread_sigmask.
>
>
>                           Yes "probably" works okay but I hate to see us
>             using
>                      something with
>                           undefined semantics. That's future clean up
>             though.
>
>
>                      We (SAP JVM) already use pthread_sigmask() /
>             thr_sigsetmask()
>                      instead of
>                      sigprocmask. Works fine. We can port this to the
>             OpenJDK.
>
>                                    Getting back to the "thinking more
>             about this" ...
>                      If a
>                               synchronous
>                                    signal is blocked at the time it is
>             generated
>             then it
>                               should remain
>                                    pending on the thread (POSIX spec)
>             but that
>                      doesn't tell us
>                               what the
>                                    thread will then do - retry the faulting
>                      instruction? Become
>                                    unschedulable? So I can easily
>             imagine that a hang
>                      or process
>                                    termination may result.
>
>
>                               This is exactly what happens, but it is
>             actually
>             covered by
>                               POSIX, see
>                               doc on pthread_sigmask: "If any of the
>             SIGFPE, SIGILL,
>                      SIGSEGV, or
>                               SIGBUS signals are generated while they
>             are blocked,
>                      the result is
>                               undefined, unless the signal was generated
>             by the
>             /kill/()
>
>
>             <http://pubs.opengroup.org/______onlinepubs/009695399/______functions/kill.html
>             <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/kill.html>
>
>
>             <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/kill.html
>             <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/kill.html>>
>
>
>             <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/kill.html
>             <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/kill.html>
>
>             <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/kill.html
>             <http://pubs.opengroup.org/onlinepubs/009695399/functions/kill.html>>>>
>                               function, the /sigqueue/()
>
>
>             <http://pubs.opengroup.org/______onlinepubs/009695399/______functions/sigqueue.html
>             <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/sigqueue.html>
>
>
>             <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/sigqueue.html
>             <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/sigqueue.html>>
>
>
>
>             <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/sigqueue.html
>             <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/sigqueue.html>
>
>
>             <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/sigqueue.html
>             <http://pubs.opengroup.org/onlinepubs/009695399/functions/sigqueue.html>>>>
>
>                               function, or the /raise/()
>
>
>             <http://pubs.opengroup.org/______onlinepubs/009695399/______functions/raise.html
>             <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/raise.html>
>
>
>             <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/raise.html
>             <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/raise.html>>
>
>
>
>             <http://pubs.opengroup.org/____onlinepubs/009695399/____functions/raise.html
>             <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/raise.html>
>
>             <http://pubs.opengroup.org/__onlinepubs/009695399/__functions/raise.html
>             <http://pubs.opengroup.org/onlinepubs/009695399/functions/raise.html>>>>
>                               function."
>
>
>                           Thanks - I managed to miss that part even
>             though I found
>                      the other
>                           part about the signal handling function
>             returning. :(
>
>
>                      It is well hidden, I found it by accident :) To me
>             it looks like
>                      they
>                      kept it intentionally vague, to not block platforms
>             where those
>                      signals
>                      could be somehow dealt with automatically? Hard to
>             see though
>                      how this
>                      would work.
>
>
>
>                               In reality, process usually aborts
>             abnormally with the
>                      default
>                               action
>                               for the signal, e.g. printing out "Illegal
>             Instruction". On
>                               MacOS, we
>                               hang (until the Watcherthread finally
>             kills the VM).
>             On old
>                               AIXes, we
>                               die without a trace.
>
>                               This also can be easily tried out by
>             removing SIGILL
>                      from the
>                               list of
>                               signals in vmError_<os>.cpp and executing:
>
>                               java -XX:ErrorHandlerTest=14
>             -XX:TestCrashInErrorHandler=15
>
>                               which will crash first with a SIGSEGV,
>             then in error
>                      handling with a
>                               secondary SIGILL. This will interrupt
>             error reporting
>                      and kill
>                               or hang
>                               the process.
>
>
>                                    In that sense unblocking those
>             signals whilst
>                      handling the
>                               initial
>                                    signal may well allow the error
>             reporting process
>                      to continue
>                                    further. But I'm unclear exactly how
>             this plays
>             out:
>
>                                    - synchronous signal encountered
>                                    - crash_handler invoked
>
>                                    - VMError::report_and_die executes
>                                    - secondary signal encountered
>
>                                    - crash_handler invoked again
>
>
>                               almost: not again, different signal
>             handler now. First
>                      signal was
>                               handled by "JVM_handle_<os>_signal()"
>
>
>                           Ah missed that - thanks - not that it makes much
>             difference :)
>
>
>                      I just like nitpicking :)
>
>                                    - VMError::report_and_die executes
>             again and
>             sees the
>                               recursion and
>                                    returns (ignoring abort due to
>             excessive recursive
>                      errors)
>
>
>                               No..
>
>                                    Is that right? So we actually return
>             from the
>                      crash_handler?
>
>
>                               Oh, but we dont return.
>             VMError::report_and_die()
>             will just
>                               create a new
>                               frame and re-execute VMError::report() of
>             the first
>                      VMError object.
>                               Which then will continue with the next
>             STEP. We never
>                      return,
>                               for each
>                               secondary error signal a new frame is created.
>
>
>                           I had trouble tracing through exactly what
>             might happen
>             on the
>                           recursive call to report_and_die. I see know
>             that report
>                      comes from:
>
>                                staticBufferStream sbs(buffer, O_BUFLEN,
>             &log);
>                                first_error->report(&sbs);
>                                first_error->_current_step = 0;
>               // reset
>                      current_step
>                                first_error->_current_step_______info =
>             "";   // reset
>                      current_step
>
>                           string
>
>                           so the second time through we will call report and
>                      _current_step
>                           should indicate where to start executing from.
>
>
>                      Exactly. There is also a catch, in that the stack
>             usage goes
>             up. Not
>                      endlessly, it is limited by the number of error
>             reporting steps.
>                      The more stack VmError::report() does cost, the
>             less well this
>                      works,
>                      especially in stack overflow scenarios.
>
>                      Which is why we extended SafeFetch and enabled it
>             for the use
>             in the
>                      error handler, which will be one of the the next
>             patches I'd
>             like to
>                      port to the OpenJDK, once this one is thru.
>
>
>                               This all happens in VMError::report_and_die:
>                               -> first error ? anchor VMError object in
>             a static
>                      variable and
>                               execute
>                               VMError::report()
>                               -> secondary error?
>                                    -> different thread? just sleep forever
>                                    -> same thread? new frame, re-enter
>                      VMError::report(). Once
>                               done, abort.
>
>                               I always found that rather neat, but in
>             fact that is
>                      not our
>                               invention
>                               but Sun's :) Anyway, my fix does not
>             change this
>                      behaviour for
>                               better or
>                               worse, it only makes it usable for more cases.
>
>                                    Because this puts us in undefined
>             territory
>                      according to POSIX:
>
>                                    "The behavior of a process is
>             undefined after it
>                      returns
>                               normally
>                                    from a signal-catching function for a
>             SIGBUS,
>             SIGFPE,
>                               SIGILL, or
>                                    SIGSEGV signal that was not generated
>             by kill(),
>                      sigqueue(), or
>                                    raise()."
>
>                               true, but we dont return...
>
>                                    On top of that you also have the
>             issue that error
>                      reporting
>                               does a
>                                    whole bunch of things that are not
>                      async-signal-safe so we can
>                                    easily encounter hangs or aborts.
>
>                                    But we're dying anyway so I guess
>             none of this
>             really
>                               matters. If
>                                    re-enabling these signals allows
>             error reporting to
>                               progress further
>                                    in some cases then that is a win.
>
>
>                               Actually, this covers a lot of cases,
>             mostly because
>                      SIGSEGV during
>                               error reporting is common, so if the
>             original error
>             was not
>                               SIGSEGV, but
>                               e.g. SIGILL, this would always result in
>             broken hs-err
>                      files.
>
>                               The back story is that at SAP, we rely
>             heavily on the
>                      hs-err
>                               files. They
>                               are our main tool for support, because
>             working with
>                      cores is
>                               often not
>                               feasible. So, we put a lot of work in
>             making error
>                      reporting
>                               reliable
>                               across all platforms. This is also covered
>             by many
>                      tests which
>                               crash the
>                               VM in exciting ways and check the hs-err
>             files for
>                      completeness.
>
>
>                           OK. Modulo the cpu specific SIGILL part
>             everything else
>                      seems fine.
>
>                      Great. just tell me how you want that part.
>
>                      Kind regards, Thomas
>
>                           Thanks,
>                           David
>
>
>

From thomas.stuefe at gmail.com  Thu Nov 27 09:27:02 2014
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Thu, 27 Nov 2014 10:27:02 +0100
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <5476E851.8050802@oracle.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>
	<54752CF8.5070408@oracle.com>
	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>
	<54759DFE.7020300@oracle.com>
	<CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>
	<5475C15E.30207@oracle.com>
	<CAA-vtUx6xfjLD4w05eiLWnce-exB-Ryo6SnVR52Oi-DRRvkvew@mail.gmail.com>
	<54767502.6010907@oracle.com> <5476B417.9030008@oracle.com>
	<CAA-vtUzWxgnbWmKH81UABTCb9czmXvAx0u+wK5CvAKzz0Dqp8A@mail.gmail.com>
	<5476E851.8050802@oracle.com>
Message-ID: <CAA-vtUyPA-tddcnhOTq3VLTZVZJevuKZFtqGpsU7=NRjJqQCGA@mail.gmail.com>

I am preparing a jtreg tests which would fail if no SIGILL is produced. A
real SIGILL is needed to make the test meaningful, although I guess a fake
SIGILL (kill() or raise()) would make the test pass too. Which could be a
workaround for the time being.

I could live with either #ifdef in shared code - shared code already
contains lots of #ifdef ARM - or with cpu-specific files; I also could add
the debug_<cpu>.hpp files needed for your solution.

Kind Regards, Thomas

On Thu, Nov 27, 2014 at 10:01 AM, David Holmes <david.holmes at oracle.com>
wrote:

> On 27/11/2014 5:36 PM, Thomas St?fe wrote:
>
>> Unfortunately, I cannot test it, as I have no ARM environment. The best
>> I can come up with without testing is this:
>> http://stackoverflow.com/questions/16081618/programmatically-cause-
>> undefined-instruction-exception
>>
>
> The issue is how to handle this? Put ifdefs for ARM in the open code?
> Revert to your per-platform solution? Some other variation? Or do we just
> not care if we can't trigger SIGILL on ARM? Though I'd like to hear from
> the AARCH64 folk too.
>
> David
>
>  Kind regards, Thomas
>>
>> On Thu, Nov 27, 2014 at 6:18 AM, David Holmes <david.holmes at oracle.com
>> <mailto:david.holmes at oracle.com>> wrote:
>>
>>     On 27/11/2014 10:49 AM, David Holmes wrote:
>>
>>         On 26/11/2014 11:33 PM, Thomas St?fe wrote:
>>
>>             Hi David,
>>
>>             here you go:
>>             http://cr.openjdk.java.net/~__stuefe/webrevs/8065895/webrev.
>> __02/
>>
>>             <http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.
>> 02/>
>>
>>             Reverted SIGILL-generating function back to its original
>>             form, plus the
>>             folding of the 000 case.
>>
>>
>>         Thanks Thomas! While we are awaiting a second reviewer I will
>>         test this
>>         out internally. It may take a day or two sorry.
>>
>>
>>     Unfortunately, on ARM (emulator), the SIGILL test generated a SEGV
>>     instead:
>>
>>     will jump to PC 0xb6fb1000, which should cause a SIGILL.
>>     #
>>     # A fatal error has been detected by the Java Runtime Environment:
>>     #
>>     #  SIGSEGV (0xb) at pc=0xb6fb2000, pid=13095, tid=3060999280
>>
>>     If I read the ARM architecture manual correctly all zeroes will map
>>     to a conditional AND instruction (Ref A8.6.12 AND(register))
>>
>>     David
>>
>>
>>         David
>>
>>             I only can guess what your closed platforms are, but if it
>>             is ARM, I
>>             believe opcodes 0-31 are undefined. For ia64, 0 is undefined
>>             as well.
>>
>>             Kind regards, Thomas
>>
>>
>>             On Wed, Nov 26, 2014 at 1:02 PM, David Holmes
>>             <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>>             <mailto:david.holmes at oracle.__com
>>
>>             <mailto:david.holmes at oracle.com>>> wrote:
>>
>>                  On 26/11/2014 9:37 PM, Thomas St?fe wrote:
>>
>>                      Hi David,
>>                      ...
>>
>>                                    - In debug.cpp for the SIGILL can you
>>             define the
>>                      all zero
>>                               case as a
>>                                    default so we only need to add
>>             platform specific
>>                               definitions when
>>                                    all zeroes doesn't work. I really
>>             hate seeing all
>>                      that CPU
>>                               selection
>>                                    in shared code. :(
>>
>>
>>                               Agreed and fixed, moved the CPU-specific
>>             sections into
>>                               CPU-specific files.
>>
>>
>>                           I'd really like to see a way to share the
>>             all-zeroes case
>>                      so that we
>>                           don't need to add platform specific code
>>             unnecessarily.
>>
>>
>>                      sooo.. back to the original code then, just with
>>             the #ifdef,
>>                      just with
>>                      the zero-cases all folded in into the #else path?
>>             Or do you
>>             prefer
>>                      something else?
>>
>>
>>                  Elsewhere there is a pattern of defining per-platform
>>             values that
>>                  can override the shared definition. eg:
>>
>>                  #ifndef HAS_SPECIAL_PLATFORM_VALUE_____FOR_XXXX
>>                     Foo XXX = ...;  //shared/default initalization
>>                  #endif
>>
>>                  but this assumes a platform specific header has already
>>             been
>>                  included that can do:
>>
>>                  #define HAS_SPECIAL_PLATFORM_VALUE_____FOR_XXXX
>>
>>                  Foo XXX = ... ; // platform specific initialization
>>
>>                  But that is not the case for debug.hpp.
>>
>>                  So I guess folding the zero-case into the else path is
>>             the best we
>>                  can do. However I'm assuming the zero case will work
>>             for our
>>                  internal platforms ... if it doesn't then we'd have to
>>             pollute the
>>                  shared code with info for the closed platforms. :(
>>
>>                  David
>>                  -----
>>
>>
>>                                    - Style nit: please use i++ rather
>>             than i ++
>>
>>
>>                               Fixed.
>>
>>                                    Aside: we should eradicate the use of
>>             sigprocmask and
>>                               replace with
>>                                    the thread specific version.
>>
>>
>>                               Agree. Though I never saw any errors
>>             stemming from the
>>                      use of
>>                               sigprocmask(). According to POSIX,
>>             sigprocmask() is
>>                      undefined in
>>                               multithreaded environment, and I guess
>>             most OSes just
>>                      default to
>>                               pthread_sigmask.
>>
>>
>>                           Yes "probably" works okay but I hate to see us
>>             using
>>                      something with
>>                           undefined semantics. That's future clean up
>>             though.
>>
>>
>>                      We (SAP JVM) already use pthread_sigmask() /
>>             thr_sigsetmask()
>>                      instead of
>>                      sigprocmask. Works fine. We can port this to the
>>             OpenJDK.
>>
>>                                    Getting back to the "thinking more
>>             about this" ...
>>                      If a
>>                               synchronous
>>                                    signal is blocked at the time it is
>>             generated
>>             then it
>>                               should remain
>>                                    pending on the thread (POSIX spec)
>>             but that
>>                      doesn't tell us
>>                               what the
>>                                    thread will then do - retry the
>> faulting
>>                      instruction? Become
>>                                    unschedulable? So I can easily
>>             imagine that a hang
>>                      or process
>>                                    termination may result.
>>
>>
>>                               This is exactly what happens, but it is
>>             actually
>>             covered by
>>                               POSIX, see
>>                               doc on pthread_sigmask: "If any of the
>>             SIGFPE, SIGILL,
>>                      SIGSEGV, or
>>                               SIGBUS signals are generated while they
>>             are blocked,
>>                      the result is
>>                               undefined, unless the signal was generated
>>             by the
>>             /kill/()
>>
>>
>>             <http://pubs.opengroup.org/______onlinepubs/009695399/______
>> functions/kill.html
>>             <http://pubs.opengroup.org/____onlinepubs/009695399/____
>> functions/kill.html>
>>
>>
>>             <http://pubs.opengroup.org/____onlinepubs/009695399/____
>> functions/kill.html
>>             <http://pubs.opengroup.org/__onlinepubs/009695399/__
>> functions/kill.html>>
>>
>>
>>             <http://pubs.opengroup.org/____onlinepubs/009695399/____
>> functions/kill.html
>>             <http://pubs.opengroup.org/__onlinepubs/009695399/__
>> functions/kill.html>
>>
>>             <http://pubs.opengroup.org/__onlinepubs/009695399/__
>> functions/kill.html
>>             <http://pubs.opengroup.org/onlinepubs/009695399/
>> functions/kill.html>>>>
>>                               function, the /sigqueue/()
>>
>>
>>             <http://pubs.opengroup.org/______onlinepubs/009695399/______
>> functions/sigqueue.html
>>             <http://pubs.opengroup.org/____onlinepubs/009695399/____
>> functions/sigqueue.html>
>>
>>
>>             <http://pubs.opengroup.org/____onlinepubs/009695399/____
>> functions/sigqueue.html
>>             <http://pubs.opengroup.org/__onlinepubs/009695399/__
>> functions/sigqueue.html>>
>>
>>
>>
>>             <http://pubs.opengroup.org/____onlinepubs/009695399/____
>> functions/sigqueue.html
>>             <http://pubs.opengroup.org/__onlinepubs/009695399/__
>> functions/sigqueue.html>
>>
>>
>>             <http://pubs.opengroup.org/__onlinepubs/009695399/__
>> functions/sigqueue.html
>>             <http://pubs.opengroup.org/onlinepubs/009695399/
>> functions/sigqueue.html>>>>
>>
>>                               function, or the /raise/()
>>
>>
>>             <http://pubs.opengroup.org/______onlinepubs/009695399/______
>> functions/raise.html
>>             <http://pubs.opengroup.org/____onlinepubs/009695399/____
>> functions/raise.html>
>>
>>
>>
>>             <http://pubs.opengroup.org/____onlinepubs/009695399/____
>> functions/raise.html
>>             <http://pubs.opengroup.org/__onlinepubs/009695399/__
>> functions/raise.html>>
>>
>>
>>
>>             <http://pubs.opengroup.org/____onlinepubs/009695399/____
>> functions/raise.html
>>             <http://pubs.opengroup.org/__onlinepubs/009695399/__
>> functions/raise.html>
>>
>>             <http://pubs.opengroup.org/__onlinepubs/009695399/__
>> functions/raise.html
>>             <http://pubs.opengroup.org/onlinepubs/009695399/
>> functions/raise.html>>>>
>>                               function."
>>
>>
>>                           Thanks - I managed to miss that part even
>>             though I found
>>                      the other
>>                           part about the signal handling function
>>             returning. :(
>>
>>
>>                      It is well hidden, I found it by accident :) To me
>>             it looks like
>>                      they
>>                      kept it intentionally vague, to not block platforms
>>             where those
>>                      signals
>>                      could be somehow dealt with automatically? Hard to
>>             see though
>>                      how this
>>                      would work.
>>
>>
>>
>>                               In reality, process usually aborts
>>             abnormally with the
>>                      default
>>                               action
>>                               for the signal, e.g. printing out "Illegal
>>             Instruction". On
>>                               MacOS, we
>>                               hang (until the Watcherthread finally
>>             kills the VM).
>>             On old
>>                               AIXes, we
>>                               die without a trace.
>>
>>                               This also can be easily tried out by
>>             removing SIGILL
>>                      from the
>>                               list of
>>                               signals in vmError_<os>.cpp and executing:
>>
>>                               java -XX:ErrorHandlerTest=14
>>             -XX:TestCrashInErrorHandler=15
>>
>>                               which will crash first with a SIGSEGV,
>>             then in error
>>                      handling with a
>>                               secondary SIGILL. This will interrupt
>>             error reporting
>>                      and kill
>>                               or hang
>>                               the process.
>>
>>
>>                                    In that sense unblocking those
>>             signals whilst
>>                      handling the
>>                               initial
>>                                    signal may well allow the error
>>             reporting process
>>                      to continue
>>                                    further. But I'm unclear exactly how
>>             this plays
>>             out:
>>
>>                                    - synchronous signal encountered
>>                                    - crash_handler invoked
>>
>>                                    - VMError::report_and_die executes
>>                                    - secondary signal encountered
>>
>>                                    - crash_handler invoked again
>>
>>
>>                               almost: not again, different signal
>>             handler now. First
>>                      signal was
>>                               handled by "JVM_handle_<os>_signal()"
>>
>>
>>                           Ah missed that - thanks - not that it makes much
>>             difference :)
>>
>>
>>                      I just like nitpicking :)
>>
>>                                    - VMError::report_and_die executes
>>             again and
>>             sees the
>>                               recursion and
>>                                    returns (ignoring abort due to
>>             excessive recursive
>>                      errors)
>>
>>
>>                               No..
>>
>>                                    Is that right? So we actually return
>>             from the
>>                      crash_handler?
>>
>>
>>                               Oh, but we dont return.
>>             VMError::report_and_die()
>>             will just
>>                               create a new
>>                               frame and re-execute VMError::report() of
>>             the first
>>                      VMError object.
>>                               Which then will continue with the next
>>             STEP. We never
>>                      return,
>>                               for each
>>                               secondary error signal a new frame is
>> created.
>>
>>
>>                           I had trouble tracing through exactly what
>>             might happen
>>             on the
>>                           recursive call to report_and_die. I see know
>>             that report
>>                      comes from:
>>
>>                                staticBufferStream sbs(buffer, O_BUFLEN,
>>             &log);
>>                                first_error->report(&sbs);
>>                                first_error->_current_step = 0;
>>               // reset
>>                      current_step
>>                                first_error->_current_step_______info =
>>
>>             "";   // reset
>>                      current_step
>>
>>                           string
>>
>>                           so the second time through we will call report
>> and
>>                      _current_step
>>                           should indicate where to start executing from.
>>
>>
>>                      Exactly. There is also a catch, in that the stack
>>             usage goes
>>             up. Not
>>                      endlessly, it is limited by the number of error
>>             reporting steps.
>>                      The more stack VmError::report() does cost, the
>>             less well this
>>                      works,
>>                      especially in stack overflow scenarios.
>>
>>                      Which is why we extended SafeFetch and enabled it
>>             for the use
>>             in the
>>                      error handler, which will be one of the the next
>>             patches I'd
>>             like to
>>                      port to the OpenJDK, once this one is thru.
>>
>>
>>                               This all happens in VMError::report_and_die:
>>                               -> first error ? anchor VMError object in
>>             a static
>>                      variable and
>>                               execute
>>                               VMError::report()
>>                               -> secondary error?
>>                                    -> different thread? just sleep forever
>>                                    -> same thread? new frame, re-enter
>>                      VMError::report(). Once
>>                               done, abort.
>>
>>                               I always found that rather neat, but in
>>             fact that is
>>                      not our
>>                               invention
>>                               but Sun's :) Anyway, my fix does not
>>             change this
>>                      behaviour for
>>                               better or
>>                               worse, it only makes it usable for more
>> cases.
>>
>>                                    Because this puts us in undefined
>>             territory
>>                      according to POSIX:
>>
>>                                    "The behavior of a process is
>>             undefined after it
>>                      returns
>>                               normally
>>                                    from a signal-catching function for a
>>             SIGBUS,
>>             SIGFPE,
>>                               SIGILL, or
>>                                    SIGSEGV signal that was not generated
>>             by kill(),
>>                      sigqueue(), or
>>                                    raise()."
>>
>>                               true, but we dont return...
>>
>>                                    On top of that you also have the
>>             issue that error
>>                      reporting
>>                               does a
>>                                    whole bunch of things that are not
>>                      async-signal-safe so we can
>>                                    easily encounter hangs or aborts.
>>
>>                                    But we're dying anyway so I guess
>>             none of this
>>             really
>>                               matters. If
>>                                    re-enabling these signals allows
>>             error reporting to
>>                               progress further
>>                                    in some cases then that is a win.
>>
>>
>>                               Actually, this covers a lot of cases,
>>             mostly because
>>                      SIGSEGV during
>>                               error reporting is common, so if the
>>             original error
>>             was not
>>                               SIGSEGV, but
>>                               e.g. SIGILL, this would always result in
>>             broken hs-err
>>                      files.
>>
>>                               The back story is that at SAP, we rely
>>             heavily on the
>>                      hs-err
>>                               files. They
>>                               are our main tool for support, because
>>             working with
>>                      cores is
>>                               often not
>>                               feasible. So, we put a lot of work in
>>             making error
>>                      reporting
>>                               reliable
>>                               across all platforms. This is also covered
>>             by many
>>                      tests which
>>                               crash the
>>                               VM in exciting ways and check the hs-err
>>             files for
>>                      completeness.
>>
>>
>>                           OK. Modulo the cpu specific SIGILL part
>>             everything else
>>                      seems fine.
>>
>>                      Great. just tell me how you want that part.
>>
>>                      Kind regards, Thomas
>>
>>                           Thanks,
>>                           David
>>
>>
>>
>>

From aph at redhat.com  Thu Nov 27 09:45:15 2014
From: aph at redhat.com (Andrew Haley)
Date: Thu, 27 Nov 2014 09:45:15 +0000
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <5476E851.8050802@oracle.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>	<54752CF8.5070408@oracle.com>	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>	<54759DFE.7020300@oracle.com>	<CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>	<5475C15E.30207@oracle.com>	<CAA-vtUx6xfjLD4w05eiLWnce-exB-Ryo6SnVR52Oi-DRRvkvew@mail.gmail.com>	<54767502.6010907@oracle.com>	<5476B417.9030008@oracle.com>	<CAA-vtUzWxgnbWmKH81UABTCb9czmXvAx0u+wK5CvAKzz0Dqp8A@mail.gmail.com>
	<5476E851.8050802@oracle.com>
Message-ID: <5476F2AB.4050401@redhat.com>

On 11/27/2014 09:01 AM, David Holmes wrote:
> On 27/11/2014 5:36 PM, Thomas St?fe wrote:
>> Unfortunately, I cannot test it, as I have no ARM environment. The best
>> I can come up with without testing is this:
>> http://stackoverflow.com/questions/16081618/programmatically-cause-undefined-instruction-exception
> 
> The issue is how to handle this? Put ifdefs for ARM in the open code? 
> Revert to your per-platform solution? Some other variation? Or do we 
> just not care if we can't trigger SIGILL on ARM? Though I'd like to hear 
> from the AARCH64 folk too.

I always use DCPS1 if I want an undefined instruction trap.  0x0100A0D4.

Andrew.


From thomas.stuefe at gmail.com  Thu Nov 27 10:38:49 2014
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Thu, 27 Nov 2014 11:38:49 +0100
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <5476F2AB.4050401@redhat.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>
	<54752CF8.5070408@oracle.com>
	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>
	<54759DFE.7020300@oracle.com>
	<CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>
	<5475C15E.30207@oracle.com>
	<CAA-vtUx6xfjLD4w05eiLWnce-exB-Ryo6SnVR52Oi-DRRvkvew@mail.gmail.com>
	<54767502.6010907@oracle.com> <5476B417.9030008@oracle.com>
	<CAA-vtUzWxgnbWmKH81UABTCb9czmXvAx0u+wK5CvAKzz0Dqp8A@mail.gmail.com>
	<5476E851.8050802@oracle.com> <5476F2AB.4050401@redhat.com>
Message-ID: <CAA-vtUwwxpooupUN85u=8X=qwx0N1MOMeD2R1RZnGkL5EQhehw@mail.gmail.com>

Hi Andrew, thank you! Does endianess matter ?

On Thu, Nov 27, 2014 at 10:45 AM, Andrew Haley <aph at redhat.com> wrote:

> On 11/27/2014 09:01 AM, David Holmes wrote:
> > On 27/11/2014 5:36 PM, Thomas St?fe wrote:
> >> Unfortunately, I cannot test it, as I have no ARM environment. The best
> >> I can come up with without testing is this:
> >>
> http://stackoverflow.com/questions/16081618/programmatically-cause-undefined-instruction-exception
> >
> > The issue is how to handle this? Put ifdefs for ARM in the open code?
> > Revert to your per-platform solution? Some other variation? Or do we
> > just not care if we can't trigger SIGILL on ARM? Though I'd like to hear
> > from the AARCH64 folk too.
>
> I always use DCPS1 if I want an undefined instruction trap.  0x0100A0D4.
>
> Andrew.
>
>
>

From aph at redhat.com  Thu Nov 27 10:55:10 2014
From: aph at redhat.com (Andrew Haley)
Date: Thu, 27 Nov 2014 10:55:10 +0000
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <CAA-vtUwwxpooupUN85u=8X=qwx0N1MOMeD2R1RZnGkL5EQhehw@mail.gmail.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>	<54752CF8.5070408@oracle.com>	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>	<54759DFE.7020300@oracle.com>	<CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>	<5475C15E.30207@oracle.com>	<CAA-vtUx6xfjLD4w05eiLWnce-exB-Ryo6SnVR52Oi-DRRvkvew@mail.gmail.com>	<54767502.6010907@oracle.com>	<5476B417.9030008@oracle.com>	<CAA-vtUzWxgnbWmKH81UABTCb9czmXvAx0u+wK5CvAKzz0Dqp8A@mail.gmail.com>	<5476E851.8050802@oracle.com>	<5476F2AB.4050401@redhat.com>
	<CAA-vtUwwxpooupUN85u=8X=qwx0N1MOMeD2R1RZnGkL5EQhehw@mail.gmail.com>
Message-ID: <5477030E.6070605@redhat.com>

On 11/27/2014 10:38 AM, Thomas St?fe wrote:
> Hi Andrew, thank you! Does endianess matter ?

Yes.  I'd do it symbolically rather than mess with endian defines:

#ifdef AARCH64
  unsigned insn;
  asm("b 1f; 0: dcps1; 1: ldr %0, 0b" : "=r"(insn));
#endif

Andrew.


From david.holmes at oracle.com  Thu Nov 27 11:00:06 2014
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 27 Nov 2014 21:00:06 +1000
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <5477030E.6070605@redhat.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>	<54752CF8.5070408@oracle.com>	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>	<54759DFE.7020300@oracle.com>	<CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>	<5475C15E.30207@oracle.com>	<CAA-vtUx6xfjLD4w05eiLWnce-exB-Ryo6SnVR52Oi-DRRvkvew@mail.gmail.com>	<54767502.6010907@oracle.com>	<5476B417.9030008@oracle.com>	<CAA-vtUzWxgnbWmKH81UABTCb9czmXvAx0u+wK5CvAKzz0Dqp8A@mail.gmail.com>	<5476E851.8050802@oracle.com>	<5476F2AB.4050401@redhat.com>
	<CAA-vtUwwxpooupUN85u=8X=qwx0N1MOMeD2R1RZnGkL5EQhehw@mail.gmail.com>
	<5477030E.6070605@redhat.com>
Message-ID: <54770436.8070705@oracle.com>

On 27/11/2014 8:55 PM, Andrew Haley wrote:
> On 11/27/2014 10:38 AM, Thomas St?fe wrote:
>> Hi Andrew, thank you! Does endianess matter ?
>
> Yes.  I'd do it symbolically rather than mess with endian defines:
>
> #ifdef AARCH64
>    unsigned insn;
>    asm("b 1f; 0: dcps1; 1: ldr %0, 0b" : "=r"(insn));
> #endif

Does that work for ARMv7?

Thanks,
David

> Andrew.
>

From aph at redhat.com  Thu Nov 27 11:04:22 2014
From: aph at redhat.com (Andrew Haley)
Date: Thu, 27 Nov 2014 11:04:22 +0000
Subject: RFR(s): 8065895: Synchronous signals during error reporting may
	terminate or hang VM process
In-Reply-To: <54770436.8070705@oracle.com>
References: <CAA-vtUxzKUa5QSvREkugnAhRppNMGMVg13CU4ZRLfuiVB5i9DQ@mail.gmail.com>	<54752CF8.5070408@oracle.com>	<CAA-vtUyXryE_cayA5=bpsyCSWxV=_s5YK_srv-ZvXc6b5CpiaA@mail.gmail.com>	<54759DFE.7020300@oracle.com>	<CAA-vtUxajk_KVOwDzSzTEQRbZnXJ4d5+ZLeFx3N7PnYZ6-aPig@mail.gmail.com>	<5475C15E.30207@oracle.com>	<CAA-vtUx6xfjLD4w05eiLWnce-exB-Ryo6SnVR52Oi-DRRvkvew@mail.gmail.com>	<54767502.6010907@oracle.com>	<5476B417.9030008@oracle.com>	<CAA-vtUzWxgnbWmKH81UABTCb9czmXvAx0u+wK5CvAKzz0Dqp8A@mail.gmail.com>	<5476E851.8050802@oracle.com>	<5476F2AB.4050401@redhat.com>
	<CAA-vtUwwxpooupUN85u=8X=qwx0N1MOMeD2R1RZnGkL5EQhehw@mail.gmail.com>
	<5477030E.6070605@redhat.com> <54770436.8070705@oracle.com>
Message-ID: <54770536.5090101@redhat.com>

On 11/27/2014 11:00 AM, David Holmes wrote:
> On 27/11/2014 8:55 PM, Andrew Haley wrote:
>> On 11/27/2014 10:38 AM, Thomas St?fe wrote:
>>> Hi Andrew, thank you! Does endianess matter ?
>>
>> Yes.  I'd do it symbolically rather than mess with endian defines:
>>
>> #ifdef AARCH64
>>    unsigned insn;
>>    asm("b 1f; 0: dcps1; 1: ldr %0, 0b" : "=r"(insn));
>> #endif
> 
> Does that work for ARMv7?

Sorry, I don't know what a good choice there would be.  And I must
warn you: DCPS1 isn't necessarily guaranteed to do this forever, but
it works on the kernels I've tried.

Andrew.


From michail.chernov at oracle.com  Thu Nov 27 13:28:49 2014
From: michail.chernov at oracle.com (Michail Chernov)
Date: Thu, 27 Nov 2014 16:28:49 +0300
Subject: RFR: 8064909: FragmentMetaspace.java got OutOfMemoryError
In-Reply-To: <54765B70.10509@oracle.com>
References: <5475D74A.2060907@oracle.com> <54762451.3070802@oracle.com>
	<54763D54.3070704@oracle.com> <54765B70.10509@oracle.com>
Message-ID: <54772711.6000003@oracle.com>

Hi,

CC'ed hotspot-runtime-dev.

Here is not test failure - test works as expected. OOME is occurred in 
compiler instance.

private JavaCompiler javac;
...
         javac = ToolProvider.getSystemJavaCompiler();
...
         int exitcode = javac.run(null, null, null, 
file.getCanonicalPath());
         if (exitcode != 0) {
             throw new RuntimeException("javac failure when compiling: " +
         file.getCanonicalPath());

Here is 2 ways - rewrite getGeneratedClass 
(runtime/testlibrary/GeneratedClassLoader.java) to allow them to throw 
not only RuntimeException, or to catch RuntimeException and check 
exception message comparing with "javac failure when compiling:". Both 
ways seem to me are not as clear as expected for this simple test. More 
- javac does not throw anything - it just returns exitcode (non-zero) 
and writes its messages to System.err.

Also I can add comment to code like "OOME with message 
"java.lang.OutOfMemoryError: Java heap space" doesn't mean that 
something wrong with metaspace - need just to increase -Xmx".

Thanks,
Michail

On 27.11.2014 2:00, Jon Masamitsu wrote:
> Dima,
>
> If this test fails with an OOME in the future, I would like it to be
> obvious that the failure is not that an OOME occurred.   I cannot
> tell that from looking at the test.    Can the test be changed so
> I don't have to spend time figuring out that the OOME is not
> a failure mode  of the test?
>
> Jon
>
>
> On 11/26/2014 12:51 PM, Dmitry Fazunenko wrote:
>> Hi Jon,
>>
>> The original version of test worked for 80 seconds trying to perform 
>> as many iterations as possible. The number of iterations performed 
>> depended on how fast is the machine. With each next iteration the 
>> size of generated and loaded classes is growing, so on fast enough 
>> machines 80 seconds is enough to run out of heap while generating a 
>> class.
>>
>> The fix not only sets the heap, but limits iterations.  300m heap is 
>> enough for 200 iterations.
>>
>> Your approach, with catching OOME(heap) and passing will also work, 
>> but it will reduce the test readability (and potentially could bring 
>> more problems).
>>
>> An alternative approach would be to limit metaspace and heap 
>> accordingly and load classes until we don't run out metaspace... But 
>> this might take awhile.
>>
>> So, I hope that Michael's fix is good.
>>
>> Thanks for looking and expressing comments.
>> Dima
>>
>>
>>
>>
>> On 26.11.2014 22:04, Jon Masamitsu wrote:
>>> Michail,
>>>
>>> Your change makes this test pass but it seems like at
>>> some future date 300m might not be big enough
>>> (for whatever reason).  Could the test be make to
>>> caught an OOME, print out a message saying that
>>> an OOME doesn't mean the test failed  but that
>>> the test needs a larger heap?  Then pass an
>>> exception up (maybe some type of Runtime
>>> exception - sorry if that is vague but I don't
>>> what type of exception would make sense).  That
>>> would mean we wouldn't have to spend time
>>> diagnosing what the OOME means again.
>>>
>>> Jon
>>>
>>> On 11/26/2014 5:36 AM, Michail Chernov wrote:
>>>> Hi,
>>>>
>>>> Please review this simple fix for nightly test failure:
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~eistepan/~mchernov/8064909/webrev.00/
>>>> Bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8064909
>>>>
>>>> Problem: test fails because of OOME (not enough heap size).
>>>> Solution: heap size were increased.
>>>>
>>>> Testing:
>>>> jtreg
>>>>
>>>> Thanks,
>>>> Michail
>>>
>>
>
>
>


From yasuenag at gmail.com  Sat Nov 29 15:44:30 2014
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Sun, 30 Nov 2014 00:44:30 +0900
Subject: RFR: JDK-8059586: hs_err report should treat redirected core
	pattern.
In-Reply-To: <CAA-vtUx5iVr6=CoywjdnyH4mkP1_zdpOceW0TMeYSTsND_PUDg@mail.gmail.com>
References: <542C8274.3010809@gmail.com>	<54338B70.9080709@oracle.com>	<CAGFVN2DjxGLomf6dzS5BVqnSKf-43XdEt897T7s0cV71C78r-Q@mail.gmail.com>	<543B1FD6.3000200@oracle.com>	<543CF553.80601@gmail.com>	<543DC2BF.9050407@oracle.com>	<543E80F8.3080204@gmail.com>	<547330E5.1050708@gmail.com>	<FE1302A7-A228-43E9-BCB6-74558268E296@oracle.com>	<CAGFVN2CPxXC9bhWvdq-wfmbyBGTZCS2gz5dq9jWf6dMCXGMYvg@mail.gmail.com>
	<CAA-vtUx5iVr6=CoywjdnyH4mkP1_zdpOceW0TMeYSTsND_PUDg@mail.gmail.com>
Message-ID: <5479E9DE.7070703@gmail.com>

Hi all,


Thank you for checking my patch!
I've uploaded new webrev:
http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.03/hotspot.patch

David:
> The change in:
>   src/os/aix/vm/os_aix.cpp
>   src/os/solaris/vm/os_solaris.cpp
>
>    jio_snprintf(buffer, bufferSize, "%s/core or core.%d", current_process_id());
>
> has no argument for the %s - presumably p was intended.

I've fixed.


Staffan:
> src/os/bsd/vm/os_linux.cpp:
> Could we not simplify this to print a helpful message instead?

Most of case in Linux, I think that core image name is "core.<pid>" .
In other case which except pipe redirection, I guess that user defines it.
Thus I print string in kernel.core_pattern directly.

> src/os/bsd/vm/os_bsd.cpp:
> On OS X cores are by default written to /cores/core.<pid>. This is configureable with the kern.corefile sysctl variable, although it is rare to do so.

Thank you!
I changed path to "/cores/core.<pid>" .


Thomas:
> - jio_snprintf() returns -1 on truncation. n+=written may walk backwards. I would probably check for (written >= 0) and also, at the start of the loop, for (n < sizeof(core_path)).
> - code is used in error reporting. I would be hesitant to create larger buffers on the stack. malloc may be better.

I've fixed them.

> - code does not detect truncation of core_path (unlikely but possible)

Do you mean variable name?
"core_path" in my patch stores /proc/sys/kernel/core_pattern .
Length of kernel.core_pattern is defined 128 chars in Linux Kernel Documentation.
https://www.kernel.org/doc/Documentation/sysctl/kernel.txt

Thus length of core_path (129 chars) is enough.

> - when reading /proc/sys/kernel/core_uses_pid, using fgetc instead of fgets may be a tiny bit simpler.

I changed to use fgetc() .


Thanks,

Yasumasa


(2014/11/26 23:12), Thomas St?fe wrote:
> Hi Yasumasa,
>
> I am not a Reviewer. Barring the general decision of the real reviewers, here are some thoughts:
>
> os_linux.cpp
>
> - jio_snprintf() returns -1 on truncation. n+=written may walk backwards. I would probably check for (written >= 0) and also, at the start of the loop, for (n < sizeof(core_path)).
> - code is used in error reporting. I would be hesitant to create larger buffers on the stack. malloc may be better.
> - code does not detect truncation of core_path (unlikely but possible)
>
> the rest is more matter of taste:
> - I would prefer sizeof(core_path) over PATH_MAX at all places where you refer to the size of the buffer. So you could make the buffer very small and test e.g. how your code behaves with truncation.
> - when reading /proc/sys/kernel/core_uses_pid, using fgetc instead of fgets may be a tiny bit simpler.
>
> Kind Regards, Thomas
>
>
>
> On Wed, Nov 26, 2014 at 4:54 AM, Yasumasa Suenaga <yasuenag at gmail.com <mailto:yasuenag at gmail.com>> wrote:
>
>     Hi Staffan,
>
>     Thank you for reviewing!
>
>     os_linux.cpp:
>     I want to print coredump location correctly to hs_err. So I want to output
>     whether coredump is processed in other process or is written to file.
>     If os::get_core_path() should be more simply, I will print raw string in
>     core_pattern.
>
>     os_bsd.cpp:
>     I don't have OS X. So I cannot check it.
>     I am focusing Linux in this enhancement. Could you file it as another
>     enhancement if it need?
>
>     Thanks,
>
>     Yasumasa
>
>       2014/11/25 18:15 "Staffan Larsen" <staffan.larsen at oracle.com <mailto:staffan.larsen at oracle.com>>:
>
>      > src/os/bsd/vm/os_linux.cpp:
>      > I?m inclined to think this is too complicated and hard to test and
>      > maintain (and I see no tests in the webrev). Could we not simplify this to
>      > print a helpful message instead? Something that prints the core_pattern and
>      > perhaps some of the values that could be used for substitution, but does
>      > not do the actual substitution? I think that would go a long way but be a
>      > lot more maintainable.
>      >
>      > src/os/bsd/vm/os_bsd.cpp:
>      > On OS X cores are by default written to /cores/core.<pid>. This is
>      > configureable with the kern.corefile sysctl variable, although it is rare
>      > to do so.
>      >
>      >  /Staffan
>      >
>      > > On 24 nov 2014, at 14:21, Yasumasa Suenaga <yasuenag at gmail.com <mailto:yasuenag at gmail.com>> wrote:
>      > >
>      > > Hi all,
>      > >
>      > > I've uploaded webrev for this issue about a month ago.
>      > > Could you review it and sponsor it?
>      > >
>      > >
>      > > Thanks,
>      > >
>      > > Yasumasa
>      > >
>      > >
>      > > On 10/15/2014 11:13 PM, Yasumasa Suenaga wrote:
>      > >> Hi David,
>      > >>
>      > >> I've uploaded new webrev:
>      > >> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.02/
>      > >>
>      > >>
>      > >>> I wasn't suggesting that you make such a change though because it is
>      > large and disruptive.
>      > >>
>      > >>> Unfactoring check_or_create_dump is a step backwards in terms of code
>      > sharing.
>      > >>
>      > >> I restored check_or_create_dump() to os_posix.cpp .
>      > >> And I changed get_core_path() to create message which represents core
>      > dump path
>      > >> (including filename) in each OS.
>      > >>
>      > >>
>      > >>> Expanding the get_core_path in os_linux.cpp to handle the core_pattern
>      > may be okay (but I don't know enough about it to validate everything).
>      > >>
>      > >> I implemented all parameters in Linux kernel documentation:
>      > >> https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
>      > >>
>      > >> So I think that parameters which are processed are enough.
>      > >>
>      > >>
>      > >> Thanks,
>      > >>
>      > >> Yasumasa
>      > >>
>      > >>
>      > >>
>      > >> (2014/10/15 9:41), David Holmes wrote:
>      > >>> On 14/10/2014 8:05 PM, Yasumasa Suenaga wrote:
>      > >>>> Hi David,
>      > >>>>
>      > >>>> Thank you for comments!
>      > >>>> I've uploaded new webrev. Could you review it again?
>      > >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.01/
>      > >>>>
>      > >>>> I am an author of jdk9. So I cannot commit it.
>      > >>>> Could you be a sponsor for this enhancement?
>      > >>>>
>      > >>>>
>      > >>>>> In which case that should be handled by the linux specific
>      > >>>>> get_core_path() function.
>      > >>>>
>      > >>>> Agree.
>      > >>>> So I implemented it in os_linux.cpp .
>      > >>>> But part of format characters (%P: global pid, %s: signal, %t dump
>      > time)
>      > >>>> are not processed
>      > >>>> in this function because I think these parameters are difficult to
>      > >>>> handle in it.
>      > >>>>
>      > >>>>   %P: I could not find API for this.
>      > >>>>   %s: We have to change arguments of get_core_path() .
>      > >>>>   %t: This parameter means timestamp of coredump. It is decided in
>      > Kernel.
>      > >>>>
>      > >>>>
>      > >>>>> Fixing this means changing all the os_posix using platforms. But your
>      > >>>>> patch is not about this part. :)
>      > >>>>
>      > >>>> I moved os::check_or_create_dump() to each OS implementations (AIX,
>      > BSD,
>      > >>>> Solaris, Linux) .
>      > >>>> So I can write Linux specific code to check_or_create_dump() .
>      > >>>> As a result, I could remove "#ifdef LINUX" from os_posix.cpp :-)
>      > >>>
>      > >>> I wasn't suggesting that you make such a change though because it is
>      > large and disruptive. The simple handling of the | part of core_pattern was
>      > basically ok. Expanding the get_core_path in os_linux.cpp to handle the
>      > core_pattern may be okay (but I don't know enough about it to validate
>      > everything). Unfactoring check_or_create_dump is a step backwards in terms
>      > of code sharing.
>      > >>>
>      > >>> Sorry this has grown too large for me to deal with right now.
>      > >>>
>      > >>> David
>      > >>> -----
>      > >>>
>      > >>>>
>      > >>>>> Though I'm unclear whether it both invokes the program and creates a
>      > >>>>> core dump file; or just invokes the program?
>      > >>>>
>      > >>>> If '|' is set, Linux kernel will just redirect core image to user
>      > process.
>      > >>>> Kernel documentation says as below:
>      > >>>> ------------
>      > >>>> . If the first character of the pattern is a '|', the kernel will
>      > treat
>      > >>>>   the rest of the pattern as a command to run.  The core dump will be
>      > >>>>   written to the standard input of that program instead of to a file.
>      > >>>> ------------
>      > >>>>
>      > >>>> And implementation of coredump (do_coredump()) follows to it.
>      > >>>>
>      > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c
>      > >>>>
>      > >>>>
>      > >>>> In case of ABRT, ABRT dumps core image to default location
>      > >>>> (<CWD>/core.<PID>)
>      > >>>> if user set unlimited to resource limit of core (ulimit -c) .
>      > >>>> https://github.com/abrt/abrt/blob/master/src/hooks/abrt-hook-ccpp.c
>      > >>>>
>      > >>>>
>      > >>>>> A few style nits - you need spaces around keywords and before braces
>      > >>>>> I also suggest saying "Core dumps may be processed with ..." rather
>      > >>>>> than "treated".
>      > >>>>> And as you don't do anything in the non-redirect case I suggest
>      > >>>>> collapsing this:
>      > >>>>
>      > >>>> I've fixed them.
>      > >>>>
>      > >>>>
>      > >>>> Thanks,
>      > >>>>
>      > >>>> Yasumasa
>      > >>>>
>      > >>>>
>      > >>>> (2014/10/13 9:41), David Holmes wrote:
>      > >>>>> Hi Yasumasa,
>      > >>>>>
>      > >>>>> On 7/10/2014 8:48 PM, Yasumasa Suenaga wrote:
>      > >>>>>> Hi David,
>      > >>>>>>
>      > >>>>>> Sorry for my English.
>      > >>>>>>
>      > >>>>>> I want to propose that JVM should create message according to core
>      > >>>>>> pattern (/proc/sys/kernel/core_pattern) .
>      > >>>>>> So I filed it to JBS and created a patch.
>      > >>>>>
>      > >>>>> So I've had a quick look at this core_pattern business and it seems
>      > to
>      > >>>>> me that there are two aspects to this.
>      > >>>>>
>      > >>>>> First, without the leading |, the entry in the core_pattern file is a
>      > >>>>> naming pattern for the core file. In which case that should be
>      > handled
>      > >>>>> by the linux specific get_core_path() function. Though that in itself
>      > >>>>> can't fully report the expected name, as part of it is provided in
>      > the
>      > >>>>> shared code in os::check_or_create_dump. Fixing this means changing
>      > >>>>> all the os_posix using platforms. But your patch is not about this
>      > >>>>> part. :)
>      > >>>>>
>      > >>>>> Second, with a leading | the core_pattern is actually the name of a
>      > >>>>> program to execute when the program is about to core dump, and that
>      > is
>      > >>>>> what you report with your patch. Though I'm unclear whether it both
>      > >>>>> invokes the program and creates a core dump file; or just invokes the
>      > >>>>> program?
>      > >>>>>
>      > >>>>> So with regards to this second part your patch seems functionally ok.
>      > >>>>> I do dislike having a big chunk of linux specific code in this
>      > "posix"
>      > >>>>> support file but ...
>      > >>>>>
>      > >>>>> A few style nits - you need spaces around keywords and before braces
>      > eg:
>      > >>>>>
>      > >>>>>   if(x){
>      > >>>>>
>      > >>>>> should be
>      > >>>>>
>      > >>>>>   if (x) {
>      > >>>>>
>      > >>>>> I also suggest saying "Core dumps may be processed with ..." rather
>      > >>>>> than "treated".
>      > >>>>>
>      > >>>>> And as you don't do anything in the non-redirect case I suggest
>      > >>>>> collapsing this:
>      > >>>>>
>      > >>>>>   83           is_redirect = core_pattern[0] == '|';
>      > >>>>>   84         }
>      > >>>>>   85
>      > >>>>>   86         if(is_redirect){
>      > >>>>>   87           jio_snprintf(buffer, bufferSize,
>      > >>>>>   88                    "Core dumps may be treated with \"%s\"",
>      > >>>>> &core_pattern[1]);
>      > >>>>>   89         }
>      > >>>>>
>      > >>>>> to just
>      > >>>>>
>      > >>>>>   83           if (core_pattern[0] == '|') {  // redirect
>      > >>>>>   84             jio_snprintf(buffer, bufferSize, "Core dumps may be
>      > >>>>> processed with \"%s\"", &core_pattern[1]);
>      > >>>>>   85            }
>      > >>>>>   86         }
>      > >>>>>
>      > >>>>> Comments from other runtime folk appreciated.
>      > >>>>>
>      > >>>>> Thanks,
>      > >>>>> David
>      > >>>>>
>      > >>>>>> Thanks,
>      > >>>>>>
>      > >>>>>> Yasumasa
>      > >>>>>>
>      > >>>>>> 2014/10/07 15:43 "David Holmes" <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>      > >>>>>> <mailto:david.holmes at oracle.com <mailto:david.holmes at oracle.com>>>:
>      > >>>>>>
>      > >>>>>>    Hi Yasumasa,
>      > >>>>>>
>      > >>>>>>    I'm sorry but I don't understand what you are proposing. When you
>      > >>>>>> say
>      > >>>>>>    "treat" do you mean "create"? Otherwise what do you mean by
>      > >>>>>> "treated"?
>      > >>>>>>
>      > >>>>>>    Thanks,
>      > >>>>>>    David
>      > >>>>>>
>      > >>>>>>    On 2/10/2014 8:38 AM, Yasumasa Suenaga wrote:
>      > >>>>>>     > I'm in Hackergarten @ JavaOne :-)
>      > >>>>>>     >
>      > >>>>>>     >
>      > >>>>>>     > Hi all,
>      > >>>>>>     >
>      > >>>>>>     > I would like to enhance the messages in hs_err report.
>      > >>>>>>     > Modern Linux kernel can treat core dump with user process
>      > >>>>>> (e.g. ABRT)
>      > >>>>>>     > However, hs_err report cannot detect it.
>      > >>>>>>     >
>      > >>>>>>     > I think that hs_err report should output messages as below:
>      > >>>>>>     > -------------
>      > >>>>>>     >     Failed to write core dump. Core dumps may be treated with
>      > >>>>>>    "/usr/sbin/chroot /proc/%P/root /usr/libexec/abrt-hook-ccpp %s
>      > %c %p
>      > >>>>>>    %u %g %t e"
>      > >>>>>>     > -------------
>      > >>>>>>     >
>      > >>>>>>     > I've uploaded webrev of this enhancement.
>      > >>>>>>     > Could you review it?
>      > >>>>>>     >
>      > >>>>>>     > http://cr.openjdk.java.net/~ysuenaga/JDK-8059586/webrev.00/
>      > >>>>>>     >
>      > >>>>>>     > This patch works fine on Fedora20 x86_64.
>      > >>>>>>     >
>      > >>>>>>     >
>      > >>>>>>     >
>      > >>>>>>     > Thanks,
>      > >>>>>>     >
>      > >>>>>>     > Yasumasa
>      > >>>>>>     >
>      > >>>>>>
>      >
>      >
>
>