RFR 8242263: Diagnose synchronization on primitive wrappers

Patricio Chilano patricio.chilano.mateo at oracle.com
Mon Aug 10 20:05:33 UTC 2020


Hi Coleen,

On 8/10/20 2:52 PM, Coleen Phillimore wrote:
>
>
> On 8/10/20 9:44 AM, Patricio Chilano wrote:
>> Hi Coleen,
>>
>> On 8/7/20 12:58 PM, Coleen Phillimore wrote:
>>>
>>>
>>> On 8/7/20 11:49 AM, Patricio Chilano wrote:
>>>> Hi Coleen,
>>>>
>>>> On 8/7/20 10:53 AM, Coleen Phillimore wrote:
>>>>>
>>>>>
>>>>> On 8/6/20 9:45 PM, Patricio Chilano wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> Thanks for looking at this.
>>>>>>
>>>>>> On 8/6/20 7:17 PM, Coleen Phillimore wrote:
>>>>>>>
>>>>>>> Patricio,  One question:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8242263/v3/webrev/src/hotspot/share/runtime/synchronizer.cpp.udiff.html 
>>>>>>>
>>>>>>>
>>>>>>> + // adjust bcp to point back to monitorenter so that we print 
>>>>>>> the correct line numbers
>>>>>>>
>>>>>>>
>>>>>>> How do the fatal and event print the correct line numbers here? 
>>>>>>> I see the logging gets it from the stack trace. Should the abort 
>>>>>>> message have more information in it?  You can get the source and 
>>>>>>> line number information in the same way that print_stack_on() 
>>>>>>> gets it.
>>>>>> For the fatal error case I'm not printing line numbers as I do 
>>>>>> with the warning case or as you get with JFR events. But maybe I 
>>>>>> should print the stack too and then exit the VM. You can get the 
>>>>>> stack info and the line number of the monitorenter bytecode that 
>>>>>> caused the crash from the generated hs_err file though.
>>>>>
>>>>> Can you send a sample, please? 
>>>> At the top of the hs_err you will get:
>>>>
>>>>  #
>>>>  # A fatal error has been detected by the Java Runtime Environment:
>>>>  #
>>>>  #  Internal Error 
>>>> (/xx/xxx/xxxx/open/src/hotspot/share/runtime/synchronizer.cpp:574), 
>>>> pid=48810, tid=48811
>>>>  #  fatal error: Synchronizing on object 0x00000007ff0758a8 of 
>>>> klass java.lang.Integer
>>>>  #
>>>>
>>>> And then on the stacktrace you can see:
>>>>
>>>> Stack: [0x00007f982b6ce000,0x00007f982b7cf000], 
>>>> sp=0x00007f982b7cd5c0,  free space=1021k
>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, 
>>>> j=interpreted, Vv=VM code, C=native code)
>>>> V  [libjvm.so+0x1662385] 
>>>> ObjectSynchronizer::handle_sync_on_primitive_wrapper(Handle, 
>>>> Thread*)+0x185
>>>> V  [libjvm.so+0x16683ac]  ObjectSynchronizer::enter(Handle, 
>>>> BasicLock*, Thread*)+0x21c
>>>> V  [libjvm.so+0xce88fa] 
>>>> InterpreterRuntime::monitorenter(JavaThread*, BasicObjectLock*)+0x13a
>>>> j  SimpleTest.main([Ljava/lang/String;)V+24
>>>> v  ~StubRoutines::call_stub
>>>> V  [libjvm.so+0xd0263a]  JavaCalls::call_helper(JavaValue*, 
>>>> methodHandle const&, JavaCallArguments*, Thread*)+0x62a
>>>> V  [libjvm.so+0xe22c8e]  jni_invoke_static(JNIEnv_*, JavaValue*, 
>>>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thr
>>>>  .constprop.1]+0x35e
>>>> V  [libjvm.so+0xe28e6b]  jni_CallStaticVoidMethod+0x21b
>>>> C  [libjli.so+0x496e]  JavaMain+0xc1e
>>>> C  [libjli.so+0x7759]  ThreadJavaMain+0x9
>>>>
>>>> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
>>>> j  SimpleTest.main([Ljava/lang/String;)V+24
>>>> v  ~StubRoutines::call_stub
>>>>
>>>> Where +24 is the bci. But that would be with the interpreter, with 
>>>> c1 that line will show as:
>>>>
>>>> J 576 c1 SimpleTest.main([Ljava/lang/String;)V (205 bytes) @ 
>>>> 0x00007f97b5ae5dc7
>>>>
>>>> You have the method where it failed but not the exact bci. So maybe 
>>>> I should print the stack and then exit the VM.
>>>>
>>>>> I guess there isn't a handy function to show the source file name 
>>>>> and line number in the fatal error message.
>>>> I'll see if there is an easy way to have it in the fatal error 
>>>> message otherwise I can just print the stack as with the warning 
>>>> case and exit.
>>>
>>> I don't know if we want the fatal message to print the whole stack. 
>>> But an improvement would be:
>>>
>>>  #  fatal error: Synchronizing on object 0x00000007ff0758a8 of klass 
>>> java.lang.Integer in method->external_name()+bci
>>>
>>> which you have in the function and then people won't have to go to 
>>> the hs_err file.
>> I tested two different changes. The first one gets the method and bci 
>> and adds it to the fatal error message as you said:
>>
>> diff --git a/src/hotspot/share/runtime/synchronizer.cpp 
>> b/src/hotspot/share/runtime/synchronizer.cpp
>> --- a/src/hotspot/share/runtime/synchronizer.cpp
>> +++ b/src/hotspot/share/runtime/synchronizer.cpp
>> @@ -572,5 +572,9 @@
>>    if (DiagnoseSyncOnPrimitiveWrappers == FATAL_EXIT) {
>>      ResourceMark rm(self);
>> -    fatal("Synchronizing on object " INTPTR_FORMAT " of klass %s", 
>> p2i(obj()), obj->klass()->external_name());
>> +    vframeStream vfst(self);
>> +    Method* method = vfst.method();
>> +    int bci = vfst.bci();
>> +    fatal("Synchronizing on object " INTPTR_FORMAT " of klass %s in 
>> \"%s\" at bci:%d",
>> +          p2i(obj()), obj->klass()->external_name(), method ? 
>> method->external_name() : "", bci);
>>    }
>>
>> Which produces the following output:
>>
>> #  Internal Error 
>> (/scratch/pchilano/random7/open/src/hotspot/share/runtime/synchronizer.cpp:577), 
>> pid=13759, tid=13785
>> #  fatal error: Synchronizing on object 0x00000001013d72d8 of klass 
>> java.lang.Float in "void SimpleTest.run()" at bci:24
>> #
>>
>> In the second one I use the same print_stack_on() method I use for 
>> the warning case but print on a stringStream object instead, and then 
>> I print the first line of the buffer only.
>>
>> diff --git a/src/hotspot/share/runtime/synchronizer.cpp 
>> b/src/hotspot/share/runtime/synchronizer.cpp
>> --- a/src/hotspot/share/runtime/synchronizer.cpp
>> +++ b/src/hotspot/share/runtime/synchronizer.cpp
>> @@ -572,5 +572,12 @@
>>    if (DiagnoseSyncOnPrimitiveWrappers == FATAL_EXIT) {
>>      ResourceMark rm(self);
>> -    fatal("Synchronizing on object " INTPTR_FORMAT " of klass %s", 
>> p2i(obj()), obj->klass()->external_name());
>> +    stringStream ss;
>> +    self->print_stack_on(&ss);
>> +    char* base = (char*)strstr(ss.base(), "at");
>> +    char* newline = (char*)strchr(ss.base(), '\n');
>> +    if (newline != NULL) {
>> +      *newline = '\0';
>> +    }
>> +    fatal("Synchronizing on object " INTPTR_FORMAT " of klass %s 
>> %s", p2i(obj()), obj->klass()->external_name(), base);
>>    }
>>
>> Which produces the following output:
>>
>> #  Internal Error 
>> (/scratch/pchilano/random7/open/src/hotspot/share/runtime/synchronizer.cpp:581), 
>> pid=9288, tid=9306
>> #  fatal error: Synchronizing on object 0x00000001013d72d8 of klass 
>> java.lang.Float at SimpleTest.run(SimpleTest.java:70)
>> #
>>
>> I think the second method is better since it provides the actual file 
>> and line number, what do you think?
>
> Yes, this second output looks really nice!
Great, please check v5:

Inc: http://cr.openjdk.java.net/~pchilanomate/8242263/v5/inc/webrev/
Full: http://cr.openjdk.java.net/~pchilanomate/8242263/v5/webrev/

Patricio
> Coleen
>
>>
>> Thanks,
>> Patricio
>>> Thanks,
>>> Coleen
>>>
>>>
>>>>
>>>>
>>>> Thanks,
>>>> Patricio
>>>>> Thanks,
>>>>> Coleen
>>>>>>
>>>>>> I don't know which exact technique JFR uses to print line numbers 
>>>>>> but it has to include reading the current bcp. Since when coming 
>>>>>> from the interpreter for monitorenter the bcp is always already 
>>>>>> pointing to the next instruction we need to decrement it to print 
>>>>>> the correct line numbers. I tested it and if I don't fix the bcp, 
>>>>>> as expected JFR too will print the next line number for the top 
>>>>>> frame.
>>>>>>
>>>>>>
>>>>>> Patricio
>>>>>>> Otherwise, this looks good to me.
>>>>>>> Coleen
>>>>>>>
>>>>>>>
>>>>>>> On 8/6/20 2:48 PM, Patricio Chilano wrote:
>>>>>>>> Hi Dan,
>>>>>>>>
>>>>>>>> On 8/5/20 6:47 PM, Daniel D. Daugherty wrote:
>>>>>>>>> I'm peeking ahead to the next webrev... :-)
>>>>>>>>>
>>>>>>>>> > http://cr.openjdk.java.net/~pchilanomate/8242263/v3/webrev/
>>>>>>>> : )
>>>>>>>>
>>>>>>>>> General comments:
>>>>>>>>>   - check files for copyright year updates.
>>>>>>>> Updated accessFlags.hpp, c1_MacroAssembler_aarch64.cpp, 
>>>>>>>> c1_MacroAssembler_arm.cpp and c1_MacroAssembler_x86.cpp.
>>>>>>>>
>>>>>>>>> src/hotspot/share/runtime/synchronizer.hpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/hotspot/share/runtime/synchronizer.cpp
>>>>>>>>>     L507:   const markWord mark = obj->mark();
>>>>>>>>>     L508:
>>>>>>>>>     L509:   if (DiagnoseSyncOnPrimitiveWrappers != 0 && 
>>>>>>>>> obj->klass()->is_box()) {
>>>>>>>>>     L510:     return false;
>>>>>>>>>     L511:   }
>>>>>>>>>         The new bailout on L509-511 can move above L507.
>>>>>>>> Moved.
>>>>>>>>
>>>>>>>>> L573: fatal("Synchronizing on object " INTPTR_FORMAT " of 
>>>>>>>>> klass %s", p2i(obj()), obj->klass()->external_name());
>>>>>>>>>         This external_name() call does not have a ResourceMark.
>>>>>>>> Good catch! I had one in a previous version but then I changed 
>>>>>>>> the conditionals and lost it for the fatal error case. The test 
>>>>>>>> worked okay because for the main JavaThread there is a 
>>>>>>>> ResourceMark in jni_invoke_static() (jni.cpp).
>>>>>>>>
>>>>>>>>> src/hotspot/share/logging/logTag.hpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/hotspot/share/oops/klass.hpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/hotspot/share/utilities/accessFlags.hpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/hotspot/share/runtime/globals.hpp
>>>>>>>>>     L814:              "0: off "
>>>>>>>>>         Missing a ';' after "off".
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>> L816:              "2: log message to stdout.
>>>>>>>>>         Perhaps add "(by default)" after "stdout" or
>>>>>>>>>         don't say where log output is at all.
>>>>>>>>>
>>>>>>>>> src/hotspot/share/runtime/arguments.cpp
>>>>>>>>>     L4197: LogConfiguration::configure_stdout(LogLevel::Info, 
>>>>>>>>> true, LOG_TAGS(primitivewrappers));
>>>>>>>>>         Hmmm... maybe ignore my comments about L816 in 
>>>>>>>>> globals.hpp
>>>>>>>>>         since it looks like logging output is configured to 
>>>>>>>>> 'stdout'.
>>>>>>>>>         I'm assuming that other log options to put output 
>>>>>>>>> elsewhere
>>>>>>>>>         are overridden by this code?
>>>>>>>> Right. So the logging is done under UL with the tag 
>>>>>>>> primitivewrappers. If that tag is specified in Xlog then this 
>>>>>>>> conditional will be skipped because !log_is_enabled(Info, 
>>>>>>>> primitivewrappers) will be false.
>>>>>>>>
>>>>>>>>> src/hotspot/share/classfile/systemDictionary.cpp
>>>>>>>>>     L2188:     for (int i = T_BOOLEAN; i < T_LONG+1; i++) {
>>>>>>>>>         nit - s/T_LONG+1/T_LONG + 1/
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>> L2191: 
>>>>>>>>> _box_klasses[i]->set_prototype_header(markWord::prototype());
>>>>>>>>>         I assume we're keeping the prototype_header field when 
>>>>>>>>> Biased Locking
>>>>>>>>>         goes away? The reason I ask:
>>>>>>>>>
>>>>>>>>>         static markWord prototype() {
>>>>>>>>>           return markWord( no_hash_in_place | no_lock_in_place );
>>>>>>>>>         }
>>>>>>>>>
>>>>>>>>>         is because without Biased Locking, do we really need 
>>>>>>>>> the prototype
>>>>>>>>>         anymore? The initial markWord won't need possible 
>>>>>>>>> variants...
>>>>>>>> Yes, I think it can go away unless somebody finds another use 
>>>>>>>> for it. Maybe Valhalla.
>>>>>>>>
>>>>>>>>> src/hotspot/share/jfr/metadata/metadata.xml
>>>>>>>>>     L69:   <Event name="SyncOnPrimitiveWrapper" category="Java 
>>>>>>>>> Application"
>>>>>>>>>         Is the category "Java Application" because it's the 
>>>>>>>>> application
>>>>>>>>>         code that did something "wrong" here? Where 
>>>>>>>>> "application" is loosely
>>>>>>>>>         defined to include the possibility of auto generated 
>>>>>>>>> code, library
>>>>>>>>>         code and the like? Or perhaps "application" because 
>>>>>>>>> something "above"
>>>>>>>>>         the "Java Virtual Machine, Runtime" did the "wrong" 
>>>>>>>>> thing here?
>>>>>>>> I don't know what the right category should be really. I saw 
>>>>>>>> the events JavaMonitorEnter, JavaMonitorWait and 
>>>>>>>> JavaMonitorInflate and thought this event should fall in the 
>>>>>>>> same category they do.
>>>>>>>>
>>>>>>>>> src/jdk.jfr/share/conf/jfr/default.jfc
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/jdk.jfr/share/conf/jfr/profile.jfc
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> test/lib/jdk/test/lib/jfr/EventNames.java
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/hotspot/cpu/aarch64/aarch64.ad
>>>>>>>>>     L3517:       __ tbnz(tmp, 
>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS), cont);
>>>>>>>>> <snip>
>>>>>>>>>     L3578:     __ bind(cont);
>>>>>>>>>     L3579:     // flag == EQ indicates success
>>>>>>>>>     L3580:     // flag == NE indicates failure
>>>>>>>>>         If tbnz() branches to "cont" when we have a box class, 
>>>>>>>>> what's
>>>>>>>>>         the flag value set to (EQ or NE)? And what set that 
>>>>>>>>> flag value?
>>>>>>>>>         The reason I ask is I don't think tbnz() sets 
>>>>>>>>> condition codes...
>>>>>>>> Right, it doesn't set condition codes, so I kept the version I 
>>>>>>>> had (more on that below).
>>>>>>>>
>>>>>>>>> src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
>>>>>>>>>     int MacroAssembler::biased_locking_enter(Register lock_reg,
>>>>>>>>> Register obj_reg,
>>>>>>>>> Register swap_reg,
>>>>>>>>> Register tmp_reg,
>>>>>>>>>                                              bool 
>>>>>>>>> swap_reg_contains_mark,
>>>>>>>>> Label& done,
>>>>>>>>> Label* slow_case,
>>>>>>>>> BiasedLockingCounters* counters) {
>>>>>>>>>         I think you've changed the only callers of 
>>>>>>>>> biased_locking_enter()
>>>>>>>>>         that cared about the return value with this changeset 
>>>>>>>>> so it can
>>>>>>>>>         be changed to a void function.
>>>>>>>> Ok, this is what I mentioned to David in a previous email. Done.
>>>>>>>>
>>>>>>>>> src/hotspot/cpu/arm/c1_MacroAssembler_arm.cpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/hotspot/cpu/arm/c2_MacroAssembler_arm.cpp
>>>>>>>>>     L96:      tbnz(Rscratch, exact_log2(JVM_ACC_IS_BOX_CLASS), 
>>>>>>>>> done);
>>>>>>>>> <snip>
>>>>>>>>>     L131:   bind(done);
>>>>>>>>>     L132:
>>>>>>>>>     L133:   // At this point flags are set as follows:
>>>>>>>>>     L134:   //  EQ -> Success
>>>>>>>>>     L135:   //  NE -> Failure, branch to slow path
>>>>>>>>>         If tbnz() branches to "done" when we have a box class, 
>>>>>>>>> what's
>>>>>>>>>         the flag value set to (EQ or NE)? And what set that 
>>>>>>>>> flag value?
>>>>>>>>>         The reason I ask is I don't think tbnz() sets 
>>>>>>>>> condition codes...
>>>>>>>> Right. Same as above, I kept the version I had.
>>>>>>>>
>>>>>>>>> src/hotspot/cpu/arm/interp_masm_arm.cpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/hotspot/cpu/arm/macroAssembler_arm.cpp
>>>>>>>>>     int MacroAssembler::biased_locking_enter(Register obj_reg, 
>>>>>>>>> Register swap_reg, Register tmp_reg,
>>>>>>>>>                                              bool 
>>>>>>>>> swap_reg_contains_mark,
>>>>>>>>> Register tmp2,
>>>>>>>>> Label& done, Label& slow_case,
>>>>>>>>> BiasedLockingCounters* counters) {
>>>>>>>>>         I think you've changed the only callers of 
>>>>>>>>> biased_locking_enter()
>>>>>>>>>         that cared about the return value with this changeset 
>>>>>>>>> so it can
>>>>>>>>>         be changed to a void function.
>>>>>>>> Done.
>>>>>>>>
>>>>>>>>> src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp
>>>>>>>>> src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp
>>>>>>>>> src/hotspot/cpu/ppc/macroAssembler_ppc.cpp
>>>>>>>>>     No comments on the PPC code.
>>>>>>>>>
>>>>>>>>> src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp
>>>>>>>>> src/hotspot/cpu/s390/interp_masm_s390.cpp
>>>>>>>>> src/hotspot/cpu/s390/macroAssembler_s390.cpp
>>>>>>>>>     No comments on the S390 code.
>>>>>>>>>
>>>>>>>>> src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp
>>>>>>>>>     L58:     load_klass(hdr, obj, rklass_decode_tmp);
>>>>>>>>>         What will this do with a 'noreg' value? (I need a 
>>>>>>>>> refresher.)
>>>>>>>> When not in 64 bit mode that register just won't be used.
>>>>>>>>
>>>>>>>>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/hotspot/cpu/x86/interp_masm_x86.cpp
>>>>>>>>>     No comments.
>>>>>>>>>
>>>>>>>>> src/hotspot/cpu/x86/macroAssembler_x86.cpp
>>>>>>>>>     int MacroAssembler::biased_locking_enter(Register lock_reg,
>>>>>>>>> Register obj_reg,
>>>>>>>>> Register swap_reg,
>>>>>>>>> Register tmp_reg,
>>>>>>>>> Register tmp_reg2,
>>>>>>>>>                                              bool 
>>>>>>>>> swap_reg_contains_mark,
>>>>>>>>> Label& done,
>>>>>>>>> Label* slow_case,
>>>>>>>>> BiasedLockingCounters* counters) {
>>>>>>>>>         I think you've changed the only caller of 
>>>>>>>>> biased_locking_enter()
>>>>>>>>>         that cared about the return value with this changeset 
>>>>>>>>> so it can
>>>>>>>>>         be changed to a void function.
>>>>>>>> Done.
>>>>>>>>
>>>>>>>>> test/hotspot/jtreg/runtime/Monitor/SyncOnPrimitiveWrapperTest.java 
>>>>>>>>>
>>>>>>>>>     L30:  * @test SyncOnPrimitiveWrapperTest
>>>>>>>>>         No parameter to @test directive.
>>>>>>>> Removed.
>>>>>>>>
>>>>>>>>> L136:         private static long sharedCounter = 0L;
>>>>>>>>>         Since you don't do anything with sharedCounter other 
>>>>>>>>> than increment it,
>>>>>>>>>         can the compilers optimize it away? If it can be 
>>>>>>>>> optimized away, does
>>>>>>>>>         that mean that:
>>>>>>>>>
>>>>>>>>>             L142:                 synchronized (obj) {
>>>>>>>>>
>>>>>>>>>         can also be optimized away?
>>>>>>>>>
>>>>>>>>>         I don't think that:
>>>>>>>>>
>>>>>>>>>             L161:                 synchronized (sharedLock1) {
>>>>>>>>>
>>>>>>>>>         can be optimized away because it is shared by multiple 
>>>>>>>>> threads.
>>>>>>>>>
>>>>>>>>> test/jdk/jdk/jfr/event/runtime/TestSyncOnPrimitiveWrapperEvent.java 
>>>>>>>>>
>>>>>>>>>     Similar questions about 'counter' being optimized away.
>>>>>>>>>     Similar question about "synchronized (obj) {" being 
>>>>>>>>> optimized away.
>>>>>>>> I'm not sure if the compiler will optimize it away. Seems 
>>>>>>>> unlikely given the counter we are incrementing is not just 
>>>>>>>> local to some thread.
>>>>>>>>
>>>>>>>>
>>>>>>>> Ok, below is v3 which has the following changes:
>>>>>>>> - Use a 32 bit load for the _access_flags field, instead of 64
>>>>>>>> - Martin's implementation for s390 and fix for PPC
>>>>>>>> - Faster LogTest version
>>>>>>>> - Above changes based on Dan review
>>>>>>>>
>>>>>>>> I'm retesting in mach5 tiers1-6 (which tests x64 and aarch64 
>>>>>>>> only) again with -XX:DiagnoseSyncOnPrimitiveWrappers=2. I 
>>>>>>>> checked it builds in arm32, ppc and s390.
>>>>>>>>
>>>>>>>> Full: http://cr.openjdk.java.net/~pchilanomate/8242263/v3/webrev
>>>>>>>> Inc: 
>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8242263/v3/inc/webrev
>>>>>>>>
>>>>>>>>
>>>>>>>> @Martin:
>>>>>>>> Please check if the test doesn't timeout for you now with the 
>>>>>>>> changes I made to LogTest.
>>>>>>>> Also, I tried to use tbnz in aarch64 and arm32 instead of tst + 
>>>>>>>> br (except for c2 since we actually need to set the condition 
>>>>>>>> flags), but for c1 I was getting an assertion in the compiler 
>>>>>>>> thread:
>>>>>>>>
>>>>>>>> guarantee(chk == -1 || chk == 0) failed: Field too big for insn
>>>>>>>>
>>>>>>>> This is the stack when that happens:
>>>>>>>>
>>>>>>>> MacroAssembler::pd_patch_instruction_size(unsigned char*, 
>>>>>>>> unsigned char*)+0x398
>>>>>>>> AbstractAssembler::bind(Label&)+0x118
>>>>>>>> MonitorEnterStub::emit_code(LIR_Assembler*)+0x28
>>>>>>>> LIR_Assembler::emit_slow_case_stubs()+0x54
>>>>>>>> Compilation::emit_code_body()+0x17c
>>>>>>>>
>>>>>>>> The change was simply:
>>>>>>>>
>>>>>>>> diff --git 
>>>>>>>> a/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp 
>>>>>>>> b/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
>>>>>>>> --- a/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
>>>>>>>> +++ b/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
>>>>>>>> @@ -78,7 +78,6 @@
>>>>>>>>    if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>      load_klass(hdr, obj);
>>>>>>>> -    ldr(hdr, Address(hdr, Klass::access_flags_offset()));
>>>>>>>> -    tst(hdr, JVM_ACC_IS_BOX_CLASS);
>>>>>>>> -    br(Assembler::NE, slow_case);
>>>>>>>> +    ldrw(hdr, Address(hdr, Klass::access_flags_offset()));
>>>>>>>> +    tbnz(hdr, exact_log2(JVM_ACC_IS_BOX_CLASS), slow_case);
>>>>>>>>    }
>>>>>>>>
>>>>>>>> So the issue must be related to where we want to jump.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Patricio
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 8/5/20 9:01 AM, Patricio Chilano wrote:
>>>>>>>>>> Hi Martin,
>>>>>>>>>>
>>>>>>>>>> On 8/5/20 4:47 AM, Doerr, Martin wrote:
>>>>>>>>>>> Hi Patricio,
>>>>>>>>>>>
>>>>>>>>>>> using 8 Byte load instructions for jint fields is a terrible 
>>>>>>>>>>> coding style!
>>>>>>>>>>> Someone else may see it and use an 8 Byte store. Will result 
>>>>>>>>>>> in great fun for debugging!
>>>>>>>>>>>
>>>>>>>>>>> And on Big Endian you will simply access the wrong bits.
>>>>>>>>>> Ah, of course! Those 32 bits will be in the wrong place when 
>>>>>>>>>> doing the test.
>>>>>>>>>>
>>>>>>>>>>> Note that Big Endian Platforms are AIX, old linux ppc, s390, 
>>>>>>>>>>> SPARC. I don't think you have tested on them.
>>>>>>>>>>>
>>>>>>>>>>>> We could remove the nested synchronized statements in the 
>>>>>>>>>>>> run() method
>>>>>>>>>>>> so that we don't generate that much logging. We could also 
>>>>>>>>>>>> lower
>>>>>>>>>>>> LOOP_COUNT. I think the issue is also because we are 
>>>>>>>>>>>> running LogTest
>>>>>>>>>>>> with multiple flag combinations. Not sure what we should 
>>>>>>>>>>>> touch first.
>>>>>>>>>>>> Maybe the synchronized statements, have only one or two and 
>>>>>>>>>>>> test that
>>>>>>>>>>>> first?
>>>>>>>>>>> Sounds like helpful ideas. Please go ahead and strip things 
>>>>>>>>>>> down.
>>>>>>>>>> Great, I will send v3 later with those changes.
>>>>>>>>>>
>>>>>>>>>> Thanks Martin!
>>>>>>>>>>
>>>>>>>>>> Patricio
>>>>>>>>>>> Thanks for taking care of it.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Martin
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Patricio Chilano <patricio.chilano.mateo at oracle.com>
>>>>>>>>>>>> Sent: Dienstag, 4. August 2020 20:26
>>>>>>>>>>>> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes
>>>>>>>>>>>> <david.holmes at oracle.com>; 
>>>>>>>>>>>> hotspot-runtime-dev at openjdk.java.net;
>>>>>>>>>>>> hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>> Subject: Re: RFR 8242263: Diagnose synchronization on 
>>>>>>>>>>>> primitive wrappers
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>
>>>>>>>>>>>> On 8/4/20 1:35 PM, Doerr, Martin wrote:
>>>>>>>>>>>>> Hi Patricio,
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ok, I'll change it to movl + testl and test it out before 
>>>>>>>>>>>>>> sending v3.
>>>>>>>>>>>>> Thanks. I forgot to mention arm + aarch64.
>>>>>>>>>>>>> Aarch64 uses ldrw + tbnz.
>>>>>>>>>>>>> Arm 32 bit uses ldr_u32 + tbnz.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I remember having seen the same mistake ��
>>>>>>>>>>>>> And nobody noticed it on little Endian platforms.
>>>>>>>>>>>> Ok, I can use a tbnz instead of test and then a branch on 
>>>>>>>>>>>> arm32 and aarch64.
>>>>>>>>>>>> Shouldn't a normal ldr in arm32 work fine?
>>>>>>>>>>>> Also in 64 bits (either x64 or aarch64) I don't see the 
>>>>>>>>>>>> issue of using a
>>>>>>>>>>>> 64 bit load, besides the fact that we only care about the 
>>>>>>>>>>>> first 32 bits.
>>>>>>>>>>>> Regardless of the endianness, aren't we masking out the 
>>>>>>>>>>>> upper part when
>>>>>>>>>>>> we do AND/TEST or if we test a bit in the 0-31 bit range? 
>>>>>>>>>>>> Otherwise it
>>>>>>>>>>>> seems it should have failed for one of those platforms in 
>>>>>>>>>>>> my tests.
>>>>>>>>>>>>
>>>>>>>>>>>>>> I see that some tests use @run driver/timeout=xxxx. Maybe 
>>>>>>>>>>>>>> you can
>>>>>>>>>>>>>> specify that and see if that fixes it? Let me know if 
>>>>>>>>>>>>>> that works and I
>>>>>>>>>>>>>> can add it to the test.
>>>>>>>>>>>>> That seems to have an effect. But now I'm not patient 
>>>>>>>>>>>>> enough to wait for
>>>>>>>>>>>> the test to finish.
>>>>>>>>>>>>> Maybe the problem is that I'm using slow debug builds.
>>>>>>>>>>>>> But is there a chance to make the test quicker without 
>>>>>>>>>>>>> losing coverage
>>>>>>>>>>>> significantly?
>>>>>>>>>>>>> I think the test is too slow this way to run it on a 
>>>>>>>>>>>>> regular basis. We'd need to
>>>>>>>>>>>> spend dedicated machines for it.
>>>>>>>>>>>> We could remove the nested synchronized statements in the 
>>>>>>>>>>>> run() method
>>>>>>>>>>>> so that we don't generate that much logging. We could also 
>>>>>>>>>>>> lower
>>>>>>>>>>>> LOOP_COUNT. I think the issue is also because we are 
>>>>>>>>>>>> running LogTest
>>>>>>>>>>>> with multiple flag combinations. Not sure what we should 
>>>>>>>>>>>> touch first.
>>>>>>>>>>>> Maybe the synchronized statements, have only one or two and 
>>>>>>>>>>>> test that
>>>>>>>>>>>> first?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Patricio
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: Patricio Chilano <patricio.chilano.mateo at oracle.com>
>>>>>>>>>>>>>> Sent: Dienstag, 4. August 2020 17:47
>>>>>>>>>>>>>> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes
>>>>>>>>>>>>>> <david.holmes at oracle.com>; 
>>>>>>>>>>>>>> hotspot-runtime-dev at openjdk.java.net;
>>>>>>>>>>>>>> hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>>>> Subject: Re: RFR 8242263: Diagnose synchronization on 
>>>>>>>>>>>>>> primitive wrappers
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for fixing PPC and taking care of s390!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 8/4/20 11:18 AM, Doerr, Martin wrote:
>>>>>>>>>>>>>>> Hi Patricio,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I suggest to use movl + testl for checking the access 
>>>>>>>>>>>>>>> flag as for other
>>>>>>>>>>>> access
>>>>>>>>>>>>>> flags on x64.
>>>>>>>>>>>>>> Ok, I'll change it to movl + testl and test it out before 
>>>>>>>>>>>>>> sending v3.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> New version for PPC64 and s390 see below.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The test SyncOnPrimitiveWrapperTest produces hs_err 
>>>>>>>>>>>>>>> files as
>>>>>>>>>>>> expected.
>>>>>>>>>>>>>> However, I'm getting timeout issues:
>>>>>>>>>>>>>>> timed out (timeout set to 120000ms, elapsed time 
>>>>>>>>>>>>>>> including timeout
>>>>>>>>>>>>>> handling was 122372ms)
>>>>>>>>>>>>>>> Can we provide more time?
>>>>>>>>>>>>>> I see that some tests use @run driver/timeout=xxxx. Maybe 
>>>>>>>>>>>>>> you can
>>>>>>>>>>>>>> specify that and see if that fixes it? Let me know if 
>>>>>>>>>>>>>> that works and I
>>>>>>>>>>>>>> can add it to the test.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Patricio
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff -r 77852e129401 
>>>>>>>>>>>>>>> src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp
>>>>>>>>>>>>>>> --- a/src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp Tue 
>>>>>>>>>>>>>>> Aug 04
>>>>>>>>>>>>>> 10:03:57 2020 +0200
>>>>>>>>>>>>>>> +++ b/src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp Tue 
>>>>>>>>>>>>>>> Aug 04
>>>>>>>>>>>>>> 16:04:31 2020 +0200
>>>>>>>>>>>>>>> @@ -107,8 +107,8 @@
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>       if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>>>>>>>>         load_klass(Rscratch, Roop);
>>>>>>>>>>>>>>> -    ld(Rscratch, 
>>>>>>>>>>>>>>> in_bytes(Klass::access_flags_offset()), Rscratch);
>>>>>>>>>>>>>>> -    andi(Rscratch, Rscratch, JVM_ACC_IS_BOX_CLASS);
>>>>>>>>>>>>>>> +    lwz(Rscratch, 
>>>>>>>>>>>>>>> in_bytes(Klass::access_flags_offset()), Rscratch);
>>>>>>>>>>>>>>> +    testbitdi(CCR0, R0, Rscratch, 
>>>>>>>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>>>>>>>>>>>         bne(CCR0, slow_case);
>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff -r 77852e129401 
>>>>>>>>>>>>>>> src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp
>>>>>>>>>>>>>>> --- a/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp Tue Aug 04
>>>>>>>>>>>>>> 10:03:57 2020 +0200
>>>>>>>>>>>>>>> +++ b/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp Tue Aug 04
>>>>>>>>>>>>>> 16:04:31 2020 +0200
>>>>>>>>>>>>>>> @@ -912,8 +912,8 @@
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>>>>>>>>           load_klass(tmp, object);
>>>>>>>>>>>>>>> -      ld(tmp, in_bytes(Klass::access_flags_offset()), 
>>>>>>>>>>>>>>> tmp);
>>>>>>>>>>>>>>> -      andi(tmp, tmp, JVM_ACC_IS_BOX_CLASS);
>>>>>>>>>>>>>>> +      lwz(tmp, in_bytes(Klass::access_flags_offset()), 
>>>>>>>>>>>>>>> tmp);
>>>>>>>>>>>>>>> +      testbitdi(CCR0, R0, tmp, 
>>>>>>>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>>>>>>>>>>>           bne(CCR0, slow_case);
>>>>>>>>>>>>>>>         }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff -r 77852e129401 
>>>>>>>>>>>>>>> src/hotspot/cpu/ppc/macroAssembler_ppc.cpp
>>>>>>>>>>>>>>> --- a/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp Tue Aug 04
>>>>>>>>>>>>>> 10:03:57 2020 +0200
>>>>>>>>>>>>>>> +++ b/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp Tue Aug 04
>>>>>>>>>>>>>> 16:04:31 2020 +0200
>>>>>>>>>>>>>>> @@ -2838,8 +2838,8 @@
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>       if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>>>>>>>>         load_klass(temp, oop);
>>>>>>>>>>>>>>> -    ld(temp, in_bytes(Klass::access_flags_offset()), 
>>>>>>>>>>>>>>> temp);
>>>>>>>>>>>>>>> -    andi(temp, temp, JVM_ACC_IS_BOX_CLASS);
>>>>>>>>>>>>>>> +    lwz(temp, in_bytes(Klass::access_flags_offset()), 
>>>>>>>>>>>>>>> temp);
>>>>>>>>>>>>>>> +    testbitdi(CCR0, R0, temp, 
>>>>>>>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>>>>>>>>>>>         bne(CCR0, cont);
>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff -r 77852e129401
>>>>>>>>>>>> src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp
>>>>>>>>>>>>>>> --- a/src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp 
>>>>>>>>>>>>>>> Tue Aug 04
>>>>>>>>>>>>>> 10:03:57 2020 +0200
>>>>>>>>>>>>>>> +++ b/src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp 
>>>>>>>>>>>>>>> Tue Aug 04
>>>>>>>>>>>>>> 16:04:31 2020 +0200
>>>>>>>>>>>>>>> @@ -91,6 +91,12 @@
>>>>>>>>>>>>>>>       // Save object being locked into the 
>>>>>>>>>>>>>>> BasicObjectLock...
>>>>>>>>>>>>>>>       z_stg(obj, Address(disp_hdr,
>>>>>>>>>>>> BasicObjectLock::obj_offset_in_bytes()));
>>>>>>>>>>>>>>> +  if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>>>>>>>> +    load_klass(Z_R1_scratch, obj);
>>>>>>>>>>>>>>> +    testbit(Address(Z_R1_scratch, 
>>>>>>>>>>>>>>> Klass::access_flags_offset()),
>>>>>>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>>>>>>>>>>> + z_btrue(slow_case);
>>>>>>>>>>>>>>> +  }
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>       if (UseBiasedLocking) {
>>>>>>>>>>>>>>>         biased_locking_enter(obj, hdr, Z_R1_scratch, 
>>>>>>>>>>>>>>> Z_R0_scratch, done,
>>>>>>>>>>>>>> &slow_case);
>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>> diff -r 77852e129401 
>>>>>>>>>>>>>>> src/hotspot/cpu/s390/interp_masm_s390.cpp
>>>>>>>>>>>>>>> --- a/src/hotspot/cpu/s390/interp_masm_s390.cpp Tue Aug 
>>>>>>>>>>>>>>> 04 10:03:57
>>>>>>>>>>>>>> 2020 +0200
>>>>>>>>>>>>>>> +++ b/src/hotspot/cpu/s390/interp_masm_s390.cpp Tue Aug 04
>>>>>>>>>>>> 16:04:31
>>>>>>>>>>>>>> 2020 +0200
>>>>>>>>>>>>>>> @@ -1000,6 +1000,12 @@
>>>>>>>>>>>>>>>       // Load markWord from object into displaced_header.
>>>>>>>>>>>>>>>       z_lg(displaced_header, 
>>>>>>>>>>>>>>> oopDesc::mark_offset_in_bytes(), object);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +  if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>>>>>>>> +    load_klass(Z_R1_scratch, object);
>>>>>>>>>>>>>>> +    testbit(Address(Z_R1_scratch, 
>>>>>>>>>>>>>>> Klass::access_flags_offset()),
>>>>>>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>>>>>>>>>>> + z_btrue(slow_case);
>>>>>>>>>>>>>>> +  }
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>       if (UseBiasedLocking) {
>>>>>>>>>>>>>>>         biased_locking_enter(object, displaced_header, 
>>>>>>>>>>>>>>> Z_R1, Z_R0, done,
>>>>>>>>>>>>>> &slow_case);
>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>> diff -r 77852e129401 
>>>>>>>>>>>>>>> src/hotspot/cpu/s390/macroAssembler_s390.cpp
>>>>>>>>>>>>>>> --- a/src/hotspot/cpu/s390/macroAssembler_s390.cpp Tue 
>>>>>>>>>>>>>>> Aug 04
>>>>>>>>>>>>>> 10:03:57 2020 +0200
>>>>>>>>>>>>>>> +++ b/src/hotspot/cpu/s390/macroAssembler_s390.cpp Tue 
>>>>>>>>>>>>>>> Aug 04
>>>>>>>>>>>>>> 16:04:31 2020 +0200
>>>>>>>>>>>>>>> @@ -3358,6 +3358,12 @@
>>>>>>>>>>>>>>>       // Load markWord from oop into mark.
>>>>>>>>>>>>>>>       z_lg(displacedHeader, 0, oop);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +  if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>>>>>>>> +    load_klass(Z_R1_scratch, oop);
>>>>>>>>>>>>>>> +    testbit(Address(Z_R1_scratch, 
>>>>>>>>>>>>>>> Klass::access_flags_offset()),
>>>>>>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>>>>>>>>>>> + z_btrue(done);
>>>>>>>>>>>>>>> +  }
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>       if (try_bias) {
>>>>>>>>>>>>>>>         biased_locking_enter(oop, displacedHeader, temp, 
>>>>>>>>>>>>>>> Z_R0, done);
>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list