RFR 8242263: Diagnose synchronization on primitive wrappers
Coleen Phillimore
coleen.phillimore at oracle.com
Mon Aug 10 17:52:01 UTC 2020
On 8/10/20 9:44 AM, Patricio Chilano wrote:
> Hi Coleen,
>
> On 8/7/20 12:58 PM, Coleen Phillimore wrote:
>>
>>
>> On 8/7/20 11:49 AM, Patricio Chilano wrote:
>>> Hi Coleen,
>>>
>>> On 8/7/20 10:53 AM, Coleen Phillimore wrote:
>>>>
>>>>
>>>> On 8/6/20 9:45 PM, Patricio Chilano wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> Thanks for looking at this.
>>>>>
>>>>> On 8/6/20 7:17 PM, Coleen Phillimore wrote:
>>>>>>
>>>>>> Patricio, One question:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~pchilanomate/8242263/v3/webrev/src/hotspot/share/runtime/synchronizer.cpp.udiff.html
>>>>>>
>>>>>>
>>>>>> + // adjust bcp to point back to monitorenter so that we print
>>>>>> the correct line numbers
>>>>>>
>>>>>>
>>>>>> How do the fatal and event print the correct line numbers here? I
>>>>>> see the logging gets it from the stack trace. Should the abort
>>>>>> message have more information in it? You can get the source and
>>>>>> line number information in the same way that print_stack_on()
>>>>>> gets it.
>>>>> For the fatal error case I'm not printing line numbers as I do
>>>>> with the warning case or as you get with JFR events. But maybe I
>>>>> should print the stack too and then exit the VM. You can get the
>>>>> stack info and the line number of the monitorenter bytecode that
>>>>> caused the crash from the generated hs_err file though.
>>>>
>>>> Can you send a sample, please?
>>> At the top of the hs_err you will get:
>>>
>>> #
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> # Internal Error
>>> (/xx/xxx/xxxx/open/src/hotspot/share/runtime/synchronizer.cpp:574),
>>> pid=48810, tid=48811
>>> # fatal error: Synchronizing on object 0x00000007ff0758a8 of klass
>>> java.lang.Integer
>>> #
>>>
>>> And then on the stacktrace you can see:
>>>
>>> Stack: [0x00007f982b6ce000,0x00007f982b7cf000],
>>> sp=0x00007f982b7cd5c0, free space=1021k
>>> Native frames: (J=compiled Java code, A=aot compiled Java code,
>>> j=interpreted, Vv=VM code, C=native code)
>>> V [libjvm.so+0x1662385]
>>> ObjectSynchronizer::handle_sync_on_primitive_wrapper(Handle,
>>> Thread*)+0x185
>>> V [libjvm.so+0x16683ac] ObjectSynchronizer::enter(Handle,
>>> BasicLock*, Thread*)+0x21c
>>> V [libjvm.so+0xce88fa]
>>> InterpreterRuntime::monitorenter(JavaThread*, BasicObjectLock*)+0x13a
>>> j SimpleTest.main([Ljava/lang/String;)V+24
>>> v ~StubRoutines::call_stub
>>> V [libjvm.so+0xd0263a] JavaCalls::call_helper(JavaValue*,
>>> methodHandle const&, JavaCallArguments*, Thread*)+0x62a
>>> V [libjvm.so+0xe22c8e] jni_invoke_static(JNIEnv_*, JavaValue*,
>>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thr
>>> .constprop.1]+0x35e
>>> V [libjvm.so+0xe28e6b] jni_CallStaticVoidMethod+0x21b
>>> C [libjli.so+0x496e] JavaMain+0xc1e
>>> C [libjli.so+0x7759] ThreadJavaMain+0x9
>>>
>>> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
>>> j SimpleTest.main([Ljava/lang/String;)V+24
>>> v ~StubRoutines::call_stub
>>>
>>> Where +24 is the bci. But that would be with the interpreter, with
>>> c1 that line will show as:
>>>
>>> J 576 c1 SimpleTest.main([Ljava/lang/String;)V (205 bytes) @
>>> 0x00007f97b5ae5dc7
>>>
>>> You have the method where it failed but not the exact bci. So maybe
>>> I should print the stack and then exit the VM.
>>>
>>>> I guess there isn't a handy function to show the source file name
>>>> and line number in the fatal error message.
>>> I'll see if there is an easy way to have it in the fatal error
>>> message otherwise I can just print the stack as with the warning
>>> case and exit.
>>
>> I don't know if we want the fatal message to print the whole stack.
>> But an improvement would be:
>>
>> # fatal error: Synchronizing on object 0x00000007ff0758a8 of klass
>> java.lang.Integer in method->external_name()+bci
>>
>> which you have in the function and then people won't have to go to
>> the hs_err file.
> I tested two different changes. The first one gets the method and bci
> and adds it to the fatal error message as you said:
>
> diff --git a/src/hotspot/share/runtime/synchronizer.cpp
> b/src/hotspot/share/runtime/synchronizer.cpp
> --- a/src/hotspot/share/runtime/synchronizer.cpp
> +++ b/src/hotspot/share/runtime/synchronizer.cpp
> @@ -572,5 +572,9 @@
> if (DiagnoseSyncOnPrimitiveWrappers == FATAL_EXIT) {
> ResourceMark rm(self);
> - fatal("Synchronizing on object " INTPTR_FORMAT " of klass %s",
> p2i(obj()), obj->klass()->external_name());
> + vframeStream vfst(self);
> + Method* method = vfst.method();
> + int bci = vfst.bci();
> + fatal("Synchronizing on object " INTPTR_FORMAT " of klass %s in
> \"%s\" at bci:%d",
> + p2i(obj()), obj->klass()->external_name(), method ?
> method->external_name() : "", bci);
> }
>
> Which produces the following output:
>
> # Internal Error
> (/scratch/pchilano/random7/open/src/hotspot/share/runtime/synchronizer.cpp:577),
> pid=13759, tid=13785
> # fatal error: Synchronizing on object 0x00000001013d72d8 of klass
> java.lang.Float in "void SimpleTest.run()" at bci:24
> #
>
> In the second one I use the same print_stack_on() method I use for the
> warning case but print on a stringStream object instead, and then I
> print the first line of the buffer only.
>
> diff --git a/src/hotspot/share/runtime/synchronizer.cpp
> b/src/hotspot/share/runtime/synchronizer.cpp
> --- a/src/hotspot/share/runtime/synchronizer.cpp
> +++ b/src/hotspot/share/runtime/synchronizer.cpp
> @@ -572,5 +572,12 @@
> if (DiagnoseSyncOnPrimitiveWrappers == FATAL_EXIT) {
> ResourceMark rm(self);
> - fatal("Synchronizing on object " INTPTR_FORMAT " of klass %s",
> p2i(obj()), obj->klass()->external_name());
> + stringStream ss;
> + self->print_stack_on(&ss);
> + char* base = (char*)strstr(ss.base(), "at");
> + char* newline = (char*)strchr(ss.base(), '\n');
> + if (newline != NULL) {
> + *newline = '\0';
> + }
> + fatal("Synchronizing on object " INTPTR_FORMAT " of klass %s %s",
> p2i(obj()), obj->klass()->external_name(), base);
> }
>
> Which produces the following output:
>
> # Internal Error
> (/scratch/pchilano/random7/open/src/hotspot/share/runtime/synchronizer.cpp:581),
> pid=9288, tid=9306
> # fatal error: Synchronizing on object 0x00000001013d72d8 of klass
> java.lang.Float at SimpleTest.run(SimpleTest.java:70)
> #
>
> I think the second method is better since it provides the actual file
> and line number, what do you think?
Yes, this second output looks really nice!
Coleen
>
> Thanks,
> Patricio
>> Thanks,
>> Coleen
>>
>>
>>>
>>>
>>> Thanks,
>>> Patricio
>>>> Thanks,
>>>> Coleen
>>>>>
>>>>> I don't know which exact technique JFR uses to print line numbers
>>>>> but it has to include reading the current bcp. Since when coming
>>>>> from the interpreter for monitorenter the bcp is always already
>>>>> pointing to the next instruction we need to decrement it to print
>>>>> the correct line numbers. I tested it and if I don't fix the bcp,
>>>>> as expected JFR too will print the next line number for the top
>>>>> frame.
>>>>>
>>>>>
>>>>> Patricio
>>>>>> Otherwise, this looks good to me.
>>>>>> Coleen
>>>>>>
>>>>>>
>>>>>> On 8/6/20 2:48 PM, Patricio Chilano wrote:
>>>>>>> Hi Dan,
>>>>>>>
>>>>>>> On 8/5/20 6:47 PM, Daniel D. Daugherty wrote:
>>>>>>>> I'm peeking ahead to the next webrev... :-)
>>>>>>>>
>>>>>>>> > http://cr.openjdk.java.net/~pchilanomate/8242263/v3/webrev/
>>>>>>> : )
>>>>>>>
>>>>>>>> General comments:
>>>>>>>> - check files for copyright year updates.
>>>>>>> Updated accessFlags.hpp, c1_MacroAssembler_aarch64.cpp,
>>>>>>> c1_MacroAssembler_arm.cpp and c1_MacroAssembler_x86.cpp.
>>>>>>>
>>>>>>>> src/hotspot/share/runtime/synchronizer.hpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/hotspot/share/runtime/synchronizer.cpp
>>>>>>>> L507: const markWord mark = obj->mark();
>>>>>>>> L508:
>>>>>>>> L509: if (DiagnoseSyncOnPrimitiveWrappers != 0 &&
>>>>>>>> obj->klass()->is_box()) {
>>>>>>>> L510: return false;
>>>>>>>> L511: }
>>>>>>>> The new bailout on L509-511 can move above L507.
>>>>>>> Moved.
>>>>>>>
>>>>>>>> L573: fatal("Synchronizing on object " INTPTR_FORMAT " of
>>>>>>>> klass %s", p2i(obj()), obj->klass()->external_name());
>>>>>>>> This external_name() call does not have a ResourceMark.
>>>>>>> Good catch! I had one in a previous version but then I changed
>>>>>>> the conditionals and lost it for the fatal error case. The test
>>>>>>> worked okay because for the main JavaThread there is a
>>>>>>> ResourceMark in jni_invoke_static() (jni.cpp).
>>>>>>>
>>>>>>>> src/hotspot/share/logging/logTag.hpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/hotspot/share/oops/klass.hpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/hotspot/share/utilities/accessFlags.hpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/hotspot/share/runtime/globals.hpp
>>>>>>>> L814: "0: off "
>>>>>>>> Missing a ';' after "off".
>>>>>>> Fixed.
>>>>>>>
>>>>>>>> L816: "2: log message to stdout.
>>>>>>>> Perhaps add "(by default)" after "stdout" or
>>>>>>>> don't say where log output is at all.
>>>>>>>>
>>>>>>>> src/hotspot/share/runtime/arguments.cpp
>>>>>>>> L4197: LogConfiguration::configure_stdout(LogLevel::Info,
>>>>>>>> true, LOG_TAGS(primitivewrappers));
>>>>>>>> Hmmm... maybe ignore my comments about L816 in globals.hpp
>>>>>>>> since it looks like logging output is configured to
>>>>>>>> 'stdout'.
>>>>>>>> I'm assuming that other log options to put output
>>>>>>>> elsewhere
>>>>>>>> are overridden by this code?
>>>>>>> Right. So the logging is done under UL with the tag
>>>>>>> primitivewrappers. If that tag is specified in Xlog then this
>>>>>>> conditional will be skipped because !log_is_enabled(Info,
>>>>>>> primitivewrappers) will be false.
>>>>>>>
>>>>>>>> src/hotspot/share/classfile/systemDictionary.cpp
>>>>>>>> L2188: for (int i = T_BOOLEAN; i < T_LONG+1; i++) {
>>>>>>>> nit - s/T_LONG+1/T_LONG + 1/
>>>>>>> Fixed.
>>>>>>>
>>>>>>>> L2191:
>>>>>>>> _box_klasses[i]->set_prototype_header(markWord::prototype());
>>>>>>>> I assume we're keeping the prototype_header field when
>>>>>>>> Biased Locking
>>>>>>>> goes away? The reason I ask:
>>>>>>>>
>>>>>>>> static markWord prototype() {
>>>>>>>> return markWord( no_hash_in_place | no_lock_in_place );
>>>>>>>> }
>>>>>>>>
>>>>>>>> is because without Biased Locking, do we really need
>>>>>>>> the prototype
>>>>>>>> anymore? The initial markWord won't need possible
>>>>>>>> variants...
>>>>>>> Yes, I think it can go away unless somebody finds another use
>>>>>>> for it. Maybe Valhalla.
>>>>>>>
>>>>>>>> src/hotspot/share/jfr/metadata/metadata.xml
>>>>>>>> L69: <Event name="SyncOnPrimitiveWrapper" category="Java
>>>>>>>> Application"
>>>>>>>> Is the category "Java Application" because it's the
>>>>>>>> application
>>>>>>>> code that did something "wrong" here? Where
>>>>>>>> "application" is loosely
>>>>>>>> defined to include the possibility of auto generated
>>>>>>>> code, library
>>>>>>>> code and the like? Or perhaps "application" because
>>>>>>>> something "above"
>>>>>>>> the "Java Virtual Machine, Runtime" did the "wrong"
>>>>>>>> thing here?
>>>>>>> I don't know what the right category should be really. I saw the
>>>>>>> events JavaMonitorEnter, JavaMonitorWait and JavaMonitorInflate
>>>>>>> and thought this event should fall in the same category they do.
>>>>>>>
>>>>>>>> src/jdk.jfr/share/conf/jfr/default.jfc
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/jdk.jfr/share/conf/jfr/profile.jfc
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> test/lib/jdk/test/lib/jfr/EventNames.java
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/hotspot/cpu/aarch64/aarch64.ad
>>>>>>>> L3517: __ tbnz(tmp, exact_log2(JVM_ACC_IS_BOX_CLASS),
>>>>>>>> cont);
>>>>>>>> <snip>
>>>>>>>> L3578: __ bind(cont);
>>>>>>>> L3579: // flag == EQ indicates success
>>>>>>>> L3580: // flag == NE indicates failure
>>>>>>>> If tbnz() branches to "cont" when we have a box class,
>>>>>>>> what's
>>>>>>>> the flag value set to (EQ or NE)? And what set that
>>>>>>>> flag value?
>>>>>>>> The reason I ask is I don't think tbnz() sets condition
>>>>>>>> codes...
>>>>>>> Right, it doesn't set condition codes, so I kept the version I
>>>>>>> had (more on that below).
>>>>>>>
>>>>>>>> src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
>>>>>>>> int MacroAssembler::biased_locking_enter(Register lock_reg,
>>>>>>>> Register obj_reg,
>>>>>>>> Register swap_reg,
>>>>>>>> Register tmp_reg,
>>>>>>>> bool
>>>>>>>> swap_reg_contains_mark,
>>>>>>>> Label& done,
>>>>>>>> Label* slow_case,
>>>>>>>> BiasedLockingCounters* counters) {
>>>>>>>> I think you've changed the only callers of
>>>>>>>> biased_locking_enter()
>>>>>>>> that cared about the return value with this changeset
>>>>>>>> so it can
>>>>>>>> be changed to a void function.
>>>>>>> Ok, this is what I mentioned to David in a previous email. Done.
>>>>>>>
>>>>>>>> src/hotspot/cpu/arm/c1_MacroAssembler_arm.cpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/hotspot/cpu/arm/c2_MacroAssembler_arm.cpp
>>>>>>>> L96: tbnz(Rscratch, exact_log2(JVM_ACC_IS_BOX_CLASS),
>>>>>>>> done);
>>>>>>>> <snip>
>>>>>>>> L131: bind(done);
>>>>>>>> L132:
>>>>>>>> L133: // At this point flags are set as follows:
>>>>>>>> L134: // EQ -> Success
>>>>>>>> L135: // NE -> Failure, branch to slow path
>>>>>>>> If tbnz() branches to "done" when we have a box class,
>>>>>>>> what's
>>>>>>>> the flag value set to (EQ or NE)? And what set that
>>>>>>>> flag value?
>>>>>>>> The reason I ask is I don't think tbnz() sets condition
>>>>>>>> codes...
>>>>>>> Right. Same as above, I kept the version I had.
>>>>>>>
>>>>>>>> src/hotspot/cpu/arm/interp_masm_arm.cpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/hotspot/cpu/arm/macroAssembler_arm.cpp
>>>>>>>> int MacroAssembler::biased_locking_enter(Register obj_reg,
>>>>>>>> Register swap_reg, Register tmp_reg,
>>>>>>>> bool
>>>>>>>> swap_reg_contains_mark,
>>>>>>>> Register tmp2,
>>>>>>>> Label& done, Label& slow_case,
>>>>>>>> BiasedLockingCounters* counters) {
>>>>>>>> I think you've changed the only callers of
>>>>>>>> biased_locking_enter()
>>>>>>>> that cared about the return value with this changeset
>>>>>>>> so it can
>>>>>>>> be changed to a void function.
>>>>>>> Done.
>>>>>>>
>>>>>>>> src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp
>>>>>>>> src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp
>>>>>>>> src/hotspot/cpu/ppc/macroAssembler_ppc.cpp
>>>>>>>> No comments on the PPC code.
>>>>>>>>
>>>>>>>> src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp
>>>>>>>> src/hotspot/cpu/s390/interp_masm_s390.cpp
>>>>>>>> src/hotspot/cpu/s390/macroAssembler_s390.cpp
>>>>>>>> No comments on the S390 code.
>>>>>>>>
>>>>>>>> src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp
>>>>>>>> L58: load_klass(hdr, obj, rklass_decode_tmp);
>>>>>>>> What will this do with a 'noreg' value? (I need a
>>>>>>>> refresher.)
>>>>>>> When not in 64 bit mode that register just won't be used.
>>>>>>>
>>>>>>>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/hotspot/cpu/x86/interp_masm_x86.cpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/hotspot/cpu/x86/macroAssembler_x86.cpp
>>>>>>>> int MacroAssembler::biased_locking_enter(Register lock_reg,
>>>>>>>> Register obj_reg,
>>>>>>>> Register swap_reg,
>>>>>>>> Register tmp_reg,
>>>>>>>> Register tmp_reg2,
>>>>>>>> bool
>>>>>>>> swap_reg_contains_mark,
>>>>>>>> Label& done,
>>>>>>>> Label* slow_case,
>>>>>>>> BiasedLockingCounters* counters) {
>>>>>>>> I think you've changed the only caller of
>>>>>>>> biased_locking_enter()
>>>>>>>> that cared about the return value with this changeset
>>>>>>>> so it can
>>>>>>>> be changed to a void function.
>>>>>>> Done.
>>>>>>>
>>>>>>>> test/hotspot/jtreg/runtime/Monitor/SyncOnPrimitiveWrapperTest.java
>>>>>>>> L30: * @test SyncOnPrimitiveWrapperTest
>>>>>>>> No parameter to @test directive.
>>>>>>> Removed.
>>>>>>>
>>>>>>>> L136: private static long sharedCounter = 0L;
>>>>>>>> Since you don't do anything with sharedCounter other
>>>>>>>> than increment it,
>>>>>>>> can the compilers optimize it away? If it can be
>>>>>>>> optimized away, does
>>>>>>>> that mean that:
>>>>>>>>
>>>>>>>> L142: synchronized (obj) {
>>>>>>>>
>>>>>>>> can also be optimized away?
>>>>>>>>
>>>>>>>> I don't think that:
>>>>>>>>
>>>>>>>> L161: synchronized (sharedLock1) {
>>>>>>>>
>>>>>>>> can be optimized away because it is shared by multiple
>>>>>>>> threads.
>>>>>>>>
>>>>>>>> test/jdk/jdk/jfr/event/runtime/TestSyncOnPrimitiveWrapperEvent.java
>>>>>>>>
>>>>>>>> Similar questions about 'counter' being optimized away.
>>>>>>>> Similar question about "synchronized (obj) {" being
>>>>>>>> optimized away.
>>>>>>> I'm not sure if the compiler will optimize it away. Seems
>>>>>>> unlikely given the counter we are incrementing is not just local
>>>>>>> to some thread.
>>>>>>>
>>>>>>>
>>>>>>> Ok, below is v3 which has the following changes:
>>>>>>> - Use a 32 bit load for the _access_flags field, instead of 64
>>>>>>> - Martin's implementation for s390 and fix for PPC
>>>>>>> - Faster LogTest version
>>>>>>> - Above changes based on Dan review
>>>>>>>
>>>>>>> I'm retesting in mach5 tiers1-6 (which tests x64 and aarch64
>>>>>>> only) again with -XX:DiagnoseSyncOnPrimitiveWrappers=2. I
>>>>>>> checked it builds in arm32, ppc and s390.
>>>>>>>
>>>>>>> Full: http://cr.openjdk.java.net/~pchilanomate/8242263/v3/webrev
>>>>>>> Inc: http://cr.openjdk.java.net/~pchilanomate/8242263/v3/inc/webrev
>>>>>>>
>>>>>>>
>>>>>>> @Martin:
>>>>>>> Please check if the test doesn't timeout for you now with the
>>>>>>> changes I made to LogTest.
>>>>>>> Also, I tried to use tbnz in aarch64 and arm32 instead of tst +
>>>>>>> br (except for c2 since we actually need to set the condition
>>>>>>> flags), but for c1 I was getting an assertion in the compiler
>>>>>>> thread:
>>>>>>>
>>>>>>> guarantee(chk == -1 || chk == 0) failed: Field too big for insn
>>>>>>>
>>>>>>> This is the stack when that happens:
>>>>>>>
>>>>>>> MacroAssembler::pd_patch_instruction_size(unsigned char*,
>>>>>>> unsigned char*)+0x398
>>>>>>> AbstractAssembler::bind(Label&)+0x118
>>>>>>> MonitorEnterStub::emit_code(LIR_Assembler*)+0x28
>>>>>>> LIR_Assembler::emit_slow_case_stubs()+0x54
>>>>>>> Compilation::emit_code_body()+0x17c
>>>>>>>
>>>>>>> The change was simply:
>>>>>>>
>>>>>>> diff --git
>>>>>>> a/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
>>>>>>> b/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
>>>>>>> --- a/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
>>>>>>> +++ b/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
>>>>>>> @@ -78,7 +78,6 @@
>>>>>>> if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>> load_klass(hdr, obj);
>>>>>>> - ldr(hdr, Address(hdr, Klass::access_flags_offset()));
>>>>>>> - tst(hdr, JVM_ACC_IS_BOX_CLASS);
>>>>>>> - br(Assembler::NE, slow_case);
>>>>>>> + ldrw(hdr, Address(hdr, Klass::access_flags_offset()));
>>>>>>> + tbnz(hdr, exact_log2(JVM_ACC_IS_BOX_CLASS), slow_case);
>>>>>>> }
>>>>>>>
>>>>>>> So the issue must be related to where we want to jump.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Patricio
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8/5/20 9:01 AM, Patricio Chilano wrote:
>>>>>>>>> Hi Martin,
>>>>>>>>>
>>>>>>>>> On 8/5/20 4:47 AM, Doerr, Martin wrote:
>>>>>>>>>> Hi Patricio,
>>>>>>>>>>
>>>>>>>>>> using 8 Byte load instructions for jint fields is a terrible
>>>>>>>>>> coding style!
>>>>>>>>>> Someone else may see it and use an 8 Byte store. Will result
>>>>>>>>>> in great fun for debugging!
>>>>>>>>>>
>>>>>>>>>> And on Big Endian you will simply access the wrong bits.
>>>>>>>>> Ah, of course! Those 32 bits will be in the wrong place when
>>>>>>>>> doing the test.
>>>>>>>>>
>>>>>>>>>> Note that Big Endian Platforms are AIX, old linux ppc, s390,
>>>>>>>>>> SPARC. I don't think you have tested on them.
>>>>>>>>>>
>>>>>>>>>>> We could remove the nested synchronized statements in the
>>>>>>>>>>> run() method
>>>>>>>>>>> so that we don't generate that much logging. We could also
>>>>>>>>>>> lower
>>>>>>>>>>> LOOP_COUNT. I think the issue is also because we are running
>>>>>>>>>>> LogTest
>>>>>>>>>>> with multiple flag combinations. Not sure what we should
>>>>>>>>>>> touch first.
>>>>>>>>>>> Maybe the synchronized statements, have only one or two and
>>>>>>>>>>> test that
>>>>>>>>>>> first?
>>>>>>>>>> Sounds like helpful ideas. Please go ahead and strip things
>>>>>>>>>> down.
>>>>>>>>> Great, I will send v3 later with those changes.
>>>>>>>>>
>>>>>>>>> Thanks Martin!
>>>>>>>>>
>>>>>>>>> Patricio
>>>>>>>>>> Thanks for taking care of it.
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Martin
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Patricio Chilano <patricio.chilano.mateo at oracle.com>
>>>>>>>>>>> Sent: Dienstag, 4. August 2020 20:26
>>>>>>>>>>> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes
>>>>>>>>>>> <david.holmes at oracle.com>;
>>>>>>>>>>> hotspot-runtime-dev at openjdk.java.net;
>>>>>>>>>>> hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>> Subject: Re: RFR 8242263: Diagnose synchronization on
>>>>>>>>>>> primitive wrappers
>>>>>>>>>>>
>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>
>>>>>>>>>>> On 8/4/20 1:35 PM, Doerr, Martin wrote:
>>>>>>>>>>>> Hi Patricio,
>>>>>>>>>>>>
>>>>>>>>>>>>> Ok, I'll change it to movl + testl and test it out before
>>>>>>>>>>>>> sending v3.
>>>>>>>>>>>> Thanks. I forgot to mention arm + aarch64.
>>>>>>>>>>>> Aarch64 uses ldrw + tbnz.
>>>>>>>>>>>> Arm 32 bit uses ldr_u32 + tbnz.
>>>>>>>>>>>>
>>>>>>>>>>>> I remember having seen the same mistake
>>>>>>>>>>>> And nobody noticed it on little Endian platforms.
>>>>>>>>>>> Ok, I can use a tbnz instead of test and then a branch on
>>>>>>>>>>> arm32 and aarch64.
>>>>>>>>>>> Shouldn't a normal ldr in arm32 work fine?
>>>>>>>>>>> Also in 64 bits (either x64 or aarch64) I don't see the
>>>>>>>>>>> issue of using a
>>>>>>>>>>> 64 bit load, besides the fact that we only care about the
>>>>>>>>>>> first 32 bits.
>>>>>>>>>>> Regardless of the endianness, aren't we masking out the
>>>>>>>>>>> upper part when
>>>>>>>>>>> we do AND/TEST or if we test a bit in the 0-31 bit range?
>>>>>>>>>>> Otherwise it
>>>>>>>>>>> seems it should have failed for one of those platforms in my
>>>>>>>>>>> tests.
>>>>>>>>>>>
>>>>>>>>>>>>> I see that some tests use @run driver/timeout=xxxx. Maybe
>>>>>>>>>>>>> you can
>>>>>>>>>>>>> specify that and see if that fixes it? Let me know if that
>>>>>>>>>>>>> works and I
>>>>>>>>>>>>> can add it to the test.
>>>>>>>>>>>> That seems to have an effect. But now I'm not patient
>>>>>>>>>>>> enough to wait for
>>>>>>>>>>> the test to finish.
>>>>>>>>>>>> Maybe the problem is that I'm using slow debug builds.
>>>>>>>>>>>> But is there a chance to make the test quicker without
>>>>>>>>>>>> losing coverage
>>>>>>>>>>> significantly?
>>>>>>>>>>>> I think the test is too slow this way to run it on a
>>>>>>>>>>>> regular basis. We'd need to
>>>>>>>>>>> spend dedicated machines for it.
>>>>>>>>>>> We could remove the nested synchronized statements in the
>>>>>>>>>>> run() method
>>>>>>>>>>> so that we don't generate that much logging. We could also
>>>>>>>>>>> lower
>>>>>>>>>>> LOOP_COUNT. I think the issue is also because we are running
>>>>>>>>>>> LogTest
>>>>>>>>>>> with multiple flag combinations. Not sure what we should
>>>>>>>>>>> touch first.
>>>>>>>>>>> Maybe the synchronized statements, have only one or two and
>>>>>>>>>>> test that
>>>>>>>>>>> first?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Patricio
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Martin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Patricio Chilano <patricio.chilano.mateo at oracle.com>
>>>>>>>>>>>>> Sent: Dienstag, 4. August 2020 17:47
>>>>>>>>>>>>> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes
>>>>>>>>>>>>> <david.holmes at oracle.com>;
>>>>>>>>>>>>> hotspot-runtime-dev at openjdk.java.net;
>>>>>>>>>>>>> hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>>> Subject: Re: RFR 8242263: Diagnose synchronization on
>>>>>>>>>>>>> primitive wrappers
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for fixing PPC and taking care of s390!
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 8/4/20 11:18 AM, Doerr, Martin wrote:
>>>>>>>>>>>>>> Hi Patricio,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I suggest to use movl + testl for checking the access
>>>>>>>>>>>>>> flag as for other
>>>>>>>>>>> access
>>>>>>>>>>>>> flags on x64.
>>>>>>>>>>>>> Ok, I'll change it to movl + testl and test it out before
>>>>>>>>>>>>> sending v3.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> New version for PPC64 and s390 see below.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The test SyncOnPrimitiveWrapperTest produces hs_err files as
>>>>>>>>>>> expected.
>>>>>>>>>>>>> However, I'm getting timeout issues:
>>>>>>>>>>>>>> timed out (timeout set to 120000ms, elapsed time
>>>>>>>>>>>>>> including timeout
>>>>>>>>>>>>> handling was 122372ms)
>>>>>>>>>>>>>> Can we provide more time?
>>>>>>>>>>>>> I see that some tests use @run driver/timeout=xxxx. Maybe
>>>>>>>>>>>>> you can
>>>>>>>>>>>>> specify that and see if that fixes it? Let me know if that
>>>>>>>>>>>>> works and I
>>>>>>>>>>>>> can add it to the test.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Patricio
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff -r 77852e129401
>>>>>>>>>>>>>> src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp
>>>>>>>>>>>>>> --- a/src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp Tue
>>>>>>>>>>>>>> Aug 04
>>>>>>>>>>>>> 10:03:57 2020 +0200
>>>>>>>>>>>>>> +++ b/src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp Tue
>>>>>>>>>>>>>> Aug 04
>>>>>>>>>>>>> 16:04:31 2020 +0200
>>>>>>>>>>>>>> @@ -107,8 +107,8 @@
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>>>>>>> load_klass(Rscratch, Roop);
>>>>>>>>>>>>>> - ld(Rscratch, in_bytes(Klass::access_flags_offset()),
>>>>>>>>>>>>>> Rscratch);
>>>>>>>>>>>>>> - andi(Rscratch, Rscratch, JVM_ACC_IS_BOX_CLASS);
>>>>>>>>>>>>>> + lwz(Rscratch,
>>>>>>>>>>>>>> in_bytes(Klass::access_flags_offset()), Rscratch);
>>>>>>>>>>>>>> + testbitdi(CCR0, R0, Rscratch,
>>>>>>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>>>>>>>>>> bne(CCR0, slow_case);
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff -r 77852e129401
>>>>>>>>>>>>>> src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp
>>>>>>>>>>>>>> --- a/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp Tue Aug 04
>>>>>>>>>>>>> 10:03:57 2020 +0200
>>>>>>>>>>>>>> +++ b/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp Tue Aug 04
>>>>>>>>>>>>> 16:04:31 2020 +0200
>>>>>>>>>>>>>> @@ -912,8 +912,8 @@
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>>>>>>> load_klass(tmp, object);
>>>>>>>>>>>>>> - ld(tmp, in_bytes(Klass::access_flags_offset()), tmp);
>>>>>>>>>>>>>> - andi(tmp, tmp, JVM_ACC_IS_BOX_CLASS);
>>>>>>>>>>>>>> + lwz(tmp, in_bytes(Klass::access_flags_offset()),
>>>>>>>>>>>>>> tmp);
>>>>>>>>>>>>>> + testbitdi(CCR0, R0, tmp,
>>>>>>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>>>>>>>>>> bne(CCR0, slow_case);
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff -r 77852e129401
>>>>>>>>>>>>>> src/hotspot/cpu/ppc/macroAssembler_ppc.cpp
>>>>>>>>>>>>>> --- a/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp Tue Aug 04
>>>>>>>>>>>>> 10:03:57 2020 +0200
>>>>>>>>>>>>>> +++ b/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp Tue Aug 04
>>>>>>>>>>>>> 16:04:31 2020 +0200
>>>>>>>>>>>>>> @@ -2838,8 +2838,8 @@
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>>>>>>> load_klass(temp, oop);
>>>>>>>>>>>>>> - ld(temp, in_bytes(Klass::access_flags_offset()), temp);
>>>>>>>>>>>>>> - andi(temp, temp, JVM_ACC_IS_BOX_CLASS);
>>>>>>>>>>>>>> + lwz(temp, in_bytes(Klass::access_flags_offset()),
>>>>>>>>>>>>>> temp);
>>>>>>>>>>>>>> + testbitdi(CCR0, R0, temp,
>>>>>>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>>>>>>>>>> bne(CCR0, cont);
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff -r 77852e129401
>>>>>>>>>>> src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp
>>>>>>>>>>>>>> --- a/src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp Tue
>>>>>>>>>>>>>> Aug 04
>>>>>>>>>>>>> 10:03:57 2020 +0200
>>>>>>>>>>>>>> +++ b/src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp Tue
>>>>>>>>>>>>>> Aug 04
>>>>>>>>>>>>> 16:04:31 2020 +0200
>>>>>>>>>>>>>> @@ -91,6 +91,12 @@
>>>>>>>>>>>>>> // Save object being locked into the
>>>>>>>>>>>>>> BasicObjectLock...
>>>>>>>>>>>>>> z_stg(obj, Address(disp_hdr,
>>>>>>>>>>> BasicObjectLock::obj_offset_in_bytes()));
>>>>>>>>>>>>>> + if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>>>>>>> + load_klass(Z_R1_scratch, obj);
>>>>>>>>>>>>>> + testbit(Address(Z_R1_scratch,
>>>>>>>>>>>>>> Klass::access_flags_offset()),
>>>>>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>>>>>>>>>> + z_btrue(slow_case);
>>>>>>>>>>>>>> + }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> if (UseBiasedLocking) {
>>>>>>>>>>>>>> biased_locking_enter(obj, hdr, Z_R1_scratch,
>>>>>>>>>>>>>> Z_R0_scratch, done,
>>>>>>>>>>>>> &slow_case);
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>> diff -r 77852e129401
>>>>>>>>>>>>>> src/hotspot/cpu/s390/interp_masm_s390.cpp
>>>>>>>>>>>>>> --- a/src/hotspot/cpu/s390/interp_masm_s390.cpp Tue Aug
>>>>>>>>>>>>>> 04 10:03:57
>>>>>>>>>>>>> 2020 +0200
>>>>>>>>>>>>>> +++ b/src/hotspot/cpu/s390/interp_masm_s390.cpp Tue Aug 04
>>>>>>>>>>> 16:04:31
>>>>>>>>>>>>> 2020 +0200
>>>>>>>>>>>>>> @@ -1000,6 +1000,12 @@
>>>>>>>>>>>>>> // Load markWord from object into displaced_header.
>>>>>>>>>>>>>> z_lg(displaced_header,
>>>>>>>>>>>>>> oopDesc::mark_offset_in_bytes(), object);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> + if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>>>>>>> + load_klass(Z_R1_scratch, object);
>>>>>>>>>>>>>> + testbit(Address(Z_R1_scratch,
>>>>>>>>>>>>>> Klass::access_flags_offset()),
>>>>>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>>>>>>>>>> + z_btrue(slow_case);
>>>>>>>>>>>>>> + }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> if (UseBiasedLocking) {
>>>>>>>>>>>>>> biased_locking_enter(object, displaced_header,
>>>>>>>>>>>>>> Z_R1, Z_R0, done,
>>>>>>>>>>>>> &slow_case);
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>> diff -r 77852e129401
>>>>>>>>>>>>>> src/hotspot/cpu/s390/macroAssembler_s390.cpp
>>>>>>>>>>>>>> --- a/src/hotspot/cpu/s390/macroAssembler_s390.cpp Tue
>>>>>>>>>>>>>> Aug 04
>>>>>>>>>>>>> 10:03:57 2020 +0200
>>>>>>>>>>>>>> +++ b/src/hotspot/cpu/s390/macroAssembler_s390.cpp Tue
>>>>>>>>>>>>>> Aug 04
>>>>>>>>>>>>> 16:04:31 2020 +0200
>>>>>>>>>>>>>> @@ -3358,6 +3358,12 @@
>>>>>>>>>>>>>> // Load markWord from oop into mark.
>>>>>>>>>>>>>> z_lg(displacedHeader, 0, oop);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> + if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>>>>>>>>>> + load_klass(Z_R1_scratch, oop);
>>>>>>>>>>>>>> + testbit(Address(Z_R1_scratch,
>>>>>>>>>>>>>> Klass::access_flags_offset()),
>>>>>>>>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>>>>>>>>>> + z_btrue(done);
>>>>>>>>>>>>>> + }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> if (try_bias) {
>>>>>>>>>>>>>> biased_locking_enter(oop, displacedHeader, temp,
>>>>>>>>>>>>>> Z_R0, done);
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list