RFR 8242263: Diagnose synchronization on primitive wrappers
Daniel D. Daugherty
daniel.daugherty at oracle.com
Wed Aug 5 21:47:35 UTC 2020
I'm peeking ahead to the next webrev... :-)
> http://cr.openjdk.java.net/~pchilanomate/8242263/v3/webrev/
General comments:
- check files for copyright year updates.
src/hotspot/share/runtime/synchronizer.hpp
No comments.
src/hotspot/share/runtime/synchronizer.cpp
L507: const markWord mark = obj->mark();
L508:
L509: if (DiagnoseSyncOnPrimitiveWrappers != 0 &&
obj->klass()->is_box()) {
L510: return false;
L511: }
The new bailout on L509-511 can move above L507.
L573: fatal("Synchronizing on object " INTPTR_FORMAT " of klass
%s", p2i(obj()), obj->klass()->external_name());
This external_name() call does not have a ResourceMark.
src/hotspot/share/logging/logTag.hpp
No comments.
src/hotspot/share/oops/klass.hpp
No comments.
src/hotspot/share/utilities/accessFlags.hpp
No comments.
src/hotspot/share/runtime/globals.hpp
L814: "0: off "
Missing a ';' after "off".
L816: "2: log message to stdout.
Perhaps add "(by default)" after "stdout" or
don't say where log output is at all.
src/hotspot/share/runtime/arguments.cpp
L4197: LogConfiguration::configure_stdout(LogLevel::Info, true,
LOG_TAGS(primitivewrappers));
Hmmm... maybe ignore my comments about L816 in globals.hpp
since it looks like logging output is configured to 'stdout'.
I'm assuming that other log options to put output elsewhere
are overridden by this code?
src/hotspot/share/classfile/systemDictionary.cpp
L2188: for (int i = T_BOOLEAN; i < T_LONG+1; i++) {
nit - s/T_LONG+1/T_LONG + 1/
L2191: _box_klasses[i]->set_prototype_header(markWord::prototype());
I assume we're keeping the prototype_header field when Biased
Locking
goes away? The reason I ask:
static markWord prototype() {
return markWord( no_hash_in_place | no_lock_in_place );
}
is because without Biased Locking, do we really need the prototype
anymore? The initial markWord won't need possible variants...
src/hotspot/share/jfr/metadata/metadata.xml
L69: <Event name="SyncOnPrimitiveWrapper" category="Java Application"
Is the category "Java Application" because it's the application
code that did something "wrong" here? Where "application" is
loosely
defined to include the possibility of auto generated code, library
code and the like? Or perhaps "application" because something
"above"
the "Java Virtual Machine, Runtime" did the "wrong" thing here?
src/jdk.jfr/share/conf/jfr/default.jfc
No comments.
src/jdk.jfr/share/conf/jfr/profile.jfc
No comments.
test/lib/jdk/test/lib/jfr/EventNames.java
No comments.
src/hotspot/cpu/aarch64/aarch64.ad
L3517: __ tbnz(tmp, exact_log2(JVM_ACC_IS_BOX_CLASS), cont);
<snip>
L3578: __ bind(cont);
L3579: // flag == EQ indicates success
L3580: // flag == NE indicates failure
If tbnz() branches to "cont" when we have a box class, what's
the flag value set to (EQ or NE)? And what set that flag value?
The reason I ask is I don't think tbnz() sets condition codes...
src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
No comments.
src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp
No comments.
src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
int MacroAssembler::biased_locking_enter(Register lock_reg,
Register obj_reg,
Register swap_reg,
Register tmp_reg,
bool swap_reg_contains_mark,
Label& done,
Label* slow_case,
BiasedLockingCounters* counters) {
I think you've changed the only callers of biased_locking_enter()
that cared about the return value with this changeset so it can
be changed to a void function.
src/hotspot/cpu/arm/c1_MacroAssembler_arm.cpp
No comments.
src/hotspot/cpu/arm/c2_MacroAssembler_arm.cpp
L96: tbnz(Rscratch, exact_log2(JVM_ACC_IS_BOX_CLASS), done);
<snip>
L131: bind(done);
L132:
L133: // At this point flags are set as follows:
L134: // EQ -> Success
L135: // NE -> Failure, branch to slow path
If tbnz() branches to "done" when we have a box class, what's
the flag value set to (EQ or NE)? And what set that flag value?
The reason I ask is I don't think tbnz() sets condition codes...
src/hotspot/cpu/arm/interp_masm_arm.cpp
No comments.
src/hotspot/cpu/arm/macroAssembler_arm.cpp
int MacroAssembler::biased_locking_enter(Register obj_reg, Register
swap_reg, Register tmp_reg,
bool swap_reg_contains_mark,
Register tmp2,
Label& done, Label& slow_case,
BiasedLockingCounters* counters) {
I think you've changed the only callers of biased_locking_enter()
that cared about the return value with this changeset so it can
be changed to a void function.
src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp
src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp
src/hotspot/cpu/ppc/macroAssembler_ppc.cpp
No comments on the PPC code.
src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp
src/hotspot/cpu/s390/interp_masm_s390.cpp
src/hotspot/cpu/s390/macroAssembler_s390.cpp
No comments on the S390 code.
src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp
L58: load_klass(hdr, obj, rklass_decode_tmp);
What will this do with a 'noreg' value? (I need a refresher.)
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
No comments.
src/hotspot/cpu/x86/interp_masm_x86.cpp
No comments.
src/hotspot/cpu/x86/macroAssembler_x86.cpp
int MacroAssembler::biased_locking_enter(Register lock_reg,
Register obj_reg,
Register swap_reg,
Register tmp_reg,
Register tmp_reg2,
bool swap_reg_contains_mark,
Label& done,
Label* slow_case,
BiasedLockingCounters* counters) {
I think you've changed the only caller of biased_locking_enter()
that cared about the return value with this changeset so it can
be changed to a void function.
test/hotspot/jtreg/runtime/Monitor/SyncOnPrimitiveWrapperTest.java
L30: * @test SyncOnPrimitiveWrapperTest
No parameter to @test directive.
L136: private static long sharedCounter = 0L;
Since you don't do anything with sharedCounter other than
increment it,
can the compilers optimize it away? If it can be optimized
away, does
that mean that:
L142: synchronized (obj) {
can also be optimized away?
I don't think that:
L161: synchronized (sharedLock1) {
can be optimized away because it is shared by multiple threads.
test/jdk/jdk/jfr/event/runtime/TestSyncOnPrimitiveWrapperEvent.java
Similar questions about 'counter' being optimized away.
Similar question about "synchronized (obj) {" being optimized away.
Dan
On 8/5/20 9:01 AM, Patricio Chilano wrote:
> Hi Martin,
>
> On 8/5/20 4:47 AM, Doerr, Martin wrote:
>> Hi Patricio,
>>
>> using 8 Byte load instructions for jint fields is a terrible coding
>> style!
>> Someone else may see it and use an 8 Byte store. Will result in great
>> fun for debugging!
>>
>> And on Big Endian you will simply access the wrong bits.
> Ah, of course! Those 32 bits will be in the wrong place when doing the
> test.
>
>> Note that Big Endian Platforms are AIX, old linux ppc, s390, SPARC. I
>> don't think you have tested on them.
>>
>>> We could remove the nested synchronized statements in the run() method
>>> so that we don't generate that much logging. We could also lower
>>> LOOP_COUNT. I think the issue is also because we are running LogTest
>>> with multiple flag combinations. Not sure what we should touch first.
>>> Maybe the synchronized statements, have only one or two and test that
>>> first?
>> Sounds like helpful ideas. Please go ahead and strip things down.
> Great, I will send v3 later with those changes.
>
> Thanks Martin!
>
> Patricio
>> Thanks for taking care of it.
>>
>> Best regards,
>> Martin
>>
>>
>>> -----Original Message-----
>>> From: Patricio Chilano <patricio.chilano.mateo at oracle.com>
>>> Sent: Dienstag, 4. August 2020 20:26
>>> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes
>>> <david.holmes at oracle.com>; hotspot-runtime-dev at openjdk.java.net;
>>> hotspot-jfr-dev at openjdk.java.net
>>> Subject: Re: RFR 8242263: Diagnose synchronization on primitive
>>> wrappers
>>>
>>> Hi Martin,
>>>
>>> On 8/4/20 1:35 PM, Doerr, Martin wrote:
>>>> Hi Patricio,
>>>>
>>>>> Ok, I'll change it to movl + testl and test it out before sending v3.
>>>> Thanks. I forgot to mention arm + aarch64.
>>>> Aarch64 uses ldrw + tbnz.
>>>> Arm 32 bit uses ldr_u32 + tbnz.
>>>>
>>>> I remember having seen the same mistake
>>>> And nobody noticed it on little Endian platforms.
>>> Ok, I can use a tbnz instead of test and then a branch on arm32 and
>>> aarch64.
>>> Shouldn't a normal ldr in arm32 work fine?
>>> Also in 64 bits (either x64 or aarch64) I don't see the issue of
>>> using a
>>> 64 bit load, besides the fact that we only care about the first 32
>>> bits.
>>> Regardless of the endianness, aren't we masking out the upper part when
>>> we do AND/TEST or if we test a bit in the 0-31 bit range? Otherwise it
>>> seems it should have failed for one of those platforms in my tests.
>>>
>>>>> I see that some tests use @run driver/timeout=xxxx. Maybe you can
>>>>> specify that and see if that fixes it? Let me know if that works
>>>>> and I
>>>>> can add it to the test.
>>>> That seems to have an effect. But now I'm not patient enough to
>>>> wait for
>>> the test to finish.
>>>> Maybe the problem is that I'm using slow debug builds.
>>>> But is there a chance to make the test quicker without losing coverage
>>> significantly?
>>>> I think the test is too slow this way to run it on a regular basis.
>>>> We'd need to
>>> spend dedicated machines for it.
>>> We could remove the nested synchronized statements in the run() method
>>> so that we don't generate that much logging. We could also lower
>>> LOOP_COUNT. I think the issue is also because we are running LogTest
>>> with multiple flag combinations. Not sure what we should touch first.
>>> Maybe the synchronized statements, have only one or two and test that
>>> first?
>>>
>>>
>>> Thanks,
>>> Patricio
>>>> Best regards,
>>>> Martin
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Patricio Chilano <patricio.chilano.mateo at oracle.com>
>>>>> Sent: Dienstag, 4. August 2020 17:47
>>>>> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes
>>>>> <david.holmes at oracle.com>; hotspot-runtime-dev at openjdk.java.net;
>>>>> hotspot-jfr-dev at openjdk.java.net
>>>>> Subject: Re: RFR 8242263: Diagnose synchronization on primitive
>>>>> wrappers
>>>>>
>>>>> Hi Martin,
>>>>>
>>>>> Thanks for fixing PPC and taking care of s390!
>>>>>
>>>>> On 8/4/20 11:18 AM, Doerr, Martin wrote:
>>>>>> Hi Patricio,
>>>>>>
>>>>>> I suggest to use movl + testl for checking the access flag as for
>>>>>> other
>>> access
>>>>> flags on x64.
>>>>> Ok, I'll change it to movl + testl and test it out before sending v3.
>>>>>
>>>>>> New version for PPC64 and s390 see below.
>>>>>>
>>>>>> The test SyncOnPrimitiveWrapperTest produces hs_err files as
>>> expected.
>>>>> However, I'm getting timeout issues:
>>>>>> timed out (timeout set to 120000ms, elapsed time including timeout
>>>>> handling was 122372ms)
>>>>>> Can we provide more time?
>>>>> I see that some tests use @run driver/timeout=xxxx. Maybe you can
>>>>> specify that and see if that fixes it? Let me know if that works
>>>>> and I
>>>>> can add it to the test.
>>>>>
>>>>> Thanks,
>>>>> Patricio
>>>>>> Best regards,
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> diff -r 77852e129401 src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp
>>>>>> --- a/src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp Tue Aug 04
>>>>> 10:03:57 2020 +0200
>>>>>> +++ b/src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp Tue Aug 04
>>>>> 16:04:31 2020 +0200
>>>>>> @@ -107,8 +107,8 @@
>>>>>>
>>>>>> if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>> load_klass(Rscratch, Roop);
>>>>>> - ld(Rscratch, in_bytes(Klass::access_flags_offset()), Rscratch);
>>>>>> - andi(Rscratch, Rscratch, JVM_ACC_IS_BOX_CLASS);
>>>>>> + lwz(Rscratch, in_bytes(Klass::access_flags_offset()),
>>>>>> Rscratch);
>>>>>> + testbitdi(CCR0, R0, Rscratch,
>>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>> bne(CCR0, slow_case);
>>>>>> }
>>>>>>
>>>>>> diff -r 77852e129401 src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp
>>>>>> --- a/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp Tue Aug 04
>>>>> 10:03:57 2020 +0200
>>>>>> +++ b/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp Tue Aug 04
>>>>> 16:04:31 2020 +0200
>>>>>> @@ -912,8 +912,8 @@
>>>>>>
>>>>>> if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>> load_klass(tmp, object);
>>>>>> - ld(tmp, in_bytes(Klass::access_flags_offset()), tmp);
>>>>>> - andi(tmp, tmp, JVM_ACC_IS_BOX_CLASS);
>>>>>> + lwz(tmp, in_bytes(Klass::access_flags_offset()), tmp);
>>>>>> + testbitdi(CCR0, R0, tmp, exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>> bne(CCR0, slow_case);
>>>>>> }
>>>>>>
>>>>>> diff -r 77852e129401 src/hotspot/cpu/ppc/macroAssembler_ppc.cpp
>>>>>> --- a/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp Tue Aug 04
>>>>> 10:03:57 2020 +0200
>>>>>> +++ b/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp Tue Aug 04
>>>>> 16:04:31 2020 +0200
>>>>>> @@ -2838,8 +2838,8 @@
>>>>>>
>>>>>> if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>> load_klass(temp, oop);
>>>>>> - ld(temp, in_bytes(Klass::access_flags_offset()), temp);
>>>>>> - andi(temp, temp, JVM_ACC_IS_BOX_CLASS);
>>>>>> + lwz(temp, in_bytes(Klass::access_flags_offset()), temp);
>>>>>> + testbitdi(CCR0, R0, temp, exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>> bne(CCR0, cont);
>>>>>> }
>>>>>>
>>>>>> diff -r 77852e129401
>>> src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp
>>>>>> --- a/src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp Tue Aug 04
>>>>> 10:03:57 2020 +0200
>>>>>> +++ b/src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp Tue Aug 04
>>>>> 16:04:31 2020 +0200
>>>>>> @@ -91,6 +91,12 @@
>>>>>> // Save object being locked into the BasicObjectLock...
>>>>>> z_stg(obj, Address(disp_hdr,
>>> BasicObjectLock::obj_offset_in_bytes()));
>>>>>> + if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>> + load_klass(Z_R1_scratch, obj);
>>>>>> + testbit(Address(Z_R1_scratch, Klass::access_flags_offset()),
>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>> + z_btrue(slow_case);
>>>>>> + }
>>>>>> +
>>>>>> if (UseBiasedLocking) {
>>>>>> biased_locking_enter(obj, hdr, Z_R1_scratch,
>>>>>> Z_R0_scratch, done,
>>>>> &slow_case);
>>>>>> }
>>>>>> diff -r 77852e129401 src/hotspot/cpu/s390/interp_masm_s390.cpp
>>>>>> --- a/src/hotspot/cpu/s390/interp_masm_s390.cpp Tue Aug 04 10:03:57
>>>>> 2020 +0200
>>>>>> +++ b/src/hotspot/cpu/s390/interp_masm_s390.cpp Tue Aug 04
>>> 16:04:31
>>>>> 2020 +0200
>>>>>> @@ -1000,6 +1000,12 @@
>>>>>> // Load markWord from object into displaced_header.
>>>>>> z_lg(displaced_header, oopDesc::mark_offset_in_bytes(),
>>>>>> object);
>>>>>>
>>>>>> + if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>> + load_klass(Z_R1_scratch, object);
>>>>>> + testbit(Address(Z_R1_scratch, Klass::access_flags_offset()),
>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>> + z_btrue(slow_case);
>>>>>> + }
>>>>>> +
>>>>>> if (UseBiasedLocking) {
>>>>>> biased_locking_enter(object, displaced_header, Z_R1,
>>>>>> Z_R0, done,
>>>>> &slow_case);
>>>>>> }
>>>>>> diff -r 77852e129401 src/hotspot/cpu/s390/macroAssembler_s390.cpp
>>>>>> --- a/src/hotspot/cpu/s390/macroAssembler_s390.cpp Tue Aug 04
>>>>> 10:03:57 2020 +0200
>>>>>> +++ b/src/hotspot/cpu/s390/macroAssembler_s390.cpp Tue Aug 04
>>>>> 16:04:31 2020 +0200
>>>>>> @@ -3358,6 +3358,12 @@
>>>>>> // Load markWord from oop into mark.
>>>>>> z_lg(displacedHeader, 0, oop);
>>>>>>
>>>>>> + if (DiagnoseSyncOnPrimitiveWrappers != 0) {
>>>>>> + load_klass(Z_R1_scratch, oop);
>>>>>> + testbit(Address(Z_R1_scratch, Klass::access_flags_offset()),
>>>>> exact_log2(JVM_ACC_IS_BOX_CLASS));
>>>>>> + z_btrue(done);
>>>>>> + }
>>>>>> +
>>>>>> if (try_bias) {
>>>>>> biased_locking_enter(oop, displacedHeader, temp, Z_R0,
>>>>>> done);
>>>>>> }
>>>>>>
>
More information about the hotspot-runtime-dev
mailing list