8267042: bug in monitor locking/unlocking on ARM32 C1 due to uninitialized BasicObjectLock::_displaced_header

Boris Ulasevich boris.ulasevich at bell-sw.com
Wed May 19 09:20:23 UTC 2021


Hi Chris,

I agree that the problem is in C1_MacroAssembler::lock_object.

What I do not like in your fix is
- an arbitrary non-zero value is put into disp_hdr address for 'ne' case.
- there is a similar code pattern in 
SharedRuntime::generate_native_wrapper - should not it be fixed too?
- the second comment in hdr bits manipulation code is wrong: "// -2- 
test (hdr - SP) if the low two bits are 0"

regards,
Boris

On 19.05.2021 08:31, Chris Cole wrote:
> Hi All,
>
> I have continued to further investigate this bug that I reported and
> have some additional information.
>
> I believe that this issue is similar to JDK-8153107  "enabling
> ObjectSynchronizer::quick_enter() on ARM64 causes hangs" (see [1]),
> this impacted C2 on both ARM32 and ARM64. The fix for that issue (see
> [2]) fixed MacroAssembler::fast_lock() that is used by C2. When this
> fix was made, I believe only C2 was using
> ObjectSynchronizer::quick_enter(). Now (version 11.0.10+ and 15+) that
> C1 is using ObjectSynchronizer::quick_enter() due to JDK-8241234, a
> similar change to C1_MacroAssembler::lock_object() used by C1 is
> required. The following change to unconditionally store to the
> displaced header field should fix things for C1.
>
> diff --git a/src/hotspot/cpu/arm/c1_MacroAssembler_arm.cpp
> b/src/hotspot/cpu/arm/c1_MacroAssembler_arm.cpp
> index 9d5c82dceb9..9a628eb7de5 100644
> --- a/src/hotspot/cpu/arm/c1_MacroAssembler_arm.cpp
> +++ b/src/hotspot/cpu/arm/c1_MacroAssembler_arm.cpp
> @@ -235,7 +235,8 @@ int C1_MacroAssembler::lock_object(Register hdr,
> Register obj,
>     sub(tmp2, hdr, SP, eq);
>     movs(tmp2, AsmOperand(tmp2, lsr, exact_log2(os::vm_page_size())), eq);
>     // If 'eq' then OK for recursive fast locking: store 0 into a lock record.
> -  str(tmp2, Address(disp_hdr, mark_offset), eq);
> +  // set to non zero otherwise (see discussion in JDK-8267042)
> +  str(tmp2, Address(disp_hdr, mark_offset));
>     b(fast_lock_done, eq);
>     // else need slow case
>     b(slow_case);
>
> This should fix the specific issue that I was seeing and that is
> reproduce with test case I provided.
>
> I also reviewed the code to see if there were any other possible issue
> related to Runtime1::monitorenter() not always initializing displaced
> header field due to ObjectSynchronizer::quick_enter(). I found that if
> the -XX:-UseFastLocking develop flag is used (C1 doesn't use any
> inline monitor lock/unlock code and just calls the
> Runtime1::monitorenter() and Runtime1::monitorexit()), then the test
> case will reproduce deadlock on any of the platforms (arm32, aarch64,
> x86_64, etc) due to uninitialized displaced header field. Here is a
> fix for that. (Note that this change below, makes the change above to
> c1_MacroAssembler_arm.cpp unnecessary).
>
> diff --git a/src/hotspot/share/runtime/synchronizer.cpp
> b/src/hotspot/share/runtime/synchronizer.cpp
> index 98afe838c49..db3eb255778 100644
> --- a/src/hotspot/share/runtime/synchronizer.cpp
> +++ b/src/hotspot/share/runtime/synchronizer.cpp
> @@ -355,11 +355,6 @@ bool ObjectSynchronizer::quick_enter(oop obj, Thread* self,
>       // Case: light contention possibly amenable to TLE
>       // Case: TLE inimical operations such as nested/recursive synchronization
>
> -    if (owner == self) {
> -      m->_recursions++;
> -      return true;
> -    }
> -
>       // This Java Monitor is inflated so obj's header will never be
>       // displaced to this thread's BasicLock. Make the displaced header
>       // non-NULL so this BasicLock is not seen as recursive nor as
> @@ -372,6 +367,11 @@ bool ObjectSynchronizer::quick_enter(oop obj, Thread* self,
>       // and last are the inflated Java Monitor (ObjectMonitor) checks.
>       lock->set_displaced_header(markWord::unused_mark());
>
> +    if (owner == self) {
> +      m->_recursions++;
> +      return true;
> +    }
> +
>       if (owner == NULL && m->try_set_owner_from(NULL, self) == NULL) {
>         assert(m->_recursions == 0, "invariant");
>         return true;
>
> I am also reviewing the biased_locking_enter() call from
> C1_MacroAssembler::lock_object(), this code path can call
> Runtime1::monitorenter() without initializing the displaced header
> field. But in this code path it might not be possible for
> ObjectSynchronizer::quick_enter() to return "true" without calling
> lock->set_displaced_header(markWord::unused_mark()), this would
> require current thread to own inflated monitor which seems to not be
> possible in this case when biased_locking_enter() is calling
> Runtime1::monitorenter().
>
> Anyway, I will continue to review the code to make sure this aren't
> any other issues related to this issue.
>
> Thanks,
> Chris
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8153107
> [2] http://hg.openjdk.java.net/jdk/jdk/rev/3e66d204af9b
>
> On Wed, May 12, 2021 at 8:58 PM Chris Cole <chris at sageembedded.com> wrote:
>> Hi Dan,
>>
>> Great, thanks! Feel free to contact me if there is anything else I can
>> do that is helpful.
>>
>> Chris
>>
>> On Wed, May 12, 2021 at 10:21 AM <daniel.daugherty at oracle.com> wrote:
>>> Hi Chris,
>>>
>>> I filed the following new bug on your behalf:
>>>
>>> JDK-8267042 bug in monitor locking/unlocking on ARM32 C1 due to
>>> uninitialized BasicObjectLock::_displaced_header
>>> https://bugs.openjdk.java.net/browse/JDK-8267042
>>>
>>> Dan
>>>
>>>
>>> On 5/12/21 1:02 PM, Chris Cole wrote:
>>>> Hi,
>>>>
>>>> Not sure if this is the appropriate place or method to report an OpenJDK
>>>> bug, if not please advise.
>>>>
>>>> I have discovered a bug with ARM32 C1 monitor locking/unlocking that can
>>>> result in deadlock. The bug was introduced with JCK-8241234 "Unify monitor
>>>> enter/exit runtime entries" [1]. This change introduced a call to
>>>> ObjectSynchronizer::quick_entry() within the logic for
>>>> Runtime1::monitorenter().
>>>> If the monitor is inflated and already owned by the current thread, then
>>>> ObjectSynchronizer::quick_entry() simply increments
>>>> ObjectMonitor::_recursions and returns. In this case
>>>> Runtime1::monitorenter() returns to the JIT compiled code without calling
>>>> "lock->set_displaced_header(markWord::unused_mark())" (see [2]). For ARM32
>>>> the _displaced_header field is not always initialized in JIT code before
>>>> calling Runtime1::monitorenter() helper. If the uninitialized value of
>>>> _displaced_header field on stack happens to be NULL, this causes an issue
>>>> because the JIT code to exit the monitor first checks for a NULL
>>>> _displaced_header as an indication for non-inflated recursive locking which
>>>> is a noop for exiting the monitor (see [3]). This means that the
>>>> Runtime1::monitorexit() helper is not called as required to exit this
>>>> inflated monitor, and the ObjectMonitor::_recursions is not decremented as
>>>> required. This leads to thread not unlocking the monitor when required and
>>>> deadlock when another thread tries to lock the monitor.
>>>>
>>>> This bug is not present on AArch64 and x86, because the displaced header is
>>>> initialized in JIT code with the "unlocked object header" value (which is
>>>> non-zero) before calling Runtime1::monitorenter() helper (see [4] and [5]).
>>>> Note sure about other CPU architectures.
>>>>
>>>> I see two ways to fix this.
>>>> 1) In ObjectSynchronizer::quick_entry() move the
>>>> "lock->set_displaced_header(markWord::unused_mark())" statement to before
>>>> the "if (owner == current)" at line 340 in share/runtime/synchronizer.cpp
>>>> (see [6]), so that Runtime1::monitorenter() helper logic always initializes
>>>> the displaced header field as was the case before JCK-8241234.
>>>> 2) For ARM32 add JIT code to initialize the displaced header field before
>>>> calling Runtime1::monitorenter() helper as done for AArch64 and x86.
>>>>
>>>> Not sure which is better (or maybe both are required for some reason I am
>>>> not aware of). I believe this "displacted header" on the stack can be
>>>> looked at by stack walkers but I am not familiar with the exact details and
>>>> if there are implications on this fix.
>>>>
>>>> The bug is also present in OpenJDK 11.0.10 and later (introduced by the
>>>> backport of JDK-8241234 [1]).
>>>>
>>>> I/my company (Sage Embedded Software) has signed the Oracle Contributor
>>>> Agreement (OCA) and have been granted access to JCK.
>>>>
>>>> The bug can be reproduced in my environment with the OpenJDK Community TCK
>>>> testing of java.io.PipedReader that deadlocks, but because reproduction of
>>>> the issue requires uninitialized stack field to be zero, it might not
>>>> happen in some environments. I have a Java test case that can reproduce
>>>> this issue on ARM in 32 bit mode. It is pasted inline below at the end of
>>>> the email. There is a "getZeroOnStack()" method that I think helps get a
>>>> zero into the uninitialized _displaced_header field. The test case source
>>>> code is copied from OpenJDK java.io.PipedReader source code and then
>>>> modified. It needs to be run with only C1 enabled (I am using minimal
>>>> variant to enforce this) and the following command line options
>>>> (-XX:-BackgroundCompilation -XX:CompileThreshold=500
>>>> -XX:CompileOnly="com.sageembedded.test.MonitorBugTest::receive"). The test
>>>> case should run and then end with "Source thread done" and "Reading
>>>> complete" output if the bug is not reproduced. If the monitor bug is
>>>> reproduced the test case will not exit and the main thread will be
>>>> deadlocked, with the main thread last printing "read() before wait" and
>>>> missing "read() after wait" and "Reading complete". If useful I can provide
>>>> the output of this test cause including -XX:PrintAssebly and logging that I
>>>> added to ObjectSynchronizer::quick_entry() that shows uninitialized
>>>> lock->_displaced_header and ObjectMonitor->_recursions continue to get
>>>> incremented (into the 1000s) as the MonitorBugTest.receive() method is
>>>> called in a loop.
>>>>
>>>> Please let me know if there is anything else that would be helpful. I hope
>>>> to become active in the OpenJDK Community. My time is a little limited at
>>>> the moment, so sometimes it might take a day to respond (I have 3 and 6
>>>> year old kids). In the coming years I expect to have additional time to be
>>>> more involved in the OpenJDK Community.
>>>>
>>>> Best regards,
>>>> Chris Cole
>>>> Sage Embedded Software LLC
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8241234
>>>> [2]
>>>> https://github.com/openjdk/jdk/blob/dfe8833f5d9a9ac59857143a86d07f85769b8eae/src/hotspot/share/runtime/synchronizer.cpp#L343
>>>> [3]
>>>> https://github.com/openjdk/jdk/blob/dfe8833f5d9a9ac59857143a86d07f85769b8eae/src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp#L130
>>>> [4]
>>>> https://github.com/openjdk/jdk/blob/71b8ad45b4de6836e3bb2716ebf136f3f8ea2198/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp#L95
>>>> [5]
>>>> https://github.com/openjdk/jdk/blob/dfe8833f5d9a9ac59857143a86d07f85769b8eae/src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp#L74
>>>> [6]
>>>> https://github.com/openjdk/jdk/blob/dfe8833f5d9a9ac59857143a86d07f85769b8eae/src/hotspot/share/runtime/synchronizer.cpp#L340
>>>>
>>>> /*
>>>>    * Copyright (c) 1996, 2020, Oracle and/or its affiliates. All rights
>>>> reserved.
>>>>    * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>>>>    *
>>>>    * This code is free software; you can redistribute it and/or modify it
>>>>    * under the terms of the GNU General Public License version 2 only, as
>>>>    * published by the Free Software Foundation.  Oracle designates this
>>>>    * particular file as subject to the "Classpath" exception as provided
>>>>    * by Oracle in the LICENSE file that accompanied this code.
>>>>    *
>>>>    * This code is distributed in the hope that it will be useful, but WITHOUT
>>>>    * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>>>    * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>>>    * version 2 for more details (a copy is included in the LICENSE file that
>>>>    * accompanied this code).
>>>>    *
>>>>    * You should have received a copy of the GNU General Public License version
>>>>    * 2 along with this work; if not, write to the Free Software Foundation,
>>>>    * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
>>>>    *
>>>>    * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
>>>>    * or visit www.oracle.com if you need additional information or have any
>>>>    * questions.
>>>>    */
>>>>
>>>> package com.sageembedded.test;
>>>>
>>>> import java.io.IOException;
>>>> import java.io.InterruptedIOException;
>>>>
>>>> /*
>>>>    * java -cp bin -minimal -XX:-BackgroundCompilation -XX:CompileThreshold=500
>>>>    *  -XX:CompileOnly="com.sageembedded.test.MonitorBugTest::receive"
>>>>    *  -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly
>>>>    *  com.sageembedded.test.MonitorBugTest
>>>>    */
>>>> public class MonitorBugTest {
>>>>
>>>>           private static int DATA_SIZE = 1000;
>>>>           private static int BUFFER_SIZE = 256;
>>>>
>>>>           private char buffer[] = new char[BUFFER_SIZE];
>>>>           private int writeIndex = -1;
>>>>           private int readIndex = 0;
>>>>
>>>>           public Object lock = new Object();
>>>>
>>>>           public static void main(String[] args) {
>>>>                   MonitorBugTest test = new MonitorBugTest();
>>>>                   test.run();
>>>>           }
>>>>           private void run() {
>>>>                   System.out.println("Starting test");
>>>>
>>>>                   SourceThread source = new SourceThread();
>>>>                   source.start();
>>>>
>>>>                   try {
>>>>                           for (int i = 0; i < DATA_SIZE; i++) {
>>>>                                   read();
>>>>                           }
>>>>                   } catch (IOException e) {
>>>>                           e.printStackTrace();
>>>>                   }
>>>>                   System.out.println("Reading complete");
>>>>
>>>>           }
>>>>
>>>>           synchronized void receive(char data[], int offset, int length)
>>>> throws IOException {
>>>>                   while (--length >= 0) {
>>>>                           getZeroOnStack(offset);
>>>>                           receive(data[offset++]);
>>>>                   }
>>>>           }
>>>>
>>>>           private void getZeroOnStack(int offset) {
>>>>                   int l1;
>>>>                   int l2;
>>>>                   int l3;
>>>>                   int l4;
>>>>                   int l5;
>>>>                   int l6;
>>>>                   int l7;
>>>>                   int l8;
>>>>                   int l9;
>>>>                   int l10;
>>>>                   int l11;
>>>>                   int l12;
>>>>                   int l13;
>>>>                   int l14;
>>>>                   int l15;
>>>>                   int l16;
>>>>
>>>>                   l1 = 0;
>>>>                   l2 = 0;
>>>>                   l3 = 0;
>>>>                   l4 = 0;
>>>>                   l5 = 0;
>>>>                   l6 = 0;
>>>>                   l7 = 0;
>>>>                   l8 = 0;
>>>>                   l9 = 0;
>>>>                   l10 = 0;
>>>>                   l11 = 0;
>>>>                   l12 = 0;
>>>>                   l13 = 0;
>>>>                   l14 = 0;
>>>>                   l15 = 0;
>>>>                   l16 = 0;
>>>>           }
>>>>
>>>>           synchronized void receive(int c) throws IOException {
>>>>                   while (writeIndex == readIndex) {
>>>>                           notifyAll();
>>>>                           try {
>>>>                                   wait(1000);
>>>>                           } catch (InterruptedException e) {
>>>>                                   throw new InterruptedIOException();
>>>>                           }
>>>>                   }
>>>>                   if (writeIndex < 0) {
>>>>                           writeIndex = 0;
>>>>                           readIndex = 0;
>>>>                   }
>>>>                   buffer[writeIndex++] = (char) c;
>>>>                   if (writeIndex >= buffer.length) {
>>>>                           writeIndex = 0;
>>>>                   }
>>>>           }
>>>>
>>>>           synchronized void last() {
>>>>                   notifyAll();
>>>>           }
>>>>           public synchronized int read() throws IOException {
>>>>                   while (writeIndex < 0) {
>>>>                           notifyAll();
>>>>                           try {
>>>>                                   System.out.println("read() before wait");
>>>>                                   wait(1000);
>>>>                                   System.out.println("read() after wait");
>>>>                           } catch (InterruptedException e) {
>>>>                                   throw new InterruptedIOException();
>>>>                           }
>>>>                   }
>>>>                   int value = buffer[readIndex++];
>>>>                   if (readIndex >= buffer.length) {
>>>>                           readIndex = 0;
>>>>                   }
>>>>                   if (writeIndex == readIndex) {
>>>>                           writeIndex = -1;
>>>>                   }
>>>>                   return value;
>>>>           }
>>>>
>>>>           private class SourceThread extends Thread {
>>>>                   @Override
>>>>                   public void run() {
>>>>                           System.out.println("Source thread start");
>>>>                           char data[] = new char[DATA_SIZE];
>>>>                           try {
>>>>                                   receive(data, 0, data.length);
>>>>                                   last();
>>>>                           } catch (IOException e) {
>>>>                                   e.printStackTrace();
>>>>                           }
>>>>                           System.out.println("Source thread done");
>>>>                   }
>>>>           }
>>>> }



More information about the hotspot-dev mailing list