RFR (XS) 8047212: fix race between ObjectMonitor alloc and verification code

Daniel D. Daugherty daniel.daugherty at oracle.com
Wed Oct 21 15:05:04 UTC 2015


On 10/21/15, 7:39 AM, Coleen Phillimore wrote:
> On 10/21/15 9:10 AM, Daniel D. Daugherty wrote:
>> Adding serviceability-dev at ... back in... you gotta stop using
>> reply-to-alias when there's more than one alias...
>>
>> The changes to vmStructs are to get it to understand the new
>> type definition. There's a "generate" macro and a "check"
>> macro that are used to generate the corresponding Java side
>> type definition and the "check" macro exists to verify that
>> the types match (as much as they can match between C++ and
>> Java).
>
> Yes.
>>
>> So... yes there is code in the Serviceability Agent that uses
>> ObjectMonitor::gBlockList so I couldn't just delete that entry
>> from vmStructs... I did think about deleting all of the code
>> that used ObjectMonitor::gBlockList from both vmStructs and
>> SA, but that wouldn't have been nice... :-)
>
> If basic SA tests still work (most are quarantined), you might as well 
> leave it

I did a special run of the SA tests where I ignored quarantine, but
honored the exclude list. Out of the 100 SA tests, I saw 6-9 known
failures and 91-94 passes depending on the platform.


> but generally there's some duplicated code that you have to "fix" in 
> the SA, which I didn't see in your webrev. Sometimes the best approach 
> is to delete the SA code since there's nothing that uses/tests/needs 
> most of the code.

The vmStructs.cpp change only exists to teach the SA about a new
data type that it did not previously support. SA knows about
volatile non-static fields, but did not know about static ptr
volatile fields.

This is a data type mapping/teaching changing only. There is
no algorithmic code that needs to change in SA for this change.
SA was able to walk the ObjectMonitor block list before and is
still able to do so.


>>
>> Dan
>>
>> P.S.
>> Is this a complete review?
>
> No, sorry, the vmStructs just caught my eye.  I will try to look at it 
> later though.

No worries. I currently have three reviewers.

Dan


>
> Coleen
>>
>>
>>
>> On 10/21/15, 6:51 AM, Coleen Phillimore wrote:
>>>
>>> Dan,
>>>
>>> Because you changed vmStructs, there is potentially a Java change in 
>>> the serviceability agent to match.
>>>
>>> If there isn't, you should remove this from vmStructs completely.
>>>
>>> Coleen
>>>
>>> On 10/20/15 10:44 PM, Daniel D. Daugherty wrote:
>>>>
>>>> On 10/20/15, 8:15 PM, David Holmes wrote:
>>>>> Quick follow up here then I'll respond to updated webrev ...
>>>>
>>>> Cool. Quick reply to your follow up...
>>>>
>>>>
>>>>>
>>>>> On 21/10/2015 12:51 AM, Daniel D. Daugherty wrote:
>>>>>> On 10/20/15, 1:53 AM, David Holmes wrote:
>>>>>>> Hi Dan,
>>>>>>>
>>>>>>> Great find in getting to the bottom of this one!
>>>>>>
>>>>>> Thanks! And thanks for the fast review!
>>>>>>
>>>>>> Also, welcome back from vacation... hope you had a blast.
>>>>>
>>>>> Thanks - I did! :)
>>>>>
>>>>>>> On 20/10/2015 1:02 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> I have a fix for a long standing race between the lock-free
>>>>>>>> ObjectMonitor
>>>>>>>> verification code and the normal (locked) ObjectMonitor block 
>>>>>>>> allocation
>>>>>>>> code path. For this fix, I would like at least a Runtime team 
>>>>>>>> reviewer
>>>>>>>> and a Serviceability team reviewer. Thanks!
>>>>>>>>
>>>>>>>> JDK-8047212 
>>>>>>>> runtime/ParallelClassLoading/bootstrap/random/inner-complex
>>>>>>>> assert(ObjectSynchronizer::verify_objmon_isinpool(inf))
>>>>>>>> failed:
>>>>>>>>              monitor is invalid
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8047212
>>>>>>>>
>>>>>>>> Webrev URL:
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8047212-webrev/0-jdk9-hs-rt/
>>>>>>>
>>>>>>>  src/share/vm/runtime/synchronizer.cpp
>>>>>>>
>>>>>>> I would have just used OrderAccess::storeStore() as per my 
>>>>>>> comment in
>>>>>>> the CR.
>>>>>>
>>>>>> Saw your comment, but I already had the current fix in my stress
>>>>>> testing cycle so I went with it...
>>>>>>
>>>>>> I could use OrderAccess::storestore(), but I think I'm more fond
>>>>>> of "OrderAccess::release_store_ptr(&gBlockList, temp)". To me the
>>>>>> semantics are more clear:
>>>>>>
>>>>>>    release previous stores before storing a new value in this 
>>>>>> variable
>>>>>>
>>>>>> I _think_ OrderAccess::release_store_ptr() is also a little less
>>>>>> heavyweight than OrderAccess::storestore(), but I'd have to re-read
>>>>>> everything in orderAccess.?pp to nail that down for sure.
>>>>>>
>>>>>>
>>>>>>> But as Carsten says a "release" should be paired with an "acquire".
>>>>>>> Which suggests that in the other code that reads these variables we
>>>>>>> also need either the load_acquire() or a loadLoad() (if using
>>>>>>> storeStore()).**
>>>>>>
>>>>>> Yes, Carsten is right and I was modeling after other ObjectMonitor
>>>>>> code that doesn't do this quite right. I'll fix this to use
>>>>>> load_acquire().
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> ** This symmetry is largely missing in our lock-free code, and I 
>>>>>>> think
>>>>>>> we've been relying on "volatile" to act as a compiler barrier. :(
>>>>>>
>>>>>> Hey it seems to work! (famous last words)
>>>>>
>>>>> Well we're gradually getting around to cleaning it up. :)
>>>>
>>>> One bug at a time... :-)
>>>>
>>>>
>>>>>
>>>>>>
>>>>>>> src/share/vm/runtime/vmStructs.cpp
>>>>>>>
>>>>>>> Can you not just define volatile_static_field?
>>>>>>
>>>>>> Yes, I went that way originally and then I changed my mind to
>>>>>> avoid colliding with the other format. See below.
>>>>>>
>>>>>>
>>>>>>> Why does the ptr aspect need to come into it? Also "static pointer
>>>>>>> volatile field" sounds really odd, it is a static, volatile 
>>>>>>> field that
>>>>>>> happens to be a pointer-type.
>>>>>>
>>>>>> It's meant to be odd because it follows the actual decl:
>>>>>>
>>>>>>      static ObjectMonitor * volatile gBlockList;
>>>>>>
>>>>>> So "static pointer volatile field" is exactly what I have:
>>>>>>
>>>>>>      static ObjectMonitor * volatile gBlockList;
>>>>>>
>>>>>>   => (static ObjectMonitor *) volatile gBlockList;
>>>>>>
>>>>>> i.e. I have a static ObjectMonitor pointer that is volatile
>>>>>>
>>>>>>
>>>>>> Compared to these two forms:
>>>>>>
>>>>>>      static volatile ObjectMonitor * gBlockList;
>>>>>>      static ObjectMonitor volatile * gBlockList;
>>>>>>
>>>>>>   => static (volatile ObjectMonitor) * gBlockList;
>>>>>>   => static (ObjectMonitor volatile) * gBlockList;
>>>>>>
>>>>>> i.e. I have a static pointer to a volatile ObjectMonitor.
>>>>>>
>>>>>> Hopefully, this makes my reasons a bit more clear...
>>>>>
>>>>> Not really :) Yes there is a difference between a "volatile 
>>>>> pointer to Foo" and "pointer to a volatile Foo", but for the sake 
>>>>> of vmstructs we don't really seem to need to care about that. The 
>>>>> two questions are:
>>>>> - is the field/variable static
>>>>> - is the field/variable volatile
>>>>
>>>> I'll have to politely disagree:
>>>>
>>>> Here's the existing volatile non-static macro:
>>>>
>>>> 2743 // This macro checks the type of a volatile VMStructEntry by 
>>>> comparing pointer types
>>>> 2744 #define CHECK_VOLATILE_NONSTATIC_VM_STRUCT_ENTRY(typeName, 
>>>> fieldName, type)        \
>>>> 2745  {typedef type dummyvtype; typeName *dummyObj = NULL; volatile 
>>>> dummyvtype* dummy = &dummyObj->fieldName; }
>>>>
>>>> And here's the new static pointer volatile macro:
>>>>
>>>> 2751 // This macro checks the type of a static pointer volatile 
>>>> VMStructEntry by comparing pointer types,
>>>> 2752 // e.g.: "static ObjectMonitor * volatile gBlockList;"
>>>> 2753 #define CHECK_STATIC_PTR_VOLATILE_VM_STRUCT_ENTRY(typeName, 
>>>> fieldName, type)       \
>>>> 2754  {type volatile * dummy = &typeName::fieldName; }
>>>>
>>>> Yes, the variable assignments are different because we have static
>>>> versus a non-static situation, but what's more important is where
>>>> the "volatile" is positioned.
>>>>
>>>> In the existing volatile non-static macro, the volatile piece is:
>>>>
>>>>     volatile dummyvtype* dummy = &dummyObj->fieldName;
>>>>
>>>> and in the new static pointer volatile macro, the volatile piece is:
>>>>
>>>>     type volatile * dummy = &typeName::fieldName;
>>>>
>>>> So the CHECK_VOLATILE_NONSTATIC_XXX macro has the "volatile" before
>>>> the data type... and the CHECK_STATIC_PTR_VOLATILE_XXX macro
>>>> has the "volatile" after the data type...
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>>> Dan
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Testing: Aurora Adhoc RT-SVC nightly batch
>>>>>>>>           4 inner-complex fastdebug parallel runs for 4+ days and
>>>>>>>>             600K iterations without seeing this failure; the 
>>>>>>>> experiment
>>>>>>>>             is still running; final results to be reported at 
>>>>>>>> the end
>>>>>>>>             of the review cycle
>>>>>>>>           JPRT -testset hotspot
>>>>>>>>
>>>>>>>> This fix:
>>>>>>>>
>>>>>>>> - makes ObjectMonitor::gBlockList volatile
>>>>>>>> - uses "OrderAccess::release_store_ptr(&gBlockList, temp)" to
>>>>>>>>    make sure the new block updates _happen before_ gBlockList is
>>>>>>>>    changed to refer to the new block
>>>>>>>> - add SA support for a "static pointer volatile" field like:
>>>>>>>>
>>>>>>>>      static ObjectMonitor * volatile gBlockList;
>>>>>>>>
>>>>>>>> See the following link for a nice description of what "volatile"
>>>>>>>> means in the different positions on a variable/parameter decl 
>>>>>>>> line:
>>>>>>>>
>>>>>>>> http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>
>


More information about the serviceability-dev mailing list