RFR (S): 8008314 - Unimplemented() Atomic::load breaks the applications

Jesper Wilhelmsson jesper.wilhelmsson at oracle.com
Thu Feb 21 07:45:04 PST 2013


This is now pushed to hsx/hotspot-gc. I will push to hsx24 once it has survived 
a couple of rounds of nightly testing.

I created JDK-8008650 to track the problems with rotatingFileStream.
/Jesper


On 20/2/13 5:06 PM, Dean Long wrote:
> On 2/20/2013 7:09 AM, Jesper Wilhelmsson wrote:
>> Hi Dean,
>>
>> Using count() is a good suggestion. It would require a larger change though
>> since the rotatingFileStream doesn't use the _position and _precount variables
>> right now and they aren't properly updated as you noted. There are places in
>> the code that uses count() to find information about the logfiles. I suspect
>> there could be a really unlikely bug hiding here: If a long running program
>> happens to wrap the _precount variable at the end of a line, calls to count()
>> would return zero and believe that there is no logging in progress.
>>
>> Regarding thread saftey there is no difference in using count() to using the
>> current _bytes_written (none of them are thread safe), and since this bug is
>> about removing the use of atomic add, I'd prefer to keep the change small.
>>
>> There are other things in rotatingFileStream that would benefit from an
>> overview. For instance there are two constructors that are identical apart
>> from a minor detail. I'd like to clean this up and remove the duplicated code
>> - especially as only one of the constructors are ever used.
>>
>> Are you OK with my current change if I'll create a separate CR to look in to
>> the potential problem with count() and to do this cleanup?
>> /Jesper
>>
> Yes, sounds good.
>
> dl
>
>>
>> On 20/2/13 4:54 AM, Dean Long wrote:
>>> How about using "if (count() < (jlong)GCLogFileSize)" and
>>> "set_position(0);set_count(0);" in
>>> rotate_log(), then removing_bytes_written?  However be aware that count() grows
>>> even when
>>> _file == NULL, because write() calls update_position() unconditionally.
>>>
>>> dl
>>>
>>> On 2/19/2013 8:58 AM, Jesper Wilhelmsson wrote:
>>>> Hi,
>>>>
>>>> The counter that is protected by the atomic add is only used to decide when to
>>>> switch to the next file in the rotating log file scheme. Even with the atomic
>>>> add it is not an exact limit since we only check the limt at a sefepoint and
>>>> we will always write a whole log message at a time.
>>>>
>>>> I ran several tests with most GC logging turned on with and without the atomic
>>>> add and I see no significant difference in the sizes of the log files.
>>>>
>>>> I have a new patch where I remove the atomic add:
>>>>
>>>> HS24: http://cr.openjdk.java.net/~jwilhelm/8008314/webrev.2.hs24/
>>>> HS25: http://cr.openjdk.java.net/~jwilhelm/8008314/webrev.2.hs25/
>>>>
>>>> Let me know if you feel confident in pushing this.
>>>> /Jesper
>>>>
>>>>
>>>> On 19/2/13 1:52 PM, Bengt Rutisson wrote:
>>>>> On 2/19/13 12:39 PM, David Holmes wrote:
>>>>>> I'm also indebted to Bengt for spotting that - hadn't realized the 'x' types
>>>>>> were intptr_t typedefs.
>>>>>>
>>>>>> But backing up a step, this code is not obviously thread-safes, or if it
>>>>>> is it
>>>>>> must be by a lock at a higher level - so why do we need an atomic update of
>>>>>> the _bytes_written value?
>>>>>
>>>>> Yes, this is something we discussed at lunch today. It looks like calling
>>>>> update_position() without some kind of locking is unsafe. This is not
>>>>> introduced
>>>>> only by the rotatingFileStream. All the outputStream implmentations does this.
>>>>> Looks scary to me. I don't really understand how this works in a safe way.
>>>>> Maybe
>>>>> it doesn't.
>>>>>
>>>>> We are not doing any locking higher up in the call chain as far as I can see.
>>>>>
>>>>> Bengt
>>>>>
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On 19/02/2013 8:18 PM, Jesper Wilhelmsson wrote:
>>>>>>> Bengt,
>>>>>>>
>>>>>>> Thanks for catching this! I should listen to Ted's mother, nothing good
>>>>>>> happens after 2am[1].
>>>>>>>
>>>>>>> I wasn't aware that jint was 32-bit on all platforms. Using 32-bit
>>>>>>> atomic add like this won't work, back to the drawing board.
>>>>>>>
>>>>>>> The one quick solution I can think of right now that won't be awful
>>>>>>> would be to limit GcLogFileSize to 32 bit. I'm not sure how quick a fix
>>>>>>> it will be though since it may require a CCC approval...
>>>>>>>
>>>>>>> /Jesper
>>>>>>>
>>>>>>>
>>>>>>> [1] http://en.wikipedia.org/wiki/Nothing_Good_Happens_After_2_A.M.
>>>>>>>
>>>>>>>
>>>>>>> On 19/2/13 9:08 AM, Bengt Rutisson wrote:
>>>>>>>>
>>>>>>>> Hi Jesper,
>>>>>>>>
>>>>>>>> Thanks for fixing this! I don't feel very comfortable with this code,
>>>>>>>> so don't
>>>>>>>> consider this a full review. Just a couple of comments:
>>>>>>>>
>>>>>>>> It seems to make sense to use uintx for _bytes_written since the flag
>>>>>>>> to control
>>>>>>>> the file size, GCLogFileSize, is uintx. But since the atomics you use are
>>>>>>>> handling jint I think it looks a bit suspicious. jint is always 32
>>>>>>>> bits and
>>>>>>>> uintx can be 32 or 64 bit. So, what happens here if _bytes_written is
>>>>>>>> 64 bit?
>>>>>>>>
>>>>>>>>   457     Atomic::add((jint)count, (jint*)&_bytes_written);
>>>>>>>>
>>>>>>>> If _bytes_written is 0x00000000FFFFFFFF and count is 1, I would guess
>>>>>>>> that we
>>>>>>>> get 0x0 instead of 0x0000000100000000.
>>>>>>>>
>>>>>>>> Also, in rotatingFileStream::rotate_log() you changed to:
>>>>>>>>
>>>>>>>> if (_bytes_written <= GCLogFileSize) return;
>>>>>>>>
>>>>>>>> instead of :
>>>>>>>>
>>>>>>>> if (_bytes_writen < (jlong)GCLogFileSize) return;
>>>>>>>>
>>>>>>>> isn't this a problem since you set _bytes_written to UINT_MAX if we
>>>>>>>> wrap around?
>>>>>>>> If someone sets GCLogFileSize on the command line to be UINT_MAX we
>>>>>>>> will never
>>>>>>>> rotate the file, right?
>>>>>>>>
>>>>>>>> Also, we only call rotate_log() when we start a safepoint. If we wrap
>>>>>>>> around
>>>>>>>> once in write() but then call it again before a safepoint happens we can
>>>>>>>> probably set _bytes_written to a value that looks small enough to not
>>>>>>>> have to
>>>>>>>> rotate the log. It seems to me that we would need a separate flag to
>>>>>>>> say that we
>>>>>>>> written enough to rotate the log.
>>>>>>>>
>>>>>>>> Bengt
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2/19/13 7:25 AM, Jesper Wilhelmsson wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Please review the following change:
>>>>>>>>>
>>>>>>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8008314/
>>>>>>>>>
>>>>>>>>> Bug: https://jbs.oracle.com/bugs/browse/JDK-8008314
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The use of atomic::add(jlong,jlong*) breaks everything on ARM since
>>>>>>>>> it uses
>>>>>>>>> atomic::load(jlong) which isn't implemented.
>>>>>>>>>
>>>>>>>>> The add was used to update a counter in rotating log file handling,
>>>>>>>>> but it
>>>>>>>>> turns out that the counter that was updated doesn't need to be a jlong.
>>>>>>>>>
>>>>>>>>> The counter is used to indicate that it is time to do a log file
>>>>>>>>> rotation and
>>>>>>>>> by adding a saftey catch for overflow, we can use an uintx for the
>>>>>>>>> counter.
>>>>>>>>>
>>>>>>>>> In practice this means that the upper "limit" for the log file size
>>>>>>>>> becomes
>>>>>>>>> UINX_MAX-1 (used to be UINT_MAX) since I change from < to <= in the
>>>>>>>>> comparison
>>>>>>>>> to the limit, but since this is not a hard limit the difference will
>>>>>>>>> only be
>>>>>>>>> noticeable when someone has written UINT_MAX characters to the log
>>>>>>>>> file and
>>>>>>>>> the next write will end up in the next log file part instead of in
>>>>>>>>> the same
>>>>>>>>> (which one could argue is a more correct behavior).
>>>>>>>>>
>>>>>>>>> Sine I changed all places where the counter was used I also fixed a
>>>>>>>>> typo in
>>>>>>>>> its name.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> /Jesper
>>>>>>>>
>>>>>
>>>
>


More information about the hotspot-dev mailing list