RFR (S): 8008314 - Unimplemented() Atomic::load breaks the applications
Dean Long
dean.long at oracle.com
Wed Feb 20 08:06:18 PST 2013
On 2/20/2013 7:09 AM, Jesper Wilhelmsson wrote:
> Hi Dean,
>
> Using count() is a good suggestion. It would require a larger change
> though since the rotatingFileStream doesn't use the _position and
> _precount variables right now and they aren't properly updated as you
> noted. There are places in the code that uses count() to find
> information about the logfiles. I suspect there could be a really
> unlikely bug hiding here: If a long running program happens to wrap
> the _precount variable at the end of a line, calls to count() would
> return zero and believe that there is no logging in progress.
>
> Regarding thread saftey there is no difference in using count() to
> using the current _bytes_written (none of them are thread safe), and
> since this bug is about removing the use of atomic add, I'd prefer to
> keep the change small.
>
> There are other things in rotatingFileStream that would benefit from
> an overview. For instance there are two constructors that are
> identical apart from a minor detail. I'd like to clean this up and
> remove the duplicated code - especially as only one of the
> constructors are ever used.
>
> Are you OK with my current change if I'll create a separate CR to look
> in to the potential problem with count() and to do this cleanup?
> /Jesper
>
Yes, sounds good.
dl
>
> On 20/2/13 4:54 AM, Dean Long wrote:
>> How about using "if (count() < (jlong)GCLogFileSize)" and
>> "set_position(0);set_count(0);" in
>> rotate_log(), then removing_bytes_written? However be aware that
>> count() grows
>> even when
>> _file == NULL, because write() calls update_position() unconditionally.
>>
>> dl
>>
>> On 2/19/2013 8:58 AM, Jesper Wilhelmsson wrote:
>>> Hi,
>>>
>>> The counter that is protected by the atomic add is only used to
>>> decide when to
>>> switch to the next file in the rotating log file scheme. Even with
>>> the atomic
>>> add it is not an exact limit since we only check the limt at a
>>> sefepoint and
>>> we will always write a whole log message at a time.
>>>
>>> I ran several tests with most GC logging turned on with and without
>>> the atomic
>>> add and I see no significant difference in the sizes of the log files.
>>>
>>> I have a new patch where I remove the atomic add:
>>>
>>> HS24: http://cr.openjdk.java.net/~jwilhelm/8008314/webrev.2.hs24/
>>> HS25: http://cr.openjdk.java.net/~jwilhelm/8008314/webrev.2.hs25/
>>>
>>> Let me know if you feel confident in pushing this.
>>> /Jesper
>>>
>>>
>>> On 19/2/13 1:52 PM, Bengt Rutisson wrote:
>>>> On 2/19/13 12:39 PM, David Holmes wrote:
>>>>> I'm also indebted to Bengt for spotting that - hadn't realized the
>>>>> 'x' types
>>>>> were intptr_t typedefs.
>>>>>
>>>>> But backing up a step, this code is not obviously thread-safes, or
>>>>> if it is it
>>>>> must be by a lock at a higher level - so why do we need an atomic
>>>>> update of
>>>>> the _bytes_written value?
>>>>
>>>> Yes, this is something we discussed at lunch today. It looks like
>>>> calling
>>>> update_position() without some kind of locking is unsafe. This is
>>>> not introduced
>>>> only by the rotatingFileStream. All the outputStream implmentations
>>>> does this.
>>>> Looks scary to me. I don't really understand how this works in a
>>>> safe way. Maybe
>>>> it doesn't.
>>>>
>>>> We are not doing any locking higher up in the call chain as far as
>>>> I can see.
>>>>
>>>> Bengt
>>>>
>>>>>
>>>>> David
>>>>>
>>>>> On 19/02/2013 8:18 PM, Jesper Wilhelmsson wrote:
>>>>>> Bengt,
>>>>>>
>>>>>> Thanks for catching this! I should listen to Ted's mother,
>>>>>> nothing good
>>>>>> happens after 2am[1].
>>>>>>
>>>>>> I wasn't aware that jint was 32-bit on all platforms. Using 32-bit
>>>>>> atomic add like this won't work, back to the drawing board.
>>>>>>
>>>>>> The one quick solution I can think of right now that won't be awful
>>>>>> would be to limit GcLogFileSize to 32 bit. I'm not sure how quick
>>>>>> a fix
>>>>>> it will be though since it may require a CCC approval...
>>>>>>
>>>>>> /Jesper
>>>>>>
>>>>>>
>>>>>> [1] http://en.wikipedia.org/wiki/Nothing_Good_Happens_After_2_A.M.
>>>>>>
>>>>>>
>>>>>> On 19/2/13 9:08 AM, Bengt Rutisson wrote:
>>>>>>>
>>>>>>> Hi Jesper,
>>>>>>>
>>>>>>> Thanks for fixing this! I don't feel very comfortable with this
>>>>>>> code,
>>>>>>> so don't
>>>>>>> consider this a full review. Just a couple of comments:
>>>>>>>
>>>>>>> It seems to make sense to use uintx for _bytes_written since the
>>>>>>> flag
>>>>>>> to control
>>>>>>> the file size, GCLogFileSize, is uintx. But since the atomics
>>>>>>> you use are
>>>>>>> handling jint I think it looks a bit suspicious. jint is always 32
>>>>>>> bits and
>>>>>>> uintx can be 32 or 64 bit. So, what happens here if
>>>>>>> _bytes_written is
>>>>>>> 64 bit?
>>>>>>>
>>>>>>> 457 Atomic::add((jint)count, (jint*)&_bytes_written);
>>>>>>>
>>>>>>> If _bytes_written is 0x00000000FFFFFFFF and count is 1, I would
>>>>>>> guess
>>>>>>> that we
>>>>>>> get 0x0 instead of 0x0000000100000000.
>>>>>>>
>>>>>>> Also, in rotatingFileStream::rotate_log() you changed to:
>>>>>>>
>>>>>>> if (_bytes_written <= GCLogFileSize) return;
>>>>>>>
>>>>>>> instead of :
>>>>>>>
>>>>>>> if (_bytes_writen < (jlong)GCLogFileSize) return;
>>>>>>>
>>>>>>> isn't this a problem since you set _bytes_written to UINT_MAX if we
>>>>>>> wrap around?
>>>>>>> If someone sets GCLogFileSize on the command line to be UINT_MAX we
>>>>>>> will never
>>>>>>> rotate the file, right?
>>>>>>>
>>>>>>> Also, we only call rotate_log() when we start a safepoint. If we
>>>>>>> wrap
>>>>>>> around
>>>>>>> once in write() but then call it again before a safepoint
>>>>>>> happens we can
>>>>>>> probably set _bytes_written to a value that looks small enough
>>>>>>> to not
>>>>>>> have to
>>>>>>> rotate the log. It seems to me that we would need a separate
>>>>>>> flag to
>>>>>>> say that we
>>>>>>> written enough to rotate the log.
>>>>>>>
>>>>>>> Bengt
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2/19/13 7:25 AM, Jesper Wilhelmsson wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Please review the following change:
>>>>>>>>
>>>>>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8008314/
>>>>>>>>
>>>>>>>> Bug: https://jbs.oracle.com/bugs/browse/JDK-8008314
>>>>>>>>
>>>>>>>>
>>>>>>>> The use of atomic::add(jlong,jlong*) breaks everything on ARM
>>>>>>>> since
>>>>>>>> it uses
>>>>>>>> atomic::load(jlong) which isn't implemented.
>>>>>>>>
>>>>>>>> The add was used to update a counter in rotating log file
>>>>>>>> handling,
>>>>>>>> but it
>>>>>>>> turns out that the counter that was updated doesn't need to be
>>>>>>>> a jlong.
>>>>>>>>
>>>>>>>> The counter is used to indicate that it is time to do a log file
>>>>>>>> rotation and
>>>>>>>> by adding a saftey catch for overflow, we can use an uintx for the
>>>>>>>> counter.
>>>>>>>>
>>>>>>>> In practice this means that the upper "limit" for the log file
>>>>>>>> size
>>>>>>>> becomes
>>>>>>>> UINX_MAX-1 (used to be UINT_MAX) since I change from < to <= in
>>>>>>>> the
>>>>>>>> comparison
>>>>>>>> to the limit, but since this is not a hard limit the difference
>>>>>>>> will
>>>>>>>> only be
>>>>>>>> noticeable when someone has written UINT_MAX characters to the log
>>>>>>>> file and
>>>>>>>> the next write will end up in the next log file part instead of in
>>>>>>>> the same
>>>>>>>> (which one could argue is a more correct behavior).
>>>>>>>>
>>>>>>>> Sine I changed all places where the counter was used I also
>>>>>>>> fixed a
>>>>>>>> typo in
>>>>>>>> its name.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> /Jesper
>>>>>>>
>>>>
>>
More information about the hotspot-dev
mailing list