RFR: 8146879: Add option for handling existing log files in UL

Marcus Larsson marcus.larsson at oracle.com
Tue Mar 15 09:17:33 UTC 2016


On 03/14/2016 04:06 PM, Dmitry Samersoff wrote:
> Marcus,
>
> 1.
> Most of log analyzing tools utilize the fact that filesystem returns
> files in alphabetical order or uses scandir()/alphasort()
>
> So good log rotation strategy is log, log.001, log.002, log.003 etc
> where "log" is current one and log.MAX_COUNT is the oldest one.

The current log rotation scheme keeps the current log file as just 
"log", and when rotating it will move this log to "log.X", where X is 
initially 1 and increases at each rotation. It will simply overwrite the 
file if it already exists. When X reaches the file-count, it wraps 
around and starts over from 1 again. There's no guarantee that the 
highest number is the oldest/youngest file.

This patch doesn't change the rotation scheme, it only makes rotation 
the default, with the addition of a smarter choice of initial X value, 
namely either the next unused X or the oldest X based on mtime if all 
X:es are in use. It's a best-effort to prevent accidental overwriting of 
your logs, but it doesn't guarantee anything.

>
> To save startup time we can declare that log.001 is the oldest one, and
> log.MAX_COUNT is the newest one but log.002 that is more recent than
> log.001 and log.003 seems not obvious to me.

This is a different rotation scheme and would require rotating every log 
file at each rotation in order to keep the invariant. I'm not against 
changing the current scheme, but I think it's outside the scope of this 
change and would prefer to do it separately in that case.

>
> Also it's very common that log producer and log analyzer resides on a
> different machines (e.g. with NFS share) and log file mtime could be
> unrelated to the age of VM events in this log. So we shouldn't use log
> mtime for any decision.

The mtime is just our best guess in the case we don't know which file to 
delete.

>
> 2.
> rename() can cause a burden in complicated environment, so it's better
> to check most common cases (permissions, PATH_MAX, etc) explicitly, as
> early as possible, to give meaningful error message to a user.

I see. I wanted to avoid the complexity, but if you feel it is important 
I'll add some checks for these things.

>
> See also below.
>
> On 2016-03-14 13:29, Marcus Larsson wrote:
>> Hi,
>>
>> Thanks for looking at this!
>>
>> On 03/13/2016 12:03 PM, Dmitry Samersoff wrote:
>>> Marcus,
>>>
>>> 1. os_posix/os_windows.cpp
>>>
>>> Windows version is expensive and change access time of a log file.
>>>
>>> So it's better to create os::get_mtime() that uses stat() for both,
>>> windows and posix and LogFileOutput::calc_mtime_diff().
>>>
>>> Also it saves a bit of compiler power unrolling loop.
>> This is similar to my first approach but I ran into problems with it
>> because st_mtime is a struct timespec on posix and a time_t on windows.
>>  From the start I wanted better precision than seconds, which is why I
>> wanted to use CompareFileTime(). Given that we can not depend on any
>> particular precision, perhaps it's better to just use stat() on windows
>> as well, like you say.
> OK.
>
>>> 2. log.cpp
>>>
>>> size_t number_of_lines_with_substring_in_file()
>>>
>>> When you plan to use it ? Current implementation is rather expensive,
>>> but if you plan to run it manually only, I'm OK with it.
>> It's used for testing only.
> If you plan to use it in jprt or nightly - please change it.

It's part of the internal vm tests, so it should be in both. I don't see 
the problem though. It's used once to verify a very small log file. The 
test completes in a matter of milliseconds, and I don't think we should 
worry about it.

>
>>> 3. logFileOutput.cpp
>>>
>>> 82.
>>> Please, change file_exists to can_be_rotated() and check that log output
>>> is a *regular file* and you have a write access to it.
>> I guess it would be useful to give an error in the case someone
>> specifies logging to something that's not a regular file. Checking for
>> write permissions shouldn't be necessary here though. If any of the
>> required operations fail because of lacking permissions we'll notice
>> that at that time.
> Logging to /dev/tape or /dev/console is perfectly valid usecase but we
> should detect it and don't attempt to rotate or truncate device ever if
> sysadmin enable rotation by mistake.
>
> Also common sysadmin mistake is a log pointing to directory or presence
> of log.NNN directory - we should detect it as well.
>
> One more case when we shouldn't rotate - log is symlink
>
>
>>> 87.
>>> Please, don't use floating point math. Just allocate 10 extra bytes, it
>>> fit all possible file length value.
>> I can use a fixed size for the allocations, but we will still want to
>> figure out how many numbers should be used so that we can use the
>> appropriate amount of padding. Logging with 10 files in rotation should
>> use log.0 to log.9, not log.0000000 to log.0000009.
> It's OK to always write log.001 etc or use something like:
>
> return (num < 10) ? 1 : (num < 100) ? 2 : 5;

Ok.

>
>>> 93.
>>> if we have log.01, log.03 etc this function creates log.02 instead of
>>> log.04.
>> Yes, that's intentional.
> log.01 and log.03 both containing older data than log.02 seems not
> obvious for me.
>
> Could we always use largest possible number?

Like I said, this is the current rotation scheme, and I would prefer to 
not change it with this patch.

Thanks,
Marcus

>
>>> Sorry! I'm not follow logic of ll.118
>>> What happens if some log parsing tool change log file mtime?
>> We'll always prefer to overwrite the file that hasn't been modified for
>> the longest time.
> Log file mtime can be unrelated to the age of VM events in this file
> (think sysadmin do: vi log.003, :wq) so we shouldn't rely on mtime.
>
>>> Please, limit the number of old logs (FileCountOption) to some
>>> reasonable value (e.g. 999) and use binary search to find the name of a
>>> next file.
>> I can limit it, but I'm not sure we should make the effort of binary
>> searching for the file name. It's a one time search during startup and I
>> would expect most people to use around 10 files.
> OK,  as soon as DefaultRotationFileCount is less than 10
>
>>> Also please check full log file name against MAX_PATH constant.
>> Is it really necessary? Won't fopen() just fail and return the
>> appropriate error for this?
> rename errors is not always obvious. see comments above.
>
> -Dmitry
>
>> Thanks,
>> Marcus
>>
>>> -Dmitry
>>>
>>>
>>>
>>>
>>> On 2016-03-11 17:39, Marcus Larsson wrote:
>>>> Hi,
>>>>
>>>> On 2016-03-11 15:35, Bengt Rutisson wrote:
>>>>> Hi Marcus,
>>>>>
>>>>> On 2016-03-11 15:21, Marcus Larsson wrote:
>>>>>> Third time's the charm.
>>>>>>
>>>>>> Webrev:
>>>>>> http://cr.openjdk.java.net/~mlarsson/8146879/webrev.03/
>>>>> I had a quick look at the code changes. It is not really my area of
>>>>> the code, so I'll leave to someone else to formally review it.
>>>>>
>>>>> However, I downloaded the patch a played a bit with the logging. This
>>>>> is much more like the way I would like it! Thanks!
>>>>>
>>>>> So, from a functional perspective this looks good to me.
>>>>>
>>>> Thanks for the feedback!
>>>>
>>>> Marcus
>>>>
>>>>> Thanks,
>>>>> Bengt
>>>>>
>>>>>> This patch makes log file rotation the default. Default thresholds
>>>>>> are 5 rotated files with a target size of 20MiB. Truncating behavior
>>>>>> can be achieved by setting filecount to 0
>>>>>> (-Xlog::myfile.log::filecount=0).
>>>>>>
>>>>>> If a log file already exists during log file initialization it will
>>>>>> be rotated. If any of the target file names (file.0 to file.4 in the
>>>>>> default case) are available, that filename will be used for the
>>>>>> existing log. If all names are taken the VM will attempt to overwrite
>>>>>> the oldest file.
>>>>>>
>>>>>> This should prevent unlimited log file creations and avoid accidental
>>>>>> loss of log files from previous runs. The default thresholds (5
>>>>>> files, 20MiB each) is just a suggestion. If you think it should be
>>>>>> higher/lower let me know.
>>>>>>
>>>>>> Tested with included internal VM tests through RBT.
>>>>>>
>>>>>> Thanks,
>>>>>> Marcus
>>>>>>
>>>>>> On 2016-03-01 15:05, Marcus Larsson wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> After some offline discussions I'm withdrawing this patch. I will
>>>>>>> instead investigate if I can achieve similar behavior using log
>>>>>>> rotation as the default.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Marcus
>>>>>>>
>>>>>>> On 03/01/2016 12:11 PM, Marcus Larsson wrote:
>>>>>>>> Hi again,
>>>>>>>>
>>>>>>>> Taking a different approach to this.
>>>>>>>>
>>>>>>>> New webrev:
>>>>>>>> http://cr.openjdk.java.net/~mlarsson/8146879/webrev.01/
>>>>>>>>
>>>>>>>> Existing files will now by default be renamed/archived with a .X
>>>>>>>> suffix where X is the lowest number such that the resulting file
>>>>>>>> name is available (jvm.log becomes jvm.log.0). A mode option for
>>>>>>>> controlling this behavior has been added as well. It can be set to
>>>>>>>> archive, append, or truncate (i.e. -Xlog::jvm.log::mode=truncate).
>>>>>>>>
>>>>>>>> Tested with included jtreg test through JPRT.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Marcus
>>>>>>>>
>>>>>>>> On 01/14/2016 04:00 PM, Marcus Larsson wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Please review the following patch to make sure UL truncates
>>>>>>>>> existing log files before writing to them. Since files are opened
>>>>>>>>> in append mode, truncation isn't done automatically, so instead
>>>>>>>>> the patch adds an attempt to remove the log file before opening it.
>>>>>>>>>
>>>>>>>>> Webrev:
>>>>>>>>> http://cr.openjdk.java.net/~mlarsson/8146879/webrev.00/
>>>>>>>>>
>>>>>>>>> Issue:
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8146879
>>>>>>>>>
>>>>>>>>> Testing:
>>>>>>>>> Included test through JPRT
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Marcus
>



More information about the serviceability-dev mailing list