RFR: 8256008: UL does not report anything if disk writing fails
Yasumasa Suenaga
ysuenaga at openjdk.java.net
Sun Nov 8 12:31:55 UTC 2020
On Sun, 8 Nov 2020 08:12:56 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> If there are no disk space enough to write UL, it does not report. So the user cannot know lack of logs timely.
>>
>> `LogFileStreamOutput::write()` uses `jio_fprintf()` and `fflush()` to write log. However return values from them would not be checked.
>> We should check them, and should report what happened if error occurred.
>>
>>
>> How to reproduce on Linux:
>>
>> 1. Mount tmpfs with 512K
>>
>> $ mkdir /tmp/tmp
>> $ mount -t tmpfs -o size=512k tmpfs /tmp/tmp
>>
>> 2. Run `java` with `-Xlog`
>>
>> $ java -Xlog:all=trace:file=/tmp/tmp/all.log --version
>
> Hi Yasumasa,
>
> This fix makes sense.
>
> Not a full review, just some initial remarks:
>
> - A simpler way to reproduce may be to set an IO quota with ulimit. We ignore SIGXFSZ, so this should make the io calls fail. This could also maybe used for a jtreg test, though I see that this could be a bit more work and I won't insist on it.
>
> - I would like it if we could attempt to write some final words into the log stream. Because that may still work, even if the larger output does not (I really hope we do not print characters individually, don't we?). Something like `("write failed (%d)", errno)`. That may or may not work but if it does its helpful when looking at the logfile.
>
> - If you handle errors from writing, make sure you handle intermittend write errors correctly. You probably want to handle EINTR by re-trying to write.
>
> - I see you make an RAII object to handle flock. Could you leave this out please to make this change easier to review? I am not against it, but would prefer this in a separate cleanup change.
>
> More remarks later.
>
> Thanks, Thomas
Thanks @tstuefe for reviewing!
> * A simpler way to reproduce may be to set an IO quota with ulimit. We ignore SIGXFSZ, so this should make the io calls fail. This could also maybe used for a jtreg test, though I see that this could be a bit more work and I won't insist on it.
In case of my test (`java -Xlog:all=trace:file=/tmp/tmp/all.log --version`) would happen crash with SIGXFSZ before register signal handlers, so we can see SIGXFSZ at `fflush()` call. I saw coredump caused by SIGXFSZ with `ulimit -f 1`.
Whatever the case JVM would receive SIGXFSZ, so we may test this change with `ulimit -f`.
> * I would like it if we could attempt to write some final words into the log stream. Because that may still work, even if the larger output does not (I really hope we do not print characters individually, don't we?). Something like `("write failed (%d)", errno)`. That may or may not work but if it does its helpful when looking at the logfile.
Agree, I will add log message.
> * If you handle errors from writing, make sure you handle intermittend write errors correctly. You probably want to handle EINTR by re-trying to write.
`LogFileStreamOutput::write()` uses `jio_fprintf()` (wrapper of `vfprintf()` ) and `fflush()`. In case of `fflush()`, we can retry, however `jio_fprintf()` can be done in the same way? `vfprintf()` might not set `errno` if error happens. (I'm not sure)
> * I see you make an RAII object to handle flock. Could you leave this out please to make this change easier to review? I am not against it, but would prefer this in a separate cleanup change.
If `FileLocker` is removed from this change, I want to add it on another RFE before this because this change makes to return from several locations in `LogFileStreamOutput::write()`.
Do you agree create subtask (RFE) to add `FileLocker`?
-------------
PR: https://git.openjdk.java.net/jdk/pull/1106
More information about the hotspot-runtime-dev
mailing list