RFR: 8303942: FileMapInfo::write_bytes aborts on a short os::write

Vladimir Kozlov kvn at openjdk.org
Tue Apr 4 03:33:05 UTC 2023


On Mon, 3 Apr 2023 12:17:54 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> We have other cases that assumes that `os::write` writes all the requested bytes:
>> 
>> https://github.com/openjdk/jdk/blob/aa762102e9328ca76663b56b3be6f6141b044744/src/hotspot/share/jfr/recorder/repository/jfrEmergencyDump.cpp#L379-L389
>> 
>> Writing a loop for each case will be messy and buggy. It's better to get rid of `os::write()` and replace it with `os::write_fully()`.
>> 
>> 
>> size_t bytes_written = os::write_fully(fd, buffer, bytes_requested);
>> if (bytes_written != bytes_requested) {
>>     // report error ....
>> }
>> 
>> 
>> and I can't understand why `os::write` takes an `unsigned int`!
>> 
>> https://github.com/openjdk/jdk/blob/aa762102e9328ca76663b56b3be6f6141b044744/src/hotspot/os/posix/os_posix.cpp#L774-L778
>
>> @iklam the JFR code you linked already handles the short write case. I only found two cases where it was not handled and bugs were filed for each - this is one of them. See JDK-8303937 for links.
> 
> Ths JFR code has multiple problems. 
> 
> 
> // os.hpp
> static ssize_t os::write(int fd, const void *buf, unsigned int nBytes);
> 
> // jfrEmergencyDump.cpp
> int64_t bytes_read = 0;
> int64_t bytes_written = 0;
> ...
> bytes_written += (int64_t)os::write(emergency_fd, copy_block, bytes_read - bytes_written);
> assert(bytes_read == bytes_written, "invariant");
> 
> 
> (1) the casting of `int64_t` to `unsigned int nBytes` is dubious.
> (2) it doesn't handle a `-1` result from `os::write`
> (3) the `assert` is wrong.
> 
> I think this shows that it's very tricky to make a loop to write all the bytes requested. In particular, Most callers would have a `size_t` for the number of bytes to write. Asking everyone to consider between `ssize_t` vs `size_t` would be a maintenance nightmare:
> 
> 
> ssize_t write(int fd, const void *buf, size_t count);
> 
> // The number of bytes written may be less than count if, for 
> // example, there is insufficient space on the  underlying physical
> // medium, or the RLIMIT_FSIZE resource limit is encountered 
> // (see setrlimit(2)), or the call was interrupted by a signal handler
> // after having written less than count bytes.  (See also pipe(7).)
> 
> 
> Therefore, we should consolidate all the loops in a single place (preferably in platform-independent os.cpp and not once per platform).
> 
> 
> // writes count bytes from the buffer starting at buf to the file referred to by the
> // file descriptor fd.
> // On  success,  0 is returned.  On error, -1 is returned, and errno is set to
> // indicate the cause of the error.
> int os::write_fully(int fd, const void *buf, size_t count);
> 
> 
> For reference, see the Java `OutputStream.write()` call, which either writes all the bytes requested, or throws an IOException. 25+ years of Java shows that no one wants or needs to handle the partial writes:
> 
> - https://docs.oracle.com/javase/8/docs/api/java/io/OutputStream.html

Our shared `os` class has many methods which are not wrappers. I think it is fine place for our `write_fully()` as @iklam suggested.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13188#issuecomment-1495294905


More information about the hotspot-runtime-dev mailing list