RFR: 8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device. [v3]
Markus Grönlund
mgronlun at openjdk.java.net
Fri Feb 18 11:23:51 UTC 2022
On Fri, 18 Feb 2022 05:44:34 GMT, KIRIYAMA Takuya <duke at openjdk.java.net> wrote:
>> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
>>
>> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
>> by using JfrJavaSupport::abort().
>>
>> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
>>
>> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
>> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core
>> because there is no space on device.
>> Could you please review the fix?
>
> KIRIYAMA Takuya has updated the pull request incrementally with one additional commit since the last revision:
>
> 8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.
diff --git a/src/hotspot/share/jfr/jni/jfrJavaSupport.cpp b/src/hotspot/share/jfr/jni/jfrJavaSupport.cpp
index 95b96e02c06..015d4ebe065 100644
--- a/src/hotspot/share/jfr/jni/jfrJavaSupport.cpp
+++ b/src/hotspot/share/jfr/jni/jfrJavaSupport.cpp
@@ -563,14 +563,16 @@ void JfrJavaSupport::throw_runtime_exception(const char* message, TRAPS) {
void JfrJavaSupport::abort(jstring errorMsg, JavaThread* t) {
DEBUG_ONLY(check_java_thread_in_vm(t));
-
ResourceMark rm(t);
- const char* const error_msg = c_str(errorMsg, t);
- if (error_msg != NULL) {
- log_error(jfr, system)("%s",error_msg);
+ abort(c_str(errorMsg, t));
+}
+
+void JfrJavaSupport::abort(const char* error_msg, bool dump_core /* true */) {
+ if (error_msg != nullptr) {
+ log_error(jfr, system)("%s", error_msg);
}
log_error(jfr, system)("%s", "An irrecoverable error in Jfr. Shutting down VM...");
- vm_abort();
+ vm_abort(dump_core);
}
JfrJavaSupport::CAUSE JfrJavaSupport::_cause = JfrJavaSupport::VM_ERROR;
diff --git a/src/hotspot/share/jfr/jni/jfrJavaSupport.hpp b/src/hotspot/share/jfr/jni/jfrJavaSupport.hpp
index 53d6eed68a8..1ec5a884b4b 100644
--- a/src/hotspot/share/jfr/jni/jfrJavaSupport.hpp
+++ b/src/hotspot/share/jfr/jni/jfrJavaSupport.hpp
@@ -112,6 +112,7 @@ class JfrJavaSupport : public AllStatic {
// critical
static void abort(jstring errorMsg, TRAPS);
+ static void abort(const char* error_msg, bool dump_core = true);
static void uncaught_exception(jthrowable throwable, JavaThread* t);
// asserts
diff --git a/src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp b/src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp
index 3a7ec286381..73404a1aede 100644
--- a/src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp
+++ b/src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp
@@ -25,8 +25,8 @@
#ifndef SHARE_JFR_WRITERS_JFRSTREAMWRITERHOST_INLINE_HPP
#define SHARE_JFR_WRITERS_JFRSTREAMWRITERHOST_INLINE_HPP
+#include "jfr/jni/jfrJavaSupport.hpp"
#include "jfr/writers/jfrStreamWriterHost.hpp"
-
#include "runtime/os.hpp"
template <typename Adapter, typename AP>
@@ -77,6 +77,9 @@ inline void StreamWriterHost<Adapter, AP>::write_bytes(const u1* buf, intptr_t l
while (len > 0) {
const unsigned int nBytes = len > INT_MAX ? INT_MAX : (unsigned int)len;
const ssize_t num_written = os::write(_fd, buf, nBytes);
+ if (errno == ENOSPC) {
+ JfrJavaSupport::abort("Failed to write to jfr stream because no space left on device", false);
+ }
guarantee(num_written > 0, "Nothing got written, or os::write() failed");
_stream_pos += num_written;
len -= num_written;
src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp line 88:
> 86: JavaThread* jt = JavaThread::current();
> 87: ThreadInVMfromNative transition(jt);
> 88: JfrJavaSupport::abort(JfrJavaSupport::new_string(msg, jt), jt, false);
Hi again Takuya, I'm sorry, but I should have noticed this earlier:
I now see that the code needs to allocate a Java string oop to conform to the existing abort function signature, which caters to invocations from Java. Then abort() immediately strips out the c-string from the oop. To be correct, also headers for logging/log.hpp and runtime/thread.inline.hpp should need be included.
I believe we can simplify this by updating the abort() signature so that we don't need to drag in those extra dependencies. Please see my following comment where I suggest a way to do this.
Thanks for your patience
Markus
-------------
PR: https://git.openjdk.java.net/jdk/pull/7227
More information about the hotspot-jfr-dev
mailing list