RFR: 8331208: Memory stress test that checks OutOfMemoryError stack trace fails [v2]

David Holmes dholmes at openjdk.org
Thu May 2 01:42:02 UTC 2024


On Wed, 1 May 2024 13:22:19 GMT, Doug Simon <dnsimon at openjdk.org> wrote:

>> This pull request mitigates failures in memory stress tests that check the stack trace of an `OutOfMemoryError` for certain expected entries.
>> 
>> The stack trace of an OOME will [not be allocated once all preallocated OOMEs are used up](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/memory/universe.cpp#L722). If the only heap allocations performed in stressful conditions are those of the stress test, then the [4 preallocated OOMEs](https://github.com/openjdk/jdk/blob/f1d0e715b67e2ca47b525069d8153abbb33f75b9/src/hotspot/share/runtime/globals.hpp#L800) would be sufficient. However, it's possible for VM internal allocations to also occur during stressful conditions, especially in `-Xcomp` mode. For example, [CompileBroker::compile_method](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/compiler/compileBroker.cpp#L1399) will try to resolve the string constants in the constant pool of the method about to be compiled. This can fail as shown here:
>> 
>> V  [jvm.dll+0x62c23a]  Exceptions::_throw+0x11a  (exceptions.cpp:168)
>> V  [jvm.dll+0x62d85b]  Exceptions::_throw_oop+0xab  (exceptions.cpp:140)
>> V  [jvm.dll+0xbbce78]  MemAllocator::Allocation::check_out_of_memory+0x208  (memAllocator.cpp:138)
>> V  [jvm.dll+0xbbcac8]  MemAllocator::allocate+0x158  (memAllocator.cpp:377)
>> V  [jvm.dll+0x79bd05]  InstanceKlass::allocate_instance+0x95  (instanceKlass.cpp:1509)
>> V  [jvm.dll+0x7ddeed]  java_lang_String::basic_create+0x9d  (javaClasses.cpp:273)
>> V  [jvm.dll+0x7e43c0]  java_lang_String::create_from_unicode+0x60  (javaClasses.cpp:291)
>> V  [jvm.dll+0xdb91a5]  StringTable::do_intern+0xb5  (stringTable.cpp:379)
>> V  [jvm.dll+0xdba9f2]  StringTable::intern+0x1b2  (stringTable.cpp:368)
>> V  [jvm.dll+0xdbaaa6]  StringTable::intern+0x86  (stringTable.cpp:328)
>> V  [jvm.dll+0x51c8b1]  ConstantPool::string_at_impl+0x1d1  (constantPool.cpp:1251)
>> V  [jvm.dll+0x51b95b]  ConstantPool::resolve_string_constants_impl+0xeb  (constantPool.cpp:800)
>> V  [jvm.dll+0x4f2f8d]  CompileBroker::compile_method+0x31d  (compileBroker.cpp:1395)
>> V  [jvm.dll+0x4f3474]  CompileBroker::compile_method+0xc4  (compileBroker.cpp:1348)
>> 
>> These internal allocations can occur before the allocations of the test and thus use up the pre-allocated OOMEs. As a result, the OOMEs triggered by the stress test may end up throwing the [default, shared OOME instance](https://github.com/openjdk/jdk/blob/3d5eeac3a38ec...
>
> Doug Simon has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - don't perform XX actions for OOME when in scope of an InternalOOMEMark
>  - rename SandboxedOOMEMark to InternalOOMEMark

src/hotspot/share/gc/shared/memAllocator.hpp line 131:

> 129: 
> 130:  public:
> 131:   InternalOOMEMark(JavaThread* thread) {

Suggestion: add a comment:

// Passing a null thread allows for a no-op implementation for contexts that will suppress
// throwing of the OOME - see RetryableAllocationMark.

I was wondering if we really need this. AFAICS it would be harmless to always pass in the current thread and set the thread's field because when we would have passed null then no exception would be thrown anyway. It seems the null thread is only used as a means for RAM to track whether activate was false. But I guess a no-op IOM achieves the same goal.

src/hotspot/share/gc/shared/memAllocator.hpp line 151:

> 149:   }
> 150: 
> 151:   // Returns nullptr iff `activate` was false in the constructor.

This comment is out of place - `activate` is in the RAM constructor

src/hotspot/share/jvmci/jvmciRuntime.cpp line 107:

> 105:   RetryableAllocationMark(JavaThread* thread, bool activate) : _iom(activate ? thread : nullptr) {}
> 106:   ~RetryableAllocationMark() {
> 107:     JavaThread* THREAD = _iom.thread();

Please restore comment: `// For exception macros.`

src/hotspot/share/jvmci/jvmciRuntime.cpp line 114:

> 112:         if (ex->is_a(vmClasses::OutOfMemoryError_klass())) {
> 113:           CLEAR_PENDING_EXCEPTION;
> 114:         }

Just an observation but the original code will clear all exceptions except for an "async" exception, which these days is only the InternalError thrown by unsafe-access-errors. But the new code will only clear OOME thus allowing the (as expected) InternalError to remain, but also any other VirtualMachineErrors that may have arisen e.g. StackOverflowError. I actually think this is more correct, but it does seem a change in behaviour that we may need to be wary of.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1586968398
PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1586968924
PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1586969275
PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1586974448


More information about the hotspot-gc-dev mailing list