RFR: 8325095: C2: bailout message broken: ResourceArea allocated string used after free
Emanuel Peter
epeter at openjdk.org
Thu Feb 15 23:05:54 UTC 2024
On Thu, 15 Feb 2024 21:54:57 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:
>> **Problem**
>> Bailout `failure_reason` strings used to be either `nullptr` or a static string. But with [JDK-8303951](https://bugs.openjdk.org/browse/JDK-8303951) I had introduced dynamic strings (e.g. formatted using `stringStream::as_string()`). These dynamic strings were ResourceArea allocated, and have a very limited life-span, i.e. withing the most recent ResourceMark scope. The pointer is passed to `Compile::_failure_reason`, which already might leave that ResourceMark scope, and the pointer is invalid. And then the pointer is further passed to `ciEnv::_failure_reason`. And finally, it even gets passed up to `CompileBroker::invoke_compiler_on_method`. When the string is finally printed for the bailout message, we already left original `ResourceMark` scope, and also the `Compile` and `ciEnv` scopes. The pointer now points to basically random data.
>>
>> **Solution**
>> Whenever a string is passed to an outer scope, I make a CHeap copy, and the outer scope is the owner of that copy.
>> This way every scope is the owner of its own copy, and can allocate and deallocate according to its own strategy safely.
>>
>> I introduced a utility class `CHeapStringHolder`:
>> - `set`: make local copy on CHeap.
>> - `clear`: free local copy. We before `set`, and in the destructor. Thus, when the holder goes out of scope, the memory is automatically freed.
>>
>> We have these 4 scopes:
>> - `ResourceMark`: It allocates from `ResourcArea`, and deallocation is taken care at the end of the ResourceMark scope.
>> - `Compile`: We turn the `_failure_reason` from a `char*` into a `CHeapStringHolder`. We set `_failure_reason` while the `ResourceMark` is live.
>> - `ciEnv`: We turn the `_failure_reason` from a `char*` into a `CHeapStringHolder`. We set `_failure_reason` while `Compile` is live.
>> - `CompileTask`: We used to just set `failure_reason`, which assumes that the string is static or elsewhere managed. Now I use pattern available in other places of `CompileBroker::invoke_compiler_on_method`: duplicate the string with `os::strdup` from some other scope, and set `reason_on_C_heap = true`, which means that we are the owner of that copy, and are responsible for freeing it later. Not sure if that is a nice pattern, but I don't want to refactor that code and it does what I need it to do.
>
> I thought after our offline discussion that you will keep failure reasons in CompileTask.
> What happened to that approach?
@vnkozlov yes, that was the plan. But then @jdksjolen objected, see his comments above. The new approach would be a bit more "clean", as the ownership is now clear.
The problem with the arena: I can't use a simple Arena at CompileTask, since an Arena only grows. A CompileTask is reused for many compilations, and so potentially we would have many failures on that Arena, leading to uncontrolled memory usage. A ResourceMark would require a ResourceMark - that could be done I guess. But creating such ad-hoc ResourceAreas with their own ResourceMarks seems a bit ... well ad-hoc. So that is why @jdksjolen proposed to just have a string-holder that simply allocates itself. Allocation and ownership are straight forward this way.
Any yes, there should probably be some refactoring. I'm not sure why there are `failure_reason` reportings at 3 layers (maybe even more?): `Compile, ciEnv, CompileTask`. Maybe we can log the failure reason only at `CompileTask` level? I'd have to see if that is a feasible refactoring.
Would it be ok to push this BugFix, and then come up with a refactoring RFE? We probably also want to refactor the strange `reason_on_C_heap` mechanism on `CompileTask` level. If someone sets that flag wrong, we have a bad free, or a memory leak.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/17710#issuecomment-1947481272
More information about the hotspot-dev
mailing list