RFR: 8338379: Accesses to class init state should be properly synchronized [v2]

Coleen Phillimore coleenp at openjdk.org
Wed Sep 25 13:12:38 UTC 2024


On Mon, 23 Sep 2024 07:17:50 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> See the bug for the discussion. We have not seen a clear evidence this is _the_ problem in the field, neither we were able to come up with a reproducer. We have found this gap by inspecting the code, while chasing a production bug.
>> 
>> In short, `InstanceKlass::_init_state` is used as the "witness" for initialized class state. When class initialization completes, it needs to publish the class state by writing `_init_state = _fully_initialized` with release semantics. Current patch makes a seqcst write, which is stronger than strictly necessary. I think it is okay to be extra paranoid on rarely-executed class initialization path.
>> 
>> Various accessors that poll `IK::_init_state`, looking for class initialization to complete, need to read the field with acquire semantics. This is where the change fans out, touching VM, interpreter and compiler paths that e.g. implement clinit barriers. In some cases in assembler code, we can rely on hardware memory model to do what we need (i.e. acquire barriers/fences are nops).
>> 
>> I made the best _guess_ what ARM32, S390X, PPC64, RISC-V code should look like, based on what related code does for volatile loads. It would be good if port maintainers could sanity-check those.
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>>  - [x] GHA to test platform buildability + adhoc platform cross-compilation
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Relax to just a release

I was looking through and we set the "loaded" state under the Compile_lock (because of dependencies in add_to_hierarchy), we set the "linked", "being_initialized", "fully_initialized" and "initialization_error" under the init_lock object (which I want to change again) with a notify for the latter two.  Using a load_acquire to examine the state (and release_store to write) seems like the right thing to do because there isn't just one lock so we should assume reading this state is lock free.

It looks like the C2 code optimizes away the clinit_barrier when possible so we can watch for any performance difference but I'd still rather have safety.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21110#issuecomment-2374046227


More information about the hotspot-dev mailing list