RFR: 8262377: Parallel class resolution loses constant pool error [v4]
David Holmes
dholmes at openjdk.java.net
Wed Mar 10 06:26:08 UTC 2021
On Wed, 10 Mar 2021 02:03:32 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:
>> This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index.
>>
>> One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions.
>>
>> I didn't squash the commits so it would be easier to see the different changes, but they all go together.
>>
>> The test description:
>>
>> Two Threads T1, T2
>>
>> Three definitions of class A, defined by user defined class loader
>> Class A extends B extends A (CCE)
>> Class A extends B
>> Class A extends C
>>
>> Five modes:
>> Sequential
>> Concurrent loading with user defined class loader
>> Concurrent loading parallelCapable class loader
>> Wait when loading the superclass with parallelCapable class loader
>> Wait when loading the superclass with user defined class loader
>>
>> In all cases, after A is parsed and calls resolve_super_or_fail to load B
>> and loading B waits. Classes ClassInLoader, CP1 and CP2 provide
>> constant pool references to A.
>>
>> In all cases, when B waits, A is replaced with bytes so A extends C.
>>
>> Two tests x 3 modes (both threads do the same):
>> (CCE) First test A extends B, which throws CCE.
>> -- All three modes: first constant pool reference throws CCE, second reference A extends C
>> (B) Second test A extends B which doesn't throw CCE.
>> -- All three modes: both references A extends B.
>>
>> The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1
>> is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is
>> currently loading.
>>
>> Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug.
>>
>> Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug.
>
> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
>
> Remove complicated test in favor of Ioi's test, fix cause.
Hi Coleen,
That took some going through - sorry it took me a while to get back to it. The addition of the "cause" processing would have been better separated into its own RFE to keep things simple.
I'll defer to Vladimir on the actual compiler changes. The rest I think I understand okay. I have a couple of additional comments below.
Thanks,
David
src/hotspot/share/oops/constantPool.cpp line 561:
> 559: // We also need to CAS to not overwrite an error from a racing thread.
> 560:
> 561: jbyte old_tag = Atomic::cmpxchg((jbyte*)this_cp->tag_addr_at(which),
So the theory of operation is: if all racing threads resolve the klass successfully then they will have obtained the same klass instance and so it doesn't matter which thread actually called Atomic::release_store(adr, k);
above most recently. One thread will manage to update the tag and all threads will return 'k'.
If any thread encounters a resolution error there is a another CAS of the tag to ensure only one of them is first. If the error thread is first the others will see the UnresolvedClassInError and clear the klass from resolved_klasses() again.
src/hotspot/share/oops/constantPool.cpp line 568:
> 566: if (old_tag == JVM_CONSTANT_UnresolvedClassInError) {
> 567: // Remove klass.
> 568: Atomic::release_store(adr, (Klass*)NULL);
You don't need a release_store in this case as you just did a CAS which has full memory synchronization. But in addition as you are setting it to NULL there are no prior stores that you need to ensure are visible.
src/hotspot/share/oops/constantPool.cpp line 587:
> 585:
> 586: if (this_cp->tag_at(which).is_klass()) {
> 587: Klass* k = this_cp->resolved_klasses()->at(resolved_klass_index);
If this can't be encapsulated in a helper function please add a comment about checking the tag first.
test/hotspot/jtreg/runtime/ParallelLoad/SaveResolutionErrorTest.java line 158:
> 156: public Class loadClass(String name) throws ClassNotFoundException {
> 157: if (name.equals("SaveResolutionErrorTest$Loadee")) {
> 158: if (hack()) {
You could just use a boolean field 'first'.
src/hotspot/share/oops/constantPool.cpp line 877:
> 875: // Needs clarification to section 5.4.3 of the VM spec (see 6308271)
> 876: } else if (this_cp->tag_at(which).value() != error_tag) {
> 877: add_resolution_error(this_cp, which, tag, PENDING_EXCEPTION);
If the CAS below fails, which is this addition of the resolution error not undone?
-------------
Marked as reviewed by dholmes (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/2718
More information about the hotspot-dev
mailing list