RFR: 8314850: SharedRuntime::handle_wrong_method() gets called too often when resolving Continuation.enter

Oli Gillespie ogillespie at openjdk.org
Wed Aug 23 15:05:23 UTC 2023


On Wed, 23 Aug 2023 14:25:12 GMT, Patricio Chilano Mateo <pchilanomate at openjdk.org> wrote:

> Please review the following fix. The check in SharedRuntime::reresolve_call_site() "caller.is_compiled_frame() && !caller.is_deoptimized_frame()" fails when caller is Continuation.enterSpecial because it is a native method. This means that if the static callsite to Continuation.enter was patched to point to an nmethod and then the nmethod was marked non-entrant we will always call SharedRuntime::handle_wrong_method() thereafter when resolving Continuation.enter because the callsite will never be reset back to the clean state. Thanks to @kuksenko for all the performance analysis while debugging this.
> I tested the patch by running the test included in the original report (loom-dev mailing list) and verifying the issue is fixed. I also run mach5 tiers1-3 for sanity testing and will run all the upper tiers too.
> 
> Thanks,
> Patricio

Thanks for working on this.

I'm not sure if this is useful, but I just did a quick test with the patch and I see some interesting performance changes. Context switches in the test go from ~160k to ~1m, and total time increases from 4 seconds to 4.8 seconds. However I do see lower CPU utilization.


import java.util.concurrent.*;

public class Main {
    public static void main(String[] args) throws Exception {
        for (int i = 0; i < 100; i++) {
            ExecutorService = Executors.newVirtualThreadPerTaskExecutor();
            long start = System.currentTimeMillis();
            for (int j = 0; j < 100_000; j++) e.submit(Main::task);
            e.shutdown(); e.awaitTermination(100, TimeUnit.MINUTES);
            long dur = System.currentTimeMillis() - start;
            System.out.println(dur + "ms");
        }
    }

    static void task() { }
}



perf stat java -XX:+UnlockExperimentalVMOptions -XX:-DoJVMTIVirtualThreadTransitions Main.java


Before patch

         45,069.79 msec task-clock                #   11.214 CPUs utilized
           159,503      context-switches          #    0.004 M/sec
            16,250      cpu-migrations            #    0.361 K/sec
           182,820      page-faults               #    0.004 M/sec
           
       4.019077281 seconds time elapsed

      44.004389000 seconds user
       1.844216000 seconds sys



After patch

         39,830.77 msec task-clock                #    8.217 CPUs utilized <--- reduced
         1,094,192      context-switches          #    0.027 M/sec <--- increased
            91,803      cpu-migrations            #    0.002 M/sec
           185,841      page-faults               #    0.005 M/sec

       4.847430580 seconds time elapsed <--- increased

      34.643548000 seconds user
      10.396489000 seconds sys


I appreciate this test may be total unrealistic and flawed, and I haven't had time to check where the context-switches are coming from, but figured I'd share the early results just in case.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15403#issuecomment-1690130076


More information about the hotspot-dev mailing list