Broken thread scheduling in indexed loop (missing safepoint?)

Fri Nov 30 15:52:59 PST 2012

On 11/30/12 15:37, Vitaly Davidovich wrote:
> What's the reason safepoints were yanked out in the first place? I'd
> assume safepoints aren't that frequent, at least while a typical loop is
> running.  Once the polling page is in cache (first iteration will bring
> it in) and assuming the test instruction on it will always go the same
> way (i.e. safepoint not requested) won't the cache hitting load and
> perfectly predicted (common case) branch not have noticeable impact on
> the loop? Or am I underestimating safepoint frequency? Or is the polling
> test in the loop inhibiting some optimizations due to control flow? Or
> something else? :)

It is not simple touch of polling page. It is point where a method could 
be deoptimized. For that a SafePoint node has edges to live values in a 
graph (C2 IR). And it affects our current implementation of loop 
optimizations (unrolling, iterations splitting, vectorization). So C2 
will generate less optimal loop code.

>
> Also, counted loops are only those with a compile time constant limit,
> right? Or does a read of a final field (or otherwise deduced by JIT to
> be a runtime constant) whose value is the loop limit also count?

No. Counted loop in C2 is a loop with constant integer increment, one 
entry path and one exit path with iteration check at the end of loop's 
body. Low and upper iterations limits could be variables.

Vladimir

>
> Thanks
>
> Sent from my phone
>
> On Nov 30, 2012 6:19 PM, "Vladimir Kozlov" <Vladimir.Kozlov at oracle.com
> <mailto:Vladimir.Kozlov at oracle.com>> wrote:
>
>     On 11/30/12 12:51, Vitaly Davidovich wrote:
>
>         By the way, what would a fix involve? Polling for safepoint every X
>         trips through the loop where X is some reasonable value and then
>         overridable via a JVM arg? I guess otherwise the JVM would have to
>         know/estimate the cost of the loop body, which seems intractable
>         in general.
>
>
>     An other approach is "mine stripping" - split a loop into inner and
>     outer loops.
>     Polling not on each iteration requires test and branch which could
>     be more expensive than load with unused result.
>     Current plan (6869327) is simple to keep safepoint in counted loop
>     when the flag is specified.
>
>     Vladimir
>
>
>         What does J9 do? Does it simply not strip out polling?
>
>         Thanks
>
>         Sent from my phone
>
>         On Nov 30, 2012 3:28 PM, "Volker Simonis"
>         <volker.simonis at gmail.com <mailto:volker.simonis at gmail.com>
>         <mailto:volker.simonis at gmail.__com
>         <mailto:volker.simonis at gmail.com>>> wrote:
>
>              Hi Vitaly,
>
>              you're right, even a STW GC will have to wait until this loop
>              without a safepoint will finish. On the other side, if
>         there is an
>              allocation inside the loop, the allocation will be done on
>         the "fast
>              path" (without safepoint check) only until the thread local
>         buffer
>              (TLAB) will be full. After that it will have to take the
>         "slow path"
>              which is a VM call and which has a safe point check. So in
>         practice
>              I think the problem is only with very tight loops which do same
>              small but expensive computations (or with nested int loops
>         which
>              have potentially the same complexity like long loops but no
>         safpoint
>              either).
>
>              Regards,
>              Volker
>
>              On Friday, November 30, 2012, Vitaly Davidovich wrote:
>
>                  Hi Volker,
>
>                  Just curious - what happens if a STW GC needs to occur
>         right as
>                  this type of loop is entered? Does the VM just stall
>         until the
>                  loop exits? What if this loop does a fast path
>         allocation on
>                  some iteration? Do all allocations check for safepoints
>         internally?
>
>                  Thanks
>
>                  Sent from my phone
>
>                  On Nov 30, 2012 1:41 PM, "Volker Simonis"
>         <volker.simonis at gmail.com <mailto:volker.simonis at gmail.com>> wrote:
>
>                      Hi,
>
>                      This is a long standing problem of HotSpot
>         (compared for
>                      example to
>                      J9). It doesn't put Safepoints into counted int loops
>                      (because it
>                      assumes they will terminate just "fast enough"
>         which is not
>                      the case
>                      in your example). You can see another example for this
>                      behavior in
>                      these slides
>         "http://www.progdoc.de/papers/__Jax2012/jax2012.html#%288%29
>         <http://www.progdoc.de/papers/Jax2012/jax2012.html#%288%29>"
>                      together with the generated assembler code.
>
>                      You can easily solve the problem by making your loop
>                      variable a "long"
>                      instead of an "int". In that case, HotSpot will be more
>                      cautious and
>                      place a safepoint into the loop.
>
>                      Regards,
>                      Volker
>
>                      On Fri, Nov 30, 2012 at 2:05 PM, Alexey Goncharuk
>         <agoncharuk at gridgain.com <mailto:agoncharuk at gridgain.com>> wrote:
>          > Hi,
>          >
>          > We faced some weird issue with thread scheduling. At a
>                      first glance it
>          > looked like it relates to
>          >
>         http://bugs.sun.com/__bugdatabase/view_bug.do?bug___id=7160161
>         <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7160161>
>                      but not exactly.
>          >
>          > This is the code we ran:
>          >
>          > public static void main(String[] args) throws Exception {
>                      Thread worker =
>          > new Thread() { @Override public void run() { double d =
>                      0; for (int j = 1; j
>          > < 2000000000; j++) d += Math.log(Math.E * j);
>                      System.out.println(d); } };
>          > Thread reporter = new Thread() { @Override public void
>                      run() { try { while
>          > (true) { Thread.sleep(1000); System.out.println("Running: " +
>          > System.currentTimeMillis()); } } catch
>                      (InterruptedException ignored) {
>          > Thread.currentThread().__interrupt(); } } };
>                      reporter.start(); worker.start();
>          > worker.join(); reporter.interrupt(); }
>          >
>          > One can expect that printing thread would output messages
>                      during all the
>          > calculation time, however it hangs after 3-4 iterations.
>                      Setting
>          > -XX:FreqInlineSize=0 as described in original bug report
>                      does not help in
>          > this case, but if I extract loop body into a separate
>                      method, setting this
>          > option works. Example passes with -Xint option as well.
>                      (Tested with
>          > 1.6.0_33, 1.6.0_37, 1.7.0_07 on Windows and 1.6.0_33 on
>                      Linux)
>          >
>          > I saw #7160161 marked as resolved, so I just wanted to
>                      confirm if behavior
>          > we see really relates to this issue and it was fixed (bug
>                      report covers
>          > non-Counted loop only).
>          >
>          > Also, is there any other workarounds rather then
>                      extracting the method and
>          > specifying FreqInlineSize=0?
>          >
>          > Thanks,
>          > Alexey Goncharuk
>          >
>