From john.r.rose at oracle.com Fri May 1 01:36:53 2020 From: john.r.rose at oracle.com (John Rose) Date: Thu, 30 Apr 2020 18:36:53 -0700 Subject: RFR(M): 8223051: support loops with long (64b) trip counts In-Reply-To: <87lfmd8lip.fsf@redhat.com> References: <87lfmd8lip.fsf@redhat.com> Message-ID: A thousand thanks for taking this on! On Apr 30, 2020, at 12:45 AM, Roland Westrelin wrote: > > > https://bugs.openjdk.java.net/browse/JDK-8223051 > http://cr.openjdk.java.net/~roland/8223051/webrev.00/ > > This transforms a long counted loop into a strip mined loop nest. That > is roughly: > > for (long l = long_start; l < long_stop; l += long_stride) { > } > > into > > for (long l = long_start; l < long_stop; ) { > int int_stride = (int)long_stride; > int int_stop = MIN(long_stop - l, max_jint - int_stride); > l += int_stop; > for (int i = 0; i < int_stop; i += int_stride) { > } > } It would be good to put this pseudo-code above the definition of PhaseIdealLoop::is_long_counted_loop, but with names adjusted to match the code. Let me sketch it in terms of the names in the code: L: for (long phi = init; phi < limit; phi += stride) { // phi := Phi(L, init, phi + stride) ? use phi and (phi + stride) ? } ==transform=> const long inner_iters_limit = INT_MAX - stride; assert(stride <= inner_iters_limit); // else deopt assert(limit + stride <= LONG_MAX); // else deopt L1: for (long phi1 = init; phi1 < limit; phi1 += stride) { // phi1 := Phi(L1, init, phi1 + stride) long inner_iters_max = MAX(0, limit + stride - phi1); long inner_iters_actual = MIN(inner_iters_max, inner_iters_limit); L2: for (int phi2 = 0; phi2 < inner_iters_actual; phi2 += stride) { ? use (phi1 + phi2) and (phi1 + phi2 + stride) ? } } > This is implemented as a separate transformation from loop strip mining > of JDK-8186027. I used the logic from JDK-8186027 as inspiration but > it's really quite different. Thank you for pursuing it! > If JDK-8186027's loop strip mining is enabled the loop nest above can be > further transformed into: > > for (long l = long_start; l < long_stop; ) { > for (int i = 0; i < int_stop; i += int_stride) { > for (int j = i; j < LoopStripMiningIter; j+= int_stride) { > } > } > } Nice. This is worth a stress test; I think my comments below set up the conditions for allowing all three loop layers to be non-trivial, in a stress test. > I refactored the code of PhaseIdealLoop::is_counted_loop() so it was > straightforward to add a PhaseIdealLoop::is_long_counted_loop() that > shares some logic with PhaseIdealLoop::is_counted_loop(). Thanks. I have suggested a few more points of refactoring below. > is_long_counted_loop() starts by looking at the shape of the loop and if > its shape is that of a counted loop with a long induction variable, then > an outer loop is added with a long induction variable. An int induction > variable is constructed for the inner loop. At this point the loop nest > is only partially constructed. > > is_long_counted_loop() then attempts the conversion of the inner loop > into an int counted loop with a call to is_counted_loop(). If that fails > for some rare corner case, is_long_counted_loop() backs off and > transforms the loop nest back so it's a single long loop again. (See below about an extra diagnostic counter for this step.) > If the inner loop is successfully converted into a counted loop, > is_long_counted_loop() finishes building the loop nest. This is > different from JDK-8186027's loop strip mining which builds the loop > nest in 2 phases: first a skeleton outer loop and after loop opts, the > fully built loop nest. > > I also added stressing code that turns: > > for (int i = int_start; i < int_stop; i += int_stride) { > } > > into: > > for (long l = (long)int_start; l < (long)int_stop; l += (long)int_stride) { > } > > that can then be converted into the loop nest above. The reason for this > is that I was concerned long loops were too uncommon in the wild for > this change to be properly tested. Very good move! > I had to change the asserts in loopopts.cpp, because all nodes that are > added when the loop nest is constructed have the same dom depth. See request below for a small comment. > > This change doesn't handle RCE. I'll work on that next. Thank you. The sort of RCE I hope to get to, eventually, is something which performs range checks on long values (not just int values), and still allows iteration range splitting. So: L: for (long phi = init; phi < limit; phi += stride) { // phi := Phi(L, init, phi + stride) if (phi >= 0 && phi < max) { ? use phi ? } } ==transform=> ? L1: for (long phi1 = init; phi1 < limit; phi1 += stride) { // phi1 := Phi(L1, init, phi1 + stride) ? int phi2 = 0; if (phi1 + phi2 < 0) { L2_PRE: ? } assert((phi1 + phi2) >= 0); L2: for (; phi2 < MIN(inner_iters_actual, phi1 - max); phi2 += stride) { ? use phi, without range check ? } if (phi1 + phi2 >= max) { L2_POST: ? } } Now for specific comments: This code, which is the use of clone_loop_predicates with the new boolean, is adequately documented, but I?d prefer to see the boolean even more clearly called out: + // Keep the outer loop below the existing predicates (some + // predicates may have been added already) and clone non-concrete + // predicates between the int loop head and the long loop head. ++ const bool not_concrete = false; ++ Node* predicate_proj = clone_loop_predicates(entry_control, outer_head, true, false, 0, NULL, not_concrete); In this webrev I didn?t see a discussion of ?what does concrete mean?; I assume it is elsewhere, and that I would know it if I were familiar with the loop optimizations (which very few people are!). Maybe add: + // See 'GraphKit::add_empty_predicates'. (Is a concrete predicate one which was inserted above an empty place-holder by a predication transformation? In that case, I might wish to call those ?predication tests?, or some other term tied to the name of a specific optimization transform, rather than ?concrete predicates?.) I have a similar comment about the new ?update_body? flag. For that, I suggest writing something with an explanatory comment near the top of blocks of code which need it: ++ // Although this code creates new loop structure, it does ++ // not change the loop nest, because the nest itself is only ++ // partially constructed at this point. ++ const bool same_loop = false; // don?t update loop bodies ? ++ register_control(outer_le, outer_ilt, iffalse, same_loop); ++ register_control(outer_ift, outer_ilt, outer_le, same_loop); The point of this trick of using a symbolic name instead of a boolean constant is (a) to make the use of it a little easier to follow in the function call, and (b) to allow the declaration of the constant to carry its own little load of documentation. (BTW I find it slightly surprising when an optional flag defaults to true rather than false; looks like you are doing the opposite which I suppose is a reasonable convention also.) In loopopts.cpp I suggest adding a very brief comment saying when dom equality can occur. (In both places, I guess.) It might help an integrator at some point, since the change is so isolated and textually simple (without a comment). That?s a nice job refactoring. C2 has notoriously long methods, and this makes some of them incrementally more readable. If you went to some trouble to make the webrev diff smaller, thank you. It?s hard to refactor blocks out of mega-functions without extra diff noise. That said, ?is_long_counted_loop? is a new mega-function. I hope it doesn?t have to be refactored any time soon. I think you did the right thing here, even though it?s possible to imagine more refactoring. I think you inserted the check to LoopStripMiningIter twice. You should probably pick one copy to remove. Suggest adding this comment: + if (cmp == NULL || cmp->Opcode() != Op_CmpI) { + return false; ++ // Avoid pointer & float & 64-bit compares And a similar one that says ?32-bit compares?. Suggest adding an assert to loop_iv_stride: ++ assert(incr->Opcode() == Op_AddI || incr->Opcode() == Op_AddL, "caller resp."); When refactoring mega-functions, I think it helps sometimes to add such asserts to make it clear who is responsible for checking what. It would be nice to add an extra filter step to loop_iv_incr, so it always returns an add, but that would require the refactoring to touch the truncation detector more than you did, so I think that?s OK. You could just add a comment to loop_iv_incr saying, ?caller must ensure that this is an increment of the expected kind?. Or not; that?s pretty obviously true. I?m more sure that the previously suggested assert is helpful. Now for some arithmetic. I think this test is not obviously correct: + if (ABS(stride_con) >= max_jint) { You need an unsigned comparison here, or some other dodge, to avoid the negative vibes from ABS(min_jlong). I think the best thing to do is add an extra overflow check which is easy to read, rather than checking for that one extra case: ++ if (stride_con != (jint)stride_con || ABS(stride_con) >= max_jint) { I think you should also check for ?stride_con == 0? here. The 32-bit code checks for zero strides but I don?t see it here. Perhaps that would fold up later after the 32-bit loop is created? But it seems tidier to keep the 32-bit and 64-bit code as parallel as possible. Also, I think this test on stride belongs after you increment _long_loops, not before. A long loop with a really big stride is, after all, still a long loop. BTW, this should be a jlong, not a plain C long: ++ jlong stride_con = stride->get_long(); The following logic is hard for me to prove correct, and I would prefer it to miss a few valid cases rather in a simpler form: + if (phi_incr != NULL && iters_limit <= ABS(stride_con)) { I suggest either detuning the check in its current place: ++ if (iters_limit <= ABS(stride_con)) { Or else move the check into the following conditional: + // if the loop exit test is on the IV before it is incremented: + // i < limit, we transform the exit test so it is performed on the + // exit test after it is incremented: i + stride < limit + stride. + // We need limit + stride to not overflow. + if (phi_incr != NULL) { ++ if (iters_limit <= ABS(stride_con)) { ++ return false; ++ } It helped me to associate this block with the later introduction of ?adjusted_limit?; suggest: ++ // We need limit + stride to not overflow. See adjusted_limit below. Also, the immediately following range check logic is very odd, and doesn?t correspond to anything in the 32-bit loop code. What you appear to be doing here is checking how limit_t is going to behave with respect to the stride and there are three possibilities: It can sometimes cause 64-bit overflow, or it cannot cause 64-bit overflow, or it *must* cause 64-bit overflow. The two parts of this check are widely separately and not obviously connected, although they are linked by a common task of predicate insertion. I suggest making it more readable by writing a range-check helper function, actually two of them: // Return 0 if it won't overflow, -1 if it must overflow, and 1 otherwise. static int check_stride_overflow(jint stride_con, const TypeInt* limit_t) { if (stride_con > 0) { if (limit_t->_lo > max_jint - stride_con) { return -1; } if (limit_t->_hi > max_jint - stride_con) { return 1; } } else { if (limit_t->_hi < min_jint - stride_con) { return -1; } if (limit_t->_lo < min_jint - stride_con) { return 1; } } return 0; } static int check_stride_overflow(jlong stride_con, const TypeLong* limit_t) { if (stride_con > 0) { if (limit_t->_lo > max_jlong - stride_con) { return -1; } if (limit_t->_hi > max_jlong - stride_con) { return 1; } } else { if (limit_t->_hi < min_jlong - stride_con) { return -1; } if (limit_t->_lo < min_jlong - stride_con) { return 1; } } return 0; } Then use the functions appropriately, which will tend to make the case analysis a little easier to understand. In the case of the 32-bit function, it will simplify (and clarify) the tri-state logic the begins like this: if (limit->is_Con()) { int limit_con = limit->get_int(); if ((stride_con > 0 && limit_con > (max_jint - stride_m)) || (stride_con < 0 && limit_con < (min_jint - stride_m))) { // Bailout: it could be integer overflow. The simplified could look like: int sov = check_stride_overflow(stride_m, limit_t); // If sov==0, limit's type always satisfies the condition, for example, // when it is an array length. if (sov != 0) { if (sov < 0) { return false; // Bailout: integer overflow is certain. } ? prepare dynamic limit check ? } I think I?m on the right track here because in both 32-bit and 64-bit code these checks adjoin predicate insertion logic, specifically the call to find_predicate_insertion_point. The 64-bit version of the code should look similar, I think. Since some of the predicate insertion work is delayed, a ?todo? flag might be set. (You may notice I like to use names for reifying logical relations between different parts of the code.) The new node inner_iters_max would read a little better with a comment: + _igvn.register_new_node_with_optimizer(inner_iters_max); ++ // inner_iters_max is MAX(0, adjusted_limit - iv), when stride > 0 The definition of adjusted_limit also deserves comment, for that matter. Perhaps this one, lifted from the 32-bit code: // If compare points directly to the phi we need to adjust // the compare so that it points to the incr. In fact, the 32-bit code near that comment should, in my opinion, also be edited to use a different variable adjusted_limit, even though that variable will have a short span. There is value to having the 32-bit and 64-bit code look as similar as possible. The fact that this comment already occurs twice in the 32-bit code is additional evidence that there should be an adjusted_limit variable, defined near the *first* occurrence of that comment, in the 32-bit code. I think such a cleanup is a desirable way to reduce the cost of making a new almost-copy of the 32-bit code, in its 64-bit form. Another milestone in the arithmetic the deserves a comment is this definition: + _igvn.register_new_node_with_optimizer(inner_iters_actual); ++ // inner_iters_actual is unsigned MIN(inner_iters_max, max_jint - ABS(stride)) ++ // this is the 32-bit number of iterations to execute in the inner loop (Why is it unsigned? I think its operands are never negative.) You know, after decoding the above MIN and MAX expressions, it seems to me that it might be time to introduce helper functions in GraphKit to create such MIN and MAX instructions. They are really hard to read as-is. Something like: GraphKit::build_min_max(BasicType bt, bool is_max, bool is_unsigned); GraphKit::signed_max(BasicType bt) { return build_min_max(bt, true, false); } GraphKit::unsigned_min(BasicType bt) { return build_min_max(bt, false, true); } etc. There?s a backtrack at this point: + // That fails. Undo graph changes we've done so far. I think that should collect a count somewhere, to be reported as part of statistics. That way, when you run stress tests, you can ensure that they include this backtrack path. I see that?s counted, in part, by the difference of _long_loops_success and _long_loops, but maybe another counter here wouldn?t be a bad idea, since there is lot of work to undo at this very late point. BTW, I like StressLongCountedLoop a lot; it?s a nice test. I suggest a second stress mode, in which the pinning against the jint range (in places like max_jint - ABS(stride)) is replaced by pinning against a value like max_jint/100. The point would be to ensure that both outer and inner layers of the decomposed loops get a chance to run. As StressLongCountedLoop is currently formulated, it will only run the outer loop once, right? The stress mode logic also deserves a counter, so we can tell how many loops were actually promoted. In the finale of the code, the IV and its increment IV+S are replaced by (OUTERIV + INNERIV) and (OUTERIV + INNERIV + S). The two loops that handle these chores are very similar. Would you mind putting them into a common helper function? I think that would make the code easier to maintain and understand. I think you need another reviewer, one currently working on the JIT team. But I like this code, and you can cite me as a reviewer. ? John From jatin.bhateja at intel.com Fri May 1 11:33:44 2020 From: jatin.bhateja at intel.com (Bhateja, Jatin) Date: Fri, 1 May 2020 11:33:44 +0000 Subject: RFR[XS] : 8244186 : assertion failure test/jdk/javax/net/ssl/DTLS/RespondToRetransmit.java In-Reply-To: <457fee15-0e58-fb22-ad68-33e4f8c7e44e@oracle.com> References: <457fee15-0e58-fb22-ad68-33e4f8c7e44e@oracle.com> Message-ID: Hi Vladimir, Thanks for your comments. I will take care of styling comments while making check-in. Backport has been replaced with Issue linked to JDK-8241040. Regards, Jatin > -----Original Message----- > From: hotspot-compiler-dev > On Behalf Of Vladimir Kozlov > Sent: Friday, May 1, 2020 4:05 AM > To: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR[XS] : 8244186 : assertion failure > test/jdk/javax/net/ssl/DTLS/RespondToRetransmit.java > > Hi Jatin, > > Fix looks fine but please always use {} for if() body. > > Also why JDK-8241040 is listed as backport for this bug? Did you mean to > link it as related? > > Thanks, > Vladimir > > On 4/30/20 4:45 AM, Bhateja, Jatin wrote: > > Hi All, > > > > Kindly review the patch which fixes assertion failures seen in some jtreg > regression. > > > > JBS: http://bugs.openjdk.java.net/browse/JDK-8244186 > > Webrev: http://cr.openjdk.java.net/~jbhateja/8244186/webrev.01/ > > > > Removing an assertion which prevents logic folding over cones already > > having a MacroLogic node as depicted by following graph[1]. > > > > Regards, > > Jatin > > > > [1] Original ideal Graph: > > | | > > [N1](XorV) [N2](XorV) > > / \ / \ > > / \ / \ > > / \ / \ > > / \ / \ > > / \/ \ > > / [N3](AndV) \ > > / / \ \ > > [A] / \ [D] > > / \ > > [N4](AndV) \ > > / \ \ > > [B] [C] [N5](AndV) > > / \ > > [C] [D] > > > > Above DAG has two logic cone roots N1 & N2. > > > > Folding logic on cone rooted at N1, MacroLogic node can have at most 3 > distinct inputs: > > | | > > [N1](XorV) [N2](XorV) > > / \ / \ > > / \ / \ > > / \ / \ > > / \ / \ > > / \/ \ > > / [N6](MacroL) \ > > / / | \ \ > > / / | \ \ > > [A] [B] [C] [D] [D] > > > > > > Folding logic on Cone rooted at N2: > > | | > > [N1](XorV) [N2](XorV) > > / \ \ > > / \ \ > > / \ \ > > / \ [N7] (MacroL) > > / \ / | \ > > / [N6](MacroL) / | \ > > / / | \ [B][C] [D] > > / / | \ > > A] [B] [C] [D] > > > > > > > > > > > > > > > > > > > > From vladimir.kozlov at oracle.com Fri May 1 17:17:37 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 1 May 2020 10:17:37 -0700 (PDT) Subject: RFR[XS] : 8244186 : assertion failure test/jdk/javax/net/ssl/DTLS/RespondToRetransmit.java In-Reply-To: References: <457fee15-0e58-fb22-ad68-33e4f8c7e44e@oracle.com> Message-ID: <9f19d292-9555-0e28-35bb-d6a59fe7f45a@oracle.com> Good. thanks, Vladimir On 5/1/20 4:33 AM, Bhateja, Jatin wrote: > Hi Vladimir, > > Thanks for your comments. > > I will take care of styling comments while making check-in. > Backport has been replaced with Issue linked to JDK-8241040. > > Regards, > Jatin > >> -----Original Message----- >> From: hotspot-compiler-dev >> On Behalf Of Vladimir Kozlov >> Sent: Friday, May 1, 2020 4:05 AM >> To: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR[XS] : 8244186 : assertion failure >> test/jdk/javax/net/ssl/DTLS/RespondToRetransmit.java >> >> Hi Jatin, >> >> Fix looks fine but please always use {} for if() body. >> >> Also why JDK-8241040 is listed as backport for this bug? Did you mean to >> link it as related? >> >> Thanks, >> Vladimir >> >> On 4/30/20 4:45 AM, Bhateja, Jatin wrote: >>> Hi All, >>> >>> Kindly review the patch which fixes assertion failures seen in some jtreg >> regression. >>> >>> JBS: http://bugs.openjdk.java.net/browse/JDK-8244186 >>> Webrev: http://cr.openjdk.java.net/~jbhateja/8244186/webrev.01/ >>> >>> Removing an assertion which prevents logic folding over cones already >>> having a MacroLogic node as depicted by following graph[1]. >>> >>> Regards, >>> Jatin >>> >>> [1] Original ideal Graph: >>> | | >>> [N1](XorV) [N2](XorV) >>> / \ / \ >>> / \ / \ >>> / \ / \ >>> / \ / \ >>> / \/ \ >>> / [N3](AndV) \ >>> / / \ \ >>> [A] / \ [D] >>> / \ >>> [N4](AndV) \ >>> / \ \ >>> [B] [C] [N5](AndV) >>> / \ >>> [C] [D] >>> >>> Above DAG has two logic cone roots N1 & N2. >>> >>> Folding logic on cone rooted at N1, MacroLogic node can have at most 3 >> distinct inputs: >>> | | >>> [N1](XorV) [N2](XorV) >>> / \ / \ >>> / \ / \ >>> / \ / \ >>> / \ / \ >>> / \/ \ >>> / [N6](MacroL) \ >>> / / | \ \ >>> / / | \ \ >>> [A] [B] [C] [D] [D] >>> >>> >>> Folding logic on Cone rooted at N2: >>> | | >>> [N1](XorV) [N2](XorV) >>> / \ \ >>> / \ \ >>> / \ \ >>> / \ [N7] (MacroL) >>> / \ / | \ >>> / [N6](MacroL) / | \ >>> / / | \ [B][C] [D] >>> / / | \ >>> A] [B] [C] [D] >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> From vladimir.kozlov at oracle.com Sat May 2 00:23:29 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 1 May 2020 17:23:29 -0700 Subject: RFR (XXL): 8223347: Integration of Vector API (Incubator): General HotSpot changes In-Reply-To: <25a564a1-7f40-6988-060f-86b06e02ad21@oracle.com> References: <25a564a1-7f40-6988-060f-86b06e02ad21@oracle.com> Message-ID: <08421e2e-985d-5901-eb20-0ae96a48d8a0@oracle.com> I looked on these changes and compiler changes seems fine. Thanks, Vladimir On 4/16/20 5:32 AM, Vladimir Ivanov wrote: > Hi, > > Any more reviews, please? Especially, compiler and runtime-related changes. > > Thanks in advance! > > Best regards, > Vladimir Ivanov > > On 04.04.2020 02:12, Vladimir Ivanov wrote: >> Hi, >> >> Following up on review requests of API [0] and Java implementation [1] for Vector API (JEP 338 [2]), here's a request >> for review of general HotSpot changes (in shared code) required for supporting the API: >> >> >> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/all.00-03/ >> >> (First of all, to set proper expectations: since the JEP is still in Candidate state, the intention is to initiate >> preliminary round(s) of review to inform the community and gather feedback before sending out final/official RFRs once >> the JEP is Targeted to a release.) >> >> Vector API (being developed in Project Panama [3]) relies on JVM support to utilize optimal vector hardware >> instructions at runtime. It interacts with JVM through intrinsics (declared in jdk.internal.vm.vector.VectorSupport >> [4]) which expose vector operations support in C2 JIT-compiler. >> >> As Paul wrote earlier: "A vector intrinsic is an internal low-level vector operation. The last argument to the >> intrinsic is fall back behavior in Java, implementing the scalar operation over the number of elements held by the >> vector.? Thus, If the intrinsic is not supported in C2 for the other arguments then the Java implementation is >> executed (the Java implementation is always executed when running in the interpreter or for C1)." >> >> The rest of JVM support is about aggressively optimizing vector boxes to minimize (ideally eliminate) the overhead of >> boxing for vector values. >> It's a stop-the-gap solution for vector box elimination problem until inline classes arrive. Vector classes are >> value-based and in the longer term will be migrated to inline classes once the support becomes available. >> >> Vector API talk from JVMLS'18 [5] contains brief overview of JVM implementation and some details. >> >> Complete implementation resides in vector-unstable branch of panama/dev repository [6]. >> >> Now to gory details (the patch is split in multiple "sub-webrevs"): >> >> =========================================================== >> >> (1) http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/00.backend.shared/ >> >> Ideal vector nodes for new operations introduced by Vector API. >> >> (Platform-specific back end support will be posted for review separately). >> >> =========================================================== >> >> (2) http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/01.intrinsics/ >> >> JVM Java interface (VectorSupport) and intrinsic support in C2. >> >> Vector instances are initially represented as VectorBox macro nodes and "unboxing" is represented by VectorUnbox node. >> It simplifies vector box elimination analysis and the nodes are expanded later right before EA pass. >> >> Vectors have 2-level on-heap representation: for the vector value primitive array is used as a backing storage and it >> is encapsulated in a typed wrapper (e.g., Int256Vector - vector of 8 ints - contains a int[8] instance which is used >> to store vector value). >> >> Unless VectorBox node goes away, it needs to be expanded into an allocation eventually, but it is a pure node and >> doesn't have any JVM state associated with it. The problem is solved by keeping JVM state separately in a >> VectorBoxAllocate node associated with VectorBox node and use it during expansion. >> >> Also, to simplify vector box elimination, inlining of vector reboxing calls (VectorSupport::maybeRebox) is delayed >> until the analysis is over. >> >> =========================================================== >> >> (3) http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/02.vbox_elimination/ >> >> Vector box elimination analysis implementation. (Brief overview: slides #36-42 [5].) >> >> The main part is devoted to scalarization across safepoints and rematerialization support during deoptimization. In >> C2-generated code vector operations work with raw vector values which live in registers or spilled on the stack and it >> allows to avoid boxing/unboxing when a vector value is alive across a safepoint. As with other values, there's just a >> location of the vector value at the safepoint and vector type information recorded in the relevant nmethod metadata >> and all the heavy-lifting happens only when rematerialization takes place. >> >> The analysis preserves object identity invariants except during aggressive reboxing (guarded by >> -XX:+EnableAggressiveReboxing). >> >> (Aggressive reboxing is crucial for cases when vectors "escape": it allocates a fresh instance at every escape point >> thus enabling original instance to go away.) >> >> =========================================================== >> >> (4) http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/03.module.hotspot/ >> >> HotSpot changes for jdk.incubator.vector module. Vector support is makred experimental and turned off by default. JEP >> 338 proposes the API to be released as an incubator module, so a user has to specify "--add-module >> jdk.incubator.vector" on the command line to be able to use it. >> When user does that, JVM automatically enables Vector API support. >> It improves usability (user doesn't need to separately "open" the API and enable JVM support) while minimizing risks >> of destabilitzation from new code when the API is not used. >> >> >> That's it! Will be happy to answer any questions. >> >> And thanks in advance for any feedback! >> >> Best regards, >> Vladimir Ivanov >> >> [0] https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-March/065345.html >> >> [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-April/041228.html >> >> [2] https://openjdk.java.net/jeps/338 >> >> [3] https://openjdk.java.net/projects/panama/ >> >> [4] >> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/01.intrinsics/src/java.base/share/classes/jdk/internal/vm/vector/VectorSupport.java.html >> >> >> [5] http://cr.openjdk.java.net/~vlivanov/talks/2018_JVMLS_VectorAPI.pdf >> >> [6] http://hg.openjdk.java.net/panama/dev/shortlog/92bbd44386e9 >> >> ???? $ hg clone http://hg.openjdk.java.net/panama/dev/ -b vector-unstable From vladimir.kozlov at oracle.com Sat May 2 00:31:37 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 1 May 2020 17:31:37 -0700 Subject: RFR (XXL): 8223347: Integration of Vector API (Incubator): x86 backend changes In-Reply-To: References: Message-ID: Changes seems fine. Nice work. Why it is called "vector-unstable branch"? Thanks, Vladimir K On 4/3/20 5:16 PM, Viswanathan, Sandhya wrote: > Hi, > > > Following up on review requests of API [0], Java implementation [1] and > > General Hotspot changes[3] for Vector API, here's a request for review > > of x86 backend changes required for supporting the API: > > > > JEP: https://openjdk.java.net/jeps/338 > > JBS: https://bugs.openjdk.java.net/browse/JDK-8223347 > > Webrev:http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/x86_webrev/webrev.00/ > > > > Complete implementation resides in vector-unstable branch of > > panama/dev repository [3]. > > Looking forward to your feedback. > > Best Regards, > Sandhya > > > [0] https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-March/065345.html > > > > [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-April/065587.html > > > > [2] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/037798.html > > > > [3] https://openjdk.java.net/projects/panama/ > > $ hg clone http://hg.openjdk.java.net/panama/dev/ -b vector-unstable > > > > > From sandhya.viswanathan at intel.com Sat May 2 00:55:57 2020 From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya) Date: Sat, 2 May 2020 00:55:57 +0000 Subject: RFR (XXL): 8223347: Integration of Vector API (Incubator): x86 backend changes In-Reply-To: References: Message-ID: Hi Vladimir, Thanks a lot for the feedback. We used an old existing separate branch to share the code for review and to track changes. We didn?t know how to change the name of the branch from vector-unstable to vector-stable. Best Regards, Sandhya -----Original Message----- From: Vladimir Kozlov Sent: Friday, May 01, 2020 5:32 PM To: Viswanathan, Sandhya ; hotspot-compiler-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; hotspot-dev Subject: Re: RFR (XXL): 8223347: Integration of Vector API (Incubator): x86 backend changes Changes seems fine. Nice work. Why it is called "vector-unstable branch"? Thanks, Vladimir K On 4/3/20 5:16 PM, Viswanathan, Sandhya wrote: > Hi, > > > Following up on review requests of API [0], Java implementation [1] and > > General Hotspot changes[3] for Vector API, here's a request for review > > of x86 backend changes required for supporting the API: > > > > JEP: https://openjdk.java.net/jeps/338 > > JBS: https://bugs.openjdk.java.net/browse/JDK-8223347 > > Webrev:http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/x86_webrev/webrev.00/ > > > > Complete implementation resides in vector-unstable branch of > > panama/dev repository [3]. > > Looking forward to your feedback. > > Best Regards, > Sandhya > > > [0] https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-March/065345.html > > > > [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-April/065587.html > > > > [2] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/037798.html > > > > [3] https://openjdk.java.net/projects/panama/ > > $ hg clone http://hg.openjdk.java.net/panama/dev/ -b vector-unstable > > > > > From vladimir.kozlov at oracle.com Sat May 2 01:00:37 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 1 May 2020 18:00:37 -0700 Subject: RFR[M]: 8151779: Some intrinsic flags could be replaced with one general flag In-Reply-To: <801D878C-CAE5-4EBE-8AFE-4E35346CD5BD@amazon.com> References: <19CD3956-4DC6-4908-8626-27D48A9AB4A4@amazon.com> <0EDAAC88-E5D9-424F-A19E-5E20C689C2F3@amazon.com> <801D878C-CAE5-4EBE-8AFE-4E35346CD5BD@amazon.com> Message-ID: <58ff5b66-1dce-d4ad-8f21-254abd1b887b@oracle.com> Hi I am CCing to runtime group too. I would like to see comments about these changes. No need to look on compiler's changes. The latest https://cr.openjdk.java.net/~xliu/8151779/02/webrev/ Good work. On 4/24/20 1:33 AM, Liu, Xin wrote: > Hi, > > May I get reviewed for this new revision? > JBS: https://bugs.openjdk.java.net/browse/JDK-8151779 > webrev: https://cr.openjdk.java.net/~xliu/8151779/01/webrev/ > > I introduce a new option -XX:ControlIntrinsic=+_id1,-id2... > The id is vmIntrinsics::ID. As prior discussion, ControlIntrinsic is expected to replace DisableIntrinsic. > I keep DisableIntrinsic in this revision. DisableIntrinsic prevails when an intrinsic appears on both lists. Yes, you have to keep DisableIntrinsic for now. We will deprecate it later. > > I use an array of tribool to mark each intrinsic is enabled or not. In this way, hotspot can avoid expensive string querying among intrinsics. > A Tribool value has 3 states: Default, true, or false. > If developers don't explicitly set an intrinsic, it will be available unless is disabled by the corresponding UseXXXIntrinsics. > Traditional Boolean value can't express fine/coarse-grained control. Ie. We only go through those auxiliary options UseXXXIntrinsics if developers don't control a specific intrinsic. > > I also add the support of ControlIntrinsic to CompilerDirectives. > > Test: > I reuse jtreg tests of DisableIntrinsic. Add add more @run annotations to verify ControlIntrinsics. > I passed hotspot:Tier1 test and all tests on x86_64/linux. Good. I submitted hotspot tier1-3 testing. Thanks, Vladimir > > Thanks, > --lx > > ?On 4/17/20, 7:22 PM, "hotspot-compiler-dev on behalf of Liu, Xin" wrote: > > Hi, Vladimir, > > Thanks for the clarification. > Oh, yes, it's theoretically possible, but it's tedious. I am wrong at that point. > I think I got your point. ControlIntrinsics will make developer's life easier. I will implement it. > > Thanks, > --lx > > > On 4/17/20, 6:46 PM, "Vladimir Kozlov" wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > I withdraw my suggestion about EnableIntrinsic from JDK-8151779 because ControlIntrinsics will provide such > functionality and will replace existing DisableIntrinsic. > > Note, we can start deprecating Use*Intrinsic flags (and DisableIntrinsic) later in other changes. You don't need to do > everything at once. What we need now a mechanism to replace them. > > On 4/16/20 11:58 PM, Liu, Xin wrote: > > Hi, Corey and Vladimir, > > > > I recently go through vmSymbols.hpp/cpp. I think I understand your comments. > > Each UseXXXIntrinsics does control a bunch of intrinsics (plural). Thanks for the hint. > > > > Even though I feel I know intrinsics mechanism of hotspot better, I still need a clarification of JDK- 8151779. > > > > There're 321 intrinsics (https://chriswhocodes.com/hotspot_intrinsics_jdk15.html). > > If there's no any option, they are all available for compilers. That makes sense because intrinsics are always beneficial. > > But there're reasons we need to disable a subset of them. A specific architecture may miss efficient instructions or fixed functions. Or simply because an intrinsic is buggy. > > > > Currently, JDK provides developers 2 ways to control intrinsics. > 1. Some diagnostic options. Eg. InlineMathNatives, UseBase64Intrinsics. > > Developers can use one option to disable a group of intrinsics. That is to say, it's a coarse-grained approach. > > > > 2. DisableIntrinsic="a,b,c" > > By passing a string list of vmIntrinsics::IDs, it's capable of disabling any specified intrinsic. > > > > But even putting above 2 approaches together, we still can't precisely control any intrinsic. > > Yes, you are right. We seems are trying to put these 2 different ways into one flag which may be mistake. > > -XX:ControlIntrinsic=-_updateBytesCRC32C,-_updateDirectByteBufferCRC32C is a similar to -XX:-UseCRC32CIntrinsics but it > requires more detailed knowledge about intrinsics ids. > > May be we can have 2nd flag, as you suggested -XX:UseIntrinsics=-AESCTR,+CRC32C, for such cases. > > > If we want to enable an intrinsic which is under control of InlineMathNatives but keep others disable, it's impossible now. [please correct if I am wrong here]. > > You can disable all other from 321 intrinsics with DisableIntrinsic flag which is very tedious I agree. > > > I think that the motivation JDK-8151779 tried to solve. > > The idea is that instead of flags we use to control particular intrinsics depending on CPU we will use vmIntrinsics::IDs > or other tables as you showed in your changes. It will require changes in vm_version_ codes. > > > > > If we provide a new option EnableIntrinsic and put it least priority, then we can precisely control any intrinsic. > > Quote Vladimir Kozlov "DisableIntrinsic list prevails if an intrinsic is specified on both EnableIntrinsic and DisableIntrinsic." > > > > "-XX:ControlIntrinsic=+_dabs,-_fabs,-_getClass" looks more elegant, but it will confuse developers with DisableIntrinsic. > > If we decide to deprecate DisableIntrinsic, I think ControlIntrinsic may be a better option. Now I prefer to provide EnableIntrinsic for simplicity and symmetry. > > I prefer to have one ControlIntrinsic flag and deprecate DisableIntrinsic. I don't think it is confusing. > > Thanks, > Vladimir > > > What do you think? > > > > Thanks, > > --lx > > > > > > On 4/13/20, 1:47 PM, "hotspot-compiler-dev on behalf of Corey Ashford" wrote: > > > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > > > > On 4/13/20 10:33 AM, Liu, Xin wrote: > > > Hi, compiler developers, > > > I attempt to refactor UseXXXIntrinsics for JDK-8151779. I think we still need to keep UseXXXIntrinsics options because many applications may be using them. > > > > > > My change provide 2 new features: > > > 1) a shorthand to enable/disable intrinsics. > > > A comma-separated string. Each one is an intrinsic. An optional tailing symbol + or '-' denotes enabling or disabling. > > > If the tailing symbol is missing, it means enable. > > > Eg. -XX:UseIntrinsics="AESCTR-,CRC32C+,CRC32-,MathExact" > > > This jvm option will expand to multiple options -XX:-UseAESCTRIntrinsics, -XX:+UseCRC32CIntrinsics, -XX:-UseCRC32Intrinsics, -XX:UseMathExactIntrinsics > > > > > > 2) provide a set of macro to declare intrinsic options > > > Developers declare once in intrinsics.hpp and macros will take care all other places. > > > Here are example: https://cr.openjdk.java.net/~xliu/8151779/00/webrev/src/hotspot/share/compiler/intrinsics.hpp.html > > > Ion Lam is overhauling jvm options. I am thinking how to be consistent with his proposal. > > > > > > > Great idea, though to be consistent with the original syntax, I think > > the +/- should be in front of the name: > > > > -XX:UseIntrinsics=-AESCTR,+CRC32C,... > > > > > > > I handle UseIntrinsics before VM_Version::initialize. It means that platform-specific initialization still has a chance to correct those options. > > > If we do that after VM_Version::initialize, some intrinsics may cause JVM crash. Eg. +UseBase64Intrinsics on x86_64 Linux. > > > Even though this behavior is same as -XX:+UseXXXIntrinsics, from user's perspective, it's not straightforward when JVM overrides what users specify implicitly. It's dilemma here, stable jvm or fidelity of cmdline. What do you think? > > > > > > Another problem is naming convention. Almost all intrinsics options use UseXXXIntrinsics. One exception is UseVectorizedMismatchIntrinsic. > > > Personally, I think it should be "UseXXXIntrinsic" because one option is for one intrinsic, right? Is it possible to change this name convention? > > > > Some (many?) intrinsic options turn on more than one .ad instruct > > instrinsic, or library instrinsics at the same time. I think that's why > > the plural is there. Also, consistently adding the plural allows you to > > add more capabilities to a flag that initially only had one intrinsic > > without changing the plurality (and thus backward compatibility). > > > > Regards, > > > > - Corey > > > > > > From vladimir.kozlov at oracle.com Sat May 2 01:05:19 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 1 May 2020 18:05:19 -0700 Subject: RFR (XXL): 8223347: Integration of Vector API (Incubator): x86 backend changes In-Reply-To: References: Message-ID: <5b4f8d0f-d355-8914-193a-3c87eccf2d34@oracle.com> On 5/1/20 5:55 PM, Viswanathan, Sandhya wrote: > Hi Vladimir, > > Thanks a lot for the feedback. > > We used an old existing separate branch to share the code for review and to track changes. > We didn?t know how to change the name of the branch from vector-unstable to vector-stable. Good to know that it does not mean that code is "unstable" ;) Katya filed today new bug [1]. Please look. Regards, Vladimir [1] https://bugs.openjdk.java.net/browse/JDK-8244269 > > Best Regards, > Sandhya > > -----Original Message----- > From: Vladimir Kozlov > Sent: Friday, May 01, 2020 5:32 PM > To: Viswanathan, Sandhya ; hotspot-compiler-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; hotspot-dev > Subject: Re: RFR (XXL): 8223347: Integration of Vector API (Incubator): x86 backend changes > > Changes seems fine. Nice work. > > Why it is called "vector-unstable branch"? > > Thanks, > Vladimir K > > On 4/3/20 5:16 PM, Viswanathan, Sandhya wrote: >> Hi, >> >> >> Following up on review requests of API [0], Java implementation [1] and >> >> General Hotspot changes[3] for Vector API, here's a request for review >> >> of x86 backend changes required for supporting the API: >> >> >> >> JEP: https://openjdk.java.net/jeps/338 >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8223347 >> >> Webrev:http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/x86_webrev/webrev.00/ >> >> >> >> Complete implementation resides in vector-unstable branch of >> >> panama/dev repository [3]. >> >> Looking forward to your feedback. >> >> Best Regards, >> Sandhya >> >> >> [0] https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-March/065345.html >> >> >> >> [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-April/065587.html >> >> >> >> [2] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/037798.html >> >> >> >> [3] https://openjdk.java.net/projects/panama/ >> >> $ hg clone http://hg.openjdk.java.net/panama/dev/ -b vector-unstable >> >> >> >> >> From vladimir.kozlov at oracle.com Sat May 2 03:29:49 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 1 May 2020 20:29:49 -0700 Subject: RFR[M]: 8151779: Some intrinsic flags could be replaced with one general flag In-Reply-To: <58ff5b66-1dce-d4ad-8f21-254abd1b887b@oracle.com> References: <19CD3956-4DC6-4908-8626-27D48A9AB4A4@amazon.com> <0EDAAC88-E5D9-424F-A19E-5E20C689C2F3@amazon.com> <801D878C-CAE5-4EBE-8AFE-4E35346CD5BD@amazon.com> <58ff5b66-1dce-d4ad-8f21-254abd1b887b@oracle.com> Message-ID: <747eaea2-36bf-dfbd-00e7-dcd9ff016dde@oracle.com> Hi Xin compiler/intrinsics/IntrinsicAvailableTest.java test failed on old x86 machine which does not have CLMUL and as result can't use CRC32 intrinsic (UseCRC32Intrinsics flag is false). With -XX:ControlIntrinsic=+_updateCRC32 test failed with: java.lang.RuntimeException: Unexpected result: intrinsic for java.util.zip.CRC32.update() is enabled but intrinsic is not available at compilation level 1 at compiler.intrinsics.IntrinsicAvailableTest.checkIntrinsicForCompilationLevel(IntrinsicAvailableTest.java:128) at compiler.intrinsics.IntrinsicAvailableTest.test(IntrinsicAvailableTest.java:138) at compiler.intrinsics.IntrinsicAvailableTest.main(IntrinsicAvailableTest.java:150) Regards, Vladimir On 5/1/20 6:00 PM, Vladimir Kozlov wrote: > Hi > > I am CCing to runtime group too. I would like to see comments about these changes. No need to look on compiler's changes. > > The latest https://cr.openjdk.java.net/~xliu/8151779/02/webrev/ > > Good work. > > On 4/24/20 1:33 AM, Liu, Xin wrote: >> Hi, >> >> May I get reviewed for this new revision? >> JBS: https://bugs.openjdk.java.net/browse/JDK-8151779 >> webrev: https://cr.openjdk.java.net/~xliu/8151779/01/webrev/ >> >> I introduce a new option -XX:ControlIntrinsic=+_id1,-id2... >> The id is vmIntrinsics::ID.? As prior discussion, ControlIntrinsic is expected to replace DisableIntrinsic. >> I keep DisableIntrinsic in this revision. DisableIntrinsic prevails when an intrinsic appears on both lists. > > Yes, you have to keep DisableIntrinsic for now. We will deprecate it later. > >> >> I use an array of tribool to mark each intrinsic is enabled or not. In this way, hotspot can avoid expensive string >> querying among intrinsics. >> A Tribool value has 3 states: Default, true, or false. >> If developers don't explicitly set an intrinsic, it will be available unless is disabled by the corresponding >> UseXXXIntrinsics. >> Traditional Boolean value can't express fine/coarse-grained control. Ie. We only go through those auxiliary options >> UseXXXIntrinsics if developers don't control a specific intrinsic. >> >> I also add the support of ControlIntrinsic to CompilerDirectives. >> >> Test: >> I reuse jtreg tests of DisableIntrinsic. Add add more @run annotations to verify ControlIntrinsics. >> I passed hotspot:Tier1 test and all tests on x86_64/linux. > > Good. I submitted hotspot tier1-3 testing. > > Thanks, > Vladimir > >> >> Thanks, >> --lx >> >> ?On 4/17/20, 7:22 PM, "hotspot-compiler-dev on behalf of Liu, Xin" > behalf of xxinliu at amazon.com> wrote: >> >> ???? Hi, Vladimir, >> >> ???? Thanks for the clarification. >> ???? Oh, yes, it's theoretically possible, but it's tedious. I am wrong at that point. >> ???? I think I got your point. ControlIntrinsics will make developer's life easier. I will implement it. >> >> ???? Thanks, >> ???? --lx >> >> >> ???? On 4/17/20, 6:46 PM, "Vladimir Kozlov" wrote: >> >> ???????? CAUTION: This email originated from outside of the organization. Do not click links or open attachments >> unless you can confirm the sender and know the content is safe. >> >> >> >> ???????? I withdraw my suggestion about EnableIntrinsic from JDK-8151779 because ControlIntrinsics will provide such >> ???????? functionality and will replace existing DisableIntrinsic. >> >> ???????? Note, we can start deprecating Use*Intrinsic flags (and DisableIntrinsic) later in other changes. You don't >> need to do >> ???????? everything at once. What we need now a mechanism to replace them. >> >> ???????? On 4/16/20 11:58 PM, Liu, Xin wrote: >> ???????? > Hi, Corey and Vladimir, >> ???????? > >> ???????? > I recently go through vmSymbols.hpp/cpp. I think I understand your comments. >> ???????? > Each UseXXXIntrinsics does control a bunch of intrinsics (plural). Thanks for the hint. >> ???????? > >> ???????? > Even though I feel I know intrinsics mechanism of hotspot better, I still need a clarification of JDK- >> 8151779. >> ???????? > >> ???????? > There're 321 intrinsics (https://chriswhocodes.com/hotspot_intrinsics_jdk15.html). >> ???????? > If there's no any option, they are all available for compilers.? That makes sense because intrinsics are >> always beneficial. >> ???????? > But there're reasons we need to disable a subset of them. A specific architecture may miss efficient >> instructions or fixed functions. Or simply because an intrinsic is buggy. >> ???????? > >> ???????? > Currently, JDK provides developers 2 ways to control intrinsics. > 1. Some diagnostic options. Eg. >> InlineMathNatives, UseBase64Intrinsics. >> ???????? > Developers can use one option to disable a group of intrinsics.? That is to say, it's a coarse-grained >> approach. >> ???????? > >> ???????? > 2. DisableIntrinsic="a,b,c" >> ???????? > By passing a string list of vmIntrinsics::IDs, it's capable of disabling any specified intrinsic. >> ???????? > >> ???????? > But even putting above 2 approaches together, we still can't precisely control any intrinsic. >> >> ???????? Yes, you are right. We seems are trying to put these 2 different ways into one flag which may be mistake. >> >> ???????? -XX:ControlIntrinsic=-_updateBytesCRC32C,-_updateDirectByteBufferCRC32C is a similar to >> -XX:-UseCRC32CIntrinsics but it >> ???????? requires more detailed knowledge about intrinsics ids. >> >> ???????? May be we can have 2nd flag, as you suggested -XX:UseIntrinsics=-AESCTR,+CRC32C, for such cases. >> >> ???????? > If we want to enable an intrinsic which is under control of InlineMathNatives but keep others disable, it's >> impossible now.? [please correct if I am wrong here]. >> >> ???????? You can disable all other from 321 intrinsics with DisableIntrinsic flag which is very tedious I agree. >> >> ???????? > I think that the motivation JDK-8151779 tried to solve. >> >> ???????? The idea is that instead of flags we use to control particular intrinsics depending on CPU we will use >> vmIntrinsics::IDs >> ???????? or other tables as you showed in your changes. It will require changes in vm_version_ codes. >> >> ???????? > >> ???????? > If we provide a new option EnableIntrinsic and put it least priority, then we can precisely control any >> intrinsic. >> ???????? > Quote Vladimir Kozlov "DisableIntrinsic list prevails if an intrinsic is specified on both EnableIntrinsic >> and DisableIntrinsic." >> ???????? > >> ???????? >?? "-XX:ControlIntrinsic=+_dabs,-_fabs,-_getClass" looks more elegant, but it will confuse developers with >> DisableIntrinsic. >> ???????? > If we decide to deprecate DisableIntrinsic, I think ControlIntrinsic may be a better option. Now I prefer >> to provide EnableIntrinsic for simplicity and symmetry. >> >> ???????? I prefer to have one ControlIntrinsic flag and deprecate DisableIntrinsic. I don't think it is confusing. >> >> ???????? Thanks, >> ???????? Vladimir >> >> ???????? > What do you think? >> ???????? > >> ???????? > Thanks, >> ???????? > --lx >> ???????? > >> ???????? > >> ???????? > On 4/13/20, 1:47 PM, "hotspot-compiler-dev on behalf of Corey Ashford" >> wrote: >> ???????? > >> ???????? >????? CAUTION: This email originated from outside of the organization. Do not click links or open >> attachments unless you can confirm the sender and know the content is safe. >> ???????? > >> ???????? > >> ???????? > >> ???????? >????? On 4/13/20 10:33 AM, Liu, Xin wrote: >> ???????? >????? > Hi, compiler developers, >> ???????? >????? > I attempt to refactor UseXXXIntrinsics for JDK-8151779.? I think we still need to keep >> UseXXXIntrinsics options because many applications may be using them. >> ???????? >????? > >> ???????? >????? > My change provide 2 new features: >> ???????? >????? > 1) a shorthand to enable/disable intrinsics. >> ???????? >????? > A comma-separated string. Each one is an intrinsic. An optional tailing symbol + or '-' denotes >> enabling or disabling. >> ???????? >????? > If the tailing symbol is missing, it means enable. >> ???????? >????? > Eg. -XX:UseIntrinsics="AESCTR-,CRC32C+,CRC32-,MathExact" >> ???????? >????? > This jvm option will expand to multiple options -XX:-UseAESCTRIntrinsics, -XX:+UseCRC32CIntrinsics, >> -XX:-UseCRC32Intrinsics, -XX:UseMathExactIntrinsics >> ???????? >????? > >> ???????? >????? > 2) provide a set of macro to declare intrinsic options >> ???????? >????? > Developers declare once in intrinsics.hpp and macros will take care all other places. >> ???????? >????? > Here are example: >> https://cr.openjdk.java.net/~xliu/8151779/00/webrev/src/hotspot/share/compiler/intrinsics.hpp.html >> ???????? >????? > Ion Lam is overhauling jvm options.? I am thinking how to be consistent with his proposal. >> ???????? >????? > >> ???????? > >> ???????? >????? Great idea, though to be consistent with the original syntax, I think >> ???????? >????? the +/- should be in front of the name: >> ???????? > >> ???????? >????? -XX:UseIntrinsics=-AESCTR,+CRC32C,... >> ???????? > >> ???????? > >> ???????? >????? > I handle UseIntrinsics before VM_Version::initialize. It means that platform-specific initialization >> still has a chance to correct those options. >> ???????? >????? > If we do that after VM_Version::initialize,? some intrinsics may cause JVM crash.? Eg. >> +UseBase64Intrinsics on x86_64 Linux. >> ???????? >????? > Even though this behavior is same as -XX:+UseXXXIntrinsics, from user's perspective, it's not >> straightforward when JVM overrides what users specify implicitly. It's dilemma here,? stable jvm or fidelity of >> cmdline.? What do you think? >> ???????? >????? > >> ???????? >????? > Another problem is naming convention. Almost all intrinsics options use UseXXXIntrinsics. One >> exception is UseVectorizedMismatchIntrinsic. >> ???????? >????? > Personally, I think it should be "UseXXXIntrinsic" because one option is for one intrinsic, right? >> Is it possible to change this name convention? >> ???????? > >> ???????? >????? Some (many?) intrinsic options turn on more than one .ad instruct >> ???????? >????? instrinsic, or library instrinsics at the same time.? I think that's why >> ???????? >????? the plural is there.? Also, consistently adding the plural allows you to >> ???????? >????? add more capabilities to a flag that initially only had one intrinsic >> ???????? >????? without changing the plurality (and thus backward compatibility). >> ???????? > >> ???????? >????? Regards, >> ???????? > >> ???????? >????? - Corey >> ???????? > >> ???????? > >> >> From manc at google.com Sat May 2 05:31:36 2020 From: manc at google.com (Man Cao) Date: Fri, 1 May 2020 22:31:36 -0700 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps Message-ID: Hi all, Can I have reviews for this one-line change that fixes a bug and could significantly improve performance? Webrev: https://cr.openjdk.java.net/~manc/8244278/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8244278 It passes tier-1 tests locally, as well as "vm/mlvm/meth/stress/compiler/deoptimize" (for the original JDK-8046809). -Man From xxinliu at amazon.com Sun May 3 22:48:01 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Sun, 3 May 2020 22:48:01 +0000 Subject: RFR[M]: 8151779: Some intrinsic flags could be replaced with one general flag In-Reply-To: <747eaea2-36bf-dfbd-00e7-dcd9ff016dde@oracle.com> References: <19CD3956-4DC6-4908-8626-27D48A9AB4A4@amazon.com> <0EDAAC88-E5D9-424F-A19E-5E20C689C2F3@amazon.com> <801D878C-CAE5-4EBE-8AFE-4E35346CD5BD@amazon.com> <58ff5b66-1dce-d4ad-8f21-254abd1b887b@oracle.com> <747eaea2-36bf-dfbd-00e7-dcd9ff016dde@oracle.com> Message-ID: <66EF8963-4CB3-414D-B620-B6E56C4454CF@amazon.com> Hi, Vladimir, Thank you to review the patch! For the failure, it's actually a bug. _updateCRC32 should be enabled no matter how on 32bit x86. Here is the new revision with bugfix. http://cr.openjdk.java.net/~xliu/8151779/03/webrev/ I made an incremental diff between rev02 and rev03: http://cr.openjdk.java.net/~xliu/8151779/r2_to_r3.diff 1. The bug was because abstractCompiler miss out to check vm_intrinsic_control_words[]. Previously, Class vmIntrinsics provided multiple interfaces for intrinsic availability. (https://hg.openjdk.java.net/jdk/jdk/file/4198213fc371/src/hotspot/share/classfile/vmSymbols.hpp#l1696) abstractCompiler.cpp and library_call.cpp is_disabled_by_flags () but stubGenerator_x86_64.cpp uses is_intrinsic_available(). I promote "is_disabled_by_flags()" to the core interface, leave more comments on it, and keep is_intrinsic_available() for compatibility. 2. add +/- UseCRC32Intrinsics to IntrinsicAvailableTest.java The purpose of that test is not to generate a CRC32 intrinsic. Its purpose is to check if compilers determine to intrinsify _updateCRC32 or not. Mathematically, "UseCRC32Intrinsics" is a set = [_updateCRC32, _updateBytesCRC32, _updateByteBufferCRC32]. "-XX:-UseCRC32Intrinsics" disables all 3 of them. If users use -XX:ControlIntrinsic=+_updateCRC32 and -XX:-UseCRC32Intrinsics, _updateCRC32 should be enabled individually. Yes, it will crash if compilers do generate _updateCRC32 without UseCRC32Intrinsics. It's because hotspot doesn't generate corresponding stubs, which are controlled by UseCRC32Intrinsics. That's by design. hotspot has made huge efforts to enable as many intrinsics as it can. If a user explicitly enables an unsupported intrinsics, he/she must do it for reasons. One possible scenario is that he is developing a new intrinsic. Actually, the reason c1/c2 crash because both of them choose hard-landing. Eg. LibraryCallKit::inline_updateCRC32() assumes it has the stub. I don't know why, but templateIntercept seems to choose to tolerate it. On legacy hosts without CLMUL, -XX:+UseCRC32Intrinsics will be drop. It's still safe to run because IntrinsicAvailableTest.java doesn't attempt to compile the intrinsic method. 3. I found an interesting optimization. We can use vm_intrinsic_control_words[] as a cache. I assume that no one changes those UseXXXIntrinsics options at the runtime. It can skip the mega-switch of vmIntrinsics::disabled_by_jvm_flags(), which might not be a big deal for optimizing compilers, but it can guarantee O(1) complexity for all toolchains. Thanks, --lx ?On 5/1/20, 8:36 PM, "hotspot-compiler-dev on behalf of Vladimir Kozlov" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi Xin compiler/intrinsics/IntrinsicAvailableTest.java test failed on old x86 machine which does not have CLMUL and as result can't use CRC32 intrinsic (UseCRC32Intrinsics flag is false). With -XX:ControlIntrinsic=+_updateCRC32 test failed with: java.lang.RuntimeException: Unexpected result: intrinsic for java.util.zip.CRC32.update() is enabled but intrinsic is not available at compilation level 1 at compiler.intrinsics.IntrinsicAvailableTest.checkIntrinsicForCompilationLevel(IntrinsicAvailableTest.java:128) at compiler.intrinsics.IntrinsicAvailableTest.test(IntrinsicAvailableTest.java:138) at compiler.intrinsics.IntrinsicAvailableTest.main(IntrinsicAvailableTest.java:150) Regards, Vladimir On 5/1/20 6:00 PM, Vladimir Kozlov wrote: > Hi > > I am CCing to runtime group too. I would like to see comments about these changes. No need to look on compiler's changes. > > The latest https://cr.openjdk.java.net/~xliu/8151779/02/webrev/ > > Good work. > > On 4/24/20 1:33 AM, Liu, Xin wrote: >> Hi, >> >> May I get reviewed for this new revision? >> JBS: https://bugs.openjdk.java.net/browse/JDK-8151779 >> webrev: https://cr.openjdk.java.net/~xliu/8151779/01/webrev/ >> >> I introduce a new option -XX:ControlIntrinsic=+_id1,-id2... >> The id is vmIntrinsics::ID. As prior discussion, ControlIntrinsic is expected to replace DisableIntrinsic. >> I keep DisableIntrinsic in this revision. DisableIntrinsic prevails when an intrinsic appears on both lists. > > Yes, you have to keep DisableIntrinsic for now. We will deprecate it later. > >> >> I use an array of tribool to mark each intrinsic is enabled or not. In this way, hotspot can avoid expensive string >> querying among intrinsics. >> A Tribool value has 3 states: Default, true, or false. >> If developers don't explicitly set an intrinsic, it will be available unless is disabled by the corresponding >> UseXXXIntrinsics. >> Traditional Boolean value can't express fine/coarse-grained control. Ie. We only go through those auxiliary options >> UseXXXIntrinsics if developers don't control a specific intrinsic. >> >> I also add the support of ControlIntrinsic to CompilerDirectives. >> >> Test: >> I reuse jtreg tests of DisableIntrinsic. Add add more @run annotations to verify ControlIntrinsics. >> I passed hotspot:Tier1 test and all tests on x86_64/linux. > > Good. I submitted hotspot tier1-3 testing. > > Thanks, > Vladimir > >> >> Thanks, >> --lx >> >> On 4/17/20, 7:22 PM, "hotspot-compiler-dev on behalf of Liu, Xin" > behalf of xxinliu at amazon.com> wrote: >> >> Hi, Vladimir, >> >> Thanks for the clarification. >> Oh, yes, it's theoretically possible, but it's tedious. I am wrong at that point. >> I think I got your point. ControlIntrinsics will make developer's life easier. I will implement it. >> >> Thanks, >> --lx >> >> >> On 4/17/20, 6:46 PM, "Vladimir Kozlov" wrote: >> >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments >> unless you can confirm the sender and know the content is safe. >> >> >> >> I withdraw my suggestion about EnableIntrinsic from JDK-8151779 because ControlIntrinsics will provide such >> functionality and will replace existing DisableIntrinsic. >> >> Note, we can start deprecating Use*Intrinsic flags (and DisableIntrinsic) later in other changes. You don't >> need to do >> everything at once. What we need now a mechanism to replace them. >> >> On 4/16/20 11:58 PM, Liu, Xin wrote: >> > Hi, Corey and Vladimir, >> > >> > I recently go through vmSymbols.hpp/cpp. I think I understand your comments. >> > Each UseXXXIntrinsics does control a bunch of intrinsics (plural). Thanks for the hint. >> > >> > Even though I feel I know intrinsics mechanism of hotspot better, I still need a clarification of JDK- >> 8151779. >> > >> > There're 321 intrinsics (https://chriswhocodes.com/hotspot_intrinsics_jdk15.html). >> > If there's no any option, they are all available for compilers. That makes sense because intrinsics are >> always beneficial. >> > But there're reasons we need to disable a subset of them. A specific architecture may miss efficient >> instructions or fixed functions. Or simply because an intrinsic is buggy. >> > >> > Currently, JDK provides developers 2 ways to control intrinsics. > 1. Some diagnostic options. Eg. >> InlineMathNatives, UseBase64Intrinsics. >> > Developers can use one option to disable a group of intrinsics. That is to say, it's a coarse-grained >> approach. >> > >> > 2. DisableIntrinsic="a,b,c" >> > By passing a string list of vmIntrinsics::IDs, it's capable of disabling any specified intrinsic. >> > >> > But even putting above 2 approaches together, we still can't precisely control any intrinsic. >> >> Yes, you are right. We seems are trying to put these 2 different ways into one flag which may be mistake. >> >> -XX:ControlIntrinsic=-_updateBytesCRC32C,-_updateDirectByteBufferCRC32C is a similar to >> -XX:-UseCRC32CIntrinsics but it >> requires more detailed knowledge about intrinsics ids. >> >> May be we can have 2nd flag, as you suggested -XX:UseIntrinsics=-AESCTR,+CRC32C, for such cases. >> >> > If we want to enable an intrinsic which is under control of InlineMathNatives but keep others disable, it's >> impossible now. [please correct if I am wrong here]. >> >> You can disable all other from 321 intrinsics with DisableIntrinsic flag which is very tedious I agree. >> >> > I think that the motivation JDK-8151779 tried to solve. >> >> The idea is that instead of flags we use to control particular intrinsics depending on CPU we will use >> vmIntrinsics::IDs >> or other tables as you showed in your changes. It will require changes in vm_version_ codes. >> >> > >> > If we provide a new option EnableIntrinsic and put it least priority, then we can precisely control any >> intrinsic. >> > Quote Vladimir Kozlov "DisableIntrinsic list prevails if an intrinsic is specified on both EnableIntrinsic >> and DisableIntrinsic." >> > >> > "-XX:ControlIntrinsic=+_dabs,-_fabs,-_getClass" looks more elegant, but it will confuse developers with >> DisableIntrinsic. >> > If we decide to deprecate DisableIntrinsic, I think ControlIntrinsic may be a better option. Now I prefer >> to provide EnableIntrinsic for simplicity and symmetry. >> >> I prefer to have one ControlIntrinsic flag and deprecate DisableIntrinsic. I don't think it is confusing. >> >> Thanks, >> Vladimir >> >> > What do you think? >> > >> > Thanks, >> > --lx >> > >> > >> > On 4/13/20, 1:47 PM, "hotspot-compiler-dev on behalf of Corey Ashford" >> wrote: >> > >> > CAUTION: This email originated from outside of the organization. Do not click links or open >> attachments unless you can confirm the sender and know the content is safe. >> > >> > >> > >> > On 4/13/20 10:33 AM, Liu, Xin wrote: >> > > Hi, compiler developers, >> > > I attempt to refactor UseXXXIntrinsics for JDK-8151779. I think we still need to keep >> UseXXXIntrinsics options because many applications may be using them. >> > > >> > > My change provide 2 new features: >> > > 1) a shorthand to enable/disable intrinsics. >> > > A comma-separated string. Each one is an intrinsic. An optional tailing symbol + or '-' denotes >> enabling or disabling. >> > > If the tailing symbol is missing, it means enable. >> > > Eg. -XX:UseIntrinsics="AESCTR-,CRC32C+,CRC32-,MathExact" >> > > This jvm option will expand to multiple options -XX:-UseAESCTRIntrinsics, -XX:+UseCRC32CIntrinsics, >> -XX:-UseCRC32Intrinsics, -XX:UseMathExactIntrinsics >> > > >> > > 2) provide a set of macro to declare intrinsic options >> > > Developers declare once in intrinsics.hpp and macros will take care all other places. >> > > Here are example: >> https://cr.openjdk.java.net/~xliu/8151779/00/webrev/src/hotspot/share/compiler/intrinsics.hpp.html >> > > Ion Lam is overhauling jvm options. I am thinking how to be consistent with his proposal. >> > > >> > >> > Great idea, though to be consistent with the original syntax, I think >> > the +/- should be in front of the name: >> > >> > -XX:UseIntrinsics=-AESCTR,+CRC32C,... >> > >> > >> > > I handle UseIntrinsics before VM_Version::initialize. It means that platform-specific initialization >> still has a chance to correct those options. >> > > If we do that after VM_Version::initialize, some intrinsics may cause JVM crash. Eg. >> +UseBase64Intrinsics on x86_64 Linux. >> > > Even though this behavior is same as -XX:+UseXXXIntrinsics, from user's perspective, it's not >> straightforward when JVM overrides what users specify implicitly. It's dilemma here, stable jvm or fidelity of >> cmdline. What do you think? >> > > >> > > Another problem is naming convention. Almost all intrinsics options use UseXXXIntrinsics. One >> exception is UseVectorizedMismatchIntrinsic. >> > > Personally, I think it should be "UseXXXIntrinsic" because one option is for one intrinsic, right? >> Is it possible to change this name convention? >> > >> > Some (many?) intrinsic options turn on more than one .ad instruct >> > instrinsic, or library instrinsics at the same time. I think that's why >> > the plural is there. Also, consistently adding the plural allows you to >> > add more capabilities to a flag that initially only had one intrinsic >> > without changing the plurality (and thus backward compatibility). >> > >> > Regards, >> > >> > - Corey >> > >> > >> >> From david.holmes at oracle.com Mon May 4 02:08:13 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 4 May 2020 12:08:13 +1000 Subject: RFR[M]: 8151779: Some intrinsic flags could be replaced with one general flag In-Reply-To: <58ff5b66-1dce-d4ad-8f21-254abd1b887b@oracle.com> References: <19CD3956-4DC6-4908-8626-27D48A9AB4A4@amazon.com> <0EDAAC88-E5D9-424F-A19E-5E20C689C2F3@amazon.com> <801D878C-CAE5-4EBE-8AFE-4E35346CD5BD@amazon.com> <58ff5b66-1dce-d4ad-8f21-254abd1b887b@oracle.com> Message-ID: Hi Vladimir, Xin, Overall this seems fine to me. A few style nits below. On 2/05/2020 11:00 am, Vladimir Kozlov wrote: > Hi > > I am CCing to runtime group too. I would like to see comments about > these changes. No need to look on compiler's changes. > > The latest https://cr.openjdk.java.net/~xliu/8151779/02/webrev/ src/hotspot/share/classfile/vmSymbols.cpp ! assert(!strcmp(nt[vmIntrinsics::_hashCode], "_hashCode"), "lined up"); Avoid implicit bools - compare against 0 ! init_vm_intrnsic_name_table(); Typo: intrnsic -> intrinsic (multiple places) + if (!strcmp(name, nt[index])) return ID_from(index); Avoid implicit bools - compare against 0 "return" on newline Use { } + * There're 2 approaches to control intrinsics. "there're" -> "there are" s/control/controlling/ + * 1.Disable/ControlIntrinsic space after . + * ControlIntrinsic is recommented. Typo: recommented -> recommended + * Currently, DisableIntrinsic list insert: ^the^ DisableIntrinsic list + * 2.some UseXXXIntrinsics options. eg. UseAESIntrinsics space after . Suggestion: s/some/Explicit/ + * Each option can control a set of intrinsics. User can specify them but s/User/The user/ + * they are subjected to hardward inspection(VM_Version::initialize). s/subjected/subject/ s/hardward/hard-wired/ space before ( + for (ControlIntrinsicIter iter(ControlIntrinsic); *iter; ++iter) { + for (ControlIntrinsicIter iter(DisableIntrinsic, true/*disable_all*/); *iter; ++iter) { Avoid implicit bools Space before /* + if (b.is_default()) + return !vmIntrinsics::is_disabled_by_flags(id); + else + return b; Use { } -- src/hotspot/share/classfile/vmSymbols.hpp tribool -> TriBool Per hotspot style guide (yes there are existing types that don't follow tis). --- test/hotspot/gtest/classfile/test_vmSymbols.cpp Apparently Amazon don't use years in their copyright notices and this is being changed elsewhere. Thanks, David > Good work. > > On 4/24/20 1:33 AM, Liu, Xin wrote: >> Hi, >> >> May I get reviewed for this new revision? >> JBS: https://bugs.openjdk.java.net/browse/JDK-8151779 >> webrev: https://cr.openjdk.java.net/~xliu/8151779/01/webrev/ >> >> I introduce a new option -XX:ControlIntrinsic=+_id1,-id2... >> The id is vmIntrinsics::ID.? As prior discussion, ControlIntrinsic is >> expected to replace DisableIntrinsic. >> I keep DisableIntrinsic in this revision. DisableIntrinsic prevails >> when an intrinsic appears on both lists. > > Yes, you have to keep DisableIntrinsic for now. We will deprecate it later. > >> >> I use an array of tribool to mark each intrinsic is enabled or not. In >> this way, hotspot can avoid expensive string querying among intrinsics. >> A Tribool value has 3 states: Default, true, or false. >> If developers don't explicitly set an intrinsic, it will be available >> unless is disabled by the corresponding UseXXXIntrinsics. >> Traditional Boolean value can't express fine/coarse-grained control. >> Ie. We only go through those auxiliary options UseXXXIntrinsics if >> developers don't control a specific intrinsic. >> >> I also add the support of ControlIntrinsic to CompilerDirectives. >> >> Test: >> I reuse jtreg tests of DisableIntrinsic. Add add more @run annotations >> to verify ControlIntrinsics. >> I passed hotspot:Tier1 test and all tests on x86_64/linux. > > Good. I submitted hotspot tier1-3 testing. > > Thanks, > Vladimir > >> >> Thanks, >> --lx >> >> ?On 4/17/20, 7:22 PM, "hotspot-compiler-dev on behalf of Liu, Xin" >> > xxinliu at amazon.com> wrote: >> >> ???? Hi, Vladimir, >> >> ???? Thanks for the clarification. >> ???? Oh, yes, it's theoretically possible, but it's tedious. I am >> wrong at that point. >> ???? I think I got your point. ControlIntrinsics will make developer's >> life easier. I will implement it. >> >> ???? Thanks, >> ???? --lx >> >> >> ???? On 4/17/20, 6:46 PM, "Vladimir Kozlov" >> wrote: >> >> ???????? CAUTION: This email originated from outside of the >> organization. Do not click links or open attachments unless you can >> confirm the sender and know the content is safe. >> >> >> >> ???????? I withdraw my suggestion about EnableIntrinsic from >> JDK-8151779 because ControlIntrinsics will provide such >> ???????? functionality and will replace existing DisableIntrinsic. >> >> ???????? Note, we can start deprecating Use*Intrinsic flags (and >> DisableIntrinsic) later in other changes. You don't need to do >> ???????? everything at once. What we need now a mechanism to replace >> them. >> >> ???????? On 4/16/20 11:58 PM, Liu, Xin wrote: >> ???????? > Hi, Corey and Vladimir, >> ???????? > >> ???????? > I recently go through vmSymbols.hpp/cpp. I think I >> understand your comments. >> ???????? > Each UseXXXIntrinsics does control a bunch of intrinsics >> (plural). Thanks for the hint. >> ???????? > >> ???????? > Even though I feel I know intrinsics mechanism of hotspot >> better, I still need a clarification of JDK- 8151779. >> ???????? > >> ???????? > There're 321 intrinsics >> (https://chriswhocodes.com/hotspot_intrinsics_jdk15.html). >> ???????? > If there's no any option, they are all available for >> compilers.? That makes sense because intrinsics are always beneficial. >> ???????? > But there're reasons we need to disable a subset of them. A >> specific architecture may miss efficient instructions or fixed >> functions. Or simply because an intrinsic is buggy. >> ???????? > >> ???????? > Currently, JDK provides developers 2 ways to control >> intrinsics. > 1. Some diagnostic options. Eg. InlineMathNatives, >> UseBase64Intrinsics. >> ???????? > Developers can use one option to disable a group of >> intrinsics.? That is to say, it's a coarse-grained approach. >> ???????? > >> ???????? > 2. DisableIntrinsic="a,b,c" >> ???????? > By passing a string list of vmIntrinsics::IDs, it's capable >> of disabling any specified intrinsic. >> ???????? > >> ???????? > But even putting above 2 approaches together, we still >> can't precisely control any intrinsic. >> >> ???????? Yes, you are right. We seems are trying to put these 2 >> different ways into one flag which may be mistake. >> >> >> -XX:ControlIntrinsic=-_updateBytesCRC32C,-_updateDirectByteBufferCRC32C is >> a similar to -XX:-UseCRC32CIntrinsics but it >> ???????? requires more detailed knowledge about intrinsics ids. >> >> ???????? May be we can have 2nd flag, as you suggested >> -XX:UseIntrinsics=-AESCTR,+CRC32C, for such cases. >> >> ???????? > If we want to enable an intrinsic which is under control of >> InlineMathNatives but keep others disable, it's impossible now. >> [please correct if I am wrong here]. >> >> ???????? You can disable all other from 321 intrinsics with >> DisableIntrinsic flag which is very tedious I agree. >> >> ???????? > I think that the motivation JDK-8151779 tried to solve. >> >> ???????? The idea is that instead of flags we use to control >> particular intrinsics depending on CPU we will use vmIntrinsics::IDs >> ???????? or other tables as you showed in your changes. It will >> require changes in vm_version_ codes. >> >> ???????? > >> ???????? > If we provide a new option EnableIntrinsic and put it least >> priority, then we can precisely control any intrinsic. >> ???????? > Quote Vladimir Kozlov "DisableIntrinsic list prevails if an >> intrinsic is specified on both EnableIntrinsic and DisableIntrinsic." >> ???????? > >> ???????? >?? "-XX:ControlIntrinsic=+_dabs,-_fabs,-_getClass" looks >> more elegant, but it will confuse developers with DisableIntrinsic. >> ???????? > If we decide to deprecate DisableIntrinsic, I think >> ControlIntrinsic may be a better option. Now I prefer to provide >> EnableIntrinsic for simplicity and symmetry. >> >> ???????? I prefer to have one ControlIntrinsic flag and deprecate >> DisableIntrinsic. I don't think it is confusing. >> >> ???????? Thanks, >> ???????? Vladimir >> >> ???????? > What do you think? >> ???????? > >> ???????? > Thanks, >> ???????? > --lx >> ???????? > >> ???????? > >> ???????? > On 4/13/20, 1:47 PM, "hotspot-compiler-dev on behalf of >> Corey Ashford" > behalf of cjashfor at linux.ibm.com> wrote: >> ???????? > >> ???????? >????? CAUTION: This email originated from outside of the >> organization. Do not click links or open attachments unless you can >> confirm the sender and know the content is safe. >> ???????? > >> ???????? > >> ???????? > >> ???????? >????? On 4/13/20 10:33 AM, Liu, Xin wrote: >> ???????? >????? > Hi, compiler developers, >> ???????? >????? > I attempt to refactor UseXXXIntrinsics for >> JDK-8151779.? I think we still need to keep UseXXXIntrinsics options >> because many applications may be using them. >> ???????? >????? > >> ???????? >????? > My change provide 2 new features: >> ???????? >????? > 1) a shorthand to enable/disable intrinsics. >> ???????? >????? > A comma-separated string. Each one is an intrinsic. >> An optional tailing symbol + or '-' denotes enabling or disabling. >> ???????? >????? > If the tailing symbol is missing, it means enable. >> ???????? >????? > Eg. >> -XX:UseIntrinsics="AESCTR-,CRC32C+,CRC32-,MathExact" >> ???????? >????? > This jvm option will expand to multiple options >> -XX:-UseAESCTRIntrinsics, -XX:+UseCRC32CIntrinsics, >> -XX:-UseCRC32Intrinsics, -XX:UseMathExactIntrinsics >> ???????? >????? > >> ???????? >????? > 2) provide a set of macro to declare intrinsic options >> ???????? >????? > Developers declare once in intrinsics.hpp and macros >> will take care all other places. >> ???????? >????? > Here are example: >> https://cr.openjdk.java.net/~xliu/8151779/00/webrev/src/hotspot/share/compiler/intrinsics.hpp.html >> >> ???????? >????? > Ion Lam is overhauling jvm options.? I am thinking >> how to be consistent with his proposal. >> ???????? >????? > >> ???????? > >> ???????? >????? Great idea, though to be consistent with the original >> syntax, I think >> ???????? >????? the +/- should be in front of the name: >> ???????? > >> ???????? >????? -XX:UseIntrinsics=-AESCTR,+CRC32C,... >> ???????? > >> ???????? > >> ???????? >????? > I handle UseIntrinsics before >> VM_Version::initialize. It means that platform-specific initialization >> still has a chance to correct those options. >> ???????? >????? > If we do that after VM_Version::initialize,? some >> intrinsics may cause JVM crash.? Eg. +UseBase64Intrinsics on x86_64 >> Linux. >> ???????? >????? > Even though this behavior is same as >> -XX:+UseXXXIntrinsics, from user's perspective, it's not >> straightforward when JVM overrides what users specify implicitly. It's >> dilemma here,? stable jvm or fidelity of cmdline.? What do you think? >> ???????? >????? > >> ???????? >????? > Another problem is naming convention. Almost all >> intrinsics options use UseXXXIntrinsics. One exception is >> UseVectorizedMismatchIntrinsic. >> ???????? >????? > Personally, I think it should be "UseXXXIntrinsic" >> because one option is for one intrinsic, right?? Is it possible to >> change this name convention? >> ???????? > >> ???????? >????? Some (many?) intrinsic options turn on more than one >> .ad instruct >> ???????? >????? instrinsic, or library instrinsics at the same time. >> I think that's why >> ???????? >????? the plural is there.? Also, consistently adding the >> plural allows you to >> ???????? >????? add more capabilities to a flag that initially only >> had one intrinsic >> ???????? >????? without changing the plurality (and thus backward >> compatibility). >> ???????? > >> ???????? >????? Regards, >> ???????? > >> ???????? >????? - Corey >> ???????? > >> ???????? > >> >> From mikael.vidstedt at oracle.com Mon May 4 05:12:20 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Sun, 3 May 2020 22:12:20 -0700 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) Message-ID: Please review this change which implements part of JEP 381: JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 Note: When reviewing this, please be aware that this exercise was *extremely* mind-numbing, so I appreciate your help reviewing all the individual changes carefully. You may want to get that coffee cup filled up (or whatever keeps you awake)! Background: Because of the size of the total patch and wide range of areas touched, this patch is one out of in total six partial patches which together make up the necessary changes to remove the Solaris and SPARC ports. The other patches are being sent out for review to mailing lists appropriate for the respective areas the touch. An email will be sent to jdk-dev summarizing all the patches/reviews. To be clear: this patch is *not* in itself complete and stand-alone - all of the (six) patches are needed to form a complete patch. Some changes in this patch may look wrong or incomplete unless also looking at the corresponding changes in other areas. For convenience, I?m including a link below[1] to the full webrev, but in case you have comments on changes in other areas, outside of the files included in this thread, please provide those comments directly in the thread on the appropriate mailing list for that area if possible. In case it helps, the changes were effectively produced by searching for and updating any code mentioning ?solaris", ?sparc?, ?solstudio?, ?sunos?, etc. More information about the areas impacted can be found in the JEP itself. A big thank you to Igor Ignatyev for helping make the changes to the hotspot tests! Also, I have a short list of follow-ups which I?m going to look at separately from this JEP/patch, mainly related to command line options/flags which are no longer relevant and should be deprecated/obsoleted/removed. Testing: A slightly earlier version of this change successfully passed tier1-8, as well as client tier1-2. Additional testing will be done after the first round of reviews has been completed. Cheers, Mikael [1] http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/all/open/webrev/ From rwestrel at redhat.com Mon May 4 07:28:55 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 04 May 2020 09:28:55 +0200 Subject: RFR(S): 8244086: Following 8241492, strip mined loop may run extra iterations In-Reply-To: References: <87wo5y8z2v.fsf@redhat.com> <878sid8jzn.fsf@redhat.com> <87zhat6voh.fsf@redhat.com> Message-ID: <87wo5s6tvs.fsf@redhat.com> Hi Martin, > my idea was rather to check if the trip counter is already checked before the loop. > Check before loop should look like this (stride > 0 example): > CmpINode c = CountedLoop->in(1) -> IfTrue->in(0) -> If->in(1) -> Bool->in(1) -> CompI > > (Maybe there's an easier way to find it where it gets generated.) > > Comparison of start value: > c->in(1) == Phi(trip counter)->in(1) > with limit: > c->in(2) == CmpI(trip counter)->in(2) > > If this matches we should be safe. > I haven't checked if such patterns match often enough. Just as an idea. The code snippet I included does that but in a slightly different way (it looks for the CmpI/Bool with the right inputs and checks that it dominates the loops). That works for simple loops but I found it doesn't for other common loop shapes. So I doubt it's a as simple as it seems to follow your suggestion. Roland. From stefan.karlsson at oracle.com Mon May 4 08:28:27 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 4 May 2020 10:28:27 +0200 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: References: Message-ID: Hi Mikael, On 2020-05-04 07:12, Mikael Vidstedt wrote: > > Please review this change which implements part of JEP 381: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 > webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ I went over this patch and collected some comments: src/hotspot/share/adlc/output_c.cpp src/hotspot/share/adlc/output_h.cpp Awkward code layout after change to. src/hotspot/share/c1/c1_Runtime1.cpp src/hotspot/share/classfile/classListParser.cpp src/hotspot/share/memory/arena.hpp src/hotspot/share/opto/chaitin.cpp test/hotspot/jtreg/gc/TestCardTablePageCommits.java Surrounding comments still refers to Sparc and/or Solaris. There are even more places if you search in the rest of the HotSpot source. Are we leaving those for a separate cleanup pass? src/hotspot/share/gc/g1/g1HeapRegionAttr.hpp Remove comment: // We use different types to represent the state value depending on platform as // some have issues loading parts of words. src/hotspot/share/gc/shared/memset_with_concurrent_readers.hpp Fuse the declaration and definition, now that we only have one implementation. Maybe even remove function/file at some point. src/hotspot/share/utilities/globalDefinitions.hpp Now that STACK_BIAS is always 0, should we remove its usages? Follow-up RFE? src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/meta/HotSpotGraphBuilderPlugins.java Maybe remove decryptSuffix? src/utils/hsdis/Makefile Is this really correct? Shouldn't: ARCH1=$(CPU:x86_64=amd64) ARCH2=$(ARCH1:i686=i386) ARCH=$(ARCH2:sparc64=sparcv9) be changed to: ARCH1=$(CPU:x86_64=amd64) ARCH=$(ARCH1:i686=i386) so that we have ARCH defined? Other than that this looks good to me. StefanK > JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 > > > Note: When reviewing this, please be aware that this exercise was *extremely* mind-numbing, so I appreciate your help reviewing all the individual changes carefully. You may want to get that coffee cup filled up (or whatever keeps you awake)! > > > Background: > > Because of the size of the total patch and wide range of areas touched, this patch is one out of in total six partial patches which together make up the necessary changes to remove the Solaris and SPARC ports. The other patches are being sent out for review to mailing lists appropriate for the respective areas the touch. An email will be sent to jdk-dev summarizing all the patches/reviews. To be clear: this patch is *not* in itself complete and stand-alone - all of the (six) patches are needed to form a complete patch. Some changes in this patch may look wrong or incomplete unless also looking at the corresponding changes in other areas. > > For convenience, I?m including a link below[1] to the full webrev, but in case you have comments on changes in other areas, outside of the files included in this thread, please provide those comments directly in the thread on the appropriate mailing list for that area if possible. > > In case it helps, the changes were effectively produced by searching for and updating any code mentioning ?solaris", ?sparc?, ?solstudio?, ?sunos?, etc. More information about the areas impacted can be found in the JEP itself. > > A big thank you to Igor Ignatyev for helping make the changes to the hotspot tests! > > Also, I have a short list of follow-ups which I?m going to look at separately from this JEP/patch, mainly related to command line options/flags which are no longer relevant and should be deprecated/obsoleted/removed. > > Testing: > > A slightly earlier version of this change successfully passed tier1-8, as well as client tier1-2. Additional testing will be done after the first round of reviews has been completed. > > Cheers, > Mikael > > [1] http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/all/open/webrev/ > From nils.eliasson at oracle.com Mon May 4 08:43:37 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Mon, 4 May 2020 10:43:37 +0200 Subject: RFR[M]: 8151779: Some intrinsic flags could be replaced with one general flag In-Reply-To: References: <19CD3956-4DC6-4908-8626-27D48A9AB4A4@amazon.com> <0EDAAC88-E5D9-424F-A19E-5E20C689C2F3@amazon.com> <801D878C-CAE5-4EBE-8AFE-4E35346CD5BD@amazon.com> Message-ID: <0b9b46b7-1db2-4147-3064-422c6e8a0ffe@oracle.com> Hi, In general I like the new flag and its format. Thank you for fixing! I do have some comments: The _intrinsic_control_words array is very large. On x86 there are 328 intrinsics and every tribool i 4 bytes. This increases the DirectiveSet from 128 bytes to 1440. Can you make the _intrinsic_control_words an array of 2-bit tribool structs instead? Also - I get this compilation error on a number of places: .../jdk/open/src/hotspot/share/compiler/compilerDirectives.cpp: In constructor 'DirectiveSet::DirectiveSet(CompilerDir ectives*)': .../jdk/open/src/hotspot/share/compiler/compilerDirectives.cpp:274:75: error: 'void* memset(void*, int, size_t)' clear ing an object of type 'class tribool' with no trivial copy-assignment; use assignment or value-initialization instead [-Werror=class-mem access] Best regards, Nils Eliasson On 2020-05-01 00:39, Liu, Xin wrote: > Hi, > > Ping for this code review. > > I've updated the rev02 a little bit. Here is new revision. > https://cr.openjdk.java.net/~xliu/8151779/02/webrev/ > > 1. resolve merging conflict with TIP. > 2. add fill_in functions to pass sanity test of submit repo. > NOTHING_TO_RUN: 0 > UNABLE_TO_RUN: 0 > KILLED: 0 > NA: 0 > HARNESS_ERROR: 0 > FAILED: 0 > EXECUTED_WITH_FAILURE: 0 > PASSED: 84 > > 3. I also changed the description of ControlIntrinsic. > java -XX:+PrintFlagsWithComments | grep ControlIntrinsic > ccstrlist ControlIntrinsic = {diagnostic} {default} Control intrinsics using a list of +/- (internal) names, separated by commas > > thanks, > --lx > > > ?On 4/24/20, 1:40 AM, "hotspot-compiler-dev on behalf of Liu, Xin" wrote: > > Hi, > > May I get reviewed for this new revision? > JBS: https://bugs.openjdk.java.net/browse/JDK-8151779 > webrev: https://cr.openjdk.java.net/~xliu/8151779/01/webrev/ > > I introduce a new option -XX:ControlIntrinsic=+_id1,-id2... > The id is vmIntrinsics::ID. As prior discussion, ControlIntrinsic is expected to replace DisableIntrinsic. > I keep DisableIntrinsic in this revision. DisableIntrinsic prevails when an intrinsic appears on both lists. > > I use an array of tribool to mark each intrinsic is enabled or not. In this way, hotspot can avoid expensive string querying among intrinsics. > A Tribool value has 3 states: Default, true, or false. > If developers don't explicitly set an intrinsic, it will be available unless is disabled by the corresponding UseXXXIntrinsics. > Traditional Boolean value can't express fine/coarse-grained control. Ie. We only go through those auxiliary options UseXXXIntrinsics if developers don't control a specific intrinsic. > > I also add the support of ControlIntrinsic to CompilerDirectives. > > Test: > I reuse jtreg tests of DisableIntrinsic. Add add more @run annotations to verify ControlIntrinsics. > I passed hotspot:Tier1 test and all tests on x86_64/linux. > > Thanks, > --lx > > On 4/17/20, 7:22 PM, "hotspot-compiler-dev on behalf of Liu, Xin" wrote: > > Hi, Vladimir, > > Thanks for the clarification. > Oh, yes, it's theoretically possible, but it's tedious. I am wrong at that point. > I think I got your point. ControlIntrinsics will make developer's life easier. I will implement it. > > Thanks, > --lx > > > On 4/17/20, 6:46 PM, "Vladimir Kozlov" wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > I withdraw my suggestion about EnableIntrinsic from JDK-8151779 because ControlIntrinsics will provide such > functionality and will replace existing DisableIntrinsic. > > Note, we can start deprecating Use*Intrinsic flags (and DisableIntrinsic) later in other changes. You don't need to do > everything at once. What we need now a mechanism to replace them. > > On 4/16/20 11:58 PM, Liu, Xin wrote: > > Hi, Corey and Vladimir, > > > > I recently go through vmSymbols.hpp/cpp. I think I understand your comments. > > Each UseXXXIntrinsics does control a bunch of intrinsics (plural). Thanks for the hint. > > > > Even though I feel I know intrinsics mechanism of hotspot better, I still need a clarification of JDK- 8151779. > > > > There're 321 intrinsics (https://chriswhocodes.com/hotspot_intrinsics_jdk15.html). > > If there's no any option, they are all available for compilers. That makes sense because intrinsics are always beneficial. > > But there're reasons we need to disable a subset of them. A specific architecture may miss efficient instructions or fixed functions. Or simply because an intrinsic is buggy. > > > > Currently, JDK provides developers 2 ways to control intrinsics. > 1. Some diagnostic options. Eg. InlineMathNatives, UseBase64Intrinsics. > > Developers can use one option to disable a group of intrinsics. That is to say, it's a coarse-grained approach. > > > > 2. DisableIntrinsic="a,b,c" > > By passing a string list of vmIntrinsics::IDs, it's capable of disabling any specified intrinsic. > > > > But even putting above 2 approaches together, we still can't precisely control any intrinsic. > > Yes, you are right. We seems are trying to put these 2 different ways into one flag which may be mistake. > > -XX:ControlIntrinsic=-_updateBytesCRC32C,-_updateDirectByteBufferCRC32C is a similar to -XX:-UseCRC32CIntrinsics but it > requires more detailed knowledge about intrinsics ids. > > May be we can have 2nd flag, as you suggested -XX:UseIntrinsics=-AESCTR,+CRC32C, for such cases. > > > If we want to enable an intrinsic which is under control of InlineMathNatives but keep others disable, it's impossible now. [please correct if I am wrong here]. > > You can disable all other from 321 intrinsics with DisableIntrinsic flag which is very tedious I agree. > > > I think that the motivation JDK-8151779 tried to solve. > > The idea is that instead of flags we use to control particular intrinsics depending on CPU we will use vmIntrinsics::IDs > or other tables as you showed in your changes. It will require changes in vm_version_ codes. > > > > > If we provide a new option EnableIntrinsic and put it least priority, then we can precisely control any intrinsic. > > Quote Vladimir Kozlov "DisableIntrinsic list prevails if an intrinsic is specified on both EnableIntrinsic and DisableIntrinsic." > > > > "-XX:ControlIntrinsic=+_dabs,-_fabs,-_getClass" looks more elegant, but it will confuse developers with DisableIntrinsic. > > If we decide to deprecate DisableIntrinsic, I think ControlIntrinsic may be a better option. Now I prefer to provide EnableIntrinsic for simplicity and symmetry. > > I prefer to have one ControlIntrinsic flag and deprecate DisableIntrinsic. I don't think it is confusing. > > Thanks, > Vladimir > > > What do you think? > > > > Thanks, > > --lx > > > > > > On 4/13/20, 1:47 PM, "hotspot-compiler-dev on behalf of Corey Ashford" wrote: > > > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > > > > On 4/13/20 10:33 AM, Liu, Xin wrote: > > > Hi, compiler developers, > > > I attempt to refactor UseXXXIntrinsics for JDK-8151779. I think we still need to keep UseXXXIntrinsics options because many applications may be using them. > > > > > > My change provide 2 new features: > > > 1) a shorthand to enable/disable intrinsics. > > > A comma-separated string. Each one is an intrinsic. An optional tailing symbol + or '-' denotes enabling or disabling. > > > If the tailing symbol is missing, it means enable. > > > Eg. -XX:UseIntrinsics="AESCTR-,CRC32C+,CRC32-,MathExact" > > > This jvm option will expand to multiple options -XX:-UseAESCTRIntrinsics, -XX:+UseCRC32CIntrinsics, -XX:-UseCRC32Intrinsics, -XX:UseMathExactIntrinsics > > > > > > 2) provide a set of macro to declare intrinsic options > > > Developers declare once in intrinsics.hpp and macros will take care all other places. > > > Here are example: https://cr.openjdk.java.net/~xliu/8151779/00/webrev/src/hotspot/share/compiler/intrinsics.hpp.html > > > Ion Lam is overhauling jvm options. I am thinking how to be consistent with his proposal. > > > > > > > Great idea, though to be consistent with the original syntax, I think > > the +/- should be in front of the name: > > > > -XX:UseIntrinsics=-AESCTR,+CRC32C,... > > > > > > > I handle UseIntrinsics before VM_Version::initialize. It means that platform-specific initialization still has a chance to correct those options. > > > If we do that after VM_Version::initialize, some intrinsics may cause JVM crash. Eg. +UseBase64Intrinsics on x86_64 Linux. > > > Even though this behavior is same as -XX:+UseXXXIntrinsics, from user's perspective, it's not straightforward when JVM overrides what users specify implicitly. It's dilemma here, stable jvm or fidelity of cmdline. What do you think? > > > > > > Another problem is naming convention. Almost all intrinsics options use UseXXXIntrinsics. One exception is UseVectorizedMismatchIntrinsic. > > > Personally, I think it should be "UseXXXIntrinsic" because one option is for one intrinsic, right? Is it possible to change this name convention? > > > > Some (many?) intrinsic options turn on more than one .ad instruct > > instrinsic, or library instrinsics at the same time. I think that's why > > the plural is there. Also, consistently adding the plural allows you to > > add more capabilities to a flag that initially only had one intrinsic > > without changing the plurality (and thus backward compatibility). > > > > Regards, > > > > - Corey > > > > > > > From kim.barrett at oracle.com Mon May 4 08:54:25 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 4 May 2020 04:54:25 -0400 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: References: Message-ID: > On May 4, 2020, at 4:28 AM, Stefan Karlsson wrote: > > Hi Mikael, > > On 2020-05-04 07:12, Mikael Vidstedt wrote: >> Please review this change which implements part of JEP 381: >> JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 >> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ > > [?] > > src/hotspot/share/gc/shared/memset_with_concurrent_readers.hpp > > Fuse the declaration and definition, now that we only have one implementation. Maybe even remove function/file at some point. I think that cleanup is covered by JDK-8142349. From thomas.schatzl at oracle.com Mon May 4 09:11:56 2020 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 4 May 2020 11:11:56 +0200 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: References: Message-ID: Hi, On 04.05.20 10:28, Stefan Karlsson wrote: > Hi Mikael, > > On 2020-05-04 07:12, Mikael Vidstedt wrote: >> >> Please review this change which implements part of JEP 381: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 >> webrev: >> http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ >> > > I went over this patch and collected some comments: > > src/hotspot/share/adlc/output_c.cpp > src/hotspot/share/adlc/output_h.cpp > > Awkward code layout after change to. > > > src/hotspot/share/c1/c1_Runtime1.cpp > src/hotspot/share/classfile/classListParser.cpp > src/hotspot/share/memory/arena.hpp > src/hotspot/share/opto/chaitin.cpp > test/hotspot/jtreg/gc/TestCardTablePageCommits.jav > > Surrounding comments still refers to Sparc and/or Solaris. > > There are even more places if you search in the rest of the HotSpot > source. Are we leaving those for a separate cleanup pass? In addition to "sparc", "solaris", also "solstudio"/"Sun Studio"/"SS compiler bug"/"niagara" yield some search (=grep) results. Some of these locations look like additional RFEs. Thanks, Thomas From nils.eliasson at oracle.com Mon May 4 09:49:23 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Mon, 4 May 2020 11:49:23 +0200 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: References: Message-ID: <9f758014-551a-3b22-1592-e368a815484b@oracle.com> Hi Man, Nice catch! Your change looks good. You can consider it trivial. Best regards, Nils On 2020-05-02 07:31, Man Cao wrote: > Hi all, > > Can I have reviews for this one-line change that fixes a bug and could > significantly improve performance? > Webrev: https://cr.openjdk.java.net/~manc/8244278/webrev.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8244278 > > It passes tier-1 tests locally, as well as > "vm/mlvm/meth/stress/compiler/deoptimize" (for the original JDK-8046809). > > -Man From richard.reingruber at sap.com Mon May 4 10:33:08 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Mon, 4 May 2020 10:33:08 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: <9c49ea2d-e3b8-b576-1d17-d18ad87cd6ed@oracle.com> References: <3c59b9f9-ec38-18c9-8f24-e1186a08a04a@oracle.com> <410eed04-e2ef-0f4f-1c56-19e6734a10f6@oracle.com> <81d7caa8-4244-85f3-4d4e-78117fe5e25b@oss.nttdata.com> <550b95ac-8b29-1eb8-a507-533e81d02322@oracle.com> <9c49ea2d-e3b8-b576-1d17-d18ad87cd6ed@oracle.com> Message-ID: // Trimmed the list of recipients. If the list gets too long then the message needs to be approved // by a moderator. Hi David, > On 28/04/2020 12:09 am, Reingruber, Richard wrote: > > Hi David, > > > >> Not a review but some general commentary ... > > > > That's welcome. > Having had to take an even closer look now I have a review comment too :) > src/hotspot/share/prims/jvmtiThreadState.cpp > void JvmtiThreadState::invalidate_cur_stack_depth() { > ! assert(SafepointSynchronize::is_at_safepoint() || > ! (Thread::current()->is_VM_thread() && > get_thread()->is_vmthread_processing_handshake()) || > (JavaThread *)Thread::current() == get_thread(), > "must be current thread or at safepoint"); You're looking at an outdated webrev, I'm afraid. This would be the post with the current webrev.1 http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/031245.html Thanks, Richard. -----Original Message----- From: David Holmes Sent: Montag, 4. Mai 2020 08:51 To: Reingruber, Richard ; Yasumasa Suenaga ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant Hi Richard, On 28/04/2020 12:09 am, Reingruber, Richard wrote: > Hi David, > >> Not a review but some general commentary ... > > That's welcome. Having had to take an even closer look now I have a review comment too :) src/hotspot/share/prims/jvmtiThreadState.cpp void JvmtiThreadState::invalidate_cur_stack_depth() { ! assert(SafepointSynchronize::is_at_safepoint() || ! (Thread::current()->is_VM_thread() && get_thread()->is_vmthread_processing_handshake()) || (JavaThread *)Thread::current() == get_thread(), "must be current thread or at safepoint"); The message needs updating to include handshakes. More below ... >> On 25/04/2020 2:08 am, Reingruber, Richard wrote: >>> Hi Yasumasa, Patricio, >>> >>>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>>> Does it help you? I think it gives you to remove workaround. >>>>> >>>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>> >>>> Thanks for your information. >>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>>> I will modify and will test it after yours. >>> >>> Thanks :) >>> >>>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>>> to me, how this has to be handled. >>> >>>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>> >>> Yes. To me it is unclear what synchronization is necessary, if it is called during a handshake. And >>> also I'm unsure if a thread should do safepoint checks while executing a handshake. > >> I'm growing increasingly concerned that use of direct handshakes to >> replace VM operations needs a much greater examination for correctness >> than might initially be thought. I see a number of issues: > > I agree. I'll address your concerns in the context of this review thread for JDK-8238585 below. > > In addition I would suggest to take the general part of the discussion to a dedicated thread or to > the review thread for JDK-8242427. I would like to keep this thread closer to its subject. I will focus on the issues in the context of this particular change then, though the issues themselves are applicable to all handshake situations (and more so with direct handshakes). This is mostly just discussion. >> First, the VMThread executes (most) VM operations with a clean stack in >> a clean state, so it has lots of room to work. If we now execute the >> same logic in a JavaThread then we risk hitting stackoverflows if >> nothing else. But we are also now executing code in a JavaThread and so >> we have to be sure that code is not going to act differently (in a bad >> way) if executed by a JavaThread rather than the VMThread. For example, >> may it be possible that if executing in the VMThread we defer some >> activity that might require execution of Java code, or else hand it off >> to one of the service threads? If we execute that code directly in the >> current JavaThread instead we may not be in a valid state (e.g. consider >> re-entrancy to various subsystems that is not allowed). > > It is not too complex, what EnterInterpOnlyModeClosure::do_thread() is doing. I already added a > paragraph to the JBS-Item [1] explaining why the direct handshake is sufficient from a > synchronization point of view. Just to be clear, your proposed change is not using a direct handshake. > Furthermore the stack is walked and the return pc of compiled frames is replaced with the address of > the deopt handler. > > I can't see why this cannot be done with a direct handshake. Something very similar is already done > in JavaThread::deoptimize_marked_methods() which is executed as part of an ordinary handshake. Note that existing non-direct handshakes may also have issues that not have been fully investigated. > The demand on stack-space should be very modest. I would not expect a higher risk for stackoverflow. For the target thread if you use more stack than would be used stopping at a safepoint then you are at risk. For the thread initiating the direct handshake if you use more stack than would be used enqueuing a VM operation, then you are at risk. As we have not quantified these numbers, nor have any easy way to establish the stack use of the actual code to be executed, we're really just hoping for the best. This is a general problem with handshakes that needs to be investigated more deeply. As a simple, general, example just imagine if the code involves logging that might utilise an on-stack buffer. >> Second, we have this question mark over what happens if the operation >> hits further safepoint or handshake polls/checks? Are there constraints >> on what is allowed here? How can we recognise this problem may exist and >> so deal with it? > > The thread in EnterInterpOnlyModeClosure::do_thread() can't become safepoint/handshake safe. I > tested locally test/hotspot/jtreg:vmTestbase_nsk_jvmti with a NoSafepointVerifier. That's good to hear but such tests are not exhaustive, they will detect if you do reach a safepoint/handshake but they can't prove that you cannot reach one. What you have done is necessary but may not be sufficient. Plus you didn't actually add the NSV to the code - is there a reason we can't actually keep it in do_thread? (I'm not sure if the NSV also acts as a NoHandshakeVerifier?) >> Third, while we are generally considering what appear to be >> single-thread operations, which should be amenable to a direct >> handshake, we also have to be careful that some of the code involved >> doesn't already expect/assume we are at a safepoint - e.g. a VM op may >> not need to take a lock where a direct handshake might! > > See again my arguments in the JBS item [1]. Yes I see the reasoning and that is good. My point is a general one as it may not be obvious when such assumptions exist in the current code. Thanks, David > Thanks, > Richard. > > [1] https://bugs.openjdk.java.net/browse/JDK-8238585 > > -----Original Message----- > From: David Holmes > Sent: Montag, 27. April 2020 07:16 > To: Reingruber, Richard ; Yasumasa Suenaga ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > Hi all, > > Not a review but some general commentary ... > > On 25/04/2020 2:08 am, Reingruber, Richard wrote: >> Hi Yasumasa, Patricio, >> >>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>> Does it help you? I think it gives you to remove workaround. >>>> >>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >> >>> Thanks for your information. >>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>> I will modify and will test it after yours. >> >> Thanks :) >> >>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>> to me, how this has to be handled. >> >>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >> >> Yes. To me it is unclear what synchronization is necessary, if it is called during a handshake. And >> also I'm unsure if a thread should do safepoint checks while executing a handshake. > > I'm growing increasingly concerned that use of direct handshakes to > replace VM operations needs a much greater examination for correctness > than might initially be thought. I see a number of issues: > > First, the VMThread executes (most) VM operations with a clean stack in > a clean state, so it has lots of room to work. If we now execute the > same logic in a JavaThread then we risk hitting stackoverflows if > nothing else. But we are also now executing code in a JavaThread and so > we have to be sure that code is not going to act differently (in a bad > way) if executed by a JavaThread rather than the VMThread. For example, > may it be possible that if executing in the VMThread we defer some > activity that might require execution of Java code, or else hand it off > to one of the service threads? If we execute that code directly in the > current JavaThread instead we may not be in a valid state (e.g. consider > re-entrancy to various subsystems that is not allowed). > > Second, we have this question mark over what happens if the operation > hits further safepoint or handshake polls/checks? Are there constraints > on what is allowed here? How can we recognise this problem may exist and > so deal with it? > > Third, while we are generally considering what appear to be > single-thread operations, which should be amenable to a direct > handshake, we also have to be careful that some of the code involved > doesn't already expect/assume we are at a safepoint - e.g. a VM op may > not need to take a lock where a direct handshake might! > > Cheers, > David > ----- > >> @Patricio, coming back to my question [1]: >> >> In the example you gave in your answer [2]: the java thread would execute a vm operation during a >> direct handshake operation, while the VMThread is actually in the middle of a VM_HandshakeAllThreads >> operation, waiting to handshake the same handshakee: why can't the VMThread just proceed? The >> handshakee would be safepoint safe, wouldn't it? >> >> Thanks, Richard. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14301677&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14301677 >> >> [2] https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14301763&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14301763 >> >> -----Original Message----- >> From: Yasumasa Suenaga >> Sent: Freitag, 24. April 2020 17:23 >> To: Reingruber, Richard ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >> >> Hi Richard, >> >> On 2020/04/24 23:44, Reingruber, Richard wrote: >>> Hi Yasumasa, >>> >>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>> Does it help you? I think it gives you to remove workaround. >>> >>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >> >> Thanks for your information. >> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >> I will modify and will test it after yours. >> >> >>> Also my first impression was that it won't be that easy from a synchronization point of view to >>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>> to me, how this has to be handled. >> >> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >> >> >> Thanks, >> >> Yasumasa >> >> >>> So it appears to me that it would be easier to push JDK-8242427 after this (JDK-8238585). >>> >>>> (The patch is available, but I want to see the result of PIT in this weekend whether JDK-8242425 works fine.) >>> >>> Would be interesting to see how you handled the issues above :) >>> >>> Thanks, Richard. >>> >>> [1] See question in comment https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14302030&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14302030 >>> >>> -----Original Message----- >>> From: Yasumasa Suenaga >>> Sent: Freitag, 24. April 2020 13:34 >>> To: Reingruber, Richard ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>> >>> Hi Richard, >>> >>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>> Does it help you? I think it gives you to remove workaround. >>> >>> (The patch is available, but I want to see the result of PIT in this weekend whether JDK-8242425 works fine.) >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/04/24 17:18, Reingruber, Richard wrote: >>>> Hi Patricio, Vladimir, and Serguei, >>>> >>>> now that direct handshakes are available, I've updated the patch to make use of them. >>>> >>>> In addition I have done some clean-up changes I missed in the first webrev. >>>> >>>> Finally I have implemented the workaround suggested by Patricio to avoid nesting the handshake >>>> into the vm operation VM_SetFramePop [1] >>>> >>>> Kindly review again: >>>> >>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ >>>> Webrev(delta): http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ >>>> >>>> I updated the JBS item explaining why the vm operation VM_EnterInterpOnlyMode can be replaced with a >>>> direct handshake: >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8238585 >>>> >>>> Testing: >>>> >>>> * JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>>> >>>> * Submit-repo: mach5-one-rrich-JDK-8238585-20200423-1436-10441737 >>>> >>>> Thanks, >>>> Richard. >>>> >>>> [1] An assertion in Handshake::execute_direct() fails, if called be VMThread, because it is no JavaThread. >>>> >>>> -----Original Message----- >>>> From: hotspot-dev On Behalf Of Reingruber, Richard >>>> Sent: Freitag, 14. Februar 2020 19:47 >>>> To: Patricio Chilano ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>> Subject: RE: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>> >>>> Hi Patricio, >>>> >>>> > > I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >>>> > > handshake cannot be nested in a vm operation. Maybe it should be asserted in the >>>> > > Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >>>> > > >>>> > > > Alternatively I think you could do something similar to what we do in >>>> > > > Deoptimization::deoptimize_all_marked(): >>>> > > > >>>> > > > EnterInterpOnlyModeClosure hs; >>>> > > > if (SafepointSynchronize::is_at_safepoint()) { >>>> > > > hs.do_thread(state->get_thread()); >>>> > > > } else { >>>> > > > Handshake::execute(&hs, state->get_thread()); >>>> > > > } >>>> > > > (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>> > > > HandshakeClosure() constructor) >>>> > > >>>> > > Maybe this could be used also in the Handshake::execute() methods as general solution? >>>> > Right, we could also do that. Avoiding to clear the polling page in >>>> > HandshakeState::clear_handshake() should be enough to fix this issue and >>>> > execute a handshake inside a safepoint, but adding that "if" statement >>>> > in Hanshake::execute() sounds good to avoid all the extra code that we >>>> > go through when executing a handshake. I filed 8239084 to make that change. >>>> >>>> Thanks for taking care of this and creating the RFE. >>>> >>>> > >>>> > > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>> > > > always called in a nested operation or just sometimes. >>>> > > >>>> > > At least one execution path without vm operation exists: >>>> > > >>>> > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >>>> > > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >>>> > > JvmtiEventControllerPrivate::recompute_enabled() : void >>>> > > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >>>> > > JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >>>> > > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >>>> > > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >>>> > > >>>> > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >>>> > > handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >>>> > > encouraged to do it with a handshake :) >>>> > Ah! I think you can still do it with a handshake with the >>>> > Deoptimization::deoptimize_all_marked() like solution. I can change the >>>> > if-else statement with just the Handshake::execute() call in 8239084. >>>> > But up to you. : ) >>>> >>>> Well, I think that's enough encouragement :) >>>> I'll wait for 8239084 and try then again. >>>> (no urgency and all) >>>> >>>> Thanks, >>>> Richard. >>>> >>>> -----Original Message----- >>>> From: Patricio Chilano >>>> Sent: Freitag, 14. Februar 2020 15:54 >>>> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>> >>>> Hi Richard, >>>> >>>> On 2/14/20 9:58 AM, Reingruber, Richard wrote: >>>>> Hi Patricio, >>>>> >>>>> thanks for having a look. >>>>> >>>>> > I?m only commenting on the handshake changes. >>>>> > I see that operation VM_EnterInterpOnlyMode can be called inside >>>>> > operation VM_SetFramePop which also allows nested operations. Here is a >>>>> > comment in VM_SetFramePop definition: >>>>> > >>>>> > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >>>>> > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >>>>> > >>>>> > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >>>>> > could have a handshake inside a safepoint operation. The issue I see >>>>> > there is that at the end of the handshake the polling page of the target >>>>> > thread could be disarmed. So if the target thread happens to be in a >>>>> > blocked state just transiently and wakes up then it will not stop for >>>>> > the ongoing safepoint. Maybe I can file an RFE to assert that the >>>>> > polling page is armed at the beginning of disarm_safepoint(). >>>>> >>>>> I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >>>>> handshake cannot be nested in a vm operation. Maybe it should be asserted in the >>>>> Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >>>>> >>>>> > Alternatively I think you could do something similar to what we do in >>>>> > Deoptimization::deoptimize_all_marked(): >>>>> > >>>>> > EnterInterpOnlyModeClosure hs; >>>>> > if (SafepointSynchronize::is_at_safepoint()) { >>>>> > hs.do_thread(state->get_thread()); >>>>> > } else { >>>>> > Handshake::execute(&hs, state->get_thread()); >>>>> > } >>>>> > (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>> > HandshakeClosure() constructor) >>>>> >>>>> Maybe this could be used also in the Handshake::execute() methods as general solution? >>>> Right, we could also do that. Avoiding to clear the polling page in >>>> HandshakeState::clear_handshake() should be enough to fix this issue and >>>> execute a handshake inside a safepoint, but adding that "if" statement >>>> in Hanshake::execute() sounds good to avoid all the extra code that we >>>> go through when executing a handshake. I filed 8239084 to make that change. >>>> >>>>> > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>> > always called in a nested operation or just sometimes. >>>>> >>>>> At least one execution path without vm operation exists: >>>>> >>>>> JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >>>>> JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >>>>> JvmtiEventControllerPrivate::recompute_enabled() : void >>>>> JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >>>>> JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >>>>> JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >>>>> jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >>>>> >>>>> I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >>>>> handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >>>>> encouraged to do it with a handshake :) >>>> Ah! I think you can still do it with a handshake with the >>>> Deoptimization::deoptimize_all_marked() like solution. I can change the >>>> if-else statement with just the Handshake::execute() call in 8239084. >>>> But up to you.? : ) >>>> >>>> Thanks, >>>> Patricio >>>>> Thanks again, >>>>> Richard. >>>>> >>>>> -----Original Message----- >>>>> From: Patricio Chilano >>>>> Sent: Donnerstag, 13. Februar 2020 18:47 >>>>> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>> >>>>> Hi Richard, >>>>> >>>>> I?m only commenting on the handshake changes. >>>>> I see that operation VM_EnterInterpOnlyMode can be called inside >>>>> operation VM_SetFramePop which also allows nested operations. Here is a >>>>> comment in VM_SetFramePop definition: >>>>> >>>>> // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >>>>> // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >>>>> >>>>> So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >>>>> could have a handshake inside a safepoint operation. The issue I see >>>>> there is that at the end of the handshake the polling page of the target >>>>> thread could be disarmed. So if the target thread happens to be in a >>>>> blocked state just transiently and wakes up then it will not stop for >>>>> the ongoing safepoint. Maybe I can file an RFE to assert that the >>>>> polling page is armed at the beginning of disarm_safepoint(). >>>>> >>>>> I think one option could be to remove >>>>> SafepointMechanism::disarm_if_needed() in >>>>> HandshakeState::clear_handshake() and let each JavaThread disarm itself >>>>> for the handshake case. >>>>> >>>>> Alternatively I think you could do something similar to what we do in >>>>> Deoptimization::deoptimize_all_marked(): >>>>> >>>>> ? EnterInterpOnlyModeClosure hs; >>>>> ? if (SafepointSynchronize::is_at_safepoint()) { >>>>> ??? hs.do_thread(state->get_thread()); >>>>> ? } else { >>>>> ??? Handshake::execute(&hs, state->get_thread()); >>>>> ? } >>>>> (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>> HandshakeClosure() constructor) >>>>> >>>>> I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>> always called in a nested operation or just sometimes. >>>>> >>>>> Thanks, >>>>> Patricio >>>>> >>>>> On 2/12/20 7:23 AM, Reingruber, Richard wrote: >>>>>> // Repost including hotspot runtime and gc lists. >>>>>> // Dean Long suggested to do so, because the enhancement replaces a vm operation >>>>>> // with a handshake. >>>>>> // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html >>>>>> >>>>>> Hi, >>>>>> >>>>>> could I please get reviews for this small enhancement in hotspot's jvmti implementation: >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >>>>>> >>>>>> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >>>>>> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >>>>>> >>>>>> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >>>>>> >>>>>> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>>>>> >>>>>> Thanks, Richard. >>>>>> >>>>>> See also my question if anyone knows a reason for making the compiled methods not_entrant: >>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html >>>> From kim.barrett at oracle.com Mon May 4 10:47:11 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 4 May 2020 06:47:11 -0400 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: References: Message-ID: <553D0344-188E-455B-A03E-D080C1484B41@oracle.com> > On May 4, 2020, at 1:12 AM, Mikael Vidstedt wrote: > > > Please review this change which implements part of JEP 381: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 > webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ > JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 I've only looked at the src/hotspot changes so far. I've not duplicated comments already made by Stefan. Looks good, other than a few very minor issues, some of which might already be covered by planned followup RFEs. ------------------------------------------------------------------------------ I think with sparc removal, c1's pack64/unpack64 stuff is no longer used. So I think that can be removed from c1_LIR.[ch]pp too. ------------------------------------------------------------------------------ src/hotspot/share/opto/generateOptoStub.cpp 225 // Clear last_Java_pc and (optionally)_flags The sparc-specific clearing of "flags" is gone. ------------------------------------------------------------------------------ src/hotspot/share/runtime/deoptimization.cpp 1086 *((jlong *) check_alignment_get_addr(obj, index, 8)) = (jlong) *((jlong *) &val); [pre-existing] The rhs cast to jlong is unnecessary, since it's dereferencing a jlong*. ------------------------------------------------------------------------------ src/hotspot/share/runtime/flags/jvmFlagConstraintsCompiler.cpp 236 JVMFlag::Error CompilerThreadPriorityConstraintFunc(intx value, bool verbose) { 237 return JVMFlag::SUCCESS; 238 } After SOLARIS code removal we no longer need this constraint function. ------------------------------------------------------------------------------ src/hotspot/share/runtime/globals.hpp 2392 experimental(size_t, ArrayAllocatorMallocLimit, \ 2393 (size_t)-1, \ Combine these lines. ------------------------------------------------------------------------------ src/hotspot/share/utilities/dtrace.hpp Shuold just eliminate all traces of HS_DTRACE_WORKAROUND_TAIL_CALL_BUG. ------------------------------------------------------------------------------ From martin.doerr at sap.com Mon May 4 10:59:40 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 4 May 2020 10:59:40 +0000 Subject: RFR(S): 8244086: Following 8241492, strip mined loop may run extra iterations In-Reply-To: <87wo5s6tvs.fsf@redhat.com> References: <87wo5y8z2v.fsf@redhat.com> <878sid8jzn.fsf@redhat.com> <87zhat6voh.fsf@redhat.com> <87wo5s6tvs.fsf@redhat.com> Message-ID: Hi Roland, I was hoping this was easier. I'd really appreciate to have more simple and comprehensive graph patterns. I wonder how many people will be able to debug them. Given the huge amount of problems related to LSM I wonder if it's worth maintaining it. It comes with a high price. Nevertheless, your version looks at least correct to me. loopnode.cpp: Please add your formula to the comment. TestStripMinedLimitBelowInit.java: I suggest to use -XX:-TieredCompilation Best regards, Martin > -----Original Message----- > From: Roland Westrelin > Sent: Montag, 4. Mai 2020 09:29 > To: Doerr, Martin ; Pengfei Li > ; hotspot-compiler-dev at openjdk.java.net > Cc: nd > Subject: RE: RFR(S): 8244086: Following 8241492, strip mined loop may run > extra iterations > > > Hi Martin, > > > my idea was rather to check if the trip counter is already checked before > the loop. > > Check before loop should look like this (stride > 0 example): > > CmpINode c = CountedLoop->in(1) -> IfTrue->in(0) -> If->in(1) -> Bool- > >in(1) -> CompI > > > > (Maybe there's an easier way to find it where it gets generated.) > > > > Comparison of start value: > > c->in(1) == Phi(trip counter)->in(1) > > with limit: > > c->in(2) == CmpI(trip counter)->in(2) > > > > If this matches we should be safe. > > I haven't checked if such patterns match often enough. Just as an idea. > > The code snippet I included does that but in a slightly different way > (it looks for the CmpI/Bool with the right inputs and checks that it > dominates the loops). That works for simple loops but I found it doesn't > for other common loop shapes. So I doubt it's a as simple as it seems to > follow your suggestion. > > Roland. From bourges.laurent at gmail.com Mon May 4 11:11:01 2020 From: bourges.laurent at gmail.com (=?UTF-8?Q?Laurent_Bourg=C3=A8s?=) Date: Mon, 4 May 2020 13:11:01 +0200 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: References: Message-ID: Hi, Do you have performance results to justify your assumption "could significantly improve performance" ? Please share numbers in the jbs bug Laurent Le sam. 2 mai 2020 ? 07:35, Man Cao a ?crit : > Hi all, > > Can I have reviews for this one-line change that fixes a bug and could > significantly improve performance? > Webrev: https://cr.openjdk.java.net/~manc/8244278/webrev.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8244278 > > It passes tier-1 tests locally, as well as > "vm/mlvm/meth/stress/compiler/deoptimize" (for the original JDK-8046809). > > -Man > From david.holmes at oracle.com Mon May 4 06:50:49 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 4 May 2020 16:50:49 +1000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: <3c59b9f9-ec38-18c9-8f24-e1186a08a04a@oracle.com> <410eed04-e2ef-0f4f-1c56-19e6734a10f6@oracle.com> <81d7caa8-4244-85f3-4d4e-78117fe5e25b@oss.nttdata.com> <550b95ac-8b29-1eb8-a507-533e81d02322@oracle.com> Message-ID: <9c49ea2d-e3b8-b576-1d17-d18ad87cd6ed@oracle.com> Hi Richard, On 28/04/2020 12:09 am, Reingruber, Richard wrote: > Hi David, > >> Not a review but some general commentary ... > > That's welcome. Having had to take an even closer look now I have a review comment too :) src/hotspot/share/prims/jvmtiThreadState.cpp void JvmtiThreadState::invalidate_cur_stack_depth() { ! assert(SafepointSynchronize::is_at_safepoint() || ! (Thread::current()->is_VM_thread() && get_thread()->is_vmthread_processing_handshake()) || (JavaThread *)Thread::current() == get_thread(), "must be current thread or at safepoint"); The message needs updating to include handshakes. More below ... >> On 25/04/2020 2:08 am, Reingruber, Richard wrote: >>> Hi Yasumasa, Patricio, >>> >>>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>>> Does it help you? I think it gives you to remove workaround. >>>>> >>>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>> >>>> Thanks for your information. >>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>>> I will modify and will test it after yours. >>> >>> Thanks :) >>> >>>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>>> to me, how this has to be handled. >>> >>>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>> >>> Yes. To me it is unclear what synchronization is necessary, if it is called during a handshake. And >>> also I'm unsure if a thread should do safepoint checks while executing a handshake. > >> I'm growing increasingly concerned that use of direct handshakes to >> replace VM operations needs a much greater examination for correctness >> than might initially be thought. I see a number of issues: > > I agree. I'll address your concerns in the context of this review thread for JDK-8238585 below. > > In addition I would suggest to take the general part of the discussion to a dedicated thread or to > the review thread for JDK-8242427. I would like to keep this thread closer to its subject. I will focus on the issues in the context of this particular change then, though the issues themselves are applicable to all handshake situations (and more so with direct handshakes). This is mostly just discussion. >> First, the VMThread executes (most) VM operations with a clean stack in >> a clean state, so it has lots of room to work. If we now execute the >> same logic in a JavaThread then we risk hitting stackoverflows if >> nothing else. But we are also now executing code in a JavaThread and so >> we have to be sure that code is not going to act differently (in a bad >> way) if executed by a JavaThread rather than the VMThread. For example, >> may it be possible that if executing in the VMThread we defer some >> activity that might require execution of Java code, or else hand it off >> to one of the service threads? If we execute that code directly in the >> current JavaThread instead we may not be in a valid state (e.g. consider >> re-entrancy to various subsystems that is not allowed). > > It is not too complex, what EnterInterpOnlyModeClosure::do_thread() is doing. I already added a > paragraph to the JBS-Item [1] explaining why the direct handshake is sufficient from a > synchronization point of view. Just to be clear, your proposed change is not using a direct handshake. > Furthermore the stack is walked and the return pc of compiled frames is replaced with the address of > the deopt handler. > > I can't see why this cannot be done with a direct handshake. Something very similar is already done > in JavaThread::deoptimize_marked_methods() which is executed as part of an ordinary handshake. Note that existing non-direct handshakes may also have issues that not have been fully investigated. > The demand on stack-space should be very modest. I would not expect a higher risk for stackoverflow. For the target thread if you use more stack than would be used stopping at a safepoint then you are at risk. For the thread initiating the direct handshake if you use more stack than would be used enqueuing a VM operation, then you are at risk. As we have not quantified these numbers, nor have any easy way to establish the stack use of the actual code to be executed, we're really just hoping for the best. This is a general problem with handshakes that needs to be investigated more deeply. As a simple, general, example just imagine if the code involves logging that might utilise an on-stack buffer. >> Second, we have this question mark over what happens if the operation >> hits further safepoint or handshake polls/checks? Are there constraints >> on what is allowed here? How can we recognise this problem may exist and >> so deal with it? > > The thread in EnterInterpOnlyModeClosure::do_thread() can't become safepoint/handshake safe. I > tested locally test/hotspot/jtreg:vmTestbase_nsk_jvmti with a NoSafepointVerifier. That's good to hear but such tests are not exhaustive, they will detect if you do reach a safepoint/handshake but they can't prove that you cannot reach one. What you have done is necessary but may not be sufficient. Plus you didn't actually add the NSV to the code - is there a reason we can't actually keep it in do_thread? (I'm not sure if the NSV also acts as a NoHandshakeVerifier?) >> Third, while we are generally considering what appear to be >> single-thread operations, which should be amenable to a direct >> handshake, we also have to be careful that some of the code involved >> doesn't already expect/assume we are at a safepoint - e.g. a VM op may >> not need to take a lock where a direct handshake might! > > See again my arguments in the JBS item [1]. Yes I see the reasoning and that is good. My point is a general one as it may not be obvious when such assumptions exist in the current code. Thanks, David > Thanks, > Richard. > > [1] https://bugs.openjdk.java.net/browse/JDK-8238585 > > -----Original Message----- > From: David Holmes > Sent: Montag, 27. April 2020 07:16 > To: Reingruber, Richard ; Yasumasa Suenaga ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > Hi all, > > Not a review but some general commentary ... > > On 25/04/2020 2:08 am, Reingruber, Richard wrote: >> Hi Yasumasa, Patricio, >> >>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>> Does it help you? I think it gives you to remove workaround. >>>> >>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >> >>> Thanks for your information. >>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>> I will modify and will test it after yours. >> >> Thanks :) >> >>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>> to me, how this has to be handled. >> >>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >> >> Yes. To me it is unclear what synchronization is necessary, if it is called during a handshake. And >> also I'm unsure if a thread should do safepoint checks while executing a handshake. > > I'm growing increasingly concerned that use of direct handshakes to > replace VM operations needs a much greater examination for correctness > than might initially be thought. I see a number of issues: > > First, the VMThread executes (most) VM operations with a clean stack in > a clean state, so it has lots of room to work. If we now execute the > same logic in a JavaThread then we risk hitting stackoverflows if > nothing else. But we are also now executing code in a JavaThread and so > we have to be sure that code is not going to act differently (in a bad > way) if executed by a JavaThread rather than the VMThread. For example, > may it be possible that if executing in the VMThread we defer some > activity that might require execution of Java code, or else hand it off > to one of the service threads? If we execute that code directly in the > current JavaThread instead we may not be in a valid state (e.g. consider > re-entrancy to various subsystems that is not allowed). > > Second, we have this question mark over what happens if the operation > hits further safepoint or handshake polls/checks? Are there constraints > on what is allowed here? How can we recognise this problem may exist and > so deal with it? > > Third, while we are generally considering what appear to be > single-thread operations, which should be amenable to a direct > handshake, we also have to be careful that some of the code involved > doesn't already expect/assume we are at a safepoint - e.g. a VM op may > not need to take a lock where a direct handshake might! > > Cheers, > David > ----- > >> @Patricio, coming back to my question [1]: >> >> In the example you gave in your answer [2]: the java thread would execute a vm operation during a >> direct handshake operation, while the VMThread is actually in the middle of a VM_HandshakeAllThreads >> operation, waiting to handshake the same handshakee: why can't the VMThread just proceed? The >> handshakee would be safepoint safe, wouldn't it? >> >> Thanks, Richard. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14301677&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14301677 >> >> [2] https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14301763&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14301763 >> >> -----Original Message----- >> From: Yasumasa Suenaga >> Sent: Freitag, 24. April 2020 17:23 >> To: Reingruber, Richard ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >> >> Hi Richard, >> >> On 2020/04/24 23:44, Reingruber, Richard wrote: >>> Hi Yasumasa, >>> >>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>> Does it help you? I think it gives you to remove workaround. >>> >>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >> >> Thanks for your information. >> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >> I will modify and will test it after yours. >> >> >>> Also my first impression was that it won't be that easy from a synchronization point of view to >>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>> to me, how this has to be handled. >> >> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >> >> >> Thanks, >> >> Yasumasa >> >> >>> So it appears to me that it would be easier to push JDK-8242427 after this (JDK-8238585). >>> >>>> (The patch is available, but I want to see the result of PIT in this weekend whether JDK-8242425 works fine.) >>> >>> Would be interesting to see how you handled the issues above :) >>> >>> Thanks, Richard. >>> >>> [1] See question in comment https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14302030&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14302030 >>> >>> -----Original Message----- >>> From: Yasumasa Suenaga >>> Sent: Freitag, 24. April 2020 13:34 >>> To: Reingruber, Richard ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>> >>> Hi Richard, >>> >>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>> Does it help you? I think it gives you to remove workaround. >>> >>> (The patch is available, but I want to see the result of PIT in this weekend whether JDK-8242425 works fine.) >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/04/24 17:18, Reingruber, Richard wrote: >>>> Hi Patricio, Vladimir, and Serguei, >>>> >>>> now that direct handshakes are available, I've updated the patch to make use of them. >>>> >>>> In addition I have done some clean-up changes I missed in the first webrev. >>>> >>>> Finally I have implemented the workaround suggested by Patricio to avoid nesting the handshake >>>> into the vm operation VM_SetFramePop [1] >>>> >>>> Kindly review again: >>>> >>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ >>>> Webrev(delta): http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ >>>> >>>> I updated the JBS item explaining why the vm operation VM_EnterInterpOnlyMode can be replaced with a >>>> direct handshake: >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8238585 >>>> >>>> Testing: >>>> >>>> * JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>>> >>>> * Submit-repo: mach5-one-rrich-JDK-8238585-20200423-1436-10441737 >>>> >>>> Thanks, >>>> Richard. >>>> >>>> [1] An assertion in Handshake::execute_direct() fails, if called be VMThread, because it is no JavaThread. >>>> >>>> -----Original Message----- >>>> From: hotspot-dev On Behalf Of Reingruber, Richard >>>> Sent: Freitag, 14. Februar 2020 19:47 >>>> To: Patricio Chilano ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>> Subject: RE: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>> >>>> Hi Patricio, >>>> >>>> > > I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >>>> > > handshake cannot be nested in a vm operation. Maybe it should be asserted in the >>>> > > Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >>>> > > >>>> > > > Alternatively I think you could do something similar to what we do in >>>> > > > Deoptimization::deoptimize_all_marked(): >>>> > > > >>>> > > > EnterInterpOnlyModeClosure hs; >>>> > > > if (SafepointSynchronize::is_at_safepoint()) { >>>> > > > hs.do_thread(state->get_thread()); >>>> > > > } else { >>>> > > > Handshake::execute(&hs, state->get_thread()); >>>> > > > } >>>> > > > (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>> > > > HandshakeClosure() constructor) >>>> > > >>>> > > Maybe this could be used also in the Handshake::execute() methods as general solution? >>>> > Right, we could also do that. Avoiding to clear the polling page in >>>> > HandshakeState::clear_handshake() should be enough to fix this issue and >>>> > execute a handshake inside a safepoint, but adding that "if" statement >>>> > in Hanshake::execute() sounds good to avoid all the extra code that we >>>> > go through when executing a handshake. I filed 8239084 to make that change. >>>> >>>> Thanks for taking care of this and creating the RFE. >>>> >>>> > >>>> > > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>> > > > always called in a nested operation or just sometimes. >>>> > > >>>> > > At least one execution path without vm operation exists: >>>> > > >>>> > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >>>> > > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >>>> > > JvmtiEventControllerPrivate::recompute_enabled() : void >>>> > > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >>>> > > JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >>>> > > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >>>> > > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >>>> > > >>>> > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >>>> > > handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >>>> > > encouraged to do it with a handshake :) >>>> > Ah! I think you can still do it with a handshake with the >>>> > Deoptimization::deoptimize_all_marked() like solution. I can change the >>>> > if-else statement with just the Handshake::execute() call in 8239084. >>>> > But up to you. : ) >>>> >>>> Well, I think that's enough encouragement :) >>>> I'll wait for 8239084 and try then again. >>>> (no urgency and all) >>>> >>>> Thanks, >>>> Richard. >>>> >>>> -----Original Message----- >>>> From: Patricio Chilano >>>> Sent: Freitag, 14. Februar 2020 15:54 >>>> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>> >>>> Hi Richard, >>>> >>>> On 2/14/20 9:58 AM, Reingruber, Richard wrote: >>>>> Hi Patricio, >>>>> >>>>> thanks for having a look. >>>>> >>>>> > I?m only commenting on the handshake changes. >>>>> > I see that operation VM_EnterInterpOnlyMode can be called inside >>>>> > operation VM_SetFramePop which also allows nested operations. Here is a >>>>> > comment in VM_SetFramePop definition: >>>>> > >>>>> > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >>>>> > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >>>>> > >>>>> > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >>>>> > could have a handshake inside a safepoint operation. The issue I see >>>>> > there is that at the end of the handshake the polling page of the target >>>>> > thread could be disarmed. So if the target thread happens to be in a >>>>> > blocked state just transiently and wakes up then it will not stop for >>>>> > the ongoing safepoint. Maybe I can file an RFE to assert that the >>>>> > polling page is armed at the beginning of disarm_safepoint(). >>>>> >>>>> I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >>>>> handshake cannot be nested in a vm operation. Maybe it should be asserted in the >>>>> Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >>>>> >>>>> > Alternatively I think you could do something similar to what we do in >>>>> > Deoptimization::deoptimize_all_marked(): >>>>> > >>>>> > EnterInterpOnlyModeClosure hs; >>>>> > if (SafepointSynchronize::is_at_safepoint()) { >>>>> > hs.do_thread(state->get_thread()); >>>>> > } else { >>>>> > Handshake::execute(&hs, state->get_thread()); >>>>> > } >>>>> > (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>> > HandshakeClosure() constructor) >>>>> >>>>> Maybe this could be used also in the Handshake::execute() methods as general solution? >>>> Right, we could also do that. Avoiding to clear the polling page in >>>> HandshakeState::clear_handshake() should be enough to fix this issue and >>>> execute a handshake inside a safepoint, but adding that "if" statement >>>> in Hanshake::execute() sounds good to avoid all the extra code that we >>>> go through when executing a handshake. I filed 8239084 to make that change. >>>> >>>>> > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>> > always called in a nested operation or just sometimes. >>>>> >>>>> At least one execution path without vm operation exists: >>>>> >>>>> JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >>>>> JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >>>>> JvmtiEventControllerPrivate::recompute_enabled() : void >>>>> JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >>>>> JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >>>>> JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >>>>> jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >>>>> >>>>> I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >>>>> handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >>>>> encouraged to do it with a handshake :) >>>> Ah! I think you can still do it with a handshake with the >>>> Deoptimization::deoptimize_all_marked() like solution. I can change the >>>> if-else statement with just the Handshake::execute() call in 8239084. >>>> But up to you.? : ) >>>> >>>> Thanks, >>>> Patricio >>>>> Thanks again, >>>>> Richard. >>>>> >>>>> -----Original Message----- >>>>> From: Patricio Chilano >>>>> Sent: Donnerstag, 13. Februar 2020 18:47 >>>>> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>> >>>>> Hi Richard, >>>>> >>>>> I?m only commenting on the handshake changes. >>>>> I see that operation VM_EnterInterpOnlyMode can be called inside >>>>> operation VM_SetFramePop which also allows nested operations. Here is a >>>>> comment in VM_SetFramePop definition: >>>>> >>>>> // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >>>>> // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >>>>> >>>>> So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >>>>> could have a handshake inside a safepoint operation. The issue I see >>>>> there is that at the end of the handshake the polling page of the target >>>>> thread could be disarmed. So if the target thread happens to be in a >>>>> blocked state just transiently and wakes up then it will not stop for >>>>> the ongoing safepoint. Maybe I can file an RFE to assert that the >>>>> polling page is armed at the beginning of disarm_safepoint(). >>>>> >>>>> I think one option could be to remove >>>>> SafepointMechanism::disarm_if_needed() in >>>>> HandshakeState::clear_handshake() and let each JavaThread disarm itself >>>>> for the handshake case. >>>>> >>>>> Alternatively I think you could do something similar to what we do in >>>>> Deoptimization::deoptimize_all_marked(): >>>>> >>>>> ? EnterInterpOnlyModeClosure hs; >>>>> ? if (SafepointSynchronize::is_at_safepoint()) { >>>>> ??? hs.do_thread(state->get_thread()); >>>>> ? } else { >>>>> ??? Handshake::execute(&hs, state->get_thread()); >>>>> ? } >>>>> (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>> HandshakeClosure() constructor) >>>>> >>>>> I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>> always called in a nested operation or just sometimes. >>>>> >>>>> Thanks, >>>>> Patricio >>>>> >>>>> On 2/12/20 7:23 AM, Reingruber, Richard wrote: >>>>>> // Repost including hotspot runtime and gc lists. >>>>>> // Dean Long suggested to do so, because the enhancement replaces a vm operation >>>>>> // with a handshake. >>>>>> // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html >>>>>> >>>>>> Hi, >>>>>> >>>>>> could I please get reviews for this small enhancement in hotspot's jvmti implementation: >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >>>>>> >>>>>> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >>>>>> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >>>>>> >>>>>> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >>>>>> >>>>>> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>>>>> >>>>>> Thanks, Richard. >>>>>> >>>>>> See also my question if anyone knows a reason for making the compiled methods not_entrant: >>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html >>>> From martin.doerr at sap.com Mon May 4 16:04:12 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 4 May 2020 16:04:12 +0000 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> Message-ID: Hi Nils, thank you for looking at this and sorry for the late reply. I've added MaxTrivialSize and also updated the issue accordingly. Makes sense. Do you have more flags in mind? Moving the flags which are only used by C2 into c2_globals definitely makes sense. Done in webrev.01: http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ Please take a look and let me know when my proposal is ready for a CSR. Best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Nils Eliasson > Sent: Dienstag, 28. April 2020 18:29 > To: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > Hi, > > Thanks for addressing this! This has been an annoyance for a long time. > > Have you though about including other flags - like MaxTrivialSize? > MaxInlineSize is tested against it. > > Also - you should move the flags that are now c2-only to c2_globals.hpp. > > Best regards, > Nils Eliasson > > On 2020-04-27 15:06, Doerr, Martin wrote: > > Hi, > > > > while tuning inlining parameters for C2 compiler with JDK-8234863 we had > discussed impact on C1. > > I still think it's bad to share them between both compilers. We may want to > do further C2 tuning without negative impact on C1 in the future. > > > > C1 has issues with substantial inlining because of the lack of uncommon > traps. When C1 inlines a lot, stack frames may get large and code cache space > may get wasted for cold or even never executed code. The situation gets > worse when many patching stubs get used for such code. > > > > I had opened the following issue: > > https://bugs.openjdk.java.net/browse/JDK-8235673 > > > > And my initial proposal is here: > > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ > > > > > > Part of my proposal is to add an additional flag which I called > C1InlineStackLimit to reduce stack utilization for C1 methods. > > I have a simple example which shows wasted stack space (java example > TestStack at the end). > > > > It simply counts stack frames until a stack overflow occurs. With the current > implementation, only 1283 frames fit on the stack because the never > executed method bogus_test with local variables gets inlined. > > Reduced C1InlineStackLimit avoids inlining of bogus_test and we get 2310 > frames until stack overflow. (I only used C1 for this example. Can be > reproduced as shown below.) > > > > I didn't notice any performance regression even with the aggressive setting > of C1InlineStackLimit=5 with TieredCompilation. > > > > I know that I'll need a CSR for this change, but I'd like to get feedback in > general and feedback about the flag names before creating a CSR. > > I'd also be glad about feedback regarding the performance impact. > > > > Best regards, > > Martin > > > > > > > > Command line: > > jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - > XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - > XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > TestStack > > CompileCommand: compileonly TestStack.triggerStackOverflow > > @ 8 TestStack::triggerStackOverflow (15 bytes) recursive > inlining too deep > > @ 11 TestStack::bogus_test (33 bytes) inline > > caught java.lang.StackOverflowError > > 1283 activations were on stack, sum = 0 > > > > jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - > XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - > XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > TestStack > > CompileCommand: compileonly TestStack.triggerStackOverflow > > @ 8 TestStack::triggerStackOverflow (15 bytes) recursive > inlining too deep > > @ 11 TestStack::bogus_test (33 bytes) callee uses too > much stack > > caught java.lang.StackOverflowError > > 2310 activations were on stack, sum = 0 > > > > > > TestStack.java: > > public class TestStack { > > > > static long cnt = 0, > > sum = 0; > > > > public static void bogus_test() { > > long c1 = 1, c2 = 2, c3 = 3, c4 = 4; > > sum += c1 + c2 + c3 + c4; > > } > > > > public static void triggerStackOverflow() { > > cnt++; > > triggerStackOverflow(); > > bogus_test(); > > } > > > > > > public static void main(String args[]) { > > try { > > triggerStackOverflow(); > > } catch (StackOverflowError e) { > > System.out.println("caught " + e); > > } > > System.out.println(cnt + " activations were on stack, sum = " + sum); > > } > > } > > From rwestrel at redhat.com Mon May 4 16:24:17 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 04 May 2020 18:24:17 +0200 Subject: RFR(M): 8223051: support loops with long (64b) trip counts In-Reply-To: References: <87lfmd8lip.fsf@redhat.com> Message-ID: <87h7wv7jny.fsf@redhat.com> Thanks for the careful review. > It would be good to put this pseudo-code above the definition > of PhaseIdealLoop::is_long_counted_loop, but with names > adjusted to match the code. > > Let me sketch it in terms of the names in the code: > > L: for (long phi = init; phi < limit; phi += stride) { > // phi := Phi(L, init, phi + stride) > ? use phi and (phi + stride) ? > } > > ==transform=> > > const long inner_iters_limit = INT_MAX - stride; > assert(stride <= inner_iters_limit); // else deopt The assertion above is statically determined to be true, right? It's not a runtime check that can cause a deopt. > In this webrev I didn?t see a discussion of ?what does concrete > mean?; I assume it is elsewhere, and that I would know it if I were > familiar with the loop optimizations (which very few people are!). > Maybe add: > + // See 'GraphKit::add_empty_predicates'. > > (Is a concrete predicate one which was inserted above an empty > place-holder by a predication transformation? In that case, I Yes, they are. I think Christian is the one that introduced the term "concrete" in one of his recent changes. > might wish to call those ?predication tests?, or some other term > tied to the name of a specific optimization transform, rather > than ?concrete predicates?.) There are at least 3 kind predicates: the place holder inserted at parse time, the tests added by predication above the place holder (the concrete ones), skeleton predicates that are added between main loop and pre loop so C2 doesn't crash in some rare cases of over unrolling (a recent addition). Skeleton predicates themselves are expanded and updated as unrolling proceeds (creating a 4th kind of predicates?). They don't compile to any code. As you can tell, this has all gotten complicated and confusing and the naming scheme (or lack of naming scheme) hasn't helped. That would need to be revisited. > I think you should also check for ?stride_con == 0? here. The 32-bit > code checks for zero strides but I don?t see it here. Perhaps that would > fold up later after the 32-bit loop is created? But it seems tidier to keep > the 32-bit and 64-bit code as parallel as possible. Isn't stride_con == 0 actually a bug? IGVN should have folded it but it was missed so rather than silently ignoring it wouldn't it be better to assert stride_con != 0? > The following logic is hard for me to prove correct, and I would prefer it > to miss a few valid cases rather in a simpler form: > + if (phi_incr != NULL && iters_limit <= ABS(stride_con)) { > > I suggest either detuning the check in its current place: > ++ if (iters_limit <= ABS(stride_con)) { I don't think that test is needed actually. If the loop exit test is performed on the iv before increment, I now transform the exit test from iv < limit to iv + stride < limit + stride but before I settled for that I tried having the int counted loop handle that transformation. That test must be a left over from those attempts. > Also, the immediately following range check logic is very odd, > and doesn?t correspond to anything in the 32-bit loop code. > What you appear to be doing here is checking how limit_t > is going to behave with respect to the stride and there are > three possibilities: It can sometimes cause 64-bit overflow, > or it cannot cause 64-bit overflow, or it *must* cause 64-bit > overflow. The two parts of this check are widely separately > and not obviously connected, although they are linked by > a common task of predicate insertion. It's different because the 32 bit loop code checks that both the iv doesn't overflow and that the limit adjustment required when the compare point to the phi in one go. I don't need to the check that iv doesn't overflow (I set the limit of the inner loop so it doesn't). > > The new node inner_iters_max would read a little better with a > comment: > > + _igvn.register_new_node_with_optimizer(inner_iters_max); > ++ // inner_iters_max is MAX(0, adjusted_limit - iv), when stride > 0 > > The definition of adjusted_limit also deserves comment, for that > matter. Perhaps this one, lifted from the 32-bit code: > > // If compare points directly to the phi we need to adjust > // the compare so that it points to the incr. > > In fact, the 32-bit code near that comment should, in my opinion, > also be edited to use a different variable adjusted_limit, even though > that variable will have a short span. There is value to having the > 32-bit and 64-bit code look as similar as possible. The fact that > this comment already occurs twice in the 32-bit code is additional > evidence that there should be an adjusted_limit variable, defined > near the *first* occurrence of that comment, in the 32-bit code. > I think such a cleanup is a desirable way to reduce the cost of > making a new almost-copy of the 32-bit code, in its 64-bit form. > > Another milestone in the arithmetic the deserves a comment > is this definition: > > + _igvn.register_new_node_with_optimizer(inner_iters_actual); > ++ // inner_iters_actual is unsigned MIN(inner_iters_max, max_jint - ABS(stride)) > ++ // this is the 32-bit number of iterations to execute in the inner loop > > (Why is it unsigned? I think its operands are never negative.) inner_iters_max is an unsigned integer. If the loop is: for (long l = Long.MIN_VALUE; l < Long.MAX_VALUE; l++) { then inners_iter_max doesn't fit in the signed long integer range. > There?s a backtrack at this point: > > + // That fails. Undo graph changes we've done so far. > > I think that should collect a count somewhere, to be reported as part > of statistics. That way, when you run stress tests, you can ensure that > they include this backtrack path. I see that?s counted, in part, by the > difference of _long_loops_success and _long_loops, but maybe another > counter here wouldn?t be a bad idea, since there is lot of work to undo > at this very late point. One issue with the counters I added is that they are hard to interpret: if the transformation of long loops fails, it's attempted again at every pass of loop opts and so _long_loops is incremented every time. So _long_loops_success < _long_loops tell us there were some failures but not how many. > BTW, I like StressLongCountedLoop a lot; it?s a nice test. I suggest > a second stress mode, in which the pinning against the jint range > (in places like max_jint - ABS(stride)) is replaced by pinning against > a value like max_jint/100. The point would be to ensure that both > outer and inner layers of the decomposed loops get a chance to run. > As StressLongCountedLoop is currently formulated, it will only > run the outer loop once, right? The stress mode logic also deserves > a counter, so we can tell how many loops were actually promoted. Right, the outer loop should be executed only once. I'm working on an updated patch. Roland. From vladimir.kozlov at oracle.com Mon May 4 19:01:17 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 4 May 2020 12:01:17 -0700 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: References: Message-ID: <25d5e2fa-8909-0b94-e0b2-6b5aaa224492@oracle.com> JIT, AOT, JVMCI and Graal changes seem fine to me. It would be interesting to see shared code execution coverage change. There are places where we use flags and setting instead of #ifdef SPARC which may not be executed now or executed partially. We may simplify such code too. Thanks, Vladimir On 5/3/20 10:12 PM, Mikael Vidstedt wrote: > > Please review this change which implements part of JEP 381: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 > webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ > JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 > > > Note: When reviewing this, please be aware that this exercise was *extremely* mind-numbing, so I appreciate your help reviewing all the individual changes carefully. You may want to get that coffee cup filled up (or whatever keeps you awake)! > > > Background: > > Because of the size of the total patch and wide range of areas touched, this patch is one out of in total six partial patches which together make up the necessary changes to remove the Solaris and SPARC ports. The other patches are being sent out for review to mailing lists appropriate for the respective areas the touch. An email will be sent to jdk-dev summarizing all the patches/reviews. To be clear: this patch is *not* in itself complete and stand-alone - all of the (six) patches are needed to form a complete patch. Some changes in this patch may look wrong or incomplete unless also looking at the corresponding changes in other areas. > > For convenience, I?m including a link below[1] to the full webrev, but in case you have comments on changes in other areas, outside of the files included in this thread, please provide those comments directly in the thread on the appropriate mailing list for that area if possible. > > In case it helps, the changes were effectively produced by searching for and updating any code mentioning ?solaris", ?sparc?, ?solstudio?, ?sunos?, etc. More information about the areas impacted can be found in the JEP itself. > > A big thank you to Igor Ignatyev for helping make the changes to the hotspot tests! > > Also, I have a short list of follow-ups which I?m going to look at separately from this JEP/patch, mainly related to command line options/flags which are no longer relevant and should be deprecated/obsoleted/removed. > > Testing: > > A slightly earlier version of this change successfully passed tier1-8, as well as client tier1-2. Additional testing will be done after the first round of reviews has been completed. > > Cheers, > Mikael > > [1] http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/all/open/webrev/ > From manc at google.com Mon May 4 19:21:00 2020 From: manc at google.com (Man Cao) Date: Mon, 4 May 2020 12:21:00 -0700 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: References: Message-ID: Hi, Thanks for the review! Yes, the code change is trivial, but runtime behavior change is considerable. In particular, throughput and CPU usage could improve, but code cache usage could increase a lot. In our experience, the improvement in throughput and CPU is well worth the code cache increase. I have attached some benchmarking results in JBS. They are based on JDK11. We have not rolled out this fix to our production JDK11 yet, as I'd like to confirm that this large change in runtime behavior is OK with the OpenJDK community. We are happy to share some performance numbers from production workload once we have them. -Man On Mon, May 4, 2020 at 4:11 AM Laurent Bourg?s wrote: > Hi, > > Do you have performance results to justify your assumption "could > significantly improve performance" ? > > Please share numbers in the jbs bug > > Laurent > > Le sam. 2 mai 2020 ? 07:35, Man Cao a ?crit : > >> Hi all, >> >> Can I have reviews for this one-line change that fixes a bug and could >> significantly improve performance? >> Webrev: https://cr.openjdk.java.net/~manc/8244278/webrev.00/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8244278 >> >> It passes tier-1 tests locally, as well as >> "vm/mlvm/meth/stress/compiler/deoptimize" (for the original JDK-8046809). >> >> -Man >> > From john.r.rose at oracle.com Mon May 4 20:43:27 2020 From: john.r.rose at oracle.com (John Rose) Date: Mon, 4 May 2020 13:43:27 -0700 Subject: RFR(M): 8223051: support loops with long (64b) trip counts In-Reply-To: <87h7wv7jny.fsf@redhat.com> References: <87lfmd8lip.fsf@redhat.com> <87h7wv7jny.fsf@redhat.com> Message-ID: <601CD9EB-C4E2-413E-988A-03CE5DE9FB00@oracle.com> On May 4, 2020, at 9:24 AM, Roland Westrelin wrote: > > Thanks for the careful review. You?re welcome! > >> It would be good to put this pseudo-code above the definition >> of PhaseIdealLoop::is_long_counted_loop, but with names >> adjusted to match the code. >> >> Let me sketch it in terms of the names in the code: >> >> L: for (long phi = init; phi < limit; phi += stride) { >> // phi := Phi(L, init, phi + stride) >> ? use phi and (phi + stride) ? >> } >> >> ==transform=> >> >> const long inner_iters_limit = INT_MAX - stride; >> assert(stride <= inner_iters_limit); // else deopt > > The assertion above is statically determined to be true, right? It's not > a runtime check that can cause a deopt. Correct, no ?else deopt?. > >> In this webrev I didn?t see a discussion of ?what does concrete >> mean?; I assume it is elsewhere, and that I would know it if I were >> familiar with the loop optimizations (which very few people are!). >> Maybe add: >> + // See 'GraphKit::add_empty_predicates'. >> >> (Is a concrete predicate one which was inserted above an empty >> place-holder by a predication transformation? In that case, I > > Yes, they are. I think Christian is the one that introduced the term > "concrete" in one of his recent changes. > >> might wish to call those ?predication tests?, or some other term >> tied to the name of a specific optimization transform, rather >> than ?concrete predicates?.) > > There are at least 3 kind predicates: the place holder inserted at parse > time, the tests added by predication above the place holder (the > concrete ones), skeleton predicates that are added between main loop and > pre loop so C2 doesn't crash in some rare cases of over unrolling (a > recent addition). Skeleton predicates themselves are expanded and > updated as unrolling proceeds (creating a 4th kind of predicates?). They > don't compile to any code. As you can tell, this has all gotten > complicated and confusing and the naming scheme (or lack of naming > scheme) hasn't helped. That would need to be revisited. I?m OK with these terms as long as they are defined somewhere in the source code, somewhere discoverable. You could put the above summary into a comment at an appropriate central place and refer to it as needed, and that would make me happy. > >> I think you should also check for ?stride_con == 0? here. The 32-bit >> code checks for zero strides but I don?t see it here. Perhaps that would >> fold up later after the 32-bit loop is created? But it seems tidier to keep >> the 32-bit and 64-bit code as parallel as possible. > > Isn't stride_con == 0 actually a bug? IGVN should have folded it but it > was missed so rather than silently ignoring it wouldn't it be better to > assert stride_con != 0? Probably a bug, yes. Whether it should be (a) an assert, or (b) a bail-out depends on how much the loop opts phase trusts the IGVN phase. Sometimes IGVN leaves left-overs, either from bugs or unforeseen corner cases (strange transition states that don?t quite fold up). I?ll leave it to you which kind of bug this would be. Either way, there should be a line of code in loop opts which defends itself against problems in IGVN. > >> The following logic is hard for me to prove correct, and I would prefer it >> to miss a few valid cases rather in a simpler form: >> + if (phi_incr != NULL && iters_limit <= ABS(stride_con)) { >> >> I suggest either detuning the check in its current place: >> ++ if (iters_limit <= ABS(stride_con)) { > > I don't think that test is needed actually. If the loop exit test is > performed on the iv before increment, I now transform the exit test from > iv < limit to iv + stride < limit + stride > but before I settled for that I tried having the int counted loop handle > that transformation. That test must be a left over from those attempts. That?s fine. I didn?t prove to myself that the test was needed, let alone correct. But when I looked at the corresponding 32-bit code I found that there were common checks which did not look common at all in the current webrev. My later comments in the email were the result of this growing realization that the algorithms could be made more parallel between 32-bit and 64-bit versions. > >> Also, the immediately following range check logic is very odd, >> and doesn?t correspond to anything in the 32-bit loop code. >> What you appear to be doing here is checking how limit_t >> is going to behave with respect to the stride and there are >> three possibilities: It can sometimes cause 64-bit overflow, >> or it cannot cause 64-bit overflow, or it *must* cause 64-bit >> overflow. The two parts of this check are widely separately >> and not obviously connected, although they are linked by >> a common task of predicate insertion. > > It's different because the 32 bit loop code checks that both the iv > doesn't overflow and that the limit adjustment required when the compare > point to the phi in one go. I don't need to the check that iv doesn't > overflow (I set the limit of the inner loop so it doesn't). I see the difference. I?d still prefer some sort of factored algorithm for overflow checking that is either common or closely parallel (near-duplicate subroutine pair), since this kind of check is best reasoned about as a separate lemma, rather than a small, lost detail in the large mosh-pit of IR transformation logic. > >> >> The new node inner_iters_max would read a little better with a >> comment: >> >> + _igvn.register_new_node_with_optimizer(inner_iters_max); >> ++ // inner_iters_max is MAX(0, adjusted_limit - iv), when stride > 0 >> >> The definition of adjusted_limit also deserves comment, for that >> matter. Perhaps this one, lifted from the 32-bit code: >> >> // If compare points directly to the phi we need to adjust >> // the compare so that it points to the incr. >> >> In fact, the 32-bit code near that comment should, in my opinion, >> also be edited to use a different variable adjusted_limit, even though >> that variable will have a short span. There is value to having the >> 32-bit and 64-bit code look as similar as possible. The fact that >> this comment already occurs twice in the 32-bit code is additional >> evidence that there should be an adjusted_limit variable, defined >> near the *first* occurrence of that comment, in the 32-bit code. >> I think such a cleanup is a desirable way to reduce the cost of >> making a new almost-copy of the 32-bit code, in its 64-bit form. >> >> Another milestone in the arithmetic the deserves a comment >> is this definition: >> >> + _igvn.register_new_node_with_optimizer(inner_iters_actual); >> ++ // inner_iters_actual is unsigned MIN(lL, max_jint - ABS(stride)) >> ++ // this is the 32-bit number of iterations to execute in the inner loop >> >> (Why is it unsigned? I think its operands are never negative.) > > inner_iters_max is an unsigned integer. If the loop is: > > for (long l = Long.MIN_VALUE; l < Long.MAX_VALUE; l++) { > > then inners_iter_max doesn't fit in the signed long integer range. Oops, missed that; thanks. Maybe a one-line comment is useful here? > >> There?s a backtrack at this point: >> >> + // That fails. Undo graph changes we've done so far. >> >> I think that should collect a count somewhere, to be reported as part >> of statistics. That way, when you run stress tests, you can ensure that >> they include this backtrack path. I see that?s counted, in part, by the >> difference of _long_loops_success and _long_loops, but maybe another >> counter here wouldn?t be a bad idea, since there is lot of work to undo >> at this very late point. > > One issue with the counters I added is that they are hard to interpret: > if the transformation of long loops fails, it's attempted again at every > pass of loop opts and so _long_loops is incremented every time. So > _long_loops_success < _long_loops tell us there were some failures but > not how many. Yes, that?s true. Idea: Log a message to the compiler log. Then at least the failure events are in context of a particular compilation task. But that?s out of character for the loop opts; the only use of C->log() is in loopnode for log_loop_tree. Your call? > >> BTW, I like StressLongCountedLoop a lot; it?s a nice test. I suggest >> a second stress mode, in which the pinning against the jint range >> (in places like max_jint - ABS(stride)) is replaced by pinning against >> a value like max_jint/100. The point would be to ensure that both >> outer and inner layers of the decomposed loops get a chance to run. >> As StressLongCountedLoop is currently formulated, it will only >> run the outer loop once, right? The stress mode logic also deserves >> a counter, so we can tell how many loops were actually promoted. > > Right, the outer loop should be executed only once. So can we arrange to run it more than once, by setting the inner trip count to be smaller? I?m afraid the optimizer could detect a one-trip loop and take it apart (in a later pass), and then the goal of the stress test won?t be achieved. > I'm working on an updated patch. Thanks, Roland. ? John From martin.doerr at sap.com Mon May 4 20:47:11 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 4 May 2020 20:47:11 +0000 Subject: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> Message-ID: Hi lx, the size attribute is wrong on PPC64 (stop is larger than 4 Bytes). S390 builds fine. I've only run the build. No tests. Should this feature be debug-only? Do we want the lengthy code emitted in product build? Best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Liu, Xin > Sent: Donnerstag, 30. April 2020 06:03 > To: hotspot-compiler-dev at openjdk.java.net > Subject: RFR(XS): Provide information when hitting a HaltNode for > architectures other than x86 > > Hi, > > Could you review this small patch? It unifies codegen of HaltNode for other > architectures. > JBS: https://bugs.openjdk.java.net/browse/JDK-8230552 > Webrev: https://cr.openjdk.java.net/~xliu/8230552/00/webrev/ > > I tested on aarch64. It generates the same crash report as x86_64 when it > does hit HaltNode. Halt reason is displayed. I paste report on the JBS. > I ran hotspot:tier1 on aarch64 fastdebug build. It passed except for 3 > relevant failures[1]. > > I plan to do that on aarch64 only, but it?s trivial on other architectures, so I > bravely modified them all. May I invite s390, SPARC arm32 maintainers take a > look at it? > If it goes through the review, I hope a sponsor can help me to push the > submit repo and see if it works. > > [1] those 3 tests failed on aarch64 with/without my changes. > gc/shenandoah/mxbeans/TestChurnNotifications.java#id2 > gc/shenandoah/mxbeans/TestChurnNotifications.java#id1 > gc/shenandoah/mxbeans/TestPauseNotifications.java#id1 > > thanks, > -lx > From igor.ignatyev at oracle.com Mon May 4 21:29:39 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 4 May 2020 14:29:39 -0700 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: References: Message-ID: Hi Mikael, the changes in /test/ look good to me. I have a question regarding src/jdk.internal.vm.compiler/*, aren't these files part of graal-compiler and hence will be brought back by the next graal update? Thanks, -- Igor > On May 3, 2020, at 10:12 PM, Mikael Vidstedt wrote: > > > Please review this change which implements part of JEP 381: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 > webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ > JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 > > > Note: When reviewing this, please be aware that this exercise was *extremely* mind-numbing, so I appreciate your help reviewing all the individual changes carefully. You may want to get that coffee cup filled up (or whatever keeps you awake)! > > > Background: > > Because of the size of the total patch and wide range of areas touched, this patch is one out of in total six partial patches which together make up the necessary changes to remove the Solaris and SPARC ports. The other patches are being sent out for review to mailing lists appropriate for the respective areas the touch. An email will be sent to jdk-dev summarizing all the patches/reviews. To be clear: this patch is *not* in itself complete and stand-alone - all of the (six) patches are needed to form a complete patch. Some changes in this patch may look wrong or incomplete unless also looking at the corresponding changes in other areas. > > For convenience, I?m including a link below[1] to the full webrev, but in case you have comments on changes in other areas, outside of the files included in this thread, please provide those comments directly in the thread on the appropriate mailing list for that area if possible. > > In case it helps, the changes were effectively produced by searching for and updating any code mentioning ?solaris", ?sparc?, ?solstudio?, ?sunos?, etc. More information about the areas impacted can be found in the JEP itself. > > A big thank you to Igor Ignatyev for helping make the changes to the hotspot tests! > > Also, I have a short list of follow-ups which I?m going to look at separately from this JEP/patch, mainly related to command line options/flags which are no longer relevant and should be deprecated/obsoleted/removed. > > Testing: > > A slightly earlier version of this change successfully passed tier1-8, as well as client tier1-2. Additional testing will be done after the first round of reviews has been completed. > > Cheers, > Mikael > > [1] http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/all/open/webrev/ > From vladimir.kozlov at oracle.com Mon May 4 21:49:47 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 4 May 2020 14:49:47 -0700 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: References: Message-ID: <60d3a22f-d4b4-280d-a8a7-1a306b7ad483@oracle.com> I filed Graal issue to change mx script to filter out SPARC code when we do sync Graal changes into JDK. For Graal shared code we may need to have versioning for latest JDK as we do in other cases. Regards, Vladimir On 5/4/20 2:29 PM, Igor Ignatyev wrote: > Hi Mikael, > > the changes in /test/ look good to me. > > I have a question regarding src/jdk.internal.vm.compiler/*, aren't these files part of graal-compiler and hence will be brought back by the next graal update? > > Thanks, > -- Igor > >> On May 3, 2020, at 10:12 PM, Mikael Vidstedt wrote: >> >> >> Please review this change which implements part of JEP 381: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 >> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ >> JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 >> >> >> Note: When reviewing this, please be aware that this exercise was *extremely* mind-numbing, so I appreciate your help reviewing all the individual changes carefully. You may want to get that coffee cup filled up (or whatever keeps you awake)! >> >> >> Background: >> >> Because of the size of the total patch and wide range of areas touched, this patch is one out of in total six partial patches which together make up the necessary changes to remove the Solaris and SPARC ports. The other patches are being sent out for review to mailing lists appropriate for the respective areas the touch. An email will be sent to jdk-dev summarizing all the patches/reviews. To be clear: this patch is *not* in itself complete and stand-alone - all of the (six) patches are needed to form a complete patch. Some changes in this patch may look wrong or incomplete unless also looking at the corresponding changes in other areas. >> >> For convenience, I?m including a link below[1] to the full webrev, but in case you have comments on changes in other areas, outside of the files included in this thread, please provide those comments directly in the thread on the appropriate mailing list for that area if possible. >> >> In case it helps, the changes were effectively produced by searching for and updating any code mentioning ?solaris", ?sparc?, ?solstudio?, ?sunos?, etc. More information about the areas impacted can be found in the JEP itself. >> >> A big thank you to Igor Ignatyev for helping make the changes to the hotspot tests! >> >> Also, I have a short list of follow-ups which I?m going to look at separately from this JEP/patch, mainly related to command line options/flags which are no longer relevant and should be deprecated/obsoleted/removed. >> >> Testing: >> >> A slightly earlier version of this change successfully passed tier1-8, as well as client tier1-2. Additional testing will be done after the first round of reviews has been completed. >> >> Cheers, >> Mikael >> >> [1] http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/all/open/webrev/ >> > From xxinliu at amazon.com Mon May 4 23:26:40 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Mon, 4 May 2020 23:26:40 +0000 Subject: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> Message-ID: <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> Hi, Martin, Thank you to review it and build it on s390 and PPC! If I delete size(4) in ppc.ad, hotspot can work out the correct size of instruction chunk, can't it? I found most of instructions in ppc.ad have size(xx), but there're a couple of exceptions: cacheWB & cacheWBPreSync. I think it should be in product hotspot. Here are my arguments. 1. Crash reports of release build are more common. Some customers even don't bother trying again with a debug build. Let me take the crash report on aarch64 as an example. I paste the comparison before and after. https://bugs.openjdk.java.net/browse/JDK-8230552?focusedCommentId=14334977&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14334977 Without stop(_halt_reason), what we know is "there's a bug in C2-generated code". If the method is big, which is very likely because of inlining, it's easy to get lost. I feel it's more helpful with the patch. We can locate which HaltNode it is by searching _halt_reason in codebase. Hopefully, we can find the culprit method from "Compilation events". 2. HaltNode is rarely generated and will be removed if it's dead. IMHO, the semantic of that Node is "halt". If It remains after optimizer or lowering to mach-nodes, something wrong and unrecoverable happened in the compilers. After we fix the compiler bug, it should be gone. That's is too say, it shouldn't cause any problem about code size in ideal cases. In reality, I observe that a HaltNode always follows the uncommon_trap call. Christian also observed that in JDK-8022574. Isn't uncommon trap a one-way ticket for all architectures? I feel the control flow never returns after uncommon_trap, why do we generate a HaltNode after that? Nevertheless, it's a separated issue. Let me make another revision to fix PPC and I found that sparc.ad is gonna gone. Thanks, --lx ?On 5/4/20, 1:47 PM, "Doerr, Martin" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi lx, the size attribute is wrong on PPC64 (stop is larger than 4 Bytes). S390 builds fine. I've only run the build. No tests. Should this feature be debug-only? Do we want the lengthy code emitted in product build? Best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Liu, Xin > Sent: Donnerstag, 30. April 2020 06:03 > To: hotspot-compiler-dev at openjdk.java.net > Subject: RFR(XS): Provide information when hitting a HaltNode for > architectures other than x86 > > Hi, > > Could you review this small patch? It unifies codegen of HaltNode for other > architectures. > JBS: https://bugs.openjdk.java.net/browse/JDK-8230552 > Webrev: https://cr.openjdk.java.net/~xliu/8230552/00/webrev/ > > I tested on aarch64. It generates the same crash report as x86_64 when it > does hit HaltNode. Halt reason is displayed. I paste report on the JBS. > I ran hotspot:tier1 on aarch64 fastdebug build. It passed except for 3 > relevant failures[1]. > > I plan to do that on aarch64 only, but it?s trivial on other architectures, so I > bravely modified them all. May I invite s390, SPARC arm32 maintainers take a > look at it? > If it goes through the review, I hope a sponsor can help me to push the > submit repo and see if it works. > > [1] those 3 tests failed on aarch64 with/without my changes. > gc/shenandoah/mxbeans/TestChurnNotifications.java#id2 > gc/shenandoah/mxbeans/TestChurnNotifications.java#id1 > gc/shenandoah/mxbeans/TestPauseNotifications.java#id1 > > thanks, > -lx > From rwestrel at redhat.com Tue May 5 07:20:12 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 05 May 2020 09:20:12 +0200 Subject: RFR(S): 8244086: Following 8241492, strip mined loop may run extra iterations In-Reply-To: References: <87wo5y8z2v.fsf@redhat.com> <878sid8jzn.fsf@redhat.com> <87zhat6voh.fsf@redhat.com> <87wo5s6tvs.fsf@redhat.com> Message-ID: <87eery7sr7.fsf@redhat.com> Hi Martin, > Given the huge amount of problems related to LSM I wonder if it's worth maintaining it. It comes with a high price. And default back to a safepoint in every loop iteration for latency sensitive gcs? People involved with gcs would need to comment on whether they think it's acceptable. Anyway, I disagree with the "huge amount of problems" comment. All non trivial C2 changes that I've been involved with have had a long bug tail. I don't think LSM is better or worse in that regard. > Nevertheless, your version looks at least correct to me. Thanks for the review. > loopnode.cpp: > Please add your formula to the comment. Actually in the review thread for 8223051, John suggested some refactoring that would apply here as well. I'll update this webrev. > TestStripMinedLimitBelowInit.java: > I suggest to use -XX:-TieredCompilation Why that? Roland. From martin.doerr at sap.com Tue May 5 07:23:19 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 5 May 2020 07:23:19 +0000 Subject: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> Message-ID: Hi lx, > If I delete size(4) in ppc.ad, hotspot can work out the correct size of > instruction chunk, can't it? Yes. > In reality, I observe that a HaltNode always follows the uncommon_trap call. > Christian also observed that in JDK-8022574. > Isn't uncommon trap a one-way ticket for all architectures? I feel the control > flow never returns after uncommon_trap, why do we generate a HaltNode > after that? Nevertheless, it's a separated issue. I think the HaltNode insertion should get removed from GraphKit::uncommon_trap before or with your change. Uncommon_traps are often used. Other usages of HaltNode may be rare enough. (I haven't checked that.) Best regards, Martin > -----Original Message----- > From: Liu, Xin > Sent: Dienstag, 5. Mai 2020 01:27 > To: Doerr, Martin ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: RFR(XS): Provide information when hitting a HaltNode for > architectures other than x86 > > Hi, Martin, > > Thank you to review it and build it on s390 and PPC! > > If I delete size(4) in ppc.ad, hotspot can work out the correct size of > instruction chunk, can't it? > I found most of instructions in ppc.ad have size(xx), but there're a couple of > exceptions: cacheWB & cacheWBPreSync. > > I think it should be in product hotspot. Here are my arguments. > 1. Crash reports of release build are more common. > Some customers even don't bother trying again with a debug build. > > Let me take the crash report on aarch64 as an example. I paste the > comparison before and after. > https://bugs.openjdk.java.net/browse/JDK- > 8230552?focusedCommentId=14334977&page=com.atlassian.jira.plugin.syst > em.issuetabpanels%3Acomment-tabpanel#comment-14334977 > > Without stop(_halt_reason), what we know is "there's a bug in C2-generated > code". If the method is big, which is very likely because of inlining, it's easy to > get lost. > > I feel it's more helpful with the patch. We can locate which HaltNode it is by > searching _halt_reason in codebase. > Hopefully, we can find the culprit method from "Compilation events". > > 2. HaltNode is rarely generated and will be removed if it's dead. > IMHO, the semantic of that Node is "halt". If It remains after optimizer or > lowering to mach-nodes, something wrong and unrecoverable happened in > the compilers. After we fix the compiler bug, it should be gone. That's is too > say, it shouldn't cause any problem about code size in ideal cases. > > In reality, I observe that a HaltNode always follows the uncommon_trap call. > Christian also observed that in JDK-8022574. > Isn't uncommon trap a one-way ticket for all architectures? I feel the control > flow never returns after uncommon_trap, why do we generate a HaltNode > after that? Nevertheless, it's a separated issue. > > Let me make another revision to fix PPC and I found that sparc.ad is gonna > gone. > > Thanks, > --lx > > > ?On 5/4/20, 1:47 PM, "Doerr, Martin" wrote: > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > > > Hi lx, > > the size attribute is wrong on PPC64 (stop is larger than 4 Bytes). S390 > builds fine. > I've only run the build. No tests. > > Should this feature be debug-only? > Do we want the lengthy code emitted in product build? > > Best regards, > Martin > > > > -----Original Message----- > > From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Liu, Xin > > Sent: Donnerstag, 30. April 2020 06:03 > > To: hotspot-compiler-dev at openjdk.java.net > > Subject: RFR(XS): Provide information when hitting a HaltNode for > > architectures other than x86 > > > > Hi, > > > > Could you review this small patch? It unifies codegen of HaltNode for > other > > architectures. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8230552 > > Webrev: https://cr.openjdk.java.net/~xliu/8230552/00/webrev/ > > > > I tested on aarch64. It generates the same crash report as x86_64 when > it > > does hit HaltNode. Halt reason is displayed. I paste report on the JBS. > > I ran hotspot:tier1 on aarch64 fastdebug build. It passed except for 3 > > relevant failures[1]. > > > > I plan to do that on aarch64 only, but it?s trivial on other architectures, so I > > bravely modified them all. May I invite s390, SPARC arm32 maintainers > take a > > look at it? > > If it goes through the review, I hope a sponsor can help me to push the > > submit repo and see if it works. > > > > [1] those 3 tests failed on aarch64 with/without my changes. > > gc/shenandoah/mxbeans/TestChurnNotifications.java#id2 > > gc/shenandoah/mxbeans/TestChurnNotifications.java#id1 > > gc/shenandoah/mxbeans/TestPauseNotifications.java#id1 > > > > thanks, > > -lx > > > From martin.doerr at sap.com Tue May 5 07:37:01 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 5 May 2020 07:37:01 +0000 Subject: RFR(S): 8244086: Following 8241492, strip mined loop may run extra iterations In-Reply-To: <87eery7sr7.fsf@redhat.com> References: <87wo5y8z2v.fsf@redhat.com> <878sid8jzn.fsf@redhat.com> <87zhat6voh.fsf@redhat.com> <87wo5s6tvs.fsf@redhat.com> <87eery7sr7.fsf@redhat.com> Message-ID: Hi Roland, > Anyway, I disagree with the "huge amount of problems" comment. All non > trivial C2 changes that I've been involved with have had a long bug > tail. I don't think LSM is better or worse in that regard. Well, we have seen a lot of crashes in production since JDK 10. But we don't have to discuss this here. Let's just fix it. > Actually in the review thread for 8223051, John suggested some > refactoring that would apply here as well. I'll update this webrev. Ok. Thanks. > > TestStripMinedLimitBelowInit.java: > > I suggest to use -XX:-TieredCompilation > > Why that? Your test is constructed in a way such that it warms up and creates a highest tier nmethod. Then, this method is called once with other parameters. Why should we create a C1 nmethod, too? -XX:-TieredCompilation makes it easier to follow when using the test case for debugging. Best regards, Martin > -----Original Message----- > From: Roland Westrelin > Sent: Dienstag, 5. Mai 2020 09:20 > To: Doerr, Martin ; Pengfei Li > ; hotspot-compiler-dev at openjdk.java.net > Cc: nd > Subject: RE: RFR(S): 8244086: Following 8241492, strip mined loop may run > extra iterations > > > Hi Martin, > > > Given the huge amount of problems related to LSM I wonder if it's worth > maintaining it. It comes with a high price. > > And default back to a safepoint in every loop iteration for latency > sensitive gcs? People involved with gcs would need to comment on whether > they think it's acceptable. > > Anyway, I disagree with the "huge amount of problems" comment. All non > trivial C2 changes that I've been involved with have had a long bug > tail. I don't think LSM is better or worse in that regard. > > > Nevertheless, your version looks at least correct to me. > > Thanks for the review. > > > loopnode.cpp: > > Please add your formula to the comment. > > Actually in the review thread for 8223051, John suggested some > refactoring that would apply here as well. I'll update this webrev. > > > TestStripMinedLimitBelowInit.java: > > I suggest to use -XX:-TieredCompilation > > Why that? > > Roland. From xxinliu at amazon.com Tue May 5 09:37:57 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Tue, 5 May 2020 09:37:57 +0000 Subject: RFR[M]: 8151779: Some intrinsic flags could be replaced with one general flag In-Reply-To: References: <19CD3956-4DC6-4908-8626-27D48A9AB4A4@amazon.com> <0EDAAC88-E5D9-424F-A19E-5E20C689C2F3@amazon.com> <801D878C-CAE5-4EBE-8AFE-4E35346CD5BD@amazon.com> <58ff5b66-1dce-d4ad-8f21-254abd1b887b@oracle.com> Message-ID: Hello, David and Nils Thank you to review the patch. I went to brush up my English grammar and then update my patch to rev04. https://cr.openjdk.java.net/~xliu/8151779/04/webrev/ Here is the incremental diff: https://cr.openjdk.java.net/~xliu/8151779/r3_to_r4.diff It reflect changes based on David's feedbacks. I really appreciate that you review so carefully and found so many invaluable suggestions. TBH, I don't understand Amazon's copyright header neither. I choose the simple way to dodge that problem. Nils points out a very tricky question. Yes, I also notice that each TriBool takes 4 bytes on x86_64. It's a natural machine word and supposed to be the most efficient form. As a result, the vector control_words take about 1.3Kb for all intrinsics. I thought it's not a big deal, but Nils brought up that each DirectiveSet will increase from 128b to 1440b. Theoretically, the user may provide a CompileCommandFile which consists of hundreds of directives. Will hotspot have hundreds of DirectiveSet in that case? Actually, I do have a compacted container of TriBool. It's like a vector specialization. https://cr.openjdk.java.net/~xliu/8151779/TriBool.cpp The reason I didn't include it because I still feel that a few KiloBytes memories are not a big deal. Nowadays, hotspot allows Java programmers allocate over 100G heap. Is it wise to increase software complexity to save KBs? If you think it matters, I can integrate it. May I update TriBoolArray in a standalone JBS? I have made a lot of changes. I hope I can verify them using KitchenSink? For the second problem, I think it's because I used 'memset' to initialize an array of objects in rev01. Previously, I had code like this: memset(&_intrinsic_control_words[0], 0, sizeof(_intrinsic_control_words)); This kind of usage will be warned as -Werror=class-memaccess in g++-8. I have fixed it since rev02. I use DirectiveSet::fill_in(). Please check out. Thanks, --lx From nils.eliasson at oracle.com Tue May 5 09:53:42 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 5 May 2020 11:53:42 +0200 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> Message-ID: Hi Martin, I think it looks good. Please go ahead! Best regards, Nils On 2020-05-04 18:04, Doerr, Martin wrote: > Hi Nils, > > thank you for looking at this and sorry for the late reply. > > I've added MaxTrivialSize and also updated the issue accordingly. Makes sense. > Do you have more flags in mind? > > Moving the flags which are only used by C2 into c2_globals definitely makes sense. > > Done in webrev.01: > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ > > Please take a look and let me know when my proposal is ready for a CSR. > > Best regards, > Martin > > >> -----Original Message----- >> From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Nils Eliasson >> Sent: Dienstag, 28. April 2020 18:29 >> To: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >> >> Hi, >> >> Thanks for addressing this! This has been an annoyance for a long time. >> >> Have you though about including other flags - like MaxTrivialSize? >> MaxInlineSize is tested against it. >> >> Also - you should move the flags that are now c2-only to c2_globals.hpp. >> >> Best regards, >> Nils Eliasson >> >> On 2020-04-27 15:06, Doerr, Martin wrote: >>> Hi, >>> >>> while tuning inlining parameters for C2 compiler with JDK-8234863 we had >> discussed impact on C1. >>> I still think it's bad to share them between both compilers. We may want to >> do further C2 tuning without negative impact on C1 in the future. >>> C1 has issues with substantial inlining because of the lack of uncommon >> traps. When C1 inlines a lot, stack frames may get large and code cache space >> may get wasted for cold or even never executed code. The situation gets >> worse when many patching stubs get used for such code. >>> I had opened the following issue: >>> https://bugs.openjdk.java.net/browse/JDK-8235673 >>> >>> And my initial proposal is here: >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ >>> >>> >>> Part of my proposal is to add an additional flag which I called >> C1InlineStackLimit to reduce stack utilization for C1 methods. >>> I have a simple example which shows wasted stack space (java example >> TestStack at the end). >>> It simply counts stack frames until a stack overflow occurs. With the current >> implementation, only 1283 frames fit on the stack because the never >> executed method bogus_test with local variables gets inlined. >>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get 2310 >> frames until stack overflow. (I only used C1 for this example. Can be >> reproduced as shown below.) >>> I didn't notice any performance regression even with the aggressive setting >> of C1InlineStackLimit=5 with TieredCompilation. >>> I know that I'll need a CSR for this change, but I'd like to get feedback in >> general and feedback about the flag names before creating a CSR. >>> I'd also be glad about feedback regarding the performance impact. >>> >>> Best regards, >>> Martin >>> >>> >>> >>> Command line: >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >> TestStack >>> CompileCommand: compileonly TestStack.triggerStackOverflow >>> @ 8 TestStack::triggerStackOverflow (15 bytes) recursive >> inlining too deep >>> @ 11 TestStack::bogus_test (33 bytes) inline >>> caught java.lang.StackOverflowError >>> 1283 activations were on stack, sum = 0 >>> >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >> TestStack >>> CompileCommand: compileonly TestStack.triggerStackOverflow >>> @ 8 TestStack::triggerStackOverflow (15 bytes) recursive >> inlining too deep >>> @ 11 TestStack::bogus_test (33 bytes) callee uses too >> much stack >>> caught java.lang.StackOverflowError >>> 2310 activations were on stack, sum = 0 >>> >>> >>> TestStack.java: >>> public class TestStack { >>> >>> static long cnt = 0, >>> sum = 0; >>> >>> public static void bogus_test() { >>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; >>> sum += c1 + c2 + c3 + c4; >>> } >>> >>> public static void triggerStackOverflow() { >>> cnt++; >>> triggerStackOverflow(); >>> bogus_test(); >>> } >>> >>> >>> public static void main(String args[]) { >>> try { >>> triggerStackOverflow(); >>> } catch (StackOverflowError e) { >>> System.out.println("caught " + e); >>> } >>> System.out.println(cnt + " activations were on stack, sum = " + sum); >>> } >>> } >>> From nils.eliasson at oracle.com Tue May 5 10:11:06 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 5 May 2020 12:11:06 +0200 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: References: Message-ID: <380ca47b-4143-e98f-ff81-461b394aaf0c@oracle.com> Hi Man, Why do you expect code cache usage would increase a lot? The sweeper still wakes up regularly and cleans the code cache. The code path fixed is just about sweeping extra aggressively under some circumstances. Some nmethod might live a little longer, but they will still be cleaned. Without your bugfix the sweeper will be notified for every new allocation in the codecache as soon as code cache usages has gone beyond 10%. That could in the worst case be one sweep for every allocation. One number you could add to your benchmark numbers is the number of nmethods reclaimed and code cache usage. I expect both to remain the same. Best regards, Nils On 2020-05-04 21:21, Man Cao wrote: > Hi, > > Thanks for the review! > Yes, the code change is trivial, but runtime behavior change is > considerable. > In particular, throughput and CPU usage could improve, but code cache usage > could increase a lot. In our experience, the improvement in throughput and > CPU is well worth the code cache increase. > > I have attached some benchmarking results in JBS. They are based on JDK11. > We have not rolled out this fix to our production JDK11 yet, as I'd like to > confirm that this large change in runtime behavior is OK with the OpenJDK > community. > We are happy to share some performance numbers from production workload > once we have them. > > -Man > > > On Mon, May 4, 2020 at 4:11 AM Laurent Bourg?s > wrote: > >> Hi, >> >> Do you have performance results to justify your assumption "could >> significantly improve performance" ? >> >> Please share numbers in the jbs bug >> >> Laurent >> >> Le sam. 2 mai 2020 ? 07:35, Man Cao a ?crit : >> >>> Hi all, >>> >>> Can I have reviews for this one-line change that fixes a bug and could >>> significantly improve performance? >>> Webrev: https://cr.openjdk.java.net/~manc/8244278/webrev.00/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8244278 >>> >>> It passes tier-1 tests locally, as well as >>> "vm/mlvm/meth/stress/compiler/deoptimize" (for the original JDK-8046809). >>> >>> -Man >>> From bourges.laurent at gmail.com Tue May 5 10:14:00 2020 From: bourges.laurent at gmail.com (=?UTF-8?Q?Laurent_Bourg=C3=A8s?=) Date: Tue, 5 May 2020 12:14:00 +0200 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: References: Message-ID: Thanks Man for sharing your early benchmark results. This one-liner fix seems giving nice throughtput improvements. Using 40mb code cache looks quite small compared to default values (256mb ?). Could you run 1 quick experiment with default code cache settings to see if there is no regression in the main use case ? PS: I am not a reviewer, but I am very implied in java jit performance. Cheers, Laurent Le lun. 4 mai 2020 ? 21:21, Man Cao a ?crit : > Hi, > > Thanks for the review! > Yes, the code change is trivial, but runtime behavior change is > considerable. > In particular, throughput and CPU usage could improve, but code cache > usage could increase a lot. In our experience, the improvement in > throughput and CPU is well worth the code cache increase. > > I have attached some benchmarking results in JBS. They are based on JDK11. > We have not rolled out this fix to our production JDK11 yet, as I'd like > to confirm that this large change in runtime behavior is OK with the > OpenJDK community. > We are happy to share some performance numbers from production workload > once we have them. > > -Man > > > On Mon, May 4, 2020 at 4:11 AM Laurent Bourg?s > wrote: > >> Hi, >> >> Do you have performance results to justify your assumption "could >> significantly improve performance" ? >> >> Please share numbers in the jbs bug >> >> Laurent >> >> Le sam. 2 mai 2020 ? 07:35, Man Cao a ?crit : >> >>> Hi all, >>> >>> Can I have reviews for this one-line change that fixes a bug and could >>> significantly improve performance? >>> Webrev: https://cr.openjdk.java.net/~manc/8244278/webrev.00/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8244278 >>> >>> It passes tier-1 tests locally, as well as >>> "vm/mlvm/meth/stress/compiler/deoptimize" (for the original JDK-8046809). >>> >>> -Man >>> >> From manc at google.com Tue May 5 18:57:10 2020 From: manc at google.com (Man Cao) Date: Tue, 5 May 2020 11:57:10 -0700 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: <380ca47b-4143-e98f-ff81-461b394aaf0c@oracle.com> References: <380ca47b-4143-e98f-ff81-461b394aaf0c@oracle.com> Message-ID: Hi Laurent, Nils, > Using 40mb code cache looks quite small compared to default values (256mb ?). The 40MB code cache size is for -XX:-TieredCompilation, for the DaCapo runs. "java -XX:-TieredCompilation -XX:+PrintFlagsFinal |& grep ReservedCodeCacheSize" shows the default size is 48MB for -TieredCompilation. So 40MB is not particularly small compared to 48MB. The Blaze runs use "-XX:+TieredCompilation" and a Google-default code cache size of 480MB (we have doubled the default size). This actually tests a case that is closer to the OpenJDK default. > Could you run 1 quick experiment with default code cache settings to see if there is no regression in the main use case? Yes, I just launched an experiment running DaCapo at JDK source tip, only with "-Xms4g -Xmx4g" and some logging flags. This is a meaningful experiment as it uses the real OpenJDK default flags, and the most up-to-date source. > Why do you expect code cache usage would increase a lot? The sweeper > still wakes up regularly and cleans the code cache. The code path fixed > is just about sweeping extra aggressively under some circumstances. Some > nmethod might live a little longer, but they will still be cleaned. I found the aggressive sweeping deoptimized a non-trivial number of nmethods, which kept the usage low. After my bugfix, the sweeper only wakes up when the usage reaches (100-StartAggressiveSweepingAt)% of code cache capacity, which default to 90%. This means many nmethods will not be cleaned until there is pressure in code cache usage. > One number you could add to your benchmark numbers is the number of > nmethods reclaimed and code cache usage. I expect both to remain the same. The benchmark result htmls already show the code cache usage (Code Cache Used (MB)). You can CTRL-F for "Code Cache" in the browser. There is a significant increase in code cache usage: For DaCapo, up to 13.5MB (for tradesoap) increase for the 40MB ReservedCodeCacheSize. For Blaze, the increase is 33MB-80MB for the 480MB ReservedCodeCacheSize. The code cache usage metric is measure like this (we added an hsperfdata counter for it), at the end of a benchmark run: size_t result = 0; FOR_ALL_ALLOCABLE_HEAPS(heap) { result += (*heap)->allocated_capacity(); } _code_cache_used_size->set_value(result); I also looked at the logs, it shows that the bugfix eliminated almost all of the code cache flushes. They also contain the number of nmethods reclaimed. E.g., for tradesoap: Without the fix: Code cache sweeper statistics: Total sweep time: 1902 ms Total number of full sweeps: 23301 Total number of flushed methods: 7080 (thereof 7080 C2 methods) Total size of flushed methods: 20468 kB With the fix: Code cache sweeper statistics: Total sweep time: 0 ms Total number of full sweeps: 0 Total number of flushed methods: 0 (thereof 0 C2 methods) Total size of flushed methods: 0 kB Anyway, just to reiterate, we think the improvement in throughput and CPU usage is well worth the increase in code cache usage. If the increase causes any problem, we would advise users to accept the increase and fully provision memory for the entire ReservedCodeCacheSize. -Man On Tue, May 5, 2020 at 3:18 AM Nils Eliasson wrote: > Hi Man, > > Why do you expect code cache usage would increase a lot? The sweeper > still wakes up regularly and cleans the code cache. The code path fixed > is just about sweeping extra aggressively under some circumstances. Some > nmethod might live a little longer, but they will still be cleaned. > > Without your bugfix the sweeper will be notified for every new > allocation in the codecache as soon as code cache usages has gone beyond > 10%. That could in the worst case be one sweep for every allocation. > > One number you could add to your benchmark numbers is the number of > nmethods reclaimed and code cache usage. I expect both to remain the same. > > Best regards, > Nils > > > > On 2020-05-04 21:21, Man Cao wrote: > > Hi, > > > > Thanks for the review! > > Yes, the code change is trivial, but runtime behavior change is > > considerable. > > In particular, throughput and CPU usage could improve, but code cache > usage > > could increase a lot. In our experience, the improvement in throughput > and > > CPU is well worth the code cache increase. > > > > I have attached some benchmarking results in JBS. They are based on > JDK11. > > We have not rolled out this fix to our production JDK11 yet, as I'd like > to > > confirm that this large change in runtime behavior is OK with the OpenJDK > > community. > > We are happy to share some performance numbers from production workload > > once we have them. > > > > -Man > > > > > > On Mon, May 4, 2020 at 4:11 AM Laurent Bourg?s < > bourges.laurent at gmail.com> > > wrote: > > > >> Hi, > >> > >> Do you have performance results to justify your assumption "could > >> significantly improve performance" ? > >> > >> Please share numbers in the jbs bug > >> > >> Laurent > >> > >> Le sam. 2 mai 2020 ? 07:35, Man Cao a ?crit : > >> > >>> Hi all, > >>> > >>> Can I have reviews for this one-line change that fixes a bug and could > >>> significantly improve performance? > >>> Webrev: https://cr.openjdk.java.net/~manc/8244278/webrev.00/ > >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8244278 > >>> > >>> It passes tier-1 tests locally, as well as > >>> "vm/mlvm/meth/stress/compiler/deoptimize" (for the original > JDK-8046809). > >>> > >>> -Man > >>> > > From felix.yang at huawei.com Wed May 6 01:32:12 2020 From: felix.yang at huawei.com (Yangfei (Felix)) Date: Wed, 6 May 2020 01:32:12 +0000 Subject: RFR(S): 8244407: JVM crashes after transformation in C2 IdealLoopTree::split_fall_in Message-ID: Hi, Please help review this patch fixing a C2 crash issue. Bug: https://bugs.openjdk.java.net/browse/JDK-8244407 Webrev: http://cr.openjdk.java.net/~fyang/8244407/webrev.00 After the fix for JDK-8240576, irreducible loop tree might be structurally changed by split_fall_in() in IdealLoopTree::beautify_loops. But loop tree is not rebuilt after that. Take the reported test case for example, irreducible loop tree looks like: 1: Loop: N649/N644 Loop: N649/N644 IRREDUCIBLE <-- this Loop: N649/N797 sfpts={ 683 } With the fix for JDK-8240576?we won't do merge_many_backedges in IdealLoopTree::beautify_loops for this irreducible loop tree. if( _head->req() > 3 && !_irreducible) { // Merge the many backedges into a single backedge but leave // the hottest backedge as separate edge for the following peel. merge_many_backedges( phase ); result = true; } N660 N644 N797 | | | | | | | v | | +---+---+ | +-----> + N649 + <-----+ +--------+ 649 Region === 649 660 797 644 [[ .... ]] !jvms: Test::testMethod @ bci:543 Then we come to the children: // Now recursively beautify nested loops if( _child ) result |= _child->beautify_loops( phase ); 2: Loop: N649/N797 Loop: N649/N644 IRREDUCIBLE Loop: N649/N797 sfpts={ 683 } <-- this After spilt_fall_in()?N660 and N644 are merged. if( fall_in_cnt > 1 ) // Need a loop landing pad to merge fall-ins split_fall_in( phase, fall_in_cnt ); N660 N644 | + | | | | | +---------+ | +---->+ N946 +<-----+ +----+---+ | N797 | | | | | | | +--------+ | +----> + N649 + <-----+ +--------+ Loop tree is now structurally changed into: Loop: N946/N644 IRREDUCIBLE Loop: N649/N797 sfpts={ 683 } But local variable 'result' in IdealLoopTree::beautify_loops hasn't got a chance to be set to true since _head->req() is not bigger than 3 after split_fall_in. Then C2 won't rebuild loop tree after IdealLoopTree::beautify_loops, which further leads to the crash. Instead of adding extra checking for loop tree structure changes, proposed fix sets 'result' to true when we meet irreducible loop with multiple backedges. This should be safer and simpler (thus good for JIT compile time). Tiered 1-3 tested on x86-64 and aarch64 linux platform. Comments? Thanks, Felix From ningsheng.jian at arm.com Wed May 6 06:35:06 2020 From: ningsheng.jian at arm.com (Ningsheng Jian) Date: Wed, 6 May 2020 14:35:06 +0800 Subject: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> Message-ID: <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Hi Xin, Martin's review comments reminds me that we should worry about code size increase on AArch64 with your patch, given that the HaltNode will be generated in many cases now. Currently in AArch64 MacroAssembler::stop(), it will generate many register saving instructions by pusha() before calling to debug64(). But I think debug64() only uses the regs[] and pc arguments when ShowMessageBoxOnError is on. Maybe we should at least only do the saving and pc arg passing when ShowMessageBoxOnError is on in MacroAssembler::stop(), as what x86 does in macroAssembler_x86.cpp? Thanks, Ningsheng On 5/5/20 7:26 AM, Liu, Xin wrote: > Hi, Martin, > > Thank you to review it and build it on s390 and PPC! > > If I delete size(4) in ppc.ad, hotspot can work out the correct size of instruction chunk, can't it? > I found most of instructions in ppc.ad have size(xx), but there're a couple of exceptions: cacheWB & cacheWBPreSync. > > I think it should be in product hotspot. Here are my arguments. > 1. Crash reports of release build are more common. > Some customers even don't bother trying again with a debug build. > > Let me take the crash report on aarch64 as an example. I paste the comparison before and after. > https://bugs.openjdk.java.net/browse/JDK-8230552?focusedCommentId=14334977&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14334977 > > Without stop(_halt_reason), what we know is "there's a bug in C2-generated code". If the method is big, which is very likely because of inlining, it's easy to get lost. > > I feel it's more helpful with the patch. We can locate which HaltNode it is by searching _halt_reason in codebase. > Hopefully, we can find the culprit method from "Compilation events". > > 2. HaltNode is rarely generated and will be removed if it's dead. > IMHO, the semantic of that Node is "halt". If It remains after optimizer or lowering to mach-nodes, something wrong and unrecoverable happened in the compilers. After we fix the compiler bug, it should be gone. That's is too say, it shouldn't cause any problem about code size in ideal cases. > > In reality, I observe that a HaltNode always follows the uncommon_trap call. Christian also observed that in JDK-8022574. > Isn't uncommon trap a one-way ticket for all architectures? I feel the control flow never returns after uncommon_trap, why do we generate a HaltNode after that? Nevertheless, it's a separated issue. > > Let me make another revision to fix PPC and I found that sparc.ad is gonna gone. > > Thanks, > --lx > > > ?On 5/4/20, 1:47 PM, "Doerr, Martin" wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > Hi lx, > > the size attribute is wrong on PPC64 (stop is larger than 4 Bytes). S390 builds fine. > I've only run the build. No tests. > > Should this feature be debug-only? > Do we want the lengthy code emitted in product build? > > Best regards, > Martin > > > > -----Original Message----- > > From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Liu, Xin > > Sent: Donnerstag, 30. April 2020 06:03 > > To: hotspot-compiler-dev at openjdk.java.net > > Subject: RFR(XS): Provide information when hitting a HaltNode for > > architectures other than x86 > > > > Hi, > > > > Could you review this small patch? It unifies codegen of HaltNode for other > > architectures. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8230552 > > Webrev: https://cr.openjdk.java.net/~xliu/8230552/00/webrev/ > > > > I tested on aarch64. It generates the same crash report as x86_64 when it > > does hit HaltNode. Halt reason is displayed. I paste report on the JBS. > > I ran hotspot:tier1 on aarch64 fastdebug build. It passed except for 3 > > relevant failures[1]. > > > > I plan to do that on aarch64 only, but it?s trivial on other architectures, so I > > bravely modified them all. May I invite s390, SPARC arm32 maintainers take a > > look at it? > > If it goes through the review, I hope a sponsor can help me to push the > > submit repo and see if it works. > > > > [1] those 3 tests failed on aarch64 with/without my changes. > > gc/shenandoah/mxbeans/TestChurnNotifications.java#id2 > > gc/shenandoah/mxbeans/TestChurnNotifications.java#id1 > > gc/shenandoah/mxbeans/TestPauseNotifications.java#id1 > > > > thanks, > > -lx > > > > From xxinliu at amazon.com Wed May 6 07:25:18 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Wed, 6 May 2020 07:25:18 +0000 Subject: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Message-ID: ?On 5/5/20, 11:35 PM, "Ningsheng Jian" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi Xin, Martin's review comments reminds me that we should worry about code size increase on AArch64 with your patch, given that the HaltNode will be generated in many cases now. > Yes, I inspected aarch64's stop() too. HaltNode will increase from 1 instruction to 7 instructions. I acknowledge that code size is a big issue for both aarch64 and ppc. I agree with Martin, so I'd like to work on JDK-8022574 first. I think It's feasible. I have 2 ideas. 1) get rid of the HaltNode after the callnode of uncommon_trap. 2) give that special HaltNode a mark, so codegen can skip stop() for it. Currently in AArch64 MacroAssembler::stop(), it will generate many register saving instructions by pusha() before calling to debug64(). But I think debug64() only uses the regs[] and pc arguments when ShowMessageBoxOnError is on. Maybe we should at least only do the saving and pc arg passing when ShowMessageBoxOnError is on in MacroAssembler::stop(), as what x86 does in macroAssembler_x86.cpp? Thanks, Ningsheng > It makes sense, I will add if (ShowMessageBoxOnError) when I come back to this issue. Thanks, --lx On 5/5/20 7:26 AM, Liu, Xin wrote: > Hi, Martin, > > Thank you to review it and build it on s390 and PPC! > > If I delete size(4) in ppc.ad, hotspot can work out the correct size of instruction chunk, can't it? > I found most of instructions in ppc.ad have size(xx), but there're a couple of exceptions: cacheWB & cacheWBPreSync. > > I think it should be in product hotspot. Here are my arguments. > 1. Crash reports of release build are more common. > Some customers even don't bother trying again with a debug build. > > Let me take the crash report on aarch64 as an example. I paste the comparison before and after. > https://bugs.openjdk.java.net/browse/JDK-8230552?focusedCommentId=14334977&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14334977 > > Without stop(_halt_reason), what we know is "there's a bug in C2-generated code". If the method is big, which is very likely because of inlining, it's easy to get lost. > > I feel it's more helpful with the patch. We can locate which HaltNode it is by searching _halt_reason in codebase. > Hopefully, we can find the culprit method from "Compilation events". > > 2. HaltNode is rarely generated and will be removed if it's dead. > IMHO, the semantic of that Node is "halt". If It remains after optimizer or lowering to mach-nodes, something wrong and unrecoverable happened in the compilers. After we fix the compiler bug, it should be gone. That's is too say, it shouldn't cause any problem about code size in ideal cases. > > In reality, I observe that a HaltNode always follows the uncommon_trap call. Christian also observed that in JDK-8022574. > Isn't uncommon trap a one-way ticket for all architectures? I feel the control flow never returns after uncommon_trap, why do we generate a HaltNode after that? Nevertheless, it's a separated issue. > > Let me make another revision to fix PPC and I found that sparc.ad is gonna gone. > > Thanks, > --lx > > > On 5/4/20, 1:47 PM, "Doerr, Martin" wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > Hi lx, > > the size attribute is wrong on PPC64 (stop is larger than 4 Bytes). S390 builds fine. > I've only run the build. No tests. > > Should this feature be debug-only? > Do we want the lengthy code emitted in product build? > > Best regards, > Martin > > > > -----Original Message----- > > From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Liu, Xin > > Sent: Donnerstag, 30. April 2020 06:03 > > To: hotspot-compiler-dev at openjdk.java.net > > Subject: RFR(XS): Provide information when hitting a HaltNode for > > architectures other than x86 > > > > Hi, > > > > Could you review this small patch? It unifies codegen of HaltNode for other > > architectures. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8230552 > > Webrev: https://cr.openjdk.java.net/~xliu/8230552/00/webrev/ > > > > I tested on aarch64. It generates the same crash report as x86_64 when it > > does hit HaltNode. Halt reason is displayed. I paste report on the JBS. > > I ran hotspot:tier1 on aarch64 fastdebug build. It passed except for 3 > > relevant failures[1]. > > > > I plan to do that on aarch64 only, but it?s trivial on other architectures, so I > > bravely modified them all. May I invite s390, SPARC arm32 maintainers take a > > look at it? > > If it goes through the review, I hope a sponsor can help me to push the > > submit repo and see if it works. > > > > [1] those 3 tests failed on aarch64 with/without my changes. > > gc/shenandoah/mxbeans/TestChurnNotifications.java#id2 > > gc/shenandoah/mxbeans/TestChurnNotifications.java#id1 > > gc/shenandoah/mxbeans/TestPauseNotifications.java#id1 > > > > thanks, > > -lx > > > > From christian.hagedorn at oracle.com Wed May 6 07:26:57 2020 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Wed, 6 May 2020 09:26:57 +0200 Subject: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Message-ID: <6f0ed258-8fad-43df-60cb-3281397ddf9b@oracle.com> Hi Ningsheng On 06.05.20 08:35, Ningsheng Jian wrote: > Currently in AArch64 MacroAssembler::stop(), it will generate many > register saving instructions by pusha() before calling to debug64(). But > I think debug64() only uses the regs[] and pc arguments when > ShowMessageBoxOnError is on. Maybe we should at least only do the saving > and pc arg passing when ShowMessageBoxOnError is on in > MacroAssembler::stop(), as what x86 does in macroAssembler_x86.cpp? That's a good point. The change from Assembler::ud2() to MacroAssembler::stop() for a HaltNode in JDK-8225653 indeed resulted in some performance regressions for x86 since MacroAssembler::debug64() set up those arguments [1]. Especially, it called Assembler::pusha() for 64-bit which emits a move instructions for each of the 16 registers. Even though the code was never called, it had a bad influence on code cache space which had a bad influence on performance. The fix was to only do it if ShowMessageBoxOnError was set. So, I think it's indeed a good idea to also do the same for other architectures. Best regards, Christian [1] https://bugs.openjdk.java.net/browse/JDK-8231720 > > > Thanks, > Ningsheng > > On 5/5/20 7:26 AM, Liu, Xin wrote: >> Hi, Martin, >> >> Thank you to review it and build it on s390 and PPC! >> >> If I delete size(4) in ppc.ad,? hotspot can work out the correct size >> of instruction chunk, can't it? >> I found most of instructions in ppc.ad have size(xx), but there're a >> couple of exceptions:? cacheWB & cacheWBPreSync. >> >> I think it should be in product hotspot.? Here are my arguments. >> 1.? Crash reports of release build are more common. >> Some customers even don't bother trying again with a debug build. >> >> Let me take the crash report on aarch64 as an example. I paste the >> comparison before and after. >> https://bugs.openjdk.java.net/browse/JDK-8230552?focusedCommentId=14334977&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14334977 >> >> >> Without stop(_halt_reason), what we know is "there's a bug in >> C2-generated code". If the method is big,? which is very likely >> because of inlining, it's easy to get lost. >> >> I feel it's more helpful with the patch. We can locate which HaltNode >> it is by searching _halt_reason in codebase. >> Hopefully, we can find the culprit method from "Compilation events". >> >> 2. HaltNode is rarely generated and will be removed if it's dead. >> IMHO, the semantic of that Node is "halt". If It remains after >> optimizer or lowering to mach-nodes,? something wrong and >> unrecoverable happened in the compilers. After we fix the compiler >> bug,? it should be gone.? That's is too say, it shouldn't cause any >> problem about code size in ideal cases. >> >> In reality, I observe that a HaltNode always follows the uncommon_trap >> call.? Christian also observed that in JDK-8022574. >> Isn't uncommon trap a one-way ticket for all architectures? I feel the >> control flow never returns after uncommon_trap, why do we generate a >> HaltNode after that? Nevertheless, it's a separated issue. >> >> Let me make another revision to fix PPC and I found that sparc.ad is >> gonna gone. >> >> Thanks, >> --lx >> >> >> ?On 5/4/20, 1:47 PM, "Doerr, Martin" wrote: >> >> ???? CAUTION: This email originated from outside of the organization. >> Do not click links or open attachments unless you can confirm the >> sender and know the content is safe. >> >> >> >> ???? Hi lx, >> >> ???? the size attribute is wrong on PPC64 (stop is larger than 4 >> Bytes). S390 builds fine. >> ???? I've only run the build. No tests. >> >> ???? Should this feature be debug-only? >> ???? Do we want the lengthy code emitted in product build? >> >> ???? Best regards, >> ???? Martin >> >> >> ???? > -----Original Message----- >> ???? > From: hotspot-compiler-dev > ???? > bounces at openjdk.java.net> On Behalf Of Liu, Xin >> ???? > Sent: Donnerstag, 30. April 2020 06:03 >> ???? > To: hotspot-compiler-dev at openjdk.java.net >> ???? > Subject: RFR(XS): Provide information when hitting a HaltNode for >> ???? > architectures other than x86 >> ???? > >> ???? > Hi, >> ???? > >> ???? > Could you review this small patch?? It unifies codegen of >> HaltNode for other >> ???? > architectures. >> ???? > JBS: https://bugs.openjdk.java.net/browse/JDK-8230552 >> ???? > Webrev: https://cr.openjdk.java.net/~xliu/8230552/00/webrev/ >> ???? > >> ???? > I tested on aarch64.? It generates the same crash report as >> x86_64 when it >> ???? > does hit HaltNode.? Halt reason is displayed. I paste report on >> the JBS. >> ???? > I ran hotspot:tier1 on aarch64 fastdebug build.? It passed >> except for 3 >> ???? > relevant failures[1]. >> ???? > >> ???? > I plan to do that on aarch64 only, but it?s trivial on other >> architectures, so I >> ???? > bravely modified them all.? May I invite s390, SPARC arm32 >> maintainers take a >> ???? > look at it? >> ???? > If it goes through the review, I hope a sponsor can help me to >> push the >> ???? > submit repo and see if it works. >> ???? > >> ???? > [1] those 3 tests failed on aarch64 with/without my changes. >> ???? > gc/shenandoah/mxbeans/TestChurnNotifications.java#id2 >> ???? > gc/shenandoah/mxbeans/TestChurnNotifications.java#id1 >> ???? > gc/shenandoah/mxbeans/TestPauseNotifications.java#id1 >> ???? > >> ???? > thanks, >> ???? > -lx >> ???? > >> >> > From aph at redhat.com Wed May 6 08:19:26 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 6 May 2020 09:19:26 +0100 Subject: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Message-ID: On 5/6/20 8:25 AM, Liu, Xin wrote: > Currently in AArch64 MacroAssembler::stop(), it will generate many > register saving instructions by pusha() before calling to debug64(). But > I think debug64() only uses the regs[] and pc arguments when > ShowMessageBoxOnError is on. Maybe we should at least only do the saving > and pc arg passing when ShowMessageBoxOnError is on in > MacroAssembler::stop(), as what x86 does in macroAssembler_x86.cpp? Maybe we should think about a better way to do it. All we have to do, after all, is put the reason into, say, r8, and execute a trap. We don't need to push and pop anything because the trap handler will do that. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Wed May 6 08:40:40 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 6 May 2020 09:40:40 +0100 Subject: [aarch64-port-dev ] RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <6f0ed258-8fad-43df-60cb-3281397ddf9b@oracle.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <6f0ed258-8fad-43df-60cb-3281397ddf9b@oracle.com> Message-ID: <55cd970b-5a4e-4d1a-e17c-77289a412022@redhat.com> On 5/6/20 8:26 AM, Christian Hagedorn wrote: > The fix was to > only do it if ShowMessageBoxOnError was set. So, I think it's indeed a > good idea to also do the same for other architectures. Really? Can't we fix stop() so that it generates only a tiny little bit of code? I can do it if you like. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From Yang.Zhang at arm.com Wed May 6 08:46:28 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Wed, 6 May 2020 08:46:28 +0000 Subject: [aarch64-port-dev ] RFR(S): 8243597: AArch64: Add support for integer vector abs Message-ID: Hi, Could you please help to review this patch? JBS: https://bugs.openjdk.java.net/browse/JDK-8243597 Webrev: http://cr.openjdk.java.net/~yzhang/8243597/webrev.00/ In JDK-8222074 [1], x86 enables auto vectorization for integer vector abs, and jtreg tests are also added. In this patch, the missing AbsVB/S/I/L support for AArch64 is added. Testing: Full jtreg test Vector API tests which cover vector abs Test case: public static void absvs(short[] a, short[] b, short[] c) { for (int i = 0; i < a.length; i++) { c[i] = (short)Math.abs((a[i] + b[i])); } } Assembly code generated by C2: 0x0000ffffaca3f3ac: ldr q17, [x16, #16] 0x0000ffffaca3f3b0: ldr q16, [x15, #16] 0x0000ffffaca3f3b4: add v16.8h, v16.8h, v17.8h 0x0000ffffaca3f3b8: abs v16.8h, v16.8h 0x0000ffffaca3f3c0: str q16, [x12, #16] Similar test cases for byte/int/long are also tested and NEON abs instruction is generated by C2. Performance: JMH tests are uploaded. http://cr.openjdk.java.net/~yzhang/8243597/TestScalar.java http://cr.openjdk.java.net/~yzhang/8243597/TestVect.java Vector abs: Before: Benchmark (size) Mode Cnt Score Error Units TestVect.testVectAbsVB 1024 avgt 5 1041.720 ? 2.606 us/op TestVect.testVectAbsVI 1024 avgt 5 659.788 ? 2.057 us/op TestVect.testVectAbsVL 1024 avgt 5 711.043 ? 5.489 us/op TestVect.testVectAbsVS 1024 avgt 5 659.157 ? 2.531 us/op After Benchmark (size) Mode Cnt Score Error Units TestVect.testVectAbsVB 1024 avgt 5 88.821 ? 1.886 us/op TestVect.testVectAbsVI 1024 avgt 5 199.081 ? 2.539 us/op TestVect.testVectAbsVL 1024 avgt 5 447.536 ? 1.195 us/op TestVect.testVectAbsVS 1024 avgt 5 119.172 ? 0.340 us/op Scalar abs: Before: Benchmark (size) Mode Cnt Score Error Units TestScalar.testAbsI 1024 avgt 5 3770.345 ? 6.760 us/op TestScalar.testAbsL 1024 avgt 5 3767.570 ? 9.097 us/op After: Benchmark (size) Mode Cnt Score Error Units TestScalar.testAbsI 1024 avgt 5 3141.312 ? 2.000 us/op TestScalar.testAbsL 1024 avgt 5 3103.143 ? 8.989 us/op [1] https://bugs.openjdk.java.net/browse/JDK-8222074 Regards Yang From christian.hagedorn at oracle.com Wed May 6 09:14:16 2020 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Wed, 6 May 2020 11:14:16 +0200 Subject: [aarch64-port-dev ] RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <55cd970b-5a4e-4d1a-e17c-77289a412022@redhat.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <6f0ed258-8fad-43df-60cb-3281397ddf9b@oracle.com> <55cd970b-5a4e-4d1a-e17c-77289a412022@redhat.com> Message-ID: <2c9c0c59-73cc-4c88-c568-deba5d73e4f5@oracle.com> Hi Andrew On 5/6/20 8:26 AM, Christian Hagedorn wrote >> [..] since MacroAssembler::debug64() set up those arguments [..] Sorry, I mixed that up. I meant that MacroAssembler::stop() on x86 emitted those additional instructions as arguments needed by debug64(). This was then guarded by ShowMessageBoxOnError to only emit the message argument for debug64() by default. On 06.05.20 10:40, Andrew Haley wrote: > On 5/6/20 8:26 AM, Christian Hagedorn wrote >> The fix was to >> only do it if ShowMessageBoxOnError was set. So, I think it's indeed a >> good idea to also do the same for other architectures. > > Really? Can't we fix stop() so that it generates only a tiny little bit > of code? I can do it if you like. This is probably what you've meant. I agree with you, the fewer instructions emitted by stop() the better. Best regards, Christian From bourges.laurent at gmail.com Wed May 6 09:34:07 2020 From: bourges.laurent at gmail.com (=?UTF-8?Q?Laurent_Bourg=C3=A8s?=) Date: Wed, 6 May 2020 11:34:07 +0200 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: References: <380ca47b-4143-e98f-ff81-461b394aaf0c@oracle.com> Message-ID: Thanks Man for your results. I will try your fix on jdk15 repo and run my Marlin tests & benchmark to see if there are any gain in these cases. Cheers, Laurent Le mar. 5 mai 2020 ? 21:00, Man Cao a ?crit : > Hi Laurent, Nils, > > > Using 40mb code cache looks quite small compared to default values (256mb > ?). > The 40MB code cache size is for -XX:-TieredCompilation, for the DaCapo > runs. > "java -XX:-TieredCompilation -XX:+PrintFlagsFinal |& grep > ReservedCodeCacheSize" shows the default size is 48MB for > -TieredCompilation. > So 40MB is not particularly small compared to 48MB. > > The Blaze runs use "-XX:+TieredCompilation" and a Google-default code cache > size of 480MB (we have doubled the default size). > This actually tests a case that is closer to the OpenJDK default. > > > Could you run 1 quick experiment with default code cache settings to see > if there is no regression in the main use case? > Yes, I just launched an experiment running DaCapo at JDK source tip, only > with "-Xms4g -Xmx4g" and some logging flags. > This is a meaningful experiment as it uses the real OpenJDK default flags, > and the most up-to-date source. > > > Why do you expect code cache usage would increase a lot? The sweeper > > still wakes up regularly and cleans the code cache. The code path fixed > > is just about sweeping extra aggressively under some circumstances. Some > > nmethod might live a little longer, but they will still be cleaned. > I found the aggressive sweeping deoptimized a non-trivial number of > nmethods, which kept the usage low. > After my bugfix, the sweeper only wakes up when the usage reaches > (100-StartAggressiveSweepingAt)% of code cache capacity, which default to > 90%. > This means many nmethods will not be cleaned until there is pressure in > code cache usage. > > > One number you could add to your benchmark numbers is the number of > > nmethods reclaimed and code cache usage. I expect both to remain the > same. > The benchmark result htmls already show the code cache usage (Code Cache > Used (MB)). > You can CTRL-F for "Code Cache" in the browser. > > There is a significant increase in code cache usage: > For DaCapo, up to 13.5MB (for tradesoap) increase for the 40MB > ReservedCodeCacheSize. > For Blaze, the increase is 33MB-80MB for the 480MB ReservedCodeCacheSize. > > The code cache usage metric is measure like this (we added an hsperfdata > counter for it), at the end of a benchmark run: > size_t result = 0; > FOR_ALL_ALLOCABLE_HEAPS(heap) { > result += (*heap)->allocated_capacity(); > } > _code_cache_used_size->set_value(result); > > I also looked at the logs, it shows that the bugfix eliminated almost all > of the code cache flushes. They also contain the number of nmethods > reclaimed. > E.g., for tradesoap: > > Without the fix: > Code cache sweeper statistics: > > Total sweep time: 1902 ms > Total number of full sweeps: 23301 > Total number of flushed methods: 7080 (thereof 7080 C2 methods) > Total size of flushed methods: 20468 kB > > With the fix: > > Code cache sweeper statistics: > Total sweep time: 0 ms > Total number of full sweeps: 0 > Total number of flushed methods: 0 (thereof 0 C2 methods) > Total size of flushed methods: 0 kB > > > Anyway, just to reiterate, we think the improvement in throughput and CPU > usage is well worth the increase in code cache usage. > If the increase causes any problem, we would advise users to accept the > increase and fully provision memory for the entire ReservedCodeCacheSize. > > -Man > > > On Tue, May 5, 2020 at 3:18 AM Nils Eliasson > wrote: > > > Hi Man, > > > > Why do you expect code cache usage would increase a lot? The sweeper > > still wakes up regularly and cleans the code cache. The code path fixed > > is just about sweeping extra aggressively under some circumstances. Some > > nmethod might live a little longer, but they will still be cleaned. > > > > Without your bugfix the sweeper will be notified for every new > > allocation in the codecache as soon as code cache usages has gone beyond > > 10%. That could in the worst case be one sweep for every allocation. > > > > One number you could add to your benchmark numbers is the number of > > nmethods reclaimed and code cache usage. I expect both to remain the > same. > > > > Best regards, > > Nils > > > > > > > > On 2020-05-04 21:21, Man Cao wrote: > > > Hi, > > > > > > Thanks for the review! > > > Yes, the code change is trivial, but runtime behavior change is > > > considerable. > > > In particular, throughput and CPU usage could improve, but code cache > > usage > > > could increase a lot. In our experience, the improvement in throughput > > and > > > CPU is well worth the code cache increase. > > > > > > I have attached some benchmarking results in JBS. They are based on > > JDK11. > > > We have not rolled out this fix to our production JDK11 yet, as I'd > like > > to > > > confirm that this large change in runtime behavior is OK with the > OpenJDK > > > community. > > > We are happy to share some performance numbers from production workload > > > once we have them. > > > > > > -Man > > > > > > > > > On Mon, May 4, 2020 at 4:11 AM Laurent Bourg?s < > > bourges.laurent at gmail.com> > > > wrote: > > > > > >> Hi, > > >> > > >> Do you have performance results to justify your assumption "could > > >> significantly improve performance" ? > > >> > > >> Please share numbers in the jbs bug > > >> > > >> Laurent > > >> > > >> Le sam. 2 mai 2020 ? 07:35, Man Cao a ?crit : > > >> > > >>> Hi all, > > >>> > > >>> Can I have reviews for this one-line change that fixes a bug and > could > > >>> significantly improve performance? > > >>> Webrev: https://cr.openjdk.java.net/~manc/8244278/webrev.00/ > > >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8244278 > > >>> > > >>> It passes tier-1 tests locally, as well as > > >>> "vm/mlvm/meth/stress/compiler/deoptimize" (for the original > > JDK-8046809). > > >>> > > >>> -Man > > >>> > > > > > From rwestrel at redhat.com Wed May 6 09:33:59 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 06 May 2020 11:33:59 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop Message-ID: <871rnx76go.fsf@redhat.com> https://bugs.openjdk.java.net/browse/JDK-8244504 http://cr.openjdk.java.net/~roland/8244504/webrev.00/ This is some refactoring in the counted loop code to prepare for 8223051 (support loops with long (64b) trip counts). Some of the changes came up in the review of 8223051 (that patch used to be part of 8223051). Roland. From rwestrel at redhat.com Wed May 6 09:41:51 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 06 May 2020 11:41:51 +0200 Subject: RFR(M): 8223051: support loops with long (64b) trip counts In-Reply-To: <601CD9EB-C4E2-413E-988A-03CE5DE9FB00@oracle.com> References: <87lfmd8lip.fsf@redhat.com> <87h7wv7jny.fsf@redhat.com> <601CD9EB-C4E2-413E-988A-03CE5DE9FB00@oracle.com> Message-ID: <87y2q55rj4.fsf@redhat.com> > So can we arrange to run it more than once, by setting the inner trip > count to be smaller? I?m afraid the optimizer could detect a one-trip loop > and take it apart (in a later pass), and then the goal of the stress test > won?t be achieved. Sure. Here is a new patch that should take all of your comments into account: http://cr.openjdk.java.net/~roland/8223051/webrev.01/ I split the change in 2. The counted loop refactoring code is out for review under 8244504 (I just sent it out for review). Your review of that other one is very much welcome. Roland. From nils.eliasson at oracle.com Wed May 6 09:52:05 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 6 May 2020 11:52:05 +0200 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: References: <380ca47b-4143-e98f-ff81-461b394aaf0c@oracle.com> Message-ID: On 2020-05-05 20:57, Man Cao wrote: > Hi Laurent, Nils, > > > Using 40mb code cache looks quite small compared to default values > (256mb ?). > The 40MB code cache size is for -XX:-TieredCompilation, for the DaCapo > runs. > "java -XX:-TieredCompilation -XX:+PrintFlagsFinal |& grep > ReservedCodeCacheSize" shows the default size is 48MB for > -TieredCompilation. > So 40MB is not particularly small compared to 48MB. > > The Blaze runs use "-XX:+TieredCompilation" and a Google-default?code > cache size of 480MB (we have doubled the default size). > This actually tests a case that is closer to the OpenJDK default. > > > Could you run 1 quick experiment with default code cache settings to > see if there is no regression in the main use case? > Yes, I just launched an experiment running DaCapo at JDK source tip, > only with "-Xms4g -Xmx4g" and some logging flags. > This is a meaningful experiment as it uses the real OpenJDK default > flags, and the most up-to-date source. > > > Why do you expect code cache usage would increase a lot? The sweeper > > still wakes up regularly and cleans the code cache. The code path fixed > > is just about sweeping extra aggressively under some circumstances. Some > > nmethod might live a little longer, but they will still be cleaned. > I found the aggressive sweeping deoptimized a non-trivial number of > nmethods, which kept the usage low. > After my bugfix, the sweeper only wakes up when the usage reaches > (100-StartAggressiveSweepingAt)% of code cache capacity, which default > to 90%. > This means many nmethods?will not be cleaned until there is pressure > in code cache usage. > > > One number you could add to your benchmark numbers is the number of > > nmethods reclaimed and code cache usage. I expect both to remain the > same. > The benchmark result htmls already show the?code cache usage (Code > Cache Used (MB)). > You can CTRL-F for "Code Cache" in the browser. > > There is a significant increase in code cache usage: > For DaCapo, up to 13.5MB (for tradesoap) increase for the 40MB > ReservedCodeCacheSize. > For Blaze, the?increase is 33MB-80MB for the 480MB?ReservedCodeCacheSize. > The code cache usage metric is measure like this (we added an > hsperfdata counter for it), at the end of a benchmark run: > ? size_t result = 0; > ? FOR_ALL_ALLOCABLE_HEAPS(heap) { > ? ? result += (*heap)->allocated_capacity(); > ? } > ??_code_cache_used_size->set_value(result); > > I also looked at the logs, it shows that the bugfix eliminated almost > all of the code cache flushes. They also contain the number of > nmethods reclaimed. > E.g., for tradesoap: > > Without the fix: > Code cache sweeper statistics: > Total sweep time: 1902 ms > Total number of full sweeps: 23301 > Total number of flushed methods: 7080 (thereof 7080 C2 methods) > Total size of flushed methods: 20468 kB > With the fix: > Code cache sweeper statistics: > Total sweep time: 0 ms > Total number of full sweeps: 0 > Total number of flushed methods: 0 (thereof 0 C2 methods) > Total size of flushed methods: 0 kB > Interesting results and a clear indication that something is broken in the sweeper heuristics - Nmethods should still be flushed! Looking at sweeper.cpp I see something that looks wrong. The _last_sweep counter is updated even if no sweep was done. In low code cache usage scenarios that means will might never reach the threshold. I think the curly brace should be moved done like this: --- a/src/hotspot/share/runtime/sweeper.cpp???? Tue May 05 21:28:46 2020 +0200 +++ b/src/hotspot/share/runtime/sweeper.cpp???? Wed May 06 11:46:01 2020 +0200 @@ -445,16 +445,17 @@ ?? if (_should_sweep || forced) { ???? init_sweeper_log(); ???? sweep_code_cache(); + +??? // We are done with sweeping the code cache once. +??? _total_nof_code_cache_sweeps++; +??? _last_sweep = _time_counter; +??? // Reset flag; temporarily disables sweeper +??? _should_sweep = false; +??? // If there was enough state change, 'possibly_enable_sweeper()' +??? // sets '_should_sweep' to true +??? possibly_enable_sweeper(); ?? } -? // We are done with sweeping the code cache once. -? _total_nof_code_cache_sweeps++; -? _last_sweep = _time_counter; -? // Reset flag; temporarily disables sweeper -? _should_sweep = false; -? // If there was enough state change, 'possibly_enable_sweeper()' -? // sets '_should_sweep' to true -? possibly_enable_sweeper(); ?? // Reset _bytes_changed only if there was enough state change. _bytes_changed ?? // can further increase by calls to 'report_state_change'. ?? if (_should_sweep) { Can you try it out and see if things improve? Best regards, Nils Eliasson > Anyway, just to reiterate, we think the improvement in throughput and > CPU usage is well worth the increase in code cache usage. > If the increase causes any problem, we would advise users to accept > the increase and fully provision memory for the > entire?ReservedCodeCacheSize. > > -Man > > > On Tue, May 5, 2020 at 3:18 AM Nils Eliasson > wrote: > > Hi Man, > > Why do you expect code cache usage would increase a lot? The sweeper > still wakes up regularly and cleans the code cache. The code path > fixed > is just about sweeping extra aggressively under some > circumstances. Some > nmethod might live a little longer, but they will still be cleaned. > > Without your bugfix the sweeper will be notified for every new > allocation in the codecache as soon as code cache usages has gone > beyond > 10%. That could in the worst case be one sweep for every allocation. > > One number you could add to your benchmark numbers is the number of > nmethods reclaimed and code cache usage. I expect both to remain > the same. > > Best regards, > Nils > > > > On 2020-05-04 21:21, Man Cao wrote: > > Hi, > > > > Thanks for the review! > > Yes, the code change is trivial, but runtime behavior change is > > considerable. > > In particular, throughput and CPU usage could improve, but code > cache usage > > could increase a lot. In our experience, the improvement in > throughput and > > CPU is well worth the code cache increase. > > > > I have attached some benchmarking results in JBS. They are based > on JDK11. > > We have not rolled out this fix to our production JDK11 yet, as > I'd like to > > confirm that this large change in runtime behavior is OK with > the OpenJDK > > community. > > We are happy to share some performance numbers from production > workload > > once we have them. > > > > -Man > > > > > > On Mon, May 4, 2020 at 4:11 AM Laurent Bourg?s > > > > wrote: > > > >> Hi, > >> > >> Do you have performance results to justify your assumption "could > >> significantly improve performance" ? > >> > >> Please share numbers in the jbs bug > >> > >> Laurent > >> > >> Le sam. 2 mai 2020 ? 07:35, Man Cao > a ?crit : > >> > >>> Hi all, > >>> > >>> Can I have reviews for this one-line change that fixes a bug > and could > >>> significantly improve performance? > >>> Webrev: https://cr.openjdk.java.net/~manc/8244278/webrev.00/ > >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8244278 > >>> > >>> It passes tier-1 tests locally, as well as > >>> "vm/mlvm/meth/stress/compiler/deoptimize" (for the original > JDK-8046809). > >>> > >>> -Man > >>> > From martin.doerr at sap.com Wed May 6 10:19:15 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 6 May 2020 10:19:15 +0000 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> Message-ID: Hi Nils, I've created CSR https://bugs.openjdk.java.net/browse/JDK-8244507 and set it to "Proposed". Feel free to modify it if needed. I will need reviewers for it, too. Best regards, Martin > -----Original Message----- > From: Nils Eliasson > Sent: Dienstag, 5. Mai 2020 11:54 > To: Doerr, Martin ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > Hi Martin, > > I think it looks good. > > Please go ahead! > > Best regards, > Nils > > > On 2020-05-04 18:04, Doerr, Martin wrote: > > Hi Nils, > > > > thank you for looking at this and sorry for the late reply. > > > > I've added MaxTrivialSize and also updated the issue accordingly. Makes > sense. > > Do you have more flags in mind? > > > > Moving the flags which are only used by C2 into c2_globals definitely makes > sense. > > > > Done in webrev.01: > > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ > > > > Please take a look and let me know when my proposal is ready for a CSR. > > > > Best regards, > > Martin > > > > > >> -----Original Message----- > >> From: hotspot-compiler-dev >> bounces at openjdk.java.net> On Behalf Of Nils Eliasson > >> Sent: Dienstag, 28. April 2020 18:29 > >> To: hotspot-compiler-dev at openjdk.java.net > >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >> > >> Hi, > >> > >> Thanks for addressing this! This has been an annoyance for a long time. > >> > >> Have you though about including other flags - like MaxTrivialSize? > >> MaxInlineSize is tested against it. > >> > >> Also - you should move the flags that are now c2-only to c2_globals.hpp. > >> > >> Best regards, > >> Nils Eliasson > >> > >> On 2020-04-27 15:06, Doerr, Martin wrote: > >>> Hi, > >>> > >>> while tuning inlining parameters for C2 compiler with JDK-8234863 we > had > >> discussed impact on C1. > >>> I still think it's bad to share them between both compilers. We may want > to > >> do further C2 tuning without negative impact on C1 in the future. > >>> C1 has issues with substantial inlining because of the lack of uncommon > >> traps. When C1 inlines a lot, stack frames may get large and code cache > space > >> may get wasted for cold or even never executed code. The situation gets > >> worse when many patching stubs get used for such code. > >>> I had opened the following issue: > >>> https://bugs.openjdk.java.net/browse/JDK-8235673 > >>> > >>> And my initial proposal is here: > >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ > >>> > >>> > >>> Part of my proposal is to add an additional flag which I called > >> C1InlineStackLimit to reduce stack utilization for C1 methods. > >>> I have a simple example which shows wasted stack space (java example > >> TestStack at the end). > >>> It simply counts stack frames until a stack overflow occurs. With the > current > >> implementation, only 1283 frames fit on the stack because the never > >> executed method bogus_test with local variables gets inlined. > >>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get > 2310 > >> frames until stack overflow. (I only used C1 for this example. Can be > >> reproduced as shown below.) > >>> I didn't notice any performance regression even with the aggressive > setting > >> of C1InlineStackLimit=5 with TieredCompilation. > >>> I know that I'll need a CSR for this change, but I'd like to get feedback in > >> general and feedback about the flag names before creating a CSR. > >>> I'd also be glad about feedback regarding the performance impact. > >>> > >>> Best regards, > >>> Martin > >>> > >>> > >>> > >>> Command line: > >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - > >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - > >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >> TestStack > >>> CompileCommand: compileonly TestStack.triggerStackOverflow > >>> @ 8 TestStack::triggerStackOverflow (15 bytes) > recursive > >> inlining too deep > >>> @ 11 TestStack::bogus_test (33 bytes) inline > >>> caught java.lang.StackOverflowError > >>> 1283 activations were on stack, sum = 0 > >>> > >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - > >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - > >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >> TestStack > >>> CompileCommand: compileonly TestStack.triggerStackOverflow > >>> @ 8 TestStack::triggerStackOverflow (15 bytes) > recursive > >> inlining too deep > >>> @ 11 TestStack::bogus_test (33 bytes) callee uses too > >> much stack > >>> caught java.lang.StackOverflowError > >>> 2310 activations were on stack, sum = 0 > >>> > >>> > >>> TestStack.java: > >>> public class TestStack { > >>> > >>> static long cnt = 0, > >>> sum = 0; > >>> > >>> public static void bogus_test() { > >>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; > >>> sum += c1 + c2 + c3 + c4; > >>> } > >>> > >>> public static void triggerStackOverflow() { > >>> cnt++; > >>> triggerStackOverflow(); > >>> bogus_test(); > >>> } > >>> > >>> > >>> public static void main(String args[]) { > >>> try { > >>> triggerStackOverflow(); > >>> } catch (StackOverflowError e) { > >>> System.out.println("caught " + e); > >>> } > >>> System.out.println(cnt + " activations were on stack, sum = " + > sum); > >>> } > >>> } > >>> From goetz.lindenmaier at sap.com Wed May 6 10:27:43 2020 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 6 May 2020 10:27:43 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi Richard, I had a look at your change. It's complex, but not that big. A lot of code is just passing info through layers of abstraction. Also, one can tell this went through some iterations by now, I think it's very well engineered. I had a look at webrev.05 Unfortunately "8242425: JVMTI monitor operations should use Thread-Local Handshakes" breaks webrev.05. I updated to before that change and took that as base of my review. I see four parts of the change that can be looked at rather individually. * Refactoring the scopeDesc constructors. Trivial. * Persisting information about the optimizations done by the compilers. Large and mostly trivial. * Deoptimizing. The most complicated part. Really well abstracted, though. * DeoptimizeObjectsALot for testing and the tests. Review of compiler changes: I understand you annotate at safepoints where the escape analysis finds out that an object is "better" than global escape. This are the cases where the analysis identifies optimization opportunities. These annotations are then used to deoptimize frames and the objects referenced by them. Doesn't this overestimate the optimized objects? E.g., eliminate_alloc_node has many cases where it bails out. c1_IR.hpp OK, nothing to do for C1, just adapt to extended method signature. Break line once more so that it matches above line length. ciEnv.h|cpp Pass through another jvmti capability. Trivial & good. debugInfoRec.hpp Pass through escape info that must be recorded. OK. pcDesc.hpp I would like to see some documentation of the methods. Maybe: // There is an object in the scope that does not escape globally. // It either does not escape at all or it escapes as arguemnt. and // One of the arguments is an object that is not globally visible // but escapes to the callee. scopeDesc.cpp Besides refactoring copy escape info from pcDesc to scopeDesc and add accessors. Trivial. In scopeDesc.hpp you talk about NoEscape and ArgEscape. This are opto terms, but scopeDesc is a shared datastructure that does not depend on a specific compiler. Please explain what is going on without using these terms. jvmciCodeInstaller.cpp OK, nothing for JVMCI. Here support for Object Optimizations for JVMCI compilers could be added. Leave this to graal people. callnode.hpp You add functionality to annotate callnodes with escape information This is carried through code generation to final output where it is added to the compiled methods meta information. At Safepoints in general jvmti can access - Objects that were scalar replaced. They must be reallocated. (Flag EliminateAllocations) - Objects that should be locked but are not because they never escape the thread. They need to be relocked. At calls, Objects where locks have been removed escape to callees. We must persist this information so that if jvmti accesses the object in a callee, we can determine by looking at the caller that it needs to be relocked. A side comment: I think the flage handling in Opto is not very intuitive. DoEscapeAnalysis depends on the jvmti capabilities. This makes no sense. It is only an analysis. The optimizations should depend on the jvmti capabilities. The correct setup would be to handle this in CompilerConfig::ergo_initialize(): If the jvmti capabilities allow, enable the optimizations EliminateAllocations or EliminateLocks/EliminateNestedLocks. If one of these optimizations is on, enable EscapeAnalysis. -- end side comment. So I would propose the following comments: // In the scope of this safepoints there are objects // that do not globally escape. They are either NoEscape or // ArgEscape. As such, they might be subject to optimizations. // Persist this information here so that the frame an the // Objects in scope can // be deoptimized if jvmti accesses an object at this safepoint. void set_not_global_escape_in_scope(bool b) { // This call passes objects that do not globally escape // to its callee. The object might be subject to optimization, // e.g. a lock might be omitted. Persist this information here // so that on a jvmti access to the callee frame we can deoptimize // the object and this frame. void set_arg_escape(bool f) { _arg_escape = f; } Actuall I am not sure whether the name of these fields (and all the others in the course of this change) should refer to escape analysis. I think the term "Object deoptimization" you also use is much better. You could call these properties (througout the whole change) set_optimized_objects_in_scope() and set_passes_optimized_objects(). I think this would make the whole matter much easier to understand. Anyways, locks can already be removed without running escape analysis at all. C2 recognizes some local patterns that allow this. escape.h|cpp The code looks good. Line 325: The comment could be a bit more elaborate: // Annotate at safepoints if they have <= ArgEscape objects in their // scope. Additionally, if the safepoint is a java call, annotate // whether it passes ArgEscape objects as parameters. And maybe add these comments?: // Returns true if an oop in the scope of sfn does not escape // globally. bool ConnectionGraph::has_not_global_escape_in_scope(SafePointNode* sfn) { // Returns true if at least one of the arguments to the call is an oop // that does not escape globally. bool ConnectionGraph::has_arg_escape(CallJavaNode* call) { General question: You collect the information you want to annotate to the method during escape analysis. Don't you overestimate the optimized objects by this? E.g. elimination of allocations does bail out for various reasons. At the end, no optimization might have happened, but then during runtime the frame is deoptimized nevertheless. machnode.hpp: Extends MachSafePointNode similar to the ideal version. Good. matcher.cpp Copy info from ideal to mach node. good. output.cpp Now finally the information is written to the debug info. Good. --------------------------------------------------------- So now let's have a look at the runtime part (including relaxing constraints to escape analysis): rootResolver.cpp Adapt to changed interface. good. c2compiler.cpp / macro.cpp Make EscpaeAnlysis independent of jvmti capabilities. Good. jvmtiEnv.cpp/jvmtiEnvBase.cpp You add deoptimization of objects where they are accessed. good. jvmtiImpl.cpp In deoptimize_objects, you check for DoEscapeAnalysis. This is correct given the current design of the flag handling in the compiler. It's not really nice to have a dependency to C2 here, though. I understand it's an optimization, the code could be run anyways, it would check but not find anything. But actually I would excpect dependencies on EliminateLocks and EliminateAllocations (if they were set according to jvmti capabilitiers as I elaborated above.) Would it make sense to protect the ArgEscape loop by if (EliminateLocks)? jvmtiTagMap.cpp Deoptimize for jvmti operations. Good. deoptimization.cpp I guess this is the core of your work. You add a new mode that just deoptimizes objects but not frames. Good idea. You have to use reallocated objects in upper frames, or by jvmti accesses to inner frames, which can not easily be replaced by interpreter frames. This way you can wait with replacing the frame until just before execution returns. eliminate_allocations(): (Strange method name, should at least be in past tense, even better reallocate_eliminated_allocations() or allocate_scalarized_objects(). Confused me until I groked the code. Legacy though, not your business.) It's not that nice to return whether you only deoptimized objects by the boolean reference argument. After all, it again depends on the mode you pass in. A different design would be to clone the method and have an eliminate_allocations_no_unpack() variant, but that would not be better as some code would be duplicated. Maybe a comment for argument eliminate_allocations: // deoptimized_objects is set to true if objects were deoptimized // but not the frame. It is unchanged if there are no objects to // be deoptimized, or if the frame was deoptim Similar for eliminate_locks(): // deoptimized_objects is set to true if objects were relocked, // else it is left unchanged. You reuse and extend the existing realloc/relock_objects, but extended it. deoptimize_objects_internal() Simple version of fetch_unroll_info_helper for EscapeBarrier. Good. I attributed the comment "Then relock objects if synchronization on them was eliminated." to the if() just below. Add an empty line to make clear the comment refers to the next 10 lines. Alternatively, replace the whole comment by // At first, reallocate the non-escaping objects and restore their fields // so they are available for relocking. And add // Now relock objects with eliminated locks. befor the if ((DoEscape... below. In fetch_unroll_info_helper, I don't understand why you need && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) { for eliminated locks, but not for skalar replaced objects? I would guess it is because the eliminated locks can be applied to argEscape, but scalar replacement only to noescape objects? I.e. it might have been done before? But why isn't this the case for eliminate_allocations? deoptimize_objects_internal does both unconditionally, so both can happen to inner frames, right? relock_objects() Ok, you need to undo biased locking. Also, you remember the lock nesting for later relocking if waiting for lock. revoke_for_object_deoptimization() I like if boolean operators are at the beginning of broken lines, but I think hotspot convention is to have them at the end. Code will get much more simple if BiasedLocking is removed. EscapeBarrier:: ... (This class maybe would qualify for a file of its own.) deoptimize_objects() I would mention escape analysis only as side remark. Also, as I understand, there is only one frame at given depth? // Deoptimize frames with optimized objects. This can be omitted locks and // objects not allocated but replaced by scalars. In C2, these optimizations // are based on escape analysis. // Up to depth, deoptimize frames with any optimized objects. // From depth to entry_frame, deoptimize only frames that // pass optimized objects to their callees. (First part similar for the comment above EscapeBarrier::deoptimize_objects_internal().) What is the check (cur_depth <= depth) good for? Can you ever walk past entry_frame? Isn't vf->is_compiled_frame() prerequisite that "Move to next physical frame" is needed? You could move it into the other check. If so, similar for deoptimize_objects_all_threads(). Syncronization: looks good. I think others had a look at this before. EscapeBarrier::deoptimize_objects_internal() The method name is misleading, it is not used by deoptimize_objects(). Also, method with the same name is in Deopitmization. Proposal: deoptimize_objects_thread() ? C1 stubs: this really shows you tested all configurations, great! mutexLocker: ok. objectMonitor.cpp: ok stackValue.hpp Is this missing clearing a bug? thread.hpp I would remove "_ea" from the flag and method names. Renaming deferred_locals to deferred_updates is good, as well as adding a datastructure for it. (Adding this data structure might be a breakout, too.) good. thread.cpp good. vframe.cpp Is this a bug in existing code? Makes sense. vframe_hp.hpp (What stands _hp for? helper? The file should be named compiledVFrame ...) not_global_escape_in_scope() ... Again, you mention escape analysis here. Comments above hold, too. You introduce JvmtiDeferredUpdates. Good. vframe_hp.cpp Changes for JvmtiDeferredUpdates, escape state accessors, line 422: Would an assertion assert(!info->owner_is_scalar_replaced(), ...) hold here? macros.hpp Good. Test coding ============ compileBroker.h|cpp You introduce a third class of threads handled here and add a new flag to distinguish it. Before, the two kinds of threads were distinguished implicitly by passing in a compiler for compiler threads. The new thread kind is only used for testing in debug. make_thread: You could assert (comp != NULL...) to assure previous conditions. line 989 indentation broken escape.cpp You enable the optimization in case of testruns. good. whitebox.cpp ok. deoptimization.cpp deoptimize_objects_alot_loop() Good. globals.hpp Nice docu of flags, but pleas mention "for testing purposes" or the like in DeoptimizeObjectsALot. I would place the flags next to each other. interfaceSupport.cpp: good. I'll look at the test themselves in an extra mail (learning from Martin ??) Best regards, Goetz. > -----Original Message----- > From: Reingruber, Richard > Sent: Wednesday, April 1, 2020 8:15 AM > To: Doerr, Martin ; 'Robbin Ehn' > ; Lindenmaier, Goetz > ; David Holmes ; > Vladimir Kozlov (vladimir.kozlov at oracle.com) ; > serviceability-dev at openjdk.java.net; hotspot-compiler- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi Martin, > > > thanks for addressing all my points. I've looked over webrev.5 and I'm > satisfied with your changes. > > Thanks! > > > I had also promised to review the tests. > > Thanks++ > I appreciate it very much, the tests are many lines of code. > > > test/jdk/com/sun/jdi/EATests.java > > This is a substantial amount of tests which is appropriate for a such a large > change. Skipping some subtests with UseJVMCICompiler makes sense > because it doesn't provide the necessary JVMTI functionality, yet. > > Nice work! > > I also like that you test with and without BiasedLocking. Your tests will still > be fine after BiasedLocking deprecation. > > Hope so :) > > > Very minor nits: > > - 2 typos in comment above EARelockingNestedInflatedTarget: "lockes are > ommitted" (sounds funny) > > - You sometimes write "graal" and sometimes "Graal". I guess the capital G > is better. (Also in EATestsJVMCI.java.) > > > test/jdk/com/sun/jdi/EATestsJVMCI.java > > EATests with Graal enabled. Nice that you support Graal to some extent. > Maybe Graal folks want to enhance them in the future. I think this is a good > starting point. > > Will change this in the next webrev. > > > Conclusion: Looks good and not trivial :-) > > Now, you have one full review. I'd be ok with covering 2nd review by partial > reviews. > > Compiler and JVMTI parts are not too complicated IMHO. > > Runtime part should get at least one additional careful review. > > Thanks a lot, > Richard. > > -----Original Message----- > From: Doerr, Martin > Sent: Dienstag, 31. M?rz 2020 16:01 > To: Reingruber, Richard ; 'Robbin Ehn' > ; Lindenmaier, Goetz > ; David Holmes ; > Vladimir Kozlov (vladimir.kozlov at oracle.com) ; > serviceability-dev at openjdk.java.net; hotspot-compiler- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi Richard, > > thanks for addressing all my points. I've looked over webrev.5 and I'm > satisfied with your changes. > > > I had also promised to review the tests. > > test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysis > Enabled.java > Thanks for updating the @summary comment. Looks good in webrev.5. > > test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnaly > sisEnabled.c > JVMTI agent for object tagging and heap iteration. Good. > > test/jdk/com/sun/jdi/EATests.java > This is a substantial amount of tests which is appropriate for a such a large > change. Skipping some subtests with UseJVMCICompiler makes sense > because it doesn't provide the necessary JVMTI functionality, yet. > Nice work! > I also like that you test with and without BiasedLocking. Your tests will still be > fine after BiasedLocking deprecation. > > Very minor nits: > - 2 typos in comment above EARelockingNestedInflatedTarget: "lockes are > ommitted" (sounds funny) > - You sometimes write "graal" and sometimes "Graal". I guess the capital G is > better. (Also in EATestsJVMCI.java.) > > test/jdk/com/sun/jdi/EATestsJVMCI.java > EATests with Graal enabled. Nice that you support Graal to some extent. > Maybe Graal folks want to enhance them in the future. I think this is a good > starting point. > > > Conclusion: Looks good and not trivial :-) > Now, you have one full review. I'd be ok with covering 2nd review by partial > reviews. > Compiler and JVMTI parts are not too complicated IMHO. > Runtime part should get at least one additional careful review. > > Best regards, > Martin > > > > -----Original Message----- > > From: Reingruber, Richard > > Sent: Montag, 30. M?rz 2020 10:32 > > To: Doerr, Martin ; 'Robbin Ehn' > > ; Lindenmaier, Goetz > > ; David Holmes > ; > > Vladimir Kozlov (vladimir.kozlov at oracle.com) > > ; serviceability-dev at openjdk.java.net; > > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > > dev at openjdk.java.net > > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance > > in the Presence of JVMTI Agents > > > > Hi, > > > > this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :) > > > > The change affects jvmti, hotspot and c2. Partial reviews are very welcome > > too. > > > > Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/ > > Delta: > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/ > > > > Robbin, Martin, please let me know, if anything shouldn't be quite as you > > wanted it. Also find my > > comments on your feedback below. > > > > Robbin, can I count you as Reviewer for the runtime part? > > > > Thanks, Richard. > > > > -- > > > > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > > > You can move both declaration and definition to that file, no need to > > clobber > > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > > > Done. > > > > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in > it's > > own > > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > > > I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting > > jvmtiDeferredLocalVariableSet is > > declared. > > > > > src/hotspot/share/code/compiledMethod.cpp > > > Nice cleanup! > > > > Thanks :) > > > > > src/hotspot/share/code/debugInfoRec.cpp > > > src/hotspot/share/code/debugInfoRec.hpp > > > Additional parmeters. (Remark: I think "non_global_escape_in_scope" > > would read better than "not_global_escape_in_scope", but your version is > > consistent with existing code, so no change request from my side.) Ok. > > > > I've been thinking about this too and finally stayed with > > not_global_escape_in_scope. It's supposed > > to mean an object whose escape state is not GlobalEscape is in scope. > > > > > src/hotspot/share/compiler/compileBroker.cpp > > > src/hotspot/share/compiler/compileBroker.hpp > > > Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into > > a follow up change together with the test in order to make this webrev > > smaller, but since it is included, I'm reviewing everything at once. Not a big > > deal.) Ok. > > > > Yes the change would be a little smaller. And if it helps I'll split it off. In > > general I prefer > > patches that bring along a suitable amount of tests. > > > > > src/hotspot/share/opto/c2compiler.cpp > > > Make do_escape_analysis independent of JVMCI capabilities. Nice! > > > > It is the main goal of the enhancement. It is done for C2, but could be done > > for JVMCI compilers > > with just a small effort as well. > > > > > src/hotspot/share/opto/escape.cpp > > > Annotation for MachSafePointNodes. Your added functionality looks > > correct. > > > But I'd prefer to move the bulky code out of the large function. > > > I suggest to factor out something like has_not_global_escape and > > has_arg_escape. So the code could look like this: > > > SafePointNode* sfn = sfn_worklist.at(next); > > > sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn)); > > > if (sfn->is_CallJava()) { > > > CallJavaNode* call = sfn->as_CallJava(); > > > call->set_arg_escape(has_arg_escape(call)); > > > } > > > This would also allow us to get rid of the found_..._escape_in_args > > variables making the loops better readable. > > > > Done. > > > > > It's kind of ugly to use strcmp to recognize uncommon trap, but that > seems > > to be the way to do it (there are more such places). So it's ok. > > > > Yeah. I copied the snippet. > > > > > src/hotspot/share/prims/jvmtiImpl.cpp > > > src/hotspot/share/prims/jvmtiImpl.hpp > > > The sequence is pretty complex: > > > VM_GetOrSetLocal element initialization executes EscapeBarrier code > > which suspends the target thread (extra VM Operation). > > > > Note that the target threads have to be suspended already for > > VM_GetOrSetLocal*. So it's mainly the > > synchronization effect of EscapeBarrier::sync_and_suspend_one() that is > > required here. Also no extra > > _handshake_ is executed, since sync_and_suspend_one() will find the > > target threads already > > suspended. > > > > > VM_GetOrSetLocal::doit_prologue performs object deoptimization (by > VM > > Thread to prepare VM Operation with frame deoptimization). > > > VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor > > which resumes the target thread. > > > But I don't have any improvement proposal. Performance is probably not > a > > concern, here. So it's ok. > > > > > VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it > > has non-globally escaping objects and other frames if they have arg > escaping > > ones. Good. > > > > It's not specifically the top frame, but the frame that is accessed. > > > > > src/hotspot/share/runtime/deoptimization.cpp > > > Object deoptimization. I have more comments and proposals, here. > > > First of all, handling recursive and waiting locks in relock_objects is tricky, > > but looks correct. > > > Comments are sufficient to understand why things are done as they are > > implemented. > > > > > BiasedLocking related parts are complex, but we may get rid of them in > the > > future (with BiasedLocking removal). > > > Anyway, looks correct, too. > > > > > Typo in comment: "regularily" => "regularly" > > > > > Deoptimization::fetch_unroll_info_helper is the only place where > > _jvmti_deferred_updates get deallocated (except JavaThread destructor). > > But I think we always go through it, so I can't see a memory leak or such > kind > > of issues. > > > > That's correct. The compiled frame for which deferred updates are > allocated > > is always deoptimized > > before (see EscapeBarrier::deoptimize_objects()). This is also asserted in > > compiledVFrame::update_deferred_value(). I've added the same assertion > > to > > Deoptimization::relock_objects(). So we can be sure that > > _jvmti_deferred_updates are deallocated > > again in fetch_unroll_info_helper(). > > > > > EscapeBarrier::deoptimize_objects: ResourceMark should use > > calling_thread(). > > > > Sure, well spotted! > > > > > You can use MutexLocker and MonitorLocker with Thread* to save the > > Thread::current() call. > > > > Right, good hint. This was recently introduced with 8235678. I even had to > > resolve conflicts. Should > > have done this then. > > > > > I'd make set_objs_are_deoptimized static and remove it from the > > EscapeBarrier interface because I think it shouldn't be used outside of > > EscapeBarrier::deoptimize_objects. > > > > Done. > > > > > Typo in comment: "we must only deoptimize" => "we only have to > > deoptimize" > > > > Replaced with "[...] we deoptimize iff local objects are passed as args" > > > > > "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and > > barrier_active() is redundant. Implementation can get moved to hpp file. > > > > Ok. Done. > > > > > I'll get back to suspend flags, later. > > > > > There are weird cases regarding _self_deoptimization_in_progress. > > > Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. > > C can set _self_deoptimization_in_progress while A performs the > handshake > > for suspending C. I think this doesn't lead to errors, but it's probably not > > desired. > > > I think it would be better to use only one "wait" call in > > sync_and_suspend_one and sync_and_suspend_all. > > > > You're right. We've discussed that face-to-face, but couldn't find a real > issue. > > But now, thinking again, a reckon I found one: > > > > 2808 // Sync with other threads that might be doing deoptimizations > > 2809 { > > 2810 // Need to switch to _thread_blocked for the wait() call > > 2811 ThreadBlockInVM tbivm(_calling_thread); > > 2812 MonitorLocker ml(EscapeBarrier_lock, > > Mutex::_no_safepoint_check_flag); > > 2813 while (_self_deoptimization_in_progress) { > > 2814 ml.wait(); > > 2815 } > > 2816 > > 2817 if (self_deopt()) { > > 2818 _self_deoptimization_in_progress = true; > > 2819 } > > 2820 > > 2821 while (_deoptee_thread->is_ea_obj_deopt_suspend()) { > > 2822 ml.wait(); > > 2823 } > > 2824 > > 2825 if (self_deopt()) { > > 2826 return; > > 2827 } > > 2828 > > 2829 // set suspend flag for target thread > > 2830 _deoptee_thread->set_ea_obj_deopt_flag(); > > 2831 } > > > > - A waits in 2822 > > - C is suspended > > - B notifies all in resume_one() > > - A and C wake up > > - C wins over A and sets _self_deoptimization_in_progress = true in 2818 > > - C does the self deoptimization > > - A executes 2830 _deoptee_thread->set_ea_obj_deopt_flag() > > > > C will self suspend at some undefined point. The resulting state is illegal. > > > > > I first thought it'd be better to move ThreadBlockInVM before wait() to > > reduce thread state transitions, but that seems to be problematic because > > ThreadBlockInVM destructor contains a safepoint check which we > shouldn't > > do while holding EscapeBarrier_lock. So no change request. > > > > Yes, would be nice to have the state change only if needed, but for the > > reason you mentioned it is > > not quite as easy as it seems to be. I experimented as well with a second > > lock, but did not succeed. > > > > > Change in thred_added: > > > I think the sequence would be more comprehensive if we waited for > > deopt_all_threads in Thread::start and all other places where a new thread > > can run into Java code (e.g. JVMTI attach). > > > Your version makes new threads come up with suspend flag set. That > looks > > correct, too. Advantage is that you only have to change one place > > (thread_added). It'll be interesting to see how it will look like when we use > > async handshakes instead of suspend flags. > > > For now, I'm ok with your version. > > > > I had a version that did what you are suggesting. The current version also > has > > the advantage, that > > there are fewer places where a thread has to wait for ongoing object > > deoptimization. This means > > viewer places where you have to worry about correct thread state > > transitions, possible deadlocks, > > and if all oops are properly Handle'ed. > > > > > I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt- > > >is_hidden_from_external_view()). > > > > Done. > > > > > Having 4 different deoptimize_objects functions makes it a little hard to > > keep an overview of which one is used for what. > > > Maybe adding suffixes would help a little bit, but I can also live with what > > you have. > > > Implementation looks correct to me. > > > > 2 are internal. I added the suffix _internal to them. This leaves 2 to choose > > from. > > > > > src/hotspot/share/runtime/deoptimization.hpp > > > Escape barriers and object deoptimization functions. > > > Typo in comment: "helt" => "held" > > > > Done in place already. > > > > > src/hotspot/share/runtime/interfaceSupport.cpp > > > InterfaceSupport::deoptimizeAllObjects() is only used for > > DeoptimizeObjectsALot = 1. > > > I think DeoptimizeObjectsALot = 2 is more important, but I think it's not > bad > > to have DeoptimizeObjectsALot = 1 in addition. Ok. > > > > I never used DeoptimizeObjectsALot = 1 that much. It could be more > > deterministic in single threaded > > scenarios. I wouldn't object to get rid of it though. > > > > > src/hotspot/share/runtime/stackValue.hpp > > > Better reinitilization in StackValue. Good. > > > > StackValue::obj_is_scalar_replaced() should not return true after calling > > set_obj(). > > > > > src/hotspot/share/runtime/thread.cpp > > > src/hotspot/share/runtime/thread.hpp > > > src/hotspot/share/runtime/thread.inline.hpp > > > wait_for_object_deoptimization, suspend flag, deferred updates and test > > feature to deoptimize objects. > > > > > In the long term, we want to get rid of suspend flags, so it's not so nice to > > introduce a new one. But I agree with G?tz that it should be acceptable as > > temporary solution until async handshakes are available (which takes more > > time). So I'm ok with your change. > > > > I'm keen to build the feature on async handshakes when the arive. > > > > > You can use MutexLocker with Thread*. > > > > Done. > > > > > JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class > > out of thread.hpp. > > > > Done. > > > > > src/hotspot/share/runtime/vframe.cpp > > > Added support for entry frame to new_vframe. Ok. > > > > > > > src/hotspot/share/runtime/vframe_hp.cpp > > > src/hotspot/share/runtime/vframe_hp.hpp > > > > > I think code()->as_nmethod() in not_global_escape_in_scope() and > > arg_escape() should better be under #ifdef ASSERT or inside the assert > > statement (no need for code cache walking in product build). > > > > Done. > > > > > jvmtiDeferredLocalVariableSet::update_monitors: > > > Please add a comment explaining that owner referenced by original info > > may be scalar replaced, but it is deoptimized in the vframe. > > > > Done. > > > > -----Original Message----- > > From: Doerr, Martin > > Sent: Donnerstag, 12. M?rz 2020 17:28 > > To: Reingruber, Richard ; 'Robbin Ehn' > > ; Lindenmaier, Goetz > > ; David Holmes > ; > > Vladimir Kozlov (vladimir.kozlov at oracle.com) > > ; serviceability-dev at openjdk.java.net; > > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > > dev at openjdk.java.net > > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance > > in the Presence of JVMTI Agents > > > > Hi Richard, > > > > > > I managed to find time for a (almost) complete review of webrev.4. (I'll > > review the tests separately.) > > > > First of all, the change seems to be in pretty good quality for its significant > > complexity. I couldn't find any real bugs. But I'd like to propose minor > > improvements. > > I'm convinced that it's mature because we did substantial testing. > > > > I like the new functionality for object deoptimization. It can possibly be > > reused for future escape analysis based optimizations. So I appreciate > having > > it available in the code base. > > In addition to that, your change makes the JVMTI implementation better > > integrated into the VM. > > > > > > Now to the details: > > > > > > src/hotspot/share/c1/c1_IR.hpp > > describe_scope parameters. Ok. > > > > > > src/hotspot/share/ci/ciEnv.cpp > > src/hotspot/share/ci/ciEnv.hpp > > Fix for JvmtiExport::can_walk_any_space() capability. Ok. > > > > > > src/hotspot/share/code/compiledMethod.cpp > > Nice cleanup! > > > > > > src/hotspot/share/code/debugInfoRec.cpp > > src/hotspot/share/code/debugInfoRec.hpp > > Additional parmeters. (Remark: I think "non_global_escape_in_scope" > > would read better than "not_global_escape_in_scope", but your version is > > consistent with existing code, so no change request from my side.) Ok. > > > > > > src/hotspot/share/code/nmethod.cpp > > Nice cleanup! > > > > > > src/hotspot/share/code/pcDesc.hpp > > Additional parameters. Ok. > > > > > > src/hotspot/share/code/scopeDesc.cpp > > src/hotspot/share/code/scopeDesc.hpp > > Improved implementation + additional parameters. Ok. > > > > > > src/hotspot/share/compiler/compileBroker.cpp > > src/hotspot/share/compiler/compileBroker.hpp > > Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a > > follow up change together with the test in order to make this webrev > > smaller, but since it is included, I'm reviewing everything at once. Not a big > > deal.) Ok. > > > > > > src/hotspot/share/jvmci/jvmciCodeInstaller.cpp > > Additional parameters. Ok. > > > > > > src/hotspot/share/opto/c2compiler.cpp > > Make do_escape_analysis independent of JVMCI capabilities. Nice! > > > > > > src/hotspot/share/opto/callnode.hpp > > Additional fields for MachSafePointNodes. Ok. > > > > > > src/hotspot/share/opto/escape.cpp > > Annotation for MachSafePointNodes. Your added functionality looks > correct. > > But I'd prefer to move the bulky code out of the large function. > > I suggest to factor out something like has_not_global_escape and > > has_arg_escape. So the code could look like this: > > SafePointNode* sfn = sfn_worklist.at(next); > > sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn)); > > if (sfn->is_CallJava()) { > > CallJavaNode* call = sfn->as_CallJava(); > > call->set_arg_escape(has_arg_escape(call)); > > } > > This would also allow us to get rid of the found_..._escape_in_args > variables > > making the loops better readable. > > > > It's kind of ugly to use strcmp to recognize uncommon trap, but that seems > > to be the way to do it (there are more such places). So it's ok. > > > > > > src/hotspot/share/opto/machnode.hpp > > Additional fields for MachSafePointNodes. Ok. > > > > > > src/hotspot/share/opto/macro.cpp > > Allow elimination of non-escaping allocations. Ok. > > > > > > src/hotspot/share/opto/matcher.cpp > > src/hotspot/share/opto/output.cpp > > Copy attribute / pass parameters. Ok. > > > > > > src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp > > Nice cleanup! > > > > > > src/hotspot/share/prims/jvmtiEnv.cpp > > src/hotspot/share/prims/jvmtiEnvBase.cpp > > Escape barriers + deoptimize objects for target thread. Good. > > > > > > src/hotspot/share/prims/jvmtiImpl.cpp > > src/hotspot/share/prims/jvmtiImpl.hpp > > The sequence is pretty complex: > > VM_GetOrSetLocal element initialization executes EscapeBarrier code > which > > suspends the target thread (extra VM Operation). > > VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM > > Thread to prepare VM Operation with frame deoptimization). > > VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor > which > > resumes the target thread. > > But I don't have any improvement proposal. Performance is probably not a > > concern, here. So it's ok. > > > > VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has > > non-globally escaping objects and other frames if they have arg escaping > > ones. Good. > > > > > > src/hotspot/share/prims/jvmtiTagMap.cpp > > Escape barriers + deoptimize objects for all threads. Ok. > > > > > > src/hotspot/share/prims/whitebox.cpp > > Added WB_IsFrameDeoptimized to API. Ok. > > > > > > src/hotspot/share/runtime/deoptimization.cpp > > Object deoptimization. I have more comments and proposals, here. > > First of all, handling recursive and waiting locks in relock_objects is tricky, > but > > looks correct. > > Comments are sufficient to understand why things are done as they are > > implemented. > > > > BiasedLocking related parts are complex, but we may get rid of them in the > > future (with BiasedLocking removal). > > Anyway, looks correct, too. > > > > Typo in comment: "regularily" => "regularly" > > > > Deoptimization::fetch_unroll_info_helper is the only place where > > _jvmti_deferred_updates get deallocated (except JavaThread destructor). > > But I think we always go through it, so I can't see a memory leak or such > kind > > of issues. > > > > EscapeBarrier::deoptimize_objects: ResourceMark should use > > calling_thread(). > > > > You can use MutexLocker and MonitorLocker with Thread* to save the > > Thread::current() call. > > > > I'd make set_objs_are_deoptimized static and remove it from the > > EscapeBarrier interface because I think it shouldn't be used outside of > > EscapeBarrier::deoptimize_objects. > > > > Typo in comment: "we must only deoptimize" => "we only have to > > deoptimize" > > > > "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and > > barrier_active() is redundant. Implementation can get moved to hpp file. > > > > I'll get back to suspend flags, later. > > > > There are weird cases regarding _self_deoptimization_in_progress. > > Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. > C > > can set _self_deoptimization_in_progress while A performs the handshake > > for suspending C. I think this doesn't lead to errors, but it's probably not > > desired. > > I think it would be better to use only one "wait" call in > > sync_and_suspend_one and sync_and_suspend_all. > > > > I first thought it'd be better to move ThreadBlockInVM before wait() to > > reduce thread state transitions, but that seems to be problematic because > > ThreadBlockInVM destructor contains a safepoint check which we > shouldn't > > do while holding EscapeBarrier_lock. So no change request. > > > > Change in thred_added: > > I think the sequence would be more comprehensive if we waited for > > deopt_all_threads in Thread::start and all other places where a new thread > > can run into Java code (e.g. JVMTI attach). > > Your version makes new threads come up with suspend flag set. That looks > > correct, too. Advantage is that you only have to change one place > > (thread_added). It'll be interesting to see how it will look like when we use > > async handshakes instead of suspend flags. > > For now, I'm ok with your version. > > > > I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt- > > >is_hidden_from_external_view()). > > > > Having 4 different deoptimize_objects functions makes it a little hard to > keep > > an overview of which one is used for what. > > Maybe adding suffixes would help a little bit, but I can also live with what > you > > have. > > Implementation looks correct to me. > > > > > > src/hotspot/share/runtime/deoptimization.hpp > > Escape barriers and object deoptimization functions. > > Typo in comment: "helt" => "held" > > > > > > src/hotspot/share/runtime/globals.hpp > > Addition of develop flag DeoptimizeObjectsALotInterval. Ok. > > > > > > src/hotspot/share/runtime/interfaceSupport.cpp > > InterfaceSupport::deoptimizeAllObjects() is only used for > > DeoptimizeObjectsALot = 1. > > I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad > > to have DeoptimizeObjectsALot = 1 in addition. Ok. > > > > > > src/hotspot/share/runtime/interfaceSupport.inline.hpp > > Addition of deoptimizeAllObjects. Ok. > > > > > > src/hotspot/share/runtime/mutexLocker.cpp > > src/hotspot/share/runtime/mutexLocker.hpp > > Addition of EscapeBarrier_lock. Ok. > > > > > > src/hotspot/share/runtime/objectMonitor.cpp > > Make recursion count relock aware. Ok. > > > > > > src/hotspot/share/runtime/stackValue.hpp > > Better reinitilization in StackValue. Good. > > > > > > src/hotspot/share/runtime/thread.cpp > > src/hotspot/share/runtime/thread.hpp > > src/hotspot/share/runtime/thread.inline.hpp > > wait_for_object_deoptimization, suspend flag, deferred updates and test > > feature to deoptimize objects. > > > > In the long term, we want to get rid of suspend flags, so it's not so nice to > > introduce a new one. But I agree with G?tz that it should be acceptable as > > temporary solution until async handshakes are available (which takes more > > time). So I'm ok with your change. > > > > You can use MutexLocker with Thread*. > > > > JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class > out > > of thread.hpp. > > > > > > src/hotspot/share/runtime/vframe.cpp > > Added support for entry frame to new_vframe. Ok. > > > > > > src/hotspot/share/runtime/vframe_hp.cpp > > src/hotspot/share/runtime/vframe_hp.hpp > > > > I think code()->as_nmethod() in not_global_escape_in_scope() and > > arg_escape() should better be under #ifdef ASSERT or inside the assert > > statement (no need for code cache walking in product build). > > > > jvmtiDeferredLocalVariableSet::update_monitors: > > Please add a comment explaining that owner referenced by original info > may > > be scalar replaced, but it is deoptimized in the vframe. > > > > > > src/hotspot/share/utilities/macros.hpp > > Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok. > > > > > > > test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysi > > sEnabled.java > > > test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnal > > ysisEnabled.c > > New test. Will review separately. > > > > > > test/jdk/TEST.ROOT > > Addition of vm.jvmci as required property. Ok. > > > > > > test/jdk/com/sun/jdi/EATests.java > > test/jdk/com/sun/jdi/EATestsJVMCI.java > > New test. Will review separately. > > > > > > test/lib/sun/hotspot/WhiteBox.java > > Added isFrameDeoptimized to API. Ok. > > > > > > That was it. Best regards, > > Martin > > > > > > > -----Original Message----- > > > From: hotspot-compiler-dev > > bounces at openjdk.java.net> On Behalf Of Reingruber, Richard > > > Sent: Dienstag, 3. M?rz 2020 21:23 > > > To: 'Robbin Ehn' ; Lindenmaier, Goetz > > > ; David Holmes > > ; > > > Vladimir Kozlov (vladimir.kozlov at oracle.com) > > > ; serviceability-dev at openjdk.java.net; > > > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > > > dev at openjdk.java.net > > > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better > > > Performance in the Presence of JVMTI Agents > > > > > > Hi Robbin, > > > > > > > > I understand that Robbin proposed to replace the usage of > > > > > _suspend_flag with handshakes. Apparently, async handshakes > > > > > are needed to do so. We have been waiting a while for removal > > > > > of the _suspend_flag / introduction of async handshakes [2]. > > > > > What is the status here? > > > > > > > I have an old prototype which I would like to continue to work on. > > > > So do not assume asynch handshakes will make 15. > > > > Even if it would, I think there are a lot more investigate work to remove > > > > _suspend_flag. > > > > > > Let us know, if we can be of any help to you and be it only testing. > > > > > > > >> Full: > > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > > > > > > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > > > > You can move both declaration and definition to that file, no need to > > > clobber > > > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > > > > > Will do. > > > > > > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in > > it's > > > own > > > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > > > > > You are right. It shouldn't be declared in thread.hpp. I will look into that. > > > > > > > Note that we also think we may have a bug in deopt: > > > > https://bugs.openjdk.java.net/browse/JDK-8238237 > > > > > > > I think it would be best, if possible, to push after that is resolved. > > > > > > Sure. > > > > > > > Not even nearly a full review :) > > > > > > I know :) > > > > > > Anyways, thanks a lot, > > > Richard. > > > > > > > > > -----Original Message----- > > > From: Robbin Ehn > > > Sent: Monday, March 2, 2020 11:17 AM > > > To: Lindenmaier, Goetz ; Reingruber, > > Richard > > > ; David Holmes > > ; > > > Vladimir Kozlov (vladimir.kozlov at oracle.com) > > > ; serviceability-dev at openjdk.java.net; > > > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > > > dev at openjdk.java.net > > > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > Performance > > > in the Presence of JVMTI Agents > > > > > > Hi, > > > > > > On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote: > > > > Hi, > > > > > > > > I had a look at the progress of this change. Nothing > > > > happened since Richard posted his update using more > > > > handshakes [1]. > > > > But we (SAP) would appreciate a lot if this change could > > > > be successfully reviewed and pushed. > > > > > > > > I think there is basic understanding that this > > > > change is helpful. It fixes a number of issues with JVMTI, > > > > and will deliver the same performance benefits as EA > > > > does in current production mode for debugging scenarios. > > > > > > > > This is important for us as we run our VMs prepared > > > > for debugging in production mode. > > > > > > > > I understand that Robbin proposed to replace the usage of > > > > _suspend_flag with handshakes. Apparently, async handshakes > > > > are needed to do so. We have been waiting a while for removal > > > > of the _suspend_flag / introduction of async handshakes [2]. > > > > What is the status here? > > > > > > I have an old prototype which I would like to continue to work on. > > > So do not assume asynch handshakes will make 15. > > > Even if it would, I think there are a lot more investigate work to remove > > > _suspend_flag. > > > > > > > > > > > I think we should no longer wait, but proceed with > > > > this change. We will look into removing the usage of > > > > suspend_flag introduced here once it is possible to implement > > > > it with handshakes. > > > > > > Yes, sure. > > > > > > >> Full: > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > > > > > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > > > You can move both declaration and definition to that file, no need to > > clobber > > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > > > > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in > it's > > > own > > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > > > > > Note that we also think we may have a bug in deopt: > > > https://bugs.openjdk.java.net/browse/JDK-8238237 > > > > > > I think it would be best, if possible, to push after that is resolved. > > > > > > Not even nearly a full review :) > > > > > > Thanks, Robbin > > > > > > > > > >> Incremental: > > > >> > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/ > > > >> > > > >> I was not able to eliminate the additional suspend flag now. I'll take > care > > > of this > > > >> as soon as the > > > >> existing suspend-resume-mechanism is reworked. > > > >> > > > >> Testing: > > > >> > > > >> Nightly tests @SAP: > > > >> > > > >> JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, > > > Renaissance > > > >> Suite, SAP specific tests > > > >> with fastdebug and release builds on all platforms > > > >> > > > >> Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x > > > parallel > > > >> for 24h > > > >> > > > >> Thanks, Richard. > > > >> > > > >> > > > >> More details on the changes: > > > >> > > > >> * Hide DeoptimizeObjectsALotThread from external view. > > > >> > > > >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock. > > > >> It used to be _safepoint_check_sometimes, which will be eliminated > > > sooner or > > > >> later. > > > >> I added explicit thread state changes with ThreadBlockInVM to code > > > paths > > > >> where we can wait() > > > >> on EscapeBarrier_lock to become safepoint safe. > > > >> > > > >> * Use handshake EscapeBarrierSuspendHandshake to suspend target > > > threads > > > >> instead of vm operation > > > >> VM_ThreadSuspendAllForObjDeopt. > > > >> > > > >> * Removed uses of Threads_lock. When adding a new thread we > > suspend > > > it iff > > > >> EA optimizations are > > > >> being reverted. In the previous version we were waiting on > > > Threads_lock > > > >> while EA optimizations > > > >> were reverted. See EscapeBarrier::thread_added(). > > > >> > > > >> * Made tests require Xmixed compilation mode. > > > >> > > > >> * Made tests agnostic regarding tiered compilation. > > > >> I.e. tc isn't disabled anymore, and the tests can be run with tc > enabled > > or > > > >> disabled. > > > >> > > > >> * Exercising EATests.java as well with stress test options > > > >> DeoptimizeObjectsALot* > > > >> Due to the non-deterministic deoptimizations some tests need to be > > > skipped. > > > >> We do this to prevent bit-rot of the stress test code. > > > >> > > > >> * Executing EATests.java as well with graal if available. Driver for this is > > > >> EATestsJVMCI.java. Graal cannot pass all tests, because it does not > > > provide all > > > >> the new debug info > > > >> (namely not_global_escape_in_scope and arg_escape in > > > scopeDesc.hpp). > > > >> And graal does not yet support the JVMTI operations force early > > return > > > and > > > >> pop frame. > > > >> > > > >> * Removed tracing from new jdi tests in EATests.java. Too much trace > > > output > > > >> before the debugging > > > >> connection is established can cause deadlock because output buffers > > fill > > > up. > > > >> (See https://bugs.openjdk.java.net/browse/JDK-8173304) > > > >> > > > >> * Many copyright year changes and smaller clean-up changes of > testing > > > code > > > >> (trailing white-space and > > > >> the like). > > > >> > > > >> > > > >> -----Original Message----- > > > >> From: David Holmes > > > >> Sent: Donnerstag, 19. Dezember 2019 03:12 > > > >> To: Reingruber, Richard ; serviceability- > > > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; > > > hotspot- > > > >> runtime-dev at openjdk.java.net; Vladimir Kozlov > > > (vladimir.kozlov at oracle.com) > > > >> > > > >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > > > Performance in > > > >> the Presence of JVMTI Agents > > > >> > > > >> Hi Richard, > > > >> > > > >> I think my issue is with the way EliminateNestedLocks works so I'm > going > > > >> to look into that more deeply. > > > >> > > > >> Thanks for the explanations. > > > >> > > > >> David > > > >> > > > >> On 18/12/2019 12:47 am, Reingruber, Richard wrote: > > > >>> Hi David, > > > >>> > > > >>> > > > Some further queries/concerns: > > > >>> > > > > > > >>> > > > src/hotspot/share/runtime/objectMonitor.cpp > > > >>> > > > > > > >>> > > > Can you please explain the changes to ObjectMonitor::wait: > > > >>> > > > > > > >>> > > > ! _recursions = save // restore the old recursion count > > > >>> > > > ! + jt->get_and_reset_relock_count_after_wait(); // > > > >>> > > > increased by the deferred relock count > > > >>> > > > > > > >>> > > > what is the "deferred relock count"? I gather it relates to > > > >>> > > > > > > >>> > > > "The code was extended to be able to deoptimize objects of > a > > > >>> > > frame that > > > >>> > > > is not the top frame and to let another thread than the > > owning > > > >>> > > thread do > > > >>> > > > it." > > > >>> > > > > > >>> > > Yes, these relate. Currently EA based optimizations are reverted, > > > when a > > > >> compiled frame is > > > >>> > > replaced with corresponding interpreter frames. Part of this is > > > relocking > > > >> objects with eliminated > > > >>> > > locking. New with the enhancement is that we do this also just > > > before > > > >> object references are > > > >>> > > acquired through JVMTI. In this case we deoptimize also the > > > owning > > > >> compiled frame C and we > > > >>> > > register deoptimized objects as deferred updates. When control > > > returns > > > >> to C it gets deoptimized, > > > >>> > > we notice that objects are already deoptimized (reallocated and > > > >> relocked), so we don't do it again > > > >>> > > (relocking twice would be incorrect of course). Deferred > updates > > > are > > > >> copied into the new > > > >>> > > interpreter frames. > > > >>> > > > > > >>> > > Problem: relocking is not possible if the target thread T is > waiting > > > on the > > > >> monitor that needs to > > > >>> > > be relocked. This happens only with non-local objects with > > > >> EliminateNestedLocks. Instead relocking > > > >>> > > is deferred until T owns the monitor again. This is what the > piece > > of > > > >> code above does. > > > >>> > > > > >>> > Sorry I need some more detail here. How can you wait() on an > > > object > > > >>> > monitor if the object allocation and/or locking was optimised > > away? > > > And > > > >>> > what is a "non-local object" in this context? Isn't EA restricted to > > > >>> > thread-confined objects? > > > >>> > > > >>> "Non-local object" is an object that escapes its thread. The issue I'm > > > >> addressing with the changes > > > >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused > by > > > >> EliminateNestedLocks, where C2 > > > >>> eliminates recursive locking of an already owned lock. The lock > owning > > > object > > > >> exists on the heap, it > > > >>> is locked and you can call wait() on it. > > > >>> > > > >>> EliminateLocks is the C2 option that controls lock elimination based > on > > > EA. > > > >> Both optimizations have > > > >>> in common that objects with eliminated locking need to be relocked > > > when > > > >> deoptimizing a frame, > > > >>> i.e. when replacing a compiled frame with equivalent interpreter > > > >>> frames. Deoptimization::relock_objects does that job for /all/ > > eliminated > > > >> locks in scope. /All/ can > > > >>> be a mix of eliminated nested locks and locks of not-escaping objects. > > > >>> > > > >>> New with the enhancement: I call relock_objects earlier, just before > > > objects > > > >> pontentially > > > >>> escape. But then later when the owning compiled frame gets > > > deoptimized, I > > > >> must not do it again: > > > >>> > > > >>> See call to EscapeBarrier::objs_are_deoptimized in > > deoptimization.cpp: > > > >>> > > > >>> 373 if ((jvmci_enabled || ((DoEscapeAnalysis || > > > EliminateNestedLocks) && > > > >> EliminateLocks)) > > > >>> 374 && !EscapeBarrier::objs_are_deoptimized(thread, > > > deoptee.id())) { > > > >>> 375 bool unused; > > > >>> 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, > > > exec_mode, > > > >> unused); > > > >>> 377 } > > > >>> > > > >>> Now when calling relock_objects early it is quiet possible that I have > to > > > relock > > > >> an object the > > > >>> target thread currently waits for. Obviously I cannot relock in this > case, > > > >> instead I chose to > > > >>> introduce relock_count_after_wait to JavaThread. > > > >>> > > > >>> > Is it just that some of the locking gets optimized away e.g. > > > >>> > > > > >>> > synchronised(obj) { > > > >>> > synchronised(obj) { > > > >>> > synchronised(obj) { > > > >>> > obj.wait(); > > > >>> > } > > > >>> > } > > > >>> > } > > > >>> > > > > >>> > If this is reduced to a form as-if it were a single lock of the > monitor > > > >>> > (due to EA) and the wait() triggers a JVM TI event which leads to > > the > > > >>> > escape of "obj" then we need to reconstruct the true lock state, > > and > > > so > > > >>> > when the wait() internally unblocks and reacquires the monitor it > > > has to > > > >>> > set the true recursion count to 3, not the 1 that it appeared to be > > > when > > > >>> > wait() was initially called. Is that the scenario? > > > >>> > > > >>> Kind of... except that the locking is not eliminated due to EA and > there > > is > > > no > > > >> JVM TI event > > > >>> triggered by wait. > > > >>> > > > >>> Add > > > >>> > > > >>> LocalObject l1 = new LocalObject(); > > > >>> > > > >>> in front of the synchrnized blocks and assume a JVM TI agent > acquires > > l1. > > > This > > > >> triggers the code in > > > >>> question. > > > >>> > > > >>> See that relocking/reallocating is transactional. If it is done then for > > /all/ > > > >> objects in scope and it is > > > >>> done at most once. It wouldn't be quite so easy to split this in > relocking > > > of > > > >> nested/EA-based > > > >>> eliminated locks. > > > >>> > > > >>> > If so I find this truly awful. Anyone using wait() in a realistic form > > > >>> > requires a notification and so the object cannot be thread > > confined. > > > In > > > >>> > > > >>> It is not thread confined. > > > >>> > > > >>> > which case I would strongly argue that upon hitting the wait() the > > > deopt > > > >>> > should occur unconditionally and so the lock state is correct > before > > > we > > > >>> > wait and so we don't need to mess with the recursion count > > > internally > > > >>> > when we reacquire the monitor. > > > >>> > > > > >>> > > > > > >>> > > > which I don't like the sound of at all when it comes to > > > ObjectMonitor > > > >>> > > > state. So I'd like to understand in detail exactly what is going > > on > > > here > > > >>> > > > and why. This is a very intrusive change that seems to badly > > > break > > > >>> > > > encapsulation and impacts future changes to ObjectMonitor > > > that are > > > >> under > > > >>> > > > investigation. > > > >>> > > > > > >>> > > I would not regard this as breaking encapsulation. Certainly not > > > badly. > > > >>> > > > > > >>> > > I've added a property relock_count_after_wait to JavaThread. > > The > > > >> property is well > > > >>> > > encapsulated. Future ObjectMonitor implementations have to > > deal > > > with > > > >> recursion too. They are free > > > >>> > > in choosing a way to do that as long as that property is taken > into > > > >> account. This is hardly a > > > >>> > > limitation. > > > >>> > > > > >>> > I do think this badly breaks encapsulation as you have to add a > > > callout > > > >>> > from the guts of the ObjectMonitor code to reach into the thread > > to > > > get > > > >>> > this lock count adjustment. I understand why you have had to do > > > this but > > > >>> > I would much rather see a change to the EA optimisation strategy > > so > > > that > > > >>> > this is not needed. > > > >>> > > > > >>> > > Note also that the property is a straight forward extension of > the > > > >> existing concept of deferred > > > >>> > > local updates. It is embedded into the structure holding them. > So > > > not > > > >> even the footprint of a > > > >>> > > JavaThread is enlarged if no deferred updates are generated. > > > >>> > > > > >>> > [...] > > > >>> > > > > >>> > > > > > >>> > > I'm actually duplicating the existing external suspend > mechanism, > > > >> because a thread can be > > > >>> > > suspended at most once. And hey, and don't like that either! > But > > it > > > >> seems not unlikely that the > > > >>> > > duplicate can be removed together with the original and the > new > > > type > > > >> of handshakes that will be > > > >>> > > used for thread suspend can be used for object deoptimization > > > too. See > > > >> today's discussion in > > > >>> > > JDK-8227745 [2]. > > > >>> > > > > >>> > I hope that discussion bears some fruit, at the moment it seems > > not > > > to > > > >>> > be possible to use handshakes here. :( > > > >>> > > > > >>> > The external suspend mechanism is a royal pain in the proverbial > > > that we > > > >>> > have to carefully live with. The idea that we're duplicating that > for > > > >>> > use in another fringe area of functionality does not thrill me at all. > > > >>> > > > > >>> > To be clear, I understand the problem that exists and that you > > wish > > > to > > > >>> > solve, but for the runtime parts I balk at the complexity cost of > > > >>> > solving it. > > > >>> > > > >>> I know it's complex, but by far no rocket science. > > > >>> > > > >>> Also I find it hard to imagine another fix for JDK-8233915 besides > > > changing > > > >> the JVM TI specification. > > > >>> > > > >>> Thanks, Richard. > > > >>> > > > >>> -----Original Message----- > > > >>> From: David Holmes > > > >>> Sent: Dienstag, 17. Dezember 2019 08:03 > > > >>> To: Reingruber, Richard ; > serviceability- > > > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; > > > hotspot- > > > >> runtime-dev at openjdk.java.net; Vladimir Kozlov > > > (vladimir.kozlov at oracle.com) > > > >> > > > >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > > > Performance > > > >> in the Presence of JVMTI Agents > > > >>> > > > >>> > > > >>> > > > >>> David > > > >>> > > > >>> On 17/12/2019 4:57 pm, David Holmes wrote: > > > >>>> Hi Richard, > > > >>>> > > > >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote: > > > >>>>> Hi David, > > > >>>>> > > > >>>>> ?? > Some further queries/concerns: > > > >>>>> ?? > > > > >>>>> ?? > src/hotspot/share/runtime/objectMonitor.cpp > > > >>>>> ?? > > > > >>>>> ?? > Can you please explain the changes to ObjectMonitor::wait: > > > >>>>> ?? > > > > >>>>> ?? > !?? _recursions = save????? // restore the old recursion count > > > >>>>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > > > >>>>> ?? > increased by the deferred relock count > > > >>>>> ?? > > > > >>>>> ?? > what is the "deferred relock count"? I gather it relates to > > > >>>>> ?? > > > > >>>>> ?? > "The code was extended to be able to deoptimize objects of a > > > >>>>> frame that > > > >>>>> ?? > is not the top frame and to let another thread than the owning > > > >>>>> thread do > > > >>>>> ?? > it." > > > >>>>> > > > >>>>> Yes, these relate. Currently EA based optimizations are reverted, > > > when > > > >>>>> a compiled frame is replaced > > > >>>>> with corresponding interpreter frames. Part of this is relocking > > > >>>>> objects with eliminated > > > >>>>> locking. New with the enhancement is that we do this also just > > before > > > >>>>> object references are acquired > > > >>>>> through JVMTI. In this case we deoptimize also the owning > compiled > > > >>>>> frame C and we register > > > >>>>> deoptimized objects as deferred updates. When control returns to > > C > > > it > > > >>>>> gets deoptimized, we notice > > > >>>>> that objects are already deoptimized (reallocated and relocked), so > > > we > > > >>>>> don't do it again (relocking > > > >>>>> twice would be incorrect of course). Deferred updates are copied > > into > > > >>>>> the new interpreter frames. > > > >>>>> > > > >>>>> Problem: relocking is not possible if the target thread T is waiting > > > >>>>> on the monitor that needs to be > > > >>>>> relocked. This happens only with non-local objects with > > > >>>>> EliminateNestedLocks. Instead relocking is > > > >>>>> deferred until T owns the monitor again. This is what the piece of > > > >>>>> code above does. > > > >>>> > > > >>>> Sorry I need some more detail here. How can you wait() on an > object > > > >>>> monitor if the object allocation and/or locking was optimised away? > > > And > > > >>>> what is a "non-local object" in this context? Isn't EA restricted to > > > >>>> thread-confined objects? > > > >>>> > > > >>>> Is it just that some of the locking gets optimized away e.g. > > > >>>> > > > >>>> synchronised(obj) { > > > >>>> ? synchronised(obj) { > > > >>>> ??? synchronised(obj) { > > > >>>> ????? obj.wait(); > > > >>>> ??? } > > > >>>> ? } > > > >>>> } > > > >>>> > > > >>>> If this is reduced to a form as-if it were a single lock of the monitor > > > >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the > > > >>>> escape of "obj" then we need to reconstruct the true lock state, and > > so > > > >>>> when the wait() internally unblocks and reacquires the monitor it > has > > to > > > >>>> set the true recursion count to 3, not the 1 that it appeared to be > > when > > > >>>> wait() was initially called. Is that the scenario? > > > >>>> > > > >>>> If so I find this truly awful. Anyone using wait() in a realistic form > > > >>>> requires a notification and so the object cannot be thread confined. > > In > > > >>>> which case I would strongly argue that upon hitting the wait() the > > > deopt > > > >>>> should occur unconditionally and so the lock state is correct before > > we > > > >>>> wait and so we don't need to mess with the recursion count > internally > > > >>>> when we reacquire the monitor. > > > >>>> > > > >>>>> > > > >>>>> ?? > which I don't like the sound of at all when it comes to > > > >>>>> ObjectMonitor > > > >>>>> ?? > state. So I'd like to understand in detail exactly what is going > > > >>>>> on here > > > >>>>> ?? > and why.? This is a very intrusive change that seems to badly > > > break > > > >>>>> ?? > encapsulation and impacts future changes to ObjectMonitor > > that > > > >>>>> are under > > > >>>>> ?? > investigation. > > > >>>>> > > > >>>>> I would not regard this as breaking encapsulation. Certainly not > > badly. > > > >>>>> > > > >>>>> I've added a property relock_count_after_wait to JavaThread. The > > > >>>>> property is well > > > >>>>> encapsulated. Future ObjectMonitor implementations have to deal > > > with > > > >>>>> recursion too. They are free in > > > >>>>> choosing a way to do that as long as that property is taken into > > > >>>>> account. This is hardly a > > > >>>>> limitation. > > > >>>> > > > >>>> I do think this badly breaks encapsulation as you have to add a > callout > > > >>>> from the guts of the ObjectMonitor code to reach into the thread to > > > get > > > >>>> this lock count adjustment. I understand why you have had to do > this > > > but > > > >>>> I would much rather see a change to the EA optimisation strategy so > > > that > > > >>>> this is not needed. > > > >>>> > > > >>>>> Note also that the property is a straight forward extension of the > > > >>>>> existing concept of deferred > > > >>>>> local updates. It is embedded into the structure holding them. So > > not > > > >>>>> even the footprint of a > > > >>>>> JavaThread is enlarged if no deferred updates are generated. > > > >>>>> > > > >>>>> ?? > --- > > > >>>>> ?? > > > > >>>>> ?? > src/hotspot/share/runtime/thread.cpp > > > >>>>> ?? > > > > >>>>> ?? > Can you please explain why > > > >>>>> JavaThread::wait_for_object_deoptimization > > > >>>>> ?? > has to be handcrafted in this way rather than using proper > > > >>>>> transitions. > > > >>>>> ?? > > > > >>>>> > > > >>>>> I wrote wait_for_object_deoptimization taking > > > >>>>> JavaThread::java_suspend_self_with_safepoint_check > > > >>>>> as template. So in short: for the same reasons :) > > > >>>>> > > > >>>>> Threads reach both methods as part of thread state transitions, > > > >>>>> therefore special handling is > > > >>>>> required to change thread state on top of ongoing transitions. > > > >>>>> > > > >>>>> ?? > We got rid of "deopt suspend" some time ago and it is > > disturbing > > > >>>>> to see > > > >>>>> ?? > it being added back (effectively). This seems like it may be > > > >>>>> something > > > >>>>> ?? > that handshakes could be used for. > > > >>>>> > > > >>>>> Deopt suspend used to be something rather different with a > similar > > > >>>>> name[1]. It is not being added back. > > > >>>> > > > >>>> I stand corrected. Despite comments in the code to the contrary > > > >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a > lot > > of > > > >>>> cleanup in this area 13 years ago :) > > > >>>> > > > >>>>> > > > >>>>> I'm actually duplicating the existing external suspend mechanism, > > > >>>>> because a thread can be suspended > > > >>>>> at most once. And hey, and don't like that either! But it seems not > > > >>>>> unlikely that the duplicate can > > > >>>>> be removed together with the original and the new type of > > > handshakes > > > >>>>> that will be used for > > > >>>>> thread suspend can be used for object deoptimization too. See > > > today's > > > >>>>> discussion in JDK-8227745 [2]. > > > >>>> > > > >>>> I hope that discussion bears some fruit, at the moment it seems not > > to > > > >>>> be possible to use handshakes here. :( > > > >>>> > > > >>>> The external suspend mechanism is a royal pain in the proverbial > that > > > we > > > >>>> have to carefully live with. The idea that we're duplicating that for > > > >>>> use in another fringe area of functionality does not thrill me at all. > > > >>>> > > > >>>> To be clear, I understand the problem that exists and that you wish > to > > > >>>> solve, but for the runtime parts I balk at the complexity cost of > > > >>>> solving it. > > > >>>> > > > >>>> Thanks, > > > >>>> David > > > >>>> ----- > > > >>>> > > > >>>>> Thanks, Richard. > > > >>>>> > > > >>>>> [1] Deopt suspend was something like an async. handshake for > > > >>>>> architectures with register windows, > > > >>>>> ???? where patching the return pc for deoptimization of a compiled > > > >>>>> frame was racy if the owner thread > > > >>>>> ???? was in native code. Instead a "deopt" suspend flag was set on > > > >>>>> which the thread patched its own > > > >>>>> ???? frame upon return from native. So no thread was suspended. > It > > > got > > > >>>>> its name only from the name of > > > >>>>> ???? the flags. > > > >>>>> > > > >>>>> [2] Discussion about using handshakes to sync. with the target > > thread: > > > >>>>> > > > >>>>> https://bugs.openjdk.java.net/browse/JDK- > > > >> > > > > > > 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst > > > e > > > >> m.issuetabpanels:comment-tabpanel#comment-14306727 > > > >>>>> > > > >>>>> > > > >>>>> -----Original Message----- > > > >>>>> From: David Holmes > > > >>>>> Sent: Freitag, 13. Dezember 2019 00:56 > > > >>>>> To: Reingruber, Richard ; > > > >>>>> serviceability-dev at openjdk.java.net; > > > >>>>> hotspot-compiler-dev at openjdk.java.net; > > > >>>>> hotspot-runtime-dev at openjdk.java.net > > > >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > > > >>>>> Performance in the Presence of JVMTI Agents > > > >>>>> > > > >>>>> Hi Richard, > > > >>>>> > > > >>>>> Some further queries/concerns: > > > >>>>> > > > >>>>> src/hotspot/share/runtime/objectMonitor.cpp > > > >>>>> > > > >>>>> Can you please explain the changes to ObjectMonitor::wait: > > > >>>>> > > > >>>>> !?? _recursions = save????? // restore the old recursion count > > > >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > > > >>>>> increased by the deferred relock count > > > >>>>> > > > >>>>> what is the "deferred relock count"? I gather it relates to > > > >>>>> > > > >>>>> "The code was extended to be able to deoptimize objects of a > > frame > > > that > > > >>>>> is not the top frame and to let another thread than the owning > > thread > > > do > > > >>>>> it." > > > >>>>> > > > >>>>> which I don't like the sound of at all when it comes to > ObjectMonitor > > > >>>>> state. So I'd like to understand in detail exactly what is going on > here > > > >>>>> and why.? This is a very intrusive change that seems to badly break > > > >>>>> encapsulation and impacts future changes to ObjectMonitor that > > are > > > under > > > >>>>> investigation. > > > >>>>> > > > >>>>> --- > > > >>>>> > > > >>>>> src/hotspot/share/runtime/thread.cpp > > > >>>>> > > > >>>>> Can you please explain why > > > JavaThread::wait_for_object_deoptimization > > > >>>>> has to be handcrafted in this way rather than using proper > > transitions. > > > >>>>> > > > >>>>> We got rid of "deopt suspend" some time ago and it is disturbing > to > > > see > > > >>>>> it being added back (effectively). This seems like it may be > > something > > > >>>>> that handshakes could be used for. > > > >>>>> > > > >>>>> Thanks, > > > >>>>> David > > > >>>>> ----- > > > >>>>> > > > >>>>> On 12/12/2019 7:02 am, David Holmes wrote: > > > >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: > > > >>>>>>> Hi David, > > > >>>>>>> > > > >>>>>>> ??? > Most of the details here are in areas I can comment on in > > > detail, > > > >>>>>>> but I > > > >>>>>>> ??? > did take an initial general look at things. > > > >>>>>>> > > > >>>>>>> Thanks for taking the time! > > > >>>>>> > > > >>>>>> Apologies the above should read: > > > >>>>>> > > > >>>>>> "Most of the details here are in areas I *can't* comment on in > > detail > > > >>>>>> ..." > > > >>>>>> > > > >>>>>> David > > > >>>>>> > > > >>>>>>> ??? > The only thing that jumped out at me is that I think the > > > >>>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. > > > >>>>>>> ??? > > > > >>>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; > > } > > > >>>>>>> > > > >>>>>>> Yes, it should. Will add the method like above. > > > >>>>>>> > > > >>>>>>> ??? > Also I don't see any testing of the > > > DeoptimizeObjectsALotThread. > > > >>>>>>> Without > > > >>>>>>> ??? > active testing this will just bit-rot. > > > >>>>>>> > > > >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger > > > >>>>>>> workload. I will add a minimal test > > > >>>>>>> to keep it fresh. > > > >>>>>>> > > > >>>>>>> ??? > Also on the tests I don't understand your @requires clause: > > > >>>>>>> ??? > > > > >>>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & > > > vm.compiler2.enabled > > > >> & > > > >>>>>>> ??? > (vm.opt.TieredCompilation != true)) > > > >>>>>>> ??? > > > > >>>>>>> ??? > This seems to require that TieredCompilation is disabled, > but > > > >>>>>>> tiered is > > > >>>>>>> ??? > our normal mode of operation. ?? > > > >>>>>>> ??? > > > > >>>>>>> > > > >>>>>>> I removed the clause. I guess I wanted to target the tests > towards > > > the > > > >>>>>>> code they are supposed to > > > >>>>>>> test, and it's easier to analyze failures w/o tiered compilation > and > > > >>>>>>> with just one compiler thread. > > > >>>>>>> > > > >>>>>>> Additionally I will make use of > > > >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the > > tests. > > > >>>>>>> > > > >>>>>>> Thanks, > > > >>>>>>> Richard. > > > >>>>>>> > > > >>>>>>> -----Original Message----- > > > >>>>>>> From: David Holmes > > > >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 > > > >>>>>>> To: Reingruber, Richard ; > > > >>>>>>> serviceability-dev at openjdk.java.net; > > > >>>>>>> hotspot-compiler-dev at openjdk.java.net; > > > >>>>>>> hotspot-runtime-dev at openjdk.java.net > > > >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > > > >>>>>>> Performance in the Presence of JVMTI Agents > > > >>>>>>> > > > >>>>>>> Hi Richard, > > > >>>>>>> > > > >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: > > > >>>>>>>> Hi, > > > >>>>>>>> > > > >>>>>>>> I would like to get reviews please for > > > >>>>>>>> > > > >>>>>>>> > > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ > > > >>>>>>>> > > > >>>>>>>> Corresponding RFE: > > > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 > > > >>>>>>>> > > > >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 > > > >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK- > > > 8214584 [1] > > > >>>>>>>> > > > >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing > > > without > > > >>>>>>>> issues (thanks!). In addition the > > > >>>>>>>> change is being tested at SAP since I posted the first RFR some > > > >>>>>>>> months ago. > > > >>>>>>>> > > > >>>>>>>> The intention of this enhancement is to benefit performance > > wise > > > from > > > >>>>>>>> escape analysis even if JVMTI > > > >>>>>>>> agents request capabilities that allow them to access local > > variable > > > >>>>>>>> values. E.g. if you start-up > > > >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, > > > then > > > >>>>>>>> escape analysis is disabled right > > > >>>>>>>> from the beginning, well before a debugger attaches -- if ever > > one > > > >>>>>>>> should do so. With the > > > >>>>>>>> enhancement, escape analysis will remain enabled until and > > after > > > a > > > >>>>>>>> debugger attaches. EA based > > > >>>>>>>> optimizations are reverted just before an agent acquires the > > > >>>>>>>> reference to an object. In the JBS item > > > >>>>>>>> you'll find more details. > > > >>>>>>> > > > >>>>>>> Most of the details here are in areas I can comment on in detail, > > but > > > I > > > >>>>>>> did take an initial general look at things. > > > >>>>>>> > > > >>>>>>> The only thing that jumped out at me is that I think the > > > >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread. > > > >>>>>>> > > > >>>>>>> +? bool is_hidden_from_external_view() const { return true; } > > > >>>>>>> > > > >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. > > > >>>>>>> Without > > > >>>>>>> active testing this will just bit-rot. > > > >>>>>>> > > > >>>>>>> Also on the tests I don't understand your @requires clause: > > > >>>>>>> > > > >>>>>>> ??? @requires ((vm.compMode != "Xcomp") & > > > vm.compiler2.enabled & > > > >>>>>>> (vm.opt.TieredCompilation != true)) > > > >>>>>>> > > > >>>>>>> This seems to require that TieredCompilation is disabled, but > > tiered > > > is > > > >>>>>>> our normal mode of operation. ?? > > > >>>>>>> > > > >>>>>>> Thanks, > > > >>>>>>> David > > > >>>>>>> > > > >>>>>>>> Thanks, > > > >>>>>>>> Richard. > > > >>>>>>>> > > > >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 > > > >>>>>>>> > > > >> > > > > > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa > > > tc > > > >> h > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> From aph at redhat.com Wed May 6 14:38:23 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 6 May 2020 15:38:23 +0100 Subject: [aarch64-port-dev ] RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <2c9c0c59-73cc-4c88-c568-deba5d73e4f5@oracle.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <6f0ed258-8fad-43df-60cb-3281397ddf9b@oracle.com> <55cd970b-5a4e-4d1a-e17c-77289a412022@redhat.com> <2c9c0c59-73cc-4c88-c568-deba5d73e4f5@oracle.com> Message-ID: <408cac5b-0151-b0cb-a1a2-5bce673e42b0@redhat.com> On 5/6/20 10:14 AM, Christian Hagedorn wrote: > On 06.05.20 10:40, Andrew Haley wrote: >> On 5/6/20 8:26 AM, Christian Hagedorn wrote >>> The fix was to >>> only do it if ShowMessageBoxOnError was set. So, I think it's indeed a >>> good idea to also do the same for other architectures. >> Really? Can't we fix stop() so that it generates only a tiny little bit >> of code? I can do it if you like. > This is probably what you've meant. I agree with you, the fewer > instructions emitted by stop() the better. ShowMessageBoxOnError is completely the wrong thing to use. I run in GDB all the time, and I want accurate error messages, and I never use ShowMessageBoxOnError. If we get this right we can have near-zero code expansion, and keep this on at all times, even in production. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From derekw at marvell.com Wed May 6 15:47:24 2020 From: derekw at marvell.com (Derek White) Date: Wed, 6 May 2020 15:47:24 +0000 Subject: [EXT] Re: [aarch64-port-dev ] RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Message-ID: Hi Ningsheng, Completely agree! I've had a patch related to this on the back-burner for a long time. If Andrew's trap version works out that would be good too. - Derek -----Original Message----- From: aarch64-port-dev On Behalf Of Ningsheng Jian Sent: Wednesday, May 6, 2020 2:35 AM To: Liu, Xin ; Doerr, Martin ; hotspot-compiler-dev at openjdk.java.net Cc: aarch64-port-dev at openjdk.java.net Subject: [EXT] Re: [aarch64-port-dev ] RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 External Email ---------------------------------------------------------------------- Hi Xin, Martin's review comments reminds me that we should worry about code size increase on AArch64 with your patch, given that the HaltNode will be generated in many cases now. Currently in AArch64 MacroAssembler::stop(), it will generate many register saving instructions by pusha() before calling to debug64(). But I think debug64() only uses the regs[] and pc arguments when ShowMessageBoxOnError is on. Maybe we should at least only do the saving and pc arg passing when ShowMessageBoxOnError is on in MacroAssembler::stop(), as what x86 does in macroAssembler_x86.cpp? Thanks, Ningsheng On 5/5/20 7:26 AM, Liu, Xin wrote: > Hi, Martin, > > Thank you to review it and build it on s390 and PPC! > > If I delete size(4) in ppc.ad, hotspot can work out the correct size of instruction chunk, can't it? > I found most of instructions in ppc.ad have size(xx), but there're a couple of exceptions: cacheWB & cacheWBPreSync. > > I think it should be in product hotspot. Here are my arguments. > 1. Crash reports of release build are more common. > Some customers even don't bother trying again with a debug build. > > Let me take the crash report on aarch64 as an example. I paste the comparison before and after. > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java > .net_browse_JDK-2D8230552-3FfocusedCommentId-3D14334977-26page-3Dcom.a > tlassian.jira.plugin.system.issuetabpanels-253Acomment-2Dtabpanel-23co > mment-2D14334977&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X > 2mceubwzCNjT0vmaU97kngYUJk&m=ujj4giApNOH52SLyvnUxdMe0pO3gqYFlEMayYqVsr > F8&s=JrGUdO8_yl-2ls3pMr2Hir7yzy2gZM7H8lS2_t5fA0s&e= > > Without stop(_halt_reason), what we know is "there's a bug in C2-generated code". If the method is big, which is very likely because of inlining, it's easy to get lost. > > I feel it's more helpful with the patch. We can locate which HaltNode it is by searching _halt_reason in codebase. > Hopefully, we can find the culprit method from "Compilation events". > > 2. HaltNode is rarely generated and will be removed if it's dead. > IMHO, the semantic of that Node is "halt". If It remains after optimizer or lowering to mach-nodes, something wrong and unrecoverable happened in the compilers. After we fix the compiler bug, it should be gone. That's is too say, it shouldn't cause any problem about code size in ideal cases. > > In reality, I observe that a HaltNode always follows the uncommon_trap call. Christian also observed that in JDK-8022574. > Isn't uncommon trap a one-way ticket for all architectures? I feel the control flow never returns after uncommon_trap, why do we generate a HaltNode after that? Nevertheless, it's a separated issue. > > Let me make another revision to fix PPC and I found that sparc.ad is gonna gone. > > Thanks, > --lx > > > ?On 5/4/20, 1:47 PM, "Doerr, Martin" wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > Hi lx, > > the size attribute is wrong on PPC64 (stop is larger than 4 Bytes). S390 builds fine. > I've only run the build. No tests. > > Should this feature be debug-only? > Do we want the lengthy code emitted in product build? > > Best regards, > Martin > > > > -----Original Message----- > > From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Liu, Xin > > Sent: Donnerstag, 30. April 2020 06:03 > > To: hotspot-compiler-dev at openjdk.java.net > > Subject: RFR(XS): Provide information when hitting a HaltNode for > > architectures other than x86 > > > > Hi, > > > > Could you review this small patch? It unifies codegen of HaltNode for other > > architectures. > > JBS: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8230552&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=ujj4giApNOH52SLyvnUxdMe0pO3gqYFlEMayYqVsrF8&s=zinJ3kLMxMM1eJWAyO_HiZjVJKBkUExIO5iXPveTnCw&e= > > Webrev: https://urldefense.proofpoint.com/v2/url?u=https-3A__cr.openjdk.java.net_-7Exliu_8230552_00_webrev_&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=ujj4giApNOH52SLyvnUxdMe0pO3gqYFlEMayYqVsrF8&s=AjLxZxLJhyWpV8dTOwQF2ePi83mjS2Omp7jjk0ib-DM&e= > > > > I tested on aarch64. It generates the same crash report as x86_64 when it > > does hit HaltNode. Halt reason is displayed. I paste report on the JBS. > > I ran hotspot:tier1 on aarch64 fastdebug build. It passed except for 3 > > relevant failures[1]. > > > > I plan to do that on aarch64 only, but it?s trivial on other architectures, so I > > bravely modified them all. May I invite s390, SPARC arm32 maintainers take a > > look at it? > > If it goes through the review, I hope a sponsor can help me to push the > > submit repo and see if it works. > > > > [1] those 3 tests failed on aarch64 with/without my changes. > > gc/shenandoah/mxbeans/TestChurnNotifications.java#id2 > > gc/shenandoah/mxbeans/TestChurnNotifications.java#id1 > > gc/shenandoah/mxbeans/TestPauseNotifications.java#id1 > > > > thanks, > > -lx > > > > From nils.eliasson at oracle.com Wed May 6 17:12:26 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 6 May 2020 19:12:26 +0200 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: References: <380ca47b-4143-e98f-ff81-461b394aaf0c@oracle.com> Message-ID: <2d849c3a-f971-b644-16fb-72aba5a919c0@oracle.com> After posting this I noticed another problem. The sweeper should wake up regularly, but now it is only awakened when hitting the SweepAggressive threshold. This is wrong. I suggest holding of the fix until all the problems have been ironed out. Best regards, Nils Eliasson On 2020-05-06 11:52, Nils Eliasson wrote: > > > On 2020-05-05 20:57, Man Cao wrote: >> Hi Laurent, Nils, >> >> > Using 40mb code cache looks quite small compared to default values >> (256mb ?). >> The 40MB code cache size is for -XX:-TieredCompilation, for the >> DaCapo runs. >> "java -XX:-TieredCompilation -XX:+PrintFlagsFinal |& grep >> ReservedCodeCacheSize" shows the default size is 48MB for >> -TieredCompilation. >> So 40MB is not particularly small compared to 48MB. >> >> The Blaze runs use "-XX:+TieredCompilation" and a Google-default?code >> cache size of 480MB (we have doubled the default size). >> This actually tests a case that is closer to the OpenJDK default. >> >> > Could you run 1 quick experiment with default code cache settings >> to see if there is no regression in the main use case? >> Yes, I just launched an experiment running DaCapo at JDK source tip, >> only with "-Xms4g -Xmx4g" and some logging flags. >> This is a meaningful experiment as it uses the real OpenJDK default >> flags, and the most up-to-date source. >> >> > Why do you expect code cache usage would increase a lot? The sweeper >> > still wakes up regularly and cleans the code cache. The code path >> fixed >> > is just about sweeping extra aggressively under some circumstances. >> Some >> > nmethod might live a little longer, but they will still be cleaned. >> I found the aggressive sweeping deoptimized a non-trivial number of >> nmethods, which kept the usage low. >> After my bugfix, the sweeper only wakes up when the usage reaches >> (100-StartAggressiveSweepingAt)% of code cache capacity, which >> default to 90%. >> This means many nmethods?will not be cleaned until there is pressure >> in code cache usage. >> >> > One number you could add to your benchmark numbers is the number of >> > nmethods reclaimed and code cache usage. I expect both to remain >> the same. >> The benchmark result htmls already show the?code cache usage (Code >> Cache Used (MB)). >> You can CTRL-F for "Code Cache" in the browser. >> >> There is a significant increase in code cache usage: >> For DaCapo, up to 13.5MB (for tradesoap) increase for the 40MB >> ReservedCodeCacheSize. >> For Blaze, the?increase is 33MB-80MB for the >> 480MB?ReservedCodeCacheSize. >> The code cache usage metric is measure like this (we added an >> hsperfdata counter for it), at the end of a benchmark run: >> ? size_t result = 0; >> ? FOR_ALL_ALLOCABLE_HEAPS(heap) { >> ? ? result += (*heap)->allocated_capacity(); >> ? } >> ??_code_cache_used_size->set_value(result); >> >> I also looked at the logs, it shows that the bugfix eliminated almost >> all of the code cache flushes. They also contain the number of >> nmethods reclaimed. >> E.g., for tradesoap: >> >> Without the fix: >> Code cache sweeper statistics: >> ?? Total sweep time:??????????????? 1902 ms >> ?? Total number of full sweeps:???? 23301 >> ?? Total number of flushed methods: 7080 (thereof 7080 C2 methods) >> ?? Total size of flushed methods:?? 20468 kB >> With the fix: >> Code cache sweeper statistics: >> ?? Total sweep time:??????????????? 0 ms >> ?? Total number of full sweeps:???? 0 >> ?? Total number of flushed methods: 0 (thereof 0 C2 methods) >> ?? Total size of flushed methods:?? 0 kB >> > > Interesting results and a clear indication that something is broken in > the sweeper heuristics - Nmethods should still be flushed! > > Looking at sweeper.cpp I see something that looks wrong. The > _last_sweep counter is updated even if no sweep was done. In low code > cache usage scenarios that means will might never reach the threshold. > > I think the curly brace should be moved done like this: > > --- a/src/hotspot/share/runtime/sweeper.cpp???? Tue May 05 21:28:46 > 2020 +0200 > +++ b/src/hotspot/share/runtime/sweeper.cpp???? Wed May 06 11:46:01 > 2020 +0200 > @@ -445,16 +445,17 @@ > ?? if (_should_sweep || forced) { > ???? init_sweeper_log(); > ???? sweep_code_cache(); > + > +??? // We are done with sweeping the code cache once. > +??? _total_nof_code_cache_sweeps++; > +??? _last_sweep = _time_counter; > +??? // Reset flag; temporarily disables sweeper > +??? _should_sweep = false; > +??? // If there was enough state change, 'possibly_enable_sweeper()' > +??? // sets '_should_sweep' to true > +??? possibly_enable_sweeper(); > ?? } > > -? // We are done with sweeping the code cache once. > -? _total_nof_code_cache_sweeps++; > -? _last_sweep = _time_counter; > -? // Reset flag; temporarily disables sweeper > -? _should_sweep = false; > -? // If there was enough state change, 'possibly_enable_sweeper()' > -? // sets '_should_sweep' to true > -? possibly_enable_sweeper(); > ?? // Reset _bytes_changed only if there was enough state change. > _bytes_changed > ?? // can further increase by calls to 'report_state_change'. > ?? if (_should_sweep) { > > Can you try it out and see if things improve? > > Best regards, > Nils Eliasson > > >> Anyway, just to reiterate, we think the improvement in throughput and >> CPU usage is well worth the increase in code cache usage. >> If the increase causes any problem, we would advise users to accept >> the increase and fully provision memory for the >> entire?ReservedCodeCacheSize. >> >> -Man >> >> >> On Tue, May 5, 2020 at 3:18 AM Nils Eliasson >> > wrote: >> >> ??? Hi Man, >> >> ??? Why do you expect code cache usage would increase a lot? The sweeper >> ??? still wakes up regularly and cleans the code cache. The code path >> ??? fixed >> ??? is just about sweeping extra aggressively under some >> ??? circumstances. Some >> ??? nmethod might live a little longer, but they will still be cleaned. >> >> ??? Without your bugfix the sweeper will be notified for every new >> ??? allocation in the codecache as soon as code cache usages has gone >> ??? beyond >> ??? 10%. That could in the worst case be one sweep for every allocation. >> >> ??? One number you could add to your benchmark numbers is the number of >> ??? nmethods reclaimed and code cache usage. I expect both to remain >> ??? the same. >> >> ??? Best regards, >> ??? Nils >> >> >> >> ??? On 2020-05-04 21:21, Man Cao wrote: >> ??? > Hi, >> ??? > >> ??? > Thanks for the review! >> ??? > Yes, the code change is trivial, but runtime behavior change is >> ??? > considerable. >> ??? > In particular, throughput and CPU usage could improve, but code >> ??? cache usage >> ??? > could increase a lot. In our experience, the improvement in >> ??? throughput and >> ??? > CPU is well worth the code cache increase. >> ??? > >> ??? > I have attached some benchmarking results in JBS. They are based >> ??? on JDK11. >> ??? > We have not rolled out this fix to our production JDK11 yet, as >> ??? I'd like to >> ??? > confirm that this large change in runtime behavior is OK with >> ??? the OpenJDK >> ??? > community. >> ??? > We are happy to share some performance numbers from production >> ??? workload >> ??? > once we have them. >> ??? > >> ??? > -Man >> ??? > >> ??? > >> ??? > On Mon, May 4, 2020 at 4:11 AM Laurent Bourg?s >> ??? > >> ??? > wrote: >> ??? > >> ??? >> Hi, >> ??? >> >> ??? >> Do you have performance results to justify your assumption "could >> ??? >> significantly improve performance" ? >> ??? >> >> ??? >> Please share numbers in the jbs bug >> ??? >> >> ??? >> Laurent >> ??? >> >> ??? >> Le sam. 2 mai 2020 ? 07:35, Man Cao > ??? > a ?crit : >> ??? >> >> ??? >>> Hi all, >> ??? >>> >> ??? >>> Can I have reviews for this one-line change that fixes a bug >> ??? and could >> ??? >>> significantly improve performance? >> ??? >>> Webrev: https://cr.openjdk.java.net/~manc/8244278/webrev.00/ >> ??? >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8244278 >> ??? >>> >> ??? >>> It passes tier-1 tests locally, as well as >> ??? >>> "vm/mlvm/meth/stress/compiler/deoptimize" (for the original >> ??? JDK-8046809). >> ??? >>> >> ??? >>> -Man >> ??? >>> >> > From aph at redhat.com Wed May 6 17:27:58 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 6 May 2020 18:27:58 +0100 Subject: [EXT] Re: [aarch64-port-dev ] RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Message-ID: <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> On 5/6/20 4:47 PM, Derek White wrote: > If Andrew's trap version works out that would be good too. Oh, uh, I guess I'll have to write it, then. :-) -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From john.r.rose at oracle.com Wed May 6 20:00:52 2020 From: john.r.rose at oracle.com (John Rose) Date: Wed, 6 May 2020 13:00:52 -0700 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <871rnx76go.fsf@redhat.com> References: <871rnx76go.fsf@redhat.com> Message-ID: <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> On May 6, 2020, at 2:33 AM, Roland Westrelin wrote: > > > https://bugs.openjdk.java.net/browse/JDK-8244504 > http://cr.openjdk.java.net/~roland/8244504/webrev.00/ > > This is some refactoring in the counted loop code to prepare for 8223051 > (support loops with long (64b) trip counts). Some of the changes came up > in the review of 8223051 (that patch used to be part of 8223051). Very quick comment: this looks wrong: + jlong hi = uhi > (julong)max_jint && ulo < (julong)max_jint ? max_jint : MAX2((jlong)uhi, (jlong)ulo); max_jint should be max_jlong, no? I suppose it?s from an incompletely edited copy/paste. From john.r.rose at oracle.com Wed May 6 20:19:03 2020 From: john.r.rose at oracle.com (John Rose) Date: Wed, 6 May 2020 13:19:03 -0700 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> Message-ID: <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> On May 6, 2020, at 1:00 PM, John Rose wrote: > > On May 6, 2020, at 2:33 AM, Roland Westrelin > wrote: >> >> >> https://bugs.openjdk.java.net/browse/JDK-8244504 >> http://cr.openjdk.java.net/~roland/8244504/webrev.00/ >> >> This is some refactoring in the counted loop code to prepare for 8223051 >> (support loops with long (64b) trip counts). Some of the changes came up >> in the review of 8223051 (that patch used to be part of 8223051). > > Very quick comment: this looks wrong: > > + jlong hi = uhi > (julong)max_jint && ulo < (julong)max_jint ? max_jint : MAX2((jlong)uhi, (jlong)ulo); > > max_jint should be max_jlong, no? I suppose it?s from an incompletely > edited copy/paste. Looking a little more at the interval arithmetic subroutines, I think it would be reasonable to leave out most of the linkage to TypeInt/TypeLong, and isolate the logic that does the min-ing and max-ing, with separate routines for translating to and from the Type* world. Maybe: struct MinMaxInterval { julong shi, slo, uhi, ulo; boolean is_int; void signedMaxWith(const Interval& that); void unsignedMaxWith(const Interval& that); void signedMinWith(const Interval& that); void unsignedMinWith(const Interval& that); MinMaxInterval(TypeInt*); MinMaxInterval(TypeLong*); const TypeInt* asTypeInt(); const TypeInt* asTypeLong(); }; It would be overkill in many cases to put such small routines into their own class, but in this case the min/max interval logic is very subtle and deserves a little display platform. From manc at google.com Wed May 6 20:20:30 2020 From: manc at google.com (Man Cao) Date: Wed, 6 May 2020 13:20:30 -0700 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: <2d849c3a-f971-b644-16fb-72aba5a919c0@oracle.com> References: <380ca47b-4143-e98f-ff81-461b394aaf0c@oracle.com> <2d849c3a-f971-b644-16fb-72aba5a919c0@oracle.com> Message-ID: Hi, [@Laurent] > Thanks Man for your results. > I will try your fix on jdk15 repo and run my Marlin tests & benchmark to see if there are any gain in these cases. You are welcome. I have run DaCapo at JDK tip, with default JVM options. I didn't see any noticeable difference in performance with and without my bugfix. This is probably due to significantly reduced code cache flushes with a large default ReservedCodeCacheSize (240MB) for +TieredCompilation. I checked the logs for tradesoap for runs without my bugfix, to count the number of completed flushes (NMethodSweeper::sweep_code_cache()): ~550 for runs with -XX:-TieredCompilation -XX:ReservedCodeCacheSize=40m on JDK11 ~35 for runs with default options on JDK tip (+TieredCompilation, ReservedCodeCacheSize=240m) The flushes are reduced by more than 15X with the default options. [@Nils] > Looking at sweeper.cpp I see something that looks wrong. The _last_sweep counter is updated even if no sweep was done. In low code cache usage scenarios that means will might never reach the threshold. > Can you try it out and see if things improve? The change makes sense to me. I can try it out together after resolving the next issue. > The sweeper should wake up > regularly, but now it is only awakened when hitting the SweepAggressive > threshold. This is wrong. > I suggest holding of the fix until all the problems have been ironed out. Could you elaborate what is the expected frequency to wake up the sweeper? Should we increase the default value for StartAggressiveSweepingAt instead? As you suggested before, currently the sweeper is awakened for every new allocation in the code cache, after the usage is above 10%. This is definitely too frequent that it hurts performance, especially for the -XX:-TieredCompilation case. We do find that turning off code cache flushing in JDK11 (-XX:-UseCodeCacheFlushing) could significantly improve performance, by ~20% for an important production workload configured with -XX:-TieredCompilation! Thus, we strongly support keeping the default flushing frequency low. -Man From martin.doerr at sap.com Wed May 6 20:45:35 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 6 May 2020 20:45:35 +0000 Subject: [EXT] Re: [aarch64-port-dev ] RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> Message-ID: Hi Andrew, I had also thought about using a trap based implementation. Maybe it makes sense to add a feature to shared code for that. E.g. we could emit an illegal instruction (which raises SIGILL) followed by some kind of index into a descriptor array. PPC64 would also benefit from a more compact solution. Best regards, Martin > -----Original Message----- > From: Andrew Haley > Sent: Mittwoch, 6. Mai 2020 19:28 > To: Derek White ; Ningsheng Jian > ; Liu, Xin ; Doerr, Martin > ; hotspot-compiler-dev at openjdk.java.net > Cc: aarch64-port-dev at openjdk.java.net > Subject: Re: [EXT] Re: [aarch64-port-dev ] RFR(XS): Provide information > when hitting a HaltNode for architectures other than x86 > > On 5/6/20 4:47 PM, Derek White wrote: > > If Andrew's trap version works out that would be good too. > > Oh, uh, I guess I'll have to write it, then. :-) > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From mikael.vidstedt at oracle.com Thu May 7 05:16:27 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 6 May 2020 22:16:27 -0700 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: References: Message-ID: Thank you for the review! Comments inline.. > On May 4, 2020, at 1:28 AM, Stefan Karlsson wrote: > > Hi Mikael, > > On 2020-05-04 07:12, Mikael Vidstedt wrote: >> Please review this change which implements part of JEP 381: >> JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 >> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ > > I went over this patch and collected some comments: > > src/hotspot/share/adlc/output_c.cpp > src/hotspot/share/adlc/output_h.cpp > > Awkward code layout after change to. Indeed - fixed! > src/hotspot/share/c1/c1_Runtime1.cpp > src/hotspot/share/classfile/classListParser.cpp > src/hotspot/share/memory/arena.hpp > src/hotspot/share/opto/chaitin.cpp > test/hotspot/jtreg/gc/TestCardTablePageCommits.java > > Surrounding comments still refers to Sparc and/or Solaris. > > There are even more places if you search in the rest of the HotSpot source. Are we leaving those for a separate cleanup pass? Correct - I deliberately avoided changing comments that were not immediately ?obvious? how to address and/or that were pre-existing issues, since it?s not necessarily wrong for a comment to refer to Solaris or SPARC even after these changes. I would prefer to do that as follow-ups. Fair? > src/hotspot/share/gc/g1/g1HeapRegionAttr.hpp > > Remove comment: > // We use different types to represent the state value depending on platform as > // some have issues loading parts of words. Fixed. > src/hotspot/share/gc/shared/memset_with_concurrent_readers.hpp > > Fuse the declaration and definition, now that we only have one implementation. Maybe even remove function/file at some point. Fixed (fused). > src/hotspot/share/utilities/globalDefinitions.hpp > > Now that STACK_BIAS is always 0, should we remove its usages? Follow-up RFE? Yes, this is one of the things I have on my list to file a follow-up for. > src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/meta/HotSpotGraphBuilderPlugins.java > > Maybe remove decryptSuffix? Fixed. > src/utils/hsdis/Makefile > > Is this really correct? > > Shouldn't: > ARCH1=$(CPU:x86_64=amd64) > ARCH2=$(ARCH1:i686=i386) > ARCH=$(ARCH2:sparc64=sparcv9) > > be changed to: > ARCH1=$(CPU:x86_64=amd64) > ARCH=$(ARCH1:i686=i386) > > so that we have ARCH defined? Very good catch! This Makefile could use some indentation love or just a plain rewrite.. In either case I fixed the ARCH definition and tested it to make sure the end result seemed to do the right thing (and AFAICT it does). > Other than that this looks good to me. Thank you! Cheers, Mikael > >> JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 >> Note: When reviewing this, please be aware that this exercise was *extremely* mind-numbing, so I appreciate your help reviewing all the individual changes carefully. You may want to get that coffee cup filled up (or whatever keeps you awake)! >> Background: >> Because of the size of the total patch and wide range of areas touched, this patch is one out of in total six partial patches which together make up the necessary changes to remove the Solaris and SPARC ports. The other patches are being sent out for review to mailing lists appropriate for the respective areas the touch. An email will be sent to jdk-dev summarizing all the patches/reviews. To be clear: this patch is *not* in itself complete and stand-alone - all of the (six) patches are needed to form a complete patch. Some changes in this patch may look wrong or incomplete unless also looking at the corresponding changes in other areas. >> For convenience, I?m including a link below[1] to the full webrev, but in case you have comments on changes in other areas, outside of the files included in this thread, please provide those comments directly in the thread on the appropriate mailing list for that area if possible. >> In case it helps, the changes were effectively produced by searching for and updating any code mentioning ?solaris", ?sparc?, ?solstudio?, ?sunos?, etc. More information about the areas impacted can be found in the JEP itself. >> A big thank you to Igor Ignatyev for helping make the changes to the hotspot tests! >> Also, I have a short list of follow-ups which I?m going to look at separately from this JEP/patch, mainly related to command line options/flags which are no longer relevant and should be deprecated/obsoleted/removed. >> Testing: >> A slightly earlier version of this change successfully passed tier1-8, as well as client tier1-2. Additional testing will be done after the first round of reviews has been completed. >> Cheers, >> Mikael >> [1] http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/all/open/webrev/ From mikael.vidstedt at oracle.com Thu May 7 05:19:07 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 6 May 2020 22:19:07 -0700 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: References: Message-ID: <2DDEE6DB-577A-4E23-B78D-0D19A15083F2@oracle.com> > On May 4, 2020, at 2:11 AM, Thomas Schatzl wrote: > > Hi, > > On 04.05.20 10:28, Stefan Karlsson wrote: >> Hi Mikael, >> On 2020-05-04 07:12, Mikael Vidstedt wrote: >>> >>> Please review this change which implements part of JEP 381: >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 >>> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ >> I went over this patch and collected some comments: >> src/hotspot/share/adlc/output_c.cpp >> src/hotspot/share/adlc/output_h.cpp >> Awkward code layout after change to. >> src/hotspot/share/c1/c1_Runtime1.cpp >> src/hotspot/share/classfile/classListParser.cpp >> src/hotspot/share/memory/arena.hpp >> src/hotspot/share/opto/chaitin.cpp >> test/hotspot/jtreg/gc/TestCardTablePageCommits.jav > > > Surrounding comments still refers to Sparc and/or Solaris. > > > > There are even more places if you search in the rest of the HotSpot > > source. Are we leaving those for a separate cleanup pass? > > In addition to "sparc", "solaris", also "solstudio"/"Sun Studio"/"SS compiler bug"/"niagara" yield some search (=grep) results. > > Some of these locations look like additional RFEs. Ah good! I found and fixed a few additional places based on those strings, but would like to handling the remaining comment updates as RFEs. Thank you for having a look! Cheers, Mikael From mikael.vidstedt at oracle.com Thu May 7 05:23:13 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 6 May 2020 22:23:13 -0700 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: <553D0344-188E-455B-A03E-D080C1484B41@oracle.com> References: <553D0344-188E-455B-A03E-D080C1484B41@oracle.com> Message-ID: <40FCCD0E-5C84-482D-9231-428898E1E0AD@oracle.com> Kim, thank you for the review! Comments inline.. > On May 4, 2020, at 3:47 AM, Kim Barrett wrote: > >> On May 4, 2020, at 1:12 AM, Mikael Vidstedt wrote: >> >> >> Please review this change which implements part of JEP 381: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 >> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ >> JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 > > I've only looked at the src/hotspot changes so far. I've not > duplicated comments already made by Stefan. > > Looks good, other than a few very minor issues, some of which might > already be covered by planned followup RFEs. > > ------------------------------------------------------------------------------ > > I think with sparc removal, c1's pack64/unpack64 stuff is no longer > used. So I think that can be removed from c1_LIR.[ch]pp too. Good catch. Fixed. > ------------------------------------------------------------------------------ > src/hotspot/share/opto/generateOptoStub.cpp > 225 // Clear last_Java_pc and (optionally)_flags > > The sparc-specific clearing of "flags" is gone. Fixed. > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/deoptimization.cpp > 1086 *((jlong *) check_alignment_get_addr(obj, index, 8)) = (jlong) *((jlong *) &val); > > [pre-existing] > The rhs cast to jlong is unnecessary, since it's dereferencing a jlong*. When I first updated the code I actually remove the cast, but it just ended up looking asymmetrical so I chose to leave it there. Let me know if you feel strongly that it should go. (I don?t like these casts in general). > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/flags/jvmFlagConstraintsCompiler.cpp > 236 JVMFlag::Error CompilerThreadPriorityConstraintFunc(intx value, bool verbose) { > 237 return JVMFlag::SUCCESS; > 238 } > > After SOLARIS code removal we no longer need this constraint function. Fixed. (I had that on my follow-up list, but included it in the upcoming webrev.) > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/globals.hpp > 2392 experimental(size_t, ArrayAllocatorMallocLimit, \ > 2393 (size_t)-1, \ > > Combine these lines. Fixed. > ------------------------------------------------------------------------------ > src/hotspot/share/utilities/dtrace.hpp > > Shuold just eliminate all traces of HS_DTRACE_WORKAROUND_TAIL_CALL_BUG. Fixed - more code removed! Cheers, Mikael From mikael.vidstedt at oracle.com Thu May 7 05:25:22 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 6 May 2020 22:25:22 -0700 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: <25d5e2fa-8909-0b94-e0b2-6b5aaa224492@oracle.com> References: <25d5e2fa-8909-0b94-e0b2-6b5aaa224492@oracle.com> Message-ID: <4B24DD51-8AA4-4896-B1F2-59FA5233F125@oracle.com> Vladimir, thank you for the review! Note that based on Stefan?s comments I have removed the decryptSuffix variable in src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/meta/HotSpotGraphBuilderPlugins.java in the upcoming webrev. Cheers, Mikael > On May 4, 2020, at 12:01 PM, Vladimir Kozlov wrote: > > JIT, AOT, JVMCI and Graal changes seem fine to me. > > It would be interesting to see shared code execution coverage change. There are places where we use flags and setting instead of #ifdef SPARC which may not be executed now or executed partially. We may simplify such code too. > > Thanks, > Vladimir > > On 5/3/20 10:12 PM, Mikael Vidstedt wrote: >> Please review this change which implements part of JEP 381: >> JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 >> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ >> JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 >> Note: When reviewing this, please be aware that this exercise was *extremely* mind-numbing, so I appreciate your help reviewing all the individual changes carefully. You may want to get that coffee cup filled up (or whatever keeps you awake)! >> Background: >> Because of the size of the total patch and wide range of areas touched, this patch is one out of in total six partial patches which together make up the necessary changes to remove the Solaris and SPARC ports. The other patches are being sent out for review to mailing lists appropriate for the respective areas the touch. An email will be sent to jdk-dev summarizing all the patches/reviews. To be clear: this patch is *not* in itself complete and stand-alone - all of the (six) patches are needed to form a complete patch. Some changes in this patch may look wrong or incomplete unless also looking at the corresponding changes in other areas. >> For convenience, I?m including a link below[1] to the full webrev, but in case you have comments on changes in other areas, outside of the files included in this thread, please provide those comments directly in the thread on the appropriate mailing list for that area if possible. >> In case it helps, the changes were effectively produced by searching for and updating any code mentioning ?solaris", ?sparc?, ?solstudio?, ?sunos?, etc. More information about the areas impacted can be found in the JEP itself. >> A big thank you to Igor Ignatyev for helping make the changes to the hotspot tests! >> Also, I have a short list of follow-ups which I?m going to look at separately from this JEP/patch, mainly related to command line options/flags which are no longer relevant and should be deprecated/obsoleted/removed. >> Testing: >> A slightly earlier version of this change successfully passed tier1-8, as well as client tier1-2. Additional testing will be done after the first round of reviews has been completed. >> Cheers, >> Mikael >> [1] http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/all/open/webrev/ From mikael.vidstedt at oracle.com Thu May 7 05:27:04 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 6 May 2020 22:27:04 -0700 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: References: Message-ID: <7F17A7EF-C9B3-40A0-816B-53614A56B7CA@oracle.com> Igor, thank you for the review, and again for helping make the test changes in the first place! :) I hope Vladimir?s reply clarifies how we?re planning on handling the Graal related changes. Cheers, Mikael > On May 4, 2020, at 2:29 PM, Igor Ignatyev wrote: > > Hi Mikael, > > the changes in /test/ look good to me. > > I have a question regarding src/jdk.internal.vm.compiler/*, aren't these files part of graal-compiler and hence will be brought back by the next graal update? > > Thanks, > -- Igor > >> On May 3, 2020, at 10:12 PM, Mikael Vidstedt wrote: >> >> >> Please review this change which implements part of JEP 381: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 >> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ >> JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 >> >> >> Note: When reviewing this, please be aware that this exercise was *extremely* mind-numbing, so I appreciate your help reviewing all the individual changes carefully. You may want to get that coffee cup filled up (or whatever keeps you awake)! >> >> >> Background: >> >> Because of the size of the total patch and wide range of areas touched, this patch is one out of in total six partial patches which together make up the necessary changes to remove the Solaris and SPARC ports. The other patches are being sent out for review to mailing lists appropriate for the respective areas the touch. An email will be sent to jdk-dev summarizing all the patches/reviews. To be clear: this patch is *not* in itself complete and stand-alone - all of the (six) patches are needed to form a complete patch. Some changes in this patch may look wrong or incomplete unless also looking at the corresponding changes in other areas. >> >> For convenience, I?m including a link below[1] to the full webrev, but in case you have comments on changes in other areas, outside of the files included in this thread, please provide those comments directly in the thread on the appropriate mailing list for that area if possible. >> >> In case it helps, the changes were effectively produced by searching for and updating any code mentioning ?solaris", ?sparc?, ?solstudio?, ?sunos?, etc. More information about the areas impacted can be found in the JEP itself. >> >> A big thank you to Igor Ignatyev for helping make the changes to the hotspot tests! >> >> Also, I have a short list of follow-ups which I?m going to look at separately from this JEP/patch, mainly related to command line options/flags which are no longer relevant and should be deprecated/obsoleted/removed. >> >> Testing: >> >> A slightly earlier version of this change successfully passed tier1-8, as well as client tier1-2. Additional testing will be done after the first round of reviews has been completed. >> >> Cheers, >> Mikael >> >> [1] http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/all/open/webrev/ >> > From mikael.vidstedt at oracle.com Thu May 7 05:35:30 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 6 May 2020 22:35:30 -0700 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: References: Message-ID: <394BD86E-EC91-440E-9936-696B2B453093@oracle.com> New webrev here: webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.01/hotspot/open/webrev/ incremental: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.01/hotspot.incr/open/webrev/ Remaining items: * File follow-up to remove STACK_BIAS * File follow-ups to change/update/remove flags and/or flag documentation: UseLWPSynchronization, BranchOnRegister, LIRFillDelaySlots, ArrayAllocatorMallocLimit, ThreadPriorityPolicy * File follow-up(s) to update comments ("solaris", ?sparc?, ?solstudio?, ?sunos?, ?sun studio?, ?s compiler bug?, ?niagara?, ?) Please let me know if there?s something I have missed! Cheers, Mikael > On May 3, 2020, at 10:12 PM, Mikael Vidstedt wrote: > > > Please review this change which implements part of JEP 381: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 > webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ > JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 > > > Note: When reviewing this, please be aware that this exercise was *extremely* mind-numbing, so I appreciate your help reviewing all the individual changes carefully. You may want to get that coffee cup filled up (or whatever keeps you awake)! > > > Background: > > Because of the size of the total patch and wide range of areas touched, this patch is one out of in total six partial patches which together make up the necessary changes to remove the Solaris and SPARC ports. The other patches are being sent out for review to mailing lists appropriate for the respective areas the touch. An email will be sent to jdk-dev summarizing all the patches/reviews. To be clear: this patch is *not* in itself complete and stand-alone - all of the (six) patches are needed to form a complete patch. Some changes in this patch may look wrong or incomplete unless also looking at the corresponding changes in other areas. > > For convenience, I?m including a link below[1] to the full webrev, but in case you have comments on changes in other areas, outside of the files included in this thread, please provide those comments directly in the thread on the appropriate mailing list for that area if possible. > > In case it helps, the changes were effectively produced by searching for and updating any code mentioning ?solaris", ?sparc?, ?solstudio?, ?sunos?, etc. More information about the areas impacted can be found in the JEP itself. > > A big thank you to Igor Ignatyev for helping make the changes to the hotspot tests! > > Also, I have a short list of follow-ups which I?m going to look at separately from this JEP/patch, mainly related to command line options/flags which are no longer relevant and should be deprecated/obsoleted/removed. > > Testing: > > A slightly earlier version of this change successfully passed tier1-8, as well as client tier1-2. Additional testing will be done after the first round of reviews has been completed. > > Cheers, > Mikael > > [1] http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/all/open/webrev/ > From kim.barrett at oracle.com Thu May 7 06:35:22 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 7 May 2020 02:35:22 -0400 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: <394BD86E-EC91-440E-9936-696B2B453093@oracle.com> References: <394BD86E-EC91-440E-9936-696B2B453093@oracle.com> Message-ID: > On May 7, 2020, at 1:35 AM, Mikael Vidstedt wrote: > > > New webrev here: > > webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.01/hotspot/open/webrev/ > incremental: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.01/hotspot.incr/open/webrev/ > > Remaining items: > > * File follow-up to remove STACK_BIAS > > * File follow-ups to change/update/remove flags and/or flag documentation: UseLWPSynchronization, BranchOnRegister, LIRFillDelaySlots, ArrayAllocatorMallocLimit, ThreadPriorityPolicy > > * File follow-up(s) to update comments ("solaris", ?sparc?, ?solstudio?, ?sunos?, ?sun studio?, ?s compiler bug?, ?niagara?, ?) > > > Please let me know if there?s something I have missed! Looks good. From xxinliu at amazon.com Thu May 7 06:53:25 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Thu, 7 May 2020 06:53:25 +0000 Subject: RFR(S): 8022574: remove HaltNode code after uncommon trap calls Message-ID: <50CEE8CA-911D-4BAB-BC90-DAC90030A708@amazon.com> Hi, Could you please review this patch? JBS: https://bugs.openjdk.java.net/browse/JDK-8022574 Webrev: https://cr.openjdk.java.net/~xliu/8022574/00/webrev/ This is the prerequisite of JDK-8230552. I agree with Martin's comment. I also want to remove the HaltNode after uncommon_trap. https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-May/038104.html Because the HaltNode after uncommon_trap callnode also takes a special role in Matcher::Fixup_Save_On_Entry, it's not easy to get rid of it. I took another approach. I mark that HaltNode not reachable. As a result, backends can choose not to expand it. My assumption is that the control flow should never return from uncommon_trap because it must go to the interpreter. HaltNode is rarely generated except this case. If this kind of HaltNode becomes dummy, we are safe to use "stop()" for the instruct "ShouldNotReachHere". Stop() could provide more debugging information if C2 crashes due to HaltNode. I apply this optimization for all architectures except for SPARC. Currently, x86 generates 4 instruction, or 20 bytes for that HaltNode. https://bugs.openjdk.java.net/browse/JDK-8022574?focusedCommentId=14336533&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14336533 Usually, C2 generates a lot of uncommon trap calls for a variety of reasons. This patch can generate more compact code. Eg. I observe the codeCache usage reduced even when java executes '-version'. ./linux-x86_64-server-release/jdk/bin/java -Xcomp -XX:-TieredCompilation -XX:+PrintCodeCache -version Before: CodeCache: size=49152Kb used=4242Kb max_used=4478Kb free=44909Kb After: CodeCache: size=49152Kb used=3988Kb max_used=4474Kb free=45163Kb Testing: I ran gtest and hotspot:tier1 on x86_64/linux and aarch64/linux. No regression was identified. Thanks, --lx From christian.hagedorn at oracle.com Thu May 7 08:11:32 2020 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Thu, 7 May 2020 10:11:32 +0200 Subject: [aarch64-port-dev ] RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <408cac5b-0151-b0cb-a1a2-5bce673e42b0@redhat.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <6f0ed258-8fad-43df-60cb-3281397ddf9b@oracle.com> <55cd970b-5a4e-4d1a-e17c-77289a412022@redhat.com> <2c9c0c59-73cc-4c88-c568-deba5d73e4f5@oracle.com> <408cac5b-0151-b0cb-a1a2-5bce673e42b0@redhat.com> Message-ID: <680dafab-56ee-043b-4555-a092c67c4b57@oracle.com> Hi Andrew On 06.05.20 16:38, Andrew Haley wrote: > On 5/6/20 10:14 AM, Christian Hagedorn wrote: >> On 06.05.20 10:40, Andrew Haley wrote: >>> On 5/6/20 8:26 AM, Christian Hagedorn wrote >>>> The fix was to >>>> only do it if ShowMessageBoxOnError was set. So, I think it's indeed a >>>> good idea to also do the same for other architectures. >>> Really? Can't we fix stop() so that it generates only a tiny little bit >>> of code? I can do it if you like. >> This is probably what you've meant. I agree with you, the fewer >> instructions emitted by stop() the better. > > ShowMessageBoxOnError is completely the wrong thing to use. I run > in GDB all the time, and I want accurate error messages, and I never > use ShowMessageBoxOnError. If we get this right we can have near-zero > code expansion, and keep this on at all times, even in production. I like your suggestion of a trap solution as suggested in the other thread. This sounds much better than the original fix for HaltNodes on x86 which only had in mind to adapt stop() in such a way that it works with debug64() while avoiding the unnecessary regs[] and pc argument when ShowMessageBoxOnError is not set (which is probably not used much anyways). If a trap approach could get rid of all of it and emit minimal code, we should go for it. Best regards, Christian From aph at redhat.com Thu May 7 08:11:41 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 7 May 2020 09:11:41 +0100 Subject: [EXT] Re: [aarch64-port-dev ] RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> Message-ID: On 5/6/20 9:45 PM, Doerr, Martin wrote: > I had also thought about using a trap based implementation. > Maybe it makes sense to add a feature to shared code for that. > E.g. we could emit an illegal instruction (which raises SIGILL) followed by some kind of index into a descriptor array. > PPC64 would also benefit from a more compact solution. Most of the stuff to handle this would be in the back end, I would have thought. I'll have a look. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From rwestrel at redhat.com Thu May 7 08:20:46 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 07 May 2020 10:20:46 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> Message-ID: <878si45f6p.fsf@redhat.com> Thanks for looking at this. > Looking a little more at the interval arithmetic subroutines, > I think it would be reasonable to leave out most of the linkage to > TypeInt/TypeLong, and isolate the logic that does the min-ing > and max-ing, with separate routines for translating to and from > the Type* world. > > Maybe: > > struct MinMaxInterval { > julong shi, slo, uhi, ulo; > boolean is_int; > void signedMaxWith(const Interval& that); > void unsignedMaxWith(const Interval& that); > void signedMinWith(const Interval& that); > void unsignedMinWith(const Interval& that); > MinMaxInterval(TypeInt*); > MinMaxInterval(TypeLong*); > const TypeInt* asTypeInt(); > const TypeInt* asTypeLong(); > }; > > It would be overkill in many cases to put such small routines > into their own class, but in this case the min/max interval logic > is very subtle and deserves a little display platform. I added this code because in the code that computes the number of iterations of the inner loop of a transformed long loop: inner_iters_max = GraphKit::max_diff_with_zero(adjusted_limit, outer_phi, _igvn); Node* inner_iters_actual = GraphKit::unsigned_min(inner_iters_max, inner_iters_limit, _igvn); inner_iters_actual is in the interval [0, max_jint-stride] Because of that, there's no need for the inner int counted loop to be guarded by a limit check. But the method constructing the int counted loop emits a limit check unless the type of inner_iters_actual is accurate enough. Another way to deal with that, would be to pass the expected result type as argument to unsigned_min() rather than compute it. Or, we don't care about the useless loop limit check... Roland. From tobias.hartmann at oracle.com Thu May 7 08:28:39 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 7 May 2020 10:28:39 +0200 Subject: RFR:8243615 Continuous deoptimizations with Reason=unstable_if and Action=none In-Reply-To: <272f8207-0b1e-4b34-b1d4-0f562b4da9d1.zhuoren.wz@alibaba-inc.com> References: <272f8207-0b1e-4b34-b1d4-0f562b4da9d1.zhuoren.wz@alibaba-inc.com> Message-ID: <4dc2e0ef-315b-a72b-bb8c-6b5f418765ed@oracle.com> Hi Zhuoren, On 26.04.20 14:31, Wang Zhuo(Zhuoren) wrote: > I met continuous deoptimization w/ Reason_unstable_if and Action_none in an online application and significant performance drop was observed. > It was found in JDK8 but I think it also existed in tip. But the Reason_unstable_if uncommon trap has Action_reinterpret (not Action_none) set? Why don't we hit the too_many_traps limit? Also, there is a too_many_traps_or_recompiles method that should be used instead. Thanks, Tobias From martin.doerr at sap.com Thu May 7 09:05:14 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 7 May 2020 09:05:14 +0000 Subject: RFR: CSR 8244507 [C1, C2] Split inlining control flags Message-ID: Hi compiler folks, can I get reviews for my CSR, please? https://bugs.openjdk.java.net/browse/JDK-8244507 Thanks and best regards, Martin From tobias.hartmann at oracle.com Thu May 7 09:10:18 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 7 May 2020 11:10:18 +0200 Subject: RFR(S): 8243670: Unexpected test result caused by C2 MergeMemNode::Ideal In-Reply-To: References: Message-ID: <4d051aec-56ef-b35e-f082-2f6305ec1694@oracle.com> Hi Felix, were you able to figure out how we ended up with two Phis with same input but different _adr_type? Thanks, Tobias On 28.04.20 08:02, Yangfei (Felix) wrote: > Hi, > > Please help review this patch fixing a C2 issue. > Bug: https://bugs.openjdk.java.net/browse/JDK-8243670 > Webrev: http://cr.openjdk.java.net/~fyang/8243670/webrev.00/ > > As described on the issue, C2 generates incorrect code for the following OSR compile: > 420 4 % b 4 TestReplaceEquivPhis::test @ 25 (107 bytes) > > v = iFld; // load from field "iFld" > iFld = TestReplaceEquivPhis.instanceCount; // store to field "iFld" > > From the C2 JIT code, load and store of field "iFld" are misplaced. > Looks like this is initially caused by the replace equivalent phis transformation in MergeMemNode::Ideal. > > Call trace: > #0 MergeMemNode::Ideal (this=0x7fff580c70c0, phase=0x7fff7d9407b0, can_reshape=true) at /home/yangfei/openjdk-jdk/src/hotspot/share/opto/memnode.cpp:4621 > #1 0x00007ffff6020bcd in PhaseGVN::apply_ideal (this=0x7fff7d9407b0, k=0x7fff580c70c0, can_reshape=true) > at /home/yangfei/openjdk-jdk/src/hotspot/share/opto/phaseX.cpp:806 > #2 0x00007ffff60223ef in PhaseIterGVN::transform_old (this=0x7fff7d9407b0, n=0x7fff580c70c0) > at /home/yangfei/openjdk-jdk/src/hotspot/share/opto/phaseX.cpp:1229 > #3 0x00007ffff602217a in PhaseIterGVN::optimize (this=0x7fff7d9407b0) at /home/yangfei/openjdk-jdk/src/hotspot/share/opto/phaseX.cpp:1175 > #4 0x00007ffff5e32618 in PhaseIdealLoop::build_and_optimize (this=0x7fff7d93fa90, mode=LoopOptsDefault) > at /home/yangfei/openjdk-jdk/src/hotspot/share/opto/loopnode.cpp:3192 > #5 0x00007ffff5800831 in PhaseIdealLoop::PhaseIdealLoop (this=0x7fff7d93fa90, igvn=..., mode=LoopOptsDefault) > at /home/yangfei/openjdk-jdk/src/hotspot/share/opto/loopnode.hpp:951 > #6 0x00007ffff580092c in PhaseIdealLoop::optimize (igvn=..., mode=LoopOptsDefault) at /home/yangfei/openjdk-jdk/src/hotspot/share/opto/loopnode.hpp:1026 > #7 0x00007ffff57f4553 in Compile::optimize_loops (this=0x7fff7d942d00, igvn=..., mode=LoopOptsDefault) > at /home/yangfei/openjdk-jdk/src/hotspot/share/opto/compile.cpp:1970 > #8 0x00007ffff57f5308 in Compile::Optimize (this=0x7fff7d942d00) at /home/yangfei/openjdk-jdk/src/hotspot/share/opto/compile.cpp:2182 > #9 0x00007ffff57ee5a3 in Compile::Compile (this=0x7fff7d942d00, ci_env=0x7fff7d943810, target=0x7fff580eeb70, osr_bci=25, subsume_loads=true, > do_escape_analysis=true, eliminate_boxing=true, directive=0x7ffff031b430) at /home/yangfei/openjdk-jdk/src/hotspot/share/opto/compile.cpp:736 > #10 0x00007ffff56ebc23 in C2Compiler::compile_method (this=0x7ffff035e940, env=0x7fff7d943810, target=0x7fff580eeb70, entry_bci=25, directive=0x7ffff031b430) > at /home/yangfei/openjdk-jdk/src/hotspot/share/opto/c2compiler.cpp:111 > #11 0x00007ffff5808cd0 in CompileBroker::invoke_compiler_on_method (task=0x7ffff03b6a10) > at /home/yangfei/openjdk-jdk/src/hotspot/share/compiler/compileBroker.cpp:2210 > #12 0x00007ffff58079eb in CompileBroker::compiler_thread_loop () at /home/yangfei/openjdk-jdk/src/hotspot/share/compiler/compileBroker.cpp:1894 > #13 0x00007ffff62557f1 in compiler_thread_entry (thread=0x7ffff035f800, __the_thread__=0x7ffff035f800) > at /home/yangfei/openjdk-jdk/src/hotspot/share/runtime/thread.cpp:3454 > #14 0x00007ffff6250a11 in JavaThread::thread_main_inner (this=0x7ffff035f800) at /home/yangfei/openjdk-jdk/src/hotspot/share/runtime/thread.cpp:1969 > #15 0x00007ffff62508bf in JavaThread::run (this=0x7ffff035f800) at /home/yangfei/openjdk-jdk/src/hotspot/share/runtime/thread.cpp:1952 > #16 0x00007ffff624cb2c in Thread::call_run (this=0x7ffff035f800) at /home/yangfei/openjdk-jdk/src/hotspot/share/runtime/thread.cpp:399 > #17 0x00007ffff5fbf288 in thread_native_entry (thread=0x7ffff035f800) at /home/yangfei/openjdk-jdk/src/hotspot/os/linux/os_linux.cpp:789 > #18 0x00007ffff71976db in start_thread (arg=0x7fff7d944700) at pthread_create.c:463 > #19 0x00007ffff78f188f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 > > graph before the transformation looks like: > > 1: Phi1( Phi1 and Phi2 have the same input edges) #memory Memory: @TestReplaceEquivPhis+12 *, name=iFld, idx=5; > 2: Phi2( Phi1 and Phi2 have the same input edges) #memory Memory: @BotPTR *+bot, idx=Bot; > 3: LoadI( input: 1) => name=iFld, idx=5 > 4: MergeMem( input:1, 2) > 5: MemBarAcqure( input: 4) > 6: Proj( input: 5) > 7: StoreI( input: 6) => name=iFld, idx=5 > > Here Phi1 and Phi2 have same input edges. Input from Phi1 to MergeMem is simplified by MergeMemoryNode::Ideal. > > graph after the transformation looks like: > > 1: Phi1( Phi1 and Phi2 have the same input edges) #memory Memory: @TestReplaceEquivPhis+12 *, name=iFld, idx=5; > 2: Phi2( Phi1 and Phi2 have the same input edges) #memory Memory: @BotPTR *+bot, idx=Bot; > 3: LoadI( input: 1) => name=iFld, idx=5 > 4: MergeMem( input: 2) > 5: MemBarAcqure( input: 4) > 6: Proj( input: 5) > 7: StoreI( input: 6) => name=iFld, idx=5 > > As a result, PhaseCFG::insert_anti_dependences won't insert an anti-dependence edge between 3 and 4 to place the load correctly. > This transformation is there from day one. With -XX:-SplitIfBlocks option, it triggers more errors. Proposed webrev simply disables it. > Tier1-3 tested on x86-64-linux-gnu. Specjbb2015 shows no performance regression with this change. > Suggestions? > > Thanks, > Felix > From nils.eliasson at oracle.com Thu May 7 09:31:17 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 7 May 2020 11:31:17 +0200 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: References: <380ca47b-4143-e98f-ff81-461b394aaf0c@oracle.com> <2d849c3a-f971-b644-16fb-72aba5a919c0@oracle.com> Message-ID: <7cd68157-36fd-d000-f420-2e1e9e0f0143@oracle.com> On 2020-05-06 22:20, Man Cao wrote: > Hi, > > [@Laurent] >> Thanks Man for your results. >> I will try your fix on jdk15 repo and run my Marlin tests & benchmark to > see if there are any gain in these cases. > You are welcome. > I have run DaCapo at JDK tip, with default JVM options. I didn't see any > noticeable difference in performance with and without my bugfix. > This is probably due to significantly reduced code cache flushes with a > large default ReservedCodeCacheSize (240MB) for +TieredCompilation. > I checked the logs for tradesoap for runs without my bugfix, to count the > number of completed flushes (NMethodSweeper::sweep_code_cache()): > ~550 for runs with -XX:-TieredCompilation -XX:ReservedCodeCacheSize=40m on > JDK11 > ~35 for runs with default options on JDK tip > (+TieredCompilation, ReservedCodeCacheSize=240m) > The flushes are reduced by more than 15X with the default options. > > [@Nils] >> Looking at sweeper.cpp I see something that looks wrong. The _last_sweep > zzcounter is updated even if no sweep was done. In low code cache usage > scenarios that means will might never reach the threshold. >> Can you try it out and see if things improve? > The change makes sense to me. I can try it out together after resolving the > next issue. > >> The sweeper should wake up >> regularly, but now it is only awakened when hitting the SweepAggressive >> threshold. This is wrong. >> I suggest holding of the fix until all the problems have been ironed out. > Could you elaborate what is the expected frequency to wake up the sweeper? > Should we increase the default value for StartAggressiveSweepingAt instead? The expected behaviour is that sweeper should be invoked depending on how many sweeper-stack scan has been done - that is tracked by the _time_counter field. In NMethodSweeper::possibly_sweep there is a heuristics that triggers a sweep more often if the code cache is getting full. The expected behaviour for StartAggressiveSweepingAt is that we possibly start a sweep with a stack scan on every codeblob/nmethod allocation when the code cache is getting full. That is an attempt to quickly free up additional space in the code cache. The first bug I found is that the _last_sweep counter should only be set when a sweep has been performed, otherwise the threshold might never be reached. The second bug is that the sweeper thread sleeps. The only way to wake it up is through code paths that can only be reached when it is awake, or when StartAggressiveSweepingAt has been reached. This bug has gone unnoticed because the bug you found made SweepAggressive trigger when the codecache is 10% full, which most often happen fairly quickly. To fix this the sweeper thread needs to be awakened whenever _time_counter has been updated. However - that can only be a temporary fix - the transition to using mostly handshakes for synchronization can make the safepoints very rare. And then the heuristics is broken again. Best regards, Nils Eliasson > > As you suggested before, currently the sweeper is awakened for every new > allocation in the code cache, after the usage is above 10%. > This is definitely too frequent that it hurts performance, especially for > the -XX:-TieredCompilation case. > We do find that turning off code cache flushing in JDK11 > (-XX:-UseCodeCacheFlushing) > could significantly improve performance, by ~20% for an important > production workload configured with -XX:-TieredCompilation! > Thus, we strongly support keeping the default flushing frequency low. > > -Man From tobias.hartmann at oracle.com Thu May 7 09:28:19 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 7 May 2020 11:28:19 +0200 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> Message-ID: <3559b0c8-7c40-47f4-e9c5-e1edf2ac1461@oracle.com> Hi Martin, looks good to me too but I'm a bit concerned about C1InlineStackLimit affecting performance. What benchmarks did you run? Did you verify that tests using these flags are still working as expected (i.e., intend to only adjust C2's behavior)? Thanks, Tobias On 04.05.20 18:04, Doerr, Martin wrote: > Hi Nils, > > thank you for looking at this and sorry for the late reply. > > I've added MaxTrivialSize and also updated the issue accordingly. Makes sense. > Do you have more flags in mind? > > Moving the flags which are only used by C2 into c2_globals definitely makes sense. > > Done in webrev.01: > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ > > Please take a look and let me know when my proposal is ready for a CSR. > > Best regards, > Martin > > >> -----Original Message----- >> From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Nils Eliasson >> Sent: Dienstag, 28. April 2020 18:29 >> To: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >> >> Hi, >> >> Thanks for addressing this! This has been an annoyance for a long time. >> >> Have you though about including other flags - like MaxTrivialSize? >> MaxInlineSize is tested against it. >> >> Also - you should move the flags that are now c2-only to c2_globals.hpp. >> >> Best regards, >> Nils Eliasson >> >> On 2020-04-27 15:06, Doerr, Martin wrote: >>> Hi, >>> >>> while tuning inlining parameters for C2 compiler with JDK-8234863 we had >> discussed impact on C1. >>> I still think it's bad to share them between both compilers. We may want to >> do further C2 tuning without negative impact on C1 in the future. >>> >>> C1 has issues with substantial inlining because of the lack of uncommon >> traps. When C1 inlines a lot, stack frames may get large and code cache space >> may get wasted for cold or even never executed code. The situation gets >> worse when many patching stubs get used for such code. >>> >>> I had opened the following issue: >>> https://bugs.openjdk.java.net/browse/JDK-8235673 >>> >>> And my initial proposal is here: >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ >>> >>> >>> Part of my proposal is to add an additional flag which I called >> C1InlineStackLimit to reduce stack utilization for C1 methods. >>> I have a simple example which shows wasted stack space (java example >> TestStack at the end). >>> >>> It simply counts stack frames until a stack overflow occurs. With the current >> implementation, only 1283 frames fit on the stack because the never >> executed method bogus_test with local variables gets inlined. >>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get 2310 >> frames until stack overflow. (I only used C1 for this example. Can be >> reproduced as shown below.) >>> >>> I didn't notice any performance regression even with the aggressive setting >> of C1InlineStackLimit=5 with TieredCompilation. >>> >>> I know that I'll need a CSR for this change, but I'd like to get feedback in >> general and feedback about the flag names before creating a CSR. >>> I'd also be glad about feedback regarding the performance impact. >>> >>> Best regards, >>> Martin >>> >>> >>> >>> Command line: >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >> TestStack >>> CompileCommand: compileonly TestStack.triggerStackOverflow >>> @ 8 TestStack::triggerStackOverflow (15 bytes) recursive >> inlining too deep >>> @ 11 TestStack::bogus_test (33 bytes) inline >>> caught java.lang.StackOverflowError >>> 1283 activations were on stack, sum = 0 >>> >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >> TestStack >>> CompileCommand: compileonly TestStack.triggerStackOverflow >>> @ 8 TestStack::triggerStackOverflow (15 bytes) recursive >> inlining too deep >>> @ 11 TestStack::bogus_test (33 bytes) callee uses too >> much stack >>> caught java.lang.StackOverflowError >>> 2310 activations were on stack, sum = 0 >>> >>> >>> TestStack.java: >>> public class TestStack { >>> >>> static long cnt = 0, >>> sum = 0; >>> >>> public static void bogus_test() { >>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; >>> sum += c1 + c2 + c3 + c4; >>> } >>> >>> public static void triggerStackOverflow() { >>> cnt++; >>> triggerStackOverflow(); >>> bogus_test(); >>> } >>> >>> >>> public static void main(String args[]) { >>> try { >>> triggerStackOverflow(); >>> } catch (StackOverflowError e) { >>> System.out.println("caught " + e); >>> } >>> System.out.println(cnt + " activations were on stack, sum = " + sum); >>> } >>> } >>> > From tobias.hartmann at oracle.com Thu May 7 09:49:36 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 7 May 2020 11:49:36 +0200 Subject: RFR(S): 8022574: remove HaltNode code after uncommon trap calls In-Reply-To: <50CEE8CA-911D-4BAB-BC90-DAC90030A708@amazon.com> References: <50CEE8CA-911D-4BAB-BC90-DAC90030A708@amazon.com> Message-ID: <4e5149d2-b366-f5d0-0eff-f15d8270994f@oracle.com> Hi Xin, looks good to me. Best regards, Tobias On 07.05.20 08:53, Liu, Xin wrote: > Hi, > > Could you please review this patch? > JBS: https://bugs.openjdk.java.net/browse/JDK-8022574 > Webrev: https://cr.openjdk.java.net/~xliu/8022574/00/webrev/ > > This is the prerequisite of JDK-8230552. I agree with Martin's comment. I also want to remove the HaltNode after uncommon_trap. > https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-May/038104.html > > Because the HaltNode after uncommon_trap callnode also takes a special role in Matcher::Fixup_Save_On_Entry, > it's not easy to get rid of it. I took another approach. I mark that HaltNode not reachable. As a result, backends can choose not to expand it. > > My assumption is that the control flow should never return from uncommon_trap because it must go to the interpreter. HaltNode is rarely generated except this case. If this kind of HaltNode becomes dummy, we are safe to use "stop()" for the instruct "ShouldNotReachHere". Stop() could provide more debugging information if C2 crashes due to HaltNode. > > I apply this optimization for all architectures except for SPARC. > > Currently, x86 generates 4 instruction, or 20 bytes for that HaltNode. > https://bugs.openjdk.java.net/browse/JDK-8022574?focusedCommentId=14336533&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14336533 > > Usually, C2 generates a lot of uncommon trap calls for a variety of reasons. This patch can generate more compact code. > Eg. I observe the codeCache usage reduced even when java executes '-version'. > ./linux-x86_64-server-release/jdk/bin/java -Xcomp -XX:-TieredCompilation -XX:+PrintCodeCache -version > Before: > CodeCache: size=49152Kb used=4242Kb max_used=4478Kb free=44909Kb > After: > CodeCache: size=49152Kb used=3988Kb max_used=4474Kb free=45163Kb > > Testing: > I ran gtest and hotspot:tier1 on x86_64/linux and aarch64/linux. No regression was identified. > > Thanks, > --lx > > From christian.hagedorn at oracle.com Thu May 7 12:09:23 2020 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Thu, 7 May 2020 14:09:23 +0200 Subject: [15] RFR(M): 8244207: Simplify usage of Compile::print_method() when debugging with gdb and enable its use with rr Message-ID: Hi Please review the following debugging enhancement: https://bugs.openjdk.java.net/browse/JDK-8244207 http://cr.openjdk.java.net/~chagedorn/8244207/webrev.00/ This enhancement simplifies the usage for printing the ideal graph for visualization with the Ideal Graph Visualizer when debugging with gdb and enables graph printing with rr [1]. Instead of calling Compile::current()->print_method(PHASE_X, y, z) from gdb, one can now just call igv_print() or igv_print(phase_name) with a custom phase name. There are multiple options depending on where the graph should be printed to (file or over network/locally to an opened Ideal Graph Visualizer). When choosing file, the output is always printed to a file named custom_debug.xml. I think the flexibility to choose another file name is not really required since it's only used while debugging. These new igv_print() methods can also be called from gdb without setting any flags required for the usual calls to Compile::current()->print_method(PHASE_X, y, z) to work. The standard Compile::current()->print_method(PHASE_X, y, z) call does not work while debugging a program trace with rr (and is probably also problematic with related replay tools). The call gets stuck somewhere. rr allows to alter some data at a breakpoint but as soon as execution continues on the replayed trace, the modifications are undone (except for file writes). This enhancement also addresses this such that the new igv_print() methods can be used with rr. However, when printing to a file is chosen, igv_print() will overwrite custom_debug.xml again at the next execution stop. To avoid that I added additional rr-specific igv_append() and igv_append(phase_name) methods that simply append a graph to the existing custom_debug.xml file without setting up a file header again. This allows all printed graphs to be kept in one file which makes it easier to navigate through them. Thank you! Best regards, Christian [1] https://rr-project.org/ From martin.doerr at sap.com Thu May 7 13:42:05 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 7 May 2020 13:42:05 +0000 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: <3559b0c8-7c40-47f4-e9c5-e1edf2ac1461@oracle.com> References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <3559b0c8-7c40-47f4-e9c5-e1edf2ac1461@oracle.com> Message-ID: Hi Tobias, thanks for looking at my change. It is only expected to influence startup, not peak performance. Nevertheless, I've run benchmarks to check peak performance as well: SPEC jvm 2008, SPEC jbb 2015 No regressions observable, as expected. For startup performance, I've ran SPEC jbb 2005 with throughput measurements every 1.5 seconds like I had shown in my fosdem talk (https://fosdem.org/2020/schedule/event/jit2020/). I couldn't observe any regression, either. It would be very helpful if other people (e.g. from Oracle) could run additional benchmarks. I don't know what you use to check startup performance. > Did you verify that tests using these flags are still working as expected (i.e., > intend to only adjust C2's behavior)? Using the existing flags still works for C2. So there's no issue with C2 tests. I'm not aware of any test which requires one of these flags to modify C1 behavior. I've run a substantial amount of tests and couldn't find any related issues: jtreg, jck (normal, with -Xcomp, with -Xcomp -XX:-TieredCompilation), SAP's proprietary tests Best regards, Martin > -----Original Message----- > From: Tobias Hartmann > Sent: Donnerstag, 7. Mai 2020 11:28 > To: Doerr, Martin ; Nils Eliasson > ; hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > Hi Martin, > > looks good to me too but I'm a bit concerned about C1InlineStackLimit > affecting performance. What > benchmarks did you run? > > Did you verify that tests using these flags are still working as expected (i.e., > intend to only > adjust C2's behavior)? > > Thanks, > Tobias > > On 04.05.20 18:04, Doerr, Martin wrote: > > Hi Nils, > > > > thank you for looking at this and sorry for the late reply. > > > > I've added MaxTrivialSize and also updated the issue accordingly. Makes > sense. > > Do you have more flags in mind? > > > > Moving the flags which are only used by C2 into c2_globals definitely makes > sense. > > > > Done in webrev.01: > > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ > > > > Please take a look and let me know when my proposal is ready for a CSR. > > > > Best regards, > > Martin > > > > > >> -----Original Message----- > >> From: hotspot-compiler-dev >> bounces at openjdk.java.net> On Behalf Of Nils Eliasson > >> Sent: Dienstag, 28. April 2020 18:29 > >> To: hotspot-compiler-dev at openjdk.java.net > >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >> > >> Hi, > >> > >> Thanks for addressing this! This has been an annoyance for a long time. > >> > >> Have you though about including other flags - like MaxTrivialSize? > >> MaxInlineSize is tested against it. > >> > >> Also - you should move the flags that are now c2-only to c2_globals.hpp. > >> > >> Best regards, > >> Nils Eliasson > >> > >> On 2020-04-27 15:06, Doerr, Martin wrote: > >>> Hi, > >>> > >>> while tuning inlining parameters for C2 compiler with JDK-8234863 we > had > >> discussed impact on C1. > >>> I still think it's bad to share them between both compilers. We may want > to > >> do further C2 tuning without negative impact on C1 in the future. > >>> > >>> C1 has issues with substantial inlining because of the lack of uncommon > >> traps. When C1 inlines a lot, stack frames may get large and code cache > space > >> may get wasted for cold or even never executed code. The situation gets > >> worse when many patching stubs get used for such code. > >>> > >>> I had opened the following issue: > >>> https://bugs.openjdk.java.net/browse/JDK-8235673 > >>> > >>> And my initial proposal is here: > >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ > >>> > >>> > >>> Part of my proposal is to add an additional flag which I called > >> C1InlineStackLimit to reduce stack utilization for C1 methods. > >>> I have a simple example which shows wasted stack space (java example > >> TestStack at the end). > >>> > >>> It simply counts stack frames until a stack overflow occurs. With the > current > >> implementation, only 1283 frames fit on the stack because the never > >> executed method bogus_test with local variables gets inlined. > >>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get > 2310 > >> frames until stack overflow. (I only used C1 for this example. Can be > >> reproduced as shown below.) > >>> > >>> I didn't notice any performance regression even with the aggressive > setting > >> of C1InlineStackLimit=5 with TieredCompilation. > >>> > >>> I know that I'll need a CSR for this change, but I'd like to get feedback in > >> general and feedback about the flag names before creating a CSR. > >>> I'd also be glad about feedback regarding the performance impact. > >>> > >>> Best regards, > >>> Martin > >>> > >>> > >>> > >>> Command line: > >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - > >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - > >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >> TestStack > >>> CompileCommand: compileonly TestStack.triggerStackOverflow > >>> @ 8 TestStack::triggerStackOverflow (15 bytes) > recursive > >> inlining too deep > >>> @ 11 TestStack::bogus_test (33 bytes) inline > >>> caught java.lang.StackOverflowError > >>> 1283 activations were on stack, sum = 0 > >>> > >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - > >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - > >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >> TestStack > >>> CompileCommand: compileonly TestStack.triggerStackOverflow > >>> @ 8 TestStack::triggerStackOverflow (15 bytes) > recursive > >> inlining too deep > >>> @ 11 TestStack::bogus_test (33 bytes) callee uses too > >> much stack > >>> caught java.lang.StackOverflowError > >>> 2310 activations were on stack, sum = 0 > >>> > >>> > >>> TestStack.java: > >>> public class TestStack { > >>> > >>> static long cnt = 0, > >>> sum = 0; > >>> > >>> public static void bogus_test() { > >>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; > >>> sum += c1 + c2 + c3 + c4; > >>> } > >>> > >>> public static void triggerStackOverflow() { > >>> cnt++; > >>> triggerStackOverflow(); > >>> bogus_test(); > >>> } > >>> > >>> > >>> public static void main(String args[]) { > >>> try { > >>> triggerStackOverflow(); > >>> } catch (StackOverflowError e) { > >>> System.out.println("caught " + e); > >>> } > >>> System.out.println(cnt + " activations were on stack, sum = " + sum); > >>> } > >>> } > >>> > > From felix.yang at huawei.com Thu May 7 13:42:20 2020 From: felix.yang at huawei.com (Yangfei (Felix)) Date: Thu, 7 May 2020 13:42:20 +0000 Subject: RFR(S): 8243670: Unexpected test result caused by C2 MergeMemNode::Ideal In-Reply-To: <4d051aec-56ef-b35e-f082-2f6305ec1694@oracle.com> References: <4d051aec-56ef-b35e-f082-2f6305ec1694@oracle.com> Message-ID: Hi Tobias, > -----Original Message----- > From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] > Sent: Thursday, May 7, 2020 5:10 PM > To: Yangfei (Felix) ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: RFR(S): 8243670: Unexpected test result caused by C2 > MergeMemNode::Ideal > > Hi Felix, > > were you able to figure out how we ended up with two Phis with same input > but different _adr_type? As I remembered, there are two major transformations which leads to this: 1. During Iter GVN1, a new phi is created with narrowed memory type through PhiNode::slice_memory. The new phi and the old phi have different _adr_type and different input. 2. Then C2 peel the first iteration of the given loop through PhaseIdealLoop::do_peeling. After that, the new phi and the old phi have same input but different _adr_type. Hope this helps. Thanks, Felix From martin.doerr at sap.com Thu May 7 14:06:14 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 7 May 2020 14:06:14 +0000 Subject: RFR(S): 8022574: remove HaltNode code after uncommon trap calls In-Reply-To: <4e5149d2-b366-f5d0-0eff-f15d8270994f@oracle.com> References: <50CEE8CA-911D-4BAB-BC90-DAC90030A708@amazon.com> <4e5149d2-b366-f5d0-0eff-f15d8270994f@oracle.com> Message-ID: Hi, I'm fine with it, too. Thanks and best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Tobias Hartmann > Sent: Donnerstag, 7. Mai 2020 11:50 > To: Liu, Xin ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: RFR(S): 8022574: remove HaltNode code after uncommon trap > calls > > Hi Xin, > > looks good to me. > > Best regards, > Tobias > > On 07.05.20 08:53, Liu, Xin wrote: > > Hi, > > > > Could you please review this patch? > > JBS: https://bugs.openjdk.java.net/browse/JDK-8022574 > > Webrev: https://cr.openjdk.java.net/~xliu/8022574/00/webrev/ > > > > This is the prerequisite of JDK-8230552. I agree with Martin's comment. I > also want to remove the HaltNode after uncommon_trap. > > https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020- > May/038104.html > > > > Because the HaltNode after uncommon_trap callnode also takes a special > role in Matcher::Fixup_Save_On_Entry, > > it's not easy to get rid of it. I took another approach. I mark that HaltNode > not reachable. As a result, backends can choose not to expand it. > > > > My assumption is that the control flow should never return from > uncommon_trap because it must go to the interpreter. HaltNode is rarely > generated except this case. If this kind of HaltNode becomes dummy, we are > safe to use "stop()" for the instruct "ShouldNotReachHere". Stop() could > provide more debugging information if C2 crashes due to HaltNode. > > > > I apply this optimization for all architectures except for SPARC. > > > > Currently, x86 generates 4 instruction, or 20 bytes for that HaltNode. > > https://bugs.openjdk.java.net/browse/JDK- > 8022574?focusedCommentId=14336533&page=com.atlassian.jira.plugin.syst > em.issuetabpanels%3Acomment-tabpanel#comment-14336533 > > > > Usually, C2 generates a lot of uncommon trap calls for a variety of reasons. > This patch can generate more compact code. > > Eg. I observe the codeCache usage reduced even when java executes '- > version'. > > ./linux-x86_64-server-release/jdk/bin/java -Xcomp -XX:-TieredCompilation > -XX:+PrintCodeCache -version > > Before: > > CodeCache: size=49152Kb used=4242Kb max_used=4478Kb free=44909Kb > > After: > > CodeCache: size=49152Kb used=3988Kb max_used=4474Kb free=45163Kb > > > > Testing: > > I ran gtest and hotspot:tier1 on x86_64/linux and aarch64/linux. No > regression was identified. > > > > Thanks, > > --lx > > > > From igor.ignatyev at oracle.com Thu May 7 14:39:50 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 7 May 2020 07:39:50 -0700 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: <7F17A7EF-C9B3-40A0-816B-53614A56B7CA@oracle.com> References: <7F17A7EF-C9B3-40A0-816B-53614A56B7CA@oracle.com> Message-ID: <95E6D883-D696-466A-80BD-4ADE420DBA43@oracle.com> Hi Mikael, yes, Vladimir's reply made it clear, let's hope all the needed changes got upstream before the next graal update so it goes smoothly. Cheers, -- Igor > On May 6, 2020, at 10:27 PM, Mikael Vidstedt wrote: > > > Igor, thank you for the review, and again for helping make the test changes in the first place! :) > > I hope Vladimir?s reply clarifies how we?re planning on handling the Graal related changes. > > Cheers, > Mikael > >> On May 4, 2020, at 2:29 PM, Igor Ignatyev wrote: >> >> Hi Mikael, >> >> the changes in /test/ look good to me. >> >> I have a question regarding src/jdk.internal.vm.compiler/*, aren't these files part of graal-compiler and hence will be brought back by the next graal update? >> >> Thanks, >> -- Igor >> >>> On May 3, 2020, at 10:12 PM, Mikael Vidstedt wrote: >>> >>> >>> Please review this change which implements part of JEP 381: >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 >>> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ >>> JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 >>> >>> >>> Note: When reviewing this, please be aware that this exercise was *extremely* mind-numbing, so I appreciate your help reviewing all the individual changes carefully. You may want to get that coffee cup filled up (or whatever keeps you awake)! >>> >>> >>> Background: >>> >>> Because of the size of the total patch and wide range of areas touched, this patch is one out of in total six partial patches which together make up the necessary changes to remove the Solaris and SPARC ports. The other patches are being sent out for review to mailing lists appropriate for the respective areas the touch. An email will be sent to jdk-dev summarizing all the patches/reviews. To be clear: this patch is *not* in itself complete and stand-alone - all of the (six) patches are needed to form a complete patch. Some changes in this patch may look wrong or incomplete unless also looking at the corresponding changes in other areas. >>> >>> For convenience, I?m including a link below[1] to the full webrev, but in case you have comments on changes in other areas, outside of the files included in this thread, please provide those comments directly in the thread on the appropriate mailing list for that area if possible. >>> >>> In case it helps, the changes were effectively produced by searching for and updating any code mentioning ?solaris", ?sparc?, ?solstudio?, ?sunos?, etc. More information about the areas impacted can be found in the JEP itself. >>> >>> A big thank you to Igor Ignatyev for helping make the changes to the hotspot tests! >>> >>> Also, I have a short list of follow-ups which I?m going to look at separately from this JEP/patch, mainly related to command line options/flags which are no longer relevant and should be deprecated/obsoleted/removed. >>> >>> Testing: >>> >>> A slightly earlier version of this change successfully passed tier1-8, as well as client tier1-2. Additional testing will be done after the first round of reviews has been completed. >>> >>> Cheers, >>> Mikael >>> >>> [1] http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/all/open/webrev/ >>> >> > From volker.simonis at gmail.com Thu May 7 14:55:19 2020 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 7 May 2020 16:55:19 +0200 Subject: RFR(S): 8022574: remove HaltNode code after uncommon trap calls In-Reply-To: <50CEE8CA-911D-4BAB-BC90-DAC90030A708@amazon.com> References: <50CEE8CA-911D-4BAB-BC90-DAC90030A708@amazon.com> Message-ID: Hi Xin, nice cleanup! And the space reduction for the CodeCache is not bad as well :) As this only removes the HaltNode after uncommon traps in the product build, we still have to take the code size of the HaltNode into account when you do 8230552 (and maybe improve the way how __stop works). But that can be done in the patch for 8230552. This looks good and is useful in itself. Thank you and best regards, Volker On Thu, May 7, 2020 at 9:02 AM Liu, Xin wrote: > > Hi, > > Could you please review this patch? > JBS: https://bugs.openjdk.java.net/browse/JDK-8022574 > Webrev: https://cr.openjdk.java.net/~xliu/8022574/00/webrev/ > > This is the prerequisite of JDK-8230552. I agree with Martin's comment. I also want to remove the HaltNode after uncommon_trap. > https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-May/038104.html > > Because the HaltNode after uncommon_trap callnode also takes a special role in Matcher::Fixup_Save_On_Entry, > it's not easy to get rid of it. I took another approach. I mark that HaltNode not reachable. As a result, backends can choose not to expand it. > > My assumption is that the control flow should never return from uncommon_trap because it must go to the interpreter. HaltNode is rarely generated except this case. If this kind of HaltNode becomes dummy, we are safe to use "stop()" for the instruct "ShouldNotReachHere". Stop() could provide more debugging information if C2 crashes due to HaltNode. > > I apply this optimization for all architectures except for SPARC. > > Currently, x86 generates 4 instruction, or 20 bytes for that HaltNode. > https://bugs.openjdk.java.net/browse/JDK-8022574?focusedCommentId=14336533&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14336533 > > Usually, C2 generates a lot of uncommon trap calls for a variety of reasons. This patch can generate more compact code. > Eg. I observe the codeCache usage reduced even when java executes '-version'. > ./linux-x86_64-server-release/jdk/bin/java -Xcomp -XX:-TieredCompilation -XX:+PrintCodeCache -version > Before: > CodeCache: size=49152Kb used=4242Kb max_used=4478Kb free=44909Kb > After: > CodeCache: size=49152Kb used=3988Kb max_used=4474Kb free=45163Kb > > Testing: > I ran gtest and hotspot:tier1 on x86_64/linux and aarch64/linux. No regression was identified. > > Thanks, > --lx > > From vladimir.kozlov at oracle.com Thu May 7 17:11:05 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 7 May 2020 10:11:05 -0700 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> Message-ID: <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> I would suggest to build VM without C2 and run tests. I grepped tests with these flags I found next tests where we need to fix test's command (add -XX:+IgnoreUnrecognizedVMOptions) or add @requires vm.compiler2.enabled or duplicate test for C1 with corresponding C1 flags (by ussing additional @test block). runtime/ReservedStack/ReservedStackTest.java compiler/intrinsics/string/TestStringIntrinsics2.java compiler/c2/Test6792161.java compiler/c2/Test5091921.java And there is issue with compiler/compilercontrol tests which use InlineSmallCode and I am not sure how to handle: http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/compiler/compilercontrol/share/scenario/Command.java#l36 Thanks, Vladimir On 5/4/20 9:04 AM, Doerr, Martin wrote: > Hi Nils, > > thank you for looking at this and sorry for the late reply. > > I've added MaxTrivialSize and also updated the issue accordingly. Makes sense. > Do you have more flags in mind? > > Moving the flags which are only used by C2 into c2_globals definitely makes sense. > > Done in webrev.01: > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ > > Please take a look and let me know when my proposal is ready for a CSR. > > Best regards, > Martin > > >> -----Original Message----- >> From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Nils Eliasson >> Sent: Dienstag, 28. April 2020 18:29 >> To: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >> >> Hi, >> >> Thanks for addressing this! This has been an annoyance for a long time. >> >> Have you though about including other flags - like MaxTrivialSize? >> MaxInlineSize is tested against it. >> >> Also - you should move the flags that are now c2-only to c2_globals.hpp. >> >> Best regards, >> Nils Eliasson >> >> On 2020-04-27 15:06, Doerr, Martin wrote: >>> Hi, >>> >>> while tuning inlining parameters for C2 compiler with JDK-8234863 we had >> discussed impact on C1. >>> I still think it's bad to share them between both compilers. We may want to >> do further C2 tuning without negative impact on C1 in the future. >>> >>> C1 has issues with substantial inlining because of the lack of uncommon >> traps. When C1 inlines a lot, stack frames may get large and code cache space >> may get wasted for cold or even never executed code. The situation gets >> worse when many patching stubs get used for such code. >>> >>> I had opened the following issue: >>> https://bugs.openjdk.java.net/browse/JDK-8235673 >>> >>> And my initial proposal is here: >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ >>> >>> >>> Part of my proposal is to add an additional flag which I called >> C1InlineStackLimit to reduce stack utilization for C1 methods. >>> I have a simple example which shows wasted stack space (java example >> TestStack at the end). >>> >>> It simply counts stack frames until a stack overflow occurs. With the current >> implementation, only 1283 frames fit on the stack because the never >> executed method bogus_test with local variables gets inlined. >>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get 2310 >> frames until stack overflow. (I only used C1 for this example. Can be >> reproduced as shown below.) >>> >>> I didn't notice any performance regression even with the aggressive setting >> of C1InlineStackLimit=5 with TieredCompilation. >>> >>> I know that I'll need a CSR for this change, but I'd like to get feedback in >> general and feedback about the flag names before creating a CSR. >>> I'd also be glad about feedback regarding the performance impact. >>> >>> Best regards, >>> Martin >>> >>> >>> >>> Command line: >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >> TestStack >>> CompileCommand: compileonly TestStack.triggerStackOverflow >>> @ 8 TestStack::triggerStackOverflow (15 bytes) recursive >> inlining too deep >>> @ 11 TestStack::bogus_test (33 bytes) inline >>> caught java.lang.StackOverflowError >>> 1283 activations were on stack, sum = 0 >>> >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >> TestStack >>> CompileCommand: compileonly TestStack.triggerStackOverflow >>> @ 8 TestStack::triggerStackOverflow (15 bytes) recursive >> inlining too deep >>> @ 11 TestStack::bogus_test (33 bytes) callee uses too >> much stack >>> caught java.lang.StackOverflowError >>> 2310 activations were on stack, sum = 0 >>> >>> >>> TestStack.java: >>> public class TestStack { >>> >>> static long cnt = 0, >>> sum = 0; >>> >>> public static void bogus_test() { >>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; >>> sum += c1 + c2 + c3 + c4; >>> } >>> >>> public static void triggerStackOverflow() { >>> cnt++; >>> triggerStackOverflow(); >>> bogus_test(); >>> } >>> >>> >>> public static void main(String args[]) { >>> try { >>> triggerStackOverflow(); >>> } catch (StackOverflowError e) { >>> System.out.println("caught " + e); >>> } >>> System.out.println(cnt + " activations were on stack, sum = " + sum); >>> } >>> } >>> > From vladimir.kozlov at oracle.com Thu May 7 17:18:52 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 7 May 2020 10:18:52 -0700 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> Message-ID: <1176826a-fc72-a1b5-9b2f-92d0c3402956@oracle.com> I reviewed CSR. For Oracle's sponsor (Nils?): we need to create doc subtask to add these changes to release notes and update java man page. Thanks, Vladimir On 5/6/20 3:19 AM, Doerr, Martin wrote: > Hi Nils, > > I've created CSR > https://bugs.openjdk.java.net/browse/JDK-8244507 > and set it to "Proposed". > > Feel free to modify it if needed. I will need reviewers for it, too. > > Best regards, > Martin > > >> -----Original Message----- >> From: Nils Eliasson >> Sent: Dienstag, 5. Mai 2020 11:54 >> To: Doerr, Martin ; hotspot-compiler- >> dev at openjdk.java.net >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >> >> Hi Martin, >> >> I think it looks good. >> >> Please go ahead! >> >> Best regards, >> Nils >> >> >> On 2020-05-04 18:04, Doerr, Martin wrote: >>> Hi Nils, >>> >>> thank you for looking at this and sorry for the late reply. >>> >>> I've added MaxTrivialSize and also updated the issue accordingly. Makes >> sense. >>> Do you have more flags in mind? >>> >>> Moving the flags which are only used by C2 into c2_globals definitely makes >> sense. >>> >>> Done in webrev.01: >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ >>> >>> Please take a look and let me know when my proposal is ready for a CSR. >>> >>> Best regards, >>> Martin >>> >>> >>>> -----Original Message----- >>>> From: hotspot-compiler-dev >>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson >>>> Sent: Dienstag, 28. April 2020 18:29 >>>> To: hotspot-compiler-dev at openjdk.java.net >>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>> >>>> Hi, >>>> >>>> Thanks for addressing this! This has been an annoyance for a long time. >>>> >>>> Have you though about including other flags - like MaxTrivialSize? >>>> MaxInlineSize is tested against it. >>>> >>>> Also - you should move the flags that are now c2-only to c2_globals.hpp. >>>> >>>> Best regards, >>>> Nils Eliasson >>>> >>>> On 2020-04-27 15:06, Doerr, Martin wrote: >>>>> Hi, >>>>> >>>>> while tuning inlining parameters for C2 compiler with JDK-8234863 we >> had >>>> discussed impact on C1. >>>>> I still think it's bad to share them between both compilers. We may want >> to >>>> do further C2 tuning without negative impact on C1 in the future. >>>>> C1 has issues with substantial inlining because of the lack of uncommon >>>> traps. When C1 inlines a lot, stack frames may get large and code cache >> space >>>> may get wasted for cold or even never executed code. The situation gets >>>> worse when many patching stubs get used for such code. >>>>> I had opened the following issue: >>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 >>>>> >>>>> And my initial proposal is here: >>>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ >>>>> >>>>> >>>>> Part of my proposal is to add an additional flag which I called >>>> C1InlineStackLimit to reduce stack utilization for C1 methods. >>>>> I have a simple example which shows wasted stack space (java example >>>> TestStack at the end). >>>>> It simply counts stack frames until a stack overflow occurs. With the >> current >>>> implementation, only 1283 frames fit on the stack because the never >>>> executed method bogus_test with local variables gets inlined. >>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get >> 2310 >>>> frames until stack overflow. (I only used C1 for this example. Can be >>>> reproduced as shown below.) >>>>> I didn't notice any performance regression even with the aggressive >> setting >>>> of C1InlineStackLimit=5 with TieredCompilation. >>>>> I know that I'll need a CSR for this change, but I'd like to get feedback in >>>> general and feedback about the flag names before creating a CSR. >>>>> I'd also be glad about feedback regarding the performance impact. >>>>> >>>>> Best regards, >>>>> Martin >>>>> >>>>> >>>>> >>>>> Command line: >>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - >>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - >>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>> TestStack >>>>> CompileCommand: compileonly TestStack.triggerStackOverflow >>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) >> recursive >>>> inlining too deep >>>>> @ 11 TestStack::bogus_test (33 bytes) inline >>>>> caught java.lang.StackOverflowError >>>>> 1283 activations were on stack, sum = 0 >>>>> >>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - >>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - >>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>> TestStack >>>>> CompileCommand: compileonly TestStack.triggerStackOverflow >>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) >> recursive >>>> inlining too deep >>>>> @ 11 TestStack::bogus_test (33 bytes) callee uses too >>>> much stack >>>>> caught java.lang.StackOverflowError >>>>> 2310 activations were on stack, sum = 0 >>>>> >>>>> >>>>> TestStack.java: >>>>> public class TestStack { >>>>> >>>>> static long cnt = 0, >>>>> sum = 0; >>>>> >>>>> public static void bogus_test() { >>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; >>>>> sum += c1 + c2 + c3 + c4; >>>>> } >>>>> >>>>> public static void triggerStackOverflow() { >>>>> cnt++; >>>>> triggerStackOverflow(); >>>>> bogus_test(); >>>>> } >>>>> >>>>> >>>>> public static void main(String args[]) { >>>>> try { >>>>> triggerStackOverflow(); >>>>> } catch (StackOverflowError e) { >>>>> System.out.println("caught " + e); >>>>> } >>>>> System.out.println(cnt + " activations were on stack, sum = " + >> sum); >>>>> } >>>>> } >>>>> > From xxinliu at amazon.com Thu May 7 17:38:50 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Thu, 7 May 2020 17:38:50 +0000 Subject: RFR(S): 8022574: remove HaltNode code after uncommon trap calls In-Reply-To: References: <50CEE8CA-911D-4BAB-BC90-DAC90030A708@amazon.com> Message-ID: Thank you to review it. Yes, I will take care of the implementation stop(). Thanks, --lx ?On 5/7/20, 7:56 AM, "Volker Simonis" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi Xin, nice cleanup! And the space reduction for the CodeCache is not bad as well :) As this only removes the HaltNode after uncommon traps in the product build, we still have to take the code size of the HaltNode into account when you do 8230552 (and maybe improve the way how __stop works). But that can be done in the patch for 8230552. This looks good and is useful in itself. Thank you and best regards, Volker On Thu, May 7, 2020 at 9:02 AM Liu, Xin wrote: > > Hi, > > Could you please review this patch? > JBS: https://bugs.openjdk.java.net/browse/JDK-8022574 > Webrev: https://cr.openjdk.java.net/~xliu/8022574/00/webrev/ > > This is the prerequisite of JDK-8230552. I agree with Martin's comment. I also want to remove the HaltNode after uncommon_trap. > https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-May/038104.html > > Because the HaltNode after uncommon_trap callnode also takes a special role in Matcher::Fixup_Save_On_Entry, > it's not easy to get rid of it. I took another approach. I mark that HaltNode not reachable. As a result, backends can choose not to expand it. > > My assumption is that the control flow should never return from uncommon_trap because it must go to the interpreter. HaltNode is rarely generated except this case. If this kind of HaltNode becomes dummy, we are safe to use "stop()" for the instruct "ShouldNotReachHere". Stop() could provide more debugging information if C2 crashes due to HaltNode. > > I apply this optimization for all architectures except for SPARC. > > Currently, x86 generates 4 instruction, or 20 bytes for that HaltNode. > https://bugs.openjdk.java.net/browse/JDK-8022574?focusedCommentId=14336533&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14336533 > > Usually, C2 generates a lot of uncommon trap calls for a variety of reasons. This patch can generate more compact code. > Eg. I observe the codeCache usage reduced even when java executes '-version'. > ./linux-x86_64-server-release/jdk/bin/java -Xcomp -XX:-TieredCompilation -XX:+PrintCodeCache -version > Before: > CodeCache: size=49152Kb used=4242Kb max_used=4478Kb free=44909Kb > After: > CodeCache: size=49152Kb used=3988Kb max_used=4474Kb free=45163Kb > > Testing: > I ran gtest and hotspot:tier1 on x86_64/linux and aarch64/linux. No regression was identified. > > Thanks, > --lx > > From aph at redhat.com Thu May 7 17:48:29 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 7 May 2020 18:48:29 +0100 Subject: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> Message-ID: On 5/7/20 9:11 AM, Andrew Haley wrote: > On 5/6/20 9:45 PM, Doerr, Martin wrote: >> I had also thought about using a trap based implementation. >> Maybe it makes sense to add a feature to shared code for that. >> E.g. we could emit an illegal instruction (which raises SIGILL) followed by some kind of index into a descriptor array. >> PPC64 would also benefit from a more compact solution. > > Most of the stuff to handle this would be in the back end, I would > have thought. I'll have a look. This attached patch does the job for Linux/AArch64. It has two disadvantages: 1. It corrupts R8, rscratch1. This doesn't really matter for AArch64. 2. If we execute stop() in the context of a signal handler triggered by another trap, we'll have a double fault and the OS will kill our process. 1 is fixable by pushing rscratch1 before executing the trap. I'm not sure it's worth doing; I'd rather have a tiny implementation of stop() that we can use guilt-free in release code. I don't think 2 is an issue because we never call out to generated code from a signal handler. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 -------------- next part -------------- diff -r ed406ec0c5cd src/hotspot/cpu/aarch64/aarch64.ad --- a/src/hotspot/cpu/aarch64/aarch64.ad Thu May 07 14:44:09 2020 +0100 +++ b/src/hotspot/cpu/aarch64/aarch64.ad Thu May 07 13:47:05 2020 -0400 @@ -15333,9 +15333,7 @@ format %{ "ShouldNotReachHere" %} ins_encode %{ - // +1 so NativeInstruction::is_sigill_zombie_not_entrant() doesn't - // return true - __ dpcs1(0xdead + 1); + __ stop(_halt_reason); %} ins_pipe(pipe_class_default); diff -r ed406ec0c5cd src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp --- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp Thu May 07 14:44:09 2020 +0100 +++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp Thu May 07 13:47:05 2020 -0400 @@ -2223,14 +2223,8 @@ } void MacroAssembler::stop(const char* msg) { - address ip = pc(); - pusha(); - mov(c_rarg0, (address)msg); - mov(c_rarg1, (address)ip); - mov(c_rarg2, sp); - mov(c_rarg3, CAST_FROM_FN_PTR(address, MacroAssembler::debug64)); - blr(c_rarg3); - hlt(0); + mov(rscratch1, (address)msg); + dpcs1(0xdeaf); } void MacroAssembler::warn(const char* msg) { @@ -2610,7 +2604,7 @@ BREAKPOINT; } } - fatal("DEBUG MESSAGE: %s", msg); + report_fatal(__FILE__, __LINE__, "DEBUG MESSAGE: %s", msg); } void MacroAssembler::push_call_clobbered_registers() { diff -r ed406ec0c5cd src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp --- a/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp Thu May 07 14:44:09 2020 +0100 +++ b/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp Thu May 07 13:47:05 2020 -0400 @@ -458,6 +458,10 @@ return uint_at(0) == 0xd4bbd5a1; // dcps1 #0xdead } +bool NativeInstruction::is_stop() { + return uint_at(0) == 0xd4bbd5e1; // dcps1 #0xdeaf +} + void NativeIllegalInstruction::insert(address code_pos) { *(juint*)code_pos = 0xd4bbd5a1; // dcps1 #0xdead } diff -r ed406ec0c5cd src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp --- a/src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp Thu May 07 14:44:09 2020 +0100 +++ b/src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp Thu May 07 13:47:05 2020 -0400 @@ -76,6 +76,7 @@ bool is_movz(); bool is_movk(); bool is_sigill_zombie_not_entrant(); + bool is_stop(); protected: address addr_at(int offset) const { return address(this) + offset; } diff -r ed406ec0c5cd src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp --- a/src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp Thu May 07 14:44:09 2020 +0100 +++ b/src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp Thu May 07 13:47:05 2020 -0400 @@ -381,6 +381,10 @@ } stub = SharedRuntime::handle_unsafe_access(thread, next_pc); } + } else if (sig == SIGILL && nativeInstruction_at(pc)->is_stop()) { + int64_t *regs = (int64_t *)(uc->uc_mcontext.regs); + MacroAssembler::debug64((char*)regs[rscratch1->encoding()], + (int64_t)pc, regs); } else From vladimir.kozlov at oracle.com Thu May 7 18:38:36 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 7 May 2020 11:38:36 -0700 Subject: [15] RFR(M): 8244207: Simplify usage of Compile::print_method() when debugging with gdb and enable its use with rr In-Reply-To: References: Message-ID: <02684646-72d5-43a6-1d8b-6e21b3ba159a@oracle.com> Looks good. Thanks, Vladimir On 5/7/20 5:09 AM, Christian Hagedorn wrote: > Hi > > Please review the following debugging enhancement: > https://bugs.openjdk.java.net/browse/JDK-8244207 > http://cr.openjdk.java.net/~chagedorn/8244207/webrev.00/ > > This enhancement simplifies the usage for printing the ideal graph for visualization with the Ideal Graph Visualizer > when debugging with gdb and enables graph printing with rr [1]. > > Instead of calling Compile::current()->print_method(PHASE_X, y, z) from gdb, one can now just call igv_print() or > igv_print(phase_name) with a custom phase name. There are multiple options depending on where the graph should be > printed to (file or over network/locally to an opened Ideal Graph Visualizer). When choosing file, the output is always > printed to a file named custom_debug.xml. I think the flexibility to choose another file name is not really required > since it's only used while debugging. These new igv_print() methods can also be called from gdb without setting any > flags required for the usual calls to Compile::current()->print_method(PHASE_X, y, z) to work. > > The standard Compile::current()->print_method(PHASE_X, y, z) call does not work while debugging a program trace with rr > (and is probably also problematic with related replay tools). The call gets stuck somewhere. rr allows to alter some > data at a breakpoint but as soon as execution continues on the replayed trace, the modifications are undone (except for > file writes). This enhancement also addresses this such that the new igv_print() methods can be used with rr. However, > when printing to a file is chosen, igv_print() will overwrite custom_debug.xml again at the next execution stop. To > avoid that I added additional rr-specific igv_append() and igv_append(phase_name) methods that simply append a graph to > the existing custom_debug.xml file without setting up a file header again. This allows all printed graphs to be kept in > one file which makes it easier to navigate through them. > > Thank you! > > Best regards, > Christian > > > [1] https://rr-project.org/ From vladimir.kozlov at oracle.com Thu May 7 20:29:12 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 7 May 2020 13:29:12 -0700 Subject: RFR[M]: 8151779: Some intrinsic flags could be replaced with one general flag In-Reply-To: <66EF8963-4CB3-414D-B620-B6E56C4454CF@amazon.com> References: <19CD3956-4DC6-4908-8626-27D48A9AB4A4@amazon.com> <0EDAAC88-E5D9-424F-A19E-5E20C689C2F3@amazon.com> <801D878C-CAE5-4EBE-8AFE-4E35346CD5BD@amazon.com> <58ff5b66-1dce-d4ad-8f21-254abd1b887b@oracle.com> <747eaea2-36bf-dfbd-00e7-dcd9ff016dde@oracle.com> <66EF8963-4CB3-414D-B620-B6E56C4454CF@amazon.com> Message-ID: <660d9ee3-6eac-9fc1-e66c-f7ba546dfe49@oracle.com> On 5/3/20 3:48 PM, Liu, Xin wrote: > Hi, Vladimir, > > Thank you to review the patch! > > For the failure, it's actually a bug. _updateCRC32 should be enabled no matter how on 32bit x86. Here is the new revision with bugfix. > http://cr.openjdk.java.net/~xliu/8151779/03/webrev/ > I made an incremental diff between rev02 and rev03: http://cr.openjdk.java.net/~xliu/8151779/r2_to_r3.diff > > 1. The bug was because abstractCompiler miss out to check vm_intrinsic_control_words[]. > Previously, Class vmIntrinsics provided multiple interfaces for intrinsic availability. (https://hg.openjdk.java.net/jdk/jdk/file/4198213fc371/src/hotspot/share/classfile/vmSymbols.hpp#l1696) > > abstractCompiler.cpp and library_call.cpp is_disabled_by_flags () but stubGenerator_x86_64.cpp uses is_intrinsic_available(). > I promote "is_disabled_by_flags()" to the core interface, leave more comments on it, and keep is_intrinsic_available() for compatibility. okay > > 2. add +/- UseCRC32Intrinsics to IntrinsicAvailableTest.java > The purpose of that test is not to generate a CRC32 intrinsic. Its purpose is to check if compilers determine to intrinsify _updateCRC32 or not. > Mathematically, "UseCRC32Intrinsics" is a set = [_updateCRC32, _updateBytesCRC32, _updateByteBufferCRC32]. > "-XX:-UseCRC32Intrinsics" disables all 3 of them. If users use -XX:ControlIntrinsic=+_updateCRC32 and -XX:-UseCRC32Intrinsics, _updateCRC32 should be enabled individually. No, I think we should preserve current behavior when UseCRC32Intrinsics is off then all corresponding intrinsics are also should be off. This is the purpose of such flags - to be able control several intrinsics with one flag. Otherwise you have to check each individual intrinsic if CPU does not support them. Even if code for some of these intrinsics can be generated on this CPU. We should be consistent, otherwise code can become very complex to support. > > Yes, it will crash if compilers do generate _updateCRC32 without UseCRC32Intrinsics. It's because hotspot doesn't generate corresponding stubs, which are controlled by UseCRC32Intrinsics. We should not allow crashes in all cases. JVM should exit gracefully with error message. > > That's by design. hotspot has made huge efforts to enable as many intrinsics as it can. If a user explicitly enables an unsupported intrinsics, he/she must do it for reasons. One possible scenario is that he is developing a new intrinsic. No, people make mistakes. We should notify them that they made mistake. These flags rules are for Java users who most likely don't know how correctly control JVM features. And JVM developers are smart enough to bypass all restrictions we put into VM. We don't need to relax VM checks for them. > > Actually, the reason c1/c2 crash because both of them choose hard-landing. Eg. LibraryCallKit::inline_updateCRC32() assumes it has the stub. > I don't know why, but templateIntercept seems to choose to tolerate it. Interpreter checks UseCRC32Intrinsics flag for code generation. C2 and C1 do the same in AbstractCompiler::is_intrinsic_available(). > > On legacy hosts without CLMUL, -XX:+UseCRC32Intrinsics will be drop. It's still safe to run because IntrinsicAvailableTest.java doesn't attempt to compile the intrinsic method. The test can check for negative results too. > > 3. I found an interesting optimization. > We can use vm_intrinsic_control_words[] as a cache. I assume that no one changes those UseXXXIntrinsics options at the runtime. > It can skip the mega-switch of vmIntrinsics::disabled_by_jvm_flags(), which might not be a big deal for optimizing compilers, but it can guarantee O(1) complexity for all toolchains. Yes, after VM_Version_init() call intrinsics flags should not change. Thanks, Vladimir > > Thanks, > --lx > > ?On 5/1/20, 8:36 PM, "hotspot-compiler-dev on behalf of Vladimir Kozlov" wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > Hi Xin > > compiler/intrinsics/IntrinsicAvailableTest.java test failed on old x86 machine which does not have CLMUL and as result > can't use CRC32 intrinsic (UseCRC32Intrinsics flag is false). With -XX:ControlIntrinsic=+_updateCRC32 test failed with: > > java.lang.RuntimeException: Unexpected result: intrinsic for java.util.zip.CRC32.update() is enabled but intrinsic is > not available at compilation level 1 > at compiler.intrinsics.IntrinsicAvailableTest.checkIntrinsicForCompilationLevel(IntrinsicAvailableTest.java:128) > at compiler.intrinsics.IntrinsicAvailableTest.test(IntrinsicAvailableTest.java:138) > at compiler.intrinsics.IntrinsicAvailableTest.main(IntrinsicAvailableTest.java:150) > > Regards, > Vladimir > > On 5/1/20 6:00 PM, Vladimir Kozlov wrote: > > Hi > > > > I am CCing to runtime group too. I would like to see comments about these changes. No need to look on compiler's changes. > > > > The latest https://cr.openjdk.java.net/~xliu/8151779/02/webrev/ > > > > Good work. > > > > On 4/24/20 1:33 AM, Liu, Xin wrote: > >> Hi, > >> > >> May I get reviewed for this new revision? > >> JBS: https://bugs.openjdk.java.net/browse/JDK-8151779 > >> webrev: https://cr.openjdk.java.net/~xliu/8151779/01/webrev/ > >> > >> I introduce a new option -XX:ControlIntrinsic=+_id1,-id2... > >> The id is vmIntrinsics::ID. As prior discussion, ControlIntrinsic is expected to replace DisableIntrinsic. > >> I keep DisableIntrinsic in this revision. DisableIntrinsic prevails when an intrinsic appears on both lists. > > > > Yes, you have to keep DisableIntrinsic for now. We will deprecate it later. > > > >> > >> I use an array of tribool to mark each intrinsic is enabled or not. In this way, hotspot can avoid expensive string > >> querying among intrinsics. > >> A Tribool value has 3 states: Default, true, or false. > >> If developers don't explicitly set an intrinsic, it will be available unless is disabled by the corresponding > >> UseXXXIntrinsics. > >> Traditional Boolean value can't express fine/coarse-grained control. Ie. We only go through those auxiliary options > >> UseXXXIntrinsics if developers don't control a specific intrinsic. > >> > >> I also add the support of ControlIntrinsic to CompilerDirectives. > >> > >> Test: > >> I reuse jtreg tests of DisableIntrinsic. Add add more @run annotations to verify ControlIntrinsics. > >> I passed hotspot:Tier1 test and all tests on x86_64/linux. > > > > Good. I submitted hotspot tier1-3 testing. > > > > Thanks, > > Vladimir > > > >> > >> Thanks, > >> --lx > >> > >> On 4/17/20, 7:22 PM, "hotspot-compiler-dev on behalf of Liu, Xin" >> behalf of xxinliu at amazon.com> wrote: > >> > >> Hi, Vladimir, > >> > >> Thanks for the clarification. > >> Oh, yes, it's theoretically possible, but it's tedious. I am wrong at that point. > >> I think I got your point. ControlIntrinsics will make developer's life easier. I will implement it. > >> > >> Thanks, > >> --lx > >> > >> > >> On 4/17/20, 6:46 PM, "Vladimir Kozlov" wrote: > >> > >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments > >> unless you can confirm the sender and know the content is safe. > >> > >> > >> > >> I withdraw my suggestion about EnableIntrinsic from JDK-8151779 because ControlIntrinsics will provide such > >> functionality and will replace existing DisableIntrinsic. > >> > >> Note, we can start deprecating Use*Intrinsic flags (and DisableIntrinsic) later in other changes. You don't > >> need to do > >> everything at once. What we need now a mechanism to replace them. > >> > >> On 4/16/20 11:58 PM, Liu, Xin wrote: > >> > Hi, Corey and Vladimir, > >> > > >> > I recently go through vmSymbols.hpp/cpp. I think I understand your comments. > >> > Each UseXXXIntrinsics does control a bunch of intrinsics (plural). Thanks for the hint. > >> > > >> > Even though I feel I know intrinsics mechanism of hotspot better, I still need a clarification of JDK- > >> 8151779. > >> > > >> > There're 321 intrinsics (https://chriswhocodes.com/hotspot_intrinsics_jdk15.html). > >> > If there's no any option, they are all available for compilers. That makes sense because intrinsics are > >> always beneficial. > >> > But there're reasons we need to disable a subset of them. A specific architecture may miss efficient > >> instructions or fixed functions. Or simply because an intrinsic is buggy. > >> > > >> > Currently, JDK provides developers 2 ways to control intrinsics. > 1. Some diagnostic options. Eg. > >> InlineMathNatives, UseBase64Intrinsics. > >> > Developers can use one option to disable a group of intrinsics. That is to say, it's a coarse-grained > >> approach. > >> > > >> > 2. DisableIntrinsic="a,b,c" > >> > By passing a string list of vmIntrinsics::IDs, it's capable of disabling any specified intrinsic. > >> > > >> > But even putting above 2 approaches together, we still can't precisely control any intrinsic. > >> > >> Yes, you are right. We seems are trying to put these 2 different ways into one flag which may be mistake. > >> > >> -XX:ControlIntrinsic=-_updateBytesCRC32C,-_updateDirectByteBufferCRC32C is a similar to > >> -XX:-UseCRC32CIntrinsics but it > >> requires more detailed knowledge about intrinsics ids. > >> > >> May be we can have 2nd flag, as you suggested -XX:UseIntrinsics=-AESCTR,+CRC32C, for such cases. > >> > >> > If we want to enable an intrinsic which is under control of InlineMathNatives but keep others disable, it's > >> impossible now. [please correct if I am wrong here]. > >> > >> You can disable all other from 321 intrinsics with DisableIntrinsic flag which is very tedious I agree. > >> > >> > I think that the motivation JDK-8151779 tried to solve. > >> > >> The idea is that instead of flags we use to control particular intrinsics depending on CPU we will use > >> vmIntrinsics::IDs > >> or other tables as you showed in your changes. It will require changes in vm_version_ codes. > >> > >> > > >> > If we provide a new option EnableIntrinsic and put it least priority, then we can precisely control any > >> intrinsic. > >> > Quote Vladimir Kozlov "DisableIntrinsic list prevails if an intrinsic is specified on both EnableIntrinsic > >> and DisableIntrinsic." > >> > > >> > "-XX:ControlIntrinsic=+_dabs,-_fabs,-_getClass" looks more elegant, but it will confuse developers with > >> DisableIntrinsic. > >> > If we decide to deprecate DisableIntrinsic, I think ControlIntrinsic may be a better option. Now I prefer > >> to provide EnableIntrinsic for simplicity and symmetry. > >> > >> I prefer to have one ControlIntrinsic flag and deprecate DisableIntrinsic. I don't think it is confusing. > >> > >> Thanks, > >> Vladimir > >> > >> > What do you think? > >> > > >> > Thanks, > >> > --lx > >> > > >> > > >> > On 4/13/20, 1:47 PM, "hotspot-compiler-dev on behalf of Corey Ashford" > >> wrote: > >> > > >> > CAUTION: This email originated from outside of the organization. Do not click links or open > >> attachments unless you can confirm the sender and know the content is safe. > >> > > >> > > >> > > >> > On 4/13/20 10:33 AM, Liu, Xin wrote: > >> > > Hi, compiler developers, > >> > > I attempt to refactor UseXXXIntrinsics for JDK-8151779. I think we still need to keep > >> UseXXXIntrinsics options because many applications may be using them. > >> > > > >> > > My change provide 2 new features: > >> > > 1) a shorthand to enable/disable intrinsics. > >> > > A comma-separated string. Each one is an intrinsic. An optional tailing symbol + or '-' denotes > >> enabling or disabling. > >> > > If the tailing symbol is missing, it means enable. > >> > > Eg. -XX:UseIntrinsics="AESCTR-,CRC32C+,CRC32-,MathExact" > >> > > This jvm option will expand to multiple options -XX:-UseAESCTRIntrinsics, -XX:+UseCRC32CIntrinsics, > >> -XX:-UseCRC32Intrinsics, -XX:UseMathExactIntrinsics > >> > > > >> > > 2) provide a set of macro to declare intrinsic options > >> > > Developers declare once in intrinsics.hpp and macros will take care all other places. > >> > > Here are example: > >> https://cr.openjdk.java.net/~xliu/8151779/00/webrev/src/hotspot/share/compiler/intrinsics.hpp.html > >> > > Ion Lam is overhauling jvm options. I am thinking how to be consistent with his proposal. > >> > > > >> > > >> > Great idea, though to be consistent with the original syntax, I think > >> > the +/- should be in front of the name: > >> > > >> > -XX:UseIntrinsics=-AESCTR,+CRC32C,... > >> > > >> > > >> > > I handle UseIntrinsics before VM_Version::initialize. It means that platform-specific initialization > >> still has a chance to correct those options. > >> > > If we do that after VM_Version::initialize, some intrinsics may cause JVM crash. Eg. > >> +UseBase64Intrinsics on x86_64 Linux. > >> > > Even though this behavior is same as -XX:+UseXXXIntrinsics, from user's perspective, it's not > >> straightforward when JVM overrides what users specify implicitly. It's dilemma here, stable jvm or fidelity of > >> cmdline. What do you think? > >> > > > >> > > Another problem is naming convention. Almost all intrinsics options use UseXXXIntrinsics. One > >> exception is UseVectorizedMismatchIntrinsic. > >> > > Personally, I think it should be "UseXXXIntrinsic" because one option is for one intrinsic, right? > >> Is it possible to change this name convention? > >> > > >> > Some (many?) intrinsic options turn on more than one .ad instruct > >> > instrinsic, or library instrinsics at the same time. I think that's why > >> > the plural is there. Also, consistently adding the plural allows you to > >> > add more capabilities to a flag that initially only had one intrinsic > >> > without changing the plurality (and thus backward compatibility). > >> > > >> > Regards, > >> > > >> > - Corey > >> > > >> > > >> > >> > From vladimir.kozlov at oracle.com Thu May 7 20:37:50 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 7 May 2020 13:37:50 -0700 Subject: RFR[M]: 8151779: Some intrinsic flags could be replaced with one general flag In-Reply-To: References: <19CD3956-4DC6-4908-8626-27D48A9AB4A4@amazon.com> <0EDAAC88-E5D9-424F-A19E-5E20C689C2F3@amazon.com> <801D878C-CAE5-4EBE-8AFE-4E35346CD5BD@amazon.com> <58ff5b66-1dce-d4ad-8f21-254abd1b887b@oracle.com> Message-ID: On 5/5/20 2:37 AM, Liu, Xin wrote: > Hello, David and Nils > > Thank you to review the patch. I went to brush up my English grammar and then update my patch to rev04. > https://cr.openjdk.java.net/~xliu/8151779/04/webrev/ > Here is the incremental diff: https://cr.openjdk.java.net/~xliu/8151779/r3_to_r4.diff It reflect changes based on David's feedbacks. I really appreciate that you review so carefully and found so many invaluable suggestions. TBH, I don't understand Amazon's copyright header neither. I choose the simple way to dodge that problem. > > Nils points out a very tricky question. Yes, I also notice that each TriBool takes 4 bytes on x86_64. It's a natural machine word and supposed to be the most efficient form. As a result, the vector control_words take about 1.3Kb for all intrinsics. I thought it's not a big deal, but Nils brought up that each DirectiveSet will increase from 128b to 1440b. Theoretically, the user may provide a CompileCommandFile which consists of hundreds of directives. Will hotspot have hundreds of DirectiveSet in that case? May be use hashtable instead of whole array in DirectiveSet. > > Actually, I do have a compacted container of TriBool. It's like a vector specialization. > https://cr.openjdk.java.net/~xliu/8151779/TriBool.cpp > > The reason I didn't include it because I still feel that a few KiloBytes memories are not a big deal. Nowadays, hotspot allows Java programmers allocate over 100G heap. Is it wise to increase software complexity to save KBs? > > If you think it matters, I can integrate it. May I update TriBoolArray in a standalone JBS? I have made a lot of changes. I hope I can verify them using KitchenSink? Yes, you can file separate issue for improvements. > > For the second problem, I think it's because I used 'memset' to initialize an array of objects in rev01. Previously, I had code like this: > memset(&_intrinsic_control_words[0], 0, sizeof(_intrinsic_control_words)); > > This kind of usage will be warned as -Werror=class-memaccess in g++-8. I have fixed it since rev02. I use DirectiveSet::fill_in(). Please check out. Okay. Thanks, Vladimir > > Thanks, > --lx > From Xiaohong.Gong at arm.com Fri May 8 03:38:34 2020 From: Xiaohong.Gong at arm.com (Xiaohong Gong) Date: Fri, 8 May 2020 03:38:34 +0000 Subject: RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option Message-ID: Hi, Please help to review this patch which obsoletes the product flag "-XX:UseBarrierssForVolatile" and its related code: Webrev: http://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.00/ JBS: https://bugs.openjdk.java.net/browse/JDK-8243339 https://bugs.openjdk.java.net/browse/JDK-8243456 (CSR) As described in the CSR, using "-XX:+UseBarriersForVolatile" might have memory consistent issue like that mentioned in [1]. It needs more effort to fix the issue and maintain the memory consistency in future. Since "ldar/stlr" has worked well for a long time, and so does "ldaxr/stlxr" for unsafe atomics, we'd better simplify things by removing this option and the alternative implementation for the volatile access. Since its only one signifcant usage on a kind of CPU would also like to be removed (See [2]), it can work well without this option. So we directly obsolete this option and remove the code, rather than deprecate it firstly. Besides obsoleting this option, this patch also removes an AArch64 CPU feature "CPU_DMB_ATOMICS" together. It is a workaround while not an AArch64 official feature, which is not required anymore (See [2]). [1] https://bugs.openjdk.java.net/browse/JDK-8241137 [2] https://bugs.openjdk.java.net/browse/JDK-8242469 Testing: Jtreg hotspot::hotspot_all_no_apps, jdk::jdk_core, langtools::tier1 JCStress: tests-all Thanks, Xiaohong Gong From tobias.hartmann at oracle.com Fri May 8 06:24:00 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 8 May 2020 08:24:00 +0200 Subject: [15] RFR(M): 8244207: Simplify usage of Compile::print_method() when debugging with gdb and enable its use with rr In-Reply-To: <02684646-72d5-43a6-1d8b-6e21b3ba159a@oracle.com> References: <02684646-72d5-43a6-1d8b-6e21b3ba159a@oracle.com> Message-ID: Hi Christian, looks good to me too but please use (jio_)snprintf instead of unsafe sprintf. Small typo: compile.cpp:4605 "set up" -> "sets up" Best regards, Tobias On 07.05.20 20:38, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 5/7/20 5:09 AM, Christian Hagedorn wrote: >> Hi >> >> Please review the following debugging enhancement: >> https://bugs.openjdk.java.net/browse/JDK-8244207 >> http://cr.openjdk.java.net/~chagedorn/8244207/webrev.00/ >> >> This enhancement simplifies the usage for printing the ideal graph for visualization with the >> Ideal Graph Visualizer when debugging with gdb and enables graph printing with rr [1]. >> >> Instead of calling Compile::current()->print_method(PHASE_X, y, z) from gdb, one can now just call >> igv_print() or igv_print(phase_name) with a custom phase name. There are multiple options >> depending on where the graph should be printed to (file or over network/locally to an opened Ideal >> Graph Visualizer). When choosing file, the output is always printed to a file named >> custom_debug.xml. I think the flexibility to choose another file name is not really required since >> it's only used while debugging. These new igv_print() methods can also be called from gdb without >> setting any flags required for the usual calls to Compile::current()->print_method(PHASE_X, y, z) >> to work. >> >> The standard Compile::current()->print_method(PHASE_X, y, z) call does not work while debugging a >> program trace with rr (and is probably also problematic with related replay tools). The call gets >> stuck somewhere. rr allows to alter some data at a breakpoint but as soon as execution continues >> on the replayed trace, the modifications are undone (except for file writes). This enhancement >> also addresses this such that the new igv_print() methods can be used with rr. However, when >> printing to a file is chosen, igv_print() will overwrite custom_debug.xml again at the next >> execution stop. To avoid that I added additional rr-specific igv_append() and >> igv_append(phase_name) methods that simply append a graph to the existing custom_debug.xml file >> without setting up a file header again. This allows all printed graphs to be kept in one file >> which makes it easier to navigate through them. >> >> Thank you! >> >> Best regards, >> Christian >> >> >> [1] https://rr-project.org/ From tobias.hartmann at oracle.com Fri May 8 06:40:07 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 8 May 2020 08:40:07 +0200 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <3559b0c8-7c40-47f4-e9c5-e1edf2ac1461@oracle.com> Message-ID: <77855ca4-8f7d-81c9-0d7f-67adefedd456@oracle.com> Hi Martin, On 07.05.20 15:42, Doerr, Martin wrote: > Hi Tobias, > > thanks for looking at my change. > It is only expected to influence startup, not peak performance. > > Nevertheless, I've run benchmarks to check peak performance as well: SPEC jvm 2008, SPEC jbb 2015 > No regressions observable, as expected. Okay, great. > For startup performance, I've ran SPEC jbb 2005 with throughput measurements every 1.5 seconds like I had shown in my fosdem talk (https://fosdem.org/2020/schedule/event/jit2020/). > I couldn't observe any regression, either. > > It would be very helpful if other people (e.g. from Oracle) could run additional benchmarks. I don't know what you use to check startup performance. Okay, I'll run it through our startup benchmark suite. Will report back once it finished. >> Did you verify that tests using these flags are still working as expected (i.e., >> intend to only adjust C2's behavior)? > Using the existing flags still works for C2. So there's no issue with C2 tests. > I'm not aware of any test which requires one of these flags to modify C1 behavior. > > I've run a substantial amount of tests and couldn't find any related issues: > jtreg, jck (normal, with -Xcomp, with -Xcomp -XX:-TieredCompilation), SAP's proprietary tests I meant that the authors of these tests might have intended to tweak C1 behavior when adding the flag whereas with your change, only C2 behavior is affected. That doesn't necessarily mean that the test will fail now but it could mean that the regression the test was written for is not triggered anymore. You might just want to add the C1 flag as well (in another @run), Vladimir K. also mentioned this in his review. Best regards, Tobias From rickard.backman at oracle.com Fri May 8 08:25:00 2020 From: rickard.backman at oracle.com (Rickard =?iso-8859-1?Q?B=E4ckman?=) Date: Fri, 8 May 2020 10:25:00 +0200 Subject: RFR(S): 8147018: Better reporting for compiler control tests. In-Reply-To: <861622ba-c03d-92d4-c562-6582ea82a034@oracle.com> References: <861622ba-c03d-92d4-c562-6582ea82a034@oracle.com> Message-ID: <20200508082500.GW23956@rbackman> Looks good. /R On 04/29, Evgeny Nikitin wrote: > Hi, > > Bug: https://bugs.openjdk.java.net/browse/JDK-8147018 > Webrev: http://cr.openjdk.java.net/~enikitin/8147018/webrev.00/ > > The patch enhances the compiler control tests reporting by adding compile > commands and expected states reporting. > > Sample output (in the .jtr file) for the compile commands reporting: > > > (CompileCommand COMPILEONLY Type: JCMD Compiler: null MethodDescriptor: > _compiler/compilercontrol/share/pool/sub/Klass$Internal,*- IsValid: true > JCMDType: ADD) > > (CompileCommand COMPILEONLY Type: JCMD Compiler: null MethodDescriptor: > *Klass *met@%hod IsValid: true JCMDType: ADD > ) > > (CompileCommand COMPILEONLY Type: JCMD Compiler: null MethodDescriptor: > +*::* IsValid: true JCMDType: ADD) > > (CompileCommand NONEXISTENT Type: JCMD Compiler: null MethodDescriptor: > null IsValid: false JCMDType: REMOVE) > > (ability to print removals also added by the change) > > Sample output for expected compilation state reporting: > > > Checking expected compilation state: { > > method: public void > compiler.compilercontrol.share.pool.sub.Klass.method() > > compile [Optional.empty, Optional.empty] > > force_inline [Optional.empty, Optional.empty] > > dont_inline [Optional.empty, Optional.empty] > > log Optional.empty > > print_assembly Optional.empty > > print_inline Optional.empty > > } > > > Other input parameters are already printed by the child VM's start command > and the child VM's output. > > The change had been tested via mach5 test runs for the compiler control > tests and tier1 run. > > Please review, > /Evgeny Nikitin From christian.hagedorn at oracle.com Fri May 8 08:45:02 2020 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Fri, 8 May 2020 10:45:02 +0200 Subject: [15] RFR(M): 8244207: Simplify usage of Compile::print_method() when debugging with gdb and enable its use with rr In-Reply-To: References: <02684646-72d5-43a6-1d8b-6e21b3ba159a@oracle.com> Message-ID: Thank you Vladimir and Tobias for your reviews! On 08.05.20 08:24, Tobias Hartmann wrote: > looks good to me too but please use (jio_)snprintf instead of unsafe sprintf. Thinking of which, we could probably directly use the phase_name parameter and call print_method() without allocating a char array and using jio_snprintf. However, it might be a good idea to change sprintf into jio_snprintf in Compile::print_method (where I got this code from). There it makes sense to use an additional char array to concatenate two strings. I included those changes together with the typo fix in a new webrev: http://cr.openjdk.java.net/~chagedorn/8244207/webrev.01/ Best regards, Christian > On 07.05.20 20:38, Vladimir Kozlov wrote: >> Looks good. >> >> Thanks, >> Vladimir >> >> On 5/7/20 5:09 AM, Christian Hagedorn wrote: >>> Hi >>> >>> Please review the following debugging enhancement: >>> https://bugs.openjdk.java.net/browse/JDK-8244207 >>> http://cr.openjdk.java.net/~chagedorn/8244207/webrev.00/ >>> >>> This enhancement simplifies the usage for printing the ideal graph for visualization with the >>> Ideal Graph Visualizer when debugging with gdb and enables graph printing with rr [1]. >>> >>> Instead of calling Compile::current()->print_method(PHASE_X, y, z) from gdb, one can now just call >>> igv_print() or igv_print(phase_name) with a custom phase name. There are multiple options >>> depending on where the graph should be printed to (file or over network/locally to an opened Ideal >>> Graph Visualizer). When choosing file, the output is always printed to a file named >>> custom_debug.xml. I think the flexibility to choose another file name is not really required since >>> it's only used while debugging. These new igv_print() methods can also be called from gdb without >>> setting any flags required for the usual calls to Compile::current()->print_method(PHASE_X, y, z) >>> to work. >>> >>> The standard Compile::current()->print_method(PHASE_X, y, z) call does not work while debugging a >>> program trace with rr (and is probably also problematic with related replay tools). The call gets >>> stuck somewhere. rr allows to alter some data at a breakpoint but as soon as execution continues >>> on the replayed trace, the modifications are undone (except for file writes). This enhancement >>> also addresses this such that the new igv_print() methods can be used with rr. However, when >>> printing to a file is chosen, igv_print() will overwrite custom_debug.xml again at the next >>> execution stop. To avoid that I added additional rr-specific igv_append() and >>> igv_append(phase_name) methods that simply append a graph to the existing custom_debug.xml file >>> without setting up a file header again. This allows all printed graphs to be kept in one file >>> which makes it easier to navigate through them. >>> >>> Thank you! >>> >>> Best regards, >>> Christian >>> >>> >>> [1] https://rr-project.org/ From martin.doerr at sap.com Fri May 8 08:50:02 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 8 May 2020 08:50:02 +0000 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: <77855ca4-8f7d-81c9-0d7f-67adefedd456@oracle.com> References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <3559b0c8-7c40-47f4-e9c5-e1edf2ac1461@oracle.com> <77855ca4-8f7d-81c9-0d7f-67adefedd456@oracle.com> Message-ID: Hi Tobias, thanks a lot for your help. > Okay, I'll run it through our startup benchmark suite. Will report back once it > finished. I'm using an aggressive setting of C1InlineStackLimit=5 with TieredCompilation. This saves more stack space, but should be evaluated carefully. So I highly appreciate your startup benchmarks. > I meant that the authors of these tests might have intended to tweak C1 > behavior when adding the > flag whereas with your change, only C2 behavior is affected. That doesn't > necessarily mean that the > test will fail now but it could mean that the regression the test was written > for is not triggered > anymore. You might just want to add the C1 flag as well (in another @run), > Vladimir K. also > mentioned this in his review. I got your point. I'll take a look and reply to Vladimir's email. Best regards, Martin > -----Original Message----- > From: Tobias Hartmann > Sent: Freitag, 8. Mai 2020 08:40 > To: Doerr, Martin ; Nils Eliasson > ; hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > Hi Martin, > > On 07.05.20 15:42, Doerr, Martin wrote: > > Hi Tobias, > > > > thanks for looking at my change. > > It is only expected to influence startup, not peak performance. > > > > Nevertheless, I've run benchmarks to check peak performance as well: > SPEC jvm 2008, SPEC jbb 2015 > > No regressions observable, as expected. > > Okay, great. > > > For startup performance, I've ran SPEC jbb 2005 with throughput > measurements every 1.5 seconds like I had shown in my fosdem talk > (https://fosdem.org/2020/schedule/event/jit2020/). > > I couldn't observe any regression, either. > > > > It would be very helpful if other people (e.g. from Oracle) could run > additional benchmarks. I don't know what you use to check startup > performance. > > Okay, I'll run it through our startup benchmark suite. Will report back once it > finished. > > >> Did you verify that tests using these flags are still working as expected > (i.e., > >> intend to only adjust C2's behavior)? > > Using the existing flags still works for C2. So there's no issue with C2 tests. > > I'm not aware of any test which requires one of these flags to modify C1 > behavior. > > > > I've run a substantial amount of tests and couldn't find any related issues: > > jtreg, jck (normal, with -Xcomp, with -Xcomp -XX:-TieredCompilation), > SAP's proprietary tests > > I meant that the authors of these tests might have intended to tweak C1 > behavior when adding the > flag whereas with your change, only C2 behavior is affected. That doesn't > necessarily mean that the > test will fail now but it could mean that the regression the test was written > for is not triggered > anymore. You might just want to add the C1 flag as well (in another @run), > Vladimir K. also > mentioned this in his review. > > Best regards, > Tobias From thomas.schatzl at oracle.com Fri May 8 09:34:15 2020 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 8 May 2020 11:34:15 +0200 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: <394BD86E-EC91-440E-9936-696B2B453093@oracle.com> References: <394BD86E-EC91-440E-9936-696B2B453093@oracle.com> Message-ID: <28de2cd7-6209-d780-0975-ab701e6f24f9@oracle.com> Hi, On 07.05.20 07:35, Mikael Vidstedt wrote: > > New webrev here: > > webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.01/hotspot/open/webrev/ > incremental: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.01/hotspot.incr/open/webrev/ > > Remaining items: > > * File follow-up to remove STACK_BIAS > > * File follow-ups to change/update/remove flags and/or flag documentation: UseLWPSynchronization, BranchOnRegister, LIRFillDelaySlots, ArrayAllocatorMallocLimit, ThreadPriorityPolicy > > * File follow-up(s) to update comments ("solaris", ?sparc?, ?solstudio?, ?sunos?, ?sun studio?, ?s compiler bug?, ?niagara?, ?) > > > Please let me know if there?s something I have missed! Looks good. Thomas From zhuoren.wz at alibaba-inc.com Fri May 8 09:26:58 2020 From: zhuoren.wz at alibaba-inc.com (=?UTF-8?B?V2FuZyBaaHVvKFpodW9yZW4p?=) Date: Fri, 08 May 2020 17:26:58 +0800 Subject: =?UTF-8?B?UmU6IFJGUjo4MjQzNjE1IENvbnRpbnVvdXMgZGVvcHRpbWl6YXRpb25zIHdpdGggUmVhc29u?= =?UTF-8?B?PXVuc3RhYmxlX2lmIGFuZCBBY3Rpb249bm9uZQ==?= In-Reply-To: <4dc2e0ef-315b-a72b-bb8c-6b5f418765ed@oracle.com> References: <272f8207-0b1e-4b34-b1d4-0f562b4da9d1.zhuoren.wz@alibaba-inc.com>, <4dc2e0ef-315b-a72b-bb8c-6b5f418765ed@oracle.com> Message-ID: Thanks for your comments. > But the Reason_unstable_if uncommon trap has Action_reinterpret (not Action_none) set? In GraphKit::uncommon_trap, if too_many_recompiles is true, Action_reinterpret will be changed to Action_none. > Why don't we hit the too_many_traps limit? Likely that recompilations were caused by other reasons, not Reason_unstable_if. So we did not hit too_many_traps limit. Here comes an interesting thing, which byte code causes so many recompilations? Checkcast is very suspicious because I met a large number of deoptimizations with Reason_null_check in checkcast. Put it together, the whole process may be as follows, 1. Some byte code(maybe checkcast) caused a lot of deoptimizations and recompilations. 2. The number of deoptimizations and recompilations reached threshold, and in the last compilation for this method, Reason_unstable_if + Action_reinterpret was changed to Reason_unstable_if + Action_none. 3. We went into uncommon trap of Reason_unstable_if, but no more recompilation happened because of Action_none. Unfortunately I failed to write a standalone test to reproduce Reason_null_check in checkcast. So the test attached in bug link maybe a little different from the process I mentioned above. > Also, there is a too_many_traps_or_recompiles method that should be used instead. Updated patch http://cr.openjdk.java.net/~wzhuo/8243615/webrev.02/ Regards, Zhuoren ------------------------------------------------------------------ From:Tobias Hartmann Sent At:2020 May 7 (Thu.) 16:31 To:Sandler ; hotspot-compiler-dev at openjdk.java.net Subject:Re: RFR:8243615 Continuous deoptimizations with Reason=unstable_if and Action=none Hi Zhuoren, On 26.04.20 14:31, Wang Zhuo(Zhuoren) wrote: > I met continuous deoptimization w/ Reason_unstable_if and Action_none in an online application and significant performance drop was observed. > It was found in JDK8 but I think it also existed in tip. But the Reason_unstable_if uncommon trap has Action_reinterpret (not Action_none) set? Why don't we hit the too_many_traps limit? Also, there is a too_many_traps_or_recompiles method that should be used instead. Thanks, Tobias From martin.doerr at sap.com Fri May 8 12:56:01 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 8 May 2020 12:56:01 +0000 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> Message-ID: Hi Vladimir, thanks a lot for looking at this, for finding the test issues and for reviewing the CSR. For me, C2 is a fundamental part of the JVM. I would usually never build without it ?? (Except if we want to use C1 + GraalVM compiler only.) But your right, --with-jvm-variants=client configuration should still be supported. We can fix it by making the flags as obsolete if C2 is not included: diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 11:14:28 2020 +0200 +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 14:41:14 2020 +0200 @@ -562,6 +562,16 @@ { "dup option", JDK_Version::jdk(9), JDK_Version::undefined(), JDK_Version::undefined() }, #endif +#ifndef COMPILER2 + // These flags were generally available, but are C2 only, now. + { "MaxInlineLevel", JDK_Version::undefined(), JDK_Version::jdk(15), JDK_Version::undefined() }, + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), JDK_Version::jdk(15), JDK_Version::undefined() }, + { "InlineSmallCode", JDK_Version::undefined(), JDK_Version::jdk(15), JDK_Version::undefined() }, + { "MaxInlineSize", JDK_Version::undefined(), JDK_Version::jdk(15), JDK_Version::undefined() }, + { "FreqInlineSize", JDK_Version::undefined(), JDK_Version::jdk(15), JDK_Version::undefined() }, + { "MaxTrivialSize", JDK_Version::undefined(), JDK_Version::jdk(15), JDK_Version::undefined() }, +#endif + { NULL, JDK_Version(0), JDK_Version(0) } }; This makes the VM accept the flags with warning: jdk/bin/java -XX:MaxInlineLevel=9 -version OpenJDK 64-Bit Client VM warning: Ignoring option MaxInlineLevel; support was removed in 15.0 If we do it this way, the only test which I think should get fixed is ReservedStackTest. I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in order to preserve the inlining behavior. (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. compiler/c2 tests: Also written to test C2 specific things.) What do you think? Best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov > Sent: Donnerstag, 7. Mai 2020 19:11 > To: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > I would suggest to build VM without C2 and run tests. > > I grepped tests with these flags I found next tests where we need to fix > test's command (add > -XX:+IgnoreUnrecognizedVMOptions) or add @requires > vm.compiler2.enabled or duplicate test for C1 with corresponding C1 > flags (by ussing additional @test block). > > runtime/ReservedStack/ReservedStackTest.java > compiler/intrinsics/string/TestStringIntrinsics2.java > compiler/c2/Test6792161.java > compiler/c2/Test5091921.java > > And there is issue with compiler/compilercontrol tests which use > InlineSmallCode and I am not sure how to handle: > > http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c > ompiler/compilercontrol/share/scenario/Command.java#l36 > > Thanks, > Vladimir > > On 5/4/20 9:04 AM, Doerr, Martin wrote: > > Hi Nils, > > > > thank you for looking at this and sorry for the late reply. > > > > I've added MaxTrivialSize and also updated the issue accordingly. Makes > sense. > > Do you have more flags in mind? > > > > Moving the flags which are only used by C2 into c2_globals definitely makes > sense. > > > > Done in webrev.01: > > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ > > > > Please take a look and let me know when my proposal is ready for a CSR. > > > > Best regards, > > Martin > > > > > >> -----Original Message----- > >> From: hotspot-compiler-dev >> bounces at openjdk.java.net> On Behalf Of Nils Eliasson > >> Sent: Dienstag, 28. April 2020 18:29 > >> To: hotspot-compiler-dev at openjdk.java.net > >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >> > >> Hi, > >> > >> Thanks for addressing this! This has been an annoyance for a long time. > >> > >> Have you though about including other flags - like MaxTrivialSize? > >> MaxInlineSize is tested against it. > >> > >> Also - you should move the flags that are now c2-only to c2_globals.hpp. > >> > >> Best regards, > >> Nils Eliasson > >> > >> On 2020-04-27 15:06, Doerr, Martin wrote: > >>> Hi, > >>> > >>> while tuning inlining parameters for C2 compiler with JDK-8234863 we > had > >> discussed impact on C1. > >>> I still think it's bad to share them between both compilers. We may want > to > >> do further C2 tuning without negative impact on C1 in the future. > >>> > >>> C1 has issues with substantial inlining because of the lack of uncommon > >> traps. When C1 inlines a lot, stack frames may get large and code cache > space > >> may get wasted for cold or even never executed code. The situation gets > >> worse when many patching stubs get used for such code. > >>> > >>> I had opened the following issue: > >>> https://bugs.openjdk.java.net/browse/JDK-8235673 > >>> > >>> And my initial proposal is here: > >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ > >>> > >>> > >>> Part of my proposal is to add an additional flag which I called > >> C1InlineStackLimit to reduce stack utilization for C1 methods. > >>> I have a simple example which shows wasted stack space (java example > >> TestStack at the end). > >>> > >>> It simply counts stack frames until a stack overflow occurs. With the > current > >> implementation, only 1283 frames fit on the stack because the never > >> executed method bogus_test with local variables gets inlined. > >>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get > 2310 > >> frames until stack overflow. (I only used C1 for this example. Can be > >> reproduced as shown below.) > >>> > >>> I didn't notice any performance regression even with the aggressive > setting > >> of C1InlineStackLimit=5 with TieredCompilation. > >>> > >>> I know that I'll need a CSR for this change, but I'd like to get feedback in > >> general and feedback about the flag names before creating a CSR. > >>> I'd also be glad about feedback regarding the performance impact. > >>> > >>> Best regards, > >>> Martin > >>> > >>> > >>> > >>> Command line: > >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - > >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - > >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >> TestStack > >>> CompileCommand: compileonly TestStack.triggerStackOverflow > >>> @ 8 TestStack::triggerStackOverflow (15 bytes) > recursive > >> inlining too deep > >>> @ 11 TestStack::bogus_test (33 bytes) inline > >>> caught java.lang.StackOverflowError > >>> 1283 activations were on stack, sum = 0 > >>> > >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - > >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - > >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >> TestStack > >>> CompileCommand: compileonly TestStack.triggerStackOverflow > >>> @ 8 TestStack::triggerStackOverflow (15 bytes) > recursive > >> inlining too deep > >>> @ 11 TestStack::bogus_test (33 bytes) callee uses too > >> much stack > >>> caught java.lang.StackOverflowError > >>> 2310 activations were on stack, sum = 0 > >>> > >>> > >>> TestStack.java: > >>> public class TestStack { > >>> > >>> static long cnt = 0, > >>> sum = 0; > >>> > >>> public static void bogus_test() { > >>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; > >>> sum += c1 + c2 + c3 + c4; > >>> } > >>> > >>> public static void triggerStackOverflow() { > >>> cnt++; > >>> triggerStackOverflow(); > >>> bogus_test(); > >>> } > >>> > >>> > >>> public static void main(String args[]) { > >>> try { > >>> triggerStackOverflow(); > >>> } catch (StackOverflowError e) { > >>> System.out.println("caught " + e); > >>> } > >>> System.out.println(cnt + " activations were on stack, sum = " + > sum); > >>> } > >>> } > >>> > > From nils.eliasson at oracle.com Fri May 8 14:39:10 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 8 May 2020 16:39:10 +0200 Subject: RFR(S): 8244658: Remove dead code in code cache sweeper Message-ID: <01a3117b-679a-9888-8b3a-c293cf5dec19@oracle.com> Hi, Please review this removal of dead code in the sweeper. Some are leftovers from the transition to handshakes: - class VM_MarkActiveNMethods - class NMethodMarkingTask - NMethodSweeper::mark_active_nmethods And some are just dead: - NMethodSweeper::report_events Bug: https://bugs.openjdk.java.net/browse/JDK-8244658 Webrev: http://cr.openjdk.java.net/~neliasso/8244658/webrev.01/ Best regards, Nils Eliasson From martin.doerr at sap.com Fri May 8 15:21:35 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 8 May 2020 15:21:35 +0000 Subject: RFR(S): 8244658: Remove dead code in code cache sweeper In-Reply-To: <01a3117b-679a-9888-8b3a-c293cf5dec19@oracle.com> References: <01a3117b-679a-9888-8b3a-c293cf5dec19@oracle.com> Message-ID: Hi Nils, thanks for cleaning this up. Looks good to me. Best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Nils Eliasson > Sent: Freitag, 8. Mai 2020 16:39 > To: hotspot-compiler-dev at openjdk.java.net > Subject: RFR(S): 8244658: Remove dead code in code cache sweeper > > Hi, > > Please review this removal of dead code in the sweeper. > > Some are leftovers from the transition to handshakes: > - class VM_MarkActiveNMethods > - class NMethodMarkingTask > - NMethodSweeper::mark_active_nmethods > > And some are just dead: > - NMethodSweeper::report_events > > Bug: https://bugs.openjdk.java.net/browse/JDK-8244658 > Webrev: http://cr.openjdk.java.net/~neliasso/8244658/webrev.01/ > > Best regards, > Nils Eliasson From nils.eliasson at oracle.com Fri May 8 15:31:30 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 8 May 2020 17:31:30 +0200 Subject: RFR(M): 8244660: Code cache sweeper heuristics is broken Message-ID: Hi, In a recent bug (JDK-8244278) it was discovered that the heuristics for StartAggressiveSweepingAt was wrong. Aggressive sweeping means that the sweeper is trying to start a sweep for every new allocation. With default settings before the fix, aggressive sweeping started at already at 10% full. However with the fix applied - there was no sweeps at all. This is also wrong - the sweeper should trigger regularly. The problem is that the sweeper thread is sleeping and there were no code path in which it was awakened (except through StartAggressiveSweepingAt). So the counters counted, thresholds were reached, but the sweeper slept on. Investigating this I encountered the next problem. The old heuristics had two different thresholds: 1) Number of safepoints with stack scans. With the transition to handshakes from safepoints we can no longer rely on safepoints happening. I have removed all of that code. 2) number of bytes of nmethods that had been marked as zombie or not entrant (see NMethodSweeper::possibly_enable_sweeper). This is now the only regular way that sweeps are triggered. (The other two are aggressive sweeping and whitebox testing). I changed the threshold test to be against a new flag - SweeperThreshold. This is because users with different sized code caches might want different thresholds. (Otherwise there would be no way to control the sweepers intensity). The default is MIN2(1 * M, ReservedCodeCacheSize / 256)); At default tiered ReservedCodeCacheSize that is about 1M. At a small code cache of 40M that is about 150k The threshold is capped at 1M because even if you have an enormous code cache - you don't want to fragment it, and you probably don't want to commit more than needed. This patch simplified the sweeper quite a bit. Since the sweeper only is awaken when there is actual job - no more complex heuristics is needed in the sweeper thread. This also has the benefit that the sweeper will always sleep when there is no job. To be able to notify the sweeper thread from NMethodSweeper::possibly_enable_sweeper I had to use a different monitor than CodeCache_lock. I added a new monitor that replaces the CodeCache_lock for signaling. This monitor (CodeSweeper_lock) is only used for signaling the sweeper thread - and the lock is always released before doing anything else. With this patch - JDK-8244278 can be applied on top to fix the aggressive sweeping, and the sweeper will continue to work. This path applies on top of JDK-8244658 that removes dead code in the sweeper. Bug: https://bugs.openjdk.java.net/browse/JDK-8244660 Webrev: http://cr.openjdk.java.net/~neliasso/8244660/webrev.01/ Please review, Nils Eliasson From nils.eliasson at oracle.com Fri May 8 15:32:26 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 8 May 2020 17:32:26 +0200 Subject: RFR(S): 8244658: Remove dead code in code cache sweeper In-Reply-To: References: <01a3117b-679a-9888-8b3a-c293cf5dec19@oracle.com> Message-ID: Thank you Martin! Best regards, Nils Eliasson On 2020-05-08 17:21, Doerr, Martin wrote: > Hi Nils, > > thanks for cleaning this up. Looks good to me. > > Best regards, > Martin > > >> -----Original Message----- >> From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Nils Eliasson >> Sent: Freitag, 8. Mai 2020 16:39 >> To: hotspot-compiler-dev at openjdk.java.net >> Subject: RFR(S): 8244658: Remove dead code in code cache sweeper >> >> Hi, >> >> Please review this removal of dead code in the sweeper. >> >> Some are leftovers from the transition to handshakes: >> - class VM_MarkActiveNMethods >> - class NMethodMarkingTask >> - NMethodSweeper::mark_active_nmethods >> >> And some are just dead: >> - NMethodSweeper::report_events >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8244658 >> Webrev: http://cr.openjdk.java.net/~neliasso/8244658/webrev.01/ >> >> Best regards, >> Nils Eliasson From nils.eliasson at oracle.com Fri May 8 15:36:10 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 8 May 2020 17:36:10 +0200 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: <7cd68157-36fd-d000-f420-2e1e9e0f0143@oracle.com> References: <380ca47b-4143-e98f-ff81-461b394aaf0c@oracle.com> <2d849c3a-f971-b644-16fb-72aba5a919c0@oracle.com> <7cd68157-36fd-d000-f420-2e1e9e0f0143@oracle.com> Message-ID: Hi again, I have now posted request for reviews for: 8244658: Remove dead code in code cache sweeper - http://cr.openjdk.java.net/~neliasso/8244658/webrev.01/ 8244660: Code cache sweeper heuristics is broken - http://cr.openjdk.java.net/~neliasso/8244660/webrev.01/ Man - Your patch applies on top of these. When all have passed reviews, I will push your patch together with my patches so that they end up in the same build. Please try this new heuristics out and give me feedback. Best regards, Nils Eliasson On 2020-05-07 11:31, Nils Eliasson wrote: > > > On 2020-05-06 22:20, Man Cao wrote: >> Hi, >> >> [@Laurent] >>> Thanks Man for your results. >>> I will try your fix on jdk15 repo and run my Marlin tests & >>> benchmark to >> see if there are any gain in these cases. >> You are welcome. >> I have run DaCapo at JDK tip, with default JVM options. I didn't see any >> noticeable difference in performance with and without my bugfix. >> This is probably due to significantly reduced code cache flushes with a >> large default ReservedCodeCacheSize (240MB) for +TieredCompilation. >> I checked the logs for tradesoap for runs without my bugfix, to count >> the >> number of completed flushes (NMethodSweeper::sweep_code_cache()): >> ~550 for runs with -XX:-TieredCompilation >> -XX:ReservedCodeCacheSize=40m on >> JDK11 >> ~35 for runs with default options on JDK tip >> (+TieredCompilation, ReservedCodeCacheSize=240m) >> The flushes are reduced by more than 15X with the default options. >> >> [@Nils] >>> Looking at sweeper.cpp I see something that looks wrong. The >>> _last_sweep >> zzcounter is updated even if no sweep was done. In low code cache usage >> scenarios that means will might never reach the threshold. >>> Can you try it out and see if things improve? >> The change makes sense to me. I can try it out together after >> resolving the >> next issue. >> >>> The sweeper should wake up >>> regularly, but now it is only awakened when hitting the SweepAggressive >>> threshold. This is wrong. >>> I suggest holding of the fix until all the problems have been ironed >>> out. >> Could you elaborate what is the expected frequency to wake up the >> sweeper? >> Should we increase the default value for StartAggressiveSweepingAt >> instead? > The expected behaviour is that sweeper should be invoked depending on > how many sweeper-stack scan has been done - that is tracked by the > _time_counter field. In NMethodSweeper::possibly_sweep there is a > heuristics that triggers a sweep more often if the code cache is > getting full. > > The expected behaviour for StartAggressiveSweepingAt is that we > possibly start a sweep with a stack scan on every codeblob/nmethod > allocation when the code cache is getting full. That is an attempt to > quickly free up additional space in the code cache. > > The first bug I found is that the _last_sweep counter should only be > set when a sweep has been performed, otherwise the threshold might > never be reached. > > The second bug is that the sweeper thread sleeps. The only way to wake > it up is through code paths that can only be reached when it is awake, > or when StartAggressiveSweepingAt has been reached. This bug has gone > unnoticed because the bug you found made SweepAggressive trigger when > the codecache is 10% full, which most often happen fairly quickly. > > To fix this the sweeper thread needs to be awakened whenever > _time_counter has been updated. > > However - that can only be a temporary fix - the transition to using > mostly handshakes for synchronization can make the safepoints very > rare. And then the heuristics is broken again. > > Best regards, > Nils Eliasson > >> >> As you suggested before, currently the sweeper is awakened for every new >> allocation in the code cache, after the usage is above 10%. >> This is definitely too frequent that it hurts performance, especially >> for >> the -XX:-TieredCompilation case. >> We do find that turning off code cache flushing in JDK11 >> (-XX:-UseCodeCacheFlushing) >> could significantly improve performance, by ~20% for an important >> production workload configured with -XX:-TieredCompilation! >> Thus, we strongly support keeping the default flushing frequency low. >> >> -Man > From nils.eliasson at oracle.com Fri May 8 15:40:37 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 8 May 2020 17:40:37 +0200 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: <1176826a-fc72-a1b5-9b2f-92d0c3402956@oracle.com> References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <1176826a-fc72-a1b5-9b2f-92d0c3402956@oracle.com> Message-ID: <44c3307a-2f87-6d39-0c35-fa2a88a857f2@oracle.com> Yes, I'll do that. I just have to find the documentation for how to change the documentation :P Thanks, Nils Eliasson On 2020-05-07 19:18, Vladimir Kozlov wrote: > I reviewed CSR. > > For Oracle's sponsor (Nils?): we need to create doc subtask to add > these changes to release notes and update java man page. > > Thanks, > Vladimir > > On 5/6/20 3:19 AM, Doerr, Martin wrote: >> Hi Nils, >> >> I've created CSR >> https://bugs.openjdk.java.net/browse/JDK-8244507 >> and set it to "Proposed". >> >> Feel free to modify it if needed. I will need reviewers for it, too. >> >> Best regards, >> Martin >> >> >>> -----Original Message----- >>> From: Nils Eliasson >>> Sent: Dienstag, 5. Mai 2020 11:54 >>> To: Doerr, Martin ; hotspot-compiler- >>> dev at openjdk.java.net >>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>> >>> Hi Martin, >>> >>> I think it looks good. >>> >>> Please go ahead! >>> >>> Best regards, >>> Nils >>> >>> >>> On 2020-05-04 18:04, Doerr, Martin wrote: >>>> Hi Nils, >>>> >>>> thank you for looking at this and sorry for the late reply. >>>> >>>> I've added MaxTrivialSize and also updated the issue accordingly. >>>> Makes >>> sense. >>>> Do you have more flags in mind? >>>> >>>> Moving the flags which are only used by C2 into c2_globals >>>> definitely makes >>> sense. >>>> >>>> Done in webrev.01: >>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ >>>> >>>> Please take a look and let me know when my proposal is ready for a >>>> CSR. >>>> >>>> Best regards, >>>> Martin >>>> >>>> >>>>> -----Original Message----- >>>>> From: hotspot-compiler-dev >>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson >>>>> Sent: Dienstag, 28. April 2020 18:29 >>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>> >>>>> Hi, >>>>> >>>>> Thanks for addressing this! This has been an annoyance for a long >>>>> time. >>>>> >>>>> Have you though about including other flags - like MaxTrivialSize? >>>>> MaxInlineSize is tested against it. >>>>> >>>>> Also - you should move the flags that are now c2-only to >>>>> c2_globals.hpp. >>>>> >>>>> Best regards, >>>>> Nils Eliasson >>>>> >>>>> On 2020-04-27 15:06, Doerr, Martin wrote: >>>>>> Hi, >>>>>> >>>>>> while tuning inlining parameters for C2 compiler with JDK-8234863 we >>> had >>>>> discussed impact on C1. >>>>>> I still think it's bad to share them between both compilers. We >>>>>> may want >>> to >>>>> do further C2 tuning without negative impact on C1 in the future. >>>>>> C1 has issues with substantial inlining because of the lack of >>>>>> uncommon >>>>> traps. When C1 inlines a lot, stack frames may get large and code >>>>> cache >>> space >>>>> may get wasted for cold or even never executed code. The situation >>>>> gets >>>>> worse when many patching stubs get used for such code. >>>>>> I had opened the following issue: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 >>>>>> >>>>>> And my initial proposal is here: >>>>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ >>>>>> >>>>>> >>>>>> Part of my proposal is to add an additional flag which I called >>>>> C1InlineStackLimit to reduce stack utilization for C1 methods. >>>>>> I have a simple example which shows wasted stack space (java example >>>>> TestStack at the end). >>>>>> It simply counts stack frames until a stack overflow occurs. With >>>>>> the >>> current >>>>> implementation, only 1283 frames fit on the stack because the never >>>>> executed method bogus_test with local variables gets inlined. >>>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get >>> 2310 >>>>> frames until stack overflow. (I only used C1 for this example. Can be >>>>> reproduced as shown below.) >>>>>> I didn't notice any performance regression even with the aggressive >>> setting >>>>> of C1InlineStackLimit=5 with TieredCompilation. >>>>>> I know that I'll need a CSR for this change, but I'd like to get >>>>>> feedback in >>>>> general and feedback about the flag names before creating a CSR. >>>>>> I'd also be glad about feedback regarding the performance impact. >>>>>> >>>>>> Best regards, >>>>>> Martin >>>>>> >>>>>> >>>>>> >>>>>> Command line: >>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - >>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - >>>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>> TestStack >>>>>> CompileCommand: compileonly TestStack.triggerStackOverflow >>>>>> ???????????????????????????????? @ 8 >>>>>> TestStack::triggerStackOverflow (15 bytes) >>> recursive >>>>> inlining too deep >>>>>> ???????????????????????????????? @ 11?? TestStack::bogus_test (33 >>>>>> bytes)?? inline >>>>>> caught java.lang.StackOverflowError >>>>>> 1283 activations were on stack, sum = 0 >>>>>> >>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - >>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - >>>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>> TestStack >>>>>> CompileCommand: compileonly TestStack.triggerStackOverflow >>>>>> ???????????????????????????????? @ 8 >>>>>> TestStack::triggerStackOverflow (15 bytes) >>> recursive >>>>> inlining too deep >>>>>> ???????????????????????????????? @ 11?? TestStack::bogus_test (33 >>>>>> bytes)?? callee uses too >>>>> much stack >>>>>> caught java.lang.StackOverflowError >>>>>> 2310 activations were on stack, sum = 0 >>>>>> >>>>>> >>>>>> TestStack.java: >>>>>> public class TestStack { >>>>>> >>>>>> ?????? static long cnt = 0, >>>>>> ?????????????????? sum = 0; >>>>>> >>>>>> ?????? public static void bogus_test() { >>>>>> ?????????? long c1 = 1, c2 = 2, c3 = 3, c4 = 4; >>>>>> ?????????? sum += c1 + c2 + c3 + c4; >>>>>> ?????? } >>>>>> >>>>>> ?????? public static void triggerStackOverflow() { >>>>>> ?????????? cnt++; >>>>>> ?????????? triggerStackOverflow(); >>>>>> ?????????? bogus_test(); >>>>>> ?????? } >>>>>> >>>>>> >>>>>> ?????? public static void main(String args[]) { >>>>>> ?????????? try { >>>>>> ?????????????? triggerStackOverflow(); >>>>>> ?????????? } catch (StackOverflowError e) { >>>>>> ?????????????? System.out.println("caught " + e); >>>>>> ?????????? } >>>>>> ?????????? System.out.println(cnt + " activations were on stack, >>>>>> sum = " + >>> sum); >>>>>> ?????? } >>>>>> } >>>>>> >> From vladimir.kozlov at oracle.com Fri May 8 18:26:53 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 8 May 2020 11:26:53 -0700 Subject: RFR(S): 8244658: Remove dead code in code cache sweeper In-Reply-To: References: <01a3117b-679a-9888-8b3a-c293cf5dec19@oracle.com> Message-ID: <724c5218-ca3c-b15c-777f-e343e1ff29ed@oracle.com> +1 Thanks, Vladimir On 5/8/20 8:21 AM, Doerr, Martin wrote: > Hi Nils, > > thanks for cleaning this up. Looks good to me. > > Best regards, > Martin > > >> -----Original Message----- >> From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Nils Eliasson >> Sent: Freitag, 8. Mai 2020 16:39 >> To: hotspot-compiler-dev at openjdk.java.net >> Subject: RFR(S): 8244658: Remove dead code in code cache sweeper >> >> Hi, >> >> Please review this removal of dead code in the sweeper. >> >> Some are leftovers from the transition to handshakes: >> - class VM_MarkActiveNMethods >> - class NMethodMarkingTask >> - NMethodSweeper::mark_active_nmethods >> >> And some are just dead: >> - NMethodSweeper::report_events >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8244658 >> Webrev: http://cr.openjdk.java.net/~neliasso/8244658/webrev.01/ >> >> Best regards, >> Nils Eliasson From vladimir.kozlov at oracle.com Fri May 8 18:33:23 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 8 May 2020 11:33:23 -0700 Subject: [15] RFR(M): 8244207: Simplify usage of Compile::print_method() when debugging with gdb and enable its use with rr In-Reply-To: References: <02684646-72d5-43a6-1d8b-6e21b3ba159a@oracle.com> Message-ID: Good. Thanks, Vladimir On 5/8/20 1:45 AM, Christian Hagedorn wrote: > Thank you Vladimir and Tobias for your reviews! > > On 08.05.20 08:24, Tobias Hartmann wrote: >> looks good to me too but please use (jio_)snprintf instead of unsafe sprintf. > > Thinking of which, we could probably directly use the phase_name parameter and call print_method() without allocating a > char array and using jio_snprintf. However, it might be a good idea to change sprintf into jio_snprintf in > Compile::print_method (where I got this code from). There it makes sense to use an additional char array to concatenate > two strings. > > I included those changes together with the typo fix in a new webrev: > http://cr.openjdk.java.net/~chagedorn/8244207/webrev.01/ > > Best regards, > Christian > >> On 07.05.20 20:38, Vladimir Kozlov wrote: >>> Looks good. >>> >>> Thanks, >>> Vladimir >>> >>> On 5/7/20 5:09 AM, Christian Hagedorn wrote: >>>> Hi >>>> >>>> Please review the following debugging enhancement: >>>> https://bugs.openjdk.java.net/browse/JDK-8244207 >>>> http://cr.openjdk.java.net/~chagedorn/8244207/webrev.00/ >>>> >>>> This enhancement simplifies the usage for printing the ideal graph for visualization with the >>>> Ideal Graph Visualizer when debugging with gdb and enables graph printing with rr [1]. >>>> >>>> Instead of calling Compile::current()->print_method(PHASE_X, y, z) from gdb, one can now just call >>>> igv_print() or igv_print(phase_name) with a custom phase name. There are multiple options >>>> depending on where the graph should be printed to (file or over network/locally to an opened Ideal >>>> Graph Visualizer). When choosing file, the output is always printed to a file named >>>> custom_debug.xml. I think the flexibility to choose another file name is not really required since >>>> it's only used while debugging. These new igv_print() methods can also be called from gdb without >>>> setting any flags required for the usual calls to Compile::current()->print_method(PHASE_X, y, z) >>>> to work. >>>> >>>> The standard Compile::current()->print_method(PHASE_X, y, z) call does not work while debugging a >>>> program trace with rr (and is probably also problematic with related replay tools). The call gets >>>> stuck somewhere. rr allows to alter some data at a breakpoint but as soon as execution continues >>>> on the replayed trace, the modifications are undone (except for file writes). This enhancement >>>> also addresses this such that the new igv_print() methods can be used with rr. However, when >>>> printing to a file is chosen, igv_print() will overwrite custom_debug.xml again at the next >>>> execution stop. To avoid that I added additional rr-specific igv_append() and >>>> igv_append(phase_name) methods that simply append a graph to the existing custom_debug.xml file >>>> without setting up a file header again. This allows all printed graphs to be kept in one file >>>> which makes it easier to navigate through them. >>>> >>>> Thank you! >>>> >>>> Best regards, >>>> Christian >>>> >>>> >>>> [1] https://rr-project.org/ From vladimir.kozlov at oracle.com Fri May 8 19:42:37 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 8 May 2020 12:42:37 -0700 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> Message-ID: Hi Martin On 5/8/20 5:56 AM, Doerr, Martin wrote: > Hi Vladimir, > > thanks a lot for looking at this, for finding the test issues and for reviewing the CSR. > > For me, C2 is a fundamental part of the JVM. I would usually never build without it ?? > (Except if we want to use C1 + GraalVM compiler only.) Yes it is one of cases. > But your right, --with-jvm-variants=client configuration should still be supported. Yes. > > We can fix it by making the flags as obsolete if C2 is not included: > diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp > --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 11:14:28 2020 +0200 > +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 14:41:14 2020 +0200 > @@ -562,6 +562,16 @@ > { "dup option", JDK_Version::jdk(9), JDK_Version::undefined(), JDK_Version::undefined() }, > #endif > > +#ifndef COMPILER2 > + // These flags were generally available, but are C2 only, now. > + { "MaxInlineLevel", JDK_Version::undefined(), JDK_Version::jdk(15), JDK_Version::undefined() }, > + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), JDK_Version::jdk(15), JDK_Version::undefined() }, > + { "InlineSmallCode", JDK_Version::undefined(), JDK_Version::jdk(15), JDK_Version::undefined() }, > + { "MaxInlineSize", JDK_Version::undefined(), JDK_Version::jdk(15), JDK_Version::undefined() }, > + { "FreqInlineSize", JDK_Version::undefined(), JDK_Version::jdk(15), JDK_Version::undefined() }, > + { "MaxTrivialSize", JDK_Version::undefined(), JDK_Version::jdk(15), JDK_Version::undefined() }, > +#endif > + > { NULL, JDK_Version(0), JDK_Version(0) } > }; Right. I think you should do full process for these product flags deprecation with obsoleting in JDK 16 for VM builds which do not include C2. You need update your CSR - add information about this and above code change. Example: https://bugs.openjdk.java.net/browse/JDK-8238840 > > This makes the VM accept the flags with warning: > jdk/bin/java -XX:MaxInlineLevel=9 -version > OpenJDK 64-Bit Client VM warning: Ignoring option MaxInlineLevel; support was removed in 15.0 > > If we do it this way, the only test which I think should get fixed is ReservedStackTest. > I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in order to preserve the inlining behavior. > > (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. compiler/c2 tests: Also written to test C2 specific things.) > > What do you think? I would suggest to fix tests anyway (there are only few) because new warning output could be unexpected. And it will be future-proof when warning will be converted into error (if/when C2 goes away). Thanks, Vladimir > > Best regards, > Martin > > >> -----Original Message----- >> From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov >> Sent: Donnerstag, 7. Mai 2020 19:11 >> To: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >> >> I would suggest to build VM without C2 and run tests. >> >> I grepped tests with these flags I found next tests where we need to fix >> test's command (add >> -XX:+IgnoreUnrecognizedVMOptions) or add @requires >> vm.compiler2.enabled or duplicate test for C1 with corresponding C1 >> flags (by ussing additional @test block). >> >> runtime/ReservedStack/ReservedStackTest.java >> compiler/intrinsics/string/TestStringIntrinsics2.java >> compiler/c2/Test6792161.java >> compiler/c2/Test5091921.java >> >> And there is issue with compiler/compilercontrol tests which use >> InlineSmallCode and I am not sure how to handle: >> >> http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c >> ompiler/compilercontrol/share/scenario/Command.java#l36 >> >> Thanks, >> Vladimir >> >> On 5/4/20 9:04 AM, Doerr, Martin wrote: >>> Hi Nils, >>> >>> thank you for looking at this and sorry for the late reply. >>> >>> I've added MaxTrivialSize and also updated the issue accordingly. Makes >> sense. >>> Do you have more flags in mind? >>> >>> Moving the flags which are only used by C2 into c2_globals definitely makes >> sense. >>> >>> Done in webrev.01: >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ >>> >>> Please take a look and let me know when my proposal is ready for a CSR. >>> >>> Best regards, >>> Martin >>> >>> >>>> -----Original Message----- >>>> From: hotspot-compiler-dev >>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson >>>> Sent: Dienstag, 28. April 2020 18:29 >>>> To: hotspot-compiler-dev at openjdk.java.net >>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>> >>>> Hi, >>>> >>>> Thanks for addressing this! This has been an annoyance for a long time. >>>> >>>> Have you though about including other flags - like MaxTrivialSize? >>>> MaxInlineSize is tested against it. >>>> >>>> Also - you should move the flags that are now c2-only to c2_globals.hpp. >>>> >>>> Best regards, >>>> Nils Eliasson >>>> >>>> On 2020-04-27 15:06, Doerr, Martin wrote: >>>>> Hi, >>>>> >>>>> while tuning inlining parameters for C2 compiler with JDK-8234863 we >> had >>>> discussed impact on C1. >>>>> I still think it's bad to share them between both compilers. We may want >> to >>>> do further C2 tuning without negative impact on C1 in the future. >>>>> >>>>> C1 has issues with substantial inlining because of the lack of uncommon >>>> traps. When C1 inlines a lot, stack frames may get large and code cache >> space >>>> may get wasted for cold or even never executed code. The situation gets >>>> worse when many patching stubs get used for such code. >>>>> >>>>> I had opened the following issue: >>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 >>>>> >>>>> And my initial proposal is here: >>>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ >>>>> >>>>> >>>>> Part of my proposal is to add an additional flag which I called >>>> C1InlineStackLimit to reduce stack utilization for C1 methods. >>>>> I have a simple example which shows wasted stack space (java example >>>> TestStack at the end). >>>>> >>>>> It simply counts stack frames until a stack overflow occurs. With the >> current >>>> implementation, only 1283 frames fit on the stack because the never >>>> executed method bogus_test with local variables gets inlined. >>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get >> 2310 >>>> frames until stack overflow. (I only used C1 for this example. Can be >>>> reproduced as shown below.) >>>>> >>>>> I didn't notice any performance regression even with the aggressive >> setting >>>> of C1InlineStackLimit=5 with TieredCompilation. >>>>> >>>>> I know that I'll need a CSR for this change, but I'd like to get feedback in >>>> general and feedback about the flag names before creating a CSR. >>>>> I'd also be glad about feedback regarding the performance impact. >>>>> >>>>> Best regards, >>>>> Martin >>>>> >>>>> >>>>> >>>>> Command line: >>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - >>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - >>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>> TestStack >>>>> CompileCommand: compileonly TestStack.triggerStackOverflow >>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) >> recursive >>>> inlining too deep >>>>> @ 11 TestStack::bogus_test (33 bytes) inline >>>>> caught java.lang.StackOverflowError >>>>> 1283 activations were on stack, sum = 0 >>>>> >>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - >>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - >>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>> TestStack >>>>> CompileCommand: compileonly TestStack.triggerStackOverflow >>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) >> recursive >>>> inlining too deep >>>>> @ 11 TestStack::bogus_test (33 bytes) callee uses too >>>> much stack >>>>> caught java.lang.StackOverflowError >>>>> 2310 activations were on stack, sum = 0 >>>>> >>>>> >>>>> TestStack.java: >>>>> public class TestStack { >>>>> >>>>> static long cnt = 0, >>>>> sum = 0; >>>>> >>>>> public static void bogus_test() { >>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; >>>>> sum += c1 + c2 + c3 + c4; >>>>> } >>>>> >>>>> public static void triggerStackOverflow() { >>>>> cnt++; >>>>> triggerStackOverflow(); >>>>> bogus_test(); >>>>> } >>>>> >>>>> >>>>> public static void main(String args[]) { >>>>> try { >>>>> triggerStackOverflow(); >>>>> } catch (StackOverflowError e) { >>>>> System.out.println("caught " + e); >>>>> } >>>>> System.out.println(cnt + " activations were on stack, sum = " + >> sum); >>>>> } >>>>> } >>>>> >>> From daniel.daugherty at oracle.com Fri May 8 19:48:02 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 8 May 2020 15:48:02 -0400 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: <394BD86E-EC91-440E-9936-696B2B453093@oracle.com> References: <394BD86E-EC91-440E-9936-696B2B453093@oracle.com> Message-ID: <048a8e25-fec4-5304-8763-6a4472bfbf54@oracle.com> On 5/7/20 1:35 AM, Mikael Vidstedt wrote: > New webrev here: > > webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.01/hotspot/open/webrev/ This pretty much says it all: > Summary of changes:??? 90904 lines changed: 8 ins; 90725 del; 171 mod; 103780 unchg My review is focused on looking at the changes and not looking for missed changes. I figure there's enough work here just looking at the changes to keep me occupied for a while and enough people have posted comments about finding other things to be examined, etc... Unlike my normal reviews, I won't be listing all the touched files; (there's _only_ 427 of them...) Don't forget to make a copyright year update pass before you push. src/hotspot/os/posix/os_posix.hpp ??????? L174 ??? old L175 #ifndef SOLARIS ??????? L176 ?????? nit - on most of this style of deletion you also got rid of ?????? one of the blank lines, but not here. src/hotspot/share/utilities/dtrace.hpp ??? old L42: #elif defined(__APPLE__) ??? old L44: #include ??? old L45: #else ??? new L32: #include ??????? was previous included only for __APPLE__ and it ??????? is now there for every platform. Any particular reason? Thumbs up! Dan > incremental: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.01/hotspot.incr/open/webrev/ > > Remaining items: > > * File follow-up to remove STACK_BIAS > > * File follow-ups to change/update/remove flags and/or flag documentation: UseLWPSynchronization, BranchOnRegister, LIRFillDelaySlots, ArrayAllocatorMallocLimit, ThreadPriorityPolicy > > * File follow-up(s) to update comments ("solaris", ?sparc?, ?solstudio?, ?sunos?, ?sun studio?, ?s compiler bug?, ?niagara?, ?) > > > Please let me know if there?s something I have missed! > > Cheers, > Mikael > >> On May 3, 2020, at 10:12 PM, Mikael Vidstedt wrote: >> >> >> Please review this change which implements part of JEP 381: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 >> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ >> JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 >> >> >> Note: When reviewing this, please be aware that this exercise was *extremely* mind-numbing, so I appreciate your help reviewing all the individual changes carefully. You may want to get that coffee cup filled up (or whatever keeps you awake)! >> >> >> Background: >> >> Because of the size of the total patch and wide range of areas touched, this patch is one out of in total six partial patches which together make up the necessary changes to remove the Solaris and SPARC ports. The other patches are being sent out for review to mailing lists appropriate for the respective areas the touch. An email will be sent to jdk-dev summarizing all the patches/reviews. To be clear: this patch is *not* in itself complete and stand-alone - all of the (six) patches are needed to form a complete patch. Some changes in this patch may look wrong or incomplete unless also looking at the corresponding changes in other areas. >> >> For convenience, I?m including a link below[1] to the full webrev, but in case you have comments on changes in other areas, outside of the files included in this thread, please provide those comments directly in the thread on the appropriate mailing list for that area if possible. >> >> In case it helps, the changes were effectively produced by searching for and updating any code mentioning ?solaris", ?sparc?, ?solstudio?, ?sunos?, etc. More information about the areas impacted can be found in the JEP itself. >> >> A big thank you to Igor Ignatyev for helping make the changes to the hotspot tests! >> >> Also, I have a short list of follow-ups which I?m going to look at separately from this JEP/patch, mainly related to command line options/flags which are no longer relevant and should be deprecated/obsoleted/removed. >> >> Testing: >> >> A slightly earlier version of this change successfully passed tier1-8, as well as client tier1-2. Additional testing will be done after the first round of reviews has been completed. >> >> Cheers, >> Mikael >> >> [1] http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/all/open/webrev/ >> From evgeny.nikitin at oracle.com Fri May 8 19:51:39 2020 From: evgeny.nikitin at oracle.com (Evgeny Nikitin) Date: Fri, 8 May 2020 21:51:39 +0200 Subject: RFR(XS): 8242150: Add jtreg "serviceability/sa/ClhsdbJstackXcompStress.java" to graal problem list. Message-ID: <24d6b004-3166-db60-d9a0-b3f67e866969@oracle.com> Hi, Bug: https://bugs.openjdk.java.net/browse/JDK-8244656 Webrev: http://cr.openjdk.java.net/~enikitin/8242150/webrev.00/ The test times out and it was suggested to add it to the problem list. A new bug have been raised in order to investigate the possibilities to make the test active again: https://bugs.openjdk.java.net/browse/JDK-8244656 Please review, /Evgeny Nikitin. From martin.doerr at sap.com Fri May 8 21:06:40 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 8 May 2020 21:06:40 +0000 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> Message-ID: Hi Vladimir, > You need update your CSR - add information about this and above code change. Example: > https://bugs.openjdk.java.net/browse/JDK-8238840 I've updated the CSR with obsolete and expired flags as in the example. > I would suggest to fix tests anyway (there are only few) because new > warning output could be unexpected. Ok. I'll prepare a webrev with fixed tests. Best regards, Martin > -----Original Message----- > From: Vladimir Kozlov > Sent: Freitag, 8. Mai 2020 21:43 > To: Doerr, Martin ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > Hi Martin > > On 5/8/20 5:56 AM, Doerr, Martin wrote: > > Hi Vladimir, > > > > thanks a lot for looking at this, for finding the test issues and for reviewing > the CSR. > > > > For me, C2 is a fundamental part of the JVM. I would usually never build > without it ?? > > (Except if we want to use C1 + GraalVM compiler only.) > > Yes it is one of cases. > > > But your right, --with-jvm-variants=client configuration should still be > supported. > > Yes. > > > > > We can fix it by making the flags as obsolete if C2 is not included: > > diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp > > --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 11:14:28 2020 > +0200 > > +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 14:41:14 > 2020 +0200 > > @@ -562,6 +562,16 @@ > > { "dup option", JDK_Version::jdk(9), JDK_Version::undefined(), > JDK_Version::undefined() }, > > #endif > > > > +#ifndef COMPILER2 > > + // These flags were generally available, but are C2 only, now. > > + { "MaxInlineLevel", JDK_Version::undefined(), > JDK_Version::jdk(15), JDK_Version::undefined() }, > > + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), > JDK_Version::jdk(15), JDK_Version::undefined() }, > > + { "InlineSmallCode", JDK_Version::undefined(), > JDK_Version::jdk(15), JDK_Version::undefined() }, > > + { "MaxInlineSize", JDK_Version::undefined(), > JDK_Version::jdk(15), JDK_Version::undefined() }, > > + { "FreqInlineSize", JDK_Version::undefined(), > JDK_Version::jdk(15), JDK_Version::undefined() }, > > + { "MaxTrivialSize", JDK_Version::undefined(), > JDK_Version::jdk(15), JDK_Version::undefined() }, > > +#endif > > + > > { NULL, JDK_Version(0), JDK_Version(0) } > > }; > > Right. I think you should do full process for these product flags deprecation > with obsoleting in JDK 16 for VM builds > which do not include C2. You need update your CSR - add information about > this and above code change. Example: > > https://bugs.openjdk.java.net/browse/JDK-8238840 > > > > > This makes the VM accept the flags with warning: > > jdk/bin/java -XX:MaxInlineLevel=9 -version > > OpenJDK 64-Bit Client VM warning: Ignoring option MaxInlineLevel; > support was removed in 15.0 > > > > If we do it this way, the only test which I think should get fixed is > ReservedStackTest. > > I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in order to > preserve the inlining behavior. > > > > (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. > compiler/c2 tests: Also written to test C2 specific things.) > > > > What do you think? > > I would suggest to fix tests anyway (there are only few) because new > warning output could be unexpected. > And it will be future-proof when warning will be converted into error > (if/when C2 goes away). > > Thanks, > Vladimir > > > > > Best regards, > > Martin > > > > > >> -----Original Message----- > >> From: hotspot-compiler-dev >> bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov > >> Sent: Donnerstag, 7. Mai 2020 19:11 > >> To: hotspot-compiler-dev at openjdk.java.net > >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >> > >> I would suggest to build VM without C2 and run tests. > >> > >> I grepped tests with these flags I found next tests where we need to fix > >> test's command (add > >> -XX:+IgnoreUnrecognizedVMOptions) or add @requires > >> vm.compiler2.enabled or duplicate test for C1 with corresponding C1 > >> flags (by ussing additional @test block). > >> > >> runtime/ReservedStack/ReservedStackTest.java > >> compiler/intrinsics/string/TestStringIntrinsics2.java > >> compiler/c2/Test6792161.java > >> compiler/c2/Test5091921.java > >> > >> And there is issue with compiler/compilercontrol tests which use > >> InlineSmallCode and I am not sure how to handle: > >> > >> > http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c > >> ompiler/compilercontrol/share/scenario/Command.java#l36 > >> > >> Thanks, > >> Vladimir > >> > >> On 5/4/20 9:04 AM, Doerr, Martin wrote: > >>> Hi Nils, > >>> > >>> thank you for looking at this and sorry for the late reply. > >>> > >>> I've added MaxTrivialSize and also updated the issue accordingly. Makes > >> sense. > >>> Do you have more flags in mind? > >>> > >>> Moving the flags which are only used by C2 into c2_globals definitely > makes > >> sense. > >>> > >>> Done in webrev.01: > >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ > >>> > >>> Please take a look and let me know when my proposal is ready for a CSR. > >>> > >>> Best regards, > >>> Martin > >>> > >>> > >>>> -----Original Message----- > >>>> From: hotspot-compiler-dev >>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson > >>>> Sent: Dienstag, 28. April 2020 18:29 > >>>> To: hotspot-compiler-dev at openjdk.java.net > >>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>> > >>>> Hi, > >>>> > >>>> Thanks for addressing this! This has been an annoyance for a long time. > >>>> > >>>> Have you though about including other flags - like MaxTrivialSize? > >>>> MaxInlineSize is tested against it. > >>>> > >>>> Also - you should move the flags that are now c2-only to > c2_globals.hpp. > >>>> > >>>> Best regards, > >>>> Nils Eliasson > >>>> > >>>> On 2020-04-27 15:06, Doerr, Martin wrote: > >>>>> Hi, > >>>>> > >>>>> while tuning inlining parameters for C2 compiler with JDK-8234863 we > >> had > >>>> discussed impact on C1. > >>>>> I still think it's bad to share them between both compilers. We may > want > >> to > >>>> do further C2 tuning without negative impact on C1 in the future. > >>>>> > >>>>> C1 has issues with substantial inlining because of the lack of > uncommon > >>>> traps. When C1 inlines a lot, stack frames may get large and code cache > >> space > >>>> may get wasted for cold or even never executed code. The situation > gets > >>>> worse when many patching stubs get used for such code. > >>>>> > >>>>> I had opened the following issue: > >>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 > >>>>> > >>>>> And my initial proposal is here: > >>>>> > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ > >>>>> > >>>>> > >>>>> Part of my proposal is to add an additional flag which I called > >>>> C1InlineStackLimit to reduce stack utilization for C1 methods. > >>>>> I have a simple example which shows wasted stack space (java > example > >>>> TestStack at the end). > >>>>> > >>>>> It simply counts stack frames until a stack overflow occurs. With the > >> current > >>>> implementation, only 1283 frames fit on the stack because the never > >>>> executed method bogus_test with local variables gets inlined. > >>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get > >> 2310 > >>>> frames until stack overflow. (I only used C1 for this example. Can be > >>>> reproduced as shown below.) > >>>>> > >>>>> I didn't notice any performance regression even with the aggressive > >> setting > >>>> of C1InlineStackLimit=5 with TieredCompilation. > >>>>> > >>>>> I know that I'll need a CSR for this change, but I'd like to get feedback > in > >>>> general and feedback about the flag names before creating a CSR. > >>>>> I'd also be glad about feedback regarding the performance impact. > >>>>> > >>>>> Best regards, > >>>>> Martin > >>>>> > >>>>> > >>>>> > >>>>> Command line: > >>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - > >>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - > >>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >>>> TestStack > >>>>> CompileCommand: compileonly TestStack.triggerStackOverflow > >>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) > >> recursive > >>>> inlining too deep > >>>>> @ 11 TestStack::bogus_test (33 bytes) inline > >>>>> caught java.lang.StackOverflowError > >>>>> 1283 activations were on stack, sum = 0 > >>>>> > >>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - > >>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining - > >>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >>>> TestStack > >>>>> CompileCommand: compileonly TestStack.triggerStackOverflow > >>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) > >> recursive > >>>> inlining too deep > >>>>> @ 11 TestStack::bogus_test (33 bytes) callee uses > too > >>>> much stack > >>>>> caught java.lang.StackOverflowError > >>>>> 2310 activations were on stack, sum = 0 > >>>>> > >>>>> > >>>>> TestStack.java: > >>>>> public class TestStack { > >>>>> > >>>>> static long cnt = 0, > >>>>> sum = 0; > >>>>> > >>>>> public static void bogus_test() { > >>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; > >>>>> sum += c1 + c2 + c3 + c4; > >>>>> } > >>>>> > >>>>> public static void triggerStackOverflow() { > >>>>> cnt++; > >>>>> triggerStackOverflow(); > >>>>> bogus_test(); > >>>>> } > >>>>> > >>>>> > >>>>> public static void main(String args[]) { > >>>>> try { > >>>>> triggerStackOverflow(); > >>>>> } catch (StackOverflowError e) { > >>>>> System.out.println("caught " + e); > >>>>> } > >>>>> System.out.println(cnt + " activations were on stack, sum = " + > >> sum); > >>>>> } > >>>>> } > >>>>> > >>> From vladimir.kozlov at oracle.com Fri May 8 22:56:32 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 8 May 2020 15:56:32 -0700 Subject: RFR(XS): 8242150: Add jtreg "serviceability/sa/ClhsdbJstackXcompStress.java" to graal problem list. In-Reply-To: <24d6b004-3166-db60-d9a0-b3f67e866969@oracle.com> References: <24d6b004-3166-db60-d9a0-b3f67e866969@oracle.com> Message-ID: <38c7e57c-687d-81c2-5118-15191877dd6d@oracle.com> Good. Thanks, Vladimir On 5/8/20 12:51 PM, Evgeny Nikitin wrote: > Hi, > > Bug: https://bugs.openjdk.java.net/browse/JDK-8242150 > Webrev: http://cr.openjdk.java.net/~enikitin/8242150/webrev.00/ > > The test times out and it was suggested to add it to the problem list. > A new bug have been raised in order to investigate the possibilities to make the test active again: > > https://bugs.openjdk.java.net/browse/JDK-8244656 > > Please review, > /Evgeny Nikitin. From manc at google.com Sat May 9 01:28:52 2020 From: manc at google.com (Man Cao) Date: Fri, 8 May 2020 18:28:52 -0700 Subject: RFR(M): 8244660: Code cache sweeper heuristics is broken In-Reply-To: References: Message-ID: Hi Nils, Thanks for fixing this so quickly, and simplifying the logic! Some high-level questions and suggestions: *1. Sweep frequency * With this new approach, is the sweeping expected to be less frequent than the current approach, or more frequent? It looks more frequent to me. If I understand correctly, for the current approach, the conditions to sweep are: - Bytes from make_not_entrant_or_zombie() and make_unloaded() reach 1/100 of ReservedCodeCacheSize OR - Heuristics based on ReservedCodeCacheSize/16M, _time_counter and _last_sweep. I suppose the second condition is the "Number of safepoints with stack scans" condition you mentioned, which is not currently triggered. With the new approach, the condition is: - Bytes from make_not_entrant_or_zombie() and make_unloaded() reach SweeperThreshold (1/256 of ReservedCodeCacheSize) Is it better to make SweeperThreshold default to 1/100 of ReservedCodeCacheSize like before? Also, could it be a percentage instead of a byte-size value? In our experience, a percentage value is also easier to maintain for production users. We also would like to reduce the default sweep frequency, especially for -XX:-TieredCompilation. Because in JDK11, we have seen the higher sweep frequency caused regression compared to JDK8, and turning off code cache flushing could significantly improve performance. *2. Sweep and make non-entrant* > The threshold is capped at 1M because even if you have an enormous code > cache - you don't want to fragment it, and you probably don't want to > commit more than needed. It is possible that sweeping will deoptimize some cold nmethods that will be used soon. Such deoptimizations could hurt performance more than fragmenting the code cache. Taking a closer look, perhaps the root problem is not just the sweep frequency itself, but coupled with the logic in NMethodSweeper::possibly_flush() to determine when to make an nmethod not-entrant. Perhaps the two flags NmethodSweepActivity and MinPassesBeforeFlush could be adjusted accordingly to the higher sweep frequency, to make JVM deoptimize fewer cold but usable nmethods. Do you think I should open a CR to investigate changing the default values of these flags later? It would be better if we could deprecate one of these two flags if they serve the same purpose. -Man From manc at google.com Sat May 9 01:32:36 2020 From: manc at google.com (Man Cao) Date: Fri, 8 May 2020 18:32:36 -0700 Subject: RFR(XXS): 8244278: Excessive code cache flushes and sweeps In-Reply-To: References: <380ca47b-4143-e98f-ff81-461b394aaf0c@oracle.com> <2d849c3a-f971-b644-16fb-72aba5a919c0@oracle.com> <7cd68157-36fd-d000-f420-2e1e9e0f0143@oracle.com> Message-ID: Thanks, Nils! I've posted some feedback/questions on for 8244660. I'll run some benchmarks with all these patches. -Man From eric.c.liu at arm.com Sat May 9 09:44:14 2020 From: eric.c.liu at arm.com (Eric Liu) Date: Sat, 9 May 2020 09:44:14 +0000 Subject: RFR(S):8242429:Better implementation for signed extract In-Reply-To: References: <420844d8-fad2-7e60-2353-398957e965e7@oracle.com>, Message-ID: Hi Tobias, Thanks for your review, and sorry for the extreme delay. I took a holiday in the last couple of days:P > Could you convert that test to a jtreg test and add it to the webrev? > And maybe also add test cases for the "0-(A>>31)" into "(A>>>31)" optimization. Add a jtreg test only, no other files has been changed. JBS: https://bugs.openjdk.java.net/browse/JDK-8242429 Webrev: http://cr.openjdk.java.net/~yzhang/ericliu/8242429/webrev.02/ Thanks, Eric From eric.c.liu at arm.com Sat May 9 09:48:22 2020 From: eric.c.liu at arm.com (Eric Liu) Date: Sat, 9 May 2020 09:48:22 +0000 Subject: RFR(S):8242429:Better implementation for signed extract In-Reply-To: References: <420844d8-fad2-7e60-2353-398957e965e7@oracle.com>, , Message-ID: Original link: https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/037849.html Thanks, Eric From Yang.Zhang at arm.com Mon May 11 01:50:16 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Mon, 11 May 2020 01:50:16 +0000 Subject: [aarch64-port-dev ] RFR(S): 8243597: AArch64: Add support for integer vector abs In-Reply-To: References: Message-ID: Hi, Ping it again. Could you please help to review this? Regards Yang -----Original Message----- From: aarch64-port-dev On Behalf Of Yang Zhang Sent: Wednesday, May 6, 2020 4:46 PM To: aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: [aarch64-port-dev ] RFR(S): 8243597: AArch64: Add support for integer vector abs Hi, Could you please help to review this patch? JBS: https://bugs.openjdk.java.net/browse/JDK-8243597 Webrev: http://cr.openjdk.java.net/~yzhang/8243597/webrev.00/ In JDK-8222074 [1], x86 enables auto vectorization for integer vector abs, and jtreg tests are also added. In this patch, the missing AbsVB/S/I/L support for AArch64 is added. Testing: Full jtreg test Vector API tests which cover vector abs Test case: public static void absvs(short[] a, short[] b, short[] c) { for (int i = 0; i < a.length; i++) { c[i] = (short)Math.abs((a[i] + b[i])); } } Assembly code generated by C2: 0x0000ffffaca3f3ac: ldr q17, [x16, #16] 0x0000ffffaca3f3b0: ldr q16, [x15, #16] 0x0000ffffaca3f3b4: add v16.8h, v16.8h, v17.8h 0x0000ffffaca3f3b8: abs v16.8h, v16.8h 0x0000ffffaca3f3c0: str q16, [x12, #16] Similar test cases for byte/int/long are also tested and NEON abs instruction is generated by C2. Performance: JMH tests are uploaded. http://cr.openjdk.java.net/~yzhang/8243597/TestScalar.java http://cr.openjdk.java.net/~yzhang/8243597/TestVect.java Vector abs: Before: Benchmark (size) Mode Cnt Score Error Units TestVect.testVectAbsVB 1024 avgt 5 1041.720 ? 2.606 us/op TestVect.testVectAbsVI 1024 avgt 5 659.788 ? 2.057 us/op TestVect.testVectAbsVL 1024 avgt 5 711.043 ? 5.489 us/op TestVect.testVectAbsVS 1024 avgt 5 659.157 ? 2.531 us/op After Benchmark (size) Mode Cnt Score Error Units TestVect.testVectAbsVB 1024 avgt 5 88.821 ? 1.886 us/op TestVect.testVectAbsVI 1024 avgt 5 199.081 ? 2.539 us/op TestVect.testVectAbsVL 1024 avgt 5 447.536 ? 1.195 us/op TestVect.testVectAbsVS 1024 avgt 5 119.172 ? 0.340 us/op Scalar abs: Before: Benchmark (size) Mode Cnt Score Error Units TestScalar.testAbsI 1024 avgt 5 3770.345 ? 6.760 us/op TestScalar.testAbsL 1024 avgt 5 3767.570 ? 9.097 us/op After: Benchmark (size) Mode Cnt Score Error Units TestScalar.testAbsI 1024 avgt 5 3141.312 ? 2.000 us/op TestScalar.testAbsL 1024 avgt 5 3103.143 ? 8.989 us/op [1] https://bugs.openjdk.java.net/browse/JDK-8222074 Regards Yang From tobias.hartmann at oracle.com Mon May 11 06:04:30 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 11 May 2020 08:04:30 +0200 Subject: [15] RFR(M): 8244207: Simplify usage of Compile::print_method() when debugging with gdb and enable its use with rr In-Reply-To: References: <02684646-72d5-43a6-1d8b-6e21b3ba159a@oracle.com> Message-ID: <08b454ed-f960-7e89-c4c9-cd63396de875@oracle.com> +1 Best regards, Tobias On 08.05.20 20:33, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 5/8/20 1:45 AM, Christian Hagedorn wrote: >> Thank you Vladimir and Tobias for your reviews! >> >> On 08.05.20 08:24, Tobias Hartmann wrote: >>> looks good to me too but please use (jio_)snprintf instead of unsafe sprintf. >> >> Thinking of which, we could probably directly use the phase_name parameter and call print_method() >> without allocating a char array and using jio_snprintf. However, it might be a good idea to change >> sprintf into jio_snprintf in Compile::print_method (where I got this code from). There it makes >> sense to use an additional char array to concatenate two strings. >> >> I included those changes together with the typo fix in a new webrev: >> http://cr.openjdk.java.net/~chagedorn/8244207/webrev.01/ >> >> Best regards, >> Christian >> >>> On 07.05.20 20:38, Vladimir Kozlov wrote: >>>> Looks good. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 5/7/20 5:09 AM, Christian Hagedorn wrote: >>>>> Hi >>>>> >>>>> Please review the following debugging enhancement: >>>>> https://bugs.openjdk.java.net/browse/JDK-8244207 >>>>> http://cr.openjdk.java.net/~chagedorn/8244207/webrev.00/ >>>>> >>>>> This enhancement simplifies the usage for printing the ideal graph for visualization with the >>>>> Ideal Graph Visualizer when debugging with gdb and enables graph printing with rr [1]. >>>>> >>>>> Instead of calling Compile::current()->print_method(PHASE_X, y, z) from gdb, one can now just call >>>>> igv_print() or igv_print(phase_name) with a custom phase name. There are multiple options >>>>> depending on where the graph should be printed to (file or over network/locally to an opened Ideal >>>>> Graph Visualizer). When choosing file, the output is always printed to a file named >>>>> custom_debug.xml. I think the flexibility to choose another file name is not really required since >>>>> it's only used while debugging. These new igv_print() methods can also be called from gdb without >>>>> setting any flags required for the usual calls to Compile::current()->print_method(PHASE_X, y, z) >>>>> to work. >>>>> >>>>> The standard Compile::current()->print_method(PHASE_X, y, z) call does not work while debugging a >>>>> program trace with rr (and is probably also problematic with related replay tools). The call gets >>>>> stuck somewhere. rr allows to alter some data at a breakpoint but as soon as execution continues >>>>> on the replayed trace, the modifications are undone (except for file writes). This enhancement >>>>> also addresses this such that the new igv_print() methods can be used with rr. However, when >>>>> printing to a file is chosen, igv_print() will overwrite custom_debug.xml again at the next >>>>> execution stop. To avoid that I added additional rr-specific igv_append() and >>>>> igv_append(phase_name) methods that simply append a graph to the existing custom_debug.xml file >>>>> without setting up a file header again. This allows all printed graphs to be kept in one file >>>>> which makes it easier to navigate through them. >>>>> >>>>> Thank you! >>>>> >>>>> Best regards, >>>>> Christian >>>>> >>>>> >>>>> [1] https://rr-project.org/ From nick.gasson at arm.com Mon May 11 07:56:23 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Mon, 11 May 2020 15:56:23 +0800 Subject: RFR: 8244164: AArch64: jaotc generates incorrect code for compressed OOPs with non-zero heap base Message-ID: <858shy9a6w.fsf@arm.com> Hi, Bug: https://bugs.openjdk.java.net/browse/JDK-8244164 Webrev: http://cr.openjdk.java.net/~ngasson/8244164/webrev.0/ On AArch64 if the VM jaotc is running in uses compressed oops with a zero base it will not emit instructions to add/subtract the heap base when compressing/decompressing oops. This causes a crash if the AOT'd shared library is loaded into a VM with non-zero heap base. Tested jtreg hotspot_all_no_apps, jdk_core with no new failures. I've made a separate pull request for Graal [1] but I'm submitting this here too as I want to backport to 14 and 11u and also add a jtreg test. Not sure if this is the right process to follow? [1] https://github.com/oracle/graal/pull/2446 -- Thanks, Nick From christian.hagedorn at oracle.com Mon May 11 08:21:48 2020 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Mon, 11 May 2020 10:21:48 +0200 Subject: [15] RFR(M): 8244207: Simplify usage of Compile::print_method() when debugging with gdb and enable its use with rr In-Reply-To: <08b454ed-f960-7e89-c4c9-cd63396de875@oracle.com> References: <02684646-72d5-43a6-1d8b-6e21b3ba159a@oracle.com> <08b454ed-f960-7e89-c4c9-cd63396de875@oracle.com> Message-ID: Thank you Vladimir and Tobias for reviewing it again! Best regards, Christian On 11.05.20 08:04, Tobias Hartmann wrote: > +1 > > Best regards, > Tobias > > On 08.05.20 20:33, Vladimir Kozlov wrote: >> Good. >> >> Thanks, >> Vladimir >> >> On 5/8/20 1:45 AM, Christian Hagedorn wrote: >>> Thank you Vladimir and Tobias for your reviews! >>> >>> On 08.05.20 08:24, Tobias Hartmann wrote: >>>> looks good to me too but please use (jio_)snprintf instead of unsafe sprintf. >>> >>> Thinking of which, we could probably directly use the phase_name parameter and call print_method() >>> without allocating a char array and using jio_snprintf. However, it might be a good idea to change >>> sprintf into jio_snprintf in Compile::print_method (where I got this code from). There it makes >>> sense to use an additional char array to concatenate two strings. >>> >>> I included those changes together with the typo fix in a new webrev: >>> http://cr.openjdk.java.net/~chagedorn/8244207/webrev.01/ >>> >>> Best regards, >>> Christian >>> >>>> On 07.05.20 20:38, Vladimir Kozlov wrote: >>>>> Looks good. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 5/7/20 5:09 AM, Christian Hagedorn wrote: >>>>>> Hi >>>>>> >>>>>> Please review the following debugging enhancement: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8244207 >>>>>> http://cr.openjdk.java.net/~chagedorn/8244207/webrev.00/ >>>>>> >>>>>> This enhancement simplifies the usage for printing the ideal graph for visualization with the >>>>>> Ideal Graph Visualizer when debugging with gdb and enables graph printing with rr [1]. >>>>>> >>>>>> Instead of calling Compile::current()->print_method(PHASE_X, y, z) from gdb, one can now just call >>>>>> igv_print() or igv_print(phase_name) with a custom phase name. There are multiple options >>>>>> depending on where the graph should be printed to (file or over network/locally to an opened Ideal >>>>>> Graph Visualizer). When choosing file, the output is always printed to a file named >>>>>> custom_debug.xml. I think the flexibility to choose another file name is not really required since >>>>>> it's only used while debugging. These new igv_print() methods can also be called from gdb without >>>>>> setting any flags required for the usual calls to Compile::current()->print_method(PHASE_X, y, z) >>>>>> to work. >>>>>> >>>>>> The standard Compile::current()->print_method(PHASE_X, y, z) call does not work while debugging a >>>>>> program trace with rr (and is probably also problematic with related replay tools). The call gets >>>>>> stuck somewhere. rr allows to alter some data at a breakpoint but as soon as execution continues >>>>>> on the replayed trace, the modifications are undone (except for file writes). This enhancement >>>>>> also addresses this such that the new igv_print() methods can be used with rr. However, when >>>>>> printing to a file is chosen, igv_print() will overwrite custom_debug.xml again at the next >>>>>> execution stop. To avoid that I added additional rr-specific igv_append() and >>>>>> igv_append(phase_name) methods that simply append a graph to the existing custom_debug.xml file >>>>>> without setting up a file header again. This allows all printed graphs to be kept in one file >>>>>> which makes it easier to navigate through them. >>>>>> >>>>>> Thank you! >>>>>> >>>>>> Best regards, >>>>>> Christian >>>>>> >>>>>> >>>>>> [1] https://rr-project.org/ From tobias.hartmann at oracle.com Mon May 11 08:24:37 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 11 May 2020 10:24:37 +0200 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: <77855ca4-8f7d-81c9-0d7f-67adefedd456@oracle.com> References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <3559b0c8-7c40-47f4-e9c5-e1edf2ac1461@oracle.com> <77855ca4-8f7d-81c9-0d7f-67adefedd456@oracle.com> Message-ID: <50b961a5-fc3a-c525-1ea4-03e65001b1da@oracle.com> Hi Martin, On 08.05.20 08:40, Tobias Hartmann wrote: > Okay, I'll run it through our startup benchmark suite. Will report back once it finished. The run finished and didn't find any regressions. Best regards, Tobias From martin.doerr at sap.com Mon May 11 08:47:17 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 11 May 2020 08:47:17 +0000 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: <50b961a5-fc3a-c525-1ea4-03e65001b1da@oracle.com> References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <3559b0c8-7c40-47f4-e9c5-e1edf2ac1461@oracle.com> <77855ca4-8f7d-81c9-0d7f-67adefedd456@oracle.com> <50b961a5-fc3a-c525-1ea4-03e65001b1da@oracle.com> Message-ID: Hi Tobias, excellent. Thanks a lot for running it. Best regards, Martin > -----Original Message----- > From: Tobias Hartmann > Sent: Montag, 11. Mai 2020 10:25 > To: Doerr, Martin ; Nils Eliasson > ; hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > Hi Martin, > > On 08.05.20 08:40, Tobias Hartmann wrote: > > Okay, I'll run it through our startup benchmark suite. Will report back once > it finished. > > The run finished and didn't find any regressions. > > Best regards, > Tobias From tobias.hartmann at oracle.com Mon May 11 08:58:39 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 11 May 2020 10:58:39 +0200 Subject: RFR(S): 8244407: JVM crashes after transformation in C2 IdealLoopTree::split_fall_in In-Reply-To: References: Message-ID: <05661edd-8cf6-dd1e-0b1a-e2399cbfd767@oracle.com> Hi Felix, your fix looks reasonable to me but Vladimir K. (who reviewed your original fix), should also have a look. Best regards, Tobias On 06.05.20 03:32, Yangfei (Felix) wrote: > Hi, > > Please help review this patch fixing a C2 crash issue. > Bug: https://bugs.openjdk.java.net/browse/JDK-8244407 > Webrev: http://cr.openjdk.java.net/~fyang/8244407/webrev.00 > > After the fix for JDK-8240576, irreducible loop tree might be structurally changed by split_fall_in() in IdealLoopTree::beautify_loops. > But loop tree is not rebuilt after that. Take the reported test case for example, irreducible loop tree looks like: > > 1: Loop: N649/N644 > > Loop: N649/N644 IRREDUCIBLE <-- this > Loop: N649/N797 sfpts={ 683 } > > With the fix for JDK-8240576?we won't do merge_many_backedges in IdealLoopTree::beautify_loops for this irreducible loop tree. > > if( _head->req() > 3 && !_irreducible) { > // Merge the many backedges into a single backedge but leave > // the hottest backedge as separate edge for the following peel. > merge_many_backedges( phase ); > result = true; > } > > N660 N644 N797 > | | | > | | | > | v | > | +---+---+ | > +-----> + N649 + <-----+ > +--------+ > > 649 Region === 649 660 797 644 [[ .... ]] !jvms: Test::testMethod @ bci:543 > > Then we come to the children: > > // Now recursively beautify nested loops > if( _child ) result |= _child->beautify_loops( phase ); > > 2: Loop: N649/N797 > > Loop: N649/N644 IRREDUCIBLE > Loop: N649/N797 sfpts={ 683 } <-- this > > After spilt_fall_in()?N660 and N644 are merged. > > if( fall_in_cnt > 1 ) // Need a loop landing pad to merge fall-ins > split_fall_in( phase, fall_in_cnt ); > > N660 N644 > | + > | | > | | > | +---------+ | > +---->+ N946 +<-----+ > +----+---+ > | N797 > | | > | | > | | > | +--------+ | > +----> + N649 + <-----+ > +--------+ > > Loop tree is now structurally changed into: > > Loop: N946/N644 IRREDUCIBLE > Loop: N649/N797 sfpts={ 683 } > > But local variable 'result' in IdealLoopTree::beautify_loops hasn't got a chance to be set to true since _head->req() is not bigger than 3 after split_fall_in. > Then C2 won't rebuild loop tree after IdealLoopTree::beautify_loops, which further leads to the crash. > Instead of adding extra checking for loop tree structure changes, proposed fix sets 'result' to true when we meet irreducible loop with multiple backedges. > This should be safer and simpler (thus good for JIT compile time). > Tiered 1-3 tested on x86-64 and aarch64 linux platform. Comments? > > Thanks, > Felix > From tobias.hartmann at oracle.com Mon May 11 09:03:16 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 11 May 2020 11:03:16 +0200 Subject: RFR(S):8242429:Better implementation for signed extract In-Reply-To: References: <420844d8-fad2-7e60-2353-398957e965e7@oracle.com> Message-ID: Hi Eric, thanks for adding the test. Looks good to me. Best regards, Tobias On 09.05.20 11:44, Eric Liu wrote: > Hi Tobias, > > Thanks for your review, and sorry for the extreme delay. I took a holiday in the last couple of days:P > > > >> Could you convert that test to a jtreg test and add it to the webrev? > >> And maybe also add test cases for the "0-(A>>31)" into "(A>>>31)" optimization. > > Add a jtreg test only, no other files has been changed. > > > > JBS: https://bugs.openjdk.java.net/browse/JDK-8242429 > Webrev: http://cr.openjdk.java.net/~yzhang/ericliu/8242429/webrev.02/ > > Thanks, > Eric > From aph at redhat.com Mon May 11 09:19:33 2020 From: aph at redhat.com (Andrew Haley) Date: Mon, 11 May 2020 10:19:33 +0100 Subject: [aarch64-port-dev ] RFR: 8244164: AArch64: jaotc generates incorrect code for compressed OOPs with non-zero heap base In-Reply-To: <858shy9a6w.fsf@arm.com> References: <858shy9a6w.fsf@arm.com> Message-ID: <9ad52446-7779-1fe2-db26-2640f4da5b97@redhat.com> On 5/11/20 8:56 AM, Nick Gasson wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8244164 > Webrev: http://cr.openjdk.java.net/~ngasson/8244164/webrev.0/ > > On AArch64 if the VM jaotc is running in uses compressed oops with a > zero base it will not emit instructions to add/subtract the heap base > when compressing/decompressing oops. This causes a crash if the AOT'd > shared library is loaded into a VM with non-zero heap base. > > Tested jtreg hotspot_all_no_apps, jdk_core with no new failures. OK. > I've made a separate pull request for Graal [1] but I'm submitting this > here too as I want to backport to 14 and 11u and also add a jtreg > test. Not sure if this is the right process to follow? > > [1] https://github.com/oracle/graal/pull/2446 Sure. It's good to get input from both sides. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From tobias.hartmann at oracle.com Mon May 11 09:52:01 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 11 May 2020 11:52:01 +0200 Subject: RFR:8243615 Continuous deoptimizations with Reason=unstable_if and Action=none In-Reply-To: References: <272f8207-0b1e-4b34-b1d4-0f562b4da9d1.zhuoren.wz@alibaba-inc.com> <4dc2e0ef-315b-a72b-bb8c-6b5f418765ed@oracle.com> Message-ID: Hi Zhuoren, thanks for the details. The same problem also affects other optimizations that only check too_many_traps() and not too_many_recompiles(), right? Best regards, Tobias On 08.05.20 11:26, Wang Zhuo(Zhuoren) wrote: > Thanks for your comments.? > >>?But?the?Reason_unstable_if?uncommon?trap?has?Action_reinterpret?(not?Action_none)?set? > In?GraphKit::uncommon_trap, if?too_many_recompiles is true,?Action_reinterpret will be changed > to?Action_none. > >> Why?don't?we?hit?the?too_many_traps?limit? > Likely that recompilations were caused by other reasons, not?Reason_unstable_if. So we did not > hit?too_many_traps limit. > Here comes an interesting thing, which byte code causes so many recompilations? > Checkcast is very?suspicious?because I met a large number of deoptimizations with?Reason_null_check > in checkcast. > Put it together, the whole process may be as follows, > 1. Some byte code(maybe checkcast) caused a lot of deoptimizations and recompilations. > 2. The number?of deoptimizations and recompilations reached threshold, and in the last compilation > for this method, Reason_unstable_if +?Action_reinterpret was changed to?Reason_unstable_if > +?Action_none. > 3. We went into uncommon trap of Reason_unstable_if, but no more recompilation?happened because > of?Action_none. > Unfortunately I failed to write a standalone test to reproduce?Reason_null_check in?checkcast. So > the test attached in bug link maybe a little different from the process I mentioned above. > >> Also,?there?is?a?too_many_traps_or_recompiles?method?that?should?be?used?instead. > Updated patch > http://cr.openjdk.java.net/~wzhuo/8243615/webrev.02/ > > Regards, > Zhuoren > > ------------------------------------------------------------------ > From:Tobias Hartmann > Sent At:2020 May 7 (Thu.) 16:31 > To:Sandler ; hotspot-compiler-dev at openjdk.java.net > > Subject:Re: RFR:8243615 Continuous deoptimizations with Reason=unstable_if and Action=none > > Hi?Zhuoren, > > On?26.04.20?14:31,?Wang?Zhuo(Zhuoren)?wrote: > >?I?met?continuous?deoptimization?w/?Reason_unstable_if?and?Action_none?in?an?online?application?and?significant?performance?drop?was?observed. > >?It?was?found?in?JDK8?but?I?think?it?also?existed?in?tip. > > But?the?Reason_unstable_if?uncommon?trap?has?Action_reinterpret?(not?Action_none)?set? > > Why?don't?we?hit?the?too_many_traps?limit? > > Also,?there?is?a?too_many_traps_or_recompiles?method?that?should?be?used?instead. > > Thanks, > Tobias > From martin.doerr at sap.com Mon May 11 13:32:31 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 11 May 2020 13:32:31 +0000 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> Message-ID: Hi Vladimir, are you ok with the updated CSR (https://bugs.openjdk.java.net/browse/JDK-8244507)? Should I set it to proposed? Here's a new webrev with obsoletion + expiration for C2 flags in ClientVM: http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ I've added the new C1 flags to the tests which should test C1 compiler as well. And I've added -XX:+IgnoreUnrecognizedVMOptions to all tests which set C2 flags. I think this is the best solution because it still allows running the tests with GraalVM compiler. Best regards, Martin > -----Original Message----- > From: Doerr, Martin > Sent: Freitag, 8. Mai 2020 23:07 > To: Vladimir Kozlov ; hotspot-compiler- > dev at openjdk.java.net > Subject: RE: RFR(S): 8235673: [C1, C2] Split inlining control flags > > Hi Vladimir, > > > You need update your CSR - add information about this and above code > change. Example: > > https://bugs.openjdk.java.net/browse/JDK-8238840 > I've updated the CSR with obsolete and expired flags as in the example. > > > I would suggest to fix tests anyway (there are only few) because new > > warning output could be unexpected. > Ok. I'll prepare a webrev with fixed tests. > > Best regards, > Martin > > > > -----Original Message----- > > From: Vladimir Kozlov > > Sent: Freitag, 8. Mai 2020 21:43 > > To: Doerr, Martin ; hotspot-compiler- > > dev at openjdk.java.net > > Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > > > Hi Martin > > > > On 5/8/20 5:56 AM, Doerr, Martin wrote: > > > Hi Vladimir, > > > > > > thanks a lot for looking at this, for finding the test issues and for > reviewing > > the CSR. > > > > > > For me, C2 is a fundamental part of the JVM. I would usually never build > > without it ?? > > > (Except if we want to use C1 + GraalVM compiler only.) > > > > Yes it is one of cases. > > > > > But your right, --with-jvm-variants=client configuration should still be > > supported. > > > > Yes. > > > > > > > > We can fix it by making the flags as obsolete if C2 is not included: > > > diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp > > > --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 11:14:28 > 2020 > > +0200 > > > +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 14:41:14 > > 2020 +0200 > > > @@ -562,6 +562,16 @@ > > > { "dup option", JDK_Version::jdk(9), > JDK_Version::undefined(), > > JDK_Version::undefined() }, > > > #endif > > > > > > +#ifndef COMPILER2 > > > + // These flags were generally available, but are C2 only, now. > > > + { "MaxInlineLevel", JDK_Version::undefined(), > > JDK_Version::jdk(15), JDK_Version::undefined() }, > > > + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), > > JDK_Version::jdk(15), JDK_Version::undefined() }, > > > + { "InlineSmallCode", JDK_Version::undefined(), > > JDK_Version::jdk(15), JDK_Version::undefined() }, > > > + { "MaxInlineSize", JDK_Version::undefined(), > > JDK_Version::jdk(15), JDK_Version::undefined() }, > > > + { "FreqInlineSize", JDK_Version::undefined(), > > JDK_Version::jdk(15), JDK_Version::undefined() }, > > > + { "MaxTrivialSize", JDK_Version::undefined(), > > JDK_Version::jdk(15), JDK_Version::undefined() }, > > > +#endif > > > + > > > { NULL, JDK_Version(0), JDK_Version(0) } > > > }; > > > > Right. I think you should do full process for these product flags deprecation > > with obsoleting in JDK 16 for VM builds > > which do not include C2. You need update your CSR - add information > about > > this and above code change. Example: > > > > https://bugs.openjdk.java.net/browse/JDK-8238840 > > > > > > > > This makes the VM accept the flags with warning: > > > jdk/bin/java -XX:MaxInlineLevel=9 -version > > > OpenJDK 64-Bit Client VM warning: Ignoring option MaxInlineLevel; > > support was removed in 15.0 > > > > > > If we do it this way, the only test which I think should get fixed is > > ReservedStackTest. > > > I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in order to > > preserve the inlining behavior. > > > > > > (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. > > compiler/c2 tests: Also written to test C2 specific things.) > > > > > > What do you think? > > > > I would suggest to fix tests anyway (there are only few) because new > > warning output could be unexpected. > > And it will be future-proof when warning will be converted into error > > (if/when C2 goes away). > > > > Thanks, > > Vladimir > > > > > > > > Best regards, > > > Martin > > > > > > > > >> -----Original Message----- > > >> From: hotspot-compiler-dev > >> bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov > > >> Sent: Donnerstag, 7. Mai 2020 19:11 > > >> To: hotspot-compiler-dev at openjdk.java.net > > >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > >> > > >> I would suggest to build VM without C2 and run tests. > > >> > > >> I grepped tests with these flags I found next tests where we need to fix > > >> test's command (add > > >> -XX:+IgnoreUnrecognizedVMOptions) or add @requires > > >> vm.compiler2.enabled or duplicate test for C1 with corresponding C1 > > >> flags (by ussing additional @test block). > > >> > > >> runtime/ReservedStack/ReservedStackTest.java > > >> compiler/intrinsics/string/TestStringIntrinsics2.java > > >> compiler/c2/Test6792161.java > > >> compiler/c2/Test5091921.java > > >> > > >> And there is issue with compiler/compilercontrol tests which use > > >> InlineSmallCode and I am not sure how to handle: > > >> > > >> > > > http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c > > >> ompiler/compilercontrol/share/scenario/Command.java#l36 > > >> > > >> Thanks, > > >> Vladimir > > >> > > >> On 5/4/20 9:04 AM, Doerr, Martin wrote: > > >>> Hi Nils, > > >>> > > >>> thank you for looking at this and sorry for the late reply. > > >>> > > >>> I've added MaxTrivialSize and also updated the issue accordingly. > Makes > > >> sense. > > >>> Do you have more flags in mind? > > >>> > > >>> Moving the flags which are only used by C2 into c2_globals definitely > > makes > > >> sense. > > >>> > > >>> Done in webrev.01: > > >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ > > >>> > > >>> Please take a look and let me know when my proposal is ready for a > CSR. > > >>> > > >>> Best regards, > > >>> Martin > > >>> > > >>> > > >>>> -----Original Message----- > > >>>> From: hotspot-compiler-dev > >>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson > > >>>> Sent: Dienstag, 28. April 2020 18:29 > > >>>> To: hotspot-compiler-dev at openjdk.java.net > > >>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > >>>> > > >>>> Hi, > > >>>> > > >>>> Thanks for addressing this! This has been an annoyance for a long > time. > > >>>> > > >>>> Have you though about including other flags - like MaxTrivialSize? > > >>>> MaxInlineSize is tested against it. > > >>>> > > >>>> Also - you should move the flags that are now c2-only to > > c2_globals.hpp. > > >>>> > > >>>> Best regards, > > >>>> Nils Eliasson > > >>>> > > >>>> On 2020-04-27 15:06, Doerr, Martin wrote: > > >>>>> Hi, > > >>>>> > > >>>>> while tuning inlining parameters for C2 compiler with JDK-8234863 > we > > >> had > > >>>> discussed impact on C1. > > >>>>> I still think it's bad to share them between both compilers. We may > > want > > >> to > > >>>> do further C2 tuning without negative impact on C1 in the future. > > >>>>> > > >>>>> C1 has issues with substantial inlining because of the lack of > > uncommon > > >>>> traps. When C1 inlines a lot, stack frames may get large and code > cache > > >> space > > >>>> may get wasted for cold or even never executed code. The situation > > gets > > >>>> worse when many patching stubs get used for such code. > > >>>>> > > >>>>> I had opened the following issue: > > >>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 > > >>>>> > > >>>>> And my initial proposal is here: > > >>>>> > > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ > > >>>>> > > >>>>> > > >>>>> Part of my proposal is to add an additional flag which I called > > >>>> C1InlineStackLimit to reduce stack utilization for C1 methods. > > >>>>> I have a simple example which shows wasted stack space (java > > example > > >>>> TestStack at the end). > > >>>>> > > >>>>> It simply counts stack frames until a stack overflow occurs. With the > > >> current > > >>>> implementation, only 1283 frames fit on the stack because the never > > >>>> executed method bogus_test with local variables gets inlined. > > >>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get > > >> 2310 > > >>>> frames until stack overflow. (I only used C1 for this example. Can be > > >>>> reproduced as shown below.) > > >>>>> > > >>>>> I didn't notice any performance regression even with the aggressive > > >> setting > > >>>> of C1InlineStackLimit=5 with TieredCompilation. > > >>>>> > > >>>>> I know that I'll need a CSR for this change, but I'd like to get > feedback > > in > > >>>> general and feedback about the flag names before creating a CSR. > > >>>>> I'd also be glad about feedback regarding the performance impact. > > >>>>> > > >>>>> Best regards, > > >>>>> Martin > > >>>>> > > >>>>> > > >>>>> > > >>>>> Command line: > > >>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - > > >>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining > - > > >>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > > >>>> TestStack > > >>>>> CompileCommand: compileonly TestStack.triggerStackOverflow > > >>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) > > >> recursive > > >>>> inlining too deep > > >>>>> @ 11 TestStack::bogus_test (33 bytes) inline > > >>>>> caught java.lang.StackOverflowError > > >>>>> 1283 activations were on stack, sum = 0 > > >>>>> > > >>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - > > >>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining > - > > >>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > > >>>> TestStack > > >>>>> CompileCommand: compileonly TestStack.triggerStackOverflow > > >>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) > > >> recursive > > >>>> inlining too deep > > >>>>> @ 11 TestStack::bogus_test (33 bytes) callee uses > > too > > >>>> much stack > > >>>>> caught java.lang.StackOverflowError > > >>>>> 2310 activations were on stack, sum = 0 > > >>>>> > > >>>>> > > >>>>> TestStack.java: > > >>>>> public class TestStack { > > >>>>> > > >>>>> static long cnt = 0, > > >>>>> sum = 0; > > >>>>> > > >>>>> public static void bogus_test() { > > >>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; > > >>>>> sum += c1 + c2 + c3 + c4; > > >>>>> } > > >>>>> > > >>>>> public static void triggerStackOverflow() { > > >>>>> cnt++; > > >>>>> triggerStackOverflow(); > > >>>>> bogus_test(); > > >>>>> } > > >>>>> > > >>>>> > > >>>>> public static void main(String args[]) { > > >>>>> try { > > >>>>> triggerStackOverflow(); > > >>>>> } catch (StackOverflowError e) { > > >>>>> System.out.println("caught " + e); > > >>>>> } > > >>>>> System.out.println(cnt + " activations were on stack, sum = " + > > >> sum); > > >>>>> } > > >>>>> } > > >>>>> > > >>> From nils.eliasson at oracle.com Mon May 11 13:42:39 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Mon, 11 May 2020 15:42:39 +0200 Subject: RFR(M): 8244660: Code cache sweeper heuristics is broken In-Reply-To: References: Message-ID: <0688678b-986b-082c-425e-543c3c32b094@oracle.com> Hi Man, On 2020-05-09 03:28, Man Cao wrote: > Hi Nils, > > Thanks for fixing this so quickly, and simplifying the logic! > > Some high-level questions and suggestions: > *1. Sweep frequency * > With this new approach, is the sweeping expected to be less > frequent than the current approach, or more frequent? It looks more > frequent to me. It should be a lot less frequent than when the StartAggressiveSweep kicked in because of the bug. I hope that it is about the same as before. The? byte threshold is lower, but there are no longer any safepoints that can trigger additional sweeps. I am open to adjusting the threshold until a good balance is achieved. > > If I understand correctly, for the current approach, the conditions to > sweep are: > - Bytes from make_not_entrant_or_zombie() and make_unloaded() reach 1/100 > of ReservedCodeCacheSize > OR > - Heuristics based on ReservedCodeCacheSize/16M, _time_counter > and _last_sweep. > I suppose the second condition is the "Number of safepoints with stack > scans" condition you mentioned, > which is not currently triggered. > > With the new approach, the condition is: > - Bytes from make_not_entrant_or_zombie() and make_unloaded() > reach SweeperThreshold (1/256 of ReservedCodeCacheSize) You understand it correctly.? I also capped the it ReservedCodeCacheSize at 1Mb, because a user running with a large reserved code cache still wants to clean out L3 code during startup. > > Is it better to make SweeperThreshold default to 1/100 > of ReservedCodeCacheSize like before? I think 1/100 of a 240mb default code cache seems a bit high. During startup we produce a lot of L3 code that will be thrown away. We want to recycle it fairly quickly, to avoid fragmenting the code cache, but not that often that we affect startup. I've done some startup measurements, and then we sweep about every other second in a benchmark that produces a lot of code. What results are you seeing? > Also, could it be a percentage instead of a byte-size value? > In our experience, a percentage value is also easier to maintain for > production users. SweeperThreshold could absolutely be a percentage. I will change that. > > We also would like to reduce the default sweep frequency, especially for > -XX:-TieredCompilation. > Because in JDK11, we have seen the higher sweep frequency caused regression > compared to JDK8, > and turning off code cache flushing could significantly improve performance. Code cache flushing has another heuristic - it might be broken too. But it would be interesting too see how it works with the new sweep heuristic. If you know that you have enough code cache - turning it off is no loss. It only helps when you are running out of code cache. > > *2. Sweep and make non-entrant* >> The threshold is capped at 1M because even if you have an enormous code >> cache - you don't want to fragment it, and you probably don't want to >> commit more than needed. > It is possible that sweeping will deoptimize some cold nmethods that will > be used soon. > Such deoptimizations could hurt performance more than fragmenting the code > cache. When we are doing normal sweeping - we don't deoptimize cold code. That is handled my the method flushing - it should only kick in when we start to run out of code cache. > Taking a closer look, perhaps the root problem is not just the sweep > frequency itself, but coupled > with the logic in NMethodSweeper::possibly_flush() to determine when to > make an nmethod not-entrant. > Perhaps the two flags NmethodSweepActivity and MinPassesBeforeFlush could > be adjusted > accordingly to the higher sweep frequency, to make JVM deoptimize fewer > cold but usable nmethods. > > Do you think I should open a CR to investigate changing the default values > of these flags later? > It would be better if we could deprecate one of these two flags if they > serve the same purpose. I think we should address MethodFlushing in a separate RFE/BUG. The heuristics for CodeAging may have been negatively affected by the transition to handshakes. Also the SetHotnessClosure should be replaced by a mechanism using the NMethodEntry barriers. I see that we are missing JFR events for MethodFlushing. I have created another patch for that. Best regards, Nils Eliasson > -Man From zhuoren.wz at alibaba-inc.com Mon May 11 12:58:47 2020 From: zhuoren.wz at alibaba-inc.com (=?UTF-8?B?V2FuZyBaaHVvKFpodW9yZW4p?=) Date: Mon, 11 May 2020 20:58:47 +0800 Subject: =?UTF-8?B?UmU6IFJGUjo4MjQzNjE1IENvbnRpbnVvdXMgZGVvcHRpbWl6YXRpb25zIHdpdGggUmVhc29u?= =?UTF-8?B?PXVuc3RhYmxlX2lmIGFuZCBBY3Rpb249bm9uZQ==?= In-Reply-To: References: <272f8207-0b1e-4b34-b1d4-0f562b4da9d1.zhuoren.wz@alibaba-inc.com> <4dc2e0ef-315b-a72b-bb8c-6b5f418765ed@oracle.com> , Message-ID: Hi Tobias, Theoretically speaking other optimizations, with Action_maybe_recompile or Action_reinterpret, can be affected, because in uncommon_trap, Action_maybe_recompile and Action_reinterpret will be changed to Action_none if too many recompiles happened. While I have only met this issue with Reason_unstable_if so far. Regards, Zhuoren ------------------------------------------------------------------ From:Tobias Hartmann Sent At:2020 May 11 (Mon.) 17:52 To:Sandler ; hotspot-compiler-dev at openjdk.java.net Subject:Re: RFR:8243615 Continuous deoptimizations with Reason=unstable_if and Action=none Hi Zhuoren, thanks for the details. The same problem also affects other optimizations that only check too_many_traps() and not too_many_recompiles(), right? Best regards, Tobias On 08.05.20 11:26, Wang Zhuo(Zhuoren) wrote: > Thanks for your comments. > >> But the Reason_unstable_if uncommon trap has Action_reinterpret (not Action_none) set? > In GraphKit::uncommon_trap, if too_many_recompiles is true, Action_reinterpret will be changed > to Action_none. > >> Why don't we hit the too_many_traps limit? > Likely that recompilations were caused by other reasons, not Reason_unstable_if. So we did not > hit too_many_traps limit. > Here comes an interesting thing, which byte code causes so many recompilations? > Checkcast is very suspicious because I met a large number of deoptimizations with Reason_null_check > in checkcast. > Put it together, the whole process may be as follows, > 1. Some byte code(maybe checkcast) caused a lot of deoptimizations and recompilations. > 2. The number of deoptimizations and recompilations reached threshold, and in the last compilation > for this method, Reason_unstable_if + Action_reinterpret was changed to Reason_unstable_if > + Action_none. > 3. We went into uncommon trap of Reason_unstable_if, but no more recompilation happened because > of Action_none. > Unfortunately I failed to write a standalone test to reproduce Reason_null_check in checkcast. So > the test attached in bug link maybe a little different from the process I mentioned above. > >> Also, there is a too_many_traps_or_recompiles method that should be used instead. > Updated patch > http://cr.openjdk.java.net/~wzhuo/8243615/webrev.02/ > > Regards, > Zhuoren > > ------------------------------------------------------------------ > From:Tobias Hartmann > Sent At:2020 May 7 (Thu.) 16:31 > To:Sandler ; hotspot-compiler-dev at openjdk.java.net > > Subject:Re: RFR:8243615 Continuous deoptimizations with Reason=unstable_if and Action=none > > Hi Zhuoren, > > On 26.04.20 14:31, Wang Zhuo(Zhuoren) wrote: > > I met continuous deoptimization w/ Reason_unstable_if and Action_none in an online application and significant performance drop was observed. > > It was found in JDK8 but I think it also existed in tip. > > But the Reason_unstable_if uncommon trap has Action_reinterpret (not Action_none) set? > > Why don't we hit the too_many_traps limit? > > Also, there is a too_many_traps_or_recompiles method that should be used instead. > > Thanks, > Tobias > From evgeny.nikitin at oracle.com Mon May 11 15:07:38 2020 From: evgeny.nikitin at oracle.com (Evgeny Nikitin) Date: Mon, 11 May 2020 17:07:38 +0200 Subject: RFR(XS): 8244282: Add modules to a jtreg test. Message-ID: <26ab81b5-02f2-6e7b-1f71-0faa6e41a42d@oracle.com> Hi, Bug: https://bugs.openjdk.java.net/browse/JDK-8244282 Webrev: http://cr.openjdk.java.net/~enikitin/8244282/webrev.00/ Test fails with '--illegal-access=deny' due to necessary module being not specified. Fixed, tested with jtreg (fails without the change, passes with it). Please review. Thanks in advance, /Evgeny Nikitin. From mikael.vidstedt at oracle.com Mon May 11 18:18:30 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Mon, 11 May 2020 11:18:30 -0700 Subject: RFR: 8244224: Implementation of JEP 381: Remove the Solaris and SPARC Ports (hotspot) In-Reply-To: <048a8e25-fec4-5304-8763-6a4472bfbf54@oracle.com> References: <394BD86E-EC91-440E-9936-696B2B453093@oracle.com> <048a8e25-fec4-5304-8763-6a4472bfbf54@oracle.com> Message-ID: <3810FA34-8AC5-45F0-B0DB-57C28324FECB@oracle.com> > On May 8, 2020, at 12:48 PM, Daniel D. Daugherty wrote: > > On 5/7/20 1:35 AM, Mikael Vidstedt wrote: >> New webrev here: >> >> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.01/hotspot/open/webrev/ > > This pretty much says it all: > > > Summary of changes: 90904 lines changed: 8 ins; 90725 del; 171 mod; 103780 unchg > > > My review is focused on looking at the changes and not looking for missed > changes. I figure there's enough work here just looking at the changes to > keep me occupied for a while and enough people have posted comments about > finding other things to be examined, etc... > > Unlike my normal reviews, I won't be listing all the touched files; > (there's _only_ 427 of them...) > > Don't forget to make a copyright year update pass before you push. Yup - I have added it in 10 different places on my TODO list to maximize the likelihood of me remembering it :) > > src/hotspot/os/posix/os_posix.hpp > L174 > old L175 #ifndef SOLARIS > L176 > nit - on most of this style of deletion you also got rid of > one of the blank lines, but not here. Oops, will fix. > src/hotspot/share/utilities/dtrace.hpp > old L42: #elif defined(__APPLE__) > old L44: #include > old L45: #else > new L32: #include > was previous included only for __APPLE__ and it > is now there for every platform. Any particular reason? No particular reason other than "it looks cleaner". I guess we could see if the include can be removed altogether. > Thumbs up! Thanks for the review!! Cheers, Mikael > >> incremental: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.01/hotspot.incr/open/webrev/ >> >> Remaining items: >> >> * File follow-up to remove STACK_BIAS >> >> * File follow-ups to change/update/remove flags and/or flag documentation: UseLWPSynchronization, BranchOnRegister, LIRFillDelaySlots, ArrayAllocatorMallocLimit, ThreadPriorityPolicy >> >> * File follow-up(s) to update comments ("solaris", ?sparc?, ?solstudio?, ?sunos?, ?sun studio?, ?s compiler bug?, ?niagara?, ?) >> >> >> Please let me know if there?s something I have missed! >> >> Cheers, >> Mikael >> >>> On May 3, 2020, at 10:12 PM, Mikael Vidstedt wrote: >>> >>> >>> Please review this change which implements part of JEP 381: >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8244224 >>> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/hotspot/open/webrev/ >>> JEP: https://bugs.openjdk.java.net/browse/JDK-8241787 >>> >>> >>> Note: When reviewing this, please be aware that this exercise was *extremely* mind-numbing, so I appreciate your help reviewing all the individual changes carefully. You may want to get that coffee cup filled up (or whatever keeps you awake)! >>> >>> >>> Background: >>> >>> Because of the size of the total patch and wide range of areas touched, this patch is one out of in total six partial patches which together make up the necessary changes to remove the Solaris and SPARC ports. The other patches are being sent out for review to mailing lists appropriate for the respective areas the touch. An email will be sent to jdk-dev summarizing all the patches/reviews. To be clear: this patch is *not* in itself complete and stand-alone - all of the (six) patches are needed to form a complete patch. Some changes in this patch may look wrong or incomplete unless also looking at the corresponding changes in other areas. >>> >>> For convenience, I?m including a link below[1] to the full webrev, but in case you have comments on changes in other areas, outside of the files included in this thread, please provide those comments directly in the thread on the appropriate mailing list for that area if possible. >>> >>> In case it helps, the changes were effectively produced by searching for and updating any code mentioning ?solaris", ?sparc?, ?solstudio?, ?sunos?, etc. More information about the areas impacted can be found in the JEP itself. >>> >>> A big thank you to Igor Ignatyev for helping make the changes to the hotspot tests! >>> >>> Also, I have a short list of follow-ups which I?m going to look at separately from this JEP/patch, mainly related to command line options/flags which are no longer relevant and should be deprecated/obsoleted/removed. >>> >>> Testing: >>> >>> A slightly earlier version of this change successfully passed tier1-8, as well as client tier1-2. Additional testing will be done after the first round of reviews has been completed. >>> >>> Cheers, >>> Mikael >>> >>> [1] http://cr.openjdk.java.net/~mikael/webrevs/8244224/webrev.00/all/open/webrev/ >>> > From john.r.rose at oracle.com Mon May 11 20:15:05 2020 From: john.r.rose at oracle.com (John Rose) Date: Mon, 11 May 2020 13:15:05 -0700 Subject: RFR:8243615 Continuous deoptimizations with Reason=unstable_if and Action=none In-Reply-To: References: <272f8207-0b1e-4b34-b1d4-0f562b4da9d1.zhuoren.wz@alibaba-inc.com> <4dc2e0ef-315b-a72b-bb8c-6b5f418765ed@oracle.com> Message-ID: On May 11, 2020, at 5:58 AM, Wang Zhuo(Zhuoren) wrote: > > Theoretically speaking other optimizations, with Action_maybe_recompile or Action_reinterpret, can be affected, because in uncommon_trap, Action_maybe_recompile and Action_reinterpret will be changed to Action_none if too many recompiles happened. > While I have only met this issue with Reason_unstable_if so far. Here?s some background: The too_many_traps logic is like those barrels full of sand or water at the edges of freeway intersections, or a backstop on a baseball field. It?s better to have a backstop than to have none at all, but something is wrong if you are hitting the backstop. In short, the too_many_traps logic is present to prevent trap storms from lasting forever. But even short trap storms are a problem, if they happen often enough. Also, the too_many_traps logic has in the past failed to terminate trap storms. I think the bug here is probably whatever specific factor is causing small trap storms, which in turn are triggering too_many_traps. Maybe there?s a bytecode that is trapping too often, and that bytecode individually is not throttling its own traps, and so the generic backstop logic is being called into play. (Less likely, the generic throttling logic needs some fix. But usually the right fix is at the root cause, with a single optimization or bytecode that is going wrong too often.) Sometimes it?s one bytecode running one corner case optimization that is trapping too many times, as if the JIT?s optimizer were saying to itself ?last time this optimization failed, but this time for sure!?, or as if the JIT?s optimizer has no feedback path at all to see that the optimization has failed in the past. Sometimes it is a whole class of bytecodes, such as ?all check-casts arising from generic erasure.? HTH ? John From eric.c.liu at arm.com Tue May 12 03:57:49 2020 From: eric.c.liu at arm.com (Eric Liu) Date: Tue, 12 May 2020 03:57:49 +0000 Subject: RFR(S):8242429:Better implementation for signed extract In-Reply-To: References: <420844d8-fad2-7e60-2353-398957e965e7@oracle.com> , Message-ID: Hi Tobias, Thanks for your review. Ningsheng has helped me to push this patch. Also thanks @Vladimir:P B&R, Eric From ioi.lam at oracle.com Tue May 12 05:36:28 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 11 May 2020 22:36:28 -0700 Subject: RFR(S) 8244775 Remove unnecessary dependency to jfrEvents.hpp Message-ID: <01aa2284-b62f-27ba-4f22-58e69c65a002@oracle.com> https://bugs.openjdk.java.net/browse/JDK-8244775 http://cr.openjdk.java.net/~iklam/jdk15/8244775-remove-dependency-jfrEvents.hpp.v01/ Currently 231 .o files depends on jfrEvents.hpp, which pulls in a lot of stuff and slows down HotSpot build. I refactored compile.hpp, compilerEvent.hpp and g1GCPhaseTimes.hpp. Now the number is down to 65 .o files. On my machine, debug build goes from 2m19s to 2m01s. Testing: passed mach5 tiers 1/2/3. Thanks - Ioi From david.holmes at oracle.com Tue May 12 07:58:52 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 12 May 2020 17:58:52 +1000 Subject: RFR[M]: 8151779: Some intrinsic flags could be replaced with one general flag In-Reply-To: References: <19CD3956-4DC6-4908-8626-27D48A9AB4A4@amazon.com> <0EDAAC88-E5D9-424F-A19E-5E20C689C2F3@amazon.com> <801D878C-CAE5-4EBE-8AFE-4E35346CD5BD@amazon.com> <58ff5b66-1dce-d4ad-8f21-254abd1b887b@oracle.com> Message-ID: <65dcfd1f-5e7e-b9e1-8298-5daafcda8a81@oracle.com> Hi, Sorry for the delay in getting back to this. On 5/05/2020 7:37 pm, Liu, Xin wrote: > Hello, David and Nils > > Thank you to review the patch. I went to brush up my English grammar and then update my patch to rev04. > https://cr.openjdk.java.net/~xliu/8151779/04/webrev/ > Here is the incremental diff: https://cr.openjdk.java.net/~xliu/8151779/r3_to_r4.diff It reflect changes based on David's feedbacks. I really appreciate that you review so carefully and found so many invaluable suggestions. TBH, I don't understand Amazon's copyright header neither. I choose the simple way to dodge that problem. In vmSymbols.hpp + // 1. Disable/Control Intrinsic accept a list of intrinsic IDs. s/accept/accepts/ + // their final value are subject to hardware inspection (VM_Version::initialize). s/value/values/ Otherwise all my nits have been addressed - thanks. I don't need to see a further webrev. Thanks, David ----- > Nils points out a very tricky question. Yes, I also notice that each TriBool takes 4 bytes on x86_64. It's a natural machine word and supposed to be the most efficient form. As a result, the vector control_words take about 1.3Kb for all intrinsics. I thought it's not a big deal, but Nils brought up that each DirectiveSet will increase from 128b to 1440b. Theoretically, the user may provide a CompileCommandFile which consists of hundreds of directives. Will hotspot have hundreds of DirectiveSet in that case? > > Actually, I do have a compacted container of TriBool. It's like a vector specialization. > https://cr.openjdk.java.net/~xliu/8151779/TriBool.cpp > > The reason I didn't include it because I still feel that a few KiloBytes memories are not a big deal. Nowadays, hotspot allows Java programmers allocate over 100G heap. Is it wise to increase software complexity to save KBs? > > If you think it matters, I can integrate it. May I update TriBoolArray in a standalone JBS? I have made a lot of changes. I hope I can verify them using KitchenSink? > > For the second problem, I think it's because I used 'memset' to initialize an array of objects in rev01. Previously, I had code like this: > memset(&_intrinsic_control_words[0], 0, sizeof(_intrinsic_control_words)); > > This kind of usage will be warned as -Werror=class-memaccess in g++-8. I have fixed it since rev02. I use DirectiveSet::fill_in(). Please check out. > > Thanks, > --lx > From martin.doerr at sap.com Tue May 12 08:42:31 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 12 May 2020 08:42:31 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: <3c59b9f9-ec38-18c9-8f24-e1186a08a04a@oracle.com> <410eed04-e2ef-0f4f-1c56-19e6734a10f6@oracle.com> <81d7caa8-4244-85f3-4d4e-78117fe5e25b@oss.nttdata.com> <550b95ac-8b29-1eb8-a507-533e81d02322@oracle.com> <9c49ea2d-e3b8-b576-1d17-d18ad87cd6ed@oracle.com> Message-ID: Hi Richard, I had already reviewed webrev.1. Looks good to me. Thanks for contributing it. Best regards, Martin > -----Original Message----- > From: hotspot-runtime-dev bounces at openjdk.java.net> On Behalf Of Reingruber, Richard > Sent: Montag, 4. Mai 2020 12:33 > To: David Holmes ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > Subject: RE: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > > // Trimmed the list of recipients. If the list gets too long then the message > needs to be approved > // by a moderator. > > Hi David, > > > On 28/04/2020 12:09 am, Reingruber, Richard wrote: > > > Hi David, > > > > > >> Not a review but some general commentary ... > > > > > > That's welcome. > > > Having had to take an even closer look now I have a review comment too :) > > > src/hotspot/share/prims/jvmtiThreadState.cpp > > > void JvmtiThreadState::invalidate_cur_stack_depth() { > > ! assert(SafepointSynchronize::is_at_safepoint() || > > ! (Thread::current()->is_VM_thread() && > > get_thread()->is_vmthread_processing_handshake()) || > > (JavaThread *)Thread::current() == get_thread(), > > "must be current thread or at safepoint"); > > You're looking at an outdated webrev, I'm afraid. > > This would be the post with the current webrev.1 > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020- > April/031245.html > > Thanks, Richard. > > -----Original Message----- > From: David Holmes > Sent: Montag, 4. Mai 2020 08:51 > To: Reingruber, Richard ; Yasumasa Suenaga > ; Patricio Chilano > ; serguei.spitsyn at oracle.com; Vladimir > Ivanov ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > > Hi Richard, > > On 28/04/2020 12:09 am, Reingruber, Richard wrote: > > Hi David, > > > >> Not a review but some general commentary ... > > > > That's welcome. > > Having had to take an even closer look now I have a review comment too :) > > src/hotspot/share/prims/jvmtiThreadState.cpp > > void JvmtiThreadState::invalidate_cur_stack_depth() { > ! assert(SafepointSynchronize::is_at_safepoint() || > ! (Thread::current()->is_VM_thread() && > get_thread()->is_vmthread_processing_handshake()) || > (JavaThread *)Thread::current() == get_thread(), > "must be current thread or at safepoint"); > > The message needs updating to include handshakes. > > More below ... > > >> On 25/04/2020 2:08 am, Reingruber, Richard wrote: > >>> Hi Yasumasa, Patricio, > >>> > >>>>>> I will send review request to replace VM_SetFramePop to > handshake in early next week in JDK-8242427. > >>>>>> Does it help you? I think it gives you to remove workaround. > >>>>> > >>>>> I think it would not help that much. Note that when replacing > VM_SetFramePop with a direct handshake > >>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm > operation [1]. So you would have to > >>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt > to these changes. > >>> > >>>> Thanks for your information. > >>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and > vmTestbase/nsk/jvmti/NotifyFramePop. > >>>> I will modify and will test it after yours. > >>> > >>> Thanks :) > >>> > >>>>> Also my first impression was that it won't be that easy from a > synchronization point of view to > >>>>> replace VM_SetFramePop with a direct handshake. E.g. > VM_SetFramePop::doit() indirectly calls > >>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, > JvmtiFramePop fpop) where > >>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at > safepoint. It's not directly clear > >>>>> to me, how this has to be handled. > >>> > >>>> I think JvmtiEventController::set_frame_pop() should hold > JvmtiThreadState_lock because it affects other JVMTI operation especially > FramePop event. > >>> > >>> Yes. To me it is unclear what synchronization is necessary, if it is called > during a handshake. And > >>> also I'm unsure if a thread should do safepoint checks while executing a > handshake. > > > >> I'm growing increasingly concerned that use of direct handshakes to > >> replace VM operations needs a much greater examination for correctness > >> than might initially be thought. I see a number of issues: > > > > I agree. I'll address your concerns in the context of this review thread for > JDK-8238585 below. > > > > In addition I would suggest to take the general part of the discussion to a > dedicated thread or to > > the review thread for JDK-8242427. I would like to keep this thread closer > to its subject. > > I will focus on the issues in the context of this particular change > then, though the issues themselves are applicable to all handshake > situations (and more so with direct handshakes). This is mostly just > discussion. > > >> First, the VMThread executes (most) VM operations with a clean stack in > >> a clean state, so it has lots of room to work. If we now execute the > >> same logic in a JavaThread then we risk hitting stackoverflows if > >> nothing else. But we are also now executing code in a JavaThread and so > >> we have to be sure that code is not going to act differently (in a bad > >> way) if executed by a JavaThread rather than the VMThread. For > example, > >> may it be possible that if executing in the VMThread we defer some > >> activity that might require execution of Java code, or else hand it off > >> to one of the service threads? If we execute that code directly in the > >> current JavaThread instead we may not be in a valid state (e.g. consider > >> re-entrancy to various subsystems that is not allowed). > > > > It is not too complex, what EnterInterpOnlyModeClosure::do_thread() is > doing. I already added a > > paragraph to the JBS-Item [1] explaining why the direct handshake is > sufficient from a > > synchronization point of view. > > Just to be clear, your proposed change is not using a direct handshake. > > > Furthermore the stack is walked and the return pc of compiled frames is > replaced with the address of > > the deopt handler. > > > > I can't see why this cannot be done with a direct handshake. Something > very similar is already done > > in JavaThread::deoptimize_marked_methods() which is executed as part > of an ordinary handshake. > > Note that existing non-direct handshakes may also have issues that not > have been fully investigated. > > > The demand on stack-space should be very modest. I would not expect a > higher risk for stackoverflow. > > For the target thread if you use more stack than would be used stopping > at a safepoint then you are at risk. For the thread initiating the > direct handshake if you use more stack than would be used enqueuing a VM > operation, then you are at risk. As we have not quantified these > numbers, nor have any easy way to establish the stack use of the actual > code to be executed, we're really just hoping for the best. This is a > general problem with handshakes that needs to be investigated more > deeply. As a simple, general, example just imagine if the code involves > logging that might utilise an on-stack buffer. > > >> Second, we have this question mark over what happens if the operation > >> hits further safepoint or handshake polls/checks? Are there constraints > >> on what is allowed here? How can we recognise this problem may exist > and > >> so deal with it? > > > > The thread in EnterInterpOnlyModeClosure::do_thread() can't become > safepoint/handshake safe. I > > tested locally test/hotspot/jtreg:vmTestbase_nsk_jvmti with a > NoSafepointVerifier. > > That's good to hear but such tests are not exhaustive, they will detect > if you do reach a safepoint/handshake but they can't prove that you > cannot reach one. What you have done is necessary but may not be > sufficient. Plus you didn't actually add the NSV to the code - is there > a reason we can't actually keep it in do_thread? (I'm not sure if the > NSV also acts as a NoHandshakeVerifier?) > > >> Third, while we are generally considering what appear to be > >> single-thread operations, which should be amenable to a direct > >> handshake, we also have to be careful that some of the code involved > >> doesn't already expect/assume we are at a safepoint - e.g. a VM op may > >> not need to take a lock where a direct handshake might! > > > > See again my arguments in the JBS item [1]. > > Yes I see the reasoning and that is good. My point is a general one as > it may not be obvious when such assumptions exist in the current code. > > Thanks, > David > > > Thanks, > > Richard. > > > > [1] https://bugs.openjdk.java.net/browse/JDK-8238585 > > > > -----Original Message----- > > From: David Holmes > > Sent: Montag, 27. April 2020 07:16 > > To: Reingruber, Richard ; Yasumasa Suenaga > ; Patricio Chilano > ; serguei.spitsyn at oracle.com; Vladimir > Ivanov ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > > Subject: Re: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > > > > Hi all, > > > > Not a review but some general commentary ... > > > > On 25/04/2020 2:08 am, Reingruber, Richard wrote: > >> Hi Yasumasa, Patricio, > >> > >>>>> I will send review request to replace VM_SetFramePop to handshake > in early next week in JDK-8242427. > >>>>> Does it help you? I think it gives you to remove workaround. > >>>> > >>>> I think it would not help that much. Note that when replacing > VM_SetFramePop with a direct handshake > >>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm > operation [1]. So you would have to > >>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt > to these changes. > >> > >>> Thanks for your information. > >>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and > vmTestbase/nsk/jvmti/NotifyFramePop. > >>> I will modify and will test it after yours. > >> > >> Thanks :) > >> > >>>> Also my first impression was that it won't be that easy from a > synchronization point of view to > >>>> replace VM_SetFramePop with a direct handshake. E.g. > VM_SetFramePop::doit() indirectly calls > >>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, > JvmtiFramePop fpop) where > >>>> JvmtiThreadState_lock is acquired with safepoint check, if not at > safepoint. It's not directly clear > >>>> to me, how this has to be handled. > >> > >>> I think JvmtiEventController::set_frame_pop() should hold > JvmtiThreadState_lock because it affects other JVMTI operation especially > FramePop event. > >> > >> Yes. To me it is unclear what synchronization is necessary, if it is called > during a handshake. And > >> also I'm unsure if a thread should do safepoint checks while executing a > handshake. > > > > I'm growing increasingly concerned that use of direct handshakes to > > replace VM operations needs a much greater examination for correctness > > than might initially be thought. I see a number of issues: > > > > First, the VMThread executes (most) VM operations with a clean stack in > > a clean state, so it has lots of room to work. If we now execute the > > same logic in a JavaThread then we risk hitting stackoverflows if > > nothing else. But we are also now executing code in a JavaThread and so > > we have to be sure that code is not going to act differently (in a bad > > way) if executed by a JavaThread rather than the VMThread. For example, > > may it be possible that if executing in the VMThread we defer some > > activity that might require execution of Java code, or else hand it off > > to one of the service threads? If we execute that code directly in the > > current JavaThread instead we may not be in a valid state (e.g. consider > > re-entrancy to various subsystems that is not allowed). > > > > Second, we have this question mark over what happens if the operation > > hits further safepoint or handshake polls/checks? Are there constraints > > on what is allowed here? How can we recognise this problem may exist and > > so deal with it? > > > > Third, while we are generally considering what appear to be > > single-thread operations, which should be amenable to a direct > > handshake, we also have to be careful that some of the code involved > > doesn't already expect/assume we are at a safepoint - e.g. a VM op may > > not need to take a lock where a direct handshake might! > > > > Cheers, > > David > > ----- > > > >> @Patricio, coming back to my question [1]: > >> > >> In the example you gave in your answer [2]: the java thread would > execute a vm operation during a > >> direct handshake operation, while the VMThread is actually in the middle > of a VM_HandshakeAllThreads > >> operation, waiting to handshake the same handshakee: why can't the > VMThread just proceed? The > >> handshakee would be safepoint safe, wouldn't it? > >> > >> Thanks, Richard. > >> > >> [1] https://bugs.openjdk.java.net/browse/JDK- > 8230594?focusedCommentId=14301677&page=com.atlassian.jira.plugin.syst > em.issuetabpanels:comment-tabpanel#comment-14301677 > >> > >> [2] https://bugs.openjdk.java.net/browse/JDK- > 8230594?focusedCommentId=14301763&page=com.atlassian.jira.plugin.syst > em.issuetabpanels:comment-tabpanel#comment-14301763 > >> > >> -----Original Message----- > >> From: Yasumasa Suenaga > >> Sent: Freitag, 24. April 2020 17:23 > >> To: Reingruber, Richard ; Patricio Chilano > ; serguei.spitsyn at oracle.com; Vladimir > Ivanov ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > >> Subject: Re: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > >> > >> Hi Richard, > >> > >> On 2020/04/24 23:44, Reingruber, Richard wrote: > >>> Hi Yasumasa, > >>> > >>>> I will send review request to replace VM_SetFramePop to handshake > in early next week in JDK-8242427. > >>>> Does it help you? I think it gives you to remove workaround. > >>> > >>> I think it would not help that much. Note that when replacing > VM_SetFramePop with a direct handshake > >>> you could not just execute VM_EnterInterpOnlyMode as a nested vm > operation [1]. So you would have to > >>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to > these changes. > >> > >> Thanks for your information. > >> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and > vmTestbase/nsk/jvmti/NotifyFramePop. > >> I will modify and will test it after yours. > >> > >> > >>> Also my first impression was that it won't be that easy from a > synchronization point of view to > >>> replace VM_SetFramePop with a direct handshake. E.g. > VM_SetFramePop::doit() indirectly calls > >>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, > JvmtiFramePop fpop) where > >>> JvmtiThreadState_lock is acquired with safepoint check, if not at > safepoint. It's not directly clear > >>> to me, how this has to be handled. > >> > >> I think JvmtiEventController::set_frame_pop() should hold > JvmtiThreadState_lock because it affects other JVMTI operation especially > FramePop event. > >> > >> > >> Thanks, > >> > >> Yasumasa > >> > >> > >>> So it appears to me that it would be easier to push JDK-8242427 after this > (JDK-8238585). > >>> > >>>> (The patch is available, but I want to see the result of PIT in this > weekend whether JDK-8242425 works fine.) > >>> > >>> Would be interesting to see how you handled the issues above :) > >>> > >>> Thanks, Richard. > >>> > >>> [1] See question in comment > https://bugs.openjdk.java.net/browse/JDK- > 8230594?focusedCommentId=14302030&page=com.atlassian.jira.plugin.syst > em.issuetabpanels:comment-tabpanel#comment-14302030 > >>> > >>> -----Original Message----- > >>> From: Yasumasa Suenaga > >>> Sent: Freitag, 24. April 2020 13:34 > >>> To: Reingruber, Richard ; Patricio Chilano > ; serguei.spitsyn at oracle.com; Vladimir > Ivanov ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > >>> Subject: Re: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > >>> > >>> Hi Richard, > >>> > >>> I will send review request to replace VM_SetFramePop to handshake in > early next week in JDK-8242427. > >>> Does it help you? I think it gives you to remove workaround. > >>> > >>> (The patch is available, but I want to see the result of PIT in this weekend > whether JDK-8242425 works fine.) > >>> > >>> > >>> Thanks, > >>> > >>> Yasumasa > >>> > >>> > >>> On 2020/04/24 17:18, Reingruber, Richard wrote: > >>>> Hi Patricio, Vladimir, and Serguei, > >>>> > >>>> now that direct handshakes are available, I've updated the patch to > make use of them. > >>>> > >>>> In addition I have done some clean-up changes I missed in the first > webrev. > >>>> > >>>> Finally I have implemented the workaround suggested by Patricio to > avoid nesting the handshake > >>>> into the vm operation VM_SetFramePop [1] > >>>> > >>>> Kindly review again: > >>>> > >>>> Webrev: > http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ > >>>> Webrev(delta): > http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ > >>>> > >>>> I updated the JBS item explaining why the vm operation > VM_EnterInterpOnlyMode can be replaced with a > >>>> direct handshake: > >>>> > >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8238585 > >>>> > >>>> Testing: > >>>> > >>>> * JCK and JTREG tests, also in Xcomp mode with fastdebug and release > builds on all platforms. > >>>> > >>>> * Submit-repo: mach5-one-rrich-JDK-8238585-20200423-1436-10441737 > >>>> > >>>> Thanks, > >>>> Richard. > >>>> > >>>> [1] An assertion in Handshake::execute_direct() fails, if called be > VMThread, because it is no JavaThread. > >>>> > >>>> -----Original Message----- > >>>> From: hotspot-dev On > Behalf Of Reingruber, Richard > >>>> Sent: Freitag, 14. Februar 2020 19:47 > >>>> To: Patricio Chilano ; > serviceability-dev at openjdk.java.net; hotspot-compiler- > dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > >>>> Subject: RE: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > >>>> > >>>> Hi Patricio, > >>>> > >>>> > > I'm really glad you noticed the problematic nesting. This seems > to be a general issue: currently a > >>>> > > handshake cannot be nested in a vm operation. Maybe it should > be asserted in the > >>>> > > Handshake::execute() methods that they are not called by the > vm thread evaluating a vm operation? > >>>> > > > >>>> > > > Alternatively I think you could do something similar to what > we do in > >>>> > > > Deoptimization::deoptimize_all_marked(): > >>>> > > > > >>>> > > > EnterInterpOnlyModeClosure hs; > >>>> > > > if (SafepointSynchronize::is_at_safepoint()) { > >>>> > > > hs.do_thread(state->get_thread()); > >>>> > > > } else { > >>>> > > > Handshake::execute(&hs, state->get_thread()); > >>>> > > > } > >>>> > > > (you could pass ?EnterInterpOnlyModeClosure? directly to > the > >>>> > > > HandshakeClosure() constructor) > >>>> > > > >>>> > > Maybe this could be used also in the Handshake::execute() > methods as general solution? > >>>> > Right, we could also do that. Avoiding to clear the polling page in > >>>> > HandshakeState::clear_handshake() should be enough to fix this > issue and > >>>> > execute a handshake inside a safepoint, but adding that "if" > statement > >>>> > in Hanshake::execute() sounds good to avoid all the extra code > that we > >>>> > go through when executing a handshake. I filed 8239084 to make > that change. > >>>> > >>>> Thanks for taking care of this and creating the RFE. > >>>> > >>>> > > >>>> > > > I don?t know JVMTI code so I?m not sure if > VM_EnterInterpOnlyMode is > >>>> > > > always called in a nested operation or just sometimes. > >>>> > > > >>>> > > At least one execution path without vm operation exists: > >>>> > > > >>>> > > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) > : void > >>>> > > > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState > *) : jlong > >>>> > > JvmtiEventControllerPrivate::recompute_enabled() : void > >>>> > > > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 > matches) > >>>> > > JvmtiEventController::change_field_watch(jvmtiEvent, > bool) : void > >>>> > > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : > jvmtiError > >>>> > > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : > jvmtiError > >>>> > > > >>>> > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't > my main intent to replace it with a > >>>> > > handshake, but to avoid making the compiled methods on stack > not_entrant.... unless I'm further > >>>> > > encouraged to do it with a handshake :) > >>>> > Ah! I think you can still do it with a handshake with the > >>>> > Deoptimization::deoptimize_all_marked() like solution. I can > change the > >>>> > if-else statement with just the Handshake::execute() call in > 8239084. > >>>> > But up to you. : ) > >>>> > >>>> Well, I think that's enough encouragement :) > >>>> I'll wait for 8239084 and try then again. > >>>> (no urgency and all) > >>>> > >>>> Thanks, > >>>> Richard. > >>>> > >>>> -----Original Message----- > >>>> From: Patricio Chilano > >>>> Sent: Freitag, 14. Februar 2020 15:54 > >>>> To: Reingruber, Richard ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > >>>> Subject: Re: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > >>>> > >>>> Hi Richard, > >>>> > >>>> On 2/14/20 9:58 AM, Reingruber, Richard wrote: > >>>>> Hi Patricio, > >>>>> > >>>>> thanks for having a look. > >>>>> > >>>>> > I?m only commenting on the handshake changes. > >>>>> > I see that operation VM_EnterInterpOnlyMode can be called > inside > >>>>> > operation VM_SetFramePop which also allows nested > operations. Here is a > >>>>> > comment in VM_SetFramePop definition: > >>>>> > > >>>>> > // Nested operation must be allowed for the > VM_EnterInterpOnlyMode that is > >>>>> > // called from the > JvmtiEventControllerPrivate::recompute_thread_enabled. > >>>>> > > >>>>> > So if we change VM_EnterInterpOnlyMode to be a handshake, > then now we > >>>>> > could have a handshake inside a safepoint operation. The issue I > see > >>>>> > there is that at the end of the handshake the polling page of the > target > >>>>> > thread could be disarmed. So if the target thread happens to be > in a > >>>>> > blocked state just transiently and wakes up then it will not stop > for > >>>>> > the ongoing safepoint. Maybe I can file an RFE to assert that the > >>>>> > polling page is armed at the beginning of disarm_safepoint(). > >>>>> > >>>>> I'm really glad you noticed the problematic nesting. This seems to be a > general issue: currently a > >>>>> handshake cannot be nested in a vm operation. Maybe it should be > asserted in the > >>>>> Handshake::execute() methods that they are not called by the vm > thread evaluating a vm operation? > >>>>> > >>>>> > Alternatively I think you could do something similar to what we > do in > >>>>> > Deoptimization::deoptimize_all_marked(): > >>>>> > > >>>>> > EnterInterpOnlyModeClosure hs; > >>>>> > if (SafepointSynchronize::is_at_safepoint()) { > >>>>> > hs.do_thread(state->get_thread()); > >>>>> > } else { > >>>>> > Handshake::execute(&hs, state->get_thread()); > >>>>> > } > >>>>> > (you could pass ?EnterInterpOnlyModeClosure? directly to the > >>>>> > HandshakeClosure() constructor) > >>>>> > >>>>> Maybe this could be used also in the Handshake::execute() methods > as general solution? > >>>> Right, we could also do that. Avoiding to clear the polling page in > >>>> HandshakeState::clear_handshake() should be enough to fix this issue > and > >>>> execute a handshake inside a safepoint, but adding that "if" statement > >>>> in Hanshake::execute() sounds good to avoid all the extra code that we > >>>> go through when executing a handshake. I filed 8239084 to make that > change. > >>>> > >>>>> > I don?t know JVMTI code so I?m not sure if > VM_EnterInterpOnlyMode is > >>>>> > always called in a nested operation or just sometimes. > >>>>> > >>>>> At least one execution path without vm operation exists: > >>>>> > >>>>> > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) > : void > >>>>> > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState > *) : jlong > >>>>> JvmtiEventControllerPrivate::recompute_enabled() : void > >>>>> JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, > bool) : void (2 matches) > >>>>> JvmtiEventController::change_field_watch(jvmtiEvent, bool) : > void > >>>>> JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError > >>>>> jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : > jvmtiError > >>>>> > >>>>> I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my > main intent to replace it with a > >>>>> handshake, but to avoid making the compiled methods on stack > not_entrant.... unless I'm further > >>>>> encouraged to do it with a handshake :) > >>>> Ah! I think you can still do it with a handshake with the > >>>> Deoptimization::deoptimize_all_marked() like solution. I can change > the > >>>> if-else statement with just the Handshake::execute() call in 8239084. > >>>> But up to you.? : ) > >>>> > >>>> Thanks, > >>>> Patricio > >>>>> Thanks again, > >>>>> Richard. > >>>>> > >>>>> -----Original Message----- > >>>>> From: Patricio Chilano > >>>>> Sent: Donnerstag, 13. Februar 2020 18:47 > >>>>> To: Reingruber, Richard ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > >>>>> Subject: Re: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > >>>>> > >>>>> Hi Richard, > >>>>> > >>>>> I?m only commenting on the handshake changes. > >>>>> I see that operation VM_EnterInterpOnlyMode can be called inside > >>>>> operation VM_SetFramePop which also allows nested operations. > Here is a > >>>>> comment in VM_SetFramePop definition: > >>>>> > >>>>> // Nested operation must be allowed for the > VM_EnterInterpOnlyMode that is > >>>>> // called from the > JvmtiEventControllerPrivate::recompute_thread_enabled. > >>>>> > >>>>> So if we change VM_EnterInterpOnlyMode to be a handshake, then > now we > >>>>> could have a handshake inside a safepoint operation. The issue I see > >>>>> there is that at the end of the handshake the polling page of the > target > >>>>> thread could be disarmed. So if the target thread happens to be in a > >>>>> blocked state just transiently and wakes up then it will not stop for > >>>>> the ongoing safepoint. Maybe I can file an RFE to assert that the > >>>>> polling page is armed at the beginning of disarm_safepoint(). > >>>>> > >>>>> I think one option could be to remove > >>>>> SafepointMechanism::disarm_if_needed() in > >>>>> HandshakeState::clear_handshake() and let each JavaThread disarm > itself > >>>>> for the handshake case. > >>>>> > >>>>> Alternatively I think you could do something similar to what we do in > >>>>> Deoptimization::deoptimize_all_marked(): > >>>>> > >>>>> ? EnterInterpOnlyModeClosure hs; > >>>>> ? if (SafepointSynchronize::is_at_safepoint()) { > >>>>> ??? hs.do_thread(state->get_thread()); > >>>>> ? } else { > >>>>> ??? Handshake::execute(&hs, state->get_thread()); > >>>>> ? } > >>>>> (you could pass ?EnterInterpOnlyModeClosure? directly to the > >>>>> HandshakeClosure() constructor) > >>>>> > >>>>> I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode > is > >>>>> always called in a nested operation or just sometimes. > >>>>> > >>>>> Thanks, > >>>>> Patricio > >>>>> > >>>>> On 2/12/20 7:23 AM, Reingruber, Richard wrote: > >>>>>> // Repost including hotspot runtime and gc lists. > >>>>>> // Dean Long suggested to do so, because the enhancement > replaces a vm operation > >>>>>> // with a handshake. > >>>>>> // Original thread: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020- > February/030359.html > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> could I please get reviews for this small enhancement in hotspot's > jvmti implementation: > >>>>>> > >>>>>> Webrev: > http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 > >>>>>> > >>>>>> The change avoids making all compiled methods on stack > not_entrant when switching a java thread to > >>>>>> interpreter only execution for jvmti purposes. It is sufficient to > deoptimize the compiled frames on stack. > >>>>>> > >>>>>> Additionally a handshake is used instead of a vm operation to walk > the stack and do the deoptimizations. > >>>>>> > >>>>>> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug > and release builds on all platforms. > >>>>>> > >>>>>> Thanks, Richard. > >>>>>> > >>>>>> See also my question if anyone knows a reason for making the > compiled methods not_entrant: > >>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020- > January/030339.html > >>>> From richard.reingruber at sap.com Tue May 12 09:42:39 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 12 May 2020 09:42:39 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: <3c59b9f9-ec38-18c9-8f24-e1186a08a04a@oracle.com> <410eed04-e2ef-0f4f-1c56-19e6734a10f6@oracle.com> <81d7caa8-4244-85f3-4d4e-78117fe5e25b@oss.nttdata.com> <550b95ac-8b29-1eb8-a507-533e81d02322@oracle.com> <9c49ea2d-e3b8-b576-1d17-d18ad87cd6ed@oracle.com> Message-ID: Thanks Martin. Cheers, Richard. -----Original Message----- From: Doerr, Martin Sent: Dienstag, 12. Mai 2020 10:43 To: Reingruber, Richard ; David Holmes ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net Subject: RE: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant Hi Richard, I had already reviewed webrev.1. Looks good to me. Thanks for contributing it. Best regards, Martin > -----Original Message----- > From: hotspot-runtime-dev bounces at openjdk.java.net> On Behalf Of Reingruber, Richard > Sent: Montag, 4. Mai 2020 12:33 > To: David Holmes ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > Subject: RE: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > > // Trimmed the list of recipients. If the list gets too long then the message > needs to be approved > // by a moderator. > > Hi David, > > > On 28/04/2020 12:09 am, Reingruber, Richard wrote: > > > Hi David, > > > > > >> Not a review but some general commentary ... > > > > > > That's welcome. > > > Having had to take an even closer look now I have a review comment too :) > > > src/hotspot/share/prims/jvmtiThreadState.cpp > > > void JvmtiThreadState::invalidate_cur_stack_depth() { > > ! assert(SafepointSynchronize::is_at_safepoint() || > > ! (Thread::current()->is_VM_thread() && > > get_thread()->is_vmthread_processing_handshake()) || > > (JavaThread *)Thread::current() == get_thread(), > > "must be current thread or at safepoint"); > > You're looking at an outdated webrev, I'm afraid. > > This would be the post with the current webrev.1 > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020- > April/031245.html > > Thanks, Richard. > > -----Original Message----- > From: David Holmes > Sent: Montag, 4. Mai 2020 08:51 > To: Reingruber, Richard ; Yasumasa Suenaga > ; Patricio Chilano > ; serguei.spitsyn at oracle.com; Vladimir > Ivanov ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > > Hi Richard, > > On 28/04/2020 12:09 am, Reingruber, Richard wrote: > > Hi David, > > > >> Not a review but some general commentary ... > > > > That's welcome. > > Having had to take an even closer look now I have a review comment too :) > > src/hotspot/share/prims/jvmtiThreadState.cpp > > void JvmtiThreadState::invalidate_cur_stack_depth() { > ! assert(SafepointSynchronize::is_at_safepoint() || > ! (Thread::current()->is_VM_thread() && > get_thread()->is_vmthread_processing_handshake()) || > (JavaThread *)Thread::current() == get_thread(), > "must be current thread or at safepoint"); > > The message needs updating to include handshakes. > > More below ... > > >> On 25/04/2020 2:08 am, Reingruber, Richard wrote: > >>> Hi Yasumasa, Patricio, > >>> > >>>>>> I will send review request to replace VM_SetFramePop to > handshake in early next week in JDK-8242427. > >>>>>> Does it help you? I think it gives you to remove workaround. > >>>>> > >>>>> I think it would not help that much. Note that when replacing > VM_SetFramePop with a direct handshake > >>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm > operation [1]. So you would have to > >>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt > to these changes. > >>> > >>>> Thanks for your information. > >>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and > vmTestbase/nsk/jvmti/NotifyFramePop. > >>>> I will modify and will test it after yours. > >>> > >>> Thanks :) > >>> > >>>>> Also my first impression was that it won't be that easy from a > synchronization point of view to > >>>>> replace VM_SetFramePop with a direct handshake. E.g. > VM_SetFramePop::doit() indirectly calls > >>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, > JvmtiFramePop fpop) where > >>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at > safepoint. It's not directly clear > >>>>> to me, how this has to be handled. > >>> > >>>> I think JvmtiEventController::set_frame_pop() should hold > JvmtiThreadState_lock because it affects other JVMTI operation especially > FramePop event. > >>> > >>> Yes. To me it is unclear what synchronization is necessary, if it is called > during a handshake. And > >>> also I'm unsure if a thread should do safepoint checks while executing a > handshake. > > > >> I'm growing increasingly concerned that use of direct handshakes to > >> replace VM operations needs a much greater examination for correctness > >> than might initially be thought. I see a number of issues: > > > > I agree. I'll address your concerns in the context of this review thread for > JDK-8238585 below. > > > > In addition I would suggest to take the general part of the discussion to a > dedicated thread or to > > the review thread for JDK-8242427. I would like to keep this thread closer > to its subject. > > I will focus on the issues in the context of this particular change > then, though the issues themselves are applicable to all handshake > situations (and more so with direct handshakes). This is mostly just > discussion. > > >> First, the VMThread executes (most) VM operations with a clean stack in > >> a clean state, so it has lots of room to work. If we now execute the > >> same logic in a JavaThread then we risk hitting stackoverflows if > >> nothing else. But we are also now executing code in a JavaThread and so > >> we have to be sure that code is not going to act differently (in a bad > >> way) if executed by a JavaThread rather than the VMThread. For > example, > >> may it be possible that if executing in the VMThread we defer some > >> activity that might require execution of Java code, or else hand it off > >> to one of the service threads? If we execute that code directly in the > >> current JavaThread instead we may not be in a valid state (e.g. consider > >> re-entrancy to various subsystems that is not allowed). > > > > It is not too complex, what EnterInterpOnlyModeClosure::do_thread() is > doing. I already added a > > paragraph to the JBS-Item [1] explaining why the direct handshake is > sufficient from a > > synchronization point of view. > > Just to be clear, your proposed change is not using a direct handshake. > > > Furthermore the stack is walked and the return pc of compiled frames is > replaced with the address of > > the deopt handler. > > > > I can't see why this cannot be done with a direct handshake. Something > very similar is already done > > in JavaThread::deoptimize_marked_methods() which is executed as part > of an ordinary handshake. > > Note that existing non-direct handshakes may also have issues that not > have been fully investigated. > > > The demand on stack-space should be very modest. I would not expect a > higher risk for stackoverflow. > > For the target thread if you use more stack than would be used stopping > at a safepoint then you are at risk. For the thread initiating the > direct handshake if you use more stack than would be used enqueuing a VM > operation, then you are at risk. As we have not quantified these > numbers, nor have any easy way to establish the stack use of the actual > code to be executed, we're really just hoping for the best. This is a > general problem with handshakes that needs to be investigated more > deeply. As a simple, general, example just imagine if the code involves > logging that might utilise an on-stack buffer. > > >> Second, we have this question mark over what happens if the operation > >> hits further safepoint or handshake polls/checks? Are there constraints > >> on what is allowed here? How can we recognise this problem may exist > and > >> so deal with it? > > > > The thread in EnterInterpOnlyModeClosure::do_thread() can't become > safepoint/handshake safe. I > > tested locally test/hotspot/jtreg:vmTestbase_nsk_jvmti with a > NoSafepointVerifier. > > That's good to hear but such tests are not exhaustive, they will detect > if you do reach a safepoint/handshake but they can't prove that you > cannot reach one. What you have done is necessary but may not be > sufficient. Plus you didn't actually add the NSV to the code - is there > a reason we can't actually keep it in do_thread? (I'm not sure if the > NSV also acts as a NoHandshakeVerifier?) > > >> Third, while we are generally considering what appear to be > >> single-thread operations, which should be amenable to a direct > >> handshake, we also have to be careful that some of the code involved > >> doesn't already expect/assume we are at a safepoint - e.g. a VM op may > >> not need to take a lock where a direct handshake might! > > > > See again my arguments in the JBS item [1]. > > Yes I see the reasoning and that is good. My point is a general one as > it may not be obvious when such assumptions exist in the current code. > > Thanks, > David > > > Thanks, > > Richard. > > > > [1] https://bugs.openjdk.java.net/browse/JDK-8238585 > > > > -----Original Message----- > > From: David Holmes > > Sent: Montag, 27. April 2020 07:16 > > To: Reingruber, Richard ; Yasumasa Suenaga > ; Patricio Chilano > ; serguei.spitsyn at oracle.com; Vladimir > Ivanov ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > > Subject: Re: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > > > > Hi all, > > > > Not a review but some general commentary ... > > > > On 25/04/2020 2:08 am, Reingruber, Richard wrote: > >> Hi Yasumasa, Patricio, > >> > >>>>> I will send review request to replace VM_SetFramePop to handshake > in early next week in JDK-8242427. > >>>>> Does it help you? I think it gives you to remove workaround. > >>>> > >>>> I think it would not help that much. Note that when replacing > VM_SetFramePop with a direct handshake > >>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm > operation [1]. So you would have to > >>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt > to these changes. > >> > >>> Thanks for your information. > >>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and > vmTestbase/nsk/jvmti/NotifyFramePop. > >>> I will modify and will test it after yours. > >> > >> Thanks :) > >> > >>>> Also my first impression was that it won't be that easy from a > synchronization point of view to > >>>> replace VM_SetFramePop with a direct handshake. E.g. > VM_SetFramePop::doit() indirectly calls > >>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, > JvmtiFramePop fpop) where > >>>> JvmtiThreadState_lock is acquired with safepoint check, if not at > safepoint. It's not directly clear > >>>> to me, how this has to be handled. > >> > >>> I think JvmtiEventController::set_frame_pop() should hold > JvmtiThreadState_lock because it affects other JVMTI operation especially > FramePop event. > >> > >> Yes. To me it is unclear what synchronization is necessary, if it is called > during a handshake. And > >> also I'm unsure if a thread should do safepoint checks while executing a > handshake. > > > > I'm growing increasingly concerned that use of direct handshakes to > > replace VM operations needs a much greater examination for correctness > > than might initially be thought. I see a number of issues: > > > > First, the VMThread executes (most) VM operations with a clean stack in > > a clean state, so it has lots of room to work. If we now execute the > > same logic in a JavaThread then we risk hitting stackoverflows if > > nothing else. But we are also now executing code in a JavaThread and so > > we have to be sure that code is not going to act differently (in a bad > > way) if executed by a JavaThread rather than the VMThread. For example, > > may it be possible that if executing in the VMThread we defer some > > activity that might require execution of Java code, or else hand it off > > to one of the service threads? If we execute that code directly in the > > current JavaThread instead we may not be in a valid state (e.g. consider > > re-entrancy to various subsystems that is not allowed). > > > > Second, we have this question mark over what happens if the operation > > hits further safepoint or handshake polls/checks? Are there constraints > > on what is allowed here? How can we recognise this problem may exist and > > so deal with it? > > > > Third, while we are generally considering what appear to be > > single-thread operations, which should be amenable to a direct > > handshake, we also have to be careful that some of the code involved > > doesn't already expect/assume we are at a safepoint - e.g. a VM op may > > not need to take a lock where a direct handshake might! > > > > Cheers, > > David > > ----- > > > >> @Patricio, coming back to my question [1]: > >> > >> In the example you gave in your answer [2]: the java thread would > execute a vm operation during a > >> direct handshake operation, while the VMThread is actually in the middle > of a VM_HandshakeAllThreads > >> operation, waiting to handshake the same handshakee: why can't the > VMThread just proceed? The > >> handshakee would be safepoint safe, wouldn't it? > >> > >> Thanks, Richard. > >> > >> [1] https://bugs.openjdk.java.net/browse/JDK- > 8230594?focusedCommentId=14301677&page=com.atlassian.jira.plugin.syst > em.issuetabpanels:comment-tabpanel#comment-14301677 > >> > >> [2] https://bugs.openjdk.java.net/browse/JDK- > 8230594?focusedCommentId=14301763&page=com.atlassian.jira.plugin.syst > em.issuetabpanels:comment-tabpanel#comment-14301763 > >> > >> -----Original Message----- > >> From: Yasumasa Suenaga > >> Sent: Freitag, 24. April 2020 17:23 > >> To: Reingruber, Richard ; Patricio Chilano > ; serguei.spitsyn at oracle.com; Vladimir > Ivanov ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > >> Subject: Re: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > >> > >> Hi Richard, > >> > >> On 2020/04/24 23:44, Reingruber, Richard wrote: > >>> Hi Yasumasa, > >>> > >>>> I will send review request to replace VM_SetFramePop to handshake > in early next week in JDK-8242427. > >>>> Does it help you? I think it gives you to remove workaround. > >>> > >>> I think it would not help that much. Note that when replacing > VM_SetFramePop with a direct handshake > >>> you could not just execute VM_EnterInterpOnlyMode as a nested vm > operation [1]. So you would have to > >>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to > these changes. > >> > >> Thanks for your information. > >> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and > vmTestbase/nsk/jvmti/NotifyFramePop. > >> I will modify and will test it after yours. > >> > >> > >>> Also my first impression was that it won't be that easy from a > synchronization point of view to > >>> replace VM_SetFramePop with a direct handshake. E.g. > VM_SetFramePop::doit() indirectly calls > >>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, > JvmtiFramePop fpop) where > >>> JvmtiThreadState_lock is acquired with safepoint check, if not at > safepoint. It's not directly clear > >>> to me, how this has to be handled. > >> > >> I think JvmtiEventController::set_frame_pop() should hold > JvmtiThreadState_lock because it affects other JVMTI operation especially > FramePop event. > >> > >> > >> Thanks, > >> > >> Yasumasa > >> > >> > >>> So it appears to me that it would be easier to push JDK-8242427 after this > (JDK-8238585). > >>> > >>>> (The patch is available, but I want to see the result of PIT in this > weekend whether JDK-8242425 works fine.) > >>> > >>> Would be interesting to see how you handled the issues above :) > >>> > >>> Thanks, Richard. > >>> > >>> [1] See question in comment > https://bugs.openjdk.java.net/browse/JDK- > 8230594?focusedCommentId=14302030&page=com.atlassian.jira.plugin.syst > em.issuetabpanels:comment-tabpanel#comment-14302030 > >>> > >>> -----Original Message----- > >>> From: Yasumasa Suenaga > >>> Sent: Freitag, 24. April 2020 13:34 > >>> To: Reingruber, Richard ; Patricio Chilano > ; serguei.spitsyn at oracle.com; Vladimir > Ivanov ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > >>> Subject: Re: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > >>> > >>> Hi Richard, > >>> > >>> I will send review request to replace VM_SetFramePop to handshake in > early next week in JDK-8242427. > >>> Does it help you? I think it gives you to remove workaround. > >>> > >>> (The patch is available, but I want to see the result of PIT in this weekend > whether JDK-8242425 works fine.) > >>> > >>> > >>> Thanks, > >>> > >>> Yasumasa > >>> > >>> > >>> On 2020/04/24 17:18, Reingruber, Richard wrote: > >>>> Hi Patricio, Vladimir, and Serguei, > >>>> > >>>> now that direct handshakes are available, I've updated the patch to > make use of them. > >>>> > >>>> In addition I have done some clean-up changes I missed in the first > webrev. > >>>> > >>>> Finally I have implemented the workaround suggested by Patricio to > avoid nesting the handshake > >>>> into the vm operation VM_SetFramePop [1] > >>>> > >>>> Kindly review again: > >>>> > >>>> Webrev: > http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ > >>>> Webrev(delta): > http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ > >>>> > >>>> I updated the JBS item explaining why the vm operation > VM_EnterInterpOnlyMode can be replaced with a > >>>> direct handshake: > >>>> > >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8238585 > >>>> > >>>> Testing: > >>>> > >>>> * JCK and JTREG tests, also in Xcomp mode with fastdebug and release > builds on all platforms. > >>>> > >>>> * Submit-repo: mach5-one-rrich-JDK-8238585-20200423-1436-10441737 > >>>> > >>>> Thanks, > >>>> Richard. > >>>> > >>>> [1] An assertion in Handshake::execute_direct() fails, if called be > VMThread, because it is no JavaThread. > >>>> > >>>> -----Original Message----- > >>>> From: hotspot-dev On > Behalf Of Reingruber, Richard > >>>> Sent: Freitag, 14. Februar 2020 19:47 > >>>> To: Patricio Chilano ; > serviceability-dev at openjdk.java.net; hotspot-compiler- > dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > >>>> Subject: RE: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > >>>> > >>>> Hi Patricio, > >>>> > >>>> > > I'm really glad you noticed the problematic nesting. This seems > to be a general issue: currently a > >>>> > > handshake cannot be nested in a vm operation. Maybe it should > be asserted in the > >>>> > > Handshake::execute() methods that they are not called by the > vm thread evaluating a vm operation? > >>>> > > > >>>> > > > Alternatively I think you could do something similar to what > we do in > >>>> > > > Deoptimization::deoptimize_all_marked(): > >>>> > > > > >>>> > > > EnterInterpOnlyModeClosure hs; > >>>> > > > if (SafepointSynchronize::is_at_safepoint()) { > >>>> > > > hs.do_thread(state->get_thread()); > >>>> > > > } else { > >>>> > > > Handshake::execute(&hs, state->get_thread()); > >>>> > > > } > >>>> > > > (you could pass ?EnterInterpOnlyModeClosure? directly to > the > >>>> > > > HandshakeClosure() constructor) > >>>> > > > >>>> > > Maybe this could be used also in the Handshake::execute() > methods as general solution? > >>>> > Right, we could also do that. Avoiding to clear the polling page in > >>>> > HandshakeState::clear_handshake() should be enough to fix this > issue and > >>>> > execute a handshake inside a safepoint, but adding that "if" > statement > >>>> > in Hanshake::execute() sounds good to avoid all the extra code > that we > >>>> > go through when executing a handshake. I filed 8239084 to make > that change. > >>>> > >>>> Thanks for taking care of this and creating the RFE. > >>>> > >>>> > > >>>> > > > I don?t know JVMTI code so I?m not sure if > VM_EnterInterpOnlyMode is > >>>> > > > always called in a nested operation or just sometimes. > >>>> > > > >>>> > > At least one execution path without vm operation exists: > >>>> > > > >>>> > > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) > : void > >>>> > > > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState > *) : jlong > >>>> > > JvmtiEventControllerPrivate::recompute_enabled() : void > >>>> > > > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 > matches) > >>>> > > JvmtiEventController::change_field_watch(jvmtiEvent, > bool) : void > >>>> > > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : > jvmtiError > >>>> > > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : > jvmtiError > >>>> > > > >>>> > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't > my main intent to replace it with a > >>>> > > handshake, but to avoid making the compiled methods on stack > not_entrant.... unless I'm further > >>>> > > encouraged to do it with a handshake :) > >>>> > Ah! I think you can still do it with a handshake with the > >>>> > Deoptimization::deoptimize_all_marked() like solution. I can > change the > >>>> > if-else statement with just the Handshake::execute() call in > 8239084. > >>>> > But up to you. : ) > >>>> > >>>> Well, I think that's enough encouragement :) > >>>> I'll wait for 8239084 and try then again. > >>>> (no urgency and all) > >>>> > >>>> Thanks, > >>>> Richard. > >>>> > >>>> -----Original Message----- > >>>> From: Patricio Chilano > >>>> Sent: Freitag, 14. Februar 2020 15:54 > >>>> To: Reingruber, Richard ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > >>>> Subject: Re: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > >>>> > >>>> Hi Richard, > >>>> > >>>> On 2/14/20 9:58 AM, Reingruber, Richard wrote: > >>>>> Hi Patricio, > >>>>> > >>>>> thanks for having a look. > >>>>> > >>>>> > I?m only commenting on the handshake changes. > >>>>> > I see that operation VM_EnterInterpOnlyMode can be called > inside > >>>>> > operation VM_SetFramePop which also allows nested > operations. Here is a > >>>>> > comment in VM_SetFramePop definition: > >>>>> > > >>>>> > // Nested operation must be allowed for the > VM_EnterInterpOnlyMode that is > >>>>> > // called from the > JvmtiEventControllerPrivate::recompute_thread_enabled. > >>>>> > > >>>>> > So if we change VM_EnterInterpOnlyMode to be a handshake, > then now we > >>>>> > could have a handshake inside a safepoint operation. The issue I > see > >>>>> > there is that at the end of the handshake the polling page of the > target > >>>>> > thread could be disarmed. So if the target thread happens to be > in a > >>>>> > blocked state just transiently and wakes up then it will not stop > for > >>>>> > the ongoing safepoint. Maybe I can file an RFE to assert that the > >>>>> > polling page is armed at the beginning of disarm_safepoint(). > >>>>> > >>>>> I'm really glad you noticed the problematic nesting. This seems to be a > general issue: currently a > >>>>> handshake cannot be nested in a vm operation. Maybe it should be > asserted in the > >>>>> Handshake::execute() methods that they are not called by the vm > thread evaluating a vm operation? > >>>>> > >>>>> > Alternatively I think you could do something similar to what we > do in > >>>>> > Deoptimization::deoptimize_all_marked(): > >>>>> > > >>>>> > EnterInterpOnlyModeClosure hs; > >>>>> > if (SafepointSynchronize::is_at_safepoint()) { > >>>>> > hs.do_thread(state->get_thread()); > >>>>> > } else { > >>>>> > Handshake::execute(&hs, state->get_thread()); > >>>>> > } > >>>>> > (you could pass ?EnterInterpOnlyModeClosure? directly to the > >>>>> > HandshakeClosure() constructor) > >>>>> > >>>>> Maybe this could be used also in the Handshake::execute() methods > as general solution? > >>>> Right, we could also do that. Avoiding to clear the polling page in > >>>> HandshakeState::clear_handshake() should be enough to fix this issue > and > >>>> execute a handshake inside a safepoint, but adding that "if" statement > >>>> in Hanshake::execute() sounds good to avoid all the extra code that we > >>>> go through when executing a handshake. I filed 8239084 to make that > change. > >>>> > >>>>> > I don?t know JVMTI code so I?m not sure if > VM_EnterInterpOnlyMode is > >>>>> > always called in a nested operation or just sometimes. > >>>>> > >>>>> At least one execution path without vm operation exists: > >>>>> > >>>>> > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) > : void > >>>>> > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState > *) : jlong > >>>>> JvmtiEventControllerPrivate::recompute_enabled() : void > >>>>> JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, > bool) : void (2 matches) > >>>>> JvmtiEventController::change_field_watch(jvmtiEvent, bool) : > void > >>>>> JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError > >>>>> jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : > jvmtiError > >>>>> > >>>>> I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my > main intent to replace it with a > >>>>> handshake, but to avoid making the compiled methods on stack > not_entrant.... unless I'm further > >>>>> encouraged to do it with a handshake :) > >>>> Ah! I think you can still do it with a handshake with the > >>>> Deoptimization::deoptimize_all_marked() like solution. I can change > the > >>>> if-else statement with just the Handshake::execute() call in 8239084. > >>>> But up to you.? : ) > >>>> > >>>> Thanks, > >>>> Patricio > >>>>> Thanks again, > >>>>> Richard. > >>>>> > >>>>> -----Original Message----- > >>>>> From: Patricio Chilano > >>>>> Sent: Donnerstag, 13. Februar 2020 18:47 > >>>>> To: Reingruber, Richard ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot- > gc-dev at openjdk.java.net > >>>>> Subject: Re: RFR(S) 8238585: Use handshake for > JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make > compiled methods on stack not_entrant > >>>>> > >>>>> Hi Richard, > >>>>> > >>>>> I?m only commenting on the handshake changes. > >>>>> I see that operation VM_EnterInterpOnlyMode can be called inside > >>>>> operation VM_SetFramePop which also allows nested operations. > Here is a > >>>>> comment in VM_SetFramePop definition: > >>>>> > >>>>> // Nested operation must be allowed for the > VM_EnterInterpOnlyMode that is > >>>>> // called from the > JvmtiEventControllerPrivate::recompute_thread_enabled. > >>>>> > >>>>> So if we change VM_EnterInterpOnlyMode to be a handshake, then > now we > >>>>> could have a handshake inside a safepoint operation. The issue I see > >>>>> there is that at the end of the handshake the polling page of the > target > >>>>> thread could be disarmed. So if the target thread happens to be in a > >>>>> blocked state just transiently and wakes up then it will not stop for > >>>>> the ongoing safepoint. Maybe I can file an RFE to assert that the > >>>>> polling page is armed at the beginning of disarm_safepoint(). > >>>>> > >>>>> I think one option could be to remove > >>>>> SafepointMechanism::disarm_if_needed() in > >>>>> HandshakeState::clear_handshake() and let each JavaThread disarm > itself > >>>>> for the handshake case. > >>>>> > >>>>> Alternatively I think you could do something similar to what we do in > >>>>> Deoptimization::deoptimize_all_marked(): > >>>>> > >>>>> ? EnterInterpOnlyModeClosure hs; > >>>>> ? if (SafepointSynchronize::is_at_safepoint()) { > >>>>> ??? hs.do_thread(state->get_thread()); > >>>>> ? } else { > >>>>> ??? Handshake::execute(&hs, state->get_thread()); > >>>>> ? } > >>>>> (you could pass ?EnterInterpOnlyModeClosure? directly to the > >>>>> HandshakeClosure() constructor) > >>>>> > >>>>> I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode > is > >>>>> always called in a nested operation or just sometimes. > >>>>> > >>>>> Thanks, > >>>>> Patricio > >>>>> > >>>>> On 2/12/20 7:23 AM, Reingruber, Richard wrote: > >>>>>> // Repost including hotspot runtime and gc lists. > >>>>>> // Dean Long suggested to do so, because the enhancement > replaces a vm operation > >>>>>> // with a handshake. > >>>>>> // Original thread: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020- > February/030359.html > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> could I please get reviews for this small enhancement in hotspot's > jvmti implementation: > >>>>>> > >>>>>> Webrev: > http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 > >>>>>> > >>>>>> The change avoids making all compiled methods on stack > not_entrant when switching a java thread to > >>>>>> interpreter only execution for jvmti purposes. It is sufficient to > deoptimize the compiled frames on stack. > >>>>>> > >>>>>> Additionally a handshake is used instead of a vm operation to walk > the stack and do the deoptimizations. > >>>>>> > >>>>>> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug > and release builds on all platforms. > >>>>>> > >>>>>> Thanks, Richard. > >>>>>> > >>>>>> See also my question if anyone knows a reason for making the > compiled methods not_entrant: > >>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020- > January/030339.html > >>>> From suenaga at oss.nttdata.com Tue May 12 13:12:10 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 12 May 2020 22:12:10 +0900 Subject: RFR: 8244819: hsdis does not compile with binutils 2.34+ Message-ID: <3399c27a-3f55-9f30-1090-5fe6aea479c2@oss.nttdata.com> Hi all, Please review this change: JBS: https://bugs.openjdk.java.net/browse/JDK-8244819 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8244819/webrev.00/ binutils 2.34 introduces new section flag: SEC_ELF_OCTETS, and it affects arguments of bfd_octets_per_byte() [1]. So we can see new compiler error as below: ``` hsdis.c:571:28: error: too few arguments to function 'bfd_octets_per_byt ' 571 | dinfo->octets_per_byte = bfd_octets_per_byte (abfd); | ^~~~~~~~~~~~~~~~~~~ In file included from hsdis.c:58: build/linux-amd64/bfd/bfd.h:1999:14: note: declared here 1999 | unsigned int bfd_octets_per_byte (const bfd *abfd, | ^~~~~~~~~~~~~~~~~~~ ``` Thanks, Yasumasa [1] https://sourceware.org/git/?p=binutils-gdb.git;h=618265039f697eab9e72bb58b95fc2d32925df58 From vladimir.kozlov at oracle.com Tue May 12 16:23:59 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 12 May 2020 09:23:59 -0700 Subject: RFR(S): 8244407: JVM crashes after transformation in C2 IdealLoopTree::split_fall_in In-Reply-To: <05661edd-8cf6-dd1e-0b1a-e2399cbfd767@oracle.com> References: <05661edd-8cf6-dd1e-0b1a-e2399cbfd767@oracle.com> Message-ID: Hi, Felix Yes, the fix looks good. Please, use {} for if() body. Thanks, Vladimir On 5/11/20 1:58 AM, Tobias Hartmann wrote: > Hi Felix, > > your fix looks reasonable to me but Vladimir K. (who reviewed your original fix), should also have a > look. > > Best regards, > Tobias > > On 06.05.20 03:32, Yangfei (Felix) wrote: >> Hi, >> >> Please help review this patch fixing a C2 crash issue. >> Bug: https://bugs.openjdk.java.net/browse/JDK-8244407 >> Webrev: http://cr.openjdk.java.net/~fyang/8244407/webrev.00 >> >> After the fix for JDK-8240576, irreducible loop tree might be structurally changed by split_fall_in() in IdealLoopTree::beautify_loops. >> But loop tree is not rebuilt after that. Take the reported test case for example, irreducible loop tree looks like: >> >> 1: Loop: N649/N644 >> >> Loop: N649/N644 IRREDUCIBLE <-- this >> Loop: N649/N797 sfpts={ 683 } >> >> With the fix for JDK-8240576?we won't do merge_many_backedges in IdealLoopTree::beautify_loops for this irreducible loop tree. >> >> if( _head->req() > 3 && !_irreducible) { >> // Merge the many backedges into a single backedge but leave >> // the hottest backedge as separate edge for the following peel. >> merge_many_backedges( phase ); >> result = true; >> } >> >> N660 N644 N797 >> | | | >> | | | >> | v | >> | +---+---+ | >> +-----> + N649 + <-----+ >> +--------+ >> >> 649 Region === 649 660 797 644 [[ .... ]] !jvms: Test::testMethod @ bci:543 >> >> Then we come to the children: >> >> // Now recursively beautify nested loops >> if( _child ) result |= _child->beautify_loops( phase ); >> >> 2: Loop: N649/N797 >> >> Loop: N649/N644 IRREDUCIBLE >> Loop: N649/N797 sfpts={ 683 } <-- this >> >> After spilt_fall_in()?N660 and N644 are merged. >> >> if( fall_in_cnt > 1 ) // Need a loop landing pad to merge fall-ins >> split_fall_in( phase, fall_in_cnt ); >> >> N660 N644 >> | + >> | | >> | | >> | +---------+ | >> +---->+ N946 +<-----+ >> +----+---+ >> | N797 >> | | >> | | >> | | >> | +--------+ | >> +----> + N649 + <-----+ >> +--------+ >> >> Loop tree is now structurally changed into: >> >> Loop: N946/N644 IRREDUCIBLE >> Loop: N649/N797 sfpts={ 683 } >> >> But local variable 'result' in IdealLoopTree::beautify_loops hasn't got a chance to be set to true since _head->req() is not bigger than 3 after split_fall_in. >> Then C2 won't rebuild loop tree after IdealLoopTree::beautify_loops, which further leads to the crash. >> Instead of adding extra checking for loop tree structure changes, proposed fix sets 'result' to true when we meet irreducible loop with multiple backedges. >> This should be safer and simpler (thus good for JIT compile time). >> Tiered 1-3 tested on x86-64 and aarch64 linux platform. Comments? >> >> Thanks, >> Felix >> From felix.yang at huawei.com Wed May 13 01:11:04 2020 From: felix.yang at huawei.com (Yangfei (Felix)) Date: Wed, 13 May 2020 01:11:04 +0000 Subject: RFR(S): 8244407: JVM crashes after transformation in C2 IdealLoopTree::split_fall_in In-Reply-To: <05661edd-8cf6-dd1e-0b1a-e2399cbfd767@oracle.com> References: <05661edd-8cf6-dd1e-0b1a-e2399cbfd767@oracle.com> Message-ID: Thank you Vladimir and Tobias for reviewing this. I have modified accordingly using {} for the if() body. Also removed a pair of redundant curly braces in the newly added test case. Pushed to the submit repo and test results received looks good: Job: mach5-one-fyang-JDK-8244407-20200511-2157-10959986 BuildId: 2020-05-11-2156071.felix.yang.source No failed tests Tasks Summary ? EXECUTED_WITH_FAILURE: 0 ? NOTHING_TO_RUN: 0 ? KILLED: 0 ? HARNESS_ERROR: 0 ? FAILED: 0 ? PASSED: 101 ? UNABLE_TO_RUN: 0 ? NA: 0 Pushed as: http://hg.openjdk.java.net/jdk/jdk/rev/c9f5a16d6980 Best regards, Felix > Hi, Felix > Yes, the fix looks good. > Please, use {} for if() body. > Thanks, > Vladimir > On 5/11/20 1:58 AM, Tobias Hartmann wrote: >> Hi Felix, >> >> your fix looks reasonable to me but Vladimir K. (who reviewed your original fix), should also have a >> look. >> >> Best regards, >> Tobias From david.holmes at oracle.com Wed May 13 05:42:50 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 13 May 2020 15:42:50 +1000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: <410eed04-e2ef-0f4f-1c56-19e6734a10f6@oracle.com> <81d7caa8-4244-85f3-4d4e-78117fe5e25b@oss.nttdata.com> <550b95ac-8b29-1eb8-a507-533e81d02322@oracle.com> <9c49ea2d-e3b8-b576-1d17-d18ad87cd6ed@oracle.com> Message-ID: <36d5e2c0-c724-7ff7-d37e-decb5cc0005b@oracle.com> On 4/05/2020 8:33 pm, Reingruber, Richard wrote: > // Trimmed the list of recipients. If the list gets too long then the message needs to be approved > // by a moderator. Yes I noticed that too :) In general if you send to hotspot-dev you shouldn't need to also send to hotspot-X-dev. > Hi David, Hi Richard, >> On 28/04/2020 12:09 am, Reingruber, Richard wrote: >>> Hi David, >>> >>>> Not a review but some general commentary ... >>> >>> That's welcome. > >> Having had to take an even closer look now I have a review comment too :) > >> src/hotspot/share/prims/jvmtiThreadState.cpp > >> void JvmtiThreadState::invalidate_cur_stack_depth() { >> ! assert(SafepointSynchronize::is_at_safepoint() || >> ! (Thread::current()->is_VM_thread() && >> get_thread()->is_vmthread_processing_handshake()) || >> (JavaThread *)Thread::current() == get_thread(), >> "must be current thread or at safepoint"); > > You're looking at an outdated webrev, I'm afraid. > > This would be the post with the current webrev.1 > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/031245.html Sorry I missed that update. Okay so this is working with direct handshakes now. One style nit in jvmtiThreadState.cpp: assert(SafepointSynchronize::is_at_safepoint() || ! (JavaThread *)Thread::current() == get_thread() || ! Thread::current() == get_thread()->active_handshaker(), ! "bad synchronization with owner thread"); the ! lines should ident as follows assert(SafepointSynchronize::is_at_safepoint() || (JavaThread *)Thread::current() == get_thread() || Thread::current() == get_thread()->active_handshaker(), ! "bad synchronization with owner thread"); Lets see how this plays out. Cheers, David > Thanks, Richard. > > -----Original Message----- > From: David Holmes > Sent: Montag, 4. Mai 2020 08:51 > To: Reingruber, Richard ; Yasumasa Suenaga ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > Hi Richard, > > On 28/04/2020 12:09 am, Reingruber, Richard wrote: >> Hi David, >> >>> Not a review but some general commentary ... >> >> That's welcome. > > Having had to take an even closer look now I have a review comment too :) > > src/hotspot/share/prims/jvmtiThreadState.cpp > > void JvmtiThreadState::invalidate_cur_stack_depth() { > ! assert(SafepointSynchronize::is_at_safepoint() || > ! (Thread::current()->is_VM_thread() && > get_thread()->is_vmthread_processing_handshake()) || > (JavaThread *)Thread::current() == get_thread(), > "must be current thread or at safepoint"); > > The message needs updating to include handshakes. > > More below ... > >>> On 25/04/2020 2:08 am, Reingruber, Richard wrote: >>>> Hi Yasumasa, Patricio, >>>> >>>>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>>>> Does it help you? I think it gives you to remove workaround. >>>>>> >>>>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>>> >>>>> Thanks for your information. >>>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>>>> I will modify and will test it after yours. >>>> >>>> Thanks :) >>>> >>>>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>>>> to me, how this has to be handled. >>>> >>>>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>>> >>>> Yes. To me it is unclear what synchronization is necessary, if it is called during a handshake. And >>>> also I'm unsure if a thread should do safepoint checks while executing a handshake. >> >>> I'm growing increasingly concerned that use of direct handshakes to >>> replace VM operations needs a much greater examination for correctness >>> than might initially be thought. I see a number of issues: >> >> I agree. I'll address your concerns in the context of this review thread for JDK-8238585 below. >> >> In addition I would suggest to take the general part of the discussion to a dedicated thread or to >> the review thread for JDK-8242427. I would like to keep this thread closer to its subject. > > I will focus on the issues in the context of this particular change > then, though the issues themselves are applicable to all handshake > situations (and more so with direct handshakes). This is mostly just > discussion. > >>> First, the VMThread executes (most) VM operations with a clean stack in >>> a clean state, so it has lots of room to work. If we now execute the >>> same logic in a JavaThread then we risk hitting stackoverflows if >>> nothing else. But we are also now executing code in a JavaThread and so >>> we have to be sure that code is not going to act differently (in a bad >>> way) if executed by a JavaThread rather than the VMThread. For example, >>> may it be possible that if executing in the VMThread we defer some >>> activity that might require execution of Java code, or else hand it off >>> to one of the service threads? If we execute that code directly in the >>> current JavaThread instead we may not be in a valid state (e.g. consider >>> re-entrancy to various subsystems that is not allowed). >> >> It is not too complex, what EnterInterpOnlyModeClosure::do_thread() is doing. I already added a >> paragraph to the JBS-Item [1] explaining why the direct handshake is sufficient from a >> synchronization point of view. > > Just to be clear, your proposed change is not using a direct handshake. > >> Furthermore the stack is walked and the return pc of compiled frames is replaced with the address of >> the deopt handler. >> >> I can't see why this cannot be done with a direct handshake. Something very similar is already done >> in JavaThread::deoptimize_marked_methods() which is executed as part of an ordinary handshake. > > Note that existing non-direct handshakes may also have issues that not > have been fully investigated. > >> The demand on stack-space should be very modest. I would not expect a higher risk for stackoverflow. > > For the target thread if you use more stack than would be used stopping > at a safepoint then you are at risk. For the thread initiating the > direct handshake if you use more stack than would be used enqueuing a VM > operation, then you are at risk. As we have not quantified these > numbers, nor have any easy way to establish the stack use of the actual > code to be executed, we're really just hoping for the best. This is a > general problem with handshakes that needs to be investigated more > deeply. As a simple, general, example just imagine if the code involves > logging that might utilise an on-stack buffer. > >>> Second, we have this question mark over what happens if the operation >>> hits further safepoint or handshake polls/checks? Are there constraints >>> on what is allowed here? How can we recognise this problem may exist and >>> so deal with it? >> >> The thread in EnterInterpOnlyModeClosure::do_thread() can't become safepoint/handshake safe. I >> tested locally test/hotspot/jtreg:vmTestbase_nsk_jvmti with a NoSafepointVerifier. > > That's good to hear but such tests are not exhaustive, they will detect > if you do reach a safepoint/handshake but they can't prove that you > cannot reach one. What you have done is necessary but may not be > sufficient. Plus you didn't actually add the NSV to the code - is there > a reason we can't actually keep it in do_thread? (I'm not sure if the > NSV also acts as a NoHandshakeVerifier?) > >>> Third, while we are generally considering what appear to be >>> single-thread operations, which should be amenable to a direct >>> handshake, we also have to be careful that some of the code involved >>> doesn't already expect/assume we are at a safepoint - e.g. a VM op may >>> not need to take a lock where a direct handshake might! >> >> See again my arguments in the JBS item [1]. > > Yes I see the reasoning and that is good. My point is a general one as > it may not be obvious when such assumptions exist in the current code. > > Thanks, > David > >> Thanks, >> Richard. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8238585 >> >> -----Original Message----- >> From: David Holmes >> Sent: Montag, 27. April 2020 07:16 >> To: Reingruber, Richard ; Yasumasa Suenaga ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >> >> Hi all, >> >> Not a review but some general commentary ... >> >> On 25/04/2020 2:08 am, Reingruber, Richard wrote: >>> Hi Yasumasa, Patricio, >>> >>>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>>> Does it help you? I think it gives you to remove workaround. >>>>> >>>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>> >>>> Thanks for your information. >>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>>> I will modify and will test it after yours. >>> >>> Thanks :) >>> >>>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>>> to me, how this has to be handled. >>> >>>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>> >>> Yes. To me it is unclear what synchronization is necessary, if it is called during a handshake. And >>> also I'm unsure if a thread should do safepoint checks while executing a handshake. >> >> I'm growing increasingly concerned that use of direct handshakes to >> replace VM operations needs a much greater examination for correctness >> than might initially be thought. I see a number of issues: >> >> First, the VMThread executes (most) VM operations with a clean stack in >> a clean state, so it has lots of room to work. If we now execute the >> same logic in a JavaThread then we risk hitting stackoverflows if >> nothing else. But we are also now executing code in a JavaThread and so >> we have to be sure that code is not going to act differently (in a bad >> way) if executed by a JavaThread rather than the VMThread. For example, >> may it be possible that if executing in the VMThread we defer some >> activity that might require execution of Java code, or else hand it off >> to one of the service threads? If we execute that code directly in the >> current JavaThread instead we may not be in a valid state (e.g. consider >> re-entrancy to various subsystems that is not allowed). >> >> Second, we have this question mark over what happens if the operation >> hits further safepoint or handshake polls/checks? Are there constraints >> on what is allowed here? How can we recognise this problem may exist and >> so deal with it? >> >> Third, while we are generally considering what appear to be >> single-thread operations, which should be amenable to a direct >> handshake, we also have to be careful that some of the code involved >> doesn't already expect/assume we are at a safepoint - e.g. a VM op may >> not need to take a lock where a direct handshake might! >> >> Cheers, >> David >> ----- >> >>> @Patricio, coming back to my question [1]: >>> >>> In the example you gave in your answer [2]: the java thread would execute a vm operation during a >>> direct handshake operation, while the VMThread is actually in the middle of a VM_HandshakeAllThreads >>> operation, waiting to handshake the same handshakee: why can't the VMThread just proceed? The >>> handshakee would be safepoint safe, wouldn't it? >>> >>> Thanks, Richard. >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14301677&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14301677 >>> >>> [2] https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14301763&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14301763 >>> >>> -----Original Message----- >>> From: Yasumasa Suenaga >>> Sent: Freitag, 24. April 2020 17:23 >>> To: Reingruber, Richard ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>> >>> Hi Richard, >>> >>> On 2020/04/24 23:44, Reingruber, Richard wrote: >>>> Hi Yasumasa, >>>> >>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>> Does it help you? I think it gives you to remove workaround. >>>> >>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>> >>> Thanks for your information. >>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>> I will modify and will test it after yours. >>> >>> >>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>> to me, how this has to be handled. >>> >>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>> So it appears to me that it would be easier to push JDK-8242427 after this (JDK-8238585). >>>> >>>>> (The patch is available, but I want to see the result of PIT in this weekend whether JDK-8242425 works fine.) >>>> >>>> Would be interesting to see how you handled the issues above :) >>>> >>>> Thanks, Richard. >>>> >>>> [1] See question in comment https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14302030&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14302030 >>>> >>>> -----Original Message----- >>>> From: Yasumasa Suenaga >>>> Sent: Freitag, 24. April 2020 13:34 >>>> To: Reingruber, Richard ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>> >>>> Hi Richard, >>>> >>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>> Does it help you? I think it gives you to remove workaround. >>>> >>>> (The patch is available, but I want to see the result of PIT in this weekend whether JDK-8242425 works fine.) >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/04/24 17:18, Reingruber, Richard wrote: >>>>> Hi Patricio, Vladimir, and Serguei, >>>>> >>>>> now that direct handshakes are available, I've updated the patch to make use of them. >>>>> >>>>> In addition I have done some clean-up changes I missed in the first webrev. >>>>> >>>>> Finally I have implemented the workaround suggested by Patricio to avoid nesting the handshake >>>>> into the vm operation VM_SetFramePop [1] >>>>> >>>>> Kindly review again: >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ >>>>> Webrev(delta): http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ >>>>> >>>>> I updated the JBS item explaining why the vm operation VM_EnterInterpOnlyMode can be replaced with a >>>>> direct handshake: >>>>> >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8238585 >>>>> >>>>> Testing: >>>>> >>>>> * JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>>>> >>>>> * Submit-repo: mach5-one-rrich-JDK-8238585-20200423-1436-10441737 >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>> [1] An assertion in Handshake::execute_direct() fails, if called be VMThread, because it is no JavaThread. >>>>> >>>>> -----Original Message----- >>>>> From: hotspot-dev On Behalf Of Reingruber, Richard >>>>> Sent: Freitag, 14. Februar 2020 19:47 >>>>> To: Patricio Chilano ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>> Subject: RE: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>> >>>>> Hi Patricio, >>>>> >>>>> > > I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >>>>> > > handshake cannot be nested in a vm operation. Maybe it should be asserted in the >>>>> > > Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >>>>> > > >>>>> > > > Alternatively I think you could do something similar to what we do in >>>>> > > > Deoptimization::deoptimize_all_marked(): >>>>> > > > >>>>> > > > EnterInterpOnlyModeClosure hs; >>>>> > > > if (SafepointSynchronize::is_at_safepoint()) { >>>>> > > > hs.do_thread(state->get_thread()); >>>>> > > > } else { >>>>> > > > Handshake::execute(&hs, state->get_thread()); >>>>> > > > } >>>>> > > > (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>> > > > HandshakeClosure() constructor) >>>>> > > >>>>> > > Maybe this could be used also in the Handshake::execute() methods as general solution? >>>>> > Right, we could also do that. Avoiding to clear the polling page in >>>>> > HandshakeState::clear_handshake() should be enough to fix this issue and >>>>> > execute a handshake inside a safepoint, but adding that "if" statement >>>>> > in Hanshake::execute() sounds good to avoid all the extra code that we >>>>> > go through when executing a handshake. I filed 8239084 to make that change. >>>>> >>>>> Thanks for taking care of this and creating the RFE. >>>>> >>>>> > >>>>> > > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>> > > > always called in a nested operation or just sometimes. >>>>> > > >>>>> > > At least one execution path without vm operation exists: >>>>> > > >>>>> > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >>>>> > > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >>>>> > > JvmtiEventControllerPrivate::recompute_enabled() : void >>>>> > > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >>>>> > > JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >>>>> > > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >>>>> > > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >>>>> > > >>>>> > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >>>>> > > handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >>>>> > > encouraged to do it with a handshake :) >>>>> > Ah! I think you can still do it with a handshake with the >>>>> > Deoptimization::deoptimize_all_marked() like solution. I can change the >>>>> > if-else statement with just the Handshake::execute() call in 8239084. >>>>> > But up to you. : ) >>>>> >>>>> Well, I think that's enough encouragement :) >>>>> I'll wait for 8239084 and try then again. >>>>> (no urgency and all) >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>> -----Original Message----- >>>>> From: Patricio Chilano >>>>> Sent: Freitag, 14. Februar 2020 15:54 >>>>> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>> >>>>> Hi Richard, >>>>> >>>>> On 2/14/20 9:58 AM, Reingruber, Richard wrote: >>>>>> Hi Patricio, >>>>>> >>>>>> thanks for having a look. >>>>>> >>>>>> > I?m only commenting on the handshake changes. >>>>>> > I see that operation VM_EnterInterpOnlyMode can be called inside >>>>>> > operation VM_SetFramePop which also allows nested operations. Here is a >>>>>> > comment in VM_SetFramePop definition: >>>>>> > >>>>>> > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >>>>>> > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >>>>>> > >>>>>> > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >>>>>> > could have a handshake inside a safepoint operation. The issue I see >>>>>> > there is that at the end of the handshake the polling page of the target >>>>>> > thread could be disarmed. So if the target thread happens to be in a >>>>>> > blocked state just transiently and wakes up then it will not stop for >>>>>> > the ongoing safepoint. Maybe I can file an RFE to assert that the >>>>>> > polling page is armed at the beginning of disarm_safepoint(). >>>>>> >>>>>> I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >>>>>> handshake cannot be nested in a vm operation. Maybe it should be asserted in the >>>>>> Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >>>>>> >>>>>> > Alternatively I think you could do something similar to what we do in >>>>>> > Deoptimization::deoptimize_all_marked(): >>>>>> > >>>>>> > EnterInterpOnlyModeClosure hs; >>>>>> > if (SafepointSynchronize::is_at_safepoint()) { >>>>>> > hs.do_thread(state->get_thread()); >>>>>> > } else { >>>>>> > Handshake::execute(&hs, state->get_thread()); >>>>>> > } >>>>>> > (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>>> > HandshakeClosure() constructor) >>>>>> >>>>>> Maybe this could be used also in the Handshake::execute() methods as general solution? >>>>> Right, we could also do that. Avoiding to clear the polling page in >>>>> HandshakeState::clear_handshake() should be enough to fix this issue and >>>>> execute a handshake inside a safepoint, but adding that "if" statement >>>>> in Hanshake::execute() sounds good to avoid all the extra code that we >>>>> go through when executing a handshake. I filed 8239084 to make that change. >>>>> >>>>>> > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>>> > always called in a nested operation or just sometimes. >>>>>> >>>>>> At least one execution path without vm operation exists: >>>>>> >>>>>> JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >>>>>> JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >>>>>> JvmtiEventControllerPrivate::recompute_enabled() : void >>>>>> JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >>>>>> JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >>>>>> JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >>>>>> jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >>>>>> >>>>>> I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >>>>>> handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >>>>>> encouraged to do it with a handshake :) >>>>> Ah! I think you can still do it with a handshake with the >>>>> Deoptimization::deoptimize_all_marked() like solution. I can change the >>>>> if-else statement with just the Handshake::execute() call in 8239084. >>>>> But up to you.? : ) >>>>> >>>>> Thanks, >>>>> Patricio >>>>>> Thanks again, >>>>>> Richard. >>>>>> >>>>>> -----Original Message----- >>>>>> From: Patricio Chilano >>>>>> Sent: Donnerstag, 13. Februar 2020 18:47 >>>>>> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>>> >>>>>> Hi Richard, >>>>>> >>>>>> I?m only commenting on the handshake changes. >>>>>> I see that operation VM_EnterInterpOnlyMode can be called inside >>>>>> operation VM_SetFramePop which also allows nested operations. Here is a >>>>>> comment in VM_SetFramePop definition: >>>>>> >>>>>> // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >>>>>> // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >>>>>> >>>>>> So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >>>>>> could have a handshake inside a safepoint operation. The issue I see >>>>>> there is that at the end of the handshake the polling page of the target >>>>>> thread could be disarmed. So if the target thread happens to be in a >>>>>> blocked state just transiently and wakes up then it will not stop for >>>>>> the ongoing safepoint. Maybe I can file an RFE to assert that the >>>>>> polling page is armed at the beginning of disarm_safepoint(). >>>>>> >>>>>> I think one option could be to remove >>>>>> SafepointMechanism::disarm_if_needed() in >>>>>> HandshakeState::clear_handshake() and let each JavaThread disarm itself >>>>>> for the handshake case. >>>>>> >>>>>> Alternatively I think you could do something similar to what we do in >>>>>> Deoptimization::deoptimize_all_marked(): >>>>>> >>>>>> ? EnterInterpOnlyModeClosure hs; >>>>>> ? if (SafepointSynchronize::is_at_safepoint()) { >>>>>> ??? hs.do_thread(state->get_thread()); >>>>>> ? } else { >>>>>> ??? Handshake::execute(&hs, state->get_thread()); >>>>>> ? } >>>>>> (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>>> HandshakeClosure() constructor) >>>>>> >>>>>> I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>>> always called in a nested operation or just sometimes. >>>>>> >>>>>> Thanks, >>>>>> Patricio >>>>>> >>>>>> On 2/12/20 7:23 AM, Reingruber, Richard wrote: >>>>>>> // Repost including hotspot runtime and gc lists. >>>>>>> // Dean Long suggested to do so, because the enhancement replaces a vm operation >>>>>>> // with a handshake. >>>>>>> // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> could I please get reviews for this small enhancement in hotspot's jvmti implementation: >>>>>>> >>>>>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >>>>>>> >>>>>>> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >>>>>>> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >>>>>>> >>>>>>> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >>>>>>> >>>>>>> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>>>>>> >>>>>>> Thanks, Richard. >>>>>>> >>>>>>> See also my question if anyone knows a reason for making the compiled methods not_entrant: >>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html >>>>> From xxinliu at amazon.com Wed May 13 07:01:38 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Wed, 13 May 2020 07:01:38 +0000 Subject: RFR[M]: 8151779: Some intrinsic flags could be replaced with one general flag In-Reply-To: <65dcfd1f-5e7e-b9e1-8298-5daafcda8a81@oracle.com> References: <19CD3956-4DC6-4908-8626-27D48A9AB4A4@amazon.com> <0EDAAC88-E5D9-424F-A19E-5E20C689C2F3@amazon.com> <801D878C-CAE5-4EBE-8AFE-4E35346CD5BD@amazon.com> <58ff5b66-1dce-d4ad-8f21-254abd1b887b@oracle.com> <65dcfd1f-5e7e-b9e1-8298-5daafcda8a81@oracle.com> Message-ID: <1EBE66E6-9AA7-4EC5-9B91-45F884071FAC@amazon.com> Hi, Vladimir, > 2. add +/- UseCRC32Intrinsics to IntrinsicAvailableTest.java > The purpose of that test is not to generate a CRC32 intrinsic. Its purpose is to check if compilers determine to intrinsify _updateCRC32 or not. > Mathematically, "UseCRC32Intrinsics" is a set = [_updateCRC32, _updateBytesCRC32, _updateByteBufferCRC32]. > "-XX:-UseCRC32Intrinsics" disables all 3 of them. If users use -XX:ControlIntrinsic=+_updateCRC32 and -XX:-UseCRC32Intrinsics, _updateCRC32 should be enabled individually. No, I think we should preserve current behavior when UseCRC32Intrinsics is off then all corresponding intrinsics are also should be off. This is the purpose of such flags - to be able control several intrinsics with one flag. Otherwise you have to check each individual intrinsic if CPU does not support them. Even if code for some of these intrinsics can be generated on this CPU. We should be consistent, otherwise code can become very complex to support. ---- If -XX:ControlIntrinsic=+_updateBytesCRC32 can't win over -XX:-UseCRC32Intrinsics, it will come back the justification of JBS-8151779: Why do we need to support the usage -XX:ControlIntrinsic=+_updateBytesCRC32? If a user doesn't set +updateBytesCRC32, it's still enabled. I read the description of "JBS-8235981" and "JBS-8151779" again. I try to understand in this way. The option 'UseCRC32Intrinsics' is the consolidation of 3 real intrinsics [_updateCRC32, _updateBytesCRC32, _updateByteBufferCRC32]. It represents some sorta hardware capabilities to make those intrinsics optimal. If UseCRC32Intrinsics is OFF, it will not make sense to intrinsify them anymore because inliner can deliver the similar result. Quote from JBS-8235981 "Right now, there's no way to introduce experimental intrinsics which are turned off by default and let users enable them on their side. " Currently, once a user declares one new intrinsics in VM_INTRINSICS_DO, it's enabled. It might not be true in the future. i.e. A develop can declare an intrinsic but mark it turn-off by default. He will try it out by -XX:ControlIntrinsic=+_myNewIntrinsic in his development stage. Do I catch up your intention this time? if yes, could you take a look at this new revision? I think I meet the requirement. Webrev: http://cr.openjdk.java.net/~xliu/8151779/05/webrev/ Incremental diff: http://cr.openjdk.java.net/~xliu/8151779/r4_to_r5.diff Here is the change log from rev04. 1) An intrinsic is enabled if and only if neither ControlIntrinsic nor the corresponding UseXXXIntrinsics disables it. The implementation is still in vmIntrinsics::is_disabled_by_flags(vmIntrinsics::ID id). 2) I introduce a compact data structure TriBoolArray. It compresses an array of Tribool. Each tribool only takes 2 bits now. I also took Coleen's suggestion to put TriBool and TriBoolArray in a standalone file "utilities/tribool.hpp". A new gtest is attached. 3) Correct some typos. Thank you David pointed them out. Thanks, --lx ?On 5/12/20, 12:59 AM, "David Holmes" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi, Sorry for the delay in getting back to this. On 5/05/2020 7:37 pm, Liu, Xin wrote: > Hello, David and Nils > > Thank you to review the patch. I went to brush up my English grammar and then update my patch to rev04. > https://cr.openjdk.java.net/~xliu/8151779/04/webrev/ > Here is the incremental diff: https://cr.openjdk.java.net/~xliu/8151779/r3_to_r4.diff It reflect changes based on David's feedbacks. I really appreciate that you review so carefully and found so many invaluable suggestions. TBH, I don't understand Amazon's copyright header neither. I choose the simple way to dodge that problem. In vmSymbols.hpp + // 1. Disable/Control Intrinsic accept a list of intrinsic IDs. s/accept/accepts/ + // their final value are subject to hardware inspection (VM_Version::initialize). s/value/values/ Otherwise all my nits have been addressed - thanks. I don't need to see a further webrev. Thanks, David ----- > Nils points out a very tricky question. Yes, I also notice that each TriBool takes 4 bytes on x86_64. It's a natural machine word and supposed to be the most efficient form. As a result, the vector control_words take about 1.3Kb for all intrinsics. I thought it's not a big deal, but Nils brought up that each DirectiveSet will increase from 128b to 1440b. Theoretically, the user may provide a CompileCommandFile which consists of hundreds of directives. Will hotspot have hundreds of DirectiveSet in that case? > > Actually, I do have a compacted container of TriBool. It's like a vector specialization. > https://cr.openjdk.java.net/~xliu/8151779/TriBool.cpp > > The reason I didn't include it because I still feel that a few KiloBytes memories are not a big deal. Nowadays, hotspot allows Java programmers allocate over 100G heap. Is it wise to increase software complexity to save KBs? > > If you think it matters, I can integrate it. May I update TriBoolArray in a standalone JBS? I have made a lot of changes. I hope I can verify them using KitchenSink? > > For the second problem, I think it's because I used 'memset' to initialize an array of objects in rev01. Previously, I had code like this: > memset(&_intrinsic_control_words[0], 0, sizeof(_intrinsic_control_words)); > > This kind of usage will be warned as -Werror=class-memaccess in g++-8. I have fixed it since rev02. I use DirectiveSet::fill_in(). Please check out. > > Thanks, > --lx > From kim.barrett at oracle.com Wed May 13 08:50:37 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 13 May 2020 04:50:37 -0400 Subject: RFR(S) 8244775 Remove unnecessary dependency to jfrEvents.hpp In-Reply-To: <01aa2284-b62f-27ba-4f22-58e69c65a002@oracle.com> References: <01aa2284-b62f-27ba-4f22-58e69c65a002@oracle.com> Message-ID: <3DB275C0-9FD8-4685-9EBE-952D98BCD488@oracle.com> > On May 12, 2020, at 1:36 AM, Ioi Lam wrote: > > https://bugs.openjdk.java.net/browse/JDK-8244775 > http://cr.openjdk.java.net/~iklam/jdk15/8244775-remove-dependency-jfrEvents.hpp.v01/ > > Currently 231 .o files depends on jfrEvents.hpp, which pulls in a lot of stuff > and slows down HotSpot build. > > I refactored compile.hpp, compilerEvent.hpp and g1GCPhaseTimes.hpp. Now the > number is down to 65 .o files. > > On my machine, debug build goes from 2m19s to 2m01s. > > Testing: passed mach5 tiers 1/2/3. > > Thanks > - Ioi GC changes look good. There are some implicit includes that might be nice to address, but those are pretty much pre-existing. Consider adding utilities/ticks.hpp to g1GCParPhaseTimesTracker.hpp. From rwestrel at redhat.com Wed May 13 14:48:44 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 13 May 2020 16:48:44 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <878si45f6p.fsf@redhat.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> Message-ID: <87zhab3n77.fsf@redhat.com> >> Looking a little more at the interval arithmetic subroutines, >> I think it would be reasonable to leave out most of the linkage to >> TypeInt/TypeLong, and isolate the logic that does the min-ing >> and max-ing, with separate routines for translating to and from >> the Type* world. >> >> Maybe: >> >> struct MinMaxInterval { >> julong shi, slo, uhi, ulo; >> boolean is_int; >> void signedMaxWith(const Interval& that); >> void unsignedMaxWith(const Interval& that); >> void signedMinWith(const Interval& that); >> void unsignedMinWith(const Interval& that); >> MinMaxInterval(TypeInt*); >> MinMaxInterval(TypeLong*); >> const TypeInt* asTypeInt(); >> const TypeInt* asTypeLong(); >> }; >> >> It would be overkill in many cases to put such small routines >> into their own class, but in this case the min/max interval logic >> is very subtle and deserves a little display platform. After spending some time trying to get this to work, I have to admit my initial logic was very much wrong and it's a lot less straightforward to get this to work in a way that I'm confident about than I thought. > Another way to deal with that, would be to pass the expected result type > as argument to unsigned_min() rather than compute it. So I went that way and passes the type as an argument to the min/max methods in case the caller knows something about the result: http://cr.openjdk.java.net/~roland/8244504/webrev.01/ Roland. From ioi.lam at oracle.com Wed May 13 15:31:13 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 13 May 2020 08:31:13 -0700 Subject: RFR(S) 8244775 Remove unnecessary dependency to jfrEvents.hpp In-Reply-To: <3DB275C0-9FD8-4685-9EBE-952D98BCD488@oracle.com> References: <01aa2284-b62f-27ba-4f22-58e69c65a002@oracle.com> <3DB275C0-9FD8-4685-9EBE-952D98BCD488@oracle.com> Message-ID: On 5/13/20 1:50 AM, Kim Barrett wrote: >> On May 12, 2020, at 1:36 AM, Ioi Lam wrote: >> >> https://bugs.openjdk.java.net/browse/JDK-8244775 >> http://cr.openjdk.java.net/~iklam/jdk15/8244775-remove-dependency-jfrEvents.hpp.v01/ >> >> Currently 231 .o files depends on jfrEvents.hpp, which pulls in a lot of stuff >> and slows down HotSpot build. >> >> I refactored compile.hpp, compilerEvent.hpp and g1GCPhaseTimes.hpp. Now the >> number is down to 65 .o files. >> >> On my machine, debug build goes from 2m19s to 2m01s. >> >> Testing: passed mach5 tiers 1/2/3. >> >> Thanks >> - Ioi > GC changes look good. > > There are some implicit includes that might be nice to address, but those are pretty much > pre-existing. Consider adding utilities/ticks.hpp to g1GCParPhaseTimesTracker.hpp. > Hi Kim, Thanks for the review. I've added utilities/ticks.hpp to g1GCParPhaseTimesTracker.hpp. - Ioi From richard.reingruber at sap.com Wed May 13 15:37:59 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 13 May 2020 15:37:59 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: <36d5e2c0-c724-7ff7-d37e-decb5cc0005b@oracle.com> References: <410eed04-e2ef-0f4f-1c56-19e6734a10f6@oracle.com> <81d7caa8-4244-85f3-4d4e-78117fe5e25b@oss.nttdata.com> <550b95ac-8b29-1eb8-a507-533e81d02322@oracle.com> <9c49ea2d-e3b8-b576-1d17-d18ad87cd6ed@oracle.com> <36d5e2c0-c724-7ff7-d37e-decb5cc0005b@oracle.com> Message-ID: Hi David, > On 4/05/2020 8:33 pm, Reingruber, Richard wrote: > > // Trimmed the list of recipients. If the list gets too long then the message needs to be approved > > // by a moderator. > Yes I noticed that too :) In general if you send to hotspot-dev you > shouldn't need to also send to hotspot-X-dev. Makes sense. Will do so next time. > > > > This would be the post with the current webrev.1 > > > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/031245.html > Sorry I missed that update. Okay so this is working with direct > handshakes now. > One style nit in jvmtiThreadState.cpp: > assert(SafepointSynchronize::is_at_safepoint() || > ! (JavaThread *)Thread::current() == get_thread() || > ! Thread::current() == get_thread()->active_handshaker(), > ! "bad synchronization with owner thread"); > the ! lines should ident as follows > assert(SafepointSynchronize::is_at_safepoint() || > (JavaThread *)Thread::current() == get_thread() || > Thread::current() == get_thread()->active_handshaker(), > ! "bad synchronization with owner thread"); Sure. > Lets see how this plays out. Hopefully not too bad... :) >> Not a review but some general commentary ... Still not a review, or is it now? Thanks, Richard. -----Original Message----- From: David Holmes Sent: Mittwoch, 13. Mai 2020 07:43 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant On 4/05/2020 8:33 pm, Reingruber, Richard wrote: > // Trimmed the list of recipients. If the list gets too long then the message needs to be approved > // by a moderator. Yes I noticed that too :) In general if you send to hotspot-dev you shouldn't need to also send to hotspot-X-dev. > Hi David, Hi Richard, >> On 28/04/2020 12:09 am, Reingruber, Richard wrote: >>> Hi David, >>> >>>> Not a review but some general commentary ... >>> >>> That's welcome. > >> Having had to take an even closer look now I have a review comment too :) > >> src/hotspot/share/prims/jvmtiThreadState.cpp > >> void JvmtiThreadState::invalidate_cur_stack_depth() { >> ! assert(SafepointSynchronize::is_at_safepoint() || >> ! (Thread::current()->is_VM_thread() && >> get_thread()->is_vmthread_processing_handshake()) || >> (JavaThread *)Thread::current() == get_thread(), >> "must be current thread or at safepoint"); > > You're looking at an outdated webrev, I'm afraid. > > This would be the post with the current webrev.1 > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/031245.html Sorry I missed that update. Okay so this is working with direct handshakes now. One style nit in jvmtiThreadState.cpp: assert(SafepointSynchronize::is_at_safepoint() || ! (JavaThread *)Thread::current() == get_thread() || ! Thread::current() == get_thread()->active_handshaker(), ! "bad synchronization with owner thread"); the ! lines should ident as follows assert(SafepointSynchronize::is_at_safepoint() || (JavaThread *)Thread::current() == get_thread() || Thread::current() == get_thread()->active_handshaker(), ! "bad synchronization with owner thread"); Lets see how this plays out. Cheers, David > Thanks, Richard. > > -----Original Message----- > From: David Holmes > Sent: Montag, 4. Mai 2020 08:51 > To: Reingruber, Richard ; Yasumasa Suenaga ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > Hi Richard, > > On 28/04/2020 12:09 am, Reingruber, Richard wrote: >> Hi David, >> >>> Not a review but some general commentary ... >> >> That's welcome. > > Having had to take an even closer look now I have a review comment too :) > > src/hotspot/share/prims/jvmtiThreadState.cpp > > void JvmtiThreadState::invalidate_cur_stack_depth() { > ! assert(SafepointSynchronize::is_at_safepoint() || > ! (Thread::current()->is_VM_thread() && > get_thread()->is_vmthread_processing_handshake()) || > (JavaThread *)Thread::current() == get_thread(), > "must be current thread or at safepoint"); > > The message needs updating to include handshakes. > > More below ... > >>> On 25/04/2020 2:08 am, Reingruber, Richard wrote: >>>> Hi Yasumasa, Patricio, >>>> >>>>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>>>> Does it help you? I think it gives you to remove workaround. >>>>>> >>>>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>>> >>>>> Thanks for your information. >>>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>>>> I will modify and will test it after yours. >>>> >>>> Thanks :) >>>> >>>>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>>>> to me, how this has to be handled. >>>> >>>>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>>> >>>> Yes. To me it is unclear what synchronization is necessary, if it is called during a handshake. And >>>> also I'm unsure if a thread should do safepoint checks while executing a handshake. >> >>> I'm growing increasingly concerned that use of direct handshakes to >>> replace VM operations needs a much greater examination for correctness >>> than might initially be thought. I see a number of issues: >> >> I agree. I'll address your concerns in the context of this review thread for JDK-8238585 below. >> >> In addition I would suggest to take the general part of the discussion to a dedicated thread or to >> the review thread for JDK-8242427. I would like to keep this thread closer to its subject. > > I will focus on the issues in the context of this particular change > then, though the issues themselves are applicable to all handshake > situations (and more so with direct handshakes). This is mostly just > discussion. > >>> First, the VMThread executes (most) VM operations with a clean stack in >>> a clean state, so it has lots of room to work. If we now execute the >>> same logic in a JavaThread then we risk hitting stackoverflows if >>> nothing else. But we are also now executing code in a JavaThread and so >>> we have to be sure that code is not going to act differently (in a bad >>> way) if executed by a JavaThread rather than the VMThread. For example, >>> may it be possible that if executing in the VMThread we defer some >>> activity that might require execution of Java code, or else hand it off >>> to one of the service threads? If we execute that code directly in the >>> current JavaThread instead we may not be in a valid state (e.g. consider >>> re-entrancy to various subsystems that is not allowed). >> >> It is not too complex, what EnterInterpOnlyModeClosure::do_thread() is doing. I already added a >> paragraph to the JBS-Item [1] explaining why the direct handshake is sufficient from a >> synchronization point of view. > > Just to be clear, your proposed change is not using a direct handshake. > >> Furthermore the stack is walked and the return pc of compiled frames is replaced with the address of >> the deopt handler. >> >> I can't see why this cannot be done with a direct handshake. Something very similar is already done >> in JavaThread::deoptimize_marked_methods() which is executed as part of an ordinary handshake. > > Note that existing non-direct handshakes may also have issues that not > have been fully investigated. > >> The demand on stack-space should be very modest. I would not expect a higher risk for stackoverflow. > > For the target thread if you use more stack than would be used stopping > at a safepoint then you are at risk. For the thread initiating the > direct handshake if you use more stack than would be used enqueuing a VM > operation, then you are at risk. As we have not quantified these > numbers, nor have any easy way to establish the stack use of the actual > code to be executed, we're really just hoping for the best. This is a > general problem with handshakes that needs to be investigated more > deeply. As a simple, general, example just imagine if the code involves > logging that might utilise an on-stack buffer. > >>> Second, we have this question mark over what happens if the operation >>> hits further safepoint or handshake polls/checks? Are there constraints >>> on what is allowed here? How can we recognise this problem may exist and >>> so deal with it? >> >> The thread in EnterInterpOnlyModeClosure::do_thread() can't become safepoint/handshake safe. I >> tested locally test/hotspot/jtreg:vmTestbase_nsk_jvmti with a NoSafepointVerifier. > > That's good to hear but such tests are not exhaustive, they will detect > if you do reach a safepoint/handshake but they can't prove that you > cannot reach one. What you have done is necessary but may not be > sufficient. Plus you didn't actually add the NSV to the code - is there > a reason we can't actually keep it in do_thread? (I'm not sure if the > NSV also acts as a NoHandshakeVerifier?) > >>> Third, while we are generally considering what appear to be >>> single-thread operations, which should be amenable to a direct >>> handshake, we also have to be careful that some of the code involved >>> doesn't already expect/assume we are at a safepoint - e.g. a VM op may >>> not need to take a lock where a direct handshake might! >> >> See again my arguments in the JBS item [1]. > > Yes I see the reasoning and that is good. My point is a general one as > it may not be obvious when such assumptions exist in the current code. > > Thanks, > David > >> Thanks, >> Richard. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8238585 >> >> -----Original Message----- >> From: David Holmes >> Sent: Montag, 27. April 2020 07:16 >> To: Reingruber, Richard ; Yasumasa Suenaga ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >> >> Hi all, >> >> Not a review but some general commentary ... >> >> On 25/04/2020 2:08 am, Reingruber, Richard wrote: >>> Hi Yasumasa, Patricio, >>> >>>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>>> Does it help you? I think it gives you to remove workaround. >>>>> >>>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>> >>>> Thanks for your information. >>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>>> I will modify and will test it after yours. >>> >>> Thanks :) >>> >>>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>>> to me, how this has to be handled. >>> >>>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>> >>> Yes. To me it is unclear what synchronization is necessary, if it is called during a handshake. And >>> also I'm unsure if a thread should do safepoint checks while executing a handshake. >> >> I'm growing increasingly concerned that use of direct handshakes to >> replace VM operations needs a much greater examination for correctness >> than might initially be thought. I see a number of issues: >> >> First, the VMThread executes (most) VM operations with a clean stack in >> a clean state, so it has lots of room to work. If we now execute the >> same logic in a JavaThread then we risk hitting stackoverflows if >> nothing else. But we are also now executing code in a JavaThread and so >> we have to be sure that code is not going to act differently (in a bad >> way) if executed by a JavaThread rather than the VMThread. For example, >> may it be possible that if executing in the VMThread we defer some >> activity that might require execution of Java code, or else hand it off >> to one of the service threads? If we execute that code directly in the >> current JavaThread instead we may not be in a valid state (e.g. consider >> re-entrancy to various subsystems that is not allowed). >> >> Second, we have this question mark over what happens if the operation >> hits further safepoint or handshake polls/checks? Are there constraints >> on what is allowed here? How can we recognise this problem may exist and >> so deal with it? >> >> Third, while we are generally considering what appear to be >> single-thread operations, which should be amenable to a direct >> handshake, we also have to be careful that some of the code involved >> doesn't already expect/assume we are at a safepoint - e.g. a VM op may >> not need to take a lock where a direct handshake might! >> >> Cheers, >> David >> ----- >> >>> @Patricio, coming back to my question [1]: >>> >>> In the example you gave in your answer [2]: the java thread would execute a vm operation during a >>> direct handshake operation, while the VMThread is actually in the middle of a VM_HandshakeAllThreads >>> operation, waiting to handshake the same handshakee: why can't the VMThread just proceed? The >>> handshakee would be safepoint safe, wouldn't it? >>> >>> Thanks, Richard. >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14301677&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14301677 >>> >>> [2] https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14301763&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14301763 >>> >>> -----Original Message----- >>> From: Yasumasa Suenaga >>> Sent: Freitag, 24. April 2020 17:23 >>> To: Reingruber, Richard ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>> >>> Hi Richard, >>> >>> On 2020/04/24 23:44, Reingruber, Richard wrote: >>>> Hi Yasumasa, >>>> >>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>> Does it help you? I think it gives you to remove workaround. >>>> >>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>> >>> Thanks for your information. >>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>> I will modify and will test it after yours. >>> >>> >>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>> to me, how this has to be handled. >>> >>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>> So it appears to me that it would be easier to push JDK-8242427 after this (JDK-8238585). >>>> >>>>> (The patch is available, but I want to see the result of PIT in this weekend whether JDK-8242425 works fine.) >>>> >>>> Would be interesting to see how you handled the issues above :) >>>> >>>> Thanks, Richard. >>>> >>>> [1] See question in comment https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14302030&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14302030 >>>> >>>> -----Original Message----- >>>> From: Yasumasa Suenaga >>>> Sent: Freitag, 24. April 2020 13:34 >>>> To: Reingruber, Richard ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>> >>>> Hi Richard, >>>> >>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>> Does it help you? I think it gives you to remove workaround. >>>> >>>> (The patch is available, but I want to see the result of PIT in this weekend whether JDK-8242425 works fine.) >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/04/24 17:18, Reingruber, Richard wrote: >>>>> Hi Patricio, Vladimir, and Serguei, >>>>> >>>>> now that direct handshakes are available, I've updated the patch to make use of them. >>>>> >>>>> In addition I have done some clean-up changes I missed in the first webrev. >>>>> >>>>> Finally I have implemented the workaround suggested by Patricio to avoid nesting the handshake >>>>> into the vm operation VM_SetFramePop [1] >>>>> >>>>> Kindly review again: >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ >>>>> Webrev(delta): http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ >>>>> >>>>> I updated the JBS item explaining why the vm operation VM_EnterInterpOnlyMode can be replaced with a >>>>> direct handshake: >>>>> >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8238585 >>>>> >>>>> Testing: >>>>> >>>>> * JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>>>> >>>>> * Submit-repo: mach5-one-rrich-JDK-8238585-20200423-1436-10441737 >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>> [1] An assertion in Handshake::execute_direct() fails, if called be VMThread, because it is no JavaThread. >>>>> >>>>> -----Original Message----- >>>>> From: hotspot-dev On Behalf Of Reingruber, Richard >>>>> Sent: Freitag, 14. Februar 2020 19:47 >>>>> To: Patricio Chilano ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>> Subject: RE: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>> >>>>> Hi Patricio, >>>>> >>>>> > > I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >>>>> > > handshake cannot be nested in a vm operation. Maybe it should be asserted in the >>>>> > > Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >>>>> > > >>>>> > > > Alternatively I think you could do something similar to what we do in >>>>> > > > Deoptimization::deoptimize_all_marked(): >>>>> > > > >>>>> > > > EnterInterpOnlyModeClosure hs; >>>>> > > > if (SafepointSynchronize::is_at_safepoint()) { >>>>> > > > hs.do_thread(state->get_thread()); >>>>> > > > } else { >>>>> > > > Handshake::execute(&hs, state->get_thread()); >>>>> > > > } >>>>> > > > (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>> > > > HandshakeClosure() constructor) >>>>> > > >>>>> > > Maybe this could be used also in the Handshake::execute() methods as general solution? >>>>> > Right, we could also do that. Avoiding to clear the polling page in >>>>> > HandshakeState::clear_handshake() should be enough to fix this issue and >>>>> > execute a handshake inside a safepoint, but adding that "if" statement >>>>> > in Hanshake::execute() sounds good to avoid all the extra code that we >>>>> > go through when executing a handshake. I filed 8239084 to make that change. >>>>> >>>>> Thanks for taking care of this and creating the RFE. >>>>> >>>>> > >>>>> > > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>> > > > always called in a nested operation or just sometimes. >>>>> > > >>>>> > > At least one execution path without vm operation exists: >>>>> > > >>>>> > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >>>>> > > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >>>>> > > JvmtiEventControllerPrivate::recompute_enabled() : void >>>>> > > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >>>>> > > JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >>>>> > > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >>>>> > > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >>>>> > > >>>>> > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >>>>> > > handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >>>>> > > encouraged to do it with a handshake :) >>>>> > Ah! I think you can still do it with a handshake with the >>>>> > Deoptimization::deoptimize_all_marked() like solution. I can change the >>>>> > if-else statement with just the Handshake::execute() call in 8239084. >>>>> > But up to you. : ) >>>>> >>>>> Well, I think that's enough encouragement :) >>>>> I'll wait for 8239084 and try then again. >>>>> (no urgency and all) >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>> -----Original Message----- >>>>> From: Patricio Chilano >>>>> Sent: Freitag, 14. Februar 2020 15:54 >>>>> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>> >>>>> Hi Richard, >>>>> >>>>> On 2/14/20 9:58 AM, Reingruber, Richard wrote: >>>>>> Hi Patricio, >>>>>> >>>>>> thanks for having a look. >>>>>> >>>>>> > I?m only commenting on the handshake changes. >>>>>> > I see that operation VM_EnterInterpOnlyMode can be called inside >>>>>> > operation VM_SetFramePop which also allows nested operations. Here is a >>>>>> > comment in VM_SetFramePop definition: >>>>>> > >>>>>> > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >>>>>> > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >>>>>> > >>>>>> > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >>>>>> > could have a handshake inside a safepoint operation. The issue I see >>>>>> > there is that at the end of the handshake the polling page of the target >>>>>> > thread could be disarmed. So if the target thread happens to be in a >>>>>> > blocked state just transiently and wakes up then it will not stop for >>>>>> > the ongoing safepoint. Maybe I can file an RFE to assert that the >>>>>> > polling page is armed at the beginning of disarm_safepoint(). >>>>>> >>>>>> I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >>>>>> handshake cannot be nested in a vm operation. Maybe it should be asserted in the >>>>>> Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >>>>>> >>>>>> > Alternatively I think you could do something similar to what we do in >>>>>> > Deoptimization::deoptimize_all_marked(): >>>>>> > >>>>>> > EnterInterpOnlyModeClosure hs; >>>>>> > if (SafepointSynchronize::is_at_safepoint()) { >>>>>> > hs.do_thread(state->get_thread()); >>>>>> > } else { >>>>>> > Handshake::execute(&hs, state->get_thread()); >>>>>> > } >>>>>> > (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>>> > HandshakeClosure() constructor) >>>>>> >>>>>> Maybe this could be used also in the Handshake::execute() methods as general solution? >>>>> Right, we could also do that. Avoiding to clear the polling page in >>>>> HandshakeState::clear_handshake() should be enough to fix this issue and >>>>> execute a handshake inside a safepoint, but adding that "if" statement >>>>> in Hanshake::execute() sounds good to avoid all the extra code that we >>>>> go through when executing a handshake. I filed 8239084 to make that change. >>>>> >>>>>> > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>>> > always called in a nested operation or just sometimes. >>>>>> >>>>>> At least one execution path without vm operation exists: >>>>>> >>>>>> JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >>>>>> JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >>>>>> JvmtiEventControllerPrivate::recompute_enabled() : void >>>>>> JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >>>>>> JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >>>>>> JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >>>>>> jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >>>>>> >>>>>> I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >>>>>> handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >>>>>> encouraged to do it with a handshake :) >>>>> Ah! I think you can still do it with a handshake with the >>>>> Deoptimization::deoptimize_all_marked() like solution. I can change the >>>>> if-else statement with just the Handshake::execute() call in 8239084. >>>>> But up to you.? : ) >>>>> >>>>> Thanks, >>>>> Patricio >>>>>> Thanks again, >>>>>> Richard. >>>>>> >>>>>> -----Original Message----- >>>>>> From: Patricio Chilano >>>>>> Sent: Donnerstag, 13. Februar 2020 18:47 >>>>>> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>>> >>>>>> Hi Richard, >>>>>> >>>>>> I?m only commenting on the handshake changes. >>>>>> I see that operation VM_EnterInterpOnlyMode can be called inside >>>>>> operation VM_SetFramePop which also allows nested operations. Here is a >>>>>> comment in VM_SetFramePop definition: >>>>>> >>>>>> // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >>>>>> // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >>>>>> >>>>>> So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >>>>>> could have a handshake inside a safepoint operation. The issue I see >>>>>> there is that at the end of the handshake the polling page of the target >>>>>> thread could be disarmed. So if the target thread happens to be in a >>>>>> blocked state just transiently and wakes up then it will not stop for >>>>>> the ongoing safepoint. Maybe I can file an RFE to assert that the >>>>>> polling page is armed at the beginning of disarm_safepoint(). >>>>>> >>>>>> I think one option could be to remove >>>>>> SafepointMechanism::disarm_if_needed() in >>>>>> HandshakeState::clear_handshake() and let each JavaThread disarm itself >>>>>> for the handshake case. >>>>>> >>>>>> Alternatively I think you could do something similar to what we do in >>>>>> Deoptimization::deoptimize_all_marked(): >>>>>> >>>>>> ? EnterInterpOnlyModeClosure hs; >>>>>> ? if (SafepointSynchronize::is_at_safepoint()) { >>>>>> ??? hs.do_thread(state->get_thread()); >>>>>> ? } else { >>>>>> ??? Handshake::execute(&hs, state->get_thread()); >>>>>> ? } >>>>>> (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>>> HandshakeClosure() constructor) >>>>>> >>>>>> I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>>> always called in a nested operation or just sometimes. >>>>>> >>>>>> Thanks, >>>>>> Patricio >>>>>> >>>>>> On 2/12/20 7:23 AM, Reingruber, Richard wrote: >>>>>>> // Repost including hotspot runtime and gc lists. >>>>>>> // Dean Long suggested to do so, because the enhancement replaces a vm operation >>>>>>> // with a handshake. >>>>>>> // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> could I please get reviews for this small enhancement in hotspot's jvmti implementation: >>>>>>> >>>>>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >>>>>>> >>>>>>> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >>>>>>> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >>>>>>> >>>>>>> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >>>>>>> >>>>>>> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>>>>>> >>>>>>> Thanks, Richard. >>>>>>> >>>>>>> See also my question if anyone knows a reason for making the compiled methods not_entrant: >>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html >>>>> From vladimir.kozlov at oracle.com Wed May 13 16:02:41 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 13 May 2020 09:02:41 -0700 Subject: RFR(S) 8244775 Remove unnecessary dependency to jfrEvents.hpp In-Reply-To: <01aa2284-b62f-27ba-4f22-58e69c65a002@oracle.com> References: <01aa2284-b62f-27ba-4f22-58e69c65a002@oracle.com> Message-ID: <2ef824a1-e297-d2d8-d6ca-3048a1e21840@oracle.com> Hi Ioi, This looks fine to me but why not put methods into compile.cpp instead of .inline.hpp? It should not be performance critical. thanks, Vladimir On 5/11/20 10:36 PM, Ioi Lam wrote: > https://bugs.openjdk.java.net/browse/JDK-8244775 > http://cr.openjdk.java.net/~iklam/jdk15/8244775-remove-dependency-jfrEvents.hpp.v01/ > > Currently 231 .o files depends on jfrEvents.hpp, which pulls in a lot of stuff > and slows down HotSpot build. > > I refactored compile.hpp, compilerEvent.hpp and g1GCPhaseTimes.hpp. Now the > number is down to 65 .o files. > > On my machine, debug build goes from 2m19s to 2m01s. > > Testing: passed mach5 tiers 1/2/3. > > Thanks > - Ioi From gnu.andrew at redhat.com Wed May 13 16:52:16 2020 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Wed, 13 May 2020 17:52:16 +0100 Subject: [8u] RFR: 8146612: C2: Precedence edges specification violated In-Reply-To: <87v9n4ie2h.fsf@redhat.com> References: <87v9n4ie2h.fsf@redhat.com> Message-ID: On 16/03/2020 14:12, Roland Westrelin wrote: > > Change does not apply cleanly to 8u in node.hpp (copyright change) and > node.cpp: (copyright change and methods > Node::del_req()/Node::del_req_ordered()). The 8u change is identical to > the one from 11u but code in Node::del_req()/Node::del_req_ordered() > changed abit. This is required in order to backport 8214862: assert(proj > != __null) at compile.cpp:3251. > > http://cr.openjdk.java.net/~roland/8146612.8u/webrev.00/ > > Initial change: > https://bugs.openjdk.java.net/browse/JDK-8146612 > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f6615ec051d9 > > Tested with tier1. > > Roland. > Looks ok to me (change the same, bar addition of err_msg wrapper) Approved for 8u. -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From ioi.lam at oracle.com Wed May 13 17:18:33 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 13 May 2020 10:18:33 -0700 Subject: RFR(S) 8244775 Remove unnecessary dependency to jfrEvents.hpp In-Reply-To: <2ef824a1-e297-d2d8-d6ca-3048a1e21840@oracle.com> References: <01aa2284-b62f-27ba-4f22-58e69c65a002@oracle.com> <2ef824a1-e297-d2d8-d6ca-3048a1e21840@oracle.com> Message-ID: Hi Vladimir, Thanks for your comments. I've moved the functions into compile.cpp and removed compile.inline.hpp http://cr.openjdk.java.net/~iklam/jdk15/8244775-remove-dependency-jfrEvents.hpp.v02/index.html Thanks - Ioi On 5/13/20 9:02 AM, Vladimir Kozlov wrote: > Hi Ioi, > > This looks fine to me but why not put methods into compile.cpp instead > of .inline.hpp? It should not be performance critical. > > thanks, > Vladimir > > On 5/11/20 10:36 PM, Ioi Lam wrote: >> https://bugs.openjdk.java.net/browse/JDK-8244775 >> http://cr.openjdk.java.net/~iklam/jdk15/8244775-remove-dependency-jfrEvents.hpp.v01/ >> >> >> Currently 231 .o files depends on jfrEvents.hpp, which pulls in a lot >> of stuff >> and slows down HotSpot build. >> >> I refactored compile.hpp, compilerEvent.hpp and g1GCPhaseTimes.hpp. >> Now the >> number is down to 65 .o files. >> >> On my machine, debug build goes from 2m19s to 2m01s. >> >> Testing: passed mach5 tiers 1/2/3. >> >> Thanks >> - Ioi From vladimir.kozlov at oracle.com Wed May 13 17:24:56 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 13 May 2020 10:24:56 -0700 Subject: RFR(S) 8244775 Remove unnecessary dependency to jfrEvents.hpp In-Reply-To: References: <01aa2284-b62f-27ba-4f22-58e69c65a002@oracle.com> <2ef824a1-e297-d2d8-d6ca-3048a1e21840@oracle.com> Message-ID: Good. Thanks, Vladimir On 5/13/20 10:18 AM, Ioi Lam wrote: > Hi Vladimir, > > Thanks for your comments. I've moved the functions into compile.cpp and removed compile.inline.hpp > > http://cr.openjdk.java.net/~iklam/jdk15/8244775-remove-dependency-jfrEvents.hpp.v02/index.html > > Thanks > - Ioi > > > On 5/13/20 9:02 AM, Vladimir Kozlov wrote: >> Hi Ioi, >> >> This looks fine to me but why not put methods into compile.cpp instead of .inline.hpp? It should not be performance >> critical. >> >> thanks, >> Vladimir >> >> On 5/11/20 10:36 PM, Ioi Lam wrote: >>> https://bugs.openjdk.java.net/browse/JDK-8244775 >>> http://cr.openjdk.java.net/~iklam/jdk15/8244775-remove-dependency-jfrEvents.hpp.v01/ >>> >>> Currently 231 .o files depends on jfrEvents.hpp, which pulls in a lot of stuff >>> and slows down HotSpot build. >>> >>> I refactored compile.hpp, compilerEvent.hpp and g1GCPhaseTimes.hpp. Now the >>> number is down to 65 .o files. >>> >>> On my machine, debug build goes from 2m19s to 2m01s. >>> >>> Testing: passed mach5 tiers 1/2/3. >>> >>> Thanks >>> - Ioi > From ioi.lam at oracle.com Wed May 13 17:54:30 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 13 May 2020 10:54:30 -0700 Subject: RFR(S) 8244775 Remove unnecessary dependency to jfrEvents.hpp In-Reply-To: References: <01aa2284-b62f-27ba-4f22-58e69c65a002@oracle.com> <2ef824a1-e297-d2d8-d6ca-3048a1e21840@oracle.com> Message-ID: <5adfc93c-a2aa-57a4-e84f-ef1583ef152a@oracle.com> Thanks Vladimir! - Ioi On 5/13/20 10:24 AM, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 5/13/20 10:18 AM, Ioi Lam wrote: >> Hi Vladimir, >> >> Thanks for your comments. I've moved the functions into compile.cpp >> and removed compile.inline.hpp >> >> http://cr.openjdk.java.net/~iklam/jdk15/8244775-remove-dependency-jfrEvents.hpp.v02/index.html >> >> >> Thanks >> - Ioi >> >> >> On 5/13/20 9:02 AM, Vladimir Kozlov wrote: >>> Hi Ioi, >>> >>> This looks fine to me but why not put methods into compile.cpp >>> instead of .inline.hpp? It should not be performance critical. >>> >>> thanks, >>> Vladimir >>> >>> On 5/11/20 10:36 PM, Ioi Lam wrote: >>>> https://bugs.openjdk.java.net/browse/JDK-8244775 >>>> http://cr.openjdk.java.net/~iklam/jdk15/8244775-remove-dependency-jfrEvents.hpp.v01/ >>>> >>>> >>>> Currently 231 .o files depends on jfrEvents.hpp, which pulls in a >>>> lot of stuff >>>> and slows down HotSpot build. >>>> >>>> I refactored compile.hpp, compilerEvent.hpp and g1GCPhaseTimes.hpp. >>>> Now the >>>> number is down to 65 .o files. >>>> >>>> On my machine, debug build goes from 2m19s to 2m01s. >>>> >>>> Testing: passed mach5 tiers 1/2/3. >>>> >>>> Thanks >>>> - Ioi >> From vladimir.kozlov at oracle.com Wed May 13 19:46:09 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 13 May 2020 12:46:09 -0700 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> Message-ID: <19d53124-94d1-50f9-f4e6-948640e7c848@oracle.com> Hi Martin, On 5/11/20 6:32 AM, Doerr, Martin wrote: > Hi Vladimir, > > are you ok with the updated CSR (https://bugs.openjdk.java.net/browse/JDK-8244507)? > Should I set it to proposed? Yes. > > Here's a new webrev with obsoletion + expiration for C2 flags in ClientVM: > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ > > I've added the new C1 flags to the tests which should test C1 compiler as well. Good. Why not do the same for C1MaxInlineSize? > And I've added -XX:+IgnoreUnrecognizedVMOptions to all tests which set C2 flags. I think this is the best solution because it still allows running the tests with GraalVM compiler. Yes. Thanks, Vladimir > > Best regards, > Martin > > >> -----Original Message----- >> From: Doerr, Martin >> Sent: Freitag, 8. Mai 2020 23:07 >> To: Vladimir Kozlov ; hotspot-compiler- >> dev at openjdk.java.net >> Subject: RE: RFR(S): 8235673: [C1, C2] Split inlining control flags >> >> Hi Vladimir, >> >>> You need update your CSR - add information about this and above code >> change. Example: >>> https://bugs.openjdk.java.net/browse/JDK-8238840 >> I've updated the CSR with obsolete and expired flags as in the example. >> >>> I would suggest to fix tests anyway (there are only few) because new >>> warning output could be unexpected. >> Ok. I'll prepare a webrev with fixed tests. >> >> Best regards, >> Martin >> >> >>> -----Original Message----- >>> From: Vladimir Kozlov >>> Sent: Freitag, 8. Mai 2020 21:43 >>> To: Doerr, Martin ; hotspot-compiler- >>> dev at openjdk.java.net >>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>> >>> Hi Martin >>> >>> On 5/8/20 5:56 AM, Doerr, Martin wrote: >>>> Hi Vladimir, >>>> >>>> thanks a lot for looking at this, for finding the test issues and for >> reviewing >>> the CSR. >>>> >>>> For me, C2 is a fundamental part of the JVM. I would usually never build >>> without it ?? >>>> (Except if we want to use C1 + GraalVM compiler only.) >>> >>> Yes it is one of cases. >>> >>>> But your right, --with-jvm-variants=client configuration should still be >>> supported. >>> >>> Yes. >>> >>>> >>>> We can fix it by making the flags as obsolete if C2 is not included: >>>> diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp >>>> --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 11:14:28 >> 2020 >>> +0200 >>>> +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 14:41:14 >>> 2020 +0200 >>>> @@ -562,6 +562,16 @@ >>>> { "dup option", JDK_Version::jdk(9), >> JDK_Version::undefined(), >>> JDK_Version::undefined() }, >>>> #endif >>>> >>>> +#ifndef COMPILER2 >>>> + // These flags were generally available, but are C2 only, now. >>>> + { "MaxInlineLevel", JDK_Version::undefined(), >>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>> + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), >>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>> + { "InlineSmallCode", JDK_Version::undefined(), >>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>> + { "MaxInlineSize", JDK_Version::undefined(), >>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>> + { "FreqInlineSize", JDK_Version::undefined(), >>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>> + { "MaxTrivialSize", JDK_Version::undefined(), >>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>> +#endif >>>> + >>>> { NULL, JDK_Version(0), JDK_Version(0) } >>>> }; >>> >>> Right. I think you should do full process for these product flags deprecation >>> with obsoleting in JDK 16 for VM builds >>> which do not include C2. You need update your CSR - add information >> about >>> this and above code change. Example: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8238840 >>> >>>> >>>> This makes the VM accept the flags with warning: >>>> jdk/bin/java -XX:MaxInlineLevel=9 -version >>>> OpenJDK 64-Bit Client VM warning: Ignoring option MaxInlineLevel; >>> support was removed in 15.0 >>>> >>>> If we do it this way, the only test which I think should get fixed is >>> ReservedStackTest. >>>> I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in order to >>> preserve the inlining behavior. >>>> >>>> (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. >>> compiler/c2 tests: Also written to test C2 specific things.) >>>> >>>> What do you think? >>> >>> I would suggest to fix tests anyway (there are only few) because new >>> warning output could be unexpected. >>> And it will be future-proof when warning will be converted into error >>> (if/when C2 goes away). >>> >>> Thanks, >>> Vladimir >>> >>>> >>>> Best regards, >>>> Martin >>>> >>>> >>>>> -----Original Message----- >>>>> From: hotspot-compiler-dev >>>> bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov >>>>> Sent: Donnerstag, 7. Mai 2020 19:11 >>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>> >>>>> I would suggest to build VM without C2 and run tests. >>>>> >>>>> I grepped tests with these flags I found next tests where we need to fix >>>>> test's command (add >>>>> -XX:+IgnoreUnrecognizedVMOptions) or add @requires >>>>> vm.compiler2.enabled or duplicate test for C1 with corresponding C1 >>>>> flags (by ussing additional @test block). >>>>> >>>>> runtime/ReservedStack/ReservedStackTest.java >>>>> compiler/intrinsics/string/TestStringIntrinsics2.java >>>>> compiler/c2/Test6792161.java >>>>> compiler/c2/Test5091921.java >>>>> >>>>> And there is issue with compiler/compilercontrol tests which use >>>>> InlineSmallCode and I am not sure how to handle: >>>>> >>>>> >>> >> http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c >>>>> ompiler/compilercontrol/share/scenario/Command.java#l36 >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 5/4/20 9:04 AM, Doerr, Martin wrote: >>>>>> Hi Nils, >>>>>> >>>>>> thank you for looking at this and sorry for the late reply. >>>>>> >>>>>> I've added MaxTrivialSize and also updated the issue accordingly. >> Makes >>>>> sense. >>>>>> Do you have more flags in mind? >>>>>> >>>>>> Moving the flags which are only used by C2 into c2_globals definitely >>> makes >>>>> sense. >>>>>> >>>>>> Done in webrev.01: >>>>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ >>>>>> >>>>>> Please take a look and let me know when my proposal is ready for a >> CSR. >>>>>> >>>>>> Best regards, >>>>>> Martin >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: hotspot-compiler-dev >>>>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson >>>>>>> Sent: Dienstag, 28. April 2020 18:29 >>>>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Thanks for addressing this! This has been an annoyance for a long >> time. >>>>>>> >>>>>>> Have you though about including other flags - like MaxTrivialSize? >>>>>>> MaxInlineSize is tested against it. >>>>>>> >>>>>>> Also - you should move the flags that are now c2-only to >>> c2_globals.hpp. >>>>>>> >>>>>>> Best regards, >>>>>>> Nils Eliasson >>>>>>> >>>>>>> On 2020-04-27 15:06, Doerr, Martin wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> while tuning inlining parameters for C2 compiler with JDK-8234863 >> we >>>>> had >>>>>>> discussed impact on C1. >>>>>>>> I still think it's bad to share them between both compilers. We may >>> want >>>>> to >>>>>>> do further C2 tuning without negative impact on C1 in the future. >>>>>>>> >>>>>>>> C1 has issues with substantial inlining because of the lack of >>> uncommon >>>>>>> traps. When C1 inlines a lot, stack frames may get large and code >> cache >>>>> space >>>>>>> may get wasted for cold or even never executed code. The situation >>> gets >>>>>>> worse when many patching stubs get used for such code. >>>>>>>> >>>>>>>> I had opened the following issue: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 >>>>>>>> >>>>>>>> And my initial proposal is here: >>>>>>>> >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ >>>>>>>> >>>>>>>> >>>>>>>> Part of my proposal is to add an additional flag which I called >>>>>>> C1InlineStackLimit to reduce stack utilization for C1 methods. >>>>>>>> I have a simple example which shows wasted stack space (java >>> example >>>>>>> TestStack at the end). >>>>>>>> >>>>>>>> It simply counts stack frames until a stack overflow occurs. With the >>>>> current >>>>>>> implementation, only 1283 frames fit on the stack because the never >>>>>>> executed method bogus_test with local variables gets inlined. >>>>>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get >>>>> 2310 >>>>>>> frames until stack overflow. (I only used C1 for this example. Can be >>>>>>> reproduced as shown below.) >>>>>>>> >>>>>>>> I didn't notice any performance regression even with the aggressive >>>>> setting >>>>>>> of C1InlineStackLimit=5 with TieredCompilation. >>>>>>>> >>>>>>>> I know that I'll need a CSR for this change, but I'd like to get >> feedback >>> in >>>>>>> general and feedback about the flag names before creating a CSR. >>>>>>>> I'd also be glad about feedback regarding the performance impact. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Martin >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Command line: >>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - >>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining >> - >>>>>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>>>> TestStack >>>>>>>> CompileCommand: compileonly TestStack.triggerStackOverflow >>>>>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) >>>>> recursive >>>>>>> inlining too deep >>>>>>>> @ 11 TestStack::bogus_test (33 bytes) inline >>>>>>>> caught java.lang.StackOverflowError >>>>>>>> 1283 activations were on stack, sum = 0 >>>>>>>> >>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - >>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining >> - >>>>>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>>>> TestStack >>>>>>>> CompileCommand: compileonly TestStack.triggerStackOverflow >>>>>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) >>>>> recursive >>>>>>> inlining too deep >>>>>>>> @ 11 TestStack::bogus_test (33 bytes) callee uses >>> too >>>>>>> much stack >>>>>>>> caught java.lang.StackOverflowError >>>>>>>> 2310 activations were on stack, sum = 0 >>>>>>>> >>>>>>>> >>>>>>>> TestStack.java: >>>>>>>> public class TestStack { >>>>>>>> >>>>>>>> static long cnt = 0, >>>>>>>> sum = 0; >>>>>>>> >>>>>>>> public static void bogus_test() { >>>>>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; >>>>>>>> sum += c1 + c2 + c3 + c4; >>>>>>>> } >>>>>>>> >>>>>>>> public static void triggerStackOverflow() { >>>>>>>> cnt++; >>>>>>>> triggerStackOverflow(); >>>>>>>> bogus_test(); >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>>> public static void main(String args[]) { >>>>>>>> try { >>>>>>>> triggerStackOverflow(); >>>>>>>> } catch (StackOverflowError e) { >>>>>>>> System.out.println("caught " + e); >>>>>>>> } >>>>>>>> System.out.println(cnt + " activations were on stack, sum = " + >>>>> sum); >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>> From martin.doerr at sap.com Wed May 13 20:10:40 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 13 May 2020 20:10:40 +0000 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: <19d53124-94d1-50f9-f4e6-948640e7c848@oracle.com> References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> <19d53124-94d1-50f9-f4e6-948640e7c848@oracle.com> Message-ID: Hi Vladimir, thanks for reviewing it. > > Should I set it to proposed? > > Yes. I've set it to "Finalized". Hope this was correct. > > I've added the new C1 flags to the tests which should test C1 compiler as > well. > > Good. Why not do the same for C1MaxInlineSize? Looks like MaxInlineSize is only used by tests which test C2 specific things. So I think C1MaxInlineSize would be pointless. In addition to that, the C2 values are probably not appropriate for C1 in some tests. Would you like to have C1MaxInlineSize configured in some tests? Best regards, Martin > -----Original Message----- > From: Vladimir Kozlov > Sent: Mittwoch, 13. Mai 2020 21:46 > To: Doerr, Martin ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > Hi Martin, > > On 5/11/20 6:32 AM, Doerr, Martin wrote: > > Hi Vladimir, > > > > are you ok with the updated CSR > (https://bugs.openjdk.java.net/browse/JDK-8244507)? > > Should I set it to proposed? > > Yes. > > > > > Here's a new webrev with obsoletion + expiration for C2 flags in ClientVM: > > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ > > > > I've added the new C1 flags to the tests which should test C1 compiler as > well. > > Good. Why not do the same for C1MaxInlineSize? > > > And I've added -XX:+IgnoreUnrecognizedVMOptions to all tests which set > C2 flags. I think this is the best solution because it still allows running the tests > with GraalVM compiler. > > Yes. > > Thanks, > Vladimir > > > > > Best regards, > > Martin > > > > > >> -----Original Message----- > >> From: Doerr, Martin > >> Sent: Freitag, 8. Mai 2020 23:07 > >> To: Vladimir Kozlov ; hotspot-compiler- > >> dev at openjdk.java.net > >> Subject: RE: RFR(S): 8235673: [C1, C2] Split inlining control flags > >> > >> Hi Vladimir, > >> > >>> You need update your CSR - add information about this and above code > >> change. Example: > >>> https://bugs.openjdk.java.net/browse/JDK-8238840 > >> I've updated the CSR with obsolete and expired flags as in the example. > >> > >>> I would suggest to fix tests anyway (there are only few) because new > >>> warning output could be unexpected. > >> Ok. I'll prepare a webrev with fixed tests. > >> > >> Best regards, > >> Martin > >> > >> > >>> -----Original Message----- > >>> From: Vladimir Kozlov > >>> Sent: Freitag, 8. Mai 2020 21:43 > >>> To: Doerr, Martin ; hotspot-compiler- > >>> dev at openjdk.java.net > >>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>> > >>> Hi Martin > >>> > >>> On 5/8/20 5:56 AM, Doerr, Martin wrote: > >>>> Hi Vladimir, > >>>> > >>>> thanks a lot for looking at this, for finding the test issues and for > >> reviewing > >>> the CSR. > >>>> > >>>> For me, C2 is a fundamental part of the JVM. I would usually never > build > >>> without it ?? > >>>> (Except if we want to use C1 + GraalVM compiler only.) > >>> > >>> Yes it is one of cases. > >>> > >>>> But your right, --with-jvm-variants=client configuration should still be > >>> supported. > >>> > >>> Yes. > >>> > >>>> > >>>> We can fix it by making the flags as obsolete if C2 is not included: > >>>> diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp > >>>> --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 11:14:28 > >> 2020 > >>> +0200 > >>>> +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 14:41:14 > >>> 2020 +0200 > >>>> @@ -562,6 +562,16 @@ > >>>> { "dup option", JDK_Version::jdk(9), > >> JDK_Version::undefined(), > >>> JDK_Version::undefined() }, > >>>> #endif > >>>> > >>>> +#ifndef COMPILER2 > >>>> + // These flags were generally available, but are C2 only, now. > >>>> + { "MaxInlineLevel", JDK_Version::undefined(), > >>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>> + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), > >>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>> + { "InlineSmallCode", JDK_Version::undefined(), > >>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>> + { "MaxInlineSize", JDK_Version::undefined(), > >>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>> + { "FreqInlineSize", JDK_Version::undefined(), > >>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>> + { "MaxTrivialSize", JDK_Version::undefined(), > >>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>> +#endif > >>>> + > >>>> { NULL, JDK_Version(0), JDK_Version(0) } > >>>> }; > >>> > >>> Right. I think you should do full process for these product flags > deprecation > >>> with obsoleting in JDK 16 for VM builds > >>> which do not include C2. You need update your CSR - add information > >> about > >>> this and above code change. Example: > >>> > >>> https://bugs.openjdk.java.net/browse/JDK-8238840 > >>> > >>>> > >>>> This makes the VM accept the flags with warning: > >>>> jdk/bin/java -XX:MaxInlineLevel=9 -version > >>>> OpenJDK 64-Bit Client VM warning: Ignoring option MaxInlineLevel; > >>> support was removed in 15.0 > >>>> > >>>> If we do it this way, the only test which I think should get fixed is > >>> ReservedStackTest. > >>>> I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in order to > >>> preserve the inlining behavior. > >>>> > >>>> (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. > >>> compiler/c2 tests: Also written to test C2 specific things.) > >>>> > >>>> What do you think? > >>> > >>> I would suggest to fix tests anyway (there are only few) because new > >>> warning output could be unexpected. > >>> And it will be future-proof when warning will be converted into error > >>> (if/when C2 goes away). > >>> > >>> Thanks, > >>> Vladimir > >>> > >>>> > >>>> Best regards, > >>>> Martin > >>>> > >>>> > >>>>> -----Original Message----- > >>>>> From: hotspot-compiler-dev >>>>> bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov > >>>>> Sent: Donnerstag, 7. Mai 2020 19:11 > >>>>> To: hotspot-compiler-dev at openjdk.java.net > >>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>> > >>>>> I would suggest to build VM without C2 and run tests. > >>>>> > >>>>> I grepped tests with these flags I found next tests where we need to > fix > >>>>> test's command (add > >>>>> -XX:+IgnoreUnrecognizedVMOptions) or add @requires > >>>>> vm.compiler2.enabled or duplicate test for C1 with corresponding C1 > >>>>> flags (by ussing additional @test block). > >>>>> > >>>>> runtime/ReservedStack/ReservedStackTest.java > >>>>> compiler/intrinsics/string/TestStringIntrinsics2.java > >>>>> compiler/c2/Test6792161.java > >>>>> compiler/c2/Test5091921.java > >>>>> > >>>>> And there is issue with compiler/compilercontrol tests which use > >>>>> InlineSmallCode and I am not sure how to handle: > >>>>> > >>>>> > >>> > >> > http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c > >>>>> ompiler/compilercontrol/share/scenario/Command.java#l36 > >>>>> > >>>>> Thanks, > >>>>> Vladimir > >>>>> > >>>>> On 5/4/20 9:04 AM, Doerr, Martin wrote: > >>>>>> Hi Nils, > >>>>>> > >>>>>> thank you for looking at this and sorry for the late reply. > >>>>>> > >>>>>> I've added MaxTrivialSize and also updated the issue accordingly. > >> Makes > >>>>> sense. > >>>>>> Do you have more flags in mind? > >>>>>> > >>>>>> Moving the flags which are only used by C2 into c2_globals definitely > >>> makes > >>>>> sense. > >>>>>> > >>>>>> Done in webrev.01: > >>>>>> > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ > >>>>>> > >>>>>> Please take a look and let me know when my proposal is ready for a > >> CSR. > >>>>>> > >>>>>> Best regards, > >>>>>> Martin > >>>>>> > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: hotspot-compiler-dev >>>>>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson > >>>>>>> Sent: Dienstag, 28. April 2020 18:29 > >>>>>>> To: hotspot-compiler-dev at openjdk.java.net > >>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>>>> > >>>>>>> Hi, > >>>>>>> > >>>>>>> Thanks for addressing this! This has been an annoyance for a long > >> time. > >>>>>>> > >>>>>>> Have you though about including other flags - like MaxTrivialSize? > >>>>>>> MaxInlineSize is tested against it. > >>>>>>> > >>>>>>> Also - you should move the flags that are now c2-only to > >>> c2_globals.hpp. > >>>>>>> > >>>>>>> Best regards, > >>>>>>> Nils Eliasson > >>>>>>> > >>>>>>> On 2020-04-27 15:06, Doerr, Martin wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> while tuning inlining parameters for C2 compiler with JDK-8234863 > >> we > >>>>> had > >>>>>>> discussed impact on C1. > >>>>>>>> I still think it's bad to share them between both compilers. We > may > >>> want > >>>>> to > >>>>>>> do further C2 tuning without negative impact on C1 in the future. > >>>>>>>> > >>>>>>>> C1 has issues with substantial inlining because of the lack of > >>> uncommon > >>>>>>> traps. When C1 inlines a lot, stack frames may get large and code > >> cache > >>>>> space > >>>>>>> may get wasted for cold or even never executed code. The > situation > >>> gets > >>>>>>> worse when many patching stubs get used for such code. > >>>>>>>> > >>>>>>>> I had opened the following issue: > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 > >>>>>>>> > >>>>>>>> And my initial proposal is here: > >>>>>>>> > >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ > >>>>>>>> > >>>>>>>> > >>>>>>>> Part of my proposal is to add an additional flag which I called > >>>>>>> C1InlineStackLimit to reduce stack utilization for C1 methods. > >>>>>>>> I have a simple example which shows wasted stack space (java > >>> example > >>>>>>> TestStack at the end). > >>>>>>>> > >>>>>>>> It simply counts stack frames until a stack overflow occurs. With > the > >>>>> current > >>>>>>> implementation, only 1283 frames fit on the stack because the > never > >>>>>>> executed method bogus_test with local variables gets inlined. > >>>>>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we > get > >>>>> 2310 > >>>>>>> frames until stack overflow. (I only used C1 for this example. Can > be > >>>>>>> reproduced as shown below.) > >>>>>>>> > >>>>>>>> I didn't notice any performance regression even with the > aggressive > >>>>> setting > >>>>>>> of C1InlineStackLimit=5 with TieredCompilation. > >>>>>>>> > >>>>>>>> I know that I'll need a CSR for this change, but I'd like to get > >> feedback > >>> in > >>>>>>> general and feedback about the flag names before creating a CSR. > >>>>>>>> I'd also be glad about feedback regarding the performance > impact. > >>>>>>>> > >>>>>>>> Best regards, > >>>>>>>> Martin > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> Command line: > >>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - > >>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - > XX:+PrintInlining > >> - > >>>>>>> > XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >>>>>>> TestStack > >>>>>>>> CompileCommand: compileonly TestStack.triggerStackOverflow > >>>>>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) > >>>>> recursive > >>>>>>> inlining too deep > >>>>>>>> @ 11 TestStack::bogus_test (33 bytes) inline > >>>>>>>> caught java.lang.StackOverflowError > >>>>>>>> 1283 activations were on stack, sum = 0 > >>>>>>>> > >>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - > >>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - > XX:+PrintInlining > >> - > >>>>>>> > XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >>>>>>> TestStack > >>>>>>>> CompileCommand: compileonly TestStack.triggerStackOverflow > >>>>>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) > >>>>> recursive > >>>>>>> inlining too deep > >>>>>>>> @ 11 TestStack::bogus_test (33 bytes) callee > uses > >>> too > >>>>>>> much stack > >>>>>>>> caught java.lang.StackOverflowError > >>>>>>>> 2310 activations were on stack, sum = 0 > >>>>>>>> > >>>>>>>> > >>>>>>>> TestStack.java: > >>>>>>>> public class TestStack { > >>>>>>>> > >>>>>>>> static long cnt = 0, > >>>>>>>> sum = 0; > >>>>>>>> > >>>>>>>> public static void bogus_test() { > >>>>>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; > >>>>>>>> sum += c1 + c2 + c3 + c4; > >>>>>>>> } > >>>>>>>> > >>>>>>>> public static void triggerStackOverflow() { > >>>>>>>> cnt++; > >>>>>>>> triggerStackOverflow(); > >>>>>>>> bogus_test(); > >>>>>>>> } > >>>>>>>> > >>>>>>>> > >>>>>>>> public static void main(String args[]) { > >>>>>>>> try { > >>>>>>>> triggerStackOverflow(); > >>>>>>>> } catch (StackOverflowError e) { > >>>>>>>> System.out.println("caught " + e); > >>>>>>>> } > >>>>>>>> System.out.println(cnt + " activations were on stack, sum = " > + > >>>>> sum); > >>>>>>>> } > >>>>>>>> } > >>>>>>>> > >>>>>> From derekw at marvell.com Wed May 13 20:20:37 2020 From: derekw at marvell.com (Derek White) Date: Wed, 13 May 2020 20:20:37 +0000 Subject: RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: Message-ID: Hi Xiaohong, This looks good to me (not a (R)eviewer). Thanks for including the patch for ThunderX ! This is a nice cleanup of the code and especially the volatile tests. I did NOT check that the removal of the Compiler Interface declarations does or doesn't impact compilation of the Graal compiler. Did you check that side? Thanks, - Derek -----Original Message----- From: hotspot-compiler-dev On Behalf Of Xiaohong Gong Sent: Thursday, May 7, 2020 11:39 PM To: aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: [EXT] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option External Email ---------------------------------------------------------------------- Hi, Please help to review this patch which obsoletes the product flag "-XX:UseBarrierssForVolatile" and its related code: Webrev: https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Exgong_rfr_8243339_webrev.00_&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=oI-OcgRUa25GiBZaU5V4OmuS8aewSxBaMLbKw3A7lnA&e= JBS: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8243339&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=T-oDfXrBvQorUBzFZq7Omb17P5yqQjg_q3dBo4EExCA&e= https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8243456&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=MDMAqUQRO_kmBtmTodGJ2wNuaVy-u_Y_jykpMmyMQwI&e= (CSR) As described in the CSR, using "-XX:+UseBarriersForVolatile" might have memory consistent issue like that mentioned in [1]. It needs more effort to fix the issue and maintain the memory consistency in future. Since "ldar/stlr" has worked well for a long time, and so does "ldaxr/stlxr" for unsafe atomics, we'd better simplify things by removing this option and the alternative implementation for the volatile access. Since its only one signifcant usage on a kind of CPU would also like to be removed (See [2]), it can work well without this option. So we directly obsolete this option and remove the code, rather than deprecate it firstly. Besides obsoleting this option, this patch also removes an AArch64 CPU feature "CPU_DMB_ATOMICS" together. It is a workaround while not an AArch64 official feature, which is not required anymore (See [2]). [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8241137&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=XQwo26nMgDENOKN5U4pW2EunOEt2UtLycucN1BCScaU&e= [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8242469&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=ChT4b4Jj_TkXozuRs6HIqUVPn1iap0DzKvB-2dKYf0g&e= Testing: Jtreg hotspot::hotspot_all_no_apps, jdk::jdk_core, langtools::tier1 JCStress: tests-all Thanks, Xiaohong Gong From vladimir.kozlov at oracle.com Wed May 13 21:33:31 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 13 May 2020 14:33:31 -0700 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> <19d53124-94d1-50f9-f4e6-948640e7c848@oracle.com> Message-ID: <702038f7-7942-9c94-c507-bd36241db180@oracle.com> On 5/13/20 1:10 PM, Doerr, Martin wrote: > Hi Vladimir, > > thanks for reviewing it. > >>> Should I set it to proposed? >> >> Yes. > I've set it to "Finalized". Hope this was correct. > >>> I've added the new C1 flags to the tests which should test C1 compiler as >> well. >> >> Good. Why not do the same for C1MaxInlineSize? > Looks like MaxInlineSize is only used by tests which test C2 specific things. So I think C1MaxInlineSize would be pointless. > In addition to that, the C2 values are probably not appropriate for C1 in some tests. > Would you like to have C1MaxInlineSize configured in some tests? You are right in cases when test switch off TieredCompilation and use only C2 (Test6792161.java) or tests intrinsics. But we can use it in Test5091921.java. C1 compiles the test code with specified value before - lets keep it. And this is not related to these changes but to have range(0, max_jint) for all these flags is questionable. I think nobody ran tests with 0 or max_jint values. Bunch of tests may simple timeout (which is understandable) but in worst case they may crash instead of graceful exit. Thanks, Vladimir > > Best regards, > Martin > > >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Mittwoch, 13. Mai 2020 21:46 >> To: Doerr, Martin ; hotspot-compiler- >> dev at openjdk.java.net >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >> >> Hi Martin, >> >> On 5/11/20 6:32 AM, Doerr, Martin wrote: >>> Hi Vladimir, >>> >>> are you ok with the updated CSR >> (https://bugs.openjdk.java.net/browse/JDK-8244507)? >>> Should I set it to proposed? >> >> Yes. >> >>> >>> Here's a new webrev with obsoletion + expiration for C2 flags in ClientVM: >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ >>> >>> I've added the new C1 flags to the tests which should test C1 compiler as >> well. >> >> Good. Why not do the same for C1MaxInlineSize? >> >>> And I've added -XX:+IgnoreUnrecognizedVMOptions to all tests which set >> C2 flags. I think this is the best solution because it still allows running the tests >> with GraalVM compiler. >> >> Yes. >> >> Thanks, >> Vladimir >> >>> >>> Best regards, >>> Martin >>> >>> >>>> -----Original Message----- >>>> From: Doerr, Martin >>>> Sent: Freitag, 8. Mai 2020 23:07 >>>> To: Vladimir Kozlov ; hotspot-compiler- >>>> dev at openjdk.java.net >>>> Subject: RE: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>> >>>> Hi Vladimir, >>>> >>>>> You need update your CSR - add information about this and above code >>>> change. Example: >>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 >>>> I've updated the CSR with obsolete and expired flags as in the example. >>>> >>>>> I would suggest to fix tests anyway (there are only few) because new >>>>> warning output could be unexpected. >>>> Ok. I'll prepare a webrev with fixed tests. >>>> >>>> Best regards, >>>> Martin >>>> >>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov >>>>> Sent: Freitag, 8. Mai 2020 21:43 >>>>> To: Doerr, Martin ; hotspot-compiler- >>>>> dev at openjdk.java.net >>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>> >>>>> Hi Martin >>>>> >>>>> On 5/8/20 5:56 AM, Doerr, Martin wrote: >>>>>> Hi Vladimir, >>>>>> >>>>>> thanks a lot for looking at this, for finding the test issues and for >>>> reviewing >>>>> the CSR. >>>>>> >>>>>> For me, C2 is a fundamental part of the JVM. I would usually never >> build >>>>> without it ?? >>>>>> (Except if we want to use C1 + GraalVM compiler only.) >>>>> >>>>> Yes it is one of cases. >>>>> >>>>>> But your right, --with-jvm-variants=client configuration should still be >>>>> supported. >>>>> >>>>> Yes. >>>>> >>>>>> >>>>>> We can fix it by making the flags as obsolete if C2 is not included: >>>>>> diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp >>>>>> --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 11:14:28 >>>> 2020 >>>>> +0200 >>>>>> +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 14:41:14 >>>>> 2020 +0200 >>>>>> @@ -562,6 +562,16 @@ >>>>>> { "dup option", JDK_Version::jdk(9), >>>> JDK_Version::undefined(), >>>>> JDK_Version::undefined() }, >>>>>> #endif >>>>>> >>>>>> +#ifndef COMPILER2 >>>>>> + // These flags were generally available, but are C2 only, now. >>>>>> + { "MaxInlineLevel", JDK_Version::undefined(), >>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>> + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), >>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>> + { "InlineSmallCode", JDK_Version::undefined(), >>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>> + { "MaxInlineSize", JDK_Version::undefined(), >>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>> + { "FreqInlineSize", JDK_Version::undefined(), >>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>> + { "MaxTrivialSize", JDK_Version::undefined(), >>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>> +#endif >>>>>> + >>>>>> { NULL, JDK_Version(0), JDK_Version(0) } >>>>>> }; >>>>> >>>>> Right. I think you should do full process for these product flags >> deprecation >>>>> with obsoleting in JDK 16 for VM builds >>>>> which do not include C2. You need update your CSR - add information >>>> about >>>>> this and above code change. Example: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 >>>>> >>>>>> >>>>>> This makes the VM accept the flags with warning: >>>>>> jdk/bin/java -XX:MaxInlineLevel=9 -version >>>>>> OpenJDK 64-Bit Client VM warning: Ignoring option MaxInlineLevel; >>>>> support was removed in 15.0 >>>>>> >>>>>> If we do it this way, the only test which I think should get fixed is >>>>> ReservedStackTest. >>>>>> I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in order to >>>>> preserve the inlining behavior. >>>>>> >>>>>> (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. >>>>> compiler/c2 tests: Also written to test C2 specific things.) >>>>>> >>>>>> What do you think? >>>>> >>>>> I would suggest to fix tests anyway (there are only few) because new >>>>> warning output could be unexpected. >>>>> And it will be future-proof when warning will be converted into error >>>>> (if/when C2 goes away). >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>>> >>>>>> Best regards, >>>>>> Martin >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: hotspot-compiler-dev >>>>>> bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov >>>>>>> Sent: Donnerstag, 7. Mai 2020 19:11 >>>>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>> >>>>>>> I would suggest to build VM without C2 and run tests. >>>>>>> >>>>>>> I grepped tests with these flags I found next tests where we need to >> fix >>>>>>> test's command (add >>>>>>> -XX:+IgnoreUnrecognizedVMOptions) or add @requires >>>>>>> vm.compiler2.enabled or duplicate test for C1 with corresponding C1 >>>>>>> flags (by ussing additional @test block). >>>>>>> >>>>>>> runtime/ReservedStack/ReservedStackTest.java >>>>>>> compiler/intrinsics/string/TestStringIntrinsics2.java >>>>>>> compiler/c2/Test6792161.java >>>>>>> compiler/c2/Test5091921.java >>>>>>> >>>>>>> And there is issue with compiler/compilercontrol tests which use >>>>>>> InlineSmallCode and I am not sure how to handle: >>>>>>> >>>>>>> >>>>> >>>> >> http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c >>>>>>> ompiler/compilercontrol/share/scenario/Command.java#l36 >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> On 5/4/20 9:04 AM, Doerr, Martin wrote: >>>>>>>> Hi Nils, >>>>>>>> >>>>>>>> thank you for looking at this and sorry for the late reply. >>>>>>>> >>>>>>>> I've added MaxTrivialSize and also updated the issue accordingly. >>>> Makes >>>>>>> sense. >>>>>>>> Do you have more flags in mind? >>>>>>>> >>>>>>>> Moving the flags which are only used by C2 into c2_globals definitely >>>>> makes >>>>>>> sense. >>>>>>>> >>>>>>>> Done in webrev.01: >>>>>>>> >> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ >>>>>>>> >>>>>>>> Please take a look and let me know when my proposal is ready for a >>>> CSR. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Martin >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: hotspot-compiler-dev >>>>>>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson >>>>>>>>> Sent: Dienstag, 28. April 2020 18:29 >>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Thanks for addressing this! This has been an annoyance for a long >>>> time. >>>>>>>>> >>>>>>>>> Have you though about including other flags - like MaxTrivialSize? >>>>>>>>> MaxInlineSize is tested against it. >>>>>>>>> >>>>>>>>> Also - you should move the flags that are now c2-only to >>>>> c2_globals.hpp. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Nils Eliasson >>>>>>>>> >>>>>>>>> On 2020-04-27 15:06, Doerr, Martin wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> while tuning inlining parameters for C2 compiler with JDK-8234863 >>>> we >>>>>>> had >>>>>>>>> discussed impact on C1. >>>>>>>>>> I still think it's bad to share them between both compilers. We >> may >>>>> want >>>>>>> to >>>>>>>>> do further C2 tuning without negative impact on C1 in the future. >>>>>>>>>> >>>>>>>>>> C1 has issues with substantial inlining because of the lack of >>>>> uncommon >>>>>>>>> traps. When C1 inlines a lot, stack frames may get large and code >>>> cache >>>>>>> space >>>>>>>>> may get wasted for cold or even never executed code. The >> situation >>>>> gets >>>>>>>>> worse when many patching stubs get used for such code. >>>>>>>>>> >>>>>>>>>> I had opened the following issue: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 >>>>>>>>>> >>>>>>>>>> And my initial proposal is here: >>>>>>>>>> >>>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Part of my proposal is to add an additional flag which I called >>>>>>>>> C1InlineStackLimit to reduce stack utilization for C1 methods. >>>>>>>>>> I have a simple example which shows wasted stack space (java >>>>> example >>>>>>>>> TestStack at the end). >>>>>>>>>> >>>>>>>>>> It simply counts stack frames until a stack overflow occurs. With >> the >>>>>>> current >>>>>>>>> implementation, only 1283 frames fit on the stack because the >> never >>>>>>>>> executed method bogus_test with local variables gets inlined. >>>>>>>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we >> get >>>>>>> 2310 >>>>>>>>> frames until stack overflow. (I only used C1 for this example. Can >> be >>>>>>>>> reproduced as shown below.) >>>>>>>>>> >>>>>>>>>> I didn't notice any performance regression even with the >> aggressive >>>>>>> setting >>>>>>>>> of C1InlineStackLimit=5 with TieredCompilation. >>>>>>>>>> >>>>>>>>>> I know that I'll need a CSR for this change, but I'd like to get >>>> feedback >>>>> in >>>>>>>>> general and feedback about the flag names before creating a CSR. >>>>>>>>>> I'd also be glad about feedback regarding the performance >> impact. >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Martin >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Command line: >>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 - >>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - >> XX:+PrintInlining >>>> - >>>>>>>>> >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>>>>>> TestStack >>>>>>>>>> CompileCommand: compileonly TestStack.triggerStackOverflow >>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) >>>>>>> recursive >>>>>>>>> inlining too deep >>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) inline >>>>>>>>>> caught java.lang.StackOverflowError >>>>>>>>>> 1283 activations were on stack, sum = 0 >>>>>>>>>> >>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 - >>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - >> XX:+PrintInlining >>>> - >>>>>>>>> >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>>>>>> TestStack >>>>>>>>>> CompileCommand: compileonly TestStack.triggerStackOverflow >>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 bytes) >>>>>>> recursive >>>>>>>>> inlining too deep >>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) callee >> uses >>>>> too >>>>>>>>> much stack >>>>>>>>>> caught java.lang.StackOverflowError >>>>>>>>>> 2310 activations were on stack, sum = 0 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> TestStack.java: >>>>>>>>>> public class TestStack { >>>>>>>>>> >>>>>>>>>> static long cnt = 0, >>>>>>>>>> sum = 0; >>>>>>>>>> >>>>>>>>>> public static void bogus_test() { >>>>>>>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; >>>>>>>>>> sum += c1 + c2 + c3 + c4; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> public static void triggerStackOverflow() { >>>>>>>>>> cnt++; >>>>>>>>>> triggerStackOverflow(); >>>>>>>>>> bogus_test(); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> public static void main(String args[]) { >>>>>>>>>> try { >>>>>>>>>> triggerStackOverflow(); >>>>>>>>>> } catch (StackOverflowError e) { >>>>>>>>>> System.out.println("caught " + e); >>>>>>>>>> } >>>>>>>>>> System.out.println(cnt + " activations were on stack, sum = " >> + >>>>>>> sum); >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> >>>>>>>> From serguei.spitsyn at oracle.com Wed May 13 22:31:47 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 13 May 2020 15:31:47 -0700 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: <3c59b9f9-ec38-18c9-8f24-e1186a08a04a@oracle.com> <410eed04-e2ef-0f4f-1c56-19e6734a10f6@oracle.com> Message-ID: <03c9a0ce-8f78-00e7-9db3-70d6f6cb8156@oracle.com> Hi Richard, Thank you for the bug report update - it is helpful. The fix/update looks good in general but I need more time to check some points. I'm thinking it would be more safe to run full tier5. I can do it after you get all thumbs ups. Thanks, Serguei On 4/24/20 01:18, Reingruber, Richard wrote: > Hi Patricio, Vladimir, and Serguei, > > now that direct handshakes are available, I've updated the patch to make use of them. > > In addition I have done some clean-up changes I missed in the first webrev. > > Finally I have implemented the workaround suggested by Patricio to avoid nesting the handshake > into the vm operation VM_SetFramePop [1] > > Kindly review again: > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ > Webrev(delta): http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ > > I updated the JBS item explaining why the vm operation VM_EnterInterpOnlyMode can be replaced with a > direct handshake: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8238585 > > Testing: > > * JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. > > * Submit-repo: mach5-one-rrich-JDK-8238585-20200423-1436-10441737 > > Thanks, > Richard. > > [1] An assertion in Handshake::execute_direct() fails, if called be VMThread, because it is no JavaThread. > > -----Original Message----- > From: hotspot-dev On Behalf Of Reingruber, Richard > Sent: Freitag, 14. Februar 2020 19:47 > To: Patricio Chilano ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > Subject: RE: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > Hi Patricio, > > > > I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a > > > handshake cannot be nested in a vm operation. Maybe it should be asserted in the > > > Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? > > > > > > > Alternatively I think you could do something similar to what we do in > > > > Deoptimization::deoptimize_all_marked(): > > > > > > > > EnterInterpOnlyModeClosure hs; > > > > if (SafepointSynchronize::is_at_safepoint()) { > > > > hs.do_thread(state->get_thread()); > > > > } else { > > > > Handshake::execute(&hs, state->get_thread()); > > > > } > > > > (you could pass ?EnterInterpOnlyModeClosure? directly to the > > > > HandshakeClosure() constructor) > > > > > > Maybe this could be used also in the Handshake::execute() methods as general solution? > > Right, we could also do that. Avoiding to clear the polling page in > > HandshakeState::clear_handshake() should be enough to fix this issue and > > execute a handshake inside a safepoint, but adding that "if" statement > > in Hanshake::execute() sounds good to avoid all the extra code that we > > go through when executing a handshake. I filed 8239084 to make that change. > > Thanks for taking care of this and creating the RFE. > > > > > > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is > > > > always called in a nested operation or just sometimes. > > > > > > At least one execution path without vm operation exists: > > > > > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void > > > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong > > > JvmtiEventControllerPrivate::recompute_enabled() : void > > > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) > > > JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void > > > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError > > > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError > > > > > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a > > > handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further > > > encouraged to do it with a handshake :) > > Ah! I think you can still do it with a handshake with the > > Deoptimization::deoptimize_all_marked() like solution. I can change the > > if-else statement with just the Handshake::execute() call in 8239084. > > But up to you. : ) > > Well, I think that's enough encouragement :) > I'll wait for 8239084 and try then again. > (no urgency and all) > > Thanks, > Richard. > > -----Original Message----- > From: Patricio Chilano > Sent: Freitag, 14. Februar 2020 15:54 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > Hi Richard, > > On 2/14/20 9:58 AM, Reingruber, Richard wrote: >> Hi Patricio, >> >> thanks for having a look. >> >> > I?m only commenting on the handshake changes. >> > I see that operation VM_EnterInterpOnlyMode can be called inside >> > operation VM_SetFramePop which also allows nested operations. Here is a >> > comment in VM_SetFramePop definition: >> > >> > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >> > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >> > >> > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >> > could have a handshake inside a safepoint operation. The issue I see >> > there is that at the end of the handshake the polling page of the target >> > thread could be disarmed. So if the target thread happens to be in a >> > blocked state just transiently and wakes up then it will not stop for >> > the ongoing safepoint. Maybe I can file an RFE to assert that the >> > polling page is armed at the beginning of disarm_safepoint(). >> >> I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >> handshake cannot be nested in a vm operation. Maybe it should be asserted in the >> Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >> >> > Alternatively I think you could do something similar to what we do in >> > Deoptimization::deoptimize_all_marked(): >> > >> > EnterInterpOnlyModeClosure hs; >> > if (SafepointSynchronize::is_at_safepoint()) { >> > hs.do_thread(state->get_thread()); >> > } else { >> > Handshake::execute(&hs, state->get_thread()); >> > } >> > (you could pass ?EnterInterpOnlyModeClosure? directly to the >> > HandshakeClosure() constructor) >> >> Maybe this could be used also in the Handshake::execute() methods as general solution? > Right, we could also do that. Avoiding to clear the polling page in > HandshakeState::clear_handshake() should be enough to fix this issue and > execute a handshake inside a safepoint, but adding that "if" statement > in Hanshake::execute() sounds good to avoid all the extra code that we > go through when executing a handshake. I filed 8239084 to make that change. > >> > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >> > always called in a nested operation or just sometimes. >> >> At least one execution path without vm operation exists: >> >> JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >> JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >> JvmtiEventControllerPrivate::recompute_enabled() : void >> JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >> JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >> JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >> jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >> >> I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >> handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >> encouraged to do it with a handshake :) > Ah! I think you can still do it with a handshake with the > Deoptimization::deoptimize_all_marked() like solution. I can change the > if-else statement with just the Handshake::execute() call in 8239084. > But up to you.? : ) > > Thanks, > Patricio >> Thanks again, >> Richard. >> >> -----Original Message----- >> From: Patricio Chilano >> Sent: Donnerstag, 13. Februar 2020 18:47 >> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >> >> Hi Richard, >> >> I?m only commenting on the handshake changes. >> I see that operation VM_EnterInterpOnlyMode can be called inside >> operation VM_SetFramePop which also allows nested operations. Here is a >> comment in VM_SetFramePop definition: >> >> // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >> // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >> >> So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >> could have a handshake inside a safepoint operation. The issue I see >> there is that at the end of the handshake the polling page of the target >> thread could be disarmed. So if the target thread happens to be in a >> blocked state just transiently and wakes up then it will not stop for >> the ongoing safepoint. Maybe I can file an RFE to assert that the >> polling page is armed at the beginning of disarm_safepoint(). >> >> I think one option could be to remove >> SafepointMechanism::disarm_if_needed() in >> HandshakeState::clear_handshake() and let each JavaThread disarm itself >> for the handshake case. >> >> Alternatively I think you could do something similar to what we do in >> Deoptimization::deoptimize_all_marked(): >> >> ? EnterInterpOnlyModeClosure hs; >> ? if (SafepointSynchronize::is_at_safepoint()) { >> ??? hs.do_thread(state->get_thread()); >> ? } else { >> ??? Handshake::execute(&hs, state->get_thread()); >> ? } >> (you could pass ?EnterInterpOnlyModeClosure? directly to the >> HandshakeClosure() constructor) >> >> I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >> always called in a nested operation or just sometimes. >> >> Thanks, >> Patricio >> >> On 2/12/20 7:23 AM, Reingruber, Richard wrote: >>> // Repost including hotspot runtime and gc lists. >>> // Dean Long suggested to do so, because the enhancement replaces a vm operation >>> // with a handshake. >>> // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html >>> >>> Hi, >>> >>> could I please get reviews for this small enhancement in hotspot's jvmti implementation: >>> >>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >>> >>> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >>> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >>> >>> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >>> >>> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>> >>> Thanks, Richard. >>> >>> See also my question if anyone knows a reason for making the compiled methods not_entrant: >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From Xiaohong.Gong at arm.com Thu May 14 01:17:44 2020 From: Xiaohong.Gong at arm.com (Xiaohong Gong) Date: Thu, 14 May 2020 01:17:44 +0000 Subject: RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: Message-ID: Hi Derek, Thanks for your review! > I did NOT check that the removal of the Compiler Interface declarations does or doesn't impact compilation of the Graal compiler. Did you check that side? Yes, sure! I have checked the impact for Graal compiler. The removed interfaces are not used in Graal compiler. The only used one is the CPU feature "DMB_ATOMICS", and it's used in Graal native-image. I will create another patch for graal substratevm to remove it. Thanks, Xiaohong -----Original Message----- From: Derek White Sent: Thursday, May 14, 2020 4:21 AM To: Xiaohong Gong ; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: RE: RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option Hi Xiaohong, This looks good to me (not a (R)eviewer). Thanks for including the patch for ThunderX ! This is a nice cleanup of the code and especially the volatile tests. I did NOT check that the removal of the Compiler Interface declarations does or doesn't impact compilation of the Graal compiler. Did you check that side? Thanks, - Derek -----Original Message----- From: hotspot-compiler-dev On Behalf Of Xiaohong Gong Sent: Thursday, May 7, 2020 11:39 PM To: aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: [EXT] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option External Email ---------------------------------------------------------------------- Hi, Please help to review this patch which obsoletes the product flag "-XX:UseBarrierssForVolatile" and its related code: Webrev: https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Exgong_rfr_8243339_webrev.00_&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=oI-OcgRUa25GiBZaU5V4OmuS8aewSxBaMLbKw3A7lnA&e= JBS: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8243339&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=T-oDfXrBvQorUBzFZq7Omb17P5yqQjg_q3dBo4EExCA&e= https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8243456&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=MDMAqUQRO_kmBtmTodGJ2wNuaVy-u_Y_jykpMmyMQwI&e= (CSR) As described in the CSR, using "-XX:+UseBarriersForVolatile" might have memory consistent issue like that mentioned in [1]. It needs more effort to fix the issue and maintain the memory consistency in future. Since "ldar/stlr" has worked well for a long time, and so does "ldaxr/stlxr" for unsafe atomics, we'd better simplify things by removing this option and the alternative implementation for the volatile access. Since its only one signifcant usage on a kind of CPU would also like to be removed (See [2]), it can work well without this option. So we directly obsolete this option and remove the code, rather than deprecate it firstly. Besides obsoleting this option, this patch also removes an AArch64 CPU feature "CPU_DMB_ATOMICS" together. It is a workaround while not an AArch64 official feature, which is not required anymore (See [2]). [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8241137&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=XQwo26nMgDENOKN5U4pW2EunOEt2UtLycucN1BCScaU&e= [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8242469&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=ChT4b4Jj_TkXozuRs6HIqUVPn1iap0DzKvB-2dKYf0g&e= Testing: Jtreg hotspot::hotspot_all_no_apps, jdk::jdk_core, langtools::tier1 JCStress: tests-all Thanks, Xiaohong Gong From manc at google.com Thu May 14 02:48:20 2020 From: manc at google.com (Man Cao) Date: Wed, 13 May 2020 19:48:20 -0700 Subject: RFR(M): 8244660: Code cache sweeper heuristics is broken In-Reply-To: <0688678b-986b-082c-425e-543c3c32b094@oracle.com> References: <0688678b-986b-082c-425e-543c3c32b094@oracle.com> Message-ID: Hi Nils, I have done more DaCapo benchmarking with the patches. Overall, the result looks good, and your fix indeed reduces sweep frequency than the current state. It retains possible performance improvement and does not introduce unnecessary increase in code cache usage. All results are available at https://cr.openjdk.java.net/~manc/8244660_benchmarks/. I have also included counters for used code cache size and sweeper statistics in the graphs. These metrics are collected using this patch: https://cr.openjdk.java.net/~manc/8244660_benchmarks/hsperfcounters_webrev/ All runs are with "-Xms4g -Xmx4g -XX:-TieredCompilation", because -TieredCompilation matters a lot for our workload. Also note that the numbers for throughput/CPU and GC exclude the warmup iterations. The codecache/sweeper statistics account for all iterations (including warmups). Comparing 3 JDK builds: https://cr.openjdk.java.net/~manc/8244660_benchmarks/20200508-JDKHead-dacapoLarge4G-sweeperPatches.html base: current state with no pending patches allFixes: with patches for JDK-8244660, JDK-8244278 and JDK-8244658 sweepAt90: with only the patch for JDK-8244278, so it's the same as the config I used in previous results in JDK-8244278. "allFixes" reduced sweep frequency than "base", without introducing much increase in code cache usage. Same as above, but with -XX:ReservedCodeCacheSize=40m: https://cr.openjdk.java.net/~manc/8244660_benchmarks/20200512-JDKHead-dacapoLarge4G-sweeperPatches-CodeCache40MB.html "allFixes" retains the throughput and CPU improvement for tradesoap, perhaps even better than not sweeping ("sweepAt90"). Code cache usage for tradesoap is between "base" and not sweeping, which is OK in my opinion. I think 1/100 of a 240mb default code cache seems a bit high. During > startup we produce a lot of L3 code that will be thrown away. We want to > recycle it fairly quickly, to avoid fragmenting the code cache, but not > that often that we affect startup. > I've done some startup measurements, and then we sweep about every other > second in a benchmark that produces a lot of code. > What results are you seeing? The 1/256 capped at 1MB seems OK. Even with 40MB or 48MB code cache size with -TieredCompilation, it does not flush too frequently. Code cache flushing has another heuristic - it might be broken too. But > it would be interesting too see how it works with the new sweep > heuristic. If you know that you have enough code cache - turning it off > is no loss. It only helps when you are running out of code cache. > When we are doing normal sweeping - we don't deoptimize cold code. That > is handled my the method flushing - it should only kick in when we start > to run out of code cache. I think we should address MethodFlushing in a separate RFE/BUG. Thanks for explaining this. I did some benchmarking with -XX:NmethodSweepActivity and -XX:MinPassesBeforeFlush, on top of the "allFixes" config: https://cr.openjdk.java.net/~manc/8244660_benchmarks/20200508-JDKHead-dacapoLarge4G-NmethodSweepActivity.html https://cr.openjdk.java.net/~manc/8244660_benchmarks/20200508-JDKHead-dacapoLarge4G-MinPassesBeforeFlush.html xalan, jython look better with small values, pmd looks worse. I'll follow up separately if I find anything wrong with the flushing/cold-code-deoptimization heuristic The heuristics for CodeAging may have been negatively affected by the > transition to handshakes. Also the SetHotnessClosure should be replaced > by a mechanism using the NMethodEntry barriers. > I see that we are missing JFR events for MethodFlushing. I have created > another patch for that. Although I'm not very familiar with these, thanks for identifying and fixing these issues! -Man From derekw at marvell.com Thu May 14 03:21:40 2020 From: derekw at marvell.com (Derek White) Date: Thu, 14 May 2020 03:21:40 +0000 Subject: RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: Message-ID: OK, that sounds good to me. - Derek -----Original Message----- From: Xiaohong Gong Sent: Wednesday, May 13, 2020 9:18 PM To: Derek White ; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: RE: RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option Hi Derek, Thanks for your review! > I did NOT check that the removal of the Compiler Interface declarations does or doesn't impact compilation of the Graal compiler. Did you check that side? Yes, sure! I have checked the impact for Graal compiler. The removed interfaces are not used in Graal compiler. The only used one is the CPU feature "DMB_ATOMICS", and it's used in Graal native-image. I will create another patch for graal substratevm to remove it. Thanks, Xiaohong -----Original Message----- From: Derek White Sent: Thursday, May 14, 2020 4:21 AM To: Xiaohong Gong ; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: RE: RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option Hi Xiaohong, This looks good to me (not a (R)eviewer). Thanks for including the patch for ThunderX ! This is a nice cleanup of the code and especially the volatile tests. I did NOT check that the removal of the Compiler Interface declarations does or doesn't impact compilation of the Graal compiler. Did you check that side? Thanks, - Derek -----Original Message----- From: hotspot-compiler-dev On Behalf Of Xiaohong Gong Sent: Thursday, May 7, 2020 11:39 PM To: aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: [EXT] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option External Email ---------------------------------------------------------------------- Hi, Please help to review this patch which obsoletes the product flag "-XX:UseBarrierssForVolatile" and its related code: Webrev: https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Exgong_rfr_8243339_webrev.00_&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=oI-OcgRUa25GiBZaU5V4OmuS8aewSxBaMLbKw3A7lnA&e= JBS: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8243339&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=T-oDfXrBvQorUBzFZq7Omb17P5yqQjg_q3dBo4EExCA&e= https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8243456&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=MDMAqUQRO_kmBtmTodGJ2wNuaVy-u_Y_jykpMmyMQwI&e= (CSR) As described in the CSR, using "-XX:+UseBarriersForVolatile" might have memory consistent issue like that mentioned in [1]. It needs more effort to fix the issue and maintain the memory consistency in future. Since "ldar/stlr" has worked well for a long time, and so does "ldaxr/stlxr" for unsafe atomics, we'd better simplify things by removing this option and the alternative implementation for the volatile access. Since its only one signifcant usage on a kind of CPU would also like to be removed (See [2]), it can work well without this option. So we directly obsolete this option and remove the code, rather than deprecate it firstly. Besides obsoleting this option, this patch also removes an AArch64 CPU feature "CPU_DMB_ATOMICS" together. It is a workaround while not an AArch64 official feature, which is not required anymore (See [2]). [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8241137&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=XQwo26nMgDENOKN5U4pW2EunOEt2UtLycucN1BCScaU&e= [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8242469&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=ChT4b4Jj_TkXozuRs6HIqUVPn1iap0DzKvB-2dKYf0g&e= Testing: Jtreg hotspot::hotspot_all_no_apps, jdk::jdk_core, langtools::tier1 JCStress: tests-all Thanks, Xiaohong Gong From david.holmes at oracle.com Thu May 14 05:29:22 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 14 May 2020 15:29:22 +1000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: <81d7caa8-4244-85f3-4d4e-78117fe5e25b@oss.nttdata.com> <550b95ac-8b29-1eb8-a507-533e81d02322@oracle.com> <9c49ea2d-e3b8-b576-1d17-d18ad87cd6ed@oracle.com> <36d5e2c0-c724-7ff7-d37e-decb5cc0005b@oracle.com> Message-ID: <9e64b51d-8ac8-9e9a-1f89-52ca897932a4@oracle.com> > Still not a review, or is it now? I'd say still not a review as I'm only looking at the general structure. Cheers, David On 14/05/2020 1:37 am, Reingruber, Richard wrote: > Hi David, > >> On 4/05/2020 8:33 pm, Reingruber, Richard wrote: >>> // Trimmed the list of recipients. If the list gets too long then the message needs to be approved >>> // by a moderator. > >> Yes I noticed that too :) In general if you send to hotspot-dev you >> shouldn't need to also send to hotspot-X-dev. > > Makes sense. Will do so next time. > >>> >>> This would be the post with the current webrev.1 >>> >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/031245.html > >> Sorry I missed that update. Okay so this is working with direct >> handshakes now. > >> One style nit in jvmtiThreadState.cpp: > >> assert(SafepointSynchronize::is_at_safepoint() || >> ! (JavaThread *)Thread::current() == get_thread() || >> ! Thread::current() == get_thread()->active_handshaker(), >> ! "bad synchronization with owner thread"); > >> the ! lines should ident as follows > >> assert(SafepointSynchronize::is_at_safepoint() || >> (JavaThread *)Thread::current() == get_thread() || >> Thread::current() == get_thread()->active_handshaker(), >> ! "bad synchronization with owner thread"); > > Sure. > >> Lets see how this plays out. > > Hopefully not too bad... :) > >>> Not a review but some general commentary ... > > Still not a review, or is it now? > > Thanks, Richard. > > -----Original Message----- > From: David Holmes > Sent: Mittwoch, 13. Mai 2020 07:43 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > On 4/05/2020 8:33 pm, Reingruber, Richard wrote: >> // Trimmed the list of recipients. If the list gets too long then the message needs to be approved >> // by a moderator. > > Yes I noticed that too :) In general if you send to hotspot-dev you > shouldn't need to also send to hotspot-X-dev. > >> Hi David, > > Hi Richard, > >>> On 28/04/2020 12:09 am, Reingruber, Richard wrote: >>>> Hi David, >>>> >>>>> Not a review but some general commentary ... >>>> >>>> That's welcome. >> >>> Having had to take an even closer look now I have a review comment too :) >> >>> src/hotspot/share/prims/jvmtiThreadState.cpp >> >>> void JvmtiThreadState::invalidate_cur_stack_depth() { >>> ! assert(SafepointSynchronize::is_at_safepoint() || >>> ! (Thread::current()->is_VM_thread() && >>> get_thread()->is_vmthread_processing_handshake()) || >>> (JavaThread *)Thread::current() == get_thread(), >>> "must be current thread or at safepoint"); >> >> You're looking at an outdated webrev, I'm afraid. >> >> This would be the post with the current webrev.1 >> >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/031245.html > > Sorry I missed that update. Okay so this is working with direct > handshakes now. > > One style nit in jvmtiThreadState.cpp: > > assert(SafepointSynchronize::is_at_safepoint() || > ! (JavaThread *)Thread::current() == get_thread() || > ! Thread::current() == get_thread()->active_handshaker(), > ! "bad synchronization with owner thread"); > > the ! lines should ident as follows > > assert(SafepointSynchronize::is_at_safepoint() || > (JavaThread *)Thread::current() == get_thread() || > Thread::current() == get_thread()->active_handshaker(), > ! "bad synchronization with owner thread"); > > Lets see how this plays out. > > Cheers, > David > >> Thanks, Richard. >> >> -----Original Message----- >> From: David Holmes >> Sent: Montag, 4. Mai 2020 08:51 >> To: Reingruber, Richard ; Yasumasa Suenaga ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >> >> Hi Richard, >> >> On 28/04/2020 12:09 am, Reingruber, Richard wrote: >>> Hi David, >>> >>>> Not a review but some general commentary ... >>> >>> That's welcome. >> >> Having had to take an even closer look now I have a review comment too :) >> >> src/hotspot/share/prims/jvmtiThreadState.cpp >> >> void JvmtiThreadState::invalidate_cur_stack_depth() { >> ! assert(SafepointSynchronize::is_at_safepoint() || >> ! (Thread::current()->is_VM_thread() && >> get_thread()->is_vmthread_processing_handshake()) || >> (JavaThread *)Thread::current() == get_thread(), >> "must be current thread or at safepoint"); >> >> The message needs updating to include handshakes. >> >> More below ... >> >>>> On 25/04/2020 2:08 am, Reingruber, Richard wrote: >>>>> Hi Yasumasa, Patricio, >>>>> >>>>>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>>>>> Does it help you? I think it gives you to remove workaround. >>>>>>> >>>>>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>>>> >>>>>> Thanks for your information. >>>>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>>>>> I will modify and will test it after yours. >>>>> >>>>> Thanks :) >>>>> >>>>>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>>>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>>>>> to me, how this has to be handled. >>>>> >>>>>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>>>> >>>>> Yes. To me it is unclear what synchronization is necessary, if it is called during a handshake. And >>>>> also I'm unsure if a thread should do safepoint checks while executing a handshake. >>> >>>> I'm growing increasingly concerned that use of direct handshakes to >>>> replace VM operations needs a much greater examination for correctness >>>> than might initially be thought. I see a number of issues: >>> >>> I agree. I'll address your concerns in the context of this review thread for JDK-8238585 below. >>> >>> In addition I would suggest to take the general part of the discussion to a dedicated thread or to >>> the review thread for JDK-8242427. I would like to keep this thread closer to its subject. >> >> I will focus on the issues in the context of this particular change >> then, though the issues themselves are applicable to all handshake >> situations (and more so with direct handshakes). This is mostly just >> discussion. >> >>>> First, the VMThread executes (most) VM operations with a clean stack in >>>> a clean state, so it has lots of room to work. If we now execute the >>>> same logic in a JavaThread then we risk hitting stackoverflows if >>>> nothing else. But we are also now executing code in a JavaThread and so >>>> we have to be sure that code is not going to act differently (in a bad >>>> way) if executed by a JavaThread rather than the VMThread. For example, >>>> may it be possible that if executing in the VMThread we defer some >>>> activity that might require execution of Java code, or else hand it off >>>> to one of the service threads? If we execute that code directly in the >>>> current JavaThread instead we may not be in a valid state (e.g. consider >>>> re-entrancy to various subsystems that is not allowed). >>> >>> It is not too complex, what EnterInterpOnlyModeClosure::do_thread() is doing. I already added a >>> paragraph to the JBS-Item [1] explaining why the direct handshake is sufficient from a >>> synchronization point of view. >> >> Just to be clear, your proposed change is not using a direct handshake. >> >>> Furthermore the stack is walked and the return pc of compiled frames is replaced with the address of >>> the deopt handler. >>> >>> I can't see why this cannot be done with a direct handshake. Something very similar is already done >>> in JavaThread::deoptimize_marked_methods() which is executed as part of an ordinary handshake. >> >> Note that existing non-direct handshakes may also have issues that not >> have been fully investigated. >> >>> The demand on stack-space should be very modest. I would not expect a higher risk for stackoverflow. >> >> For the target thread if you use more stack than would be used stopping >> at a safepoint then you are at risk. For the thread initiating the >> direct handshake if you use more stack than would be used enqueuing a VM >> operation, then you are at risk. As we have not quantified these >> numbers, nor have any easy way to establish the stack use of the actual >> code to be executed, we're really just hoping for the best. This is a >> general problem with handshakes that needs to be investigated more >> deeply. As a simple, general, example just imagine if the code involves >> logging that might utilise an on-stack buffer. >> >>>> Second, we have this question mark over what happens if the operation >>>> hits further safepoint or handshake polls/checks? Are there constraints >>>> on what is allowed here? How can we recognise this problem may exist and >>>> so deal with it? >>> >>> The thread in EnterInterpOnlyModeClosure::do_thread() can't become safepoint/handshake safe. I >>> tested locally test/hotspot/jtreg:vmTestbase_nsk_jvmti with a NoSafepointVerifier. >> >> That's good to hear but such tests are not exhaustive, they will detect >> if you do reach a safepoint/handshake but they can't prove that you >> cannot reach one. What you have done is necessary but may not be >> sufficient. Plus you didn't actually add the NSV to the code - is there >> a reason we can't actually keep it in do_thread? (I'm not sure if the >> NSV also acts as a NoHandshakeVerifier?) >> >>>> Third, while we are generally considering what appear to be >>>> single-thread operations, which should be amenable to a direct >>>> handshake, we also have to be careful that some of the code involved >>>> doesn't already expect/assume we are at a safepoint - e.g. a VM op may >>>> not need to take a lock where a direct handshake might! >>> >>> See again my arguments in the JBS item [1]. >> >> Yes I see the reasoning and that is good. My point is a general one as >> it may not be obvious when such assumptions exist in the current code. >> >> Thanks, >> David >> >>> Thanks, >>> Richard. >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8238585 >>> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Montag, 27. April 2020 07:16 >>> To: Reingruber, Richard ; Yasumasa Suenaga ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>> >>> Hi all, >>> >>> Not a review but some general commentary ... >>> >>> On 25/04/2020 2:08 am, Reingruber, Richard wrote: >>>> Hi Yasumasa, Patricio, >>>> >>>>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>>>> Does it help you? I think it gives you to remove workaround. >>>>>> >>>>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>>> >>>>> Thanks for your information. >>>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>>>> I will modify and will test it after yours. >>>> >>>> Thanks :) >>>> >>>>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>>>> to me, how this has to be handled. >>>> >>>>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>>> >>>> Yes. To me it is unclear what synchronization is necessary, if it is called during a handshake. And >>>> also I'm unsure if a thread should do safepoint checks while executing a handshake. >>> >>> I'm growing increasingly concerned that use of direct handshakes to >>> replace VM operations needs a much greater examination for correctness >>> than might initially be thought. I see a number of issues: >>> >>> First, the VMThread executes (most) VM operations with a clean stack in >>> a clean state, so it has lots of room to work. If we now execute the >>> same logic in a JavaThread then we risk hitting stackoverflows if >>> nothing else. But we are also now executing code in a JavaThread and so >>> we have to be sure that code is not going to act differently (in a bad >>> way) if executed by a JavaThread rather than the VMThread. For example, >>> may it be possible that if executing in the VMThread we defer some >>> activity that might require execution of Java code, or else hand it off >>> to one of the service threads? If we execute that code directly in the >>> current JavaThread instead we may not be in a valid state (e.g. consider >>> re-entrancy to various subsystems that is not allowed). >>> >>> Second, we have this question mark over what happens if the operation >>> hits further safepoint or handshake polls/checks? Are there constraints >>> on what is allowed here? How can we recognise this problem may exist and >>> so deal with it? >>> >>> Third, while we are generally considering what appear to be >>> single-thread operations, which should be amenable to a direct >>> handshake, we also have to be careful that some of the code involved >>> doesn't already expect/assume we are at a safepoint - e.g. a VM op may >>> not need to take a lock where a direct handshake might! >>> >>> Cheers, >>> David >>> ----- >>> >>>> @Patricio, coming back to my question [1]: >>>> >>>> In the example you gave in your answer [2]: the java thread would execute a vm operation during a >>>> direct handshake operation, while the VMThread is actually in the middle of a VM_HandshakeAllThreads >>>> operation, waiting to handshake the same handshakee: why can't the VMThread just proceed? The >>>> handshakee would be safepoint safe, wouldn't it? >>>> >>>> Thanks, Richard. >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14301677&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14301677 >>>> >>>> [2] https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14301763&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14301763 >>>> >>>> -----Original Message----- >>>> From: Yasumasa Suenaga >>>> Sent: Freitag, 24. April 2020 17:23 >>>> To: Reingruber, Richard ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>> >>>> Hi Richard, >>>> >>>> On 2020/04/24 23:44, Reingruber, Richard wrote: >>>>> Hi Yasumasa, >>>>> >>>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>>> Does it help you? I think it gives you to remove workaround. >>>>> >>>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>>> >>>> Thanks for your information. >>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>>> I will modify and will test it after yours. >>>> >>>> >>>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>>> to me, how this has to be handled. >>>> >>>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>>> So it appears to me that it would be easier to push JDK-8242427 after this (JDK-8238585). >>>>> >>>>>> (The patch is available, but I want to see the result of PIT in this weekend whether JDK-8242425 works fine.) >>>>> >>>>> Would be interesting to see how you handled the issues above :) >>>>> >>>>> Thanks, Richard. >>>>> >>>>> [1] See question in comment https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14302030&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14302030 >>>>> >>>>> -----Original Message----- >>>>> From: Yasumasa Suenaga >>>>> Sent: Freitag, 24. April 2020 13:34 >>>>> To: Reingruber, Richard ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>> >>>>> Hi Richard, >>>>> >>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>> Does it help you? I think it gives you to remove workaround. >>>>> >>>>> (The patch is available, but I want to see the result of PIT in this weekend whether JDK-8242425 works fine.) >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/04/24 17:18, Reingruber, Richard wrote: >>>>>> Hi Patricio, Vladimir, and Serguei, >>>>>> >>>>>> now that direct handshakes are available, I've updated the patch to make use of them. >>>>>> >>>>>> In addition I have done some clean-up changes I missed in the first webrev. >>>>>> >>>>>> Finally I have implemented the workaround suggested by Patricio to avoid nesting the handshake >>>>>> into the vm operation VM_SetFramePop [1] >>>>>> >>>>>> Kindly review again: >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ >>>>>> Webrev(delta): http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ >>>>>> >>>>>> I updated the JBS item explaining why the vm operation VM_EnterInterpOnlyMode can be replaced with a >>>>>> direct handshake: >>>>>> >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8238585 >>>>>> >>>>>> Testing: >>>>>> >>>>>> * JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>>>>> >>>>>> * Submit-repo: mach5-one-rrich-JDK-8238585-20200423-1436-10441737 >>>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>> [1] An assertion in Handshake::execute_direct() fails, if called be VMThread, because it is no JavaThread. >>>>>> >>>>>> -----Original Message----- >>>>>> From: hotspot-dev On Behalf Of Reingruber, Richard >>>>>> Sent: Freitag, 14. Februar 2020 19:47 >>>>>> To: Patricio Chilano ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>>> Subject: RE: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>>> >>>>>> Hi Patricio, >>>>>> >>>>>> > > I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >>>>>> > > handshake cannot be nested in a vm operation. Maybe it should be asserted in the >>>>>> > > Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >>>>>> > > >>>>>> > > > Alternatively I think you could do something similar to what we do in >>>>>> > > > Deoptimization::deoptimize_all_marked(): >>>>>> > > > >>>>>> > > > EnterInterpOnlyModeClosure hs; >>>>>> > > > if (SafepointSynchronize::is_at_safepoint()) { >>>>>> > > > hs.do_thread(state->get_thread()); >>>>>> > > > } else { >>>>>> > > > Handshake::execute(&hs, state->get_thread()); >>>>>> > > > } >>>>>> > > > (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>>> > > > HandshakeClosure() constructor) >>>>>> > > >>>>>> > > Maybe this could be used also in the Handshake::execute() methods as general solution? >>>>>> > Right, we could also do that. Avoiding to clear the polling page in >>>>>> > HandshakeState::clear_handshake() should be enough to fix this issue and >>>>>> > execute a handshake inside a safepoint, but adding that "if" statement >>>>>> > in Hanshake::execute() sounds good to avoid all the extra code that we >>>>>> > go through when executing a handshake. I filed 8239084 to make that change. >>>>>> >>>>>> Thanks for taking care of this and creating the RFE. >>>>>> >>>>>> > >>>>>> > > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>>> > > > always called in a nested operation or just sometimes. >>>>>> > > >>>>>> > > At least one execution path without vm operation exists: >>>>>> > > >>>>>> > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >>>>>> > > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >>>>>> > > JvmtiEventControllerPrivate::recompute_enabled() : void >>>>>> > > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >>>>>> > > JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >>>>>> > > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >>>>>> > > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >>>>>> > > >>>>>> > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >>>>>> > > handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >>>>>> > > encouraged to do it with a handshake :) >>>>>> > Ah! I think you can still do it with a handshake with the >>>>>> > Deoptimization::deoptimize_all_marked() like solution. I can change the >>>>>> > if-else statement with just the Handshake::execute() call in 8239084. >>>>>> > But up to you. : ) >>>>>> >>>>>> Well, I think that's enough encouragement :) >>>>>> I'll wait for 8239084 and try then again. >>>>>> (no urgency and all) >>>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>> -----Original Message----- >>>>>> From: Patricio Chilano >>>>>> Sent: Freitag, 14. Februar 2020 15:54 >>>>>> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>>> >>>>>> Hi Richard, >>>>>> >>>>>> On 2/14/20 9:58 AM, Reingruber, Richard wrote: >>>>>>> Hi Patricio, >>>>>>> >>>>>>> thanks for having a look. >>>>>>> >>>>>>> > I?m only commenting on the handshake changes. >>>>>>> > I see that operation VM_EnterInterpOnlyMode can be called inside >>>>>>> > operation VM_SetFramePop which also allows nested operations. Here is a >>>>>>> > comment in VM_SetFramePop definition: >>>>>>> > >>>>>>> > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >>>>>>> > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >>>>>>> > >>>>>>> > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >>>>>>> > could have a handshake inside a safepoint operation. The issue I see >>>>>>> > there is that at the end of the handshake the polling page of the target >>>>>>> > thread could be disarmed. So if the target thread happens to be in a >>>>>>> > blocked state just transiently and wakes up then it will not stop for >>>>>>> > the ongoing safepoint. Maybe I can file an RFE to assert that the >>>>>>> > polling page is armed at the beginning of disarm_safepoint(). >>>>>>> >>>>>>> I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >>>>>>> handshake cannot be nested in a vm operation. Maybe it should be asserted in the >>>>>>> Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >>>>>>> >>>>>>> > Alternatively I think you could do something similar to what we do in >>>>>>> > Deoptimization::deoptimize_all_marked(): >>>>>>> > >>>>>>> > EnterInterpOnlyModeClosure hs; >>>>>>> > if (SafepointSynchronize::is_at_safepoint()) { >>>>>>> > hs.do_thread(state->get_thread()); >>>>>>> > } else { >>>>>>> > Handshake::execute(&hs, state->get_thread()); >>>>>>> > } >>>>>>> > (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>>>> > HandshakeClosure() constructor) >>>>>>> >>>>>>> Maybe this could be used also in the Handshake::execute() methods as general solution? >>>>>> Right, we could also do that. Avoiding to clear the polling page in >>>>>> HandshakeState::clear_handshake() should be enough to fix this issue and >>>>>> execute a handshake inside a safepoint, but adding that "if" statement >>>>>> in Hanshake::execute() sounds good to avoid all the extra code that we >>>>>> go through when executing a handshake. I filed 8239084 to make that change. >>>>>> >>>>>>> > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>>>> > always called in a nested operation or just sometimes. >>>>>>> >>>>>>> At least one execution path without vm operation exists: >>>>>>> >>>>>>> JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >>>>>>> JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >>>>>>> JvmtiEventControllerPrivate::recompute_enabled() : void >>>>>>> JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >>>>>>> JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >>>>>>> JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >>>>>>> jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >>>>>>> >>>>>>> I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >>>>>>> handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >>>>>>> encouraged to do it with a handshake :) >>>>>> Ah! I think you can still do it with a handshake with the >>>>>> Deoptimization::deoptimize_all_marked() like solution. I can change the >>>>>> if-else statement with just the Handshake::execute() call in 8239084. >>>>>> But up to you.? : ) >>>>>> >>>>>> Thanks, >>>>>> Patricio >>>>>>> Thanks again, >>>>>>> Richard. >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Patricio Chilano >>>>>>> Sent: Donnerstag, 13. Februar 2020 18:47 >>>>>>> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>>>> >>>>>>> Hi Richard, >>>>>>> >>>>>>> I?m only commenting on the handshake changes. >>>>>>> I see that operation VM_EnterInterpOnlyMode can be called inside >>>>>>> operation VM_SetFramePop which also allows nested operations. Here is a >>>>>>> comment in VM_SetFramePop definition: >>>>>>> >>>>>>> // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >>>>>>> // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >>>>>>> >>>>>>> So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >>>>>>> could have a handshake inside a safepoint operation. The issue I see >>>>>>> there is that at the end of the handshake the polling page of the target >>>>>>> thread could be disarmed. So if the target thread happens to be in a >>>>>>> blocked state just transiently and wakes up then it will not stop for >>>>>>> the ongoing safepoint. Maybe I can file an RFE to assert that the >>>>>>> polling page is armed at the beginning of disarm_safepoint(). >>>>>>> >>>>>>> I think one option could be to remove >>>>>>> SafepointMechanism::disarm_if_needed() in >>>>>>> HandshakeState::clear_handshake() and let each JavaThread disarm itself >>>>>>> for the handshake case. >>>>>>> >>>>>>> Alternatively I think you could do something similar to what we do in >>>>>>> Deoptimization::deoptimize_all_marked(): >>>>>>> >>>>>>> ? EnterInterpOnlyModeClosure hs; >>>>>>> ? if (SafepointSynchronize::is_at_safepoint()) { >>>>>>> ??? hs.do_thread(state->get_thread()); >>>>>>> ? } else { >>>>>>> ??? Handshake::execute(&hs, state->get_thread()); >>>>>>> ? } >>>>>>> (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>>>> HandshakeClosure() constructor) >>>>>>> >>>>>>> I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>>>> always called in a nested operation or just sometimes. >>>>>>> >>>>>>> Thanks, >>>>>>> Patricio >>>>>>> >>>>>>> On 2/12/20 7:23 AM, Reingruber, Richard wrote: >>>>>>>> // Repost including hotspot runtime and gc lists. >>>>>>>> // Dean Long suggested to do so, because the enhancement replaces a vm operation >>>>>>>> // with a handshake. >>>>>>>> // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> could I please get reviews for this small enhancement in hotspot's jvmti implementation: >>>>>>>> >>>>>>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >>>>>>>> >>>>>>>> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >>>>>>>> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >>>>>>>> >>>>>>>> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >>>>>>>> >>>>>>>> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>>>>>>> >>>>>>>> Thanks, Richard. >>>>>>>> >>>>>>>> See also my question if anyone knows a reason for making the compiled methods not_entrant: >>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html >>>>>> From richard.reingruber at sap.com Thu May 14 07:38:30 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Thu, 14 May 2020 07:38:30 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: <9e64b51d-8ac8-9e9a-1f89-52ca897932a4@oracle.com> References: <81d7caa8-4244-85f3-4d4e-78117fe5e25b@oss.nttdata.com> <550b95ac-8b29-1eb8-a507-533e81d02322@oracle.com> <9c49ea2d-e3b8-b576-1d17-d18ad87cd6ed@oracle.com> <36d5e2c0-c724-7ff7-d37e-decb5cc0005b@oracle.com> <9e64b51d-8ac8-9e9a-1f89-52ca897932a4@oracle.com> Message-ID: Ok. Thanks for the feedback anyways. Cheers, Richard. -----Original Message----- From: David Holmes Sent: Donnerstag, 14. Mai 2020 07:29 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > Still not a review, or is it now? I'd say still not a review as I'm only looking at the general structure. Cheers, David On 14/05/2020 1:37 am, Reingruber, Richard wrote: > Hi David, > >> On 4/05/2020 8:33 pm, Reingruber, Richard wrote: >>> // Trimmed the list of recipients. If the list gets too long then the message needs to be approved >>> // by a moderator. > >> Yes I noticed that too :) In general if you send to hotspot-dev you >> shouldn't need to also send to hotspot-X-dev. > > Makes sense. Will do so next time. > >>> >>> This would be the post with the current webrev.1 >>> >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/031245.html > >> Sorry I missed that update. Okay so this is working with direct >> handshakes now. > >> One style nit in jvmtiThreadState.cpp: > >> assert(SafepointSynchronize::is_at_safepoint() || >> ! (JavaThread *)Thread::current() == get_thread() || >> ! Thread::current() == get_thread()->active_handshaker(), >> ! "bad synchronization with owner thread"); > >> the ! lines should ident as follows > >> assert(SafepointSynchronize::is_at_safepoint() || >> (JavaThread *)Thread::current() == get_thread() || >> Thread::current() == get_thread()->active_handshaker(), >> ! "bad synchronization with owner thread"); > > Sure. > >> Lets see how this plays out. > > Hopefully not too bad... :) > >>> Not a review but some general commentary ... > > Still not a review, or is it now? > > Thanks, Richard. > > -----Original Message----- > From: David Holmes > Sent: Mittwoch, 13. Mai 2020 07:43 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > On 4/05/2020 8:33 pm, Reingruber, Richard wrote: >> // Trimmed the list of recipients. If the list gets too long then the message needs to be approved >> // by a moderator. > > Yes I noticed that too :) In general if you send to hotspot-dev you > shouldn't need to also send to hotspot-X-dev. > >> Hi David, > > Hi Richard, > >>> On 28/04/2020 12:09 am, Reingruber, Richard wrote: >>>> Hi David, >>>> >>>>> Not a review but some general commentary ... >>>> >>>> That's welcome. >> >>> Having had to take an even closer look now I have a review comment too :) >> >>> src/hotspot/share/prims/jvmtiThreadState.cpp >> >>> void JvmtiThreadState::invalidate_cur_stack_depth() { >>> ! assert(SafepointSynchronize::is_at_safepoint() || >>> ! (Thread::current()->is_VM_thread() && >>> get_thread()->is_vmthread_processing_handshake()) || >>> (JavaThread *)Thread::current() == get_thread(), >>> "must be current thread or at safepoint"); >> >> You're looking at an outdated webrev, I'm afraid. >> >> This would be the post with the current webrev.1 >> >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/031245.html > > Sorry I missed that update. Okay so this is working with direct > handshakes now. > > One style nit in jvmtiThreadState.cpp: > > assert(SafepointSynchronize::is_at_safepoint() || > ! (JavaThread *)Thread::current() == get_thread() || > ! Thread::current() == get_thread()->active_handshaker(), > ! "bad synchronization with owner thread"); > > the ! lines should ident as follows > > assert(SafepointSynchronize::is_at_safepoint() || > (JavaThread *)Thread::current() == get_thread() || > Thread::current() == get_thread()->active_handshaker(), > ! "bad synchronization with owner thread"); > > Lets see how this plays out. > > Cheers, > David > >> Thanks, Richard. >> >> -----Original Message----- >> From: David Holmes >> Sent: Montag, 4. Mai 2020 08:51 >> To: Reingruber, Richard ; Yasumasa Suenaga ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >> >> Hi Richard, >> >> On 28/04/2020 12:09 am, Reingruber, Richard wrote: >>> Hi David, >>> >>>> Not a review but some general commentary ... >>> >>> That's welcome. >> >> Having had to take an even closer look now I have a review comment too :) >> >> src/hotspot/share/prims/jvmtiThreadState.cpp >> >> void JvmtiThreadState::invalidate_cur_stack_depth() { >> ! assert(SafepointSynchronize::is_at_safepoint() || >> ! (Thread::current()->is_VM_thread() && >> get_thread()->is_vmthread_processing_handshake()) || >> (JavaThread *)Thread::current() == get_thread(), >> "must be current thread or at safepoint"); >> >> The message needs updating to include handshakes. >> >> More below ... >> >>>> On 25/04/2020 2:08 am, Reingruber, Richard wrote: >>>>> Hi Yasumasa, Patricio, >>>>> >>>>>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>>>>> Does it help you? I think it gives you to remove workaround. >>>>>>> >>>>>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>>>> >>>>>> Thanks for your information. >>>>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>>>>> I will modify and will test it after yours. >>>>> >>>>> Thanks :) >>>>> >>>>>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>>>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>>>>> to me, how this has to be handled. >>>>> >>>>>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>>>> >>>>> Yes. To me it is unclear what synchronization is necessary, if it is called during a handshake. And >>>>> also I'm unsure if a thread should do safepoint checks while executing a handshake. >>> >>>> I'm growing increasingly concerned that use of direct handshakes to >>>> replace VM operations needs a much greater examination for correctness >>>> than might initially be thought. I see a number of issues: >>> >>> I agree. I'll address your concerns in the context of this review thread for JDK-8238585 below. >>> >>> In addition I would suggest to take the general part of the discussion to a dedicated thread or to >>> the review thread for JDK-8242427. I would like to keep this thread closer to its subject. >> >> I will focus on the issues in the context of this particular change >> then, though the issues themselves are applicable to all handshake >> situations (and more so with direct handshakes). This is mostly just >> discussion. >> >>>> First, the VMThread executes (most) VM operations with a clean stack in >>>> a clean state, so it has lots of room to work. If we now execute the >>>> same logic in a JavaThread then we risk hitting stackoverflows if >>>> nothing else. But we are also now executing code in a JavaThread and so >>>> we have to be sure that code is not going to act differently (in a bad >>>> way) if executed by a JavaThread rather than the VMThread. For example, >>>> may it be possible that if executing in the VMThread we defer some >>>> activity that might require execution of Java code, or else hand it off >>>> to one of the service threads? If we execute that code directly in the >>>> current JavaThread instead we may not be in a valid state (e.g. consider >>>> re-entrancy to various subsystems that is not allowed). >>> >>> It is not too complex, what EnterInterpOnlyModeClosure::do_thread() is doing. I already added a >>> paragraph to the JBS-Item [1] explaining why the direct handshake is sufficient from a >>> synchronization point of view. >> >> Just to be clear, your proposed change is not using a direct handshake. >> >>> Furthermore the stack is walked and the return pc of compiled frames is replaced with the address of >>> the deopt handler. >>> >>> I can't see why this cannot be done with a direct handshake. Something very similar is already done >>> in JavaThread::deoptimize_marked_methods() which is executed as part of an ordinary handshake. >> >> Note that existing non-direct handshakes may also have issues that not >> have been fully investigated. >> >>> The demand on stack-space should be very modest. I would not expect a higher risk for stackoverflow. >> >> For the target thread if you use more stack than would be used stopping >> at a safepoint then you are at risk. For the thread initiating the >> direct handshake if you use more stack than would be used enqueuing a VM >> operation, then you are at risk. As we have not quantified these >> numbers, nor have any easy way to establish the stack use of the actual >> code to be executed, we're really just hoping for the best. This is a >> general problem with handshakes that needs to be investigated more >> deeply. As a simple, general, example just imagine if the code involves >> logging that might utilise an on-stack buffer. >> >>>> Second, we have this question mark over what happens if the operation >>>> hits further safepoint or handshake polls/checks? Are there constraints >>>> on what is allowed here? How can we recognise this problem may exist and >>>> so deal with it? >>> >>> The thread in EnterInterpOnlyModeClosure::do_thread() can't become safepoint/handshake safe. I >>> tested locally test/hotspot/jtreg:vmTestbase_nsk_jvmti with a NoSafepointVerifier. >> >> That's good to hear but such tests are not exhaustive, they will detect >> if you do reach a safepoint/handshake but they can't prove that you >> cannot reach one. What you have done is necessary but may not be >> sufficient. Plus you didn't actually add the NSV to the code - is there >> a reason we can't actually keep it in do_thread? (I'm not sure if the >> NSV also acts as a NoHandshakeVerifier?) >> >>>> Third, while we are generally considering what appear to be >>>> single-thread operations, which should be amenable to a direct >>>> handshake, we also have to be careful that some of the code involved >>>> doesn't already expect/assume we are at a safepoint - e.g. a VM op may >>>> not need to take a lock where a direct handshake might! >>> >>> See again my arguments in the JBS item [1]. >> >> Yes I see the reasoning and that is good. My point is a general one as >> it may not be obvious when such assumptions exist in the current code. >> >> Thanks, >> David >> >>> Thanks, >>> Richard. >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8238585 >>> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Montag, 27. April 2020 07:16 >>> To: Reingruber, Richard ; Yasumasa Suenaga ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>> >>> Hi all, >>> >>> Not a review but some general commentary ... >>> >>> On 25/04/2020 2:08 am, Reingruber, Richard wrote: >>>> Hi Yasumasa, Patricio, >>>> >>>>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>>>> Does it help you? I think it gives you to remove workaround. >>>>>> >>>>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>>> >>>>> Thanks for your information. >>>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>>>> I will modify and will test it after yours. >>>> >>>> Thanks :) >>>> >>>>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>>>> to me, how this has to be handled. >>>> >>>>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>>> >>>> Yes. To me it is unclear what synchronization is necessary, if it is called during a handshake. And >>>> also I'm unsure if a thread should do safepoint checks while executing a handshake. >>> >>> I'm growing increasingly concerned that use of direct handshakes to >>> replace VM operations needs a much greater examination for correctness >>> than might initially be thought. I see a number of issues: >>> >>> First, the VMThread executes (most) VM operations with a clean stack in >>> a clean state, so it has lots of room to work. If we now execute the >>> same logic in a JavaThread then we risk hitting stackoverflows if >>> nothing else. But we are also now executing code in a JavaThread and so >>> we have to be sure that code is not going to act differently (in a bad >>> way) if executed by a JavaThread rather than the VMThread. For example, >>> may it be possible that if executing in the VMThread we defer some >>> activity that might require execution of Java code, or else hand it off >>> to one of the service threads? If we execute that code directly in the >>> current JavaThread instead we may not be in a valid state (e.g. consider >>> re-entrancy to various subsystems that is not allowed). >>> >>> Second, we have this question mark over what happens if the operation >>> hits further safepoint or handshake polls/checks? Are there constraints >>> on what is allowed here? How can we recognise this problem may exist and >>> so deal with it? >>> >>> Third, while we are generally considering what appear to be >>> single-thread operations, which should be amenable to a direct >>> handshake, we also have to be careful that some of the code involved >>> doesn't already expect/assume we are at a safepoint - e.g. a VM op may >>> not need to take a lock where a direct handshake might! >>> >>> Cheers, >>> David >>> ----- >>> >>>> @Patricio, coming back to my question [1]: >>>> >>>> In the example you gave in your answer [2]: the java thread would execute a vm operation during a >>>> direct handshake operation, while the VMThread is actually in the middle of a VM_HandshakeAllThreads >>>> operation, waiting to handshake the same handshakee: why can't the VMThread just proceed? The >>>> handshakee would be safepoint safe, wouldn't it? >>>> >>>> Thanks, Richard. >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14301677&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14301677 >>>> >>>> [2] https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14301763&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14301763 >>>> >>>> -----Original Message----- >>>> From: Yasumasa Suenaga >>>> Sent: Freitag, 24. April 2020 17:23 >>>> To: Reingruber, Richard ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>> >>>> Hi Richard, >>>> >>>> On 2020/04/24 23:44, Reingruber, Richard wrote: >>>>> Hi Yasumasa, >>>>> >>>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>>> Does it help you? I think it gives you to remove workaround. >>>>> >>>>> I think it would not help that much. Note that when replacing VM_SetFramePop with a direct handshake >>>>> you could not just execute VM_EnterInterpOnlyMode as a nested vm operation [1]. So you would have to >>>>> change/replace VM_EnterInterpOnlyMode and I would have to adapt to these changes. >>>> >>>> Thanks for your information. >>>> I tested my patch with both vmTestbase/nsk/jvmti/PopFrame and vmTestbase/nsk/jvmti/NotifyFramePop. >>>> I will modify and will test it after yours. >>>> >>>> >>>>> Also my first impression was that it won't be that easy from a synchronization point of view to >>>>> replace VM_SetFramePop with a direct handshake. E.g. VM_SetFramePop::doit() indirectly calls >>>>> JvmtiEventController::set_frame_pop(JvmtiEnvThreadState *ets, JvmtiFramePop fpop) where >>>>> JvmtiThreadState_lock is acquired with safepoint check, if not at safepoint. It's not directly clear >>>>> to me, how this has to be handled. >>>> >>>> I think JvmtiEventController::set_frame_pop() should hold JvmtiThreadState_lock because it affects other JVMTI operation especially FramePop event. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>>> So it appears to me that it would be easier to push JDK-8242427 after this (JDK-8238585). >>>>> >>>>>> (The patch is available, but I want to see the result of PIT in this weekend whether JDK-8242425 works fine.) >>>>> >>>>> Would be interesting to see how you handled the issues above :) >>>>> >>>>> Thanks, Richard. >>>>> >>>>> [1] See question in comment https://bugs.openjdk.java.net/browse/JDK-8230594?focusedCommentId=14302030&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14302030 >>>>> >>>>> -----Original Message----- >>>>> From: Yasumasa Suenaga >>>>> Sent: Freitag, 24. April 2020 13:34 >>>>> To: Reingruber, Richard ; Patricio Chilano ; serguei.spitsyn at oracle.com; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>> >>>>> Hi Richard, >>>>> >>>>> I will send review request to replace VM_SetFramePop to handshake in early next week in JDK-8242427. >>>>> Does it help you? I think it gives you to remove workaround. >>>>> >>>>> (The patch is available, but I want to see the result of PIT in this weekend whether JDK-8242425 works fine.) >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/04/24 17:18, Reingruber, Richard wrote: >>>>>> Hi Patricio, Vladimir, and Serguei, >>>>>> >>>>>> now that direct handshakes are available, I've updated the patch to make use of them. >>>>>> >>>>>> In addition I have done some clean-up changes I missed in the first webrev. >>>>>> >>>>>> Finally I have implemented the workaround suggested by Patricio to avoid nesting the handshake >>>>>> into the vm operation VM_SetFramePop [1] >>>>>> >>>>>> Kindly review again: >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ >>>>>> Webrev(delta): http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ >>>>>> >>>>>> I updated the JBS item explaining why the vm operation VM_EnterInterpOnlyMode can be replaced with a >>>>>> direct handshake: >>>>>> >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8238585 >>>>>> >>>>>> Testing: >>>>>> >>>>>> * JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>>>>> >>>>>> * Submit-repo: mach5-one-rrich-JDK-8238585-20200423-1436-10441737 >>>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>> [1] An assertion in Handshake::execute_direct() fails, if called be VMThread, because it is no JavaThread. >>>>>> >>>>>> -----Original Message----- >>>>>> From: hotspot-dev On Behalf Of Reingruber, Richard >>>>>> Sent: Freitag, 14. Februar 2020 19:47 >>>>>> To: Patricio Chilano ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>>> Subject: RE: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>>> >>>>>> Hi Patricio, >>>>>> >>>>>> > > I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >>>>>> > > handshake cannot be nested in a vm operation. Maybe it should be asserted in the >>>>>> > > Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >>>>>> > > >>>>>> > > > Alternatively I think you could do something similar to what we do in >>>>>> > > > Deoptimization::deoptimize_all_marked(): >>>>>> > > > >>>>>> > > > EnterInterpOnlyModeClosure hs; >>>>>> > > > if (SafepointSynchronize::is_at_safepoint()) { >>>>>> > > > hs.do_thread(state->get_thread()); >>>>>> > > > } else { >>>>>> > > > Handshake::execute(&hs, state->get_thread()); >>>>>> > > > } >>>>>> > > > (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>>> > > > HandshakeClosure() constructor) >>>>>> > > >>>>>> > > Maybe this could be used also in the Handshake::execute() methods as general solution? >>>>>> > Right, we could also do that. Avoiding to clear the polling page in >>>>>> > HandshakeState::clear_handshake() should be enough to fix this issue and >>>>>> > execute a handshake inside a safepoint, but adding that "if" statement >>>>>> > in Hanshake::execute() sounds good to avoid all the extra code that we >>>>>> > go through when executing a handshake. I filed 8239084 to make that change. >>>>>> >>>>>> Thanks for taking care of this and creating the RFE. >>>>>> >>>>>> > >>>>>> > > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>>> > > > always called in a nested operation or just sometimes. >>>>>> > > >>>>>> > > At least one execution path without vm operation exists: >>>>>> > > >>>>>> > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >>>>>> > > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >>>>>> > > JvmtiEventControllerPrivate::recompute_enabled() : void >>>>>> > > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >>>>>> > > JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >>>>>> > > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >>>>>> > > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >>>>>> > > >>>>>> > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >>>>>> > > handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >>>>>> > > encouraged to do it with a handshake :) >>>>>> > Ah! I think you can still do it with a handshake with the >>>>>> > Deoptimization::deoptimize_all_marked() like solution. I can change the >>>>>> > if-else statement with just the Handshake::execute() call in 8239084. >>>>>> > But up to you. : ) >>>>>> >>>>>> Well, I think that's enough encouragement :) >>>>>> I'll wait for 8239084 and try then again. >>>>>> (no urgency and all) >>>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>> -----Original Message----- >>>>>> From: Patricio Chilano >>>>>> Sent: Freitag, 14. Februar 2020 15:54 >>>>>> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>>> >>>>>> Hi Richard, >>>>>> >>>>>> On 2/14/20 9:58 AM, Reingruber, Richard wrote: >>>>>>> Hi Patricio, >>>>>>> >>>>>>> thanks for having a look. >>>>>>> >>>>>>> > I?m only commenting on the handshake changes. >>>>>>> > I see that operation VM_EnterInterpOnlyMode can be called inside >>>>>>> > operation VM_SetFramePop which also allows nested operations. Here is a >>>>>>> > comment in VM_SetFramePop definition: >>>>>>> > >>>>>>> > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >>>>>>> > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >>>>>>> > >>>>>>> > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >>>>>>> > could have a handshake inside a safepoint operation. The issue I see >>>>>>> > there is that at the end of the handshake the polling page of the target >>>>>>> > thread could be disarmed. So if the target thread happens to be in a >>>>>>> > blocked state just transiently and wakes up then it will not stop for >>>>>>> > the ongoing safepoint. Maybe I can file an RFE to assert that the >>>>>>> > polling page is armed at the beginning of disarm_safepoint(). >>>>>>> >>>>>>> I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >>>>>>> handshake cannot be nested in a vm operation. Maybe it should be asserted in the >>>>>>> Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >>>>>>> >>>>>>> > Alternatively I think you could do something similar to what we do in >>>>>>> > Deoptimization::deoptimize_all_marked(): >>>>>>> > >>>>>>> > EnterInterpOnlyModeClosure hs; >>>>>>> > if (SafepointSynchronize::is_at_safepoint()) { >>>>>>> > hs.do_thread(state->get_thread()); >>>>>>> > } else { >>>>>>> > Handshake::execute(&hs, state->get_thread()); >>>>>>> > } >>>>>>> > (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>>>> > HandshakeClosure() constructor) >>>>>>> >>>>>>> Maybe this could be used also in the Handshake::execute() methods as general solution? >>>>>> Right, we could also do that. Avoiding to clear the polling page in >>>>>> HandshakeState::clear_handshake() should be enough to fix this issue and >>>>>> execute a handshake inside a safepoint, but adding that "if" statement >>>>>> in Hanshake::execute() sounds good to avoid all the extra code that we >>>>>> go through when executing a handshake. I filed 8239084 to make that change. >>>>>> >>>>>>> > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>>>> > always called in a nested operation or just sometimes. >>>>>>> >>>>>>> At least one execution path without vm operation exists: >>>>>>> >>>>>>> JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >>>>>>> JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >>>>>>> JvmtiEventControllerPrivate::recompute_enabled() : void >>>>>>> JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >>>>>>> JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >>>>>>> JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >>>>>>> jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >>>>>>> >>>>>>> I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >>>>>>> handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >>>>>>> encouraged to do it with a handshake :) >>>>>> Ah! I think you can still do it with a handshake with the >>>>>> Deoptimization::deoptimize_all_marked() like solution. I can change the >>>>>> if-else statement with just the Handshake::execute() call in 8239084. >>>>>> But up to you.? : ) >>>>>> >>>>>> Thanks, >>>>>> Patricio >>>>>>> Thanks again, >>>>>>> Richard. >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Patricio Chilano >>>>>>> Sent: Donnerstag, 13. Februar 2020 18:47 >>>>>>> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >>>>>>> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >>>>>>> >>>>>>> Hi Richard, >>>>>>> >>>>>>> I?m only commenting on the handshake changes. >>>>>>> I see that operation VM_EnterInterpOnlyMode can be called inside >>>>>>> operation VM_SetFramePop which also allows nested operations. Here is a >>>>>>> comment in VM_SetFramePop definition: >>>>>>> >>>>>>> // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >>>>>>> // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >>>>>>> >>>>>>> So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >>>>>>> could have a handshake inside a safepoint operation. The issue I see >>>>>>> there is that at the end of the handshake the polling page of the target >>>>>>> thread could be disarmed. So if the target thread happens to be in a >>>>>>> blocked state just transiently and wakes up then it will not stop for >>>>>>> the ongoing safepoint. Maybe I can file an RFE to assert that the >>>>>>> polling page is armed at the beginning of disarm_safepoint(). >>>>>>> >>>>>>> I think one option could be to remove >>>>>>> SafepointMechanism::disarm_if_needed() in >>>>>>> HandshakeState::clear_handshake() and let each JavaThread disarm itself >>>>>>> for the handshake case. >>>>>>> >>>>>>> Alternatively I think you could do something similar to what we do in >>>>>>> Deoptimization::deoptimize_all_marked(): >>>>>>> >>>>>>> ? EnterInterpOnlyModeClosure hs; >>>>>>> ? if (SafepointSynchronize::is_at_safepoint()) { >>>>>>> ??? hs.do_thread(state->get_thread()); >>>>>>> ? } else { >>>>>>> ??? Handshake::execute(&hs, state->get_thread()); >>>>>>> ? } >>>>>>> (you could pass ?EnterInterpOnlyModeClosure? directly to the >>>>>>> HandshakeClosure() constructor) >>>>>>> >>>>>>> I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >>>>>>> always called in a nested operation or just sometimes. >>>>>>> >>>>>>> Thanks, >>>>>>> Patricio >>>>>>> >>>>>>> On 2/12/20 7:23 AM, Reingruber, Richard wrote: >>>>>>>> // Repost including hotspot runtime and gc lists. >>>>>>>> // Dean Long suggested to do so, because the enhancement replaces a vm operation >>>>>>>> // with a handshake. >>>>>>>> // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> could I please get reviews for this small enhancement in hotspot's jvmti implementation: >>>>>>>> >>>>>>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >>>>>>>> >>>>>>>> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >>>>>>>> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >>>>>>>> >>>>>>>> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >>>>>>>> >>>>>>>> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>>>>>>> >>>>>>>> Thanks, Richard. >>>>>>>> >>>>>>>> See also my question if anyone knows a reason for making the compiled methods not_entrant: >>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html >>>>>> From richard.reingruber at sap.com Thu May 14 07:48:43 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Thu, 14 May 2020 07:48:43 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: <03c9a0ce-8f78-00e7-9db3-70d6f6cb8156@oracle.com> References: <3c59b9f9-ec38-18c9-8f24-e1186a08a04a@oracle.com> <410eed04-e2ef-0f4f-1c56-19e6734a10f6@oracle.com> <03c9a0ce-8f78-00e7-9db3-70d6f6cb8156@oracle.com> Message-ID: Hi Serguei, > Thank you for the bug report update - it is helpful. > The fix/update looks good in general but I need more time to check some > points. Sure. I'd be happy if you could look at it again. > I'm thinking it would be more safe to run full tier5. > I can do it after you get all thumbs ups. The patch goes through extensive testing here at SAP every night since many weeks. Still it would be great if you could run full tier5. I'll wait then for a view more thumbs... Thanks, Richard. -----Original Message----- From: serguei.spitsyn at oracle.com Sent: Donnerstag, 14. Mai 2020 00:32 To: Reingruber, Richard ; Patricio Chilano ; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant Hi Richard, Thank you for the bug report update - it is helpful. The fix/update looks good in general but I need more time to check some points. I'm thinking it would be more safe to run full tier5. I can do it after you get all thumbs ups. Thanks, Serguei On 4/24/20 01:18, Reingruber, Richard wrote: > Hi Patricio, Vladimir, and Serguei, > > now that direct handshakes are available, I've updated the patch to make use of them. > > In addition I have done some clean-up changes I missed in the first webrev. > > Finally I have implemented the workaround suggested by Patricio to avoid nesting the handshake > into the vm operation VM_SetFramePop [1] > > Kindly review again: > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ > Webrev(delta): http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ > > I updated the JBS item explaining why the vm operation VM_EnterInterpOnlyMode can be replaced with a > direct handshake: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8238585 > > Testing: > > * JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. > > * Submit-repo: mach5-one-rrich-JDK-8238585-20200423-1436-10441737 > > Thanks, > Richard. > > [1] An assertion in Handshake::execute_direct() fails, if called be VMThread, because it is no JavaThread. > > -----Original Message----- > From: hotspot-dev On Behalf Of Reingruber, Richard > Sent: Freitag, 14. Februar 2020 19:47 > To: Patricio Chilano ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > Subject: RE: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > Hi Patricio, > > > > I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a > > > handshake cannot be nested in a vm operation. Maybe it should be asserted in the > > > Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? > > > > > > > Alternatively I think you could do something similar to what we do in > > > > Deoptimization::deoptimize_all_marked(): > > > > > > > > EnterInterpOnlyModeClosure hs; > > > > if (SafepointSynchronize::is_at_safepoint()) { > > > > hs.do_thread(state->get_thread()); > > > > } else { > > > > Handshake::execute(&hs, state->get_thread()); > > > > } > > > > (you could pass ?EnterInterpOnlyModeClosure? directly to the > > > > HandshakeClosure() constructor) > > > > > > Maybe this could be used also in the Handshake::execute() methods as general solution? > > Right, we could also do that. Avoiding to clear the polling page in > > HandshakeState::clear_handshake() should be enough to fix this issue and > > execute a handshake inside a safepoint, but adding that "if" statement > > in Hanshake::execute() sounds good to avoid all the extra code that we > > go through when executing a handshake. I filed 8239084 to make that change. > > Thanks for taking care of this and creating the RFE. > > > > > > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is > > > > always called in a nested operation or just sometimes. > > > > > > At least one execution path without vm operation exists: > > > > > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void > > > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong > > > JvmtiEventControllerPrivate::recompute_enabled() : void > > > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) > > > JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void > > > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError > > > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError > > > > > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a > > > handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further > > > encouraged to do it with a handshake :) > > Ah! I think you can still do it with a handshake with the > > Deoptimization::deoptimize_all_marked() like solution. I can change the > > if-else statement with just the Handshake::execute() call in 8239084. > > But up to you. : ) > > Well, I think that's enough encouragement :) > I'll wait for 8239084 and try then again. > (no urgency and all) > > Thanks, > Richard. > > -----Original Message----- > From: Patricio Chilano > Sent: Freitag, 14. Februar 2020 15:54 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > Hi Richard, > > On 2/14/20 9:58 AM, Reingruber, Richard wrote: >> Hi Patricio, >> >> thanks for having a look. >> >> > I?m only commenting on the handshake changes. >> > I see that operation VM_EnterInterpOnlyMode can be called inside >> > operation VM_SetFramePop which also allows nested operations. Here is a >> > comment in VM_SetFramePop definition: >> > >> > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >> > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >> > >> > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >> > could have a handshake inside a safepoint operation. The issue I see >> > there is that at the end of the handshake the polling page of the target >> > thread could be disarmed. So if the target thread happens to be in a >> > blocked state just transiently and wakes up then it will not stop for >> > the ongoing safepoint. Maybe I can file an RFE to assert that the >> > polling page is armed at the beginning of disarm_safepoint(). >> >> I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a >> handshake cannot be nested in a vm operation. Maybe it should be asserted in the >> Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? >> >> > Alternatively I think you could do something similar to what we do in >> > Deoptimization::deoptimize_all_marked(): >> > >> > EnterInterpOnlyModeClosure hs; >> > if (SafepointSynchronize::is_at_safepoint()) { >> > hs.do_thread(state->get_thread()); >> > } else { >> > Handshake::execute(&hs, state->get_thread()); >> > } >> > (you could pass ?EnterInterpOnlyModeClosure? directly to the >> > HandshakeClosure() constructor) >> >> Maybe this could be used also in the Handshake::execute() methods as general solution? > Right, we could also do that. Avoiding to clear the polling page in > HandshakeState::clear_handshake() should be enough to fix this issue and > execute a handshake inside a safepoint, but adding that "if" statement > in Hanshake::execute() sounds good to avoid all the extra code that we > go through when executing a handshake. I filed 8239084 to make that change. > >> > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >> > always called in a nested operation or just sometimes. >> >> At least one execution path without vm operation exists: >> >> JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void >> JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong >> JvmtiEventControllerPrivate::recompute_enabled() : void >> JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) >> JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void >> JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError >> jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError >> >> I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a >> handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further >> encouraged to do it with a handshake :) > Ah! I think you can still do it with a handshake with the > Deoptimization::deoptimize_all_marked() like solution. I can change the > if-else statement with just the Handshake::execute() call in 8239084. > But up to you.? : ) > > Thanks, > Patricio >> Thanks again, >> Richard. >> >> -----Original Message----- >> From: Patricio Chilano >> Sent: Donnerstag, 13. Februar 2020 18:47 >> To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net >> Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant >> >> Hi Richard, >> >> I?m only commenting on the handshake changes. >> I see that operation VM_EnterInterpOnlyMode can be called inside >> operation VM_SetFramePop which also allows nested operations. Here is a >> comment in VM_SetFramePop definition: >> >> // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is >> // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. >> >> So if we change VM_EnterInterpOnlyMode to be a handshake, then now we >> could have a handshake inside a safepoint operation. The issue I see >> there is that at the end of the handshake the polling page of the target >> thread could be disarmed. So if the target thread happens to be in a >> blocked state just transiently and wakes up then it will not stop for >> the ongoing safepoint. Maybe I can file an RFE to assert that the >> polling page is armed at the beginning of disarm_safepoint(). >> >> I think one option could be to remove >> SafepointMechanism::disarm_if_needed() in >> HandshakeState::clear_handshake() and let each JavaThread disarm itself >> for the handshake case. >> >> Alternatively I think you could do something similar to what we do in >> Deoptimization::deoptimize_all_marked(): >> >> ? EnterInterpOnlyModeClosure hs; >> ? if (SafepointSynchronize::is_at_safepoint()) { >> ??? hs.do_thread(state->get_thread()); >> ? } else { >> ??? Handshake::execute(&hs, state->get_thread()); >> ? } >> (you could pass ?EnterInterpOnlyModeClosure? directly to the >> HandshakeClosure() constructor) >> >> I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is >> always called in a nested operation or just sometimes. >> >> Thanks, >> Patricio >> >> On 2/12/20 7:23 AM, Reingruber, Richard wrote: >>> // Repost including hotspot runtime and gc lists. >>> // Dean Long suggested to do so, because the enhancement replaces a vm operation >>> // with a handshake. >>> // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html >>> >>> Hi, >>> >>> could I please get reviews for this small enhancement in hotspot's jvmti implementation: >>> >>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >>> >>> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >>> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >>> >>> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >>> >>> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >>> >>> Thanks, Richard. >>> >>> See also my question if anyone knows a reason for making the compiled methods not_entrant: >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From adinn at redhat.com Thu May 14 08:48:47 2020 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 14 May 2020 09:48:47 +0100 Subject: RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: Message-ID: Hi Xiaohong, Just for references a direct link to the webrev, issue and CSR are: https://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8243339 https://bugs.openjdk.java.net/browse/JDK-8243456 The webrev looks fine to me. Nice work, thank you! regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill On 13/05/2020 21:20, Derek White wrote: > Hi Xiaohong, > > This looks good to me (not a (R)eviewer). Thanks for including the patch for ThunderX ! > > This is a nice cleanup of the code and especially the volatile tests. > > I did NOT check that the removal of the Compiler Interface declarations does or doesn't impact compilation of the Graal compiler. Did you check that side? > > Thanks, > - Derek > > -----Original Message----- > From: hotspot-compiler-dev On Behalf Of Xiaohong Gong > Sent: Thursday, May 7, 2020 11:39 PM > To: aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net > Cc: nd > Subject: [EXT] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option > > External Email > > ---------------------------------------------------------------------- > Hi, > > Please help to review this patch which obsoletes the product flag "-XX:UseBarrierssForVolatile" and its related code: > Webrev: > https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Exgong_rfr_8243339_webrev.00_&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=oI-OcgRUa25GiBZaU5V4OmuS8aewSxBaMLbKw3A7lnA&e= > JBS: > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8243339&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=T-oDfXrBvQorUBzFZq7Omb17P5yqQjg_q3dBo4EExCA&e= > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8243456&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=MDMAqUQRO_kmBtmTodGJ2wNuaVy-u_Y_jykpMmyMQwI&e= (CSR) > > As described in the CSR, using "-XX:+UseBarriersForVolatile" might have memory consistent issue like that mentioned in [1]. It needs more effort to fix the issue and maintain the memory consistency in future. Since "ldar/stlr" has worked well for a long time, and so does "ldaxr/stlxr" for unsafe atomics, we'd better simplify things by removing this option and the alternative implementation for the volatile access. > > Since its only one signifcant usage on a kind of CPU would also like to be removed (See [2]), it can work well without this option. So we directly obsolete this option and remove the code, rather than deprecate it firstly. > > Besides obsoleting this option, this patch also removes an AArch64 CPU feature "CPU_DMB_ATOMICS" together. It is a workaround while not an AArch64 official feature, which is not required anymore (See [2]). > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8241137&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=XQwo26nMgDENOKN5U4pW2EunOEt2UtLycucN1BCScaU&e= > [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8242469&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=ChT4b4Jj_TkXozuRs6HIqUVPn1iap0DzKvB-2dKYf0g&e= > > Testing: > Jtreg hotspot::hotspot_all_no_apps, jdk::jdk_core, langtools::tier1 > JCStress: tests-all > > Thanks, > Xiaohong Gong > > From Xiaohong.Gong at arm.com Thu May 14 08:52:28 2020 From: Xiaohong.Gong at arm.com (Xiaohong Gong) Date: Thu, 14 May 2020 08:52:28 +0000 Subject: RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: Message-ID: Hi Andrew, Thanks for your review and the direct reference links! Thanks, Xiaohong Gong -----Original Message----- From: Andrew Dinn Sent: Thursday, May 14, 2020 4:49 PM To: Derek White ; Xiaohong Gong ; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: Re: RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option Hi Xiaohong, Just for references a direct link to the webrev, issue and CSR are: https://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8243339 https://bugs.openjdk.java.net/browse/JDK-8243456 The webrev looks fine to me. Nice work, thank you! regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill On 13/05/2020 21:20, Derek White wrote: > Hi Xiaohong, > > This looks good to me (not a (R)eviewer). Thanks for including the patch for ThunderX ! > > This is a nice cleanup of the code and especially the volatile tests. > > I did NOT check that the removal of the Compiler Interface declarations does or doesn't impact compilation of the Graal compiler. Did you check that side? > > Thanks, > - Derek > > -----Original Message----- > From: hotspot-compiler-dev On Behalf Of Xiaohong Gong > Sent: Thursday, May 7, 2020 11:39 PM > To: aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net > Cc: nd > Subject: [EXT] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option > > External Email > > ---------------------------------------------------------------------- > Hi, > > Please help to review this patch which obsoletes the product flag "-XX:UseBarrierssForVolatile" and its related code: > Webrev: > https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Exgong_rfr_8243339_webrev.00_&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=oI-OcgRUa25GiBZaU5V4OmuS8aewSxBaMLbKw3A7lnA&e= > JBS: > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8243339&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=T-oDfXrBvQorUBzFZq7Omb17P5yqQjg_q3dBo4EExCA&e= > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8243456&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=MDMAqUQRO_kmBtmTodGJ2wNuaVy-u_Y_jykpMmyMQwI&e= (CSR) > > As described in the CSR, using "-XX:+UseBarriersForVolatile" might have memory consistent issue like that mentioned in [1]. It needs more effort to fix the issue and maintain the memory consistency in future. Since "ldar/stlr" has worked well for a long time, and so does "ldaxr/stlxr" for unsafe atomics, we'd better simplify things by removing this option and the alternative implementation for the volatile access. > > Since its only one signifcant usage on a kind of CPU would also like to be removed (See [2]), it can work well without this option. So we directly obsolete this option and remove the code, rather than deprecate it firstly. > > Besides obsoleting this option, this patch also removes an AArch64 CPU feature "CPU_DMB_ATOMICS" together. It is a workaround while not an AArch64 official feature, which is not required anymore (See [2]). > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8241137&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=XQwo26nMgDENOKN5U4pW2EunOEt2UtLycucN1BCScaU&e= > [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8242469&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=OT2KWgejq1kc_YJt_NWWZBKwYqCqlRvzSlfRO04igpk&s=ChT4b4Jj_TkXozuRs6HIqUVPn1iap0DzKvB-2dKYf0g&e= > > Testing: > Jtreg hotspot::hotspot_all_no_apps, jdk::jdk_core, langtools::tier1 > JCStress: tests-all > > Thanks, > Xiaohong Gong > > From aph at redhat.com Thu May 14 09:37:09 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 14 May 2020 10:37:09 +0100 Subject: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: Message-ID: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> On 5/14/20 9:48 AM, Andrew Dinn wrote: > Just for references a direct link to the webrev, issue and CSR are: > > https://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8243339 > https://bugs.openjdk.java.net/browse/JDK-8243456 > > The webrev looks fine to me. Nice work, thank you! There's a problem with C1: we generate unnecessary DMBs if we're using TieredStopAtLevel=1 or if we only have the client compiler. This is a performance regression, so I reject this patch. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Thu May 14 09:44:41 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 14 May 2020 10:44:41 +0100 Subject: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> References: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> Message-ID: <2551a58f-bb05-b63d-b8ed-63f120a75eeb@redhat.com> On 5/14/20 10:37 AM, Andrew Haley wrote: > On 5/14/20 9:48 AM, Andrew Dinn wrote: >> Just for references a direct link to the webrev, issue and CSR are: >> >> https://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8243339 >> https://bugs.openjdk.java.net/browse/JDK-8243456 >> >> The webrev looks fine to me. Nice work, thank you! > > There's a problem with C1: we generate unnecessary DMBs if we're using > TieredStopAtLevel=1 or if we only have the client compiler. This is a > performance regression, so I reject this patch. There are similar regressoins in the interpreter. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From Xiaohong.Gong at arm.com Thu May 14 10:07:35 2020 From: Xiaohong.Gong at arm.com (Xiaohong Gong) Date: Thu, 14 May 2020 10:07:35 +0000 Subject: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: <2551a58f-bb05-b63d-b8ed-63f120a75eeb@redhat.com> References: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> <2551a58f-bb05-b63d-b8ed-63f120a75eeb@redhat.com> Message-ID: Hi Andrew, Thanks for having a look at it! > On 5/14/20 10:37 AM, Andrew Haley wrote: > > On 5/14/20 9:48 AM, Andrew Dinn wrote: > >> Just for references a direct link to the webrev, issue and CSR > are: > >> > >> https://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.00/ > >> https://bugs.openjdk.java.net/browse/JDK-8243339 > >> https://bugs.openjdk.java.net/browse/JDK-8243456 > >> > >> The webrev looks fine to me. Nice work, thank you! > > > > There's a problem with C1: we generate unnecessary DMBs if we're > using > > TieredStopAtLevel=1 or if we only have the client compiler. This > is a > > performance regression, so I reject this patch. > > There are similar regressoins in the interpreter. Yes, I agree with you that regressions exist for interpreter and client compiler. So do you think if it's better to add the conditions to add DMBs for C1 and interpreter? How about just excluding the scenario like "interpreter only", "client compiler only" and "TieredStopAtLevel=1" ? Thanks, Xiaohong Gong From aph at redhat.com Thu May 14 10:15:26 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 14 May 2020 11:15:26 +0100 Subject: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> <2551a58f-bb05-b63d-b8ed-63f120a75eeb@redhat.com> Message-ID: On 5/14/20 11:07 AM, Xiaohong Gong wrote: > Hi Andrew, > > Thanks for having a look at it! > > > On 5/14/20 10:37 AM, Andrew Haley wrote: > > > On 5/14/20 9:48 AM, Andrew Dinn wrote: > > >> Just for references a direct link to the webrev, issue and CSR > > are: > > >> > > >> https://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.00/ > > >> https://bugs.openjdk.java.net/browse/JDK-8243339 > > >> https://bugs.openjdk.java.net/browse/JDK-8243456 > > >> > > >> The webrev looks fine to me. Nice work, thank you! > > > > > > There's a problem with C1: we generate unnecessary DMBs if > > > we're using TieredStopAtLevel=1 or if we only have the client > > > compiler. This is a performance regression, so I reject this > > > patch. > > > > There are similar regressoins in the interpreter. > > Yes, I agree with you that regressions exist for interpreter and > client compiler. So do you think if it's better to add the > conditions to add DMBs for C1 and interpreter? How about just > excluding the scenario like "interpreter only", "client compiler > only" and "TieredStopAtLevel=1" ? Yes, I think so. Is there some way simply to ask the question "Are we using C2 or JVMCI compilers?" That's what we need to know. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From adinn at redhat.com Thu May 14 10:39:06 2020 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 14 May 2020 11:39:06 +0100 Subject: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> <2551a58f-bb05-b63d-b8ed-63f120a75eeb@redhat.com> Message-ID: On 14/05/2020 11:15, Andrew Haley wrote: > On 5/14/20 11:07 AM, Xiaohong Gong wrote: >> Hi Andrew, >> >> Thanks for having a look at it! >> >> > On 5/14/20 10:37 AM, Andrew Haley wrote: >> > > On 5/14/20 9:48 AM, Andrew Dinn wrote: >> > >> Just for references a direct link to the webrev, issue and CSR >> > are: >> > >> >> > >> https://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.00/ >> > >> https://bugs.openjdk.java.net/browse/JDK-8243339 >> > >> https://bugs.openjdk.java.net/browse/JDK-8243456 >> > >> >> > >> The webrev looks fine to me. Nice work, thank you! >> > > >> > > There's a problem with C1: we generate unnecessary DMBs if >> > > we're using TieredStopAtLevel=1 or if we only have the client >> > > compiler. This is a performance regression, so I reject this >> > > patch. >> > >> > There are similar regressoins in the interpreter. >> >> Yes, I agree with you that regressions exist for interpreter and >> client compiler. So do you think if it's better to add the >> conditions to add DMBs for C1 and interpreter? How about just >> excluding the scenario like "interpreter only", "client compiler >> only" and "TieredStopAtLevel=1" ? > > Yes, I think so. Is there some way simply to ask the question "Are we > using C2 or JVMCI compilers?" That's what we need to know. This can be done using build time conditionality. Elsewhere in the code base we have: #ifdef COMPILER2 . . . #if INCLUDE_JVMCI . . . #if COMPILER2_OR_JVMCI . . . regards, Andrew Dinn ----------- From aph at redhat.com Thu May 14 10:46:54 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 14 May 2020 11:46:54 +0100 Subject: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> <2551a58f-bb05-b63d-b8ed-63f120a75eeb@redhat.com> Message-ID: On 5/14/20 11:39 AM, Andrew Dinn wrote: > On 14/05/2020 11:15, Andrew Haley wrote: >> On 5/14/20 11:07 AM, Xiaohong Gong wrote: >>> Hi Andrew, >>> >>> Thanks for having a look at it! >>> >>> > On 5/14/20 10:37 AM, Andrew Haley wrote: >>> > > On 5/14/20 9:48 AM, Andrew Dinn wrote: >>> > >> Just for references a direct link to the webrev, issue and CSR >>> > are: >>> > >> >>> > >> https://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.00/ >>> > >> https://bugs.openjdk.java.net/browse/JDK-8243339 >>> > >> https://bugs.openjdk.java.net/browse/JDK-8243456 >>> > >> >>> > >> The webrev looks fine to me. Nice work, thank you! >>> > > >>> > > There's a problem with C1: we generate unnecessary DMBs if >>> > > we're using TieredStopAtLevel=1 or if we only have the client >>> > > compiler. This is a performance regression, so I reject this >>> > > patch. >>> > >>> > There are similar regressoins in the interpreter. >>> >>> Yes, I agree with you that regressions exist for interpreter and >>> client compiler. So do you think if it's better to add the >>> conditions to add DMBs for C1 and interpreter? How about just >>> excluding the scenario like "interpreter only", "client compiler >>> only" and "TieredStopAtLevel=1" ? >> >> Yes, I think so. Is there some way simply to ask the question "Are we >> using C2 or JVMCI compilers?" That's what we need to know. > This can be done using build time conditionality. Not entirely, because this is also switchable at runtime. > Elsewhere in the code > base we have: > > #ifdef COMPILER2 > . . . > > #if INCLUDE_JVMCI > . . . > > #if COMPILER2_OR_JVMCI > . . . > > > regards, > > > Andrew Dinn > ----------- > -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From martin.doerr at sap.com Thu May 14 19:14:19 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 14 May 2020 19:14:19 +0000 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: <702038f7-7942-9c94-c507-bd36241db180@oracle.com> References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> <19d53124-94d1-50f9-f4e6-948640e7c848@oracle.com> <702038f7-7942-9c94-c507-bd36241db180@oracle.com> Message-ID: Hi Vladimir, > But we can use it in Test5091921.java. C1 compiles the test code with > specified value before - lets keep it. Ok. That makes sense for this test. Updated webrev in place. > And this is not related to these changes but to have range(0, max_jint) for all > these flags is questionable. I think > nobody ran tests with 0 or max_jint values. Bunch of tests may simple > timeout (which is understandable) but in worst > case they may crash instead of graceful exit. I was wondering about that, too. But I haven't changed that. The previously global flags already had this range. I had also thought about guessing more reasonable values, but reasonable limits may depend on platform and future changes. I don't think we can define ranges such that everything works great while we stay inside and also such that nobody will ever want greater values. So I prefer keeping it this way unless somebody has a better proposal. Thanks and best regards, Martin > -----Original Message----- > From: Vladimir Kozlov > Sent: Mittwoch, 13. Mai 2020 23:34 > To: Doerr, Martin ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > On 5/13/20 1:10 PM, Doerr, Martin wrote: > > Hi Vladimir, > > > > thanks for reviewing it. > > > >>> Should I set it to proposed? > >> > >> Yes. > > I've set it to "Finalized". Hope this was correct. > > > >>> I've added the new C1 flags to the tests which should test C1 compiler as > >> well. > >> > >> Good. Why not do the same for C1MaxInlineSize? > > Looks like MaxInlineSize is only used by tests which test C2 specific things. > So I think C1MaxInlineSize would be pointless. > > In addition to that, the C2 values are probably not appropriate for C1 in > some tests. > > Would you like to have C1MaxInlineSize configured in some tests? > > You are right in cases when test switch off TieredCompilation and use only C2 > (Test6792161.java) or tests intrinsics. > > But we can use it in Test5091921.java. C1 compiles the test code with > specified value before - lets keep it. > > And this is not related to these changes but to have range(0, max_jint) for all > these flags is questionable. I think > nobody ran tests with 0 or max_jint values. Bunch of tests may simple > timeout (which is understandable) but in worst > case they may crash instead of graceful exit. > > Thanks, > Vladimir > > > > > Best regards, > > Martin > > > > > >> -----Original Message----- > >> From: Vladimir Kozlov > >> Sent: Mittwoch, 13. Mai 2020 21:46 > >> To: Doerr, Martin ; hotspot-compiler- > >> dev at openjdk.java.net > >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >> > >> Hi Martin, > >> > >> On 5/11/20 6:32 AM, Doerr, Martin wrote: > >>> Hi Vladimir, > >>> > >>> are you ok with the updated CSR > >> (https://bugs.openjdk.java.net/browse/JDK-8244507)? > >>> Should I set it to proposed? > >> > >> Yes. > >> > >>> > >>> Here's a new webrev with obsoletion + expiration for C2 flags in > ClientVM: > >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ > >>> > >>> I've added the new C1 flags to the tests which should test C1 compiler as > >> well. > >> > >> Good. Why not do the same for C1MaxInlineSize? > >> > >>> And I've added -XX:+IgnoreUnrecognizedVMOptions to all tests which > set > >> C2 flags. I think this is the best solution because it still allows running the > tests > >> with GraalVM compiler. > >> > >> Yes. > >> > >> Thanks, > >> Vladimir > >> > >>> > >>> Best regards, > >>> Martin > >>> > >>> > >>>> -----Original Message----- > >>>> From: Doerr, Martin > >>>> Sent: Freitag, 8. Mai 2020 23:07 > >>>> To: Vladimir Kozlov ; hotspot-compiler- > >>>> dev at openjdk.java.net > >>>> Subject: RE: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>> > >>>> Hi Vladimir, > >>>> > >>>>> You need update your CSR - add information about this and above > code > >>>> change. Example: > >>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 > >>>> I've updated the CSR with obsolete and expired flags as in the example. > >>>> > >>>>> I would suggest to fix tests anyway (there are only few) because new > >>>>> warning output could be unexpected. > >>>> Ok. I'll prepare a webrev with fixed tests. > >>>> > >>>> Best regards, > >>>> Martin > >>>> > >>>> > >>>>> -----Original Message----- > >>>>> From: Vladimir Kozlov > >>>>> Sent: Freitag, 8. Mai 2020 21:43 > >>>>> To: Doerr, Martin ; hotspot-compiler- > >>>>> dev at openjdk.java.net > >>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>> > >>>>> Hi Martin > >>>>> > >>>>> On 5/8/20 5:56 AM, Doerr, Martin wrote: > >>>>>> Hi Vladimir, > >>>>>> > >>>>>> thanks a lot for looking at this, for finding the test issues and for > >>>> reviewing > >>>>> the CSR. > >>>>>> > >>>>>> For me, C2 is a fundamental part of the JVM. I would usually never > >> build > >>>>> without it ?? > >>>>>> (Except if we want to use C1 + GraalVM compiler only.) > >>>>> > >>>>> Yes it is one of cases. > >>>>> > >>>>>> But your right, --with-jvm-variants=client configuration should still be > >>>>> supported. > >>>>> > >>>>> Yes. > >>>>> > >>>>>> > >>>>>> We can fix it by making the flags as obsolete if C2 is not included: > >>>>>> diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp > >>>>>> --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 11:14:28 > >>>> 2020 > >>>>> +0200 > >>>>>> +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 > 14:41:14 > >>>>> 2020 +0200 > >>>>>> @@ -562,6 +562,16 @@ > >>>>>> { "dup option", JDK_Version::jdk(9), > >>>> JDK_Version::undefined(), > >>>>> JDK_Version::undefined() }, > >>>>>> #endif > >>>>>> > >>>>>> +#ifndef COMPILER2 > >>>>>> + // These flags were generally available, but are C2 only, now. > >>>>>> + { "MaxInlineLevel", JDK_Version::undefined(), > >>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>> + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), > >>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>> + { "InlineSmallCode", JDK_Version::undefined(), > >>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>> + { "MaxInlineSize", JDK_Version::undefined(), > >>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>> + { "FreqInlineSize", JDK_Version::undefined(), > >>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>> + { "MaxTrivialSize", JDK_Version::undefined(), > >>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>> +#endif > >>>>>> + > >>>>>> { NULL, JDK_Version(0), JDK_Version(0) } > >>>>>> }; > >>>>> > >>>>> Right. I think you should do full process for these product flags > >> deprecation > >>>>> with obsoleting in JDK 16 for VM builds > >>>>> which do not include C2. You need update your CSR - add information > >>>> about > >>>>> this and above code change. Example: > >>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 > >>>>> > >>>>>> > >>>>>> This makes the VM accept the flags with warning: > >>>>>> jdk/bin/java -XX:MaxInlineLevel=9 -version > >>>>>> OpenJDK 64-Bit Client VM warning: Ignoring option MaxInlineLevel; > >>>>> support was removed in 15.0 > >>>>>> > >>>>>> If we do it this way, the only test which I think should get fixed is > >>>>> ReservedStackTest. > >>>>>> I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in order > to > >>>>> preserve the inlining behavior. > >>>>>> > >>>>>> (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. > >>>>> compiler/c2 tests: Also written to test C2 specific things.) > >>>>>> > >>>>>> What do you think? > >>>>> > >>>>> I would suggest to fix tests anyway (there are only few) because new > >>>>> warning output could be unexpected. > >>>>> And it will be future-proof when warning will be converted into error > >>>>> (if/when C2 goes away). > >>>>> > >>>>> Thanks, > >>>>> Vladimir > >>>>> > >>>>>> > >>>>>> Best regards, > >>>>>> Martin > >>>>>> > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: hotspot-compiler-dev >>>>>>> bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov > >>>>>>> Sent: Donnerstag, 7. Mai 2020 19:11 > >>>>>>> To: hotspot-compiler-dev at openjdk.java.net > >>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>>>> > >>>>>>> I would suggest to build VM without C2 and run tests. > >>>>>>> > >>>>>>> I grepped tests with these flags I found next tests where we need > to > >> fix > >>>>>>> test's command (add > >>>>>>> -XX:+IgnoreUnrecognizedVMOptions) or add @requires > >>>>>>> vm.compiler2.enabled or duplicate test for C1 with corresponding > C1 > >>>>>>> flags (by ussing additional @test block). > >>>>>>> > >>>>>>> runtime/ReservedStack/ReservedStackTest.java > >>>>>>> compiler/intrinsics/string/TestStringIntrinsics2.java > >>>>>>> compiler/c2/Test6792161.java > >>>>>>> compiler/c2/Test5091921.java > >>>>>>> > >>>>>>> And there is issue with compiler/compilercontrol tests which use > >>>>>>> InlineSmallCode and I am not sure how to handle: > >>>>>>> > >>>>>>> > >>>>> > >>>> > >> > http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c > >>>>>>> ompiler/compilercontrol/share/scenario/Command.java#l36 > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Vladimir > >>>>>>> > >>>>>>> On 5/4/20 9:04 AM, Doerr, Martin wrote: > >>>>>>>> Hi Nils, > >>>>>>>> > >>>>>>>> thank you for looking at this and sorry for the late reply. > >>>>>>>> > >>>>>>>> I've added MaxTrivialSize and also updated the issue accordingly. > >>>> Makes > >>>>>>> sense. > >>>>>>>> Do you have more flags in mind? > >>>>>>>> > >>>>>>>> Moving the flags which are only used by C2 into c2_globals > definitely > >>>>> makes > >>>>>>> sense. > >>>>>>>> > >>>>>>>> Done in webrev.01: > >>>>>>>> > >> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ > >>>>>>>> > >>>>>>>> Please take a look and let me know when my proposal is ready for > a > >>>> CSR. > >>>>>>>> > >>>>>>>> Best regards, > >>>>>>>> Martin > >>>>>>>> > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: hotspot-compiler-dev >>>>>>>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson > >>>>>>>>> Sent: Dienstag, 28. April 2020 18:29 > >>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net > >>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>>>>>> > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> Thanks for addressing this! This has been an annoyance for a > long > >>>> time. > >>>>>>>>> > >>>>>>>>> Have you though about including other flags - like > MaxTrivialSize? > >>>>>>>>> MaxInlineSize is tested against it. > >>>>>>>>> > >>>>>>>>> Also - you should move the flags that are now c2-only to > >>>>> c2_globals.hpp. > >>>>>>>>> > >>>>>>>>> Best regards, > >>>>>>>>> Nils Eliasson > >>>>>>>>> > >>>>>>>>> On 2020-04-27 15:06, Doerr, Martin wrote: > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> while tuning inlining parameters for C2 compiler with JDK- > 8234863 > >>>> we > >>>>>>> had > >>>>>>>>> discussed impact on C1. > >>>>>>>>>> I still think it's bad to share them between both compilers. We > >> may > >>>>> want > >>>>>>> to > >>>>>>>>> do further C2 tuning without negative impact on C1 in the future. > >>>>>>>>>> > >>>>>>>>>> C1 has issues with substantial inlining because of the lack of > >>>>> uncommon > >>>>>>>>> traps. When C1 inlines a lot, stack frames may get large and code > >>>> cache > >>>>>>> space > >>>>>>>>> may get wasted for cold or even never executed code. The > >> situation > >>>>> gets > >>>>>>>>> worse when many patching stubs get used for such code. > >>>>>>>>>> > >>>>>>>>>> I had opened the following issue: > >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 > >>>>>>>>>> > >>>>>>>>>> And my initial proposal is here: > >>>>>>>>>> > >>>>> > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Part of my proposal is to add an additional flag which I called > >>>>>>>>> C1InlineStackLimit to reduce stack utilization for C1 methods. > >>>>>>>>>> I have a simple example which shows wasted stack space (java > >>>>> example > >>>>>>>>> TestStack at the end). > >>>>>>>>>> > >>>>>>>>>> It simply counts stack frames until a stack overflow occurs. With > >> the > >>>>>>> current > >>>>>>>>> implementation, only 1283 frames fit on the stack because the > >> never > >>>>>>>>> executed method bogus_test with local variables gets inlined. > >>>>>>>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and > we > >> get > >>>>>>> 2310 > >>>>>>>>> frames until stack overflow. (I only used C1 for this example. Can > >> be > >>>>>>>>> reproduced as shown below.) > >>>>>>>>>> > >>>>>>>>>> I didn't notice any performance regression even with the > >> aggressive > >>>>>>> setting > >>>>>>>>> of C1InlineStackLimit=5 with TieredCompilation. > >>>>>>>>>> > >>>>>>>>>> I know that I'll need a CSR for this change, but I'd like to get > >>>> feedback > >>>>> in > >>>>>>>>> general and feedback about the flag names before creating a > CSR. > >>>>>>>>>> I'd also be glad about feedback regarding the performance > >> impact. > >>>>>>>>>> > >>>>>>>>>> Best regards, > >>>>>>>>>> Martin > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Command line: > >>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - > XX:C1InlineStackLimit=20 - > >>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - > >> XX:+PrintInlining > >>>> - > >>>>>>>>> > >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >>>>>>>>> TestStack > >>>>>>>>>> CompileCommand: compileonly > TestStack.triggerStackOverflow > >>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 > bytes) > >>>>>>> recursive > >>>>>>>>> inlining too deep > >>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) inline > >>>>>>>>>> caught java.lang.StackOverflowError > >>>>>>>>>> 1283 activations were on stack, sum = 0 > >>>>>>>>>> > >>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - > XX:C1InlineStackLimit=10 - > >>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - > >> XX:+PrintInlining > >>>> - > >>>>>>>>> > >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >>>>>>>>> TestStack > >>>>>>>>>> CompileCommand: compileonly > TestStack.triggerStackOverflow > >>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 > bytes) > >>>>>>> recursive > >>>>>>>>> inlining too deep > >>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) callee > >> uses > >>>>> too > >>>>>>>>> much stack > >>>>>>>>>> caught java.lang.StackOverflowError > >>>>>>>>>> 2310 activations were on stack, sum = 0 > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> TestStack.java: > >>>>>>>>>> public class TestStack { > >>>>>>>>>> > >>>>>>>>>> static long cnt = 0, > >>>>>>>>>> sum = 0; > >>>>>>>>>> > >>>>>>>>>> public static void bogus_test() { > >>>>>>>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; > >>>>>>>>>> sum += c1 + c2 + c3 + c4; > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> public static void triggerStackOverflow() { > >>>>>>>>>> cnt++; > >>>>>>>>>> triggerStackOverflow(); > >>>>>>>>>> bogus_test(); > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> public static void main(String args[]) { > >>>>>>>>>> try { > >>>>>>>>>> triggerStackOverflow(); > >>>>>>>>>> } catch (StackOverflowError e) { > >>>>>>>>>> System.out.println("caught " + e); > >>>>>>>>>> } > >>>>>>>>>> System.out.println(cnt + " activations were on stack, sum > = " > >> + > >>>>>>> sum); > >>>>>>>>>> } > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>> From martin.doerr at sap.com Thu May 14 19:24:13 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 14 May 2020 19:24:13 +0000 Subject: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> Message-ID: Hi, I have prepared a change to use SIGTRAP on PPC64 for stop functions etc. I'll test it and post a separate RFR at a later point of time. Just for you upfront in case you would like to take a look: https://bugs.openjdk.java.net/browse/JDK-8244949 http://cr.openjdk.java.net/~mdoerr/8244949_ppc64_asm_stop/webrev.00/ @lx: Btw. Please use bug id in subject line of RFR emails. Best regards, Martin > -----Original Message----- > From: Andrew Haley > Sent: Donnerstag, 7. Mai 2020 19:48 > To: Doerr, Martin ; Derek White > ; Ningsheng Jian ; Liu, > Xin ; hotspot-compiler-dev at openjdk.java.net > Cc: aarch64-port-dev at openjdk.java.net > Subject: Re: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information > when hitting a HaltNode for architectures other than x86 > > On 5/7/20 9:11 AM, Andrew Haley wrote: > > On 5/6/20 9:45 PM, Doerr, Martin wrote: > >> I had also thought about using a trap based implementation. > >> Maybe it makes sense to add a feature to shared code for that. > >> E.g. we could emit an illegal instruction (which raises SIGILL) followed by > some kind of index into a descriptor array. > >> PPC64 would also benefit from a more compact solution. > > > > Most of the stuff to handle this would be in the back end, I would > > have thought. I'll have a look. > > This attached patch does the job for Linux/AArch64. It has two > disadvantages: > > 1. It corrupts R8, rscratch1. This doesn't really matter for AArch64. > > 2. If we execute stop() in the context of a signal handler triggered > by another trap, we'll have a double fault and the OS will kill our > process. > > 1 is fixable by pushing rscratch1 before executing the trap. I'm not > sure it's worth doing; I'd rather have a tiny implementation of stop() > that we can use guilt-free in release code. > > I don't think 2 is an issue because we never call out to generated > code from a signal handler. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From nils.eliasson at oracle.com Thu May 14 19:54:03 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 14 May 2020 21:54:03 +0200 Subject: RFR(M): 8244660: Code cache sweeper heuristics is broken In-Reply-To: References: <0688678b-986b-082c-425e-543c3c32b094@oracle.com> Message-ID: Hi Man, Thank you for your very comprehensive measuring. This makes me comfortable that this change achieves the desired goals. Lets go with a default threshold of 0.5%. If we encounter any issues, it is easy to change. I changed the SweeperThreshold to be a percentage of ReservedCodeCacheSize - but capped at 1.2Mb. The default is 0.5% which is 1.2 Mb when running with tiered compilation.? The cap only applies when the default is used - in that way a user have freedom to increase it at will. I added add a log_info(codecache, sweep) with the threshold in bytes during startup for convenience. I also added the threshold (in bytes) to the JFR CodeSweeperConfiguration event. webrev: http://cr.openjdk.java.net/~neliasso/8244660/webrev.02/ This patch applies on top of cleanup patch JDK-8244658 And patch JDK-8244278 must be used on top of this patch to get decent results. Best regards, Nils Eliasson On 2020-05-14 04:48, Man Cao wrote: > Hi Nils, > > I have done more DaCapo benchmarking with the patches. > Overall, the result looks good, and your fix indeed reduces sweep frequency > than the current state. > It retains possible performance improvement and does not introduce > unnecessary increase in code cache usage. > > All results are available at > https://cr.openjdk.java.net/~manc/8244660_benchmarks/. > I have also included counters for used code cache size and sweeper > statistics in the graphs. > These metrics are collected using this patch: > https://cr.openjdk.java.net/~manc/8244660_benchmarks/hsperfcounters_webrev/ > All runs are with "-Xms4g -Xmx4g -XX:-TieredCompilation", because > -TieredCompilation matters a lot for our workload. > Also note that the numbers for throughput/CPU and GC exclude the warmup > iterations. The codecache/sweeper statistics account for all iterations > (including warmups). > > Comparing 3 JDK builds: > https://cr.openjdk.java.net/~manc/8244660_benchmarks/20200508-JDKHead-dacapoLarge4G-sweeperPatches.html > base: current state with no pending patches > allFixes: with patches for JDK-8244660, JDK-8244278 and JDK-8244658 > sweepAt90: with only the patch for JDK-8244278, so it's the same as the > config I used in previous results in JDK-8244278. > "allFixes" reduced sweep frequency than "base", without introducing much > increase in code cache usage. > > Same as above, but with -XX:ReservedCodeCacheSize=40m: > https://cr.openjdk.java.net/~manc/8244660_benchmarks/20200512-JDKHead-dacapoLarge4G-sweeperPatches-CodeCache40MB.html > "allFixes" retains the throughput and CPU improvement for tradesoap, > perhaps even better than not sweeping ("sweepAt90"). > Code cache usage for tradesoap is between "base" and not sweeping, which is > OK in my opinion. > > I think 1/100 of a 240mb default code cache seems a bit high. During >> startup we produce a lot of L3 code that will be thrown away. We want to >> recycle it fairly quickly, to avoid fragmenting the code cache, but not >> that often that we affect startup. >> I've done some startup measurements, and then we sweep about every other >> second in a benchmark that produces a lot of code. >> What results are you seeing? > > The 1/256 capped at 1MB seems OK. > Even with 40MB or 48MB code cache size with -TieredCompilation, it does not > flush too frequently. > > Code cache flushing has another heuristic - it might be broken too. But >> it would be interesting too see how it works with the new sweep >> heuristic. If you know that you have enough code cache - turning it off >> is no loss. It only helps when you are running out of code cache. > > >> When we are doing normal sweeping - we don't deoptimize cold code. That >> is handled my the method flushing - it should only kick in when we start >> to run out of code cache. > > I think we should address MethodFlushing in a separate RFE/BUG. > > > Thanks for explaining this. > I did some benchmarking with -XX:NmethodSweepActivity and > -XX:MinPassesBeforeFlush, on top of the "allFixes" config: > https://cr.openjdk.java.net/~manc/8244660_benchmarks/20200508-JDKHead-dacapoLarge4G-NmethodSweepActivity.html > https://cr.openjdk.java.net/~manc/8244660_benchmarks/20200508-JDKHead-dacapoLarge4G-MinPassesBeforeFlush.html > xalan, jython look better with small values, pmd looks worse. > I'll follow up separately if I find anything wrong with the > flushing/cold-code-deoptimization heuristic > > The heuristics for CodeAging may have been negatively affected by the >> transition to handshakes. Also the SetHotnessClosure should be replaced >> by a mechanism using the NMethodEntry barriers. >> I see that we are missing JFR events for MethodFlushing. I have created >> another patch for that. > Although I'm not very familiar with these, thanks for identifying and > fixing these issues! > > -Man From nils.eliasson at oracle.com Thu May 14 20:06:01 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 14 May 2020 22:06:01 +0200 Subject: RFR(S): 8245011: Add JFR event for cold methods flushed Message-ID: Hi, Please review this small patch that adds a JFR event for cold methods that are flushed and adds the total number of cold methods flushed to the sweeper statistics event. Bug: https://bugs.openjdk.java.net/browse/JDK-8245011 Webrev: http://cr.openjdk.java.net/~neliasso/8245011/ Please review, Nils Eliasson From erik.gahlin at oracle.com Thu May 14 20:11:17 2020 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Thu, 14 May 2020 22:11:17 +0200 Subject: RFR(S): 8245011: Add JFR event for cold methods flushed In-Reply-To: References: Message-ID: <99BAD5E5-8D8B-4B36-AC65-495042CADC13@oracle.com> Looks good. Erik > On 14 May 2020, at 22:06, Nils Eliasson wrote: > > Hi, > > Please review this small patch that adds a JFR event for cold methods that are flushed and adds the total number of cold methods flushed to the sweeper statistics event. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8245011 > Webrev: http://cr.openjdk.java.net/~neliasso/8245011/ > > Please review, > > Nils Eliasson From vladimir.kozlov at oracle.com Thu May 14 20:28:44 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 14 May 2020 13:28:44 -0700 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> <19d53124-94d1-50f9-f4e6-948640e7c848@oracle.com> <702038f7-7942-9c94-c507-bd36241db180@oracle.com> Message-ID: On 5/14/20 12:14 PM, Doerr, Martin wrote: > Hi Vladimir, > >> But we can use it in Test5091921.java. C1 compiles the test code with >> specified value before - lets keep it. > Ok. That makes sense for this test. Updated webrev in place. Good. > >> And this is not related to these changes but to have range(0, max_jint) for all >> these flags is questionable. I think >> nobody ran tests with 0 or max_jint values. Bunch of tests may simple >> timeout (which is understandable) but in worst >> case they may crash instead of graceful exit. > I was wondering about that, too. But I haven't changed that. The previously global flags already had this range. > I had also thought about guessing more reasonable values, but reasonable limits may depend on platform and future changes. > I don't think we can define ranges such that everything works great while we stay inside and also such that nobody will ever want greater values. > So I prefer keeping it this way unless somebody has a better proposal. I did not mean to have that in these change. Current changes are fine for me. I was thinking aloud that it would be nice to investigate this later by someone. At least for some flags. We may keep current range as it is but may be add dynamic checks based on platform and other conditions. This looks like starter task for junior engineer or student intern. Thanks, Vladimir > > Thanks and best regards, > Martin > > >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Mittwoch, 13. Mai 2020 23:34 >> To: Doerr, Martin ; hotspot-compiler- >> dev at openjdk.java.net >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >> >> On 5/13/20 1:10 PM, Doerr, Martin wrote: >>> Hi Vladimir, >>> >>> thanks for reviewing it. >>> >>>>> Should I set it to proposed? >>>> >>>> Yes. >>> I've set it to "Finalized". Hope this was correct. >>> >>>>> I've added the new C1 flags to the tests which should test C1 compiler as >>>> well. >>>> >>>> Good. Why not do the same for C1MaxInlineSize? >>> Looks like MaxInlineSize is only used by tests which test C2 specific things. >> So I think C1MaxInlineSize would be pointless. >>> In addition to that, the C2 values are probably not appropriate for C1 in >> some tests. >>> Would you like to have C1MaxInlineSize configured in some tests? >> >> You are right in cases when test switch off TieredCompilation and use only C2 >> (Test6792161.java) or tests intrinsics. >> >> But we can use it in Test5091921.java. C1 compiles the test code with >> specified value before - lets keep it. >> >> And this is not related to these changes but to have range(0, max_jint) for all >> these flags is questionable. I think >> nobody ran tests with 0 or max_jint values. Bunch of tests may simple >> timeout (which is understandable) but in worst >> case they may crash instead of graceful exit. >> >> Thanks, >> Vladimir >> >>> >>> Best regards, >>> Martin >>> >>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov >>>> Sent: Mittwoch, 13. Mai 2020 21:46 >>>> To: Doerr, Martin ; hotspot-compiler- >>>> dev at openjdk.java.net >>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>> >>>> Hi Martin, >>>> >>>> On 5/11/20 6:32 AM, Doerr, Martin wrote: >>>>> Hi Vladimir, >>>>> >>>>> are you ok with the updated CSR >>>> (https://bugs.openjdk.java.net/browse/JDK-8244507)? >>>>> Should I set it to proposed? >>>> >>>> Yes. >>>> >>>>> >>>>> Here's a new webrev with obsoletion + expiration for C2 flags in >> ClientVM: >>>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ >>>>> >>>>> I've added the new C1 flags to the tests which should test C1 compiler as >>>> well. >>>> >>>> Good. Why not do the same for C1MaxInlineSize? >>>> >>>>> And I've added -XX:+IgnoreUnrecognizedVMOptions to all tests which >> set >>>> C2 flags. I think this is the best solution because it still allows running the >> tests >>>> with GraalVM compiler. >>>> >>>> Yes. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>>> >>>>> Best regards, >>>>> Martin >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Doerr, Martin >>>>>> Sent: Freitag, 8. Mai 2020 23:07 >>>>>> To: Vladimir Kozlov ; hotspot-compiler- >>>>>> dev at openjdk.java.net >>>>>> Subject: RE: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>> >>>>>> Hi Vladimir, >>>>>> >>>>>>> You need update your CSR - add information about this and above >> code >>>>>> change. Example: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 >>>>>> I've updated the CSR with obsolete and expired flags as in the example. >>>>>> >>>>>>> I would suggest to fix tests anyway (there are only few) because new >>>>>>> warning output could be unexpected. >>>>>> Ok. I'll prepare a webrev with fixed tests. >>>>>> >>>>>> Best regards, >>>>>> Martin >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Vladimir Kozlov >>>>>>> Sent: Freitag, 8. Mai 2020 21:43 >>>>>>> To: Doerr, Martin ; hotspot-compiler- >>>>>>> dev at openjdk.java.net >>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>> >>>>>>> Hi Martin >>>>>>> >>>>>>> On 5/8/20 5:56 AM, Doerr, Martin wrote: >>>>>>>> Hi Vladimir, >>>>>>>> >>>>>>>> thanks a lot for looking at this, for finding the test issues and for >>>>>> reviewing >>>>>>> the CSR. >>>>>>>> >>>>>>>> For me, C2 is a fundamental part of the JVM. I would usually never >>>> build >>>>>>> without it ?? >>>>>>>> (Except if we want to use C1 + GraalVM compiler only.) >>>>>>> >>>>>>> Yes it is one of cases. >>>>>>> >>>>>>>> But your right, --with-jvm-variants=client configuration should still be >>>>>>> supported. >>>>>>> >>>>>>> Yes. >>>>>>> >>>>>>>> >>>>>>>> We can fix it by making the flags as obsolete if C2 is not included: >>>>>>>> diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp >>>>>>>> --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 11:14:28 >>>>>> 2020 >>>>>>> +0200 >>>>>>>> +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 >> 14:41:14 >>>>>>> 2020 +0200 >>>>>>>> @@ -562,6 +562,16 @@ >>>>>>>> { "dup option", JDK_Version::jdk(9), >>>>>> JDK_Version::undefined(), >>>>>>> JDK_Version::undefined() }, >>>>>>>> #endif >>>>>>>> >>>>>>>> +#ifndef COMPILER2 >>>>>>>> + // These flags were generally available, but are C2 only, now. >>>>>>>> + { "MaxInlineLevel", JDK_Version::undefined(), >>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>> + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), >>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>> + { "InlineSmallCode", JDK_Version::undefined(), >>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>> + { "MaxInlineSize", JDK_Version::undefined(), >>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>> + { "FreqInlineSize", JDK_Version::undefined(), >>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>> + { "MaxTrivialSize", JDK_Version::undefined(), >>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>> +#endif >>>>>>>> + >>>>>>>> { NULL, JDK_Version(0), JDK_Version(0) } >>>>>>>> }; >>>>>>> >>>>>>> Right. I think you should do full process for these product flags >>>> deprecation >>>>>>> with obsoleting in JDK 16 for VM builds >>>>>>> which do not include C2. You need update your CSR - add information >>>>>> about >>>>>>> this and above code change. Example: >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 >>>>>>> >>>>>>>> >>>>>>>> This makes the VM accept the flags with warning: >>>>>>>> jdk/bin/java -XX:MaxInlineLevel=9 -version >>>>>>>> OpenJDK 64-Bit Client VM warning: Ignoring option MaxInlineLevel; >>>>>>> support was removed in 15.0 >>>>>>>> >>>>>>>> If we do it this way, the only test which I think should get fixed is >>>>>>> ReservedStackTest. >>>>>>>> I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in order >> to >>>>>>> preserve the inlining behavior. >>>>>>>> >>>>>>>> (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. >>>>>>> compiler/c2 tests: Also written to test C2 specific things.) >>>>>>>> >>>>>>>> What do you think? >>>>>>> >>>>>>> I would suggest to fix tests anyway (there are only few) because new >>>>>>> warning output could be unexpected. >>>>>>> And it will be future-proof when warning will be converted into error >>>>>>> (if/when C2 goes away). >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Martin >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: hotspot-compiler-dev >>>>>>>> bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov >>>>>>>>> Sent: Donnerstag, 7. Mai 2020 19:11 >>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>> >>>>>>>>> I would suggest to build VM without C2 and run tests. >>>>>>>>> >>>>>>>>> I grepped tests with these flags I found next tests where we need >> to >>>> fix >>>>>>>>> test's command (add >>>>>>>>> -XX:+IgnoreUnrecognizedVMOptions) or add @requires >>>>>>>>> vm.compiler2.enabled or duplicate test for C1 with corresponding >> C1 >>>>>>>>> flags (by ussing additional @test block). >>>>>>>>> >>>>>>>>> runtime/ReservedStack/ReservedStackTest.java >>>>>>>>> compiler/intrinsics/string/TestStringIntrinsics2.java >>>>>>>>> compiler/c2/Test6792161.java >>>>>>>>> compiler/c2/Test5091921.java >>>>>>>>> >>>>>>>>> And there is issue with compiler/compilercontrol tests which use >>>>>>>>> InlineSmallCode and I am not sure how to handle: >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>> >> http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c >>>>>>>>> ompiler/compilercontrol/share/scenario/Command.java#l36 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Vladimir >>>>>>>>> >>>>>>>>> On 5/4/20 9:04 AM, Doerr, Martin wrote: >>>>>>>>>> Hi Nils, >>>>>>>>>> >>>>>>>>>> thank you for looking at this and sorry for the late reply. >>>>>>>>>> >>>>>>>>>> I've added MaxTrivialSize and also updated the issue accordingly. >>>>>> Makes >>>>>>>>> sense. >>>>>>>>>> Do you have more flags in mind? >>>>>>>>>> >>>>>>>>>> Moving the flags which are only used by C2 into c2_globals >> definitely >>>>>>> makes >>>>>>>>> sense. >>>>>>>>>> >>>>>>>>>> Done in webrev.01: >>>>>>>>>> >>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ >>>>>>>>>> >>>>>>>>>> Please take a look and let me know when my proposal is ready for >> a >>>>>> CSR. >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Martin >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: hotspot-compiler-dev >>>>>>>>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson >>>>>>>>>>> Sent: Dienstag, 28. April 2020 18:29 >>>>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> Thanks for addressing this! This has been an annoyance for a >> long >>>>>> time. >>>>>>>>>>> >>>>>>>>>>> Have you though about including other flags - like >> MaxTrivialSize? >>>>>>>>>>> MaxInlineSize is tested against it. >>>>>>>>>>> >>>>>>>>>>> Also - you should move the flags that are now c2-only to >>>>>>> c2_globals.hpp. >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> Nils Eliasson >>>>>>>>>>> >>>>>>>>>>> On 2020-04-27 15:06, Doerr, Martin wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> while tuning inlining parameters for C2 compiler with JDK- >> 8234863 >>>>>> we >>>>>>>>> had >>>>>>>>>>> discussed impact on C1. >>>>>>>>>>>> I still think it's bad to share them between both compilers. We >>>> may >>>>>>> want >>>>>>>>> to >>>>>>>>>>> do further C2 tuning without negative impact on C1 in the future. >>>>>>>>>>>> >>>>>>>>>>>> C1 has issues with substantial inlining because of the lack of >>>>>>> uncommon >>>>>>>>>>> traps. When C1 inlines a lot, stack frames may get large and code >>>>>> cache >>>>>>>>> space >>>>>>>>>>> may get wasted for cold or even never executed code. The >>>> situation >>>>>>> gets >>>>>>>>>>> worse when many patching stubs get used for such code. >>>>>>>>>>>> >>>>>>>>>>>> I had opened the following issue: >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 >>>>>>>>>>>> >>>>>>>>>>>> And my initial proposal is here: >>>>>>>>>>>> >>>>>>> >> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Part of my proposal is to add an additional flag which I called >>>>>>>>>>> C1InlineStackLimit to reduce stack utilization for C1 methods. >>>>>>>>>>>> I have a simple example which shows wasted stack space (java >>>>>>> example >>>>>>>>>>> TestStack at the end). >>>>>>>>>>>> >>>>>>>>>>>> It simply counts stack frames until a stack overflow occurs. With >>>> the >>>>>>>>> current >>>>>>>>>>> implementation, only 1283 frames fit on the stack because the >>>> never >>>>>>>>>>> executed method bogus_test with local variables gets inlined. >>>>>>>>>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and >> we >>>> get >>>>>>>>> 2310 >>>>>>>>>>> frames until stack overflow. (I only used C1 for this example. Can >>>> be >>>>>>>>>>> reproduced as shown below.) >>>>>>>>>>>> >>>>>>>>>>>> I didn't notice any performance regression even with the >>>> aggressive >>>>>>>>> setting >>>>>>>>>>> of C1InlineStackLimit=5 with TieredCompilation. >>>>>>>>>>>> >>>>>>>>>>>> I know that I'll need a CSR for this change, but I'd like to get >>>>>> feedback >>>>>>> in >>>>>>>>>>> general and feedback about the flag names before creating a >> CSR. >>>>>>>>>>>> I'd also be glad about feedback regarding the performance >>>> impact. >>>>>>>>>>>> >>>>>>>>>>>> Best regards, >>>>>>>>>>>> Martin >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Command line: >>>>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - >> XX:C1InlineStackLimit=20 - >>>>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - >>>> XX:+PrintInlining >>>>>> - >>>>>>>>>>> >>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>>>>>>>> TestStack >>>>>>>>>>>> CompileCommand: compileonly >> TestStack.triggerStackOverflow >>>>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 >> bytes) >>>>>>>>> recursive >>>>>>>>>>> inlining too deep >>>>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) inline >>>>>>>>>>>> caught java.lang.StackOverflowError >>>>>>>>>>>> 1283 activations were on stack, sum = 0 >>>>>>>>>>>> >>>>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - >> XX:C1InlineStackLimit=10 - >>>>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - >>>> XX:+PrintInlining >>>>>> - >>>>>>>>>>> >>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>>>>>>>> TestStack >>>>>>>>>>>> CompileCommand: compileonly >> TestStack.triggerStackOverflow >>>>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 >> bytes) >>>>>>>>> recursive >>>>>>>>>>> inlining too deep >>>>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) callee >>>> uses >>>>>>> too >>>>>>>>>>> much stack >>>>>>>>>>>> caught java.lang.StackOverflowError >>>>>>>>>>>> 2310 activations were on stack, sum = 0 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> TestStack.java: >>>>>>>>>>>> public class TestStack { >>>>>>>>>>>> >>>>>>>>>>>> static long cnt = 0, >>>>>>>>>>>> sum = 0; >>>>>>>>>>>> >>>>>>>>>>>> public static void bogus_test() { >>>>>>>>>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; >>>>>>>>>>>> sum += c1 + c2 + c3 + c4; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> public static void triggerStackOverflow() { >>>>>>>>>>>> cnt++; >>>>>>>>>>>> triggerStackOverflow(); >>>>>>>>>>>> bogus_test(); >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> public static void main(String args[]) { >>>>>>>>>>>> try { >>>>>>>>>>>> triggerStackOverflow(); >>>>>>>>>>>> } catch (StackOverflowError e) { >>>>>>>>>>>> System.out.println("caught " + e); >>>>>>>>>>>> } >>>>>>>>>>>> System.out.println(cnt + " activations were on stack, sum >> = " >>>> + >>>>>>>>> sum); >>>>>>>>>>>> } >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>> From rwestrel at redhat.com Fri May 15 08:20:08 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 15 May 2020 10:20:08 +0200 Subject: RFR(S): 8245083: [REDO] Shenandoah: Remove null-handling in LRB expansion Message-ID: <87o8qp38zr.fsf@redhat.com> https://bugs.openjdk.java.net/browse/JDK-8245083 http://cr.openjdk.java.net/~roland/8245083/webrev.00/ Same as 8244523, this removes null handling code from LRB expansion. The only difference is that to preserve implicit null checks in: a' = lrb(a); if (a == null) { } f = a'.f this patch transforms it to: a' = lrb(a); if (a' == null) { } f = a'.f at expansion time. Roland. From rwestrel at redhat.com Fri May 15 08:31:52 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 15 May 2020 10:31:52 +0200 Subject: RFR(S): 8244721: CTW: C2 (Shenandoah) compilation fails with "unexpected infinite loop graph shape" Message-ID: <87lflt38g7.fsf@redhat.com> https://bugs.openjdk.java.net/browse/JDK-8244721 http://cr.openjdk.java.net/~roland/8244721/webrev.00/ Logic that finds raw memory in the case of an infinite loop fails if the loop has more than one backedge. If there is no memory phi in the loop head, that logic uses the memory input from the safepoint on the backedge. With multiple backedges and multiple safepoints, all safepoints should have the same memory input. I changed the logic so it's robust to multiple backedges under that assumption. Roland. From rwestrel at redhat.com Fri May 15 08:46:08 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 15 May 2020 10:46:08 +0200 Subject: RFR(S): 8244663: Shenandoah: C2 assertion fails in Matcher::collect_null_checks Message-ID: <87imgx37sf.fsf@redhat.com> https://bugs.openjdk.java.net/browse/JDK-8244663 http://cr.openjdk.java.net/~roland/8244663/webrev.00/ When a load barrier is used after a call in both an exception handler and the fallthrough path, it must be cloned in the 2 paths. If the barrier is used after both paths, a phi must be added to merge the barriers from both paths. If that use is a If, the current code may create a Phi which merges 2 Bool nodes. An If shouldn't have a Phi as input. This patch adds some logic for this special case so the Bool->Cmp is cloned and the Phi is created as input of the Cmp node. Roland. From rwestrel at redhat.com Fri May 15 08:49:43 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 15 May 2020 10:49:43 +0200 Subject: RFR(XS): 8241070: Shenandoah: remove unused local variables in C2 support Message-ID: <87ftc137mg.fsf@redhat.com> https://bugs.openjdk.java.net/browse/JDK-8241070 http://cr.openjdk.java.net/~roland/8241070/webrev.00/ Some cleanup. Roland. From aph at redhat.com Fri May 15 08:57:26 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 15 May 2020 09:57:26 +0100 Subject: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> Message-ID: On 5/14/20 8:24 PM, Doerr, Martin wrote: > Just for you upfront in case you would like to take a look: > https://bugs.openjdk.java.net/browse/JDK-8244949 > http://cr.openjdk.java.net/~mdoerr/8244949_ppc64_asm_stop/webrev.00/ That's nice: it's a lot more full-featured than what I did. and AFAICS you'll get the registers printed in the VM error log. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From christian.hagedorn at oracle.com Fri May 15 09:11:25 2020 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Fri, 15 May 2020 11:11:25 +0200 Subject: [15] RFR(XS): 8239083: C1 assert(known_holder == NULL || (known_holder->is_instance_klass() && (!known_holder->is_interface() || ((ciInstanceKlass*)known_holder)->has_nonstatic_concrete_methods())), "should be non-static concrete method"); Message-ID: <1dd061e7-f872-877e-b574-08e578f006ba@oracle.com> Hi Please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8239083 http://cr.openjdk.java.net/~chagedorn/8239083/webrev.00/ The assert fails in the test case when invoking the only static interface method with a method handle. In this case, known_holder is non-NULL. However, known_holder would be set to NULL at [1] since the call returns NULL when known_holder is an interface. In the failing test case, known_holder is non-NULL since GraphBuilder::try_method_handle_inline() calls GraphBuilder::try_inline() with holder_known set to true which eventually lets profile_call() to be called with a non-NULL known_holder argument. On the other hand, when calling a static method without a method handle, known_holder seems to be always NULL: profile_call() is called directly at [2] with NULL or indirectly via try_inline() [3]. In the latter case, cha_monomorphic_target and exact_target are always NULL for static methods and therefore known_holder will also be always NULL in profile_call(). We could therefore just remove the assert which seems to be too strong (not handling this edge case). Another option would be to change the call to try_inline() in try_method_handle_inline() to only set holder_known to true if the target is not static. The known_holder is eventually only used in LIR_Assembler::emit_profile_call() [4] but only if op->should_profile_receiver_type() holds [5]. This is only true if the callee is not static [6]. The webrev uses the second approach. What do you think? Best regards, Christian [1] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/share/c1/c1_GraphBuilder.cpp#l4386 [2] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/share/c1/c1_GraphBuilder.cpp#l3571 [3] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/share/c1/c1_GraphBuilder.cpp#l2039 [4] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp#l3589 [5] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp#l3584 [6] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/share/c1/c1_LIR.hpp#l1916 From shade at redhat.com Fri May 15 10:21:30 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 May 2020 12:21:30 +0200 Subject: RFR(XS): 8241070: Shenandoah: remove unused local variables in C2 support In-Reply-To: <87ftc137mg.fsf@redhat.com> References: <87ftc137mg.fsf@redhat.com> Message-ID: On 5/15/20 10:49 AM, Roland Westrelin wrote: > https://bugs.openjdk.java.net/browse/JDK-8241070 > http://cr.openjdk.java.net/~roland/8241070/webrev.00/ Looks good to me. -- Thanks, -Aleksey From shade at redhat.com Fri May 15 10:27:05 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 May 2020 12:27:05 +0200 Subject: RFR(S): 8245083: [REDO] Shenandoah: Remove null-handling in LRB expansion In-Reply-To: <87o8qp38zr.fsf@redhat.com> References: <87o8qp38zr.fsf@redhat.com> Message-ID: On 5/15/20 10:20 AM, Roland Westrelin wrote: > > https://bugs.openjdk.java.net/browse/JDK-8245083 > http://cr.openjdk.java.net/~roland/8245083/webrev.00/ This patch passes extended testing for me. Looks good. Minor nit, feel free to ignore. Comments indent is a bit off here. 486 fields[TypeFunc::Parms+0] = TypeOopPtr::BOTTOM; // original field value 487 fields[TypeFunc::Parms+1] = TypeRawPtr::BOTTOM; // original load address -- Thanks, -Aleksey From martin.doerr at sap.com Fri May 15 10:37:28 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 15 May 2020 10:37:28 +0000 Subject: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> Message-ID: Exactly, we get stop type + stop message + registers + instructions (unfortunately not disassembled for some reason) + nice stack trace. Best regards, Martin > -----Original Message----- > From: Andrew Haley > Sent: Freitag, 15. Mai 2020 10:57 > To: Doerr, Martin ; Derek White > ; Ningsheng Jian ; Liu, > Xin ; hotspot-compiler-dev at openjdk.java.net > Cc: aarch64-port-dev at openjdk.java.net > Subject: Re: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information > when hitting a HaltNode for architectures other than x86 > > On 5/14/20 8:24 PM, Doerr, Martin wrote: > > Just for you upfront in case you would like to take a look: > > https://bugs.openjdk.java.net/browse/JDK-8244949 > > > http://cr.openjdk.java.net/~mdoerr/8244949_ppc64_asm_stop/webrev.00/ > > That's nice: it's a lot more full-featured than what I did. and AFAICS > you'll get the registers printed in the VM error log. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From martin.doerr at sap.com Fri May 15 11:41:30 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 15 May 2020 11:41:30 +0000 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> <19d53124-94d1-50f9-f4e6-948640e7c848@oracle.com> <702038f7-7942-9c94-c507-bd36241db180@oracle.com> Message-ID: Hi Vladimir, Nils and Tobias, Can I consider http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ reviewed by you? Submission repo testing was successful. Thanks and best regards, Martin > -----Original Message----- > From: Vladimir Kozlov > Sent: Donnerstag, 14. Mai 2020 22:29 > To: Doerr, Martin ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > On 5/14/20 12:14 PM, Doerr, Martin wrote: > > Hi Vladimir, > > > >> But we can use it in Test5091921.java. C1 compiles the test code with > >> specified value before - lets keep it. > > Ok. That makes sense for this test. Updated webrev in place. > > Good. > > > > >> And this is not related to these changes but to have range(0, max_jint) for > all > >> these flags is questionable. I think > >> nobody ran tests with 0 or max_jint values. Bunch of tests may simple > >> timeout (which is understandable) but in worst > >> case they may crash instead of graceful exit. > > I was wondering about that, too. But I haven't changed that. The previously > global flags already had this range. > > I had also thought about guessing more reasonable values, but reasonable > limits may depend on platform and future changes. > > I don't think we can define ranges such that everything works great while > we stay inside and also such that nobody will ever want greater values. > > So I prefer keeping it this way unless somebody has a better proposal. > > I did not mean to have that in these change. Current changes are fine for me. > > I was thinking aloud that it would be nice to investigate this later by > someone. At least for some flags. We may keep > current range as it is but may be add dynamic checks based on platform and > other conditions. This looks like starter > task for junior engineer or student intern. > > Thanks, > Vladimir > > > > > Thanks and best regards, > > Martin > > > > > >> -----Original Message----- > >> From: Vladimir Kozlov > >> Sent: Mittwoch, 13. Mai 2020 23:34 > >> To: Doerr, Martin ; hotspot-compiler- > >> dev at openjdk.java.net > >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >> > >> On 5/13/20 1:10 PM, Doerr, Martin wrote: > >>> Hi Vladimir, > >>> > >>> thanks for reviewing it. > >>> > >>>>> Should I set it to proposed? > >>>> > >>>> Yes. > >>> I've set it to "Finalized". Hope this was correct. > >>> > >>>>> I've added the new C1 flags to the tests which should test C1 compiler > as > >>>> well. > >>>> > >>>> Good. Why not do the same for C1MaxInlineSize? > >>> Looks like MaxInlineSize is only used by tests which test C2 specific > things. > >> So I think C1MaxInlineSize would be pointless. > >>> In addition to that, the C2 values are probably not appropriate for C1 in > >> some tests. > >>> Would you like to have C1MaxInlineSize configured in some tests? > >> > >> You are right in cases when test switch off TieredCompilation and use only > C2 > >> (Test6792161.java) or tests intrinsics. > >> > >> But we can use it in Test5091921.java. C1 compiles the test code with > >> specified value before - lets keep it. > >> > >> And this is not related to these changes but to have range(0, max_jint) for > all > >> these flags is questionable. I think > >> nobody ran tests with 0 or max_jint values. Bunch of tests may simple > >> timeout (which is understandable) but in worst > >> case they may crash instead of graceful exit. > >> > >> Thanks, > >> Vladimir > >> > >>> > >>> Best regards, > >>> Martin > >>> > >>> > >>>> -----Original Message----- > >>>> From: Vladimir Kozlov > >>>> Sent: Mittwoch, 13. Mai 2020 21:46 > >>>> To: Doerr, Martin ; hotspot-compiler- > >>>> dev at openjdk.java.net > >>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>> > >>>> Hi Martin, > >>>> > >>>> On 5/11/20 6:32 AM, Doerr, Martin wrote: > >>>>> Hi Vladimir, > >>>>> > >>>>> are you ok with the updated CSR > >>>> (https://bugs.openjdk.java.net/browse/JDK-8244507)? > >>>>> Should I set it to proposed? > >>>> > >>>> Yes. > >>>> > >>>>> > >>>>> Here's a new webrev with obsoletion + expiration for C2 flags in > >> ClientVM: > >>>>> > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ > >>>>> > >>>>> I've added the new C1 flags to the tests which should test C1 compiler > as > >>>> well. > >>>> > >>>> Good. Why not do the same for C1MaxInlineSize? > >>>> > >>>>> And I've added -XX:+IgnoreUnrecognizedVMOptions to all tests which > >> set > >>>> C2 flags. I think this is the best solution because it still allows running > the > >> tests > >>>> with GraalVM compiler. > >>>> > >>>> Yes. > >>>> > >>>> Thanks, > >>>> Vladimir > >>>> > >>>>> > >>>>> Best regards, > >>>>> Martin > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Doerr, Martin > >>>>>> Sent: Freitag, 8. Mai 2020 23:07 > >>>>>> To: Vladimir Kozlov ; hotspot- > compiler- > >>>>>> dev at openjdk.java.net > >>>>>> Subject: RE: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>>> > >>>>>> Hi Vladimir, > >>>>>> > >>>>>>> You need update your CSR - add information about this and above > >> code > >>>>>> change. Example: > >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 > >>>>>> I've updated the CSR with obsolete and expired flags as in the > example. > >>>>>> > >>>>>>> I would suggest to fix tests anyway (there are only few) because > new > >>>>>>> warning output could be unexpected. > >>>>>> Ok. I'll prepare a webrev with fixed tests. > >>>>>> > >>>>>> Best regards, > >>>>>> Martin > >>>>>> > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: Vladimir Kozlov > >>>>>>> Sent: Freitag, 8. Mai 2020 21:43 > >>>>>>> To: Doerr, Martin ; hotspot-compiler- > >>>>>>> dev at openjdk.java.net > >>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>>>> > >>>>>>> Hi Martin > >>>>>>> > >>>>>>> On 5/8/20 5:56 AM, Doerr, Martin wrote: > >>>>>>>> Hi Vladimir, > >>>>>>>> > >>>>>>>> thanks a lot for looking at this, for finding the test issues and for > >>>>>> reviewing > >>>>>>> the CSR. > >>>>>>>> > >>>>>>>> For me, C2 is a fundamental part of the JVM. I would usually never > >>>> build > >>>>>>> without it ?? > >>>>>>>> (Except if we want to use C1 + GraalVM compiler only.) > >>>>>>> > >>>>>>> Yes it is one of cases. > >>>>>>> > >>>>>>>> But your right, --with-jvm-variants=client configuration should still > be > >>>>>>> supported. > >>>>>>> > >>>>>>> Yes. > >>>>>>> > >>>>>>>> > >>>>>>>> We can fix it by making the flags as obsolete if C2 is not included: > >>>>>>>> diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp > >>>>>>>> --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 > 11:14:28 > >>>>>> 2020 > >>>>>>> +0200 > >>>>>>>> +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 > >> 14:41:14 > >>>>>>> 2020 +0200 > >>>>>>>> @@ -562,6 +562,16 @@ > >>>>>>>> { "dup option", JDK_Version::jdk(9), > >>>>>> JDK_Version::undefined(), > >>>>>>> JDK_Version::undefined() }, > >>>>>>>> #endif > >>>>>>>> > >>>>>>>> +#ifndef COMPILER2 > >>>>>>>> + // These flags were generally available, but are C2 only, now. > >>>>>>>> + { "MaxInlineLevel", JDK_Version::undefined(), > >>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>>>> + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), > >>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>>>> + { "InlineSmallCode", JDK_Version::undefined(), > >>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>>>> + { "MaxInlineSize", JDK_Version::undefined(), > >>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>>>> + { "FreqInlineSize", JDK_Version::undefined(), > >>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>>>> + { "MaxTrivialSize", JDK_Version::undefined(), > >>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>>>> +#endif > >>>>>>>> + > >>>>>>>> { NULL, JDK_Version(0), JDK_Version(0) } > >>>>>>>> }; > >>>>>>> > >>>>>>> Right. I think you should do full process for these product flags > >>>> deprecation > >>>>>>> with obsoleting in JDK 16 for VM builds > >>>>>>> which do not include C2. You need update your CSR - add > information > >>>>>> about > >>>>>>> this and above code change. Example: > >>>>>>> > >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 > >>>>>>> > >>>>>>>> > >>>>>>>> This makes the VM accept the flags with warning: > >>>>>>>> jdk/bin/java -XX:MaxInlineLevel=9 -version > >>>>>>>> OpenJDK 64-Bit Client VM warning: Ignoring option > MaxInlineLevel; > >>>>>>> support was removed in 15.0 > >>>>>>>> > >>>>>>>> If we do it this way, the only test which I think should get fixed is > >>>>>>> ReservedStackTest. > >>>>>>>> I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in > order > >> to > >>>>>>> preserve the inlining behavior. > >>>>>>>> > >>>>>>>> (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. > >>>>>>> compiler/c2 tests: Also written to test C2 specific things.) > >>>>>>>> > >>>>>>>> What do you think? > >>>>>>> > >>>>>>> I would suggest to fix tests anyway (there are only few) because > new > >>>>>>> warning output could be unexpected. > >>>>>>> And it will be future-proof when warning will be converted into > error > >>>>>>> (if/when C2 goes away). > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Vladimir > >>>>>>> > >>>>>>>> > >>>>>>>> Best regards, > >>>>>>>> Martin > >>>>>>>> > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: hotspot-compiler-dev >>>>>>>>> bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov > >>>>>>>>> Sent: Donnerstag, 7. Mai 2020 19:11 > >>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net > >>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>>>>>> > >>>>>>>>> I would suggest to build VM without C2 and run tests. > >>>>>>>>> > >>>>>>>>> I grepped tests with these flags I found next tests where we > need > >> to > >>>> fix > >>>>>>>>> test's command (add > >>>>>>>>> -XX:+IgnoreUnrecognizedVMOptions) or add @requires > >>>>>>>>> vm.compiler2.enabled or duplicate test for C1 with > corresponding > >> C1 > >>>>>>>>> flags (by ussing additional @test block). > >>>>>>>>> > >>>>>>>>> runtime/ReservedStack/ReservedStackTest.java > >>>>>>>>> compiler/intrinsics/string/TestStringIntrinsics2.java > >>>>>>>>> compiler/c2/Test6792161.java > >>>>>>>>> compiler/c2/Test5091921.java > >>>>>>>>> > >>>>>>>>> And there is issue with compiler/compilercontrol tests which use > >>>>>>>>> InlineSmallCode and I am not sure how to handle: > >>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>>> > >>>> > >> > http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c > >>>>>>>>> ompiler/compilercontrol/share/scenario/Command.java#l36 > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Vladimir > >>>>>>>>> > >>>>>>>>> On 5/4/20 9:04 AM, Doerr, Martin wrote: > >>>>>>>>>> Hi Nils, > >>>>>>>>>> > >>>>>>>>>> thank you for looking at this and sorry for the late reply. > >>>>>>>>>> > >>>>>>>>>> I've added MaxTrivialSize and also updated the issue > accordingly. > >>>>>> Makes > >>>>>>>>> sense. > >>>>>>>>>> Do you have more flags in mind? > >>>>>>>>>> > >>>>>>>>>> Moving the flags which are only used by C2 into c2_globals > >> definitely > >>>>>>> makes > >>>>>>>>> sense. > >>>>>>>>>> > >>>>>>>>>> Done in webrev.01: > >>>>>>>>>> > >>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ > >>>>>>>>>> > >>>>>>>>>> Please take a look and let me know when my proposal is ready > for > >> a > >>>>>> CSR. > >>>>>>>>>> > >>>>>>>>>> Best regards, > >>>>>>>>>> Martin > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> -----Original Message----- > >>>>>>>>>>> From: hotspot-compiler-dev >>>>>>>>>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson > >>>>>>>>>>> Sent: Dienstag, 28. April 2020 18:29 > >>>>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net > >>>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>>>>>>>> > >>>>>>>>>>> Hi, > >>>>>>>>>>> > >>>>>>>>>>> Thanks for addressing this! This has been an annoyance for a > >> long > >>>>>> time. > >>>>>>>>>>> > >>>>>>>>>>> Have you though about including other flags - like > >> MaxTrivialSize? > >>>>>>>>>>> MaxInlineSize is tested against it. > >>>>>>>>>>> > >>>>>>>>>>> Also - you should move the flags that are now c2-only to > >>>>>>> c2_globals.hpp. > >>>>>>>>>>> > >>>>>>>>>>> Best regards, > >>>>>>>>>>> Nils Eliasson > >>>>>>>>>>> > >>>>>>>>>>> On 2020-04-27 15:06, Doerr, Martin wrote: > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> > >>>>>>>>>>>> while tuning inlining parameters for C2 compiler with JDK- > >> 8234863 > >>>>>> we > >>>>>>>>> had > >>>>>>>>>>> discussed impact on C1. > >>>>>>>>>>>> I still think it's bad to share them between both compilers. > We > >>>> may > >>>>>>> want > >>>>>>>>> to > >>>>>>>>>>> do further C2 tuning without negative impact on C1 in the > future. > >>>>>>>>>>>> > >>>>>>>>>>>> C1 has issues with substantial inlining because of the lack of > >>>>>>> uncommon > >>>>>>>>>>> traps. When C1 inlines a lot, stack frames may get large and > code > >>>>>> cache > >>>>>>>>> space > >>>>>>>>>>> may get wasted for cold or even never executed code. The > >>>> situation > >>>>>>> gets > >>>>>>>>>>> worse when many patching stubs get used for such code. > >>>>>>>>>>>> > >>>>>>>>>>>> I had opened the following issue: > >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 > >>>>>>>>>>>> > >>>>>>>>>>>> And my initial proposal is here: > >>>>>>>>>>>> > >>>>>>> > >> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Part of my proposal is to add an additional flag which I called > >>>>>>>>>>> C1InlineStackLimit to reduce stack utilization for C1 methods. > >>>>>>>>>>>> I have a simple example which shows wasted stack space > (java > >>>>>>> example > >>>>>>>>>>> TestStack at the end). > >>>>>>>>>>>> > >>>>>>>>>>>> It simply counts stack frames until a stack overflow occurs. > With > >>>> the > >>>>>>>>> current > >>>>>>>>>>> implementation, only 1283 frames fit on the stack because the > >>>> never > >>>>>>>>>>> executed method bogus_test with local variables gets inlined. > >>>>>>>>>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and > >> we > >>>> get > >>>>>>>>> 2310 > >>>>>>>>>>> frames until stack overflow. (I only used C1 for this example. > Can > >>>> be > >>>>>>>>>>> reproduced as shown below.) > >>>>>>>>>>>> > >>>>>>>>>>>> I didn't notice any performance regression even with the > >>>> aggressive > >>>>>>>>> setting > >>>>>>>>>>> of C1InlineStackLimit=5 with TieredCompilation. > >>>>>>>>>>>> > >>>>>>>>>>>> I know that I'll need a CSR for this change, but I'd like to get > >>>>>> feedback > >>>>>>> in > >>>>>>>>>>> general and feedback about the flag names before creating a > >> CSR. > >>>>>>>>>>>> I'd also be glad about feedback regarding the performance > >>>> impact. > >>>>>>>>>>>> > >>>>>>>>>>>> Best regards, > >>>>>>>>>>>> Martin > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Command line: > >>>>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - > >> XX:C1InlineStackLimit=20 - > >>>>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - > >>>> XX:+PrintInlining > >>>>>> - > >>>>>>>>>>> > >>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >>>>>>>>>>> TestStack > >>>>>>>>>>>> CompileCommand: compileonly > >> TestStack.triggerStackOverflow > >>>>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 > >> bytes) > >>>>>>>>> recursive > >>>>>>>>>>> inlining too deep > >>>>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) > inline > >>>>>>>>>>>> caught java.lang.StackOverflowError > >>>>>>>>>>>> 1283 activations were on stack, sum = 0 > >>>>>>>>>>>> > >>>>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - > >> XX:C1InlineStackLimit=10 - > >>>>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - > >>>> XX:+PrintInlining > >>>>>> - > >>>>>>>>>>> > >>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >>>>>>>>>>> TestStack > >>>>>>>>>>>> CompileCommand: compileonly > >> TestStack.triggerStackOverflow > >>>>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 > >> bytes) > >>>>>>>>> recursive > >>>>>>>>>>> inlining too deep > >>>>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) > callee > >>>> uses > >>>>>>> too > >>>>>>>>>>> much stack > >>>>>>>>>>>> caught java.lang.StackOverflowError > >>>>>>>>>>>> 2310 activations were on stack, sum = 0 > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> TestStack.java: > >>>>>>>>>>>> public class TestStack { > >>>>>>>>>>>> > >>>>>>>>>>>> static long cnt = 0, > >>>>>>>>>>>> sum = 0; > >>>>>>>>>>>> > >>>>>>>>>>>> public static void bogus_test() { > >>>>>>>>>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; > >>>>>>>>>>>> sum += c1 + c2 + c3 + c4; > >>>>>>>>>>>> } > >>>>>>>>>>>> > >>>>>>>>>>>> public static void triggerStackOverflow() { > >>>>>>>>>>>> cnt++; > >>>>>>>>>>>> triggerStackOverflow(); > >>>>>>>>>>>> bogus_test(); > >>>>>>>>>>>> } > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> public static void main(String args[]) { > >>>>>>>>>>>> try { > >>>>>>>>>>>> triggerStackOverflow(); > >>>>>>>>>>>> } catch (StackOverflowError e) { > >>>>>>>>>>>> System.out.println("caught " + e); > >>>>>>>>>>>> } > >>>>>>>>>>>> System.out.println(cnt + " activations were on stack, > sum > >> = " > >>>> + > >>>>>>>>> sum); > >>>>>>>>>>>> } > >>>>>>>>>>>> } > >>>>>>>>>>>> > >>>>>>>>>> From tobias.hartmann at oracle.com Fri May 15 11:54:50 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 15 May 2020 13:54:50 +0200 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <496a3bde-09ca-adbe-1d2c-93a759623118@oracle.com> <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> <19d53124-94d1-50f9-f4e6-948640e7c848@oracle.com> <702038f7-7942-9c94-c507-bd36241db180@oracle.com> Message-ID: <2ff562fc-cdbb-1f47-17a0-2f5c9aae487b@oracle.com> Hi Martin, yes, looks good to me. Best regards, Tobias On 15.05.20 13:41, Doerr, Martin wrote: > Hi Vladimir, Nils and Tobias, > > Can I consider http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ reviewed by you? > Submission repo testing was successful. > > Thanks and best regards, > Martin > > >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Donnerstag, 14. Mai 2020 22:29 >> To: Doerr, Martin ; hotspot-compiler- >> dev at openjdk.java.net >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >> >> On 5/14/20 12:14 PM, Doerr, Martin wrote: >>> Hi Vladimir, >>> >>>> But we can use it in Test5091921.java. C1 compiles the test code with >>>> specified value before - lets keep it. >>> Ok. That makes sense for this test. Updated webrev in place. >> >> Good. >> >>> >>>> And this is not related to these changes but to have range(0, max_jint) for >> all >>>> these flags is questionable. I think >>>> nobody ran tests with 0 or max_jint values. Bunch of tests may simple >>>> timeout (which is understandable) but in worst >>>> case they may crash instead of graceful exit. >>> I was wondering about that, too. But I haven't changed that. The previously >> global flags already had this range. >>> I had also thought about guessing more reasonable values, but reasonable >> limits may depend on platform and future changes. >>> I don't think we can define ranges such that everything works great while >> we stay inside and also such that nobody will ever want greater values. >>> So I prefer keeping it this way unless somebody has a better proposal. >> >> I did not mean to have that in these change. Current changes are fine for me. >> >> I was thinking aloud that it would be nice to investigate this later by >> someone. At least for some flags. We may keep >> current range as it is but may be add dynamic checks based on platform and >> other conditions. This looks like starter >> task for junior engineer or student intern. >> >> Thanks, >> Vladimir >> >>> >>> Thanks and best regards, >>> Martin >>> >>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov >>>> Sent: Mittwoch, 13. Mai 2020 23:34 >>>> To: Doerr, Martin ; hotspot-compiler- >>>> dev at openjdk.java.net >>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>> >>>> On 5/13/20 1:10 PM, Doerr, Martin wrote: >>>>> Hi Vladimir, >>>>> >>>>> thanks for reviewing it. >>>>> >>>>>>> Should I set it to proposed? >>>>>> >>>>>> Yes. >>>>> I've set it to "Finalized". Hope this was correct. >>>>> >>>>>>> I've added the new C1 flags to the tests which should test C1 compiler >> as >>>>>> well. >>>>>> >>>>>> Good. Why not do the same for C1MaxInlineSize? >>>>> Looks like MaxInlineSize is only used by tests which test C2 specific >> things. >>>> So I think C1MaxInlineSize would be pointless. >>>>> In addition to that, the C2 values are probably not appropriate for C1 in >>>> some tests. >>>>> Would you like to have C1MaxInlineSize configured in some tests? >>>> >>>> You are right in cases when test switch off TieredCompilation and use only >> C2 >>>> (Test6792161.java) or tests intrinsics. >>>> >>>> But we can use it in Test5091921.java. C1 compiles the test code with >>>> specified value before - lets keep it. >>>> >>>> And this is not related to these changes but to have range(0, max_jint) for >> all >>>> these flags is questionable. I think >>>> nobody ran tests with 0 or max_jint values. Bunch of tests may simple >>>> timeout (which is understandable) but in worst >>>> case they may crash instead of graceful exit. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>>> >>>>> Best regards, >>>>> Martin >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Vladimir Kozlov >>>>>> Sent: Mittwoch, 13. Mai 2020 21:46 >>>>>> To: Doerr, Martin ; hotspot-compiler- >>>>>> dev at openjdk.java.net >>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>> >>>>>> Hi Martin, >>>>>> >>>>>> On 5/11/20 6:32 AM, Doerr, Martin wrote: >>>>>>> Hi Vladimir, >>>>>>> >>>>>>> are you ok with the updated CSR >>>>>> (https://bugs.openjdk.java.net/browse/JDK-8244507)? >>>>>>> Should I set it to proposed? >>>>>> >>>>>> Yes. >>>>>> >>>>>>> >>>>>>> Here's a new webrev with obsoletion + expiration for C2 flags in >>>> ClientVM: >>>>>>> >> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ >>>>>>> >>>>>>> I've added the new C1 flags to the tests which should test C1 compiler >> as >>>>>> well. >>>>>> >>>>>> Good. Why not do the same for C1MaxInlineSize? >>>>>> >>>>>>> And I've added -XX:+IgnoreUnrecognizedVMOptions to all tests which >>>> set >>>>>> C2 flags. I think this is the best solution because it still allows running >> the >>>> tests >>>>>> with GraalVM compiler. >>>>>> >>>>>> Yes. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>>> >>>>>>> Best regards, >>>>>>> Martin >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Doerr, Martin >>>>>>>> Sent: Freitag, 8. Mai 2020 23:07 >>>>>>>> To: Vladimir Kozlov ; hotspot- >> compiler- >>>>>>>> dev at openjdk.java.net >>>>>>>> Subject: RE: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>> >>>>>>>> Hi Vladimir, >>>>>>>> >>>>>>>>> You need update your CSR - add information about this and above >>>> code >>>>>>>> change. Example: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 >>>>>>>> I've updated the CSR with obsolete and expired flags as in the >> example. >>>>>>>> >>>>>>>>> I would suggest to fix tests anyway (there are only few) because >> new >>>>>>>>> warning output could be unexpected. >>>>>>>> Ok. I'll prepare a webrev with fixed tests. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Martin >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Vladimir Kozlov >>>>>>>>> Sent: Freitag, 8. Mai 2020 21:43 >>>>>>>>> To: Doerr, Martin ; hotspot-compiler- >>>>>>>>> dev at openjdk.java.net >>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>> >>>>>>>>> Hi Martin >>>>>>>>> >>>>>>>>> On 5/8/20 5:56 AM, Doerr, Martin wrote: >>>>>>>>>> Hi Vladimir, >>>>>>>>>> >>>>>>>>>> thanks a lot for looking at this, for finding the test issues and for >>>>>>>> reviewing >>>>>>>>> the CSR. >>>>>>>>>> >>>>>>>>>> For me, C2 is a fundamental part of the JVM. I would usually never >>>>>> build >>>>>>>>> without it ?? >>>>>>>>>> (Except if we want to use C1 + GraalVM compiler only.) >>>>>>>>> >>>>>>>>> Yes it is one of cases. >>>>>>>>> >>>>>>>>>> But your right, --with-jvm-variants=client configuration should still >> be >>>>>>>>> supported. >>>>>>>>> >>>>>>>>> Yes. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> We can fix it by making the flags as obsolete if C2 is not included: >>>>>>>>>> diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp >>>>>>>>>> --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 >> 11:14:28 >>>>>>>> 2020 >>>>>>>>> +0200 >>>>>>>>>> +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 >>>> 14:41:14 >>>>>>>>> 2020 +0200 >>>>>>>>>> @@ -562,6 +562,16 @@ >>>>>>>>>> { "dup option", JDK_Version::jdk(9), >>>>>>>> JDK_Version::undefined(), >>>>>>>>> JDK_Version::undefined() }, >>>>>>>>>> #endif >>>>>>>>>> >>>>>>>>>> +#ifndef COMPILER2 >>>>>>>>>> + // These flags were generally available, but are C2 only, now. >>>>>>>>>> + { "MaxInlineLevel", JDK_Version::undefined(), >>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>> + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), >>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>> + { "InlineSmallCode", JDK_Version::undefined(), >>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>> + { "MaxInlineSize", JDK_Version::undefined(), >>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>> + { "FreqInlineSize", JDK_Version::undefined(), >>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>> + { "MaxTrivialSize", JDK_Version::undefined(), >>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>> +#endif >>>>>>>>>> + >>>>>>>>>> { NULL, JDK_Version(0), JDK_Version(0) } >>>>>>>>>> }; >>>>>>>>> >>>>>>>>> Right. I think you should do full process for these product flags >>>>>> deprecation >>>>>>>>> with obsoleting in JDK 16 for VM builds >>>>>>>>> which do not include C2. You need update your CSR - add >> information >>>>>>>> about >>>>>>>>> this and above code change. Example: >>>>>>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 >>>>>>>>> >>>>>>>>>> >>>>>>>>>> This makes the VM accept the flags with warning: >>>>>>>>>> jdk/bin/java -XX:MaxInlineLevel=9 -version >>>>>>>>>> OpenJDK 64-Bit Client VM warning: Ignoring option >> MaxInlineLevel; >>>>>>>>> support was removed in 15.0 >>>>>>>>>> >>>>>>>>>> If we do it this way, the only test which I think should get fixed is >>>>>>>>> ReservedStackTest. >>>>>>>>>> I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in >> order >>>> to >>>>>>>>> preserve the inlining behavior. >>>>>>>>>> >>>>>>>>>> (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. >>>>>>>>> compiler/c2 tests: Also written to test C2 specific things.) >>>>>>>>>> >>>>>>>>>> What do you think? >>>>>>>>> >>>>>>>>> I would suggest to fix tests anyway (there are only few) because >> new >>>>>>>>> warning output could be unexpected. >>>>>>>>> And it will be future-proof when warning will be converted into >> error >>>>>>>>> (if/when C2 goes away). >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Vladimir >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Martin >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: hotspot-compiler-dev >>>>>>>>>> bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov >>>>>>>>>>> Sent: Donnerstag, 7. Mai 2020 19:11 >>>>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>>>> >>>>>>>>>>> I would suggest to build VM without C2 and run tests. >>>>>>>>>>> >>>>>>>>>>> I grepped tests with these flags I found next tests where we >> need >>>> to >>>>>> fix >>>>>>>>>>> test's command (add >>>>>>>>>>> -XX:+IgnoreUnrecognizedVMOptions) or add @requires >>>>>>>>>>> vm.compiler2.enabled or duplicate test for C1 with >> corresponding >>>> C1 >>>>>>>>>>> flags (by ussing additional @test block). >>>>>>>>>>> >>>>>>>>>>> runtime/ReservedStack/ReservedStackTest.java >>>>>>>>>>> compiler/intrinsics/string/TestStringIntrinsics2.java >>>>>>>>>>> compiler/c2/Test6792161.java >>>>>>>>>>> compiler/c2/Test5091921.java >>>>>>>>>>> >>>>>>>>>>> And there is issue with compiler/compilercontrol tests which use >>>>>>>>>>> InlineSmallCode and I am not sure how to handle: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>> >> http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c >>>>>>>>>>> ompiler/compilercontrol/share/scenario/Command.java#l36 >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Vladimir >>>>>>>>>>> >>>>>>>>>>> On 5/4/20 9:04 AM, Doerr, Martin wrote: >>>>>>>>>>>> Hi Nils, >>>>>>>>>>>> >>>>>>>>>>>> thank you for looking at this and sorry for the late reply. >>>>>>>>>>>> >>>>>>>>>>>> I've added MaxTrivialSize and also updated the issue >> accordingly. >>>>>>>> Makes >>>>>>>>>>> sense. >>>>>>>>>>>> Do you have more flags in mind? >>>>>>>>>>>> >>>>>>>>>>>> Moving the flags which are only used by C2 into c2_globals >>>> definitely >>>>>>>>> makes >>>>>>>>>>> sense. >>>>>>>>>>>> >>>>>>>>>>>> Done in webrev.01: >>>>>>>>>>>> >>>>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ >>>>>>>>>>>> >>>>>>>>>>>> Please take a look and let me know when my proposal is ready >> for >>>> a >>>>>>>> CSR. >>>>>>>>>>>> >>>>>>>>>>>> Best regards, >>>>>>>>>>>> Martin >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>> From: hotspot-compiler-dev >>>>>>>>>>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson >>>>>>>>>>>>> Sent: Dienstag, 28. April 2020 18:29 >>>>>>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for addressing this! This has been an annoyance for a >>>> long >>>>>>>> time. >>>>>>>>>>>>> >>>>>>>>>>>>> Have you though about including other flags - like >>>> MaxTrivialSize? >>>>>>>>>>>>> MaxInlineSize is tested against it. >>>>>>>>>>>>> >>>>>>>>>>>>> Also - you should move the flags that are now c2-only to >>>>>>>>> c2_globals.hpp. >>>>>>>>>>>>> >>>>>>>>>>>>> Best regards, >>>>>>>>>>>>> Nils Eliasson >>>>>>>>>>>>> >>>>>>>>>>>>> On 2020-04-27 15:06, Doerr, Martin wrote: >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> while tuning inlining parameters for C2 compiler with JDK- >>>> 8234863 >>>>>>>> we >>>>>>>>>>> had >>>>>>>>>>>>> discussed impact on C1. >>>>>>>>>>>>>> I still think it's bad to share them between both compilers. >> We >>>>>> may >>>>>>>>> want >>>>>>>>>>> to >>>>>>>>>>>>> do further C2 tuning without negative impact on C1 in the >> future. >>>>>>>>>>>>>> >>>>>>>>>>>>>> C1 has issues with substantial inlining because of the lack of >>>>>>>>> uncommon >>>>>>>>>>>>> traps. When C1 inlines a lot, stack frames may get large and >> code >>>>>>>> cache >>>>>>>>>>> space >>>>>>>>>>>>> may get wasted for cold or even never executed code. The >>>>>> situation >>>>>>>>> gets >>>>>>>>>>>>> worse when many patching stubs get used for such code. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I had opened the following issue: >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 >>>>>>>>>>>>>> >>>>>>>>>>>>>> And my initial proposal is here: >>>>>>>>>>>>>> >>>>>>>>> >>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Part of my proposal is to add an additional flag which I called >>>>>>>>>>>>> C1InlineStackLimit to reduce stack utilization for C1 methods. >>>>>>>>>>>>>> I have a simple example which shows wasted stack space >> (java >>>>>>>>> example >>>>>>>>>>>>> TestStack at the end). >>>>>>>>>>>>>> >>>>>>>>>>>>>> It simply counts stack frames until a stack overflow occurs. >> With >>>>>> the >>>>>>>>>>> current >>>>>>>>>>>>> implementation, only 1283 frames fit on the stack because the >>>>>> never >>>>>>>>>>>>> executed method bogus_test with local variables gets inlined. >>>>>>>>>>>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and >>>> we >>>>>> get >>>>>>>>>>> 2310 >>>>>>>>>>>>> frames until stack overflow. (I only used C1 for this example. >> Can >>>>>> be >>>>>>>>>>>>> reproduced as shown below.) >>>>>>>>>>>>>> >>>>>>>>>>>>>> I didn't notice any performance regression even with the >>>>>> aggressive >>>>>>>>>>> setting >>>>>>>>>>>>> of C1InlineStackLimit=5 with TieredCompilation. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I know that I'll need a CSR for this change, but I'd like to get >>>>>>>> feedback >>>>>>>>> in >>>>>>>>>>>>> general and feedback about the flag names before creating a >>>> CSR. >>>>>>>>>>>>>> I'd also be glad about feedback regarding the performance >>>>>> impact. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>> Martin >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Command line: >>>>>>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - >>>> XX:C1InlineStackLimit=20 - >>>>>>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - >>>>>> XX:+PrintInlining >>>>>>>> - >>>>>>>>>>>>> >>>>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>>>>>>>>>> TestStack >>>>>>>>>>>>>> CompileCommand: compileonly >>>> TestStack.triggerStackOverflow >>>>>>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 >>>> bytes) >>>>>>>>>>> recursive >>>>>>>>>>>>> inlining too deep >>>>>>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) >> inline >>>>>>>>>>>>>> caught java.lang.StackOverflowError >>>>>>>>>>>>>> 1283 activations were on stack, sum = 0 >>>>>>>>>>>>>> >>>>>>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - >>>> XX:C1InlineStackLimit=10 - >>>>>>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - >>>>>> XX:+PrintInlining >>>>>>>> - >>>>>>>>>>>>> >>>>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>>>>>>>>>> TestStack >>>>>>>>>>>>>> CompileCommand: compileonly >>>> TestStack.triggerStackOverflow >>>>>>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 >>>> bytes) >>>>>>>>>>> recursive >>>>>>>>>>>>> inlining too deep >>>>>>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) >> callee >>>>>> uses >>>>>>>>> too >>>>>>>>>>>>> much stack >>>>>>>>>>>>>> caught java.lang.StackOverflowError >>>>>>>>>>>>>> 2310 activations were on stack, sum = 0 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> TestStack.java: >>>>>>>>>>>>>> public class TestStack { >>>>>>>>>>>>>> >>>>>>>>>>>>>> static long cnt = 0, >>>>>>>>>>>>>> sum = 0; >>>>>>>>>>>>>> >>>>>>>>>>>>>> public static void bogus_test() { >>>>>>>>>>>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; >>>>>>>>>>>>>> sum += c1 + c2 + c3 + c4; >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> public static void triggerStackOverflow() { >>>>>>>>>>>>>> cnt++; >>>>>>>>>>>>>> triggerStackOverflow(); >>>>>>>>>>>>>> bogus_test(); >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> public static void main(String args[]) { >>>>>>>>>>>>>> try { >>>>>>>>>>>>>> triggerStackOverflow(); >>>>>>>>>>>>>> } catch (StackOverflowError e) { >>>>>>>>>>>>>> System.out.println("caught " + e); >>>>>>>>>>>>>> } >>>>>>>>>>>>>> System.out.println(cnt + " activations were on stack, >> sum >>>> = " >>>>>> + >>>>>>>>>>> sum); >>>>>>>>>>>>>> } >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>> From shade at redhat.com Fri May 15 12:04:08 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 May 2020 14:04:08 +0200 Subject: RFR(S): 8244721: CTW: C2 (Shenandoah) compilation fails with "unexpected infinite loop graph shape" In-Reply-To: <87lflt38g7.fsf@redhat.com> References: <87lflt38g7.fsf@redhat.com> Message-ID: <87871673-81ac-27dc-29fc-190d72d291e1@redhat.com> On 5/15/20 10:31 AM, Roland Westrelin wrote: > > https://bugs.openjdk.java.net/browse/JDK-8244721 > http://cr.openjdk.java.net/~roland/8244721/webrev.00/ Looks fine. Is there any reason why assert on L2141 does not subsume assert at L2139? 2138 } 2139 assert(mem != NULL, "should have found safepoint"); 2140 } 2141 assert(mem != NULL, "should have found safepoint"); 2142 } else { 2143 mem = phi_mem; 2144 } -- Thanks, -Aleksey From shade at redhat.com Fri May 15 12:06:06 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 May 2020 14:06:06 +0200 Subject: RFR(S): 8244663: Shenandoah: C2 assertion fails in Matcher::collect_null_checks In-Reply-To: <87imgx37sf.fsf@redhat.com> References: <87imgx37sf.fsf@redhat.com> Message-ID: <9012d060-4fbc-0841-fa52-afe3a779132a@redhat.com> On 5/15/20 10:46 AM, Roland Westrelin wrote: > https://bugs.openjdk.java.net/browse/JDK-8244663 > http://cr.openjdk.java.net/~roland/8244663/webrev.00/ Looks OK to me. -- Thanks, -Aleksey From rwestrel at redhat.com Fri May 15 12:31:46 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 15 May 2020 14:31:46 +0200 Subject: RFR(S): 8244721: CTW: C2 (Shenandoah) compilation fails with "unexpected infinite loop graph shape" In-Reply-To: <87871673-81ac-27dc-29fc-190d72d291e1@redhat.com> References: <87lflt38g7.fsf@redhat.com> <87871673-81ac-27dc-29fc-190d72d291e1@redhat.com> Message-ID: <87d0752xcd.fsf@redhat.com> Thanks for looking at this. > Is there any reason why assert on L2141 does not subsume assert at L2139? > > 2138 } > 2139 assert(mem != NULL, "should have found safepoint"); > 2140 } > 2141 assert(mem != NULL, "should have found safepoint"); > 2142 } else { > 2143 mem = phi_mem; > 2144 } assert L2139 checks that every backedge has a safepoint. assert L2141 checks that there was indeed a backedge. Roland. From shade at redhat.com Fri May 15 12:33:12 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 May 2020 14:33:12 +0200 Subject: RFR(S): 8244721: CTW: C2 (Shenandoah) compilation fails with "unexpected infinite loop graph shape" In-Reply-To: <87d0752xcd.fsf@redhat.com> References: <87lflt38g7.fsf@redhat.com> <87871673-81ac-27dc-29fc-190d72d291e1@redhat.com> <87d0752xcd.fsf@redhat.com> Message-ID: <57da106c-a8d1-2947-bb7f-36c7dcb66bb8@redhat.com> On 5/15/20 2:31 PM, Roland Westrelin wrote: >> Is there any reason why assert on L2141 does not subsume assert at L2139? >> >> 2138 } >> 2139 assert(mem != NULL, "should have found safepoint"); >> 2140 } >> 2141 assert(mem != NULL, "should have found safepoint"); >> 2142 } else { >> 2143 mem = phi_mem; >> 2144 } > > assert L2139 checks that every backedge has a safepoint. > assert L2141 checks that there was indeed a backedge. Ah. Looks good then! -- Thanks, -Aleksey From martin.doerr at sap.com Fri May 15 16:21:49 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 15 May 2020 16:21:49 +0000 Subject: RFR(S): 8245047: PPC64 fails jcstress Unsafe (?) memory ordering tests due to C2 (?) bug Message-ID: Hi, acquire barriers are missing for load*_reversed nodes on PPC64 (introduced by JDK-8179527 in JDK 10). Aleksey has reported the issue: https://bugs.openjdk.java.net/browse/JDK-8245047 Here's my proposed fix: http://cr.openjdk.java.net/~mdoerr/8245047_ppc64_load_reversed_acquire/webrev.00/ Please review. I'd appreciate retesting, too, if possible. Best regards, Martin From shade at redhat.com Fri May 15 16:41:43 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 May 2020 18:41:43 +0200 Subject: RFR(S): 8245047: PPC64 fails jcstress Unsafe (?) memory ordering tests due to C2 (?) bug In-Reply-To: References: Message-ID: <9aa3c97f-ec3f-f45d-e601-2f484494ec31@redhat.com> On 5/15/20 6:21 PM, Doerr, Martin wrote: > Aleksey has reported the issue: > https://bugs.openjdk.java.net/browse/JDK-8245047 Well, I think it is a good idea to change the synopsis. I speculated in the provisional synopsis, and thought "(?)" would prompt the edit :) Looks to me, it is "[PPC64] C2: ReverseBytes(U)S/Load(U)S always match to unordered loads". > Here?s my proposed fix: > http://cr.openjdk.java.net/~mdoerr/8245047_ppc64_load_reversed_acquire/webrev.00/ Looks fine to me. It is a bit odd to me to see that "normal" loads are matched with isync, but it seems to fit the rest of ppc64.ad that has two versions of loads, unordered/followed_by_acquire explicitly excepted. > I?d appreciate retesting, too, if possible. I'll run a few jcstress tests here. -- Thanks, -Aleksey From vladimir.kozlov at oracle.com Fri May 15 17:55:39 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 15 May 2020 10:55:39 -0700 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: <2ff562fc-cdbb-1f47-17a0-2f5c9aae487b@oracle.com> References: <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> <19d53124-94d1-50f9-f4e6-948640e7c848@oracle.com> <702038f7-7942-9c94-c507-bd36241db180@oracle.com> <2ff562fc-cdbb-1f47-17a0-2f5c9aae487b@oracle.com> Message-ID: +1 Vladimir On 5/15/20 4:54 AM, Tobias Hartmann wrote: > Hi Martin, > > yes, looks good to me. > > Best regards, > Tobias > > On 15.05.20 13:41, Doerr, Martin wrote: >> Hi Vladimir, Nils and Tobias, >> >> Can I consider http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ reviewed by you? >> Submission repo testing was successful. >> >> Thanks and best regards, >> Martin >> >> >>> -----Original Message----- >>> From: Vladimir Kozlov >>> Sent: Donnerstag, 14. Mai 2020 22:29 >>> To: Doerr, Martin ; hotspot-compiler- >>> dev at openjdk.java.net >>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>> >>> On 5/14/20 12:14 PM, Doerr, Martin wrote: >>>> Hi Vladimir, >>>> >>>>> But we can use it in Test5091921.java. C1 compiles the test code with >>>>> specified value before - lets keep it. >>>> Ok. That makes sense for this test. Updated webrev in place. >>> >>> Good. >>> >>>> >>>>> And this is not related to these changes but to have range(0, max_jint) for >>> all >>>>> these flags is questionable. I think >>>>> nobody ran tests with 0 or max_jint values. Bunch of tests may simple >>>>> timeout (which is understandable) but in worst >>>>> case they may crash instead of graceful exit. >>>> I was wondering about that, too. But I haven't changed that. The previously >>> global flags already had this range. >>>> I had also thought about guessing more reasonable values, but reasonable >>> limits may depend on platform and future changes. >>>> I don't think we can define ranges such that everything works great while >>> we stay inside and also such that nobody will ever want greater values. >>>> So I prefer keeping it this way unless somebody has a better proposal. >>> >>> I did not mean to have that in these change. Current changes are fine for me. >>> >>> I was thinking aloud that it would be nice to investigate this later by >>> someone. At least for some flags. We may keep >>> current range as it is but may be add dynamic checks based on platform and >>> other conditions. This looks like starter >>> task for junior engineer or student intern. >>> >>> Thanks, >>> Vladimir >>> >>>> >>>> Thanks and best regards, >>>> Martin >>>> >>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov >>>>> Sent: Mittwoch, 13. Mai 2020 23:34 >>>>> To: Doerr, Martin ; hotspot-compiler- >>>>> dev at openjdk.java.net >>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>> >>>>> On 5/13/20 1:10 PM, Doerr, Martin wrote: >>>>>> Hi Vladimir, >>>>>> >>>>>> thanks for reviewing it. >>>>>> >>>>>>>> Should I set it to proposed? >>>>>>> >>>>>>> Yes. >>>>>> I've set it to "Finalized". Hope this was correct. >>>>>> >>>>>>>> I've added the new C1 flags to the tests which should test C1 compiler >>> as >>>>>>> well. >>>>>>> >>>>>>> Good. Why not do the same for C1MaxInlineSize? >>>>>> Looks like MaxInlineSize is only used by tests which test C2 specific >>> things. >>>>> So I think C1MaxInlineSize would be pointless. >>>>>> In addition to that, the C2 values are probably not appropriate for C1 in >>>>> some tests. >>>>>> Would you like to have C1MaxInlineSize configured in some tests? >>>>> >>>>> You are right in cases when test switch off TieredCompilation and use only >>> C2 >>>>> (Test6792161.java) or tests intrinsics. >>>>> >>>>> But we can use it in Test5091921.java. C1 compiles the test code with >>>>> specified value before - lets keep it. >>>>> >>>>> And this is not related to these changes but to have range(0, max_jint) for >>> all >>>>> these flags is questionable. I think >>>>> nobody ran tests with 0 or max_jint values. Bunch of tests may simple >>>>> timeout (which is understandable) but in worst >>>>> case they may crash instead of graceful exit. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>>> >>>>>> Best regards, >>>>>> Martin >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Vladimir Kozlov >>>>>>> Sent: Mittwoch, 13. Mai 2020 21:46 >>>>>>> To: Doerr, Martin ; hotspot-compiler- >>>>>>> dev at openjdk.java.net >>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>> >>>>>>> Hi Martin, >>>>>>> >>>>>>> On 5/11/20 6:32 AM, Doerr, Martin wrote: >>>>>>>> Hi Vladimir, >>>>>>>> >>>>>>>> are you ok with the updated CSR >>>>>>> (https://bugs.openjdk.java.net/browse/JDK-8244507)? >>>>>>>> Should I set it to proposed? >>>>>>> >>>>>>> Yes. >>>>>>> >>>>>>>> >>>>>>>> Here's a new webrev with obsoletion + expiration for C2 flags in >>>>> ClientVM: >>>>>>>> >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ >>>>>>>> >>>>>>>> I've added the new C1 flags to the tests which should test C1 compiler >>> as >>>>>>> well. >>>>>>> >>>>>>> Good. Why not do the same for C1MaxInlineSize? >>>>>>> >>>>>>>> And I've added -XX:+IgnoreUnrecognizedVMOptions to all tests which >>>>> set >>>>>>> C2 flags. I think this is the best solution because it still allows running >>> the >>>>> tests >>>>>>> with GraalVM compiler. >>>>>>> >>>>>>> Yes. >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Martin >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Doerr, Martin >>>>>>>>> Sent: Freitag, 8. Mai 2020 23:07 >>>>>>>>> To: Vladimir Kozlov ; hotspot- >>> compiler- >>>>>>>>> dev at openjdk.java.net >>>>>>>>> Subject: RE: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>> >>>>>>>>> Hi Vladimir, >>>>>>>>> >>>>>>>>>> You need update your CSR - add information about this and above >>>>> code >>>>>>>>> change. Example: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 >>>>>>>>> I've updated the CSR with obsolete and expired flags as in the >>> example. >>>>>>>>> >>>>>>>>>> I would suggest to fix tests anyway (there are only few) because >>> new >>>>>>>>>> warning output could be unexpected. >>>>>>>>> Ok. I'll prepare a webrev with fixed tests. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Martin >>>>>>>>> >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Vladimir Kozlov >>>>>>>>>> Sent: Freitag, 8. Mai 2020 21:43 >>>>>>>>>> To: Doerr, Martin ; hotspot-compiler- >>>>>>>>>> dev at openjdk.java.net >>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>>> >>>>>>>>>> Hi Martin >>>>>>>>>> >>>>>>>>>> On 5/8/20 5:56 AM, Doerr, Martin wrote: >>>>>>>>>>> Hi Vladimir, >>>>>>>>>>> >>>>>>>>>>> thanks a lot for looking at this, for finding the test issues and for >>>>>>>>> reviewing >>>>>>>>>> the CSR. >>>>>>>>>>> >>>>>>>>>>> For me, C2 is a fundamental part of the JVM. I would usually never >>>>>>> build >>>>>>>>>> without it ?? >>>>>>>>>>> (Except if we want to use C1 + GraalVM compiler only.) >>>>>>>>>> >>>>>>>>>> Yes it is one of cases. >>>>>>>>>> >>>>>>>>>>> But your right, --with-jvm-variants=client configuration should still >>> be >>>>>>>>>> supported. >>>>>>>>>> >>>>>>>>>> Yes. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> We can fix it by making the flags as obsolete if C2 is not included: >>>>>>>>>>> diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp >>>>>>>>>>> --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 >>> 11:14:28 >>>>>>>>> 2020 >>>>>>>>>> +0200 >>>>>>>>>>> +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 >>>>> 14:41:14 >>>>>>>>>> 2020 +0200 >>>>>>>>>>> @@ -562,6 +562,16 @@ >>>>>>>>>>> { "dup option", JDK_Version::jdk(9), >>>>>>>>> JDK_Version::undefined(), >>>>>>>>>> JDK_Version::undefined() }, >>>>>>>>>>> #endif >>>>>>>>>>> >>>>>>>>>>> +#ifndef COMPILER2 >>>>>>>>>>> + // These flags were generally available, but are C2 only, now. >>>>>>>>>>> + { "MaxInlineLevel", JDK_Version::undefined(), >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>>> + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>>> + { "InlineSmallCode", JDK_Version::undefined(), >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>>> + { "MaxInlineSize", JDK_Version::undefined(), >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>>> + { "FreqInlineSize", JDK_Version::undefined(), >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>>> + { "MaxTrivialSize", JDK_Version::undefined(), >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>>> +#endif >>>>>>>>>>> + >>>>>>>>>>> { NULL, JDK_Version(0), JDK_Version(0) } >>>>>>>>>>> }; >>>>>>>>>> >>>>>>>>>> Right. I think you should do full process for these product flags >>>>>>> deprecation >>>>>>>>>> with obsoleting in JDK 16 for VM builds >>>>>>>>>> which do not include C2. You need update your CSR - add >>> information >>>>>>>>> about >>>>>>>>>> this and above code change. Example: >>>>>>>>>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> This makes the VM accept the flags with warning: >>>>>>>>>>> jdk/bin/java -XX:MaxInlineLevel=9 -version >>>>>>>>>>> OpenJDK 64-Bit Client VM warning: Ignoring option >>> MaxInlineLevel; >>>>>>>>>> support was removed in 15.0 >>>>>>>>>>> >>>>>>>>>>> If we do it this way, the only test which I think should get fixed is >>>>>>>>>> ReservedStackTest. >>>>>>>>>>> I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in >>> order >>>>> to >>>>>>>>>> preserve the inlining behavior. >>>>>>>>>>> >>>>>>>>>>> (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. >>>>>>>>>> compiler/c2 tests: Also written to test C2 specific things.) >>>>>>>>>>> >>>>>>>>>>> What do you think? >>>>>>>>>> >>>>>>>>>> I would suggest to fix tests anyway (there are only few) because >>> new >>>>>>>>>> warning output could be unexpected. >>>>>>>>>> And it will be future-proof when warning will be converted into >>> error >>>>>>>>>> (if/when C2 goes away). >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Vladimir >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> Martin >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: hotspot-compiler-dev >>>>>>>>>>> bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov >>>>>>>>>>>> Sent: Donnerstag, 7. Mai 2020 19:11 >>>>>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>>>>> >>>>>>>>>>>> I would suggest to build VM without C2 and run tests. >>>>>>>>>>>> >>>>>>>>>>>> I grepped tests with these flags I found next tests where we >>> need >>>>> to >>>>>>> fix >>>>>>>>>>>> test's command (add >>>>>>>>>>>> -XX:+IgnoreUnrecognizedVMOptions) or add @requires >>>>>>>>>>>> vm.compiler2.enabled or duplicate test for C1 with >>> corresponding >>>>> C1 >>>>>>>>>>>> flags (by ussing additional @test block). >>>>>>>>>>>> >>>>>>>>>>>> runtime/ReservedStack/ReservedStackTest.java >>>>>>>>>>>> compiler/intrinsics/string/TestStringIntrinsics2.java >>>>>>>>>>>> compiler/c2/Test6792161.java >>>>>>>>>>>> compiler/c2/Test5091921.java >>>>>>>>>>>> >>>>>>>>>>>> And there is issue with compiler/compilercontrol tests which use >>>>>>>>>>>> InlineSmallCode and I am not sure how to handle: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>> http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c >>>>>>>>>>>> ompiler/compilercontrol/share/scenario/Command.java#l36 >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Vladimir >>>>>>>>>>>> >>>>>>>>>>>> On 5/4/20 9:04 AM, Doerr, Martin wrote: >>>>>>>>>>>>> Hi Nils, >>>>>>>>>>>>> >>>>>>>>>>>>> thank you for looking at this and sorry for the late reply. >>>>>>>>>>>>> >>>>>>>>>>>>> I've added MaxTrivialSize and also updated the issue >>> accordingly. >>>>>>>>> Makes >>>>>>>>>>>> sense. >>>>>>>>>>>>> Do you have more flags in mind? >>>>>>>>>>>>> >>>>>>>>>>>>> Moving the flags which are only used by C2 into c2_globals >>>>> definitely >>>>>>>>>> makes >>>>>>>>>>>> sense. >>>>>>>>>>>>> >>>>>>>>>>>>> Done in webrev.01: >>>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ >>>>>>>>>>>>> >>>>>>>>>>>>> Please take a look and let me know when my proposal is ready >>> for >>>>> a >>>>>>>>> CSR. >>>>>>>>>>>>> >>>>>>>>>>>>> Best regards, >>>>>>>>>>>>> Martin >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>> From: hotspot-compiler-dev >>>>>>>>>>>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson >>>>>>>>>>>>>> Sent: Dienstag, 28. April 2020 18:29 >>>>>>>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>>>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for addressing this! This has been an annoyance for a >>>>> long >>>>>>>>> time. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Have you though about including other flags - like >>>>> MaxTrivialSize? >>>>>>>>>>>>>> MaxInlineSize is tested against it. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Also - you should move the flags that are now c2-only to >>>>>>>>>> c2_globals.hpp. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>> Nils Eliasson >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2020-04-27 15:06, Doerr, Martin wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> while tuning inlining parameters for C2 compiler with JDK- >>>>> 8234863 >>>>>>>>> we >>>>>>>>>>>> had >>>>>>>>>>>>>> discussed impact on C1. >>>>>>>>>>>>>>> I still think it's bad to share them between both compilers. >>> We >>>>>>> may >>>>>>>>>> want >>>>>>>>>>>> to >>>>>>>>>>>>>> do further C2 tuning without negative impact on C1 in the >>> future. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> C1 has issues with substantial inlining because of the lack of >>>>>>>>>> uncommon >>>>>>>>>>>>>> traps. When C1 inlines a lot, stack frames may get large and >>> code >>>>>>>>> cache >>>>>>>>>>>> space >>>>>>>>>>>>>> may get wasted for cold or even never executed code. The >>>>>>> situation >>>>>>>>>> gets >>>>>>>>>>>>>> worse when many patching stubs get used for such code. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I had opened the following issue: >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> And my initial proposal is here: >>>>>>>>>>>>>>> >>>>>>>>>> >>>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Part of my proposal is to add an additional flag which I called >>>>>>>>>>>>>> C1InlineStackLimit to reduce stack utilization for C1 methods. >>>>>>>>>>>>>>> I have a simple example which shows wasted stack space >>> (java >>>>>>>>>> example >>>>>>>>>>>>>> TestStack at the end). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It simply counts stack frames until a stack overflow occurs. >>> With >>>>>>> the >>>>>>>>>>>> current >>>>>>>>>>>>>> implementation, only 1283 frames fit on the stack because the >>>>>>> never >>>>>>>>>>>>>> executed method bogus_test with local variables gets inlined. >>>>>>>>>>>>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and >>>>> we >>>>>>> get >>>>>>>>>>>> 2310 >>>>>>>>>>>>>> frames until stack overflow. (I only used C1 for this example. >>> Can >>>>>>> be >>>>>>>>>>>>>> reproduced as shown below.) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I didn't notice any performance regression even with the >>>>>>> aggressive >>>>>>>>>>>> setting >>>>>>>>>>>>>> of C1InlineStackLimit=5 with TieredCompilation. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I know that I'll need a CSR for this change, but I'd like to get >>>>>>>>> feedback >>>>>>>>>> in >>>>>>>>>>>>>> general and feedback about the flag names before creating a >>>>> CSR. >>>>>>>>>>>>>>> I'd also be glad about feedback regarding the performance >>>>>>> impact. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>>> Martin >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Command line: >>>>>>>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - >>>>> XX:C1InlineStackLimit=20 - >>>>>>>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - >>>>>>> XX:+PrintInlining >>>>>>>>> - >>>>>>>>>>>>>> >>>>>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>>>>>>>>>>> TestStack >>>>>>>>>>>>>>> CompileCommand: compileonly >>>>> TestStack.triggerStackOverflow >>>>>>>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 >>>>> bytes) >>>>>>>>>>>> recursive >>>>>>>>>>>>>> inlining too deep >>>>>>>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) >>> inline >>>>>>>>>>>>>>> caught java.lang.StackOverflowError >>>>>>>>>>>>>>> 1283 activations were on stack, sum = 0 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - >>>>> XX:C1InlineStackLimit=10 - >>>>>>>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - >>>>>>> XX:+PrintInlining >>>>>>>>> - >>>>>>>>>>>>>> >>>>>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>>>>>>>>>>> TestStack >>>>>>>>>>>>>>> CompileCommand: compileonly >>>>> TestStack.triggerStackOverflow >>>>>>>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 >>>>> bytes) >>>>>>>>>>>> recursive >>>>>>>>>>>>>> inlining too deep >>>>>>>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) >>> callee >>>>>>> uses >>>>>>>>>> too >>>>>>>>>>>>>> much stack >>>>>>>>>>>>>>> caught java.lang.StackOverflowError >>>>>>>>>>>>>>> 2310 activations were on stack, sum = 0 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> TestStack.java: >>>>>>>>>>>>>>> public class TestStack { >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> static long cnt = 0, >>>>>>>>>>>>>>> sum = 0; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> public static void bogus_test() { >>>>>>>>>>>>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; >>>>>>>>>>>>>>> sum += c1 + c2 + c3 + c4; >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> public static void triggerStackOverflow() { >>>>>>>>>>>>>>> cnt++; >>>>>>>>>>>>>>> triggerStackOverflow(); >>>>>>>>>>>>>>> bogus_test(); >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> public static void main(String args[]) { >>>>>>>>>>>>>>> try { >>>>>>>>>>>>>>> triggerStackOverflow(); >>>>>>>>>>>>>>> } catch (StackOverflowError e) { >>>>>>>>>>>>>>> System.out.println("caught " + e); >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> System.out.println(cnt + " activations were on stack, >>> sum >>>>> = " >>>>>>> + >>>>>>>>>>>> sum); >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> >>>>>>>>>>>>> From xxinliu at amazon.com Fri May 15 18:49:00 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Fri, 15 May 2020 18:49:00 +0000 Subject: RFR(XS): 8245051: c1 is broken if it is compiled by gcc without -fno-lifetime-dse Message-ID: Hi, Please review the following patch. Jbs: https://bugs.openjdk.java.net/browse/JDK-8245051 Webrev: http://cr.openjdk.java.net/~xliu/8245051/00/webrev/ I found this problem from centos?s java-11-openjdk. https://git.centos.org/rpms/java-11-openjdk/blob/c7/f/SPECS/java-11-openjdk.spec#_1327 '-std=gnu++98' is not a valid option for cc1. As a result, configure will fail to determine gcc supports -fno-lifetime-dse or not. C1 acts weird if it is compiled by GCC without that flag. After then, I built hotspot with -fsanitize=undefined. I found some interesting points for c1. With this patch, I can build the whole openjdk without -fno-lifetime-dse. I've tested hotspot:tier1 and "gtest:all". Even though everything looks fine, I don't think we reach a point to lift "-fno-lifetime-dse". This patch just attempts to fix C1. Thanks, --lx From shade at redhat.com Fri May 15 19:20:19 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 May 2020 21:20:19 +0200 Subject: RFR(S): 8245047: PPC64 fails jcstress Unsafe (?) memory ordering tests due to C2 (?) bug In-Reply-To: <9aa3c97f-ec3f-f45d-e601-2f484494ec31@redhat.com> References: <9aa3c97f-ec3f-f45d-e601-2f484494ec31@redhat.com> Message-ID: <5be25ae9-e171-0f4e-9254-df3d7934fe12@redhat.com> On 5/15/20 6:41 PM, Aleksey Shipilev wrote: > On 5/15/20 6:21 PM, Doerr, Martin wrote: >> I?d appreciate retesting, too, if possible. > > I'll run a few jcstress tests here. jcstress seems to run fine with this patch on ppc64le. -- Thanks, -Aleksey From xxinliu at amazon.com Sat May 16 00:25:00 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Sat, 16 May 2020 00:25:00 +0000 Subject: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> Message-ID: <55D731CD-5C89-47E5-B61A-83304E5123ED@amazon.com> Hi, Martin, It's my fault. I should associate with the bug id in the subject. I will double check the email subject before sending. Now this thread becomes an RFR for JDK-8244949. So, you just try it out in ppc? If it works well, we will apply this approach to both 390 and aarch64? I am not reviewer. just my 2 cents. 1. nativeInst_ppc.hpp. I think the return type of get_stop_type() should be 'int' here. bool get_stop_type() { return MacroAssembler::tdi_get_si16(long_at(0), Assembler::traptoUnconditional, 0); } 2. nativeInst_ppc.cpp NativeInstruction::is_sigill_zombie_not_entrant_at() comment still says "iff !UseSIGTRAP". 3. TrapBasedNotEntrantChecks is a product option. Is it too abrupt to drop it? This wiki defines the lifecycle of a Jvm option, but I believe this one is special. It's ppc only. https://wiki.openjdk.java.net/display/HotSpot/Hotspot+Command-line+Flags%3A+Kinds%2C+Lifecycle+and+the+CSR+Process Thanks, --lx ?On 5/15/20, 3:38 AM, "Doerr, Martin" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Exactly, we get stop type + stop message + registers + instructions (unfortunately not disassembled for some reason) + nice stack trace. Best regards, Martin > -----Original Message----- > From: Andrew Haley > Sent: Freitag, 15. Mai 2020 10:57 > To: Doerr, Martin ; Derek White > ; Ningsheng Jian ; Liu, > Xin ; hotspot-compiler-dev at openjdk.java.net > Cc: aarch64-port-dev at openjdk.java.net > Subject: Re: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information > when hitting a HaltNode for architectures other than x86 > > On 5/14/20 8:24 PM, Doerr, Martin wrote: > > Just for you upfront in case you would like to take a look: > > https://bugs.openjdk.java.net/browse/JDK-8244949 > > > http://cr.openjdk.java.net/~mdoerr/8244949_ppc64_asm_stop/webrev.00/ > > That's nice: it's a lot more full-featured than what I did. and AFAICS > you'll get the registers printed in the VM error log. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From kim.barrett at oracle.com Sat May 16 23:28:28 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Sat, 16 May 2020 19:28:28 -0400 Subject: RFR(XS): 8245051: c1 is broken if it is compiled by gcc without -fno-lifetime-dse In-Reply-To: References: Message-ID: <5389667A-123E-44A9-B3B2-44450953CC7E@oracle.com> > On May 15, 2020, at 2:49 PM, Liu, Xin wrote: > > Hi, > > Please review the following patch. > Jbs: https://bugs.openjdk.java.net/browse/JDK-8245051 > Webrev: http://cr.openjdk.java.net/~xliu/8245051/00/webrev/ > > I found this problem from centos?s java-11-openjdk. https://git.centos.org/rpms/java-11-openjdk/blob/c7/f/SPECS/java-11-openjdk.spec#_1327 > '-std=gnu++98' is not a valid option for cc1. As a result, configure will fail to determine gcc supports -fno-lifetime-dse or not. > C1 acts weird if it is compiled by GCC without that flag. > > After then, I built hotspot with -fsanitize=undefined. I found some interesting points for c1. With this patch, I can build the whole openjdk without -fno-lifetime-dse. > I've tested hotspot:tier1 and "gtest:all". Even though everything looks fine, I don't think we reach a point to lift "-fno-lifetime-dse". This patch just attempts to fix C1. > > Thanks, > --lx Thanks for finding these. The change to c1_ValueMap.cpp looks good to me. The changes to c1_Instruction.hpp also look good to me. In addition, Instruction no longer needs to befriend BlockBegin. Please make that change before pushing. However, I have to question the configure options being used. Some (many? all?) of the options being passed in as --with-extra-cflags and friends are already used properly by the build system, and don't need to be added that way, and may even cause problems. That's what's happening with -std=gnu++98, which is not a valid option for compling C code, but the build system will use it when compiling C++ code. That's been true since JDK 9 (JDK-8156980). We definitely cannot remove -fno-lifetime-dse. There are a number of other places where we're doing weird things in allocators/deallocators that are similarly bogus. The ResourceObj allocation_type stuff is just one example; I'm pretty sure there are others. I'm running these changes (including the friendship removal) through more extensive testing. I'll report back when done, but I don't expect that to find anything. It looks like you will need a sponsor? Hopefully a compiler person will also review and sponsor. From xxinliu at amazon.com Sun May 17 21:34:04 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Sun, 17 May 2020 21:34:04 +0000 Subject: RFR(XS): 8245051: c1 is broken if it is compiled by gcc without -fno-lifetime-dse In-Reply-To: <5389667A-123E-44A9-B3B2-44450953CC7E@oracle.com> References: <5389667A-123E-44A9-B3B2-44450953CC7E@oracle.com> Message-ID: Hi, Kim, Thank you to review my patch. I have removed the friend class BlockBegin. Here is the new revision: http://cr.openjdk.java.net/~xliu/8245051/01/webrev/ About --with-extra-cflags, I completely agree. However, how to configure OpenJDK is not under control. There're so many linux distributions. In addition, it might cause subtle bugs if the toolchain is not gcc. From my side, I'd like to get rid of undefine behaviors as many as we can. Yes, I still need a reviewer and a sponsor. Thanks, --lx ?On 5/16/20, 4:29 PM, "Kim Barrett" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > On May 15, 2020, at 2:49 PM, Liu, Xin wrote: > > Hi, > > Please review the following patch. > Jbs: https://bugs.openjdk.java.net/browse/JDK-8245051 > Webrev: http://cr.openjdk.java.net/~xliu/8245051/00/webrev/ > > I found this problem from centos?s java-11-openjdk. https://git.centos.org/rpms/java-11-openjdk/blob/c7/f/SPECS/java-11-openjdk.spec#_1327 > '-std=gnu++98' is not a valid option for cc1. As a result, configure will fail to determine gcc supports -fno-lifetime-dse or not. > C1 acts weird if it is compiled by GCC without that flag. > > After then, I built hotspot with -fsanitize=undefined. I found some interesting points for c1. With this patch, I can build the whole openjdk without -fno-lifetime-dse. > I've tested hotspot:tier1 and "gtest:all". Even though everything looks fine, I don't think we reach a point to lift "-fno-lifetime-dse". This patch just attempts to fix C1. > > Thanks, > --lx Thanks for finding these. The change to c1_ValueMap.cpp looks good to me. The changes to c1_Instruction.hpp also look good to me. In addition, Instruction no longer needs to befriend BlockBegin. Please make that change before pushing. However, I have to question the configure options being used. Some (many? all?) of the options being passed in as --with-extra-cflags and friends are already used properly by the build system, and don't need to be added that way, and may even cause problems. That's what's happening with -std=gnu++98, which is not a valid option for compling C code, but the build system will use it when compiling C++ code. That's been true since JDK 9 (JDK-8156980). We definitely cannot remove -fno-lifetime-dse. There are a number of other places where we're doing weird things in allocators/deallocators that are similarly bogus. The ResourceObj allocation_type stuff is just one example; I'm pretty sure there are others. I'm running these changes (including the friendship removal) through more extensive testing. I'll report back when done, but I don't expect that to find anything. It looks like you will need a sponsor? Hopefully a compiler person will also review and sponsor. From Yang.Zhang at arm.com Mon May 18 05:51:03 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Mon, 18 May 2020 05:51:03 +0000 Subject: [aarch64-port-dev ] RFR(S): 8243597: AArch64: Add support for integer vector abs In-Reply-To: References: Message-ID: Hi all Re-ping. If anyone has bandwidth, would he be able to review this patch? Regards Yang -----Original Message----- From: aarch64-port-dev On Behalf Of Yang Zhang Sent: Wednesday, May 6, 2020 4:46 PM To: aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: [aarch64-port-dev ] RFR(S): 8243597: AArch64: Add support for integer vector abs Hi, Could you please help to review this patch? JBS: https://bugs.openjdk.java.net/browse/JDK-8243597 Webrev: http://cr.openjdk.java.net/~yzhang/8243597/webrev.00/ In JDK-8222074 [1], x86 enables auto vectorization for integer vector abs, and jtreg tests are also added. In this patch, the missing AbsVB/S/I/L support for AArch64 is added. Testing: Full jtreg test Vector API tests which cover vector abs Test case: public static void absvs(short[] a, short[] b, short[] c) { for (int i = 0; i < a.length; i++) { c[i] = (short)Math.abs((a[i] + b[i])); } } Assembly code generated by C2: 0x0000ffffaca3f3ac: ldr q17, [x16, #16] 0x0000ffffaca3f3b0: ldr q16, [x15, #16] 0x0000ffffaca3f3b4: add v16.8h, v16.8h, v17.8h 0x0000ffffaca3f3b8: abs v16.8h, v16.8h 0x0000ffffaca3f3c0: str q16, [x12, #16] Similar test cases for byte/int/long are also tested and NEON abs instruction is generated by C2. Performance: JMH tests are uploaded. http://cr.openjdk.java.net/~yzhang/8243597/TestScalar.java http://cr.openjdk.java.net/~yzhang/8243597/TestVect.java Vector abs: Before: Benchmark (size) Mode Cnt Score Error Units TestVect.testVectAbsVB 1024 avgt 5 1041.720 ? 2.606 us/op TestVect.testVectAbsVI 1024 avgt 5 659.788 ? 2.057 us/op TestVect.testVectAbsVL 1024 avgt 5 711.043 ? 5.489 us/op TestVect.testVectAbsVS 1024 avgt 5 659.157 ? 2.531 us/op After Benchmark (size) Mode Cnt Score Error Units TestVect.testVectAbsVB 1024 avgt 5 88.821 ? 1.886 us/op TestVect.testVectAbsVI 1024 avgt 5 199.081 ? 2.539 us/op TestVect.testVectAbsVL 1024 avgt 5 447.536 ? 1.195 us/op TestVect.testVectAbsVS 1024 avgt 5 119.172 ? 0.340 us/op Scalar abs: Before: Benchmark (size) Mode Cnt Score Error Units TestScalar.testAbsI 1024 avgt 5 3770.345 ? 6.760 us/op TestScalar.testAbsL 1024 avgt 5 3767.570 ? 9.097 us/op After: Benchmark (size) Mode Cnt Score Error Units TestScalar.testAbsI 1024 avgt 5 3141.312 ? 2.000 us/op TestScalar.testAbsL 1024 avgt 5 3103.143 ? 8.989 us/op [1] https://bugs.openjdk.java.net/browse/JDK-8222074 Regards Yang From martin.doerr at sap.com Mon May 18 09:05:46 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 18 May 2020 09:05:46 +0000 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: References: <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> <19d53124-94d1-50f9-f4e6-948640e7c848@oracle.com> <702038f7-7942-9c94-c507-bd36241db180@oracle.com> <2ff562fc-cdbb-1f47-17a0-2f5c9aae487b@oracle.com> Message-ID: Thanks everyone for all assistance regarding reviews, tests, benchmarks and approving the CSR. Pushed to jdk/jdk. (I already had a "looks good" from Nils.) If you would like to have anything from my side wrt. a release note, just let me know. I can also create a subtask for it if desired. Best regards, Martin > -----Original Message----- > From: Vladimir Kozlov > Sent: Freitag, 15. Mai 2020 19:56 > To: Tobias Hartmann ; Doerr, Martin > ; hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > > +1 > > Vladimir > > On 5/15/20 4:54 AM, Tobias Hartmann wrote: > > Hi Martin, > > > > yes, looks good to me. > > > > Best regards, > > Tobias > > > > On 15.05.20 13:41, Doerr, Martin wrote: > >> Hi Vladimir, Nils and Tobias, > >> > >> Can I consider > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ > reviewed by you? > >> Submission repo testing was successful. > >> > >> Thanks and best regards, > >> Martin > >> > >> > >>> -----Original Message----- > >>> From: Vladimir Kozlov > >>> Sent: Donnerstag, 14. Mai 2020 22:29 > >>> To: Doerr, Martin ; hotspot-compiler- > >>> dev at openjdk.java.net > >>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>> > >>> On 5/14/20 12:14 PM, Doerr, Martin wrote: > >>>> Hi Vladimir, > >>>> > >>>>> But we can use it in Test5091921.java. C1 compiles the test code with > >>>>> specified value before - lets keep it. > >>>> Ok. That makes sense for this test. Updated webrev in place. > >>> > >>> Good. > >>> > >>>> > >>>>> And this is not related to these changes but to have range(0, > max_jint) for > >>> all > >>>>> these flags is questionable. I think > >>>>> nobody ran tests with 0 or max_jint values. Bunch of tests may simple > >>>>> timeout (which is understandable) but in worst > >>>>> case they may crash instead of graceful exit. > >>>> I was wondering about that, too. But I haven't changed that. The > previously > >>> global flags already had this range. > >>>> I had also thought about guessing more reasonable values, but > reasonable > >>> limits may depend on platform and future changes. > >>>> I don't think we can define ranges such that everything works great > while > >>> we stay inside and also such that nobody will ever want greater values. > >>>> So I prefer keeping it this way unless somebody has a better proposal. > >>> > >>> I did not mean to have that in these change. Current changes are fine for > me. > >>> > >>> I was thinking aloud that it would be nice to investigate this later by > >>> someone. At least for some flags. We may keep > >>> current range as it is but may be add dynamic checks based on platform > and > >>> other conditions. This looks like starter > >>> task for junior engineer or student intern. > >>> > >>> Thanks, > >>> Vladimir > >>> > >>>> > >>>> Thanks and best regards, > >>>> Martin > >>>> > >>>> > >>>>> -----Original Message----- > >>>>> From: Vladimir Kozlov > >>>>> Sent: Mittwoch, 13. Mai 2020 23:34 > >>>>> To: Doerr, Martin ; hotspot-compiler- > >>>>> dev at openjdk.java.net > >>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>> > >>>>> On 5/13/20 1:10 PM, Doerr, Martin wrote: > >>>>>> Hi Vladimir, > >>>>>> > >>>>>> thanks for reviewing it. > >>>>>> > >>>>>>>> Should I set it to proposed? > >>>>>>> > >>>>>>> Yes. > >>>>>> I've set it to "Finalized". Hope this was correct. > >>>>>> > >>>>>>>> I've added the new C1 flags to the tests which should test C1 > compiler > >>> as > >>>>>>> well. > >>>>>>> > >>>>>>> Good. Why not do the same for C1MaxInlineSize? > >>>>>> Looks like MaxInlineSize is only used by tests which test C2 specific > >>> things. > >>>>> So I think C1MaxInlineSize would be pointless. > >>>>>> In addition to that, the C2 values are probably not appropriate for C1 > in > >>>>> some tests. > >>>>>> Would you like to have C1MaxInlineSize configured in some tests? > >>>>> > >>>>> You are right in cases when test switch off TieredCompilation and use > only > >>> C2 > >>>>> (Test6792161.java) or tests intrinsics. > >>>>> > >>>>> But we can use it in Test5091921.java. C1 compiles the test code with > >>>>> specified value before - lets keep it. > >>>>> > >>>>> And this is not related to these changes but to have range(0, > max_jint) for > >>> all > >>>>> these flags is questionable. I think > >>>>> nobody ran tests with 0 or max_jint values. Bunch of tests may simple > >>>>> timeout (which is understandable) but in worst > >>>>> case they may crash instead of graceful exit. > >>>>> > >>>>> Thanks, > >>>>> Vladimir > >>>>> > >>>>>> > >>>>>> Best regards, > >>>>>> Martin > >>>>>> > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: Vladimir Kozlov > >>>>>>> Sent: Mittwoch, 13. Mai 2020 21:46 > >>>>>>> To: Doerr, Martin ; hotspot-compiler- > >>>>>>> dev at openjdk.java.net > >>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>>>> > >>>>>>> Hi Martin, > >>>>>>> > >>>>>>> On 5/11/20 6:32 AM, Doerr, Martin wrote: > >>>>>>>> Hi Vladimir, > >>>>>>>> > >>>>>>>> are you ok with the updated CSR > >>>>>>> (https://bugs.openjdk.java.net/browse/JDK-8244507)? > >>>>>>>> Should I set it to proposed? > >>>>>>> > >>>>>>> Yes. > >>>>>>> > >>>>>>>> > >>>>>>>> Here's a new webrev with obsoletion + expiration for C2 flags in > >>>>> ClientVM: > >>>>>>>> > >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ > >>>>>>>> > >>>>>>>> I've added the new C1 flags to the tests which should test C1 > compiler > >>> as > >>>>>>> well. > >>>>>>> > >>>>>>> Good. Why not do the same for C1MaxInlineSize? > >>>>>>> > >>>>>>>> And I've added -XX:+IgnoreUnrecognizedVMOptions to all tests > which > >>>>> set > >>>>>>> C2 flags. I think this is the best solution because it still allows > running > >>> the > >>>>> tests > >>>>>>> with GraalVM compiler. > >>>>>>> > >>>>>>> Yes. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Vladimir > >>>>>>> > >>>>>>>> > >>>>>>>> Best regards, > >>>>>>>> Martin > >>>>>>>> > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Doerr, Martin > >>>>>>>>> Sent: Freitag, 8. Mai 2020 23:07 > >>>>>>>>> To: Vladimir Kozlov ; hotspot- > >>> compiler- > >>>>>>>>> dev at openjdk.java.net > >>>>>>>>> Subject: RE: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>>>>>> > >>>>>>>>> Hi Vladimir, > >>>>>>>>> > >>>>>>>>>> You need update your CSR - add information about this and > above > >>>>> code > >>>>>>>>> change. Example: > >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 > >>>>>>>>> I've updated the CSR with obsolete and expired flags as in the > >>> example. > >>>>>>>>> > >>>>>>>>>> I would suggest to fix tests anyway (there are only few) > because > >>> new > >>>>>>>>>> warning output could be unexpected. > >>>>>>>>> Ok. I'll prepare a webrev with fixed tests. > >>>>>>>>> > >>>>>>>>> Best regards, > >>>>>>>>> Martin > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> -----Original Message----- > >>>>>>>>>> From: Vladimir Kozlov > >>>>>>>>>> Sent: Freitag, 8. Mai 2020 21:43 > >>>>>>>>>> To: Doerr, Martin ; hotspot-compiler- > >>>>>>>>>> dev at openjdk.java.net > >>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags > >>>>>>>>>> > >>>>>>>>>> Hi Martin > >>>>>>>>>> > >>>>>>>>>> On 5/8/20 5:56 AM, Doerr, Martin wrote: > >>>>>>>>>>> Hi Vladimir, > >>>>>>>>>>> > >>>>>>>>>>> thanks a lot for looking at this, for finding the test issues and > for > >>>>>>>>> reviewing > >>>>>>>>>> the CSR. > >>>>>>>>>>> > >>>>>>>>>>> For me, C2 is a fundamental part of the JVM. I would usually > never > >>>>>>> build > >>>>>>>>>> without it ?? > >>>>>>>>>>> (Except if we want to use C1 + GraalVM compiler only.) > >>>>>>>>>> > >>>>>>>>>> Yes it is one of cases. > >>>>>>>>>> > >>>>>>>>>>> But your right, --with-jvm-variants=client configuration should > still > >>> be > >>>>>>>>>> supported. > >>>>>>>>>> > >>>>>>>>>> Yes. > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> We can fix it by making the flags as obsolete if C2 is not > included: > >>>>>>>>>>> diff -r 5f5ed86d7883 > src/hotspot/share/runtime/arguments.cpp > >>>>>>>>>>> --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 > >>> 11:14:28 > >>>>>>>>> 2020 > >>>>>>>>>> +0200 > >>>>>>>>>>> +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 > >>>>> 14:41:14 > >>>>>>>>>> 2020 +0200 > >>>>>>>>>>> @@ -562,6 +562,16 @@ > >>>>>>>>>>> { "dup option", JDK_Version::jdk(9), > >>>>>>>>> JDK_Version::undefined(), > >>>>>>>>>> JDK_Version::undefined() }, > >>>>>>>>>>> #endif > >>>>>>>>>>> > >>>>>>>>>>> +#ifndef COMPILER2 > >>>>>>>>>>> + // These flags were generally available, but are C2 only, > now. > >>>>>>>>>>> + { "MaxInlineLevel", JDK_Version::undefined(), > >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>>>>>>> + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), > >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>>>>>>> + { "InlineSmallCode", JDK_Version::undefined(), > >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>>>>>>> + { "MaxInlineSize", JDK_Version::undefined(), > >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>>>>>>> + { "FreqInlineSize", JDK_Version::undefined(), > >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>>>>>>> + { "MaxTrivialSize", JDK_Version::undefined(), > >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, > >>>>>>>>>>> +#endif > >>>>>>>>>>> + > >>>>>>>>>>> { NULL, JDK_Version(0), JDK_Version(0) } > >>>>>>>>>>> }; > >>>>>>>>>> > >>>>>>>>>> Right. I think you should do full process for these product flags > >>>>>>> deprecation > >>>>>>>>>> with obsoleting in JDK 16 for VM builds > >>>>>>>>>> which do not include C2. You need update your CSR - add > >>> information > >>>>>>>>> about > >>>>>>>>>> this and above code change. Example: > >>>>>>>>>> > >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> This makes the VM accept the flags with warning: > >>>>>>>>>>> jdk/bin/java -XX:MaxInlineLevel=9 -version > >>>>>>>>>>> OpenJDK 64-Bit Client VM warning: Ignoring option > >>> MaxInlineLevel; > >>>>>>>>>> support was removed in 15.0 > >>>>>>>>>>> > >>>>>>>>>>> If we do it this way, the only test which I think should get fixed > is > >>>>>>>>>> ReservedStackTest. > >>>>>>>>>>> I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in > >>> order > >>>>> to > >>>>>>>>>> preserve the inlining behavior. > >>>>>>>>>>> > >>>>>>>>>>> (TestStringIntrinsics2: C1 doesn't have String intrinsics > anymore. > >>>>>>>>>> compiler/c2 tests: Also written to test C2 specific things.) > >>>>>>>>>>> > >>>>>>>>>>> What do you think? > >>>>>>>>>> > >>>>>>>>>> I would suggest to fix tests anyway (there are only few) > because > >>> new > >>>>>>>>>> warning output could be unexpected. > >>>>>>>>>> And it will be future-proof when warning will be converted into > >>> error > >>>>>>>>>> (if/when C2 goes away). > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Vladimir > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Best regards, > >>>>>>>>>>> Martin > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>> From: hotspot-compiler-dev >>>>>>>>>>>> bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov > >>>>>>>>>>>> Sent: Donnerstag, 7. Mai 2020 19:11 > >>>>>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net > >>>>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control > flags > >>>>>>>>>>>> > >>>>>>>>>>>> I would suggest to build VM without C2 and run tests. > >>>>>>>>>>>> > >>>>>>>>>>>> I grepped tests with these flags I found next tests where we > >>> need > >>>>> to > >>>>>>> fix > >>>>>>>>>>>> test's command (add > >>>>>>>>>>>> -XX:+IgnoreUnrecognizedVMOptions) or add @requires > >>>>>>>>>>>> vm.compiler2.enabled or duplicate test for C1 with > >>> corresponding > >>>>> C1 > >>>>>>>>>>>> flags (by ussing additional @test block). > >>>>>>>>>>>> > >>>>>>>>>>>> runtime/ReservedStack/ReservedStackTest.java > >>>>>>>>>>>> compiler/intrinsics/string/TestStringIntrinsics2.java > >>>>>>>>>>>> compiler/c2/Test6792161.java > >>>>>>>>>>>> compiler/c2/Test5091921.java > >>>>>>>>>>>> > >>>>>>>>>>>> And there is issue with compiler/compilercontrol tests which > use > >>>>>>>>>>>> InlineSmallCode and I am not sure how to handle: > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>> > >>> > http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c > >>>>>>>>>>>> ompiler/compilercontrol/share/scenario/Command.java#l36 > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> Vladimir > >>>>>>>>>>>> > >>>>>>>>>>>> On 5/4/20 9:04 AM, Doerr, Martin wrote: > >>>>>>>>>>>>> Hi Nils, > >>>>>>>>>>>>> > >>>>>>>>>>>>> thank you for looking at this and sorry for the late reply. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I've added MaxTrivialSize and also updated the issue > >>> accordingly. > >>>>>>>>> Makes > >>>>>>>>>>>> sense. > >>>>>>>>>>>>> Do you have more flags in mind? > >>>>>>>>>>>>> > >>>>>>>>>>>>> Moving the flags which are only used by C2 into c2_globals > >>>>> definitely > >>>>>>>>>> makes > >>>>>>>>>>>> sense. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Done in webrev.01: > >>>>>>>>>>>>> > >>>>>>> > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ > >>>>>>>>>>>>> > >>>>>>>>>>>>> Please take a look and let me know when my proposal is > ready > >>> for > >>>>> a > >>>>>>>>> CSR. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Best regards, > >>>>>>>>>>>>> Martin > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>> From: hotspot-compiler-dev >>>>>>>>>>>>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson > >>>>>>>>>>>>>> Sent: Dienstag, 28. April 2020 18:29 > >>>>>>>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net > >>>>>>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control > flags > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks for addressing this! This has been an annoyance > for a > >>>>> long > >>>>>>>>> time. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Have you though about including other flags - like > >>>>> MaxTrivialSize? > >>>>>>>>>>>>>> MaxInlineSize is tested against it. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Also - you should move the flags that are now c2-only to > >>>>>>>>>> c2_globals.hpp. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Best regards, > >>>>>>>>>>>>>> Nils Eliasson > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On 2020-04-27 15:06, Doerr, Martin wrote: > >>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> while tuning inlining parameters for C2 compiler with > JDK- > >>>>> 8234863 > >>>>>>>>> we > >>>>>>>>>>>> had > >>>>>>>>>>>>>> discussed impact on C1. > >>>>>>>>>>>>>>> I still think it's bad to share them between both > compilers. > >>> We > >>>>>>> may > >>>>>>>>>> want > >>>>>>>>>>>> to > >>>>>>>>>>>>>> do further C2 tuning without negative impact on C1 in the > >>> future. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> C1 has issues with substantial inlining because of the lack > of > >>>>>>>>>> uncommon > >>>>>>>>>>>>>> traps. When C1 inlines a lot, stack frames may get large > and > >>> code > >>>>>>>>> cache > >>>>>>>>>>>> space > >>>>>>>>>>>>>> may get wasted for cold or even never executed code. > The > >>>>>>> situation > >>>>>>>>>> gets > >>>>>>>>>>>>>> worse when many patching stubs get used for such code. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I had opened the following issue: > >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> And my initial proposal is here: > >>>>>>>>>>>>>>> > >>>>>>>>>> > >>>>> > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Part of my proposal is to add an additional flag which I > called > >>>>>>>>>>>>>> C1InlineStackLimit to reduce stack utilization for C1 > methods. > >>>>>>>>>>>>>>> I have a simple example which shows wasted stack space > >>> (java > >>>>>>>>>> example > >>>>>>>>>>>>>> TestStack at the end). > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> It simply counts stack frames until a stack overflow > occurs. > >>> With > >>>>>>> the > >>>>>>>>>>>> current > >>>>>>>>>>>>>> implementation, only 1283 frames fit on the stack because > the > >>>>>>> never > >>>>>>>>>>>>>> executed method bogus_test with local variables gets > inlined. > >>>>>>>>>>>>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test > and > >>>>> we > >>>>>>> get > >>>>>>>>>>>> 2310 > >>>>>>>>>>>>>> frames until stack overflow. (I only used C1 for this > example. > >>> Can > >>>>>>> be > >>>>>>>>>>>>>> reproduced as shown below.) > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I didn't notice any performance regression even with the > >>>>>>> aggressive > >>>>>>>>>>>> setting > >>>>>>>>>>>>>> of C1InlineStackLimit=5 with TieredCompilation. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I know that I'll need a CSR for this change, but I'd like to > get > >>>>>>>>> feedback > >>>>>>>>>> in > >>>>>>>>>>>>>> general and feedback about the flag names before > creating a > >>>>> CSR. > >>>>>>>>>>>>>>> I'd also be glad about feedback regarding the > performance > >>>>>>> impact. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Best regards, > >>>>>>>>>>>>>>> Martin > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Command line: > >>>>>>>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - > >>>>> XX:C1InlineStackLimit=20 - > >>>>>>>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - > >>>>>>> XX:+PrintInlining > >>>>>>>>> - > >>>>>>>>>>>>>> > >>>>>>> > XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >>>>>>>>>>>>>> TestStack > >>>>>>>>>>>>>>> CompileCommand: compileonly > >>>>> TestStack.triggerStackOverflow > >>>>>>>>>>>>>>> @ 8 TestStack::triggerStackOverflow > (15 > >>>>> bytes) > >>>>>>>>>>>> recursive > >>>>>>>>>>>>>> inlining too deep > >>>>>>>>>>>>>>> @ 11 TestStack::bogus_test (33 > bytes) > >>> inline > >>>>>>>>>>>>>>> caught java.lang.StackOverflowError > >>>>>>>>>>>>>>> 1283 activations were on stack, sum = 0 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - > >>>>> XX:C1InlineStackLimit=10 - > >>>>>>>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - > >>>>>>> XX:+PrintInlining > >>>>>>>>> - > >>>>>>>>>>>>>> > >>>>>>> > XX:CompileCommand=compileonly,TestStack::triggerStackOverflow > >>>>>>>>>>>>>> TestStack > >>>>>>>>>>>>>>> CompileCommand: compileonly > >>>>> TestStack.triggerStackOverflow > >>>>>>>>>>>>>>> @ 8 TestStack::triggerStackOverflow > (15 > >>>>> bytes) > >>>>>>>>>>>> recursive > >>>>>>>>>>>>>> inlining too deep > >>>>>>>>>>>>>>> @ 11 TestStack::bogus_test (33 > bytes) > >>> callee > >>>>>>> uses > >>>>>>>>>> too > >>>>>>>>>>>>>> much stack > >>>>>>>>>>>>>>> caught java.lang.StackOverflowError > >>>>>>>>>>>>>>> 2310 activations were on stack, sum = 0 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> TestStack.java: > >>>>>>>>>>>>>>> public class TestStack { > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> static long cnt = 0, > >>>>>>>>>>>>>>> sum = 0; > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> public static void bogus_test() { > >>>>>>>>>>>>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; > >>>>>>>>>>>>>>> sum += c1 + c2 + c3 + c4; > >>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> public static void triggerStackOverflow() { > >>>>>>>>>>>>>>> cnt++; > >>>>>>>>>>>>>>> triggerStackOverflow(); > >>>>>>>>>>>>>>> bogus_test(); > >>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> public static void main(String args[]) { > >>>>>>>>>>>>>>> try { > >>>>>>>>>>>>>>> triggerStackOverflow(); > >>>>>>>>>>>>>>> } catch (StackOverflowError e) { > >>>>>>>>>>>>>>> System.out.println("caught " + e); > >>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>> System.out.println(cnt + " activations were on > stack, > >>> sum > >>>>> = " > >>>>>>> + > >>>>>>>>>>>> sum); > >>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> From adinn at redhat.com Mon May 18 09:37:39 2020 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 18 May 2020 10:37:39 +0100 Subject: RFR(XXS):8244170: correct instruction typo for dcps1/2/3 In-Reply-To: <3F8C4202-6810-4CC6-BB77-656A6D71E9D3@amazon.com> References: <3F8C4202-6810-4CC6-BB77-656A6D71E9D3@amazon.com> Message-ID: On 30/04/2020 09:11, Liu, Xin wrote: > Please review the typo correction change for aarch64. > The change is trivial. It just makes the instruction name dcps same as armv8 manual. > > JBS: https://cr.openjdk.java.net/~xliu/8244170/webrev/ > webrev: https://bugs.openjdk.java.net/browse/JDK-8244170 > > I ran hotspot-tier1 and no regression found for fastdebug build on aarch64. Reviewed! regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From tobias.hartmann at oracle.com Mon May 18 12:17:28 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 18 May 2020 14:17:28 +0200 Subject: RFR(S): 8245011: Add JFR event for cold methods flushed In-Reply-To: <99BAD5E5-8D8B-4B36-AC65-495042CADC13@oracle.com> References: <99BAD5E5-8D8B-4B36-AC65-495042CADC13@oracle.com> Message-ID: <7ba7729c-ca70-7412-d0cc-3bac5a37f1b2@oracle.com> +1 Best regards, Tobias On 14.05.20 22:11, Erik Gahlin wrote: > Looks good. > > Erik > >> On 14 May 2020, at 22:06, Nils Eliasson wrote: >> >> Hi, >> >> Please review this small patch that adds a JFR event for cold methods that are flushed and adds the total number of cold methods flushed to the sweeper statistics event. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8245011 >> Webrev: http://cr.openjdk.java.net/~neliasso/8245011/ >> >> Please review, >> >> Nils Eliasson > From tobias.hartmann at oracle.com Mon May 18 12:20:41 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 18 May 2020 14:20:41 +0200 Subject: RFR(XS): 8245051: c1 is broken if it is compiled by gcc without -fno-lifetime-dse In-Reply-To: References: <5389667A-123E-44A9-B3B2-44450953CC7E@oracle.com> Message-ID: <99f4fd44-6902-84e9-afcc-858c07c87596@oracle.com> Looks good to me. Best regards, Tobias On 17.05.20 23:34, Liu, Xin wrote: > Hi, Kim, > > Thank you to review my patch. I have removed the friend class BlockBegin. > Here is the new revision: http://cr.openjdk.java.net/~xliu/8245051/01/webrev/ > > About --with-extra-cflags, I completely agree. However, how to configure OpenJDK is not under control. There're so many linux distributions. > In addition, it might cause subtle bugs if the toolchain is not gcc. From my side, I'd like to get rid of undefine behaviors as many as we can. > > Yes, I still need a reviewer and a sponsor. > > Thanks, > --lx > > > ?On 5/16/20, 4:29 PM, "Kim Barrett" wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > On May 15, 2020, at 2:49 PM, Liu, Xin wrote: > > > > Hi, > > > > Please review the following patch. > > Jbs: https://bugs.openjdk.java.net/browse/JDK-8245051 > > Webrev: http://cr.openjdk.java.net/~xliu/8245051/00/webrev/ > > > > I found this problem from centos?s java-11-openjdk. https://git.centos.org/rpms/java-11-openjdk/blob/c7/f/SPECS/java-11-openjdk.spec#_1327 > > '-std=gnu++98' is not a valid option for cc1. As a result, configure will fail to determine gcc supports -fno-lifetime-dse or not. > > C1 acts weird if it is compiled by GCC without that flag. > > > > After then, I built hotspot with -fsanitize=undefined. I found some interesting points for c1. With this patch, I can build the whole openjdk without -fno-lifetime-dse. > > I've tested hotspot:tier1 and "gtest:all". Even though everything looks fine, I don't think we reach a point to lift "-fno-lifetime-dse". This patch just attempts to fix C1. > > > > Thanks, > > --lx > > Thanks for finding these. > > The change to c1_ValueMap.cpp looks good to me. > > The changes to c1_Instruction.hpp also look good to me. In addition, > Instruction no longer needs to befriend BlockBegin. Please make that > change before pushing. > > However, I have to question the configure options being used. Some > (many? all?) of the options being passed in as --with-extra-cflags and > friends are already used properly by the build system, and don't need > to be added that way, and may even cause problems. That's what's > happening with -std=gnu++98, which is not a valid option for compling > C code, but the build system will use it when compiling C++ code. > That's been true since JDK 9 (JDK-8156980). > > We definitely cannot remove -fno-lifetime-dse. There are a number of > other places where we're doing weird things in allocators/deallocators > that are similarly bogus. The ResourceObj allocation_type stuff is > just one example; I'm pretty sure there are others. > > I'm running these changes (including the friendship removal) through > more extensive testing. I'll report back when done, but I don't expect > that to find anything. > > It looks like you will need a sponsor? Hopefully a compiler person > will also review and sponsor. > > From martin.doerr at sap.com Mon May 18 13:57:49 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 18 May 2020 13:57:49 +0000 Subject: RFR(S): 8245047: PPC64 fails jcstress Unsafe (?) memory ordering tests due to C2 (?) bug In-Reply-To: <9aa3c97f-ec3f-f45d-e601-2f484494ec31@redhat.com> References: <9aa3c97f-ec3f-f45d-e601-2f484494ec31@redhat.com> Message-ID: Hi Aleksey, thanks a lot for verifying my patch. I'll send a new RFR email with updated synopsis. Please reply to that one if you have further comments. > Looks fine to me. It is a bit odd to me to see that "normal" loads are matched > with isync, but it > seems to fit the rest of ppc64.ad that has two versions of loads, > unordered/followed_by_acquire > explicitly excepted. Well, these loads are not "normal" loads. They are "load.acquire". C2 uses 2 different things to express acquire semantics: - The LoadNodes have an attribute _mo which is set to "acquire" instead of "unordered". - A MemBarAcquireNode is attached to the LoadNode (via precedence edge). Platform implementors can choose whether to use one or the other. PPC64 uses empty implementation for MemBarAcquireNode, so the acquire barrier needs to be handled by the LoadNode. Load with acquire semantics is typically faster than Load + independent acquire barrier on platforms which have instructions (or tricky patterns) to support that. Best regards, Martin > -----Original Message----- > From: Aleksey Shipilev > Sent: Freitag, 15. Mai 2020 18:42 > To: Doerr, Martin ; 'hotspot-compiler- > dev at openjdk.java.net' ; > Michihiro Horie (HORIE at jp.ibm.com) ; > joserz at linux.ibm.com; Lindenmaier, Goetz > Subject: Re: RFR(S): 8245047: PPC64 fails jcstress Unsafe (?) memory ordering > tests due to C2 (?) bug > > On 5/15/20 6:21 PM, Doerr, Martin wrote: > > Aleksey has reported the issue: > > https://bugs.openjdk.java.net/browse/JDK-8245047 > > Well, I think it is a good idea to change the synopsis. I speculated in the > provisional synopsis, > and thought "(?)" would prompt the edit :) Looks to me, it is "[PPC64] C2: > ReverseBytes(U)S/Load(U)S > always match to unordered loads". > > > Here's my proposed fix: > > > http://cr.openjdk.java.net/~mdoerr/8245047_ppc64_load_reversed_acquir > e/webrev.00/ > > Looks fine to me. It is a bit odd to me to see that "normal" loads are matched > with isync, but it > seems to fit the rest of ppc64.ad that has two versions of loads, > unordered/followed_by_acquire > explicitly excepted. > > > I'd appreciate retesting, too, if possible. > > I'll run a few jcstress tests here. > > -- > Thanks, > -Aleksey From martin.doerr at sap.com Mon May 18 14:03:24 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 18 May 2020 14:03:24 +0000 Subject: RFR(S): 8245047: [PPC64] C2: ReverseBytes + Load always match to unordered Load (acquire semantics missing) Message-ID: Hi, this issue was previously discussed with the preliminary synopsis "RFR(S): 8245047: PPC64 fails jcstress Unsafe (?) memory ordering tests due to C2 (?) bug": http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-May/038264.html Bug id and webrev are still the same: https://bugs.openjdk.java.net/browse/JDK-8245047 http://cr.openjdk.java.net/~mdoerr/8245047_ppc64_load_reversed_acquire/webrev.00/ Please send reviews replying to this RFR email. Best regards, Martin From shade at redhat.com Mon May 18 14:04:27 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 18 May 2020 16:04:27 +0200 Subject: RFR(S): 8245047: [PPC64] C2: ReverseBytes + Load always match to unordered Load (acquire semantics missing) In-Reply-To: References: Message-ID: <6b41c2f2-7dbb-0e45-5a55-6352f91b5c2f@redhat.com> On 5/18/20 4:03 PM, Doerr, Martin wrote: > http://cr.openjdk.java.net/~mdoerr/8245047_ppc64_load_reversed_acquire/webrev.00/ Still looks good. -- Thanks, -Aleksey From martin.doerr at sap.com Mon May 18 14:31:24 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 18 May 2020 14:31:24 +0000 Subject: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <55D731CD-5C89-47E5-B61A-83304E5123ED@amazon.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> <55D731CD-5C89-47E5-B61A-83304E5123ED@amazon.com> Message-ID: Hi lx, > It's my fault. I should associate with the bug id in the subject. I will double > check the email subject before sending. Thanks and no problem. > Now this thread becomes an RFR for JDK-8244949. So, you just try it out in > ppc? JDK-8244949 is just for PPC64. I think it's large enough to keep it separate. I've left out the addition of the HaltNode message which is subject of your change which is still wanted. I don't know if Andrew wants to integrate his aarch64 part into your change or push it separately. It would be nice to do it also for s390, but I have limited time for this platform. We could live without my enhancement or do it later. > I am not reviewer. just my 2 cents. Your review is appreciated. > 1. nativeInst_ppc.hpp. > I think the return type of get_stop_type() should be 'int' here. > > bool get_stop_type() { > return MacroAssembler::tdi_get_si16(long_at(0), > Assembler::traptoUnconditional, 0); > } > > 2. nativeInst_ppc.cpp > NativeInstruction::is_sigill_zombie_not_entrant_at() comment still says "iff > !UseSIGTRAP". Thanks a lot for reviewing my preliminary version. I'll update the webrev in place before sending a RFR email. > 3. TrapBasedNotEntrantChecks is a product option. Is it too abrupt to drop it? > This wiki defines the lifecycle of a Jvm option, but I believe this one is special. > It's ppc only. The point is "It's ppc only". We don't apply the strict rules as the PPC flags are not properly categorized (product, diagnostic, develop, ...) and we don't expect anyone to set this flag in production. Best regards, Martin > -----Original Message----- > From: Liu, Xin > Sent: Samstag, 16. Mai 2020 02:25 > To: Doerr, Martin ; Andrew Haley > ; Derek White ; Ningsheng Jian > ; hotspot-compiler-dev at openjdk.java.net > Cc: aarch64-port-dev at openjdk.java.net > Subject: Re: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information > when hitting a HaltNode for architectures other than x86 > > Hi, Martin, > It's my fault. I should associate with the bug id in the subject. I will double > check the email subject before sending. > > Now this thread becomes an RFR for JDK-8244949. So, you just try it out in > ppc? > If it works well, we will apply this approach to both 390 and aarch64? > > I am not reviewer. just my 2 cents. > 1. nativeInst_ppc.hpp. > I think the return type of get_stop_type() should be 'int' here. > > bool get_stop_type() { > return MacroAssembler::tdi_get_si16(long_at(0), > Assembler::traptoUnconditional, 0); > } > > 2. nativeInst_ppc.cpp > NativeInstruction::is_sigill_zombie_not_entrant_at() comment still says "iff > !UseSIGTRAP". > > 3. TrapBasedNotEntrantChecks is a product option. Is it too abrupt to drop it? > This wiki defines the lifecycle of a Jvm option, but I believe this one is special. > It's ppc only. > https://wiki.openjdk.java.net/display/HotSpot/Hotspot+Command- > line+Flags%3A+Kinds%2C+Lifecycle+and+the+CSR+Process > > Thanks, > --lx > > > > ?On 5/15/20, 3:38 AM, "Doerr, Martin" wrote: > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > > > Exactly, we get stop type + stop message + registers + instructions > (unfortunately not disassembled for some reason) + nice stack trace. > > Best regards, > Martin > > > > -----Original Message----- > > From: Andrew Haley > > Sent: Freitag, 15. Mai 2020 10:57 > > To: Doerr, Martin ; Derek White > > ; Ningsheng Jian ; > Liu, > > Xin ; hotspot-compiler-dev at openjdk.java.net > > Cc: aarch64-port-dev at openjdk.java.net > > Subject: Re: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information > > when hitting a HaltNode for architectures other than x86 > > > > On 5/14/20 8:24 PM, Doerr, Martin wrote: > > > Just for you upfront in case you would like to take a look: > > > https://bugs.openjdk.java.net/browse/JDK-8244949 > > > > > > http://cr.openjdk.java.net/~mdoerr/8244949_ppc64_asm_stop/webrev.00/ > > > > That's nice: it's a lot more full-featured than what I did. and AFAICS > > you'll get the registers printed in the VM error log. > > > > -- > > Andrew Haley (he/him) > > Java Platform Lead Engineer > > Red Hat UK Ltd. > > https://keybase.io/andrewhaley > > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > From patric.hedlin at oracle.com Mon May 18 15:50:59 2020 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Mon, 18 May 2020 17:50:59 +0200 Subject: RFR(M): 8229495: SIGILL in C2 generated OSR compilation Message-ID: <3b720427-d718-5d1c-dbe9-6149a21883af@oracle.com> Dear all, I would like to ask for help to review the following change/update: Issue: https://bugs.openjdk.java.net/browse/JDK-8229495 Webrev: http://cr.openjdk.java.net/~phedlin/tr8229495/ 8229495: SIGILL in C2 generated OSR compilation The approach to insert range-check guards (see, JDK-8193130, JDK-8216135, JDK-8240335) between the pre- and the main-loopis somewhat problematic. The immediate problem here is due to an inherent dependency between the additional (template) range-check guards introduced (during RCE) and the state of the loop, such as the level of loop-unrolling.To keep the range-check guards valid through the compilation, these arere-generated when/if the main-loop is unrolled further. Here, the error is introduced when a guard is generated with an illegal offset, that will erroneously cut the path to the main-loop (resulting in a 'Halt'). The reason for range-checks to be present in the main-loop to begin with is due to a failing dominator search (this was also corrected in JDK-8231412, for JDK14). Trying to solve this we could encode part of the loop state (e.g. the stride) into the templates when first introduced, in order to track the progress correctly. However, this still leaves us with a starting-point problem (here, we cannot tell what range-checks are actually valid starting-points). Also, the set of range-check guards generated from these new templates have a similar problem as the original loop-guard in that they will grow in complexity, possibly to a point where they may no longer be reduced at compile time. The approach taken in this patch is to replace the range-check asserts with a more conservative check on the loop-variable staying inside the iteration space when passing between the pre- and the main-loop. The assert is introduced once (during RCE) and need not be updated to stay valid (it is supposed to be a proper tautology). The focus is correctness also in the degenerate case when the pre-loop depletes the iteration space. In particular, it will not determine the main-loop inaccessible in the cases when the main-loop would make to many iterations in its first trip (this could be supported by adjusting the upper bound of the conservative test to "ub - final-stride", after loop-optimisations are done). Testing: hs-precheckin-comp, hs-tier1-6,8 Also including JDK11,12 and 14, however, the reproducing testcase has been ran (along with regressiontesting) using a set of tweaks(debug flags) for JDK14,15 to "open the path" to the kernel of the poodle (Des Pudels Kern). These tweaks are not included in the patch. (The reproducer working on JDK11,12 without tweaks.) Best regards, Patric From vladimir.kozlov at oracle.com Mon May 18 17:57:28 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 18 May 2020 10:57:28 -0700 Subject: RFR: 8244819: hsdis does not compile with binutils 2.34+ In-Reply-To: <3399c27a-3f55-9f30-1090-5fe6aea479c2@oss.nttdata.com> References: <3399c27a-3f55-9f30-1090-5fe6aea479c2@oss.nttdata.com> Message-ID: <37f96427-77f4-9b35-3b27-703f89c8af7e@oracle.com> Looks good. Thanks, Vladimir On 5/12/20 6:12 AM, Yasumasa Suenaga wrote: > Hi all, > > Please review this change: > > ? JBS: https://bugs.openjdk.java.net/browse/JDK-8244819 > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8244819/webrev.00/ > > binutils 2.34 introduces new section flag: SEC_ELF_OCTETS, and it affects arguments of bfd_octets_per_byte() [1]. So we > can see new compiler error as below: > > ``` > hsdis.c:571:28: error: too few arguments to function 'bfd_octets_per_byt > ' > ? 571 | dinfo->octets_per_byte = bfd_octets_per_byte (abfd); > ????? | ^~~~~~~~~~~~~~~~~~~ > In file included from hsdis.c:58: > build/linux-amd64/bfd/bfd.h:1999:14: note: declared here > ?1999 | unsigned int bfd_octets_per_byte (const bfd *abfd, > ????? | ^~~~~~~~~~~~~~~~~~~ > ``` > > > Thanks, > > Yasumasa > > > [1] https://sourceware.org/git/?p=binutils-gdb.git;h=618265039f697eab9e72bb58b95fc2d32925df58 From patric.hedlin at oracle.com Mon May 18 20:37:28 2020 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Mon, 18 May 2020 22:37:28 +0200 Subject: RFR(S): 8245021: Add method 'remove_if_existing' to growableArray. Message-ID: <054bdcb1-9543-eefc-b814-60ad5ab641d3@oracle.com> Dear all, I would like to ask for help to review the following change/update: Issue:? https://bugs.openjdk.java.net/browse/JDK-8245021 Webrev: http://cr.openjdk.java.net/~phedlin/tr8245021/ 8245021: Add method 'remove_if_existing' to growableArray. Minor improvement to simplify the code pattern "if contains then remove" found in a few places (in "compile.hpp"). Testing: hs-tier1-3 Best regards, Patric From hohensee at amazon.com Mon May 18 20:45:26 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 18 May 2020 20:45:26 +0000 Subject: RFR(XS): 8245051: c1 is broken if it is compiled by gcc without -fno-lifetime-dse Message-ID: I'll push it for Xin once Kim comes back with an approval. Thanks, Paul ?On 5/18/20, 5:22 AM, "hotspot-compiler-dev on behalf of Tobias Hartmann" wrote: Looks good to me. Best regards, Tobias On 17.05.20 23:34, Liu, Xin wrote: > Hi, Kim, > > Thank you to review my patch. I have removed the friend class BlockBegin. > Here is the new revision: http://cr.openjdk.java.net/~xliu/8245051/01/webrev/ > > About --with-extra-cflags, I completely agree. However, how to configure OpenJDK is not under control. There're so many linux distributions. > In addition, it might cause subtle bugs if the toolchain is not gcc. From my side, I'd like to get rid of undefine behaviors as many as we can. > > Yes, I still need a reviewer and a sponsor. > > Thanks, > --lx > > > On 5/16/20, 4:29 PM, "Kim Barrett" wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > On May 15, 2020, at 2:49 PM, Liu, Xin wrote: > > > > Hi, > > > > Please review the following patch. > > Jbs: https://bugs.openjdk.java.net/browse/JDK-8245051 > > Webrev: http://cr.openjdk.java.net/~xliu/8245051/00/webrev/ > > > > I found this problem from centos?s java-11-openjdk. https://git.centos.org/rpms/java-11-openjdk/blob/c7/f/SPECS/java-11-openjdk.spec#_1327 > > '-std=gnu++98' is not a valid option for cc1. As a result, configure will fail to determine gcc supports -fno-lifetime-dse or not. > > C1 acts weird if it is compiled by GCC without that flag. > > > > After then, I built hotspot with -fsanitize=undefined. I found some interesting points for c1. With this patch, I can build the whole openjdk without -fno-lifetime-dse. > > I've tested hotspot:tier1 and "gtest:all". Even though everything looks fine, I don't think we reach a point to lift "-fno-lifetime-dse". This patch just attempts to fix C1. > > > > Thanks, > > --lx > > Thanks for finding these. > > The change to c1_ValueMap.cpp looks good to me. > > The changes to c1_Instruction.hpp also look good to me. In addition, > Instruction no longer needs to befriend BlockBegin. Please make that > change before pushing. > > However, I have to question the configure options being used. Some > (many? all?) of the options being passed in as --with-extra-cflags and > friends are already used properly by the build system, and don't need > to be added that way, and may even cause problems. That's what's > happening with -std=gnu++98, which is not a valid option for compling > C code, but the build system will use it when compiling C++ code. > That's been true since JDK 9 (JDK-8156980). > > We definitely cannot remove -fno-lifetime-dse. There are a number of > other places where we're doing weird things in allocators/deallocators > that are similarly bogus. The ResourceObj allocation_type stuff is > just one example; I'm pretty sure there are others. > > I'm running these changes (including the friendship removal) through > more extensive testing. I'll report back when done, but I don't expect > that to find anything. > > It looks like you will need a sponsor? Hopefully a compiler person > will also review and sponsor. > > From xxinliu at amazon.com Mon May 18 21:43:21 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Mon, 18 May 2020 21:43:21 +0000 Subject: RFR(XXS):8244170: correct instruction typo for dcps1/2/3 In-Reply-To: References: <3F8C4202-6810-4CC6-BB77-656A6D71E9D3@amazon.com> Message-ID: <47B6CB94-62D9-4E66-A9C1-0055A43FFA24@amazon.com> Hi, Andrew and Rahul, Thank you to review it. Here is the new revision, which can apply to TIP. It's almost same, I just resolve a merge conflict with JDK-8022574. http://cr.openjdk.java.net/~xliu/8244170/01/webrev/ I ran test on "hotspot:tier1" on aarch64. thanks, --lx ?On 5/18/20, 2:38 AM, "Andrew Dinn" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. On 30/04/2020 09:11, Liu, Xin wrote: > Please review the typo correction change for aarch64. > The change is trivial. It just makes the instruction name dcps same as armv8 manual. > > JBS: https://cr.openjdk.java.net/~xliu/8244170/webrev/ > webrev: https://bugs.openjdk.java.net/browse/JDK-8244170 > > I ran hotspot-tier1 and no regression found for fastdebug build on aarch64. Reviewed! regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From suenaga at oss.nttdata.com Mon May 18 23:46:15 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 19 May 2020 08:46:15 +0900 Subject: RFR: 8244819: hsdis does not compile with binutils 2.34+ In-Reply-To: <37f96427-77f4-9b35-3b27-703f89c8af7e@oracle.com> References: <3399c27a-3f55-9f30-1090-5fe6aea479c2@oss.nttdata.com> <37f96427-77f4-9b35-3b27-703f89c8af7e@oracle.com> Message-ID: <8110da95-110f-49bb-0e0b-a63c74392bc2@oss.nttdata.com> Thanks Vladimir! Yasumasa On 2020/05/19 2:57, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 5/12/20 6:12 AM, Yasumasa Suenaga wrote: >> Hi all, >> >> Please review this change: >> >> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8244819 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8244819/webrev.00/ >> >> binutils 2.34 introduces new section flag: SEC_ELF_OCTETS, and it affects arguments of bfd_octets_per_byte() [1]. So we can see new compiler error as below: >> >> ``` >> hsdis.c:571:28: error: too few arguments to function 'bfd_octets_per_byt >> ' >> ?? 571 | dinfo->octets_per_byte = bfd_octets_per_byte (abfd); >> ?????? | ^~~~~~~~~~~~~~~~~~~ >> In file included from hsdis.c:58: >> build/linux-amd64/bfd/bfd.h:1999:14: note: declared here >> ??1999 | unsigned int bfd_octets_per_byte (const bfd *abfd, >> ?????? | ^~~~~~~~~~~~~~~~~~~ >> ``` >> >> >> Thanks, >> >> Yasumasa >> >> >> [1] https://sourceware.org/git/?p=binutils-gdb.git;h=618265039f697eab9e72bb58b95fc2d32925df58 From kim.barrett at oracle.com Tue May 19 00:54:49 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 18 May 2020 20:54:49 -0400 Subject: RFR(XS): 8245051: c1 is broken if it is compiled by gcc without -fno-lifetime-dse In-Reply-To: References: <5389667A-123E-44A9-B3B2-44450953CC7E@oracle.com> Message-ID: > On May 17, 2020, at 5:34 PM, Liu, Xin wrote: > > Hi, Kim, > > Thank you to review my patch. I have removed the friend class BlockBegin. > Here is the new revision: http://cr.openjdk.java.net/~xliu/8245051/01/webrev/ Looks good. > About --with-extra-cflags, I completely agree. However, how to configure OpenJDK is not under control. There're so many linux distributions. I?m not sure what you mean by that. The given configure options simply aren?t valid. And if some linux distribution is patching the OpenJDK to allow such a configuration, that? s their lookout. > In addition, it might cause subtle bugs if the toolchain is not gcc. From my side, I'd like to get rid of undefine behaviors as many as we can. I agree that it?s desirable to eliminate unnecessary or unintentional UB. > Yes, I still need a reviewer and a sponsor. Looks like Paul Hohense has volunteered. From xxinliu at amazon.com Tue May 19 01:44:27 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Tue, 19 May 2020 01:44:27 +0000 Subject: RFR(XS): 8245051: c1 is broken if it is compiled by gcc without -fno-lifetime-dse In-Reply-To: References: <5389667A-123E-44A9-B3B2-44450953CC7E@oracle.com> Message-ID: ?On 5/18/20, 5:56 PM, "Kim Barrett" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > On May 17, 2020, at 5:34 PM, Liu, Xin wrote: > > Hi, Kim, > > Thank you to review my patch. I have removed the friend class BlockBegin. > Here is the new revision: http://cr.openjdk.java.net/~xliu/8245051/01/webrev/ Looks good. > About --with-extra-cflags, I completely agree. However, how to configure OpenJDK is not under control. There're so many linux distributions. I?m not sure what you mean by that. The given configure options simply aren?t valid. And if some linux distribution is patching the OpenJDK to allow such a configuration, that? s their lookout. -- Recently, I found different linux distros have their convenient ways to configure OpenJDK. Eg. centos7: https://git.centos.org/rpms/java-11-openjdk/blob/c7/f/SPECS/java-11-openjdk.spec#_1327 ubuntu: https://git.launchpad.net/~openjdk/ubuntu/+source/openjdk/+git/openjdk/tree/debian/rules?h=openjdk-11#n223 yes, they should take care of cflags. I will file a bug in centos Bugzilla. > In addition, it might cause subtle bugs if the toolchain is not gcc. From my side, I'd like to get rid of undefine behaviors as many as we can. I agree that it?s desirable to eliminate unnecessary or unintentional UB. > Yes, I still need a reviewer and a sponsor. Looks like Paul Hohense has volunteered. Thank you! --lx From tobias.hartmann at oracle.com Tue May 19 09:24:50 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 19 May 2020 11:24:50 +0200 Subject: RFR: 8244819: hsdis does not compile with binutils 2.34+ In-Reply-To: <8110da95-110f-49bb-0e0b-a63c74392bc2@oss.nttdata.com> References: <3399c27a-3f55-9f30-1090-5fe6aea479c2@oss.nttdata.com> <37f96427-77f4-9b35-3b27-703f89c8af7e@oracle.com> <8110da95-110f-49bb-0e0b-a63c74392bc2@oss.nttdata.com> Message-ID: +1 Best regards, Tobias On 19.05.20 01:46, Yasumasa Suenaga wrote: > Thanks Vladimir! > > Yasumasa > > On 2020/05/19 2:57, Vladimir Kozlov wrote: >> Looks good. >> >> Thanks, >> Vladimir >> >> On 5/12/20 6:12 AM, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> Please review this change: >>> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8244819 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8244819/webrev.00/ >>> >>> binutils 2.34 introduces new section flag: SEC_ELF_OCTETS, and it affects arguments of >>> bfd_octets_per_byte() [1]. So we can see new compiler error as below: >>> >>> ``` >>> hsdis.c:571:28: error: too few arguments to function 'bfd_octets_per_byt >>> ' >>> ?? 571 | dinfo->octets_per_byte = bfd_octets_per_byte (abfd); >>> ?????? | ^~~~~~~~~~~~~~~~~~~ >>> In file included from hsdis.c:58: >>> build/linux-amd64/bfd/bfd.h:1999:14: note: declared here >>> ??1999 | unsigned int bfd_octets_per_byte (const bfd *abfd, >>> ?????? | ^~~~~~~~~~~~~~~~~~~ >>> ``` >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> [1] https://sourceware.org/git/?p=binutils-gdb.git;h=618265039f697eab9e72bb58b95fc2d32925df58 From tobias.hartmann at oracle.com Tue May 19 09:33:49 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 19 May 2020 11:33:49 +0200 Subject: RFR(S): 8245021: Add method 'remove_if_existing' to growableArray. In-Reply-To: <054bdcb1-9543-eefc-b814-60ad5ab641d3@oracle.com> References: <054bdcb1-9543-eefc-b814-60ad5ab641d3@oracle.com> Message-ID: <243790ff-6640-8f48-b345-b195efc46ede@oracle.com> Hi Patric, Looks good to me but please add brackets around the for loop. Also, there are some more cases of this code pattern. For example, JvmtiPendingMonitors::destroy/exit and ShenandoahBarrierSetC2State::remove_enqueue_barrier/remove_load_reference_barrier. Best regards, Tobias On 18.05.20 22:37, Patric Hedlin wrote: > Dear all, > > I would like to ask for help to review the following change/update: > > Issue:? https://bugs.openjdk.java.net/browse/JDK-8245021 > Webrev: http://cr.openjdk.java.net/~phedlin/tr8245021/ > > > 8245021: Add method 'remove_if_existing' to growableArray. > > Minor improvement to simplify the code pattern "if contains then remove" found in a few places (in > "compile.hpp"). > > > Testing: hs-tier1-3 > > > Best regards, > Patric > From adinn at redhat.com Tue May 19 09:42:06 2020 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 19 May 2020 10:42:06 +0100 Subject: RFR(XXS):8244170: correct instruction typo for dcps1/2/3 In-Reply-To: <47B6CB94-62D9-4E66-A9C1-0055A43FFA24@amazon.com> References: <3F8C4202-6810-4CC6-BB77-656A6D71E9D3@amazon.com> <47B6CB94-62D9-4E66-A9C1-0055A43FFA24@amazon.com> Message-ID: On 18/05/2020 22:43, Liu, Xin wrote: > Hi, Andrew and Rahul, > > Thank you to review it. Here is the new revision, which can apply to TIP. > It's almost same, I just resolve a merge conflict with JDK-8022574. > http://cr.openjdk.java.net/~xliu/8244170/01/webrev/ > > I ran test on "hotspot:tier1" on aarch64. Still reviewed! regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From Yang.Zhang at arm.com Tue May 19 10:55:09 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Tue, 19 May 2020 10:55:09 +0000 Subject: RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes Message-ID: Hi, Following up on review requests of API [0], Java implementation and test [1], General Hotspot changes[2] for Vector API and x86 backend changes [3]. Here's a request for review of AArch64 backend changes required for supporting the Vector API: JEP: https://openjdk.java.net/jeps/338 JBS: https://bugs.openjdk.java.net/browse/JDK-8223347 Webrev: http://cr.openjdk.java.net/~yzhang/vectorapi/vectorapi.rfr/aarch64_webrev/webrev.01/ Complete implementation resides in vector-unstable branch of panama/dev repository [4]. Looking forward to your feedback. Best Regards, Yang [0] https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-March/065345.html [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-April/065587.html [2] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/037798.html [3] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/037801.html [4] https://openjdk.java.net/projects/panama/ $ hg clone http://hg.openjdk.java.net/panama/dev/ -b vector-unstable From suenaga at oss.nttdata.com Tue May 19 11:54:02 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 19 May 2020 20:54:02 +0900 Subject: RFR: 8244819: hsdis does not compile with binutils 2.34+ In-Reply-To: References: <3399c27a-3f55-9f30-1090-5fe6aea479c2@oss.nttdata.com> <37f96427-77f4-9b35-3b27-703f89c8af7e@oracle.com> <8110da95-110f-49bb-0e0b-a63c74392bc2@oss.nttdata.com> Message-ID: Thanks Tobias! Yasumasa On 2020/05/19 18:24, Tobias Hartmann wrote: > +1 > > Best regards, > Tobias > > On 19.05.20 01:46, Yasumasa Suenaga wrote: >> Thanks Vladimir! >> >> Yasumasa >> >> On 2020/05/19 2:57, Vladimir Kozlov wrote: >>> Looks good. >>> >>> Thanks, >>> Vladimir >>> >>> On 5/12/20 6:12 AM, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> Please review this change: >>>> >>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8244819 >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8244819/webrev.00/ >>>> >>>> binutils 2.34 introduces new section flag: SEC_ELF_OCTETS, and it affects arguments of >>>> bfd_octets_per_byte() [1]. So we can see new compiler error as below: >>>> >>>> ``` >>>> hsdis.c:571:28: error: too few arguments to function 'bfd_octets_per_byt >>>> ' >>>> ?? 571 | dinfo->octets_per_byte = bfd_octets_per_byte (abfd); >>>> ?????? | ^~~~~~~~~~~~~~~~~~~ >>>> In file included from hsdis.c:58: >>>> build/linux-amd64/bfd/bfd.h:1999:14: note: declared here >>>> ??1999 | unsigned int bfd_octets_per_byte (const bfd *abfd, >>>> ?????? | ^~~~~~~~~~~~~~~~~~~ >>>> ``` >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> [1] https://sourceware.org/git/?p=binutils-gdb.git;h=618265039f697eab9e72bb58b95fc2d32925df58 From lutz.schmidt at sap.com Tue May 19 13:06:51 2020 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Tue, 19 May 2020 13:06:51 +0000 Subject: [CAUTION] RFR(S): 8245047: [PPC64] C2: ReverseBytes + Load always match to unordered Load (acquire semantics missing) Message-ID: <9C97BE97-62BD-4731-9A92-6F018A7808D6@sap.com> Hi Martin, your change looks good to me. Thanks for digging into this and fixing the issue. Best, Lutz ?On 18.05.20, 16:03, "hotspot-compiler-dev on behalf of Doerr, Martin" wrote: Hi, this issue was previously discussed with the preliminary synopsis "RFR(S): 8245047: PPC64 fails jcstress Unsafe (?) memory ordering tests due to C2 (?) bug": http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-May/038264.html Bug id and webrev are still the same: https://bugs.openjdk.java.net/browse/JDK-8245047 http://cr.openjdk.java.net/~mdoerr/8245047_ppc64_load_reversed_acquire/webrev.00/ Please send reviews replying to this RFR email. Best regards, Martin From martin.doerr at sap.com Tue May 19 13:16:27 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 19 May 2020 13:16:27 +0000 Subject: RFR(S): 8245047: [PPC64] C2: ReverseBytes + Load always match to unordered Load (acquire semantics missing) In-Reply-To: <6b41c2f2-7dbb-0e45-5a55-6352f91b5c2f@redhat.com> References: <6b41c2f2-7dbb-0e45-5a55-6352f91b5c2f@redhat.com> Message-ID: Hi Aleksey and Lutz, thanks for the reviews. Pushed to jdk/jdk. I'll request backports, too. Best regards, Martin From rwestrel at redhat.com Tue May 19 14:57:21 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 19 May 2020 16:57:21 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <87zhab3n77.fsf@redhat.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> Message-ID: <874ksc2cry.fsf@redhat.com> Hi John, Can you confirm this: > http://cr.openjdk.java.net/~roland/8244504/webrev.01/ looks ok to you? Thanks, Roland. From hohensee at amazon.com Tue May 19 17:39:51 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Tue, 19 May 2020 17:39:51 +0000 Subject: [aarch64-port-dev ] RFR(XXS):8244170: correct instruction typo for dcps1/2/3 Message-ID: Lgtm too. Pushed. Paul ?On 5/19/20, 2:43 AM, "aarch64-port-dev on behalf of Andrew Dinn" wrote: On 18/05/2020 22:43, Liu, Xin wrote: > Hi, Andrew and Rahul, > > Thank you to review it. Here is the new revision, which can apply to TIP. > It's almost same, I just resolve a merge conflict with JDK-8022574. > http://cr.openjdk.java.net/~xliu/8244170/01/webrev/ > > I ran test on "hotspot:tier1" on aarch64. Still reviewed! regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From xxinliu at amazon.com Tue May 19 19:03:01 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Tue, 19 May 2020 19:03:01 +0000 Subject: RFR(S): 8245021: Add method 'remove_if_existing' to growableArray. In-Reply-To: <054bdcb1-9543-eefc-b814-60ad5ab641d3@oracle.com> References: <054bdcb1-9543-eefc-b814-60ad5ab641d3@oracle.com> Message-ID: <1EEC80B7-9603-4B8C-A0D4-97D3DE51EBDB@amazon.com> Hi, Patric, I don't object to your change. I feel that the API 'remove' of GrowableArray was not good. Even though it's complexity is still linear, you scan all elements and write some of them. The problem is it has to retain order. Actually, I didn't run into any problem when I replace the removing element with the last one. It suggests that probably nobody in hotspot makes use the sorted GrowableArray. I found another interesting point. There's an API delete_at which ignore orders, so I try and replace your remove_if_exists with it. bool delete_if_existing(const E& elem) { int index = find(elem); if (index != -1) { _data[index] = _data[--_len]; return true; } return false; } I didn't have any regression in jtreg:hotspot:tier1. Actually, CodeCache::unregister_old_nmethod use the same trick. Here is current implementation of delete_at(). It checks if the index is the last element, and skip copying if so. I am not sure if an extra comparison is worthy here. Users should use pop() instead in that scenario. // The order is changed. void delete_at(int index) { assert(0 <= index && index < _len, "illegal index"); if (index < --_len) { // Replace removed element with last one. _data[index] = _data[_len]; } } Thanks, --lx ?On 5/18/20, 1:38 PM, "hotspot-compiler-dev on behalf of Patric Hedlin" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Dear all, I would like to ask for help to review the following change/update: Issue: https://bugs.openjdk.java.net/browse/JDK-8245021 Webrev: http://cr.openjdk.java.net/~phedlin/tr8245021/ 8245021: Add method 'remove_if_existing' to growableArray. Minor improvement to simplify the code pattern "if contains then remove" found in a few places (in "compile.hpp"). Testing: hs-tier1-3 Best regards, Patric From xxinliu at amazon.com Wed May 20 00:18:47 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Wed, 20 May 2020 00:18:47 +0000 Subject: RFR(XS): 8245051: c1 is broken if it is compiled by gcc without -fno-lifetime-dse In-Reply-To: References: <5389667A-123E-44A9-B3B2-44450953CC7E@oracle.com> Message-ID: Hi, Kim, Thank you to review it. Is there any regression you catch? can I push this change? On the other side, I filed a bug to centos7 as a follow-up. https://bugs.centos.org/view.php?id=17375 thanks, --lx ?On 5/18/20, 6:44 PM, "Liu, Xin" wrote: On 5/18/20, 5:56 PM, "Kim Barrett" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > On May 17, 2020, at 5:34 PM, Liu, Xin wrote: > > Hi, Kim, > > Thank you to review my patch. I have removed the friend class BlockBegin. > Here is the new revision: http://cr.openjdk.java.net/~xliu/8245051/01/webrev/ Looks good. > About --with-extra-cflags, I completely agree. However, how to configure OpenJDK is not under control. There're so many linux distributions. I?m not sure what you mean by that. The given configure options simply aren?t valid. And if some linux distribution is patching the OpenJDK to allow such a configuration, that? s their lookout. -- Recently, I found different linux distros have their convenient ways to configure OpenJDK. Eg. centos7: https://git.centos.org/rpms/java-11-openjdk/blob/c7/f/SPECS/java-11-openjdk.spec#_1327 ubuntu: https://git.launchpad.net/~openjdk/ubuntu/+source/openjdk/+git/openjdk/tree/debian/rules?h=openjdk-11#n223 yes, they should take care of cflags. I will file a bug in centos Bugzilla. > In addition, it might cause subtle bugs if the toolchain is not gcc. From my side, I'd like to get rid of undefine behaviors as many as we can. I agree that it?s desirable to eliminate unnecessary or unintentional UB. > Yes, I still need a reviewer and a sponsor. Looks like Paul Hohense has volunteered. Thank you! --lx From Pengfei.Li at arm.com Wed May 20 09:42:55 2020 From: Pengfei.Li at arm.com (Pengfei Li) Date: Wed, 20 May 2020 09:42:55 +0000 Subject: RFR: 8245158: C2: Enable SLP for some manually unrolled loops Message-ID: Hi C2 Reviewers, Can I have a review of this enhancement of C2 SLP? JBS: https://bugs.openjdk.java.net/browse/JDK-8245158 Webrev: http://cr.openjdk.java.net/~pli/rfr/8245158/webrev.00/ Below Java loop with stride = 1 can be vectorized by C2. for (int i = start; i < limit; i++) { c[i] = a[i] + b[i]; } But if it's manually unrolled once, like in the code below, SLP would fail to vectorize it. for (int i = start; i < limit; i += 2) { c[i] = a[i] + b[i]; c[i + 1] = a[i + 1] + b[i + 1]; } Notably, if the induction variable's initial value "start" is replaced by a compile-time constant, the vectorization works. Root cause of these is that in current C2 SuperWord implementation, find_adjacent_refs() calls find_align_to_ref() to select a "best align to" memory reference to create packs, and particularly, the reference selected must be "pre-loop alignable". In other words, C2 must be able to adjust the pre-loop trip count such that the vectorized access of this reference is aligned. Hence, in find_align_to_ref(), unalignable memory references are discarded. [1] Then SLP packs creation is aborted if no memory reference is eligible to be the "best align to". [2] In current C2 SLP code, the selected "best align to" reference has two uses. One is to compute alignment info in order to find adjacent memory references for packs creation. Another use is to facilitate the pre-loop trip count adjustment to align vector memory accesses in the main-loop. But on some platforms, aligning vector accesses is not a mandatory requirement (after Roland's JDK-8215483 [3], this is usually checked by "!Matcher::misaligned_vectors_ok() || AlignVector"). So the "best align to" memory reference doesn't have to be "pre-loop alignable" on all platforms. In this patch, we only discard unalignable references when that platform-dependent check returns true. After this patch, some manually unrolled loops can be vectorized on platforms with no alignment requirement. As almost all modern x86 CPUs support unaligned vector move, I suspect this can benefit the majority of today's CPUs. Please note that this patch doesn't try to enable SLP for all manually unrolled loops. If above case is unrolled more times, vectorization may still don't work. The reason behind is that current SLP applies only to main-loops produced by the iteration split. When the loop is manually unrolled many times, node count may exceed LoopUnrollLimit, resulting in no iteration split at all. Although this can be workarounded by relaxing the unrolling policy by slp_max_unroll_factor, we don't do in this way since splitting a big loop may increase too much code size. Anyone wants to vectorize a super-manually-unrolled loop can use -XX:LoopUnrollLimit= with a greater value. [Tests] Jtreg hotspot::hotspot_all_no_apps, jdk::jdk_core, langtools::tier1 are tested and no new failure is found. Below are the results of the JMH test [4] from above case. Before: Benchmark Mode Cnt Score Error Units TestUnrolledLoop.bar thrpt 25 58097.290 ? 128.802 ops/s After: Benchmark Mode Cnt Score Error Units TestUnrolledLoop.bar thrpt 25 260110.139 ? 10902.284 ops/s [1] http://hg.openjdk.java.net/jdk/jdk/file/a0a21978f3b9/src/hotspot/share/opto/superword.cpp#l780 [2] http://hg.openjdk.java.net/jdk/jdk/file/a0a21978f3b9/src/hotspot/share/opto/superword.cpp#l587 [3] http://hg.openjdk.java.net/jdk/jdk/rev/da7dc9e92d91 [4] http://cr.openjdk.java.net/~pli/rfr/8245158/TestUnrolledLoop.java -- Thanks, Pengfei From kim.barrett at oracle.com Wed May 20 17:37:17 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 20 May 2020 13:37:17 -0400 Subject: RFR(XS): 8245051: c1 is broken if it is compiled by gcc without -fno-lifetime-dse In-Reply-To: References: <5389667A-123E-44A9-B3B2-44450953CC7E@oracle.com> Message-ID: > On May 19, 2020, at 8:18 PM, Liu, Xin wrote: > > Hi, Kim, > > Thank you to review it. Is there any regression you catch? can I push this change? I gave my approval with my message from 5/18. Sorry if that wasn?t clear. No problems found by additional testing. From john.r.rose at oracle.com Wed May 20 22:24:05 2020 From: john.r.rose at oracle.com (John Rose) Date: Wed, 20 May 2020 15:24:05 -0700 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <874ksc2cry.fsf@redhat.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> <874ksc2cry.fsf@redhat.com> Message-ID: <86FF0784-87B5-45A9-95AC-E27C0EBD39F1@oracle.com> On May 19, 2020, at 7:57 AM, Roland Westrelin wrote: > > Can you confirm this: > >> http://cr.openjdk.java.net/~roland/8244504/webrev.01/ > > looks ok to you? (Sorry to leave you hanging? I will have a few more more comments shortly.) From shravya.rukmannagari at intel.com Wed May 20 23:01:34 2020 From: shravya.rukmannagari at intel.com (Rukmannagari, Shravya) Date: Wed, 20 May 2020 23:01:34 +0000 Subject: [15] RFR(M): 8245512: CRC32 optimization using AVX512 instructions Message-ID: Hi All, We would like to contribute optimizations for CRC32 algorithm for upcoming Intel x86_64 platforms. Contributors: Shravya Rukmannagari(shravya.rukmannagari at intel.com) Greg B Tucker(greg.b.tucker at intel.com) I have tested the patch to confirm correctness and performance. The patch also passes compiler/jtreg tests. Please take a look and let me know if you have any questions or comments. Bug Id: https://bugs.openjdk.java.net/browse/JDK-8245512 https://cr.openjdk.java.net/~srukmannagar/CRC32/webrev.01/ Regards, Shravya Rukmannagari From xxinliu at amazon.com Thu May 21 05:26:19 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Thu, 21 May 2020 05:26:19 +0000 Subject: RFR(XS): 8245051: c1 is broken if it is compiled by gcc without -fno-lifetime-dse In-Reply-To: References: <5389667A-123E-44A9-B3B2-44450953CC7E@oracle.com> Message-ID: <8D783CDB-A924-454C-B010-EC131A6A14D2@amazon.com> Hi, Kim, Got it. thanks! Pushed by Paul. Thank you too! Thanks, --lx ?On 5/20/20, 10:38 AM, "Kim Barrett" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > On May 19, 2020, at 8:18 PM, Liu, Xin wrote: > > Hi, Kim, > > Thank you to review it. Is there any regression you catch? can I push this change? I gave my approval with my message from 5/18. Sorry if that wasn?t clear. No problems found by additional testing. From Xiaohong.Gong at arm.com Thu May 21 10:24:44 2020 From: Xiaohong.Gong at arm.com (Xiaohong Gong) Date: Thu, 21 May 2020 10:24:44 +0000 Subject: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> <2551a58f-bb05-b63d-b8ed-63f120a75eeb@redhat.com> Message-ID: Hi Andrew, > On 5/14/20 11:39 AM, Andrew Dinn wrote: > > On 14/05/2020 11:15, Andrew Haley wrote: > >> On 5/14/20 11:07 AM, Xiaohong Gong wrote: > >>> Hi Andrew, > >>> > >>> Thanks for having a look at it! > >>> > >>> > On 5/14/20 10:37 AM, Andrew Haley wrote: > >>> > > On 5/14/20 9:48 AM, Andrew Dinn wrote: > >>> > >> Just for references a direct link to the webrev, issue > and CSR > >>> > are: > >>> > >> > >>> > >> https://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.00/ > >>> > >> https://bugs.openjdk.java.net/browse/JDK-8243339 > >>> > >> https://bugs.openjdk.java.net/browse/JDK-8243456 > >>> > >> > >>> > >> The webrev looks fine to me. Nice work, thank you! > >>> > > > >>> > > There's a problem with C1: we generate unnecessary DMBs if > > > > >>> we're using TieredStopAtLevel=1 or if we only have the client > > > > >>> compiler. This is a performance regression, so I reject this > > > > >>> patch. > >>> > > >>> > There are similar regressoins in the interpreter. > >>> > >>> Yes, I agree with you that regressions exist for interpreter > and > >>> client compiler. So do you think if it's better to add the > >>> conditions to add DMBs for C1 and interpreter? How about just > >>> excluding the scenario like "interpreter only", "client > compiler > >>> only" and "TieredStopAtLevel=1" ? > >> > >> Yes, I think so. Is there some way simply to ask the question > "Are we > >> using C2 or JVMCI compilers?" That's what we need to know. > > This can be done using build time conditionality. > > Not entirely, because this is also switchable at runtime. I'v created a new patch to add the condition when inserting "DMBs" before volatile load for C1/Interpreter. The updated webrev: http://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.01/ It adds a new function "is_c1_or_interpreter_only()" , which can decide whether C2/JVMCI is used. Besides, since AOT also uses Graal compiler as the codegen, it always return false if AOT mode is enabled. Tests: Tested jtreg hotspot::hotspot_all_no_apps,jdk::jdk_core,langtools::tier1 and jcstress:tests-all, and all tests pass without new failure. Besides the tests with default configuration, different compile modes (client mode, c2 disabled mode) with different vm option groups (-XX:CompilationMode="quick-only", -XX:-TieredCompilation, -XX:TieredStopAtLevel=1) are also tested. Could you please take a look at it? Thank you! Thanks, Xiaohong From aph at redhat.com Thu May 21 14:06:07 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 21 May 2020 15:06:07 +0100 Subject: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> <2551a58f-bb05-b63d-b8ed-63f120a75eeb@redhat.com> Message-ID: <71630bf9-1cde-69b4-c376-6318957ea672@redhat.com> On 5/21/20 11:24 AM, Xiaohong Gong wrote: > I'v created a new patch to add the condition when inserting "DMBs" before volatile > load for C1/Interpreter. > The updated webrev: http://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.01/ > > It adds a new function "is_c1_or_interpreter_only()" , which can decide whether > C2/JVMCI is used. Besides, since AOT also uses Graal compiler as the codegen, it > always return false if AOT mode is enabled. Looks good to me, thanks. As far as I remember, Graal does optimize volatile accesses to use ldar/stlr, or at least it will do so in the future, so if we're using AOT or JVMCI the safe thing to do is add the DMBs. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From john.r.rose at oracle.com Thu May 21 20:09:33 2020 From: john.r.rose at oracle.com (John Rose) Date: Thu, 21 May 2020 13:09:33 -0700 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <87zhab3n77.fsf@redhat.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> Message-ID: <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> On May 13, 2020, at 7:48 AM, Roland Westrelin wrote: > ... > After spending some time trying to get this to work, I have to admit my > initial logic was very much wrong and it's a lot less straightforward to > get this to work in a way that I'm confident about than I thought. Thanks for pushing it through. I think this will be very useful for improving loops over jumbo data structures, such as those that could bye accessed via the MemorySegment API (JEP 383). > >> Another way to deal with that, would be to pass the expected result type >> as argument to unsigned_min() rather than compute it. > > So I went that way and passes the type as an argument to the min/max > methods in case the caller knows something about the result: > > http://cr.openjdk.java.net/~roland/8244504/webrev.01/ A few comments: In C2 whenever a TypeNode is created, it is given a type you the programmer know is ?as good as possible for now?, and then maybe it narrows some more later. Often that type is redundant. Case in point: the new call to set_type on inner_iters_actual_int the TypeInt should really be derived in a Value call the sees the earlier TypeLong, not made from scratch. I wish there were a way to do this less manual repetition, because bugs can hide there. I think this kind of repetition is endemic to C2, so I don?t have any more to say about this than to complain. I sort of want a function called PhaseTransform::set_type_Value which starts from set_type_bottom but then sharpens it once, by calling Value (on the assumption that the nearby nodes already have a phase-assigned type). Uses of register_new_node_with_optimizer are places where such a function might reduce redundancy. The assigned types in build_min_max* are also provided ?by fiat? rather than by local inference, and it?s harder to prove correct because of the way the code is factored (but thank you for doing that!). I really like your asserts that test the ?fiat type? against the ?physics? of the node being created; I mean those that say ?type doesn?t match?. Would it make sense to add more such asserts on the other branches? You?d need a work-alike for the Value method of MaxINode for all the other cases, so (sigh) maybe it?s too much to squeeze into this work. The lack of good asserts here feels like tech. debt, though. Would you mind filing a suitable bug, if you agree with me? Perhaps the right answer is to move the GraphKit functions (one more time) onto polymorphic factory methods MaxNode::make and MinNode::make, styled like LoadNode::make. Then any fiddly min/max interval logic (for assertion checks) can go where it really belongs, in the same file as MaxINode::Value. (Which happens to be addnode.cpp, but whatever.) Also MinNode::make_zero_clipped_difference. With an optional unsigned flag. Another reason I?m thinking this could make sense is that, while I thought GraphKit was a good place for the new functions, the _gvn instance isn?t a GK, so you had to make it a static method on GK that takes _gvn instead of ?this?. That?s a smell, isn?t it? (My bad.) I think a polymorphic factory method or three on MaxNode/MinNode would work better. That would also allow future C2 work to fill out the set of min and max operations that get handled by intrinsics, with little further disturbance. The logic of check_stride_overflow is now clear as crystal, where it used to be very muddy indeed. Thanks for adding the extra comments. I looked for a while at the 32-bit and 64-bit versions of the counted loop pattern match, and thought about whether they could be brought even closer together, perhaps making the comments more similar, calling out where there are features or limitations on one but not the other, but gave up. Further unifying the loop detection would be a larger task, and perhaps not worth the effort. ? John From Xiaohong.Gong at arm.com Fri May 22 02:36:36 2020 From: Xiaohong.Gong at arm.com (Xiaohong Gong) Date: Fri, 22 May 2020 02:36:36 +0000 Subject: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: <71630bf9-1cde-69b4-c376-6318957ea672@redhat.com> References: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> <2551a58f-bb05-b63d-b8ed-63f120a75eeb@redhat.com> <71630bf9-1cde-69b4-c376-6318957ea672@redhat.com> Message-ID: Hi Andrew, > On 5/21/20 11:24 AM, Xiaohong Gong wrote: > > I'v created a new patch to add the condition when inserting > "DMBs" > > before volatile load for C1/Interpreter. > > The updated webrev: > > http://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.01/ > > > > It adds a new function "is_c1_or_interpreter_only()" , which can > > decide whether C2/JVMCI is used. Besides, since AOT also uses > Graal > > compiler as the codegen, it always return false if AOT mode is > enabled. > > Looks good to me, thanks. > > As far as I remember, Graal does optimize volatile accesses to use > ldar/stlr, or at least it will do so in the future, so if we're > using AOT or JVMCI the safe thing to do is add the DMBs. Yes, exactly! It has a patch in Graal github to do this optimization (https://github.com/oracle/graal/pull/1772). Thanks, Xiaohong -----Original Message----- From: Andrew Haley Sent: Thursday, May 21, 2020 10:06 PM To: Xiaohong Gong ; Andrew Dinn ; Derek White ; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: Re: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option On 5/21/20 11:24 AM, Xiaohong Gong wrote: > I'v created a new patch to add the condition when inserting "DMBs" > before volatile load for C1/Interpreter. > The updated webrev: > http://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.01/ > > It adds a new function "is_c1_or_interpreter_only()" , which can > decide whether C2/JVMCI is used. Besides, since AOT also uses Graal > compiler as the codegen, it always return false if AOT mode is enabled. Looks good to me, thanks. As far as I remember, Graal does optimize volatile accesses to use ldar/stlr, or at least it will do so in the future, so if we're using AOT or JVMCI the safe thing to do is add the DMBs. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From mikael.vidstedt at oracle.com Fri May 22 03:36:51 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 21 May 2020 20:36:51 -0700 Subject: RFR(XS): 8245521: Remove STACK_BIAS Message-ID: <114B5A9A-DEF8-43AA-814A-3D7E7BD2B001@oracle.com> Please review this small change which removes the STACK_BIAS constant and its uses: JBS: https://bugs.openjdk.java.net/browse/JDK-8245521 webrev: http://cr.openjdk.java.net/~mikael/webrevs/8245521/webrev.00/open/webrev/ Background (from JBS): With Solaris/SPARC removed the STACK_BIAS definition in src/hotspot/share/utilities/globalDefinitions.hpp is now always 0 and can be removed. Testing: Tier1 Cheers, Mikael From david.holmes at oracle.com Fri May 22 04:11:09 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 22 May 2020 14:11:09 +1000 Subject: RFR(XS): 8245521: Remove STACK_BIAS In-Reply-To: <114B5A9A-DEF8-43AA-814A-3D7E7BD2B001@oracle.com> References: <114B5A9A-DEF8-43AA-814A-3D7E7BD2B001@oracle.com> Message-ID: Hi Mikael, Looks good. I assume the change to GraalHotSpotVMConfig.java is to allow it to work with older VMs? Thanks, David On 22/05/2020 1:36 pm, Mikael Vidstedt wrote: > > Please review this small change which removes the STACK_BIAS constant and its uses: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8245521 > webrev: http://cr.openjdk.java.net/~mikael/webrevs/8245521/webrev.00/open/webrev/ > > Background (from JBS): > > With Solaris/SPARC removed the STACK_BIAS definition in src/hotspot/share/utilities/globalDefinitions.hpp is now always 0 and can be removed. > > > Testing: > > Tier1 > > Cheers, > Mikael > From xxinliu at amazon.com Fri May 22 07:34:43 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Fri, 22 May 2020 07:34:43 +0000 Subject: RFR[M]: 8151779: Some intrinsic flags could be replaced with one general flag In-Reply-To: <1EBE66E6-9AA7-4EC5-9B91-45F884071FAC@amazon.com> References: <19CD3956-4DC6-4908-8626-27D48A9AB4A4@amazon.com> <0EDAAC88-E5D9-424F-A19E-5E20C689C2F3@amazon.com> <801D878C-CAE5-4EBE-8AFE-4E35346CD5BD@amazon.com> <58ff5b66-1dce-d4ad-8f21-254abd1b887b@oracle.com> <65dcfd1f-5e7e-b9e1-8298-5daafcda8a81@oracle.com> <1EBE66E6-9AA7-4EC5-9B91-45F884071FAC@amazon.com> Message-ID: <2982174F-DBB6-4316-93C3-1B4DFDF34C88@amazon.com> Hi, Please allow me to ping for this code review. Here is the latest webrev: http://cr.openjdk.java.net/~xliu/8151779/05/webrev/ Incremental diff: http://cr.openjdk.java.net/~xliu/8151779/r4_to_r5.diff I verified it in submit repo a week ago. I also double-check the patch still can patch to TIP and pass both hotspot:tier1 and gtest:all. Here is log message I got from mach-5. Job: mach5-one-phh-JDK-8151779-1-20200513-1821-11015755 BuildId: 2020-05-13-1820211.hohensee.source No failed tests Tasks Summary EXECUTED_WITH_FAILURE: 0 NOTHING_TO_RUN: 0 KILLED: 0 HARNESS_ERROR: 0 FAILED: 0 PASSED: 101 UNABLE_TO_RUN: 0 NA: 0 Thanks, --lx ?On 5/13/20, 12:03 AM, "hotspot-compiler-dev on behalf of Liu, Xin" wrote: Hi, Vladimir, > 2. add +/- UseCRC32Intrinsics to IntrinsicAvailableTest.java > The purpose of that test is not to generate a CRC32 intrinsic. Its purpose is to check if compilers determine to intrinsify _updateCRC32 or not. > Mathematically, "UseCRC32Intrinsics" is a set = [_updateCRC32, _updateBytesCRC32, _updateByteBufferCRC32]. > "-XX:-UseCRC32Intrinsics" disables all 3 of them. If users use -XX:ControlIntrinsic=+_updateCRC32 and -XX:-UseCRC32Intrinsics, _updateCRC32 should be enabled individually. No, I think we should preserve current behavior when UseCRC32Intrinsics is off then all corresponding intrinsics are also should be off. This is the purpose of such flags - to be able control several intrinsics with one flag. Otherwise you have to check each individual intrinsic if CPU does not support them. Even if code for some of these intrinsics can be generated on this CPU. We should be consistent, otherwise code can become very complex to support. ---- If -XX:ControlIntrinsic=+_updateBytesCRC32 can't win over -XX:-UseCRC32Intrinsics, it will come back the justification of JBS-8151779: Why do we need to support the usage -XX:ControlIntrinsic=+_updateBytesCRC32? If a user doesn't set +updateBytesCRC32, it's still enabled. I read the description of "JBS-8235981" and "JBS-8151779" again. I try to understand in this way. The option 'UseCRC32Intrinsics' is the consolidation of 3 real intrinsics [_updateCRC32, _updateBytesCRC32, _updateByteBufferCRC32]. It represents some sorta hardware capabilities to make those intrinsics optimal. If UseCRC32Intrinsics is OFF, it will not make sense to intrinsify them anymore because inliner can deliver the similar result. Quote from JBS-8235981 "Right now, there's no way to introduce experimental intrinsics which are turned off by default and let users enable them on their side. " Currently, once a user declares one new intrinsics in VM_INTRINSICS_DO, it's enabled. It might not be true in the future. i.e. A develop can declare an intrinsic but mark it turn-off by default. He will try it out by -XX:ControlIntrinsic=+_myNewIntrinsic in his development stage. Do I catch up your intention this time? if yes, could you take a look at this new revision? I think I meet the requirement. Webrev: http://cr.openjdk.java.net/~xliu/8151779/05/webrev/ Incremental diff: http://cr.openjdk.java.net/~xliu/8151779/r4_to_r5.diff Here is the change log from rev04. 1) An intrinsic is enabled if and only if neither ControlIntrinsic nor the corresponding UseXXXIntrinsics disables it. The implementation is still in vmIntrinsics::is_disabled_by_flags(vmIntrinsics::ID id). 2) I introduce a compact data structure TriBoolArray. It compresses an array of Tribool. Each tribool only takes 2 bits now. I also took Coleen's suggestion to put TriBool and TriBoolArray in a standalone file "utilities/tribool.hpp". A new gtest is attached. 3) Correct some typos. Thank you David pointed them out. Thanks, --lx On 5/12/20, 12:59 AM, "David Holmes" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi, Sorry for the delay in getting back to this. On 5/05/2020 7:37 pm, Liu, Xin wrote: > Hello, David and Nils > > Thank you to review the patch. I went to brush up my English grammar and then update my patch to rev04. > https://cr.openjdk.java.net/~xliu/8151779/04/webrev/ > Here is the incremental diff: https://cr.openjdk.java.net/~xliu/8151779/r3_to_r4.diff It reflect changes based on David's feedbacks. I really appreciate that you review so carefully and found so many invaluable suggestions. TBH, I don't understand Amazon's copyright header neither. I choose the simple way to dodge that problem. In vmSymbols.hpp + // 1. Disable/Control Intrinsic accept a list of intrinsic IDs. s/accept/accepts/ + // their final value are subject to hardware inspection (VM_Version::initialize). s/value/values/ Otherwise all my nits have been addressed - thanks. I don't need to see a further webrev. Thanks, David ----- > Nils points out a very tricky question. Yes, I also notice that each TriBool takes 4 bytes on x86_64. It's a natural machine word and supposed to be the most efficient form. As a result, the vector control_words take about 1.3Kb for all intrinsics. I thought it's not a big deal, but Nils brought up that each DirectiveSet will increase from 128b to 1440b. Theoretically, the user may provide a CompileCommandFile which consists of hundreds of directives. Will hotspot have hundreds of DirectiveSet in that case? > > Actually, I do have a compacted container of TriBool. It's like a vector specialization. > https://cr.openjdk.java.net/~xliu/8151779/TriBool.cpp > > The reason I didn't include it because I still feel that a few KiloBytes memories are not a big deal. Nowadays, hotspot allows Java programmers allocate over 100G heap. Is it wise to increase software complexity to save KBs? > > If you think it matters, I can integrate it. May I update TriBoolArray in a standalone JBS? I have made a lot of changes. I hope I can verify them using KitchenSink? > > For the second problem, I think it's because I used 'memset' to initialize an array of objects in rev01. Previously, I had code like this: > memset(&_intrinsic_control_words[0], 0, sizeof(_intrinsic_control_words)); > > This kind of usage will be warned as -Werror=class-memaccess in g++-8. I have fixed it since rev02. I use DirectiveSet::fill_in(). Please check out. > > Thanks, > --lx > From manc at google.com Fri May 22 08:13:46 2020 From: manc at google.com (Man Cao) Date: Fri, 22 May 2020 01:13:46 -0700 Subject: RFR(M): 8244660: Code cache sweeper heuristics is broken In-Reply-To: References: <0688678b-986b-082c-425e-543c3c32b094@oracle.com> Message-ID: Hi Nils, Thanks for the updated code! Below is my feedback after a closer look, ordered by importance. 1. sweeper_loop() NMethodSweeper::sweeper_loop() seems to be missing an inner loop to check for a wakeup condition. It could suffer from lost wakeup and spurious wakeup problems, as described in [1]. Similar code like ServiceThread::service_thread_entry() also has nested loops to check for wakeup conditions. We could add a boolean variable "should_sweep", guarded by the CodeSweeper_lock. [1] https://www.modernescpp.com/index.php/c-core-guidelines-be-aware-of-the-traps-of-condition-variables One wild idea is to let the ServiceThread handle the code cache sweep work, and remove the sweeper thread. Has anyone considered this idea before? 2. Rank of CodeSweeper_lock In mutexLocker.cpp: def(CodeSweeper_lock , PaddedMonitor, special+1, true, _safepoint_check_never); Should the rank be "special - 1", like the CompiledMethod_lock? We want to check that this lock acquisition order is valid, but not vice versa: { MonitorLocker(CodeCache_lock); MonitorLocker(CodeSweeper_lock); } Reading the code in Mutex::set_owner_implementation(), the deadlock avoidance rule enforces that the inner lock should have a lower rank than the outer lock. "special + 1" has the same value as Mutex::suspend_resume, which disables the deadlock avoidance check. 3. Data race on _bytes_changed 74 static volatile int _bytes_changed; The "volatile" keyword likely intends to avoid data races and atomicity issues, but the accesses use "_bytes_changed +=" and "=" to do loads and stores. Should those accesses use Atomic::add(&_bytes_changed, value) and other Atomic functions. 4. NMethodSweeper::_sweep_threshold Is it better to make it a "size_t" instead of "int"? Then we can use "ulong" in metadata.xml, and SIZE_FORMAT in the log_info()). Also it's probably better to name it as _sweep_threshold_bytes or _sweep_threshold_bytes, to differentiate from the SweeperThreshold percentage value. _sweep_threshold's type should probably be consistent with _bytes_changed, so perhaps _bytes_changed could be changed to size_t. 5. 887 void CompileBroker::init_compiler_sweeper_threads() { 888 NMethodSweeper::set_sweep_threshold((SweeperThreshold / 100) * ReservedCodeCacheSize); Is it better to use "static_cast()" to explicitly mark type cast? -Man From aph at redhat.com Fri May 22 13:00:58 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 22 May 2020 14:00:58 +0100 Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes In-Reply-To: References: Message-ID: <275eb57c-51c0-675e-c32a-91b198023559@redhat.com> On 5/19/20 11:55 AM, Yang Zhang wrote: > Following up on review requests of API [0], Java implementation and > test [1], General Hotspot changes[2] for Vector API and x86 backend > changes [3]. Here's a request for review of AArch64 backend changes > required for supporting the Vector API: > > JEP: https://openjdk.java.net/jeps/338 > JBS: https://bugs.openjdk.java.net/browse/JDK-8223347 > Webrev: http://cr.openjdk.java.net/~yzhang/vectorapi/vectorapi.rfr/aarch64_webrev/webrev.01/ > > Complete implementation resides in vector-unstable branch of panama/dev > repository [4]. This looks great, and it's very impressive. Unfortunately, there are few of us sufficiently knowledgeable about Panama to review it in the detail that perhaps it deserves. I'm happy with it. However, we need tests for the new assembly instructions, so please add some to aarch64_asmtest.py and update macroassemler.cpp. Also, aarch64.ad is getting to be far too large, and perhaps all the vector stuff should be moved into a new file. Thanks. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From matthias.baesken at sap.com Fri May 22 06:38:17 2020 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 22 May 2020 06:38:17 +0000 Subject: RFR(XS): 8245521: Remove STACK_BIAS In-Reply-To: <114B5A9A-DEF8-43AA-814A-3D7E7BD2B001@oracle.com> References: <114B5A9A-DEF8-43AA-814A-3D7E7BD2B001@oracle.com> Message-ID: Hi Mikael, looks good, thanks for the cleanup . Best regards, Matthias -----Original Message----- From: ppc-aix-port-dev On Behalf Of Mikael Vidstedt Sent: Freitag, 22. Mai 2020 05:37 To: hotspot compiler ; hotspot-runtime-dev at openjdk.java.net runtime ; serviceability-dev ; ppc-aix-port-dev at openjdk.java.net Subject: RFR(XS): 8245521: Remove STACK_BIAS Please review this small change which removes the STACK_BIAS constant and its uses: JBS: https://bugs.openjdk.java.net/browse/JDK-8245521 webrev: http://cr.openjdk.java.net/~mikael/webrevs/8245521/webrev.00/open/webrev/ Background (from JBS): With Solaris/SPARC removed the STACK_BIAS definition in src/hotspot/share/utilities/globalDefinitions.hpp is now always 0 and can be removed. Testing: Tier1 Cheers, Mikael From paul.sandoz at oracle.com Fri May 22 16:12:03 2020 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 22 May 2020 09:12:03 -0700 Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes In-Reply-To: <275eb57c-51c0-675e-c32a-91b198023559@redhat.com> References: <275eb57c-51c0-675e-c32a-91b198023559@redhat.com> Message-ID: <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com> HI Andrew, Thanks for looking. I am not terribly familiar with the AArch64 code, but I would note the Vector API comes with a bunch of unit tests should exercise the code gen, just not as directly as I presume you would like. To what extent do you feel we can follow up with additional issues and fix them after the initial integration? Paul. > On May 22, 2020, at 6:00 AM, Andrew Haley wrote: > > On 5/19/20 11:55 AM, Yang Zhang wrote: >> Following up on review requests of API [0], Java implementation and >> test [1], General Hotspot changes[2] for Vector API and x86 backend >> changes [3]. Here's a request for review of AArch64 backend changes >> required for supporting the Vector API: >> >> JEP: https://openjdk.java.net/jeps/338 >> JBS: https://bugs.openjdk.java.net/browse/JDK-8223347 >> Webrev: http://cr.openjdk.java.net/~yzhang/vectorapi/vectorapi.rfr/aarch64_webrev/webrev.01/ >> >> Complete implementation resides in vector-unstable branch of panama/dev >> repository [4]. > > This looks great, and it's very impressive. Unfortunately, there are few > of us sufficiently knowledgeable about Panama to review it in the detail > that perhaps it deserves. I'm happy with it. > > However, we need tests for the new assembly instructions, so please add some > to aarch64_asmtest.py and update macroassemler.cpp. > > Also, aarch64.ad is getting to be far too large, and perhaps all the vector > stuff should be moved into a new file. Thanks. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > From aph at redhat.com Fri May 22 17:40:12 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 22 May 2020 18:40:12 +0100 Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes In-Reply-To: <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com> References: <275eb57c-51c0-675e-c32a-91b198023559@redhat.com> <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com> Message-ID: <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com> On 5/22/20 5:12 PM, Paul Sandoz wrote: > I am not terribly familiar with the AArch64 code, but I would note > the Vector API comes with a bunch of unit tests should exercise the > code gen, just not as directly as I presume you would like. Yes, you've understood me: direct is what I want. The assembler tests are intended to make sure we generate exactly the right instructions, rather than having something painfully hard to debug later on. When a patch adds a lot of instructions to the assembler, that IMO is the right time to test that they generate correctly-encoded instructions. But yes, that can go into a follow-up patch, as long as it gets done fairly shortly. > To what extent do you feel we can follow up with additional issues > and fix them after the initial integration? We can do that. Note that after this patch, aarch64.ad is 21762 lines long. I know we don't have any hard-and-fast rule about this, but I'd rather it didn't get forgotten. Maybe I should do that one myself, but I guess I'd rather avoid the problem of version skew between the Panama repo and mainline. That'd make merging rather grim. Yang, against which repo is this webrev intended to be applied? -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From paul.sandoz at oracle.com Fri May 22 18:01:01 2020 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 22 May 2020 11:01:01 -0700 Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes In-Reply-To: <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com> References: <275eb57c-51c0-675e-c32a-91b198023559@redhat.com> <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com> <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com> Message-ID: <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com> > On May 22, 2020, at 10:40 AM, Andrew Haley wrote: > > On 5/22/20 5:12 PM, Paul Sandoz wrote: > >> I am not terribly familiar with the AArch64 code, but I would note >> the Vector API comes with a bunch of unit tests should exercise the >> code gen, just not as directly as I presume you would like. > > Yes, you've understood me: direct is what I want. The assembler tests > are intended to make sure we generate exactly the right instructions, > rather than having something painfully hard to debug later on. When a > patch adds a lot of instructions to the assembler, that IMO is the > right time to test that they generate correctly-encoded > instructions. But yes, that can go into a follow-up patch, as long as > it gets done fairly shortly. Ok. > >> To what extent do you feel we can follow up with additional issues >> and fix them after the initial integration? > > We can do that. Note that after this patch, aarch64.ad is 21762 lines > long. I know we don't have any hard-and-fast rule about this, but I'd > rather it didn't get forgotten. We have made changes similar in spirit to the x64 ad file (reducing in size at least), so I think it reasonable request before integration to reduce the cognitive and maintenance burden. (FWIW I don?t know to what extent some functionality is utilized by the auto vectorizer and whether that impacts its location or not.) > Maybe I should do that one myself, but > I guess I'd rather avoid the problem of version skew between the > Panama repo and mainline. That'd make merging rather grim. > > Yang, against which repo is this webrev intended to be applied? > I cannot speak for Yang but we have been generating webrevs from the Panama dev repo, branch vector-unstable (unfortunate name I know!) and doing: hg diff -r default Where the default is regularly, but manually, pulled from jdk/jdk. More specifically: - code is generally pushed to the vectorIntrinsics branch - on a manual but regular basis vectorIntrinsics is synced up to jdk/jdk (via the default branch). - on a manual but regular basis vector-unstable is synced with vectorIntrinsics - occasionally there are fixes pushed directly to vector-unstable for the purposes of integration (e.g. removal of the perf test or the x64 SVML stubs). Hth, Paul. From aph at redhat.com Sat May 23 09:16:07 2020 From: aph at redhat.com (Andrew Haley) Date: Sat, 23 May 2020 10:16:07 +0100 Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes In-Reply-To: <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com> References: <275eb57c-51c0-675e-c32a-91b198023559@redhat.com> <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com> <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com> <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com> Message-ID: <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com> On 5/22/20 7:01 PM, Paul Sandoz wrote: > We have made changes similar in spirit to the x64 ad file (reducing in size at least), so I think it reasonable request before integration to reduce the cognitive and maintenance burden. So here's a question: can the changes to the AArch64 back end be made to mainline now? Or is there a problem in that the C2 patterns being used don't exist? -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From vaibhav.x.choudhary at oracle.com Sat May 23 14:48:26 2020 From: vaibhav.x.choudhary at oracle.com (Vaibhav Choudhary) Date: Sat, 23 May 2020 20:18:26 +0530 Subject: 8245179: [TESTBUG] compiler/jvmci/events/JvmciNotifyBootstrapFinishedEventTest.java fails with custom Tiered Level set externally Message-ID: Hi, Please review this trivial patch. CR - https://bugs.openjdk.java.net/browse/JDK-8245179 > Webrev: http://cr.openjdk.java.net/~vaibhav/8245179/webrev.00/ Testcase: compiler/jvmci/events/JvmciNotifyBootstrapFinishedEventTest.java Description: 8241232 prevents BootStrapJVMCI to run with TieredStopAtLevel. This issue is a test fix which suggests, not to run the testcase with TieredStopAtLevel. - * @requires vm.jvmci & !vm.graal.enabled & vm.compMode == "Xmixed" + * @requires vm.jvmci & !vm.graal.enabled & vm.compMode == "Xmixed" & vm.opt.TieredStopAtLevel == null Thanks, Vaibhav Choudhary From Yang.Zhang at arm.com Mon May 25 08:26:45 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Mon, 25 May 2020 08:26:45 +0000 Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes In-Reply-To: <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com> References: <275eb57c-51c0-675e-c32a-91b198023559@redhat.com> <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com> <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com> <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com> <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com> Message-ID: Hi Andrew Please check the following. >> We have made changes similar in spirit to the x64 ad file (reducing in size at least), so I think it reasonable request before integration to reduce the cognitive and maintenance burden. > So here's a question: can the changes to the AArch64 back end be made to mainline now? Or is there a problem in that the C2 patterns being used don't exist? X86 ad file has been refactored to reduce code size in a series of JBSs [1]. I also investigated how to make similar changes to AArch64 ad file in Aug 2019. In summary, these changes wouldn't have a great impact on AArch64. If making similar changes to AArch64, about ~200kb (1% of total code size) will be reduced [2]. So I don't think we need to make these changes to AArch64 for now. To reduce maintenance burden, I agree that all the vector stuff should be moved into a new file, just like AArch64 SVE has done [3]. All the SVE instructions are generated by m4 file and placed in aarch64_sve.ad. For newly added NEON instructions in Vector API, they are all generated by m4 file. In jdk master, what we need to do is that writing m4 file for existing vector instructions and placed them to a new file aarch64_neon.ad. If no question, I will do it right away. [1] https://bugs.openjdk.java.net/browse/JDK-8230015 [2] The data came from: I implement an example (merging vaddB/S/I/F) in AArch64 platform. The code size reduction in libjvm.so is ~15kb. If all the vector instructions are merged, the estimated size reduction would be ~200kb. The code size from vector support in AArch64 backend is ~450kb (2% of total size). Original libjvm.so is 20770256, while libjvm.so without vector support is 20317328. Based on the idea of generic vector operands, half of vector stuff can be removed. So the estimated size reduction is also ~200kb. [3] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-March/037628.html Regards Yang -----Original Message----- From: Andrew Haley Sent: Saturday, May 23, 2020 5:16 PM To: Paul Sandoz Cc: Yang Zhang ; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; nd Subject: Re: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes On 5/22/20 7:01 PM, Paul Sandoz wrote: > We have made changes similar in spirit to the x64 ad file (reducing in size at least), so I think it reasonable request before integration to reduce the cognitive and maintenance burden. So here's a question: can the changes to the AArch64 back end be made to mainline now? Or is there a problem in that the C2 patterns being used don't exist? -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From rwestrel at redhat.com Mon May 25 09:11:37 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 25 May 2020 11:11:37 +0200 Subject: RFR(XS): 8245714: "Bad graph detected in build_loop_late" when loads are pinned on loop limit check uncommon branch Message-ID: <87tv041ira.fsf@redhat.com> https://bugs.openjdk.java.net/browse/JDK-8245714 http://cr.openjdk.java.net/~roland/8245714/webrev.00/ This triggers when data nodes are pinned on the uncommon trap path of a predicate. When a new predicate is added, a region is created to merge the paths comming from the place holder and the new predicate. Data nodes pinned on the uncommon path for the place holder are then updated to be pinned on the new region. That logic updates the control edge but not the control that loop opts keep track of. This causes a crash with the test case of the webrev where the predicate is a loop limit check. Roland. From rwestrel at redhat.com Mon May 25 09:22:43 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 25 May 2020 11:22:43 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> Message-ID: <87r1v81i8s.fsf@redhat.com> > I really like your asserts that > test the ?fiat type? against the ?physics? of the node being > created; I mean those that say ?type doesn?t match?. > Would it make sense to add more such asserts on the > other branches? You?d need a work-alike for the Value > method of MaxINode for all the other cases, so (sigh) > maybe it?s too much to squeeze into this work. The > lack of good asserts here feels like tech. debt, though. > Would you mind filing a suitable bug, if you agree with > me? Do I understand right that the new bug would be: to provide CMoveINode::Value() and CMoveLNode::Value() in the case where the input condition follows the pattern of min/max and then add asserts to GraphKit::build_min_max(), so essentially the integer interval logic I gave up on? > Perhaps the right answer is to move the GraphKit > functions (one more time) onto polymorphic factory > methods MaxNode::make and MinNode::make, styled > like LoadNode::make. Sure but there's no MinNode, only a MaxNode: class MinINode : public MaxNode { Should MaxNode be renamed to MinMaxNode? There's not much in MaxNode so I suppose its body could be cloned to introduce a MinNode too. Roland. From rwestrel at redhat.com Mon May 25 15:21:27 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 25 May 2020 17:21:27 +0200 Subject: RFR(M): 8229495: SIGILL in C2 generated OSR compilation In-Reply-To: <3b720427-d718-5d1c-dbe9-6149a21883af@oracle.com> References: <3b720427-d718-5d1c-dbe9-6149a21883af@oracle.com> Message-ID: <87o8qc11mw.fsf@redhat.com> Hi Patric, > I would like to ask for help to review the following change/update: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8229495 > Webrev: http://cr.openjdk.java.net/~phedlin/tr8229495/ Running compiler/loopopts/IterationSplitPredicateInconsistency.java with -XX:LoopUnrollLimit=1024 and your patch causes a crash: # Internal Error (/home/roland/jdk-jdk/src/hotspot/share/opto/loopnode.cpp:4016), pid=4004890, tid=4004904 # assert(!had_error) failed: bad dominance It doesn't crash without. Loop unrolling heuristics must have changed recently. I don't see as much unrolling as I used to when I worked on this. Roland. From john.r.rose at oracle.com Mon May 25 16:22:48 2020 From: john.r.rose at oracle.com (John Rose) Date: Mon, 25 May 2020 09:22:48 -0700 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <87r1v81i8s.fsf@redhat.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> <87r1v81i8s.fsf@redhat.com> Message-ID: On May 25, 2020, at 2:22 AM, Roland Westrelin wrote: > > >> I really like your asserts that >> test the ?fiat type? against the ?physics? of the node being >> created; I mean those that say ?type doesn?t match?. >> Would it make sense to add more such asserts on the >> other branches? You?d need a work-alike for the Value >> method of MaxINode for all the other cases, so (sigh) >> maybe it?s too much to squeeze into this work. The >> lack of good asserts here feels like tech. debt, though. >> Would you mind filing a suitable bug, if you agree with >> me? > > Do I understand right that the new bug would be: to provide > CMoveINode::Value() and CMoveLNode::Value() in the case where the input > condition follows the pattern of min/max and then add asserts to > GraphKit::build_min_max(), so essentially the integer interval logic I > gave up on? Maybe that?s the right outcome, but I think that interval logic should be co-located with other MaxNode stuff, rather than hanging out in move node.cpp. > >> Perhaps the right answer is to move the GraphKit >> functions (one more time) onto polymorphic factory >> methods MaxNode::make and MinNode::make, styled >> like LoadNode::make. > > Sure but there's no MinNode, only a MaxNode: > > class MinINode : public MaxNode { > > Should MaxNode be renamed to MinMaxNode? There's not much in MaxNode so > I suppose its body could be cloned to introduce a MinNode too. Kind of like MulNode covers AndNode (IIRC). The node supertype factoring is ?lumpy? rather than ?splitty?. I think min and max are so closely connected that they belong in one lump rather than being split apart. So I think keeping MaxNode as a lump is appropriate here. If you are comfortable renaming it to MinMaxNode, fine, but the existing name is acceptable, just like MulNode. (ExtremumNode feels too pedantic for this code base.) I?m not totally against splitting out MinNode, either, but the two classes would surely be coupled behind the scenes, since the logic is almost the same between the two. We wouldn?t want to clone the code and then reverse all the comparisons, right? That?s what boolean flags are for. The important thing is to keep the min/max logic in one place, since it is tricky. I think the new bug should suggest factoring min/max for I and L so that they can all be hooked to hardware intrinsics, rather than boiling them down eagerly to CMoves. I suggest (not for now!) allowing min/max nodes to exist as themselves during GVN passes, and only later turn to CMoves if the matcher can?t handle them. (So hardware min/max are optional intrinsics, to be covered by c-move if they are absent.) If that?s a bad idea, and we wish to keep the mix of MaxI with CMoveL (although it?s awkward) then we?d want CMoveL to have a little side channel to talk to to the MaxNode logic when it can tell it is ?really? a MaxLNode. That?s what I was thinking. But for now, the *factory* logic should factor through MaxNode (or a renamed MinMaxNode), and not be placed in a random location; my GraphKit suggestion was wrong. The rest of your changes are good to go, I think. Thanks. ? John From vaibhav.x.choudhary at oracle.com Tue May 26 03:42:14 2020 From: vaibhav.x.choudhary at oracle.com (Vaibhav Choudhary) Date: Tue, 26 May 2020 09:12:14 +0530 Subject: 8245179: [TESTBUG] compiler/jvmci/events/JvmciNotifyBootstrapFinishedEventTest.java fails with custom Tiered Level set externally In-Reply-To: References: Message-ID: <96EE0FB3-B79A-47A6-8A30-D70CC4381AEF@oracle.com> Gentle reminder. > On 23-May-2020, at 8:18 PM, Vaibhav Choudhary wrote: > > Hi, > > Please review this trivial patch. > > CR - https://bugs.openjdk.java.net/browse/JDK-8245179 > > Webrev: http://cr.openjdk.java.net/~vaibhav/8245179/webrev.00/ > > Testcase: compiler/jvmci/events/JvmciNotifyBootstrapFinishedEventTest.java > Description: 8241232 prevents BootStrapJVMCI to run with TieredStopAtLevel. This issue is a test fix which suggests, not to run the testcase with TieredStopAtLevel. > > - * @requires vm.jvmci & !vm.graal.enabled & vm.compMode == "Xmixed" > + * @requires vm.jvmci & !vm.graal.enabled & vm.compMode == "Xmixed" & vm.opt.TieredStopAtLevel == null > > > Thanks, > Vaibhav Choudhary > From Xiaohong.Gong at arm.com Tue May 26 06:57:34 2020 From: Xiaohong.Gong at arm.com (Xiaohong Gong) Date: Tue, 26 May 2020 06:57:34 +0000 Subject: RFR: 8245717: VM option "-XX:EnableJVMCIProduct" could not be repetitively enabled Message-ID: Hi, Could you please help to review this simple patch? It fixes the issue that JVM crashes in debug mode when the vm option "-XX:EnableJVMCIProduct" is enabled repetitively. JBS: https://bugs.openjdk.java.net/browse/JDK-8245717 Webrev: http://cr.openjdk.java.net/~xgong/rfr/8245717/webrev.00/ Repetitively enabling the vm option "-XX:+EnableJVMCIProduct" in the command line makes the assertion fail in debug mode: "assert(is_experimental(), sanity)". It happens when the VM iteratively parses the options from command line. When the matched option is "-XX:+EnableJVMCIProduct", the original experimental JVMCI flags will be converted to product mode, with the above assertion before it. So if all the JVMCI flags have been converted to the product mode at the first time parsing "-XX:+EnableJVMCIProduct", the assertion will fail at the second time it is parsed. A simple fix is to just ignoring the conversion if this option has been parsed. Testing: Tested jtreg hotspot::hotspot_all_no_apps,jdk::jdk_core,langtools::tier1 and jcstress:tests-custom, and all tests pass without new failure. Thanks, Xiaohong Gong From bobpengxie at tencent.com Tue May 26 07:54:41 2020 From: bobpengxie at tencent.com (=?utf-8?B?Ym9icGVuZ3hpZSjpoonpuY8p?=) Date: Tue, 26 May 2020 07:54:41 +0000 Subject: AOT fails to compile jdk.base Message-ID: <13553321-5188-4E20-B9CC-9FE072585245@tencent.com> Hi, AOT fails to compile java.base with jdk15-ea and jdk14. Reproduce: jaotc --output libjava.base.so --module java.base Environment: centos 7, mac 10.14.5 Errors: Please see the attachment. But it can compile successfully with jdk13. Is it still supported in the latest jdk? Thanks, Peng xie From aph at redhat.com Tue May 26 08:25:02 2020 From: aph at redhat.com (Andrew Haley) Date: Tue, 26 May 2020 09:25:02 +0100 Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes In-Reply-To: References: <275eb57c-51c0-675e-c32a-91b198023559@redhat.com> <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com> <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com> <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com> <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com> Message-ID: <1909fa9d-98bb-c2fb-45d8-540247d1ca8b@redhat.com> On 25/05/2020 09:26, Yang Zhang wrote: > In jdk master, what we need to do is that writing m4 file for existing > vector instructions and placed them to a new file aarch64_neon.ad. > If no question, I will do it right away. I'm not entirely sure that such a change is necessary now. In particular, reorganizing the existing vector instructions is IMO excessive, but I admit that it might be an improvement. But to my earlier question. please: can the new instructions be moved into jdk head first, and then merged into the Panama branch, or not? It'd help if this was possible. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From Pengfei.Li at arm.com Tue May 26 09:50:57 2020 From: Pengfei.Li at arm.com (Pengfei Li) Date: Tue, 26 May 2020 09:50:57 +0000 Subject: PING: RFR: 8245158: C2: Enable SLP for some manually unrolled loops Message-ID: Ping - Any reviews of this? -- Thanks, Pengfei > Can I have a review of this enhancement of C2 SLP? > > JBS: https://bugs.openjdk.java.net/browse/JDK-8245158 > Webrev: http://cr.openjdk.java.net/~pli/rfr/8245158/webrev.00/ > > Below Java loop with stride = 1 can be vectorized by C2. > for (int i = start; i < limit; i++) { > c[i] = a[i] + b[i]; > } > > But if it's manually unrolled once, like in the code below, SLP would fail to > vectorize it. > for (int i = start; i < limit; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] + b[i + 1]; > } > > Notably, if the induction variable's initial value "start" is replaced by a > compile-time constant, the vectorization works. > > Root cause of these is that in current C2 SuperWord implementation, > find_adjacent_refs() calls find_align_to_ref() to select a "best align to" > memory reference to create packs, and particularly, the reference selected > must be "pre-loop alignable". In other words, C2 must be able to adjust the > pre-loop trip count such that the vectorized access of this reference is aligned. > Hence, in find_align_to_ref(), unalignable memory references are discarded. > [1] Then SLP packs creation is aborted if no memory reference is eligible to be > the "best align to". [2] > > In current C2 SLP code, the selected "best align to" reference has two uses. > One is to compute alignment info in order to find adjacent memory > references for packs creation. Another use is to facilitate the pre-loop trip > count adjustment to align vector memory accesses in the main-loop. > But on some platforms, aligning vector accesses is not a mandatory > requirement (after Roland's JDK-8215483 [3], this is usually checked by > "!Matcher::misaligned_vectors_ok() || AlignVector"). So the "best align to" > memory reference doesn't have to be "pre-loop alignable" on all platforms. > In this patch, we only discard unalignable references when that platform- > dependent check returns true. > > After this patch, some manually unrolled loops can be vectorized on > platforms with no alignment requirement. As almost all modern x86 CPUs > support unaligned vector move, I suspect this can benefit the majority of > today's CPUs. > > Please note that this patch doesn't try to enable SLP for all manually unrolled > loops. If above case is unrolled more times, vectorization may still don't work. > The reason behind is that current SLP applies only to main-loops produced by > the iteration split. When the loop is manually unrolled many times, node > count may exceed LoopUnrollLimit, resulting in no iteration split at all. > Although this can be workarounded by relaxing the unrolling policy by > slp_max_unroll_factor, we don't do in this way since splitting a big loop may > increase too much code size. Anyone wants to vectorize a super-manually- > unrolled loop can use -XX:LoopUnrollLimit= with a greater value. > > [Tests] > > Jtreg hotspot::hotspot_all_no_apps, jdk::jdk_core, langtools::tier1 are tested > and no new failure is found. > > Below are the results of the JMH test [4] from above case. > > Before: > Benchmark Mode Cnt Score Error Units > TestUnrolledLoop.bar thrpt 25 58097.290 ? 128.802 ops/s > > After: > Benchmark Mode Cnt Score Error Units > TestUnrolledLoop.bar thrpt 25 260110.139 ? 10902.284 ops/s > > [1] > http://hg.openjdk.java.net/jdk/jdk/file/a0a21978f3b9/src/hotspot/share/opt > o/superword.cpp#l780 > [2] > http://hg.openjdk.java.net/jdk/jdk/file/a0a21978f3b9/src/hotspot/share/opt > o/superword.cpp#l587 > [3] http://hg.openjdk.java.net/jdk/jdk/rev/da7dc9e92d91 > [4] http://cr.openjdk.java.net/~pli/rfr/8245158/TestUnrolledLoop.java > > -- > Thanks, > Pengfei From rwestrel at redhat.com Tue May 26 12:01:08 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 26 May 2020 14:01:08 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> <87r1v81i8s.fsf@redhat.com> Message-ID: <87lfle29dn.fsf@redhat.com> Here is an updated webrev: http://cr.openjdk.java.net/~roland/8244504/webrev.02/ I moved the min/max methods to MaxNode and filed https://bugs.openjdk.java.net/browse/JDK-8245787 I noticed the previous webrev incorrectly included some of the changes from 8223051 (support loops with long (64b) trip counts) and removed them. I intend this webrev to only be refactoring in preparation for the long loops support. Anyone else for a review? Roland. From tobias.hartmann at oracle.com Tue May 26 12:52:29 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 26 May 2020 14:52:29 +0200 Subject: [15] RFR(XS): 8245801: StressRecompilation triggers assert "redundunt OSR recompilation detected. memory leak in CodeCache!" Message-ID: <2afb41f8-d79f-027d-cf8e-6564b91a5851@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8245801 http://cr.openjdk.java.net/~thartmann/8245801/webrev.00/ Running with -XX:+StressRecompilation triggers the assert added by JDK-8222670 [1] because re-compiling an OSR method is not expected. The assert is too strong. I've also fixed the indentation and a typo in the assert ("redundunt" -> "redundant"). Thanks, Tobias [1] https://bugs.openjdk.java.net/browse/JDK-8222670 From rwestrel at redhat.com Tue May 26 13:05:35 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 26 May 2020 15:05:35 +0200 Subject: RFR: 8245158: C2: Enable SLP for some manually unrolled loops In-Reply-To: References: Message-ID: <87ftbm26e8.fsf@redhat.com> > Webrev: http://cr.openjdk.java.net/~pli/rfr/8245158/webrev.00/ That looks reasonable to me. Roland. From david.holmes at oracle.com Tue May 26 13:11:22 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 26 May 2020 23:11:22 +1000 Subject: RFR: 8245717: VM option "-XX:EnableJVMCIProduct" could not be repetitively enabled In-Reply-To: References: Message-ID: Hi, On 26/05/2020 4:57 pm, Xiaohong Gong wrote: > Hi, > > Could you please help to review this simple patch? It fixes the issue that JVM crashes > in debug mode when the vm option "-XX:EnableJVMCIProduct" is enabled repetitively. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8245717 > Webrev: http://cr.openjdk.java.net/~xgong/rfr/8245717/webrev.00/ > > Repetitively enabling the vm option "-XX:+EnableJVMCIProduct" in the command > line makes the assertion fail in debug mode: "assert(is_experimental(), sanity)". > > It happens when the VM iteratively parses the options from command line. When the > matched option is "-XX:+EnableJVMCIProduct", the original experimental JVMCI flags > will be converted to product mode, with the above assertion before it. So if all > the JVMCI flags have been converted to the product mode at the first time parsing > "-XX:+EnableJVMCIProduct", the assertion will fail at the second time it is parsed. > > A simple fix is to just ignoring the conversion if this option has been parsed. Seems a reasonable approach given the already complex handling of this flag. > Testing: > Tested jtreg hotspot::hotspot_all_no_apps,jdk::jdk_core,langtools::tier1 > and jcstress:tests-custom, and all tests pass without new failure. I think adding a regression test in ./compiler/jvmci/TestEnableJVMCIProduct.java would be appropriate. Thanks, David > Thanks, > Xiaohong Gong > From rwestrel at redhat.com Tue May 26 13:18:02 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 26 May 2020 15:18:02 +0200 Subject: [15] RFR(XS): 8245801: StressRecompilation triggers assert "redundunt OSR recompilation detected. memory leak in CodeCache!" In-Reply-To: <2afb41f8-d79f-027d-cf8e-6564b91a5851@oracle.com> References: <2afb41f8-d79f-027d-cf8e-6564b91a5851@oracle.com> Message-ID: <87d06q25th.fsf@redhat.com> > http://cr.openjdk.java.net/~thartmann/8245801/webrev.00/ Looks good to me. Roland. From tobias.hartmann at oracle.com Tue May 26 13:18:47 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 26 May 2020 15:18:47 +0200 Subject: [15] RFR(XS): 8245801: StressRecompilation triggers assert "redundunt OSR recompilation detected. memory leak in CodeCache!" In-Reply-To: <87d06q25th.fsf@redhat.com> References: <2afb41f8-d79f-027d-cf8e-6564b91a5851@oracle.com> <87d06q25th.fsf@redhat.com> Message-ID: Thanks Roland! Best regards, Tobias On 26.05.20 15:18, Roland Westrelin wrote: > >> http://cr.openjdk.java.net/~thartmann/8245801/webrev.00/ > > Looks good to me. > > Roland. > From felix.yang at huawei.com Tue May 26 13:25:05 2020 From: felix.yang at huawei.com (Yangfei (Felix)) Date: Tue, 26 May 2020 13:25:05 +0000 Subject: RFR(S): 8243670: Unexpected test result caused by C2 MergeMemNode::Ideal References: <4d051aec-56ef-b35e-f082-2f6305ec1694@oracle.com> Message-ID: Gentle Ping ... > -----Original Message----- > From: Yangfei (Felix) > Sent: Thursday, May 7, 2020 9:42 PM > To: 'Tobias Hartmann' ; hotspot-compiler- > dev at openjdk.java.net > Subject: RE: RFR(S): 8243670: Unexpected test result caused by C2 > MergeMemNode::Ideal > > Hi Tobias, > > > -----Original Message----- > > From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] > > Sent: Thursday, May 7, 2020 5:10 PM > > To: Yangfei (Felix) ; hotspot-compiler- > > dev at openjdk.java.net > > Subject: Re: RFR(S): 8243670: Unexpected test result caused by C2 > > MergeMemNode::Ideal > > > > Hi Felix, > > > > were you able to figure out how we ended up with two Phis with same > > input but different _adr_type? > > As I remembered, there are two major transformations which leads to this: > > 1. During Iter GVN1, a new phi is created with narrowed memory type > through PhiNode::slice_memory. > The new phi and the old phi have different _adr_type and different input. > > 2. Then C2 peel the first iteration of the given loop through > PhaseIdealLoop::do_peeling. > After that, the new phi and the old phi have same input but different > _adr_type. > > Hope this helps. > > Thanks, > Felix From tobias.hartmann at oracle.com Tue May 26 13:29:18 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 26 May 2020 15:29:18 +0200 Subject: RFR: 8245158: C2: Enable SLP for some manually unrolled loops In-Reply-To: <87ftbm26e8.fsf@redhat.com> References: <87ftbm26e8.fsf@redhat.com> Message-ID: <47d371c9-d647-4e97-19f5-330831181ceb@oracle.com> +1 Best regards, Tobias On 26.05.20 15:05, Roland Westrelin wrote: > >> Webrev: http://cr.openjdk.java.net/~pli/rfr/8245158/webrev.00/ > > That looks reasonable to me. > > Roland. > From tobias.hartmann at oracle.com Tue May 26 13:34:22 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 26 May 2020 15:34:22 +0200 Subject: 8245179: [TESTBUG] compiler/jvmci/events/JvmciNotifyBootstrapFinishedEventTest.java fails with custom Tiered Level set externally In-Reply-To: <96EE0FB3-B79A-47A6-8A30-D70CC4381AEF@oracle.com> References: <96EE0FB3-B79A-47A6-8A30-D70CC4381AEF@oracle.com> Message-ID: <2d6057e7-5db7-1950-58ca-c0a43da04241@oracle.com> Looks good to me. Best regards, Tobias On 26.05.20 05:42, Vaibhav Choudhary wrote: > Gentle reminder. > > >> On 23-May-2020, at 8:18 PM, Vaibhav Choudhary wrote: >> >> Hi, >> >> Please review this trivial patch. >> >> CR - https://bugs.openjdk.java.net/browse/JDK-8245179 > >> Webrev: http://cr.openjdk.java.net/~vaibhav/8245179/webrev.00/ >> >> Testcase: compiler/jvmci/events/JvmciNotifyBootstrapFinishedEventTest.java >> Description: 8241232 prevents BootStrapJVMCI to run with TieredStopAtLevel. This issue is a test fix which suggests, not to run the testcase with TieredStopAtLevel. >> >> - * @requires vm.jvmci & !vm.graal.enabled & vm.compMode == "Xmixed" >> + * @requires vm.jvmci & !vm.graal.enabled & vm.compMode == "Xmixed" & vm.opt.TieredStopAtLevel == null >> >> >> Thanks, >> Vaibhav Choudhary >> > From tobias.hartmann at oracle.com Tue May 26 13:51:00 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 26 May 2020 15:51:00 +0200 Subject: RFR(XS): 8245714: "Bad graph detected in build_loop_late" when loads are pinned on loop limit check uncommon branch In-Reply-To: <87tv041ira.fsf@redhat.com> References: <87tv041ira.fsf@redhat.com> Message-ID: Hi Roland, looks good and trivial to me. Best regards, Tobias On 25.05.20 11:11, Roland Westrelin wrote: > > https://bugs.openjdk.java.net/browse/JDK-8245714 > http://cr.openjdk.java.net/~roland/8245714/webrev.00/ > > This triggers when data nodes are pinned on the uncommon trap path of a > predicate. When a new predicate is added, a region is created to merge > the paths comming from the place holder and the new predicate. Data > nodes pinned on the uncommon path for the place holder are then updated > to be pinned on the new region. That logic updates the control edge but > not the control that loop opts keep track of. This causes a crash with > the test case of the webrev where the predicate is a loop limit check. > > Roland. > From tobias.hartmann at oracle.com Tue May 26 13:56:58 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 26 May 2020 15:56:58 +0200 Subject: [15] RFR(XS): 8239083: C1 assert(known_holder == NULL || (known_holder->is_instance_klass() && (!known_holder->is_interface() || ((ciInstanceKlass*)known_holder)->has_nonstatic_concrete_methods())), "should be non-static concrete method"); In-Reply-To: <1dd061e7-f872-877e-b574-08e578f006ba@oracle.com> References: <1dd061e7-f872-877e-b574-08e578f006ba@oracle.com> Message-ID: <7bf13c9a-a1ce-a6a1-979c-50a1ea7a80bf@oracle.com> Hi Christian, looks reasonable to me. Best regards, Tobias On 15.05.20 11:11, Christian Hagedorn wrote: > Hi > > Please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8239083 > http://cr.openjdk.java.net/~chagedorn/8239083/webrev.00/ > > The assert fails in the test case when invoking the only static interface method with a method > handle. In this case, known_holder is non-NULL. However, known_holder would be set to NULL at [1] > since the call returns NULL when known_holder is an interface. > > In the failing test case, known_holder is non-NULL since GraphBuilder::try_method_handle_inline() > calls GraphBuilder::try_inline() with holder_known set to true which eventually lets profile_call() > to be called with a non-NULL known_holder argument. > > On the other hand, when calling a static method without a method handle, known_holder seems to be > always NULL: > profile_call() is called directly at [2] with NULL or indirectly via try_inline() [3]. In the latter > case, cha_monomorphic_target and exact_target are always NULL for static methods and therefore > known_holder will also be always NULL in profile_call(). > > We could therefore just remove the assert which seems to be too strong (not handling this edge > case). Another option would be to change the call to try_inline() in try_method_handle_inline() to > only set holder_known to true if the target is not static. The known_holder is eventually only used > in LIR_Assembler::emit_profile_call() [4] but only if op->should_profile_receiver_type() holds [5]. > This is only true if the callee is not static [6]. The webrev uses the second approach. > > What do you think? > > Best regards, > Christian > > > [1] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/share/c1/c1_GraphBuilder.cpp#l4386 > [2] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/share/c1/c1_GraphBuilder.cpp#l3571 > [3] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/share/c1/c1_GraphBuilder.cpp#l2039 > [4] > http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp#l3589 > [5] > http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp#l3584 > [6] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/share/c1/c1_LIR.hpp#l1916 From tobias.hartmann at oracle.com Tue May 26 14:00:25 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 26 May 2020 16:00:25 +0200 Subject: AOT fails to compile jdk.base In-Reply-To: <13553321-5188-4E20-B9CC-9FE072585245@tencent.com> References: <13553321-5188-4E20-B9CC-9FE072585245@tencent.com> Message-ID: Hi Peng Xie, On 26.05.20 09:54, bobpengxie(??) wrote: > Errors: > Please see the attachment. Attachments are stripped. Could you please upload the file somewhere and share the link? Best regards, Tobias From christian.hagedorn at oracle.com Tue May 26 14:19:51 2020 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Tue, 26 May 2020 16:19:51 +0200 Subject: [15] RFR(XS): 8239083: C1 assert(known_holder == NULL || (known_holder->is_instance_klass() && (!known_holder->is_interface() || ((ciInstanceKlass*)known_holder)->has_nonstatic_concrete_methods())), "should be non-static concrete method"); In-Reply-To: <7bf13c9a-a1ce-a6a1-979c-50a1ea7a80bf@oracle.com> References: <1dd061e7-f872-877e-b574-08e578f006ba@oracle.com> <7bf13c9a-a1ce-a6a1-979c-50a1ea7a80bf@oracle.com> Message-ID: Thank you Tobias for your review! Best regards, Christian On 26.05.20 15:56, Tobias Hartmann wrote: > Hi Christian, > > looks reasonable to me. > > Best regards, > Tobias > > On 15.05.20 11:11, Christian Hagedorn wrote: >> Hi >> >> Please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8239083 >> http://cr.openjdk.java.net/~chagedorn/8239083/webrev.00/ >> >> The assert fails in the test case when invoking the only static interface method with a method >> handle. In this case, known_holder is non-NULL. However, known_holder would be set to NULL at [1] >> since the call returns NULL when known_holder is an interface. >> >> In the failing test case, known_holder is non-NULL since GraphBuilder::try_method_handle_inline() >> calls GraphBuilder::try_inline() with holder_known set to true which eventually lets profile_call() >> to be called with a non-NULL known_holder argument. >> >> On the other hand, when calling a static method without a method handle, known_holder seems to be >> always NULL: >> profile_call() is called directly at [2] with NULL or indirectly via try_inline() [3]. In the latter >> case, cha_monomorphic_target and exact_target are always NULL for static methods and therefore >> known_holder will also be always NULL in profile_call(). >> >> We could therefore just remove the assert which seems to be too strong (not handling this edge >> case). Another option would be to change the call to try_inline() in try_method_handle_inline() to >> only set holder_known to true if the target is not static. The known_holder is eventually only used >> in LIR_Assembler::emit_profile_call() [4] but only if op->should_profile_receiver_type() holds [5]. >> This is only true if the callee is not static [6]. The webrev uses the second approach. >> >> What do you think? >> >> Best regards, >> Christian >> >> >> [1] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/share/c1/c1_GraphBuilder.cpp#l4386 >> [2] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/share/c1/c1_GraphBuilder.cpp#l3571 >> [3] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/share/c1/c1_GraphBuilder.cpp#l2039 >> [4] >> http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp#l3589 >> [5] >> http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp#l3584 >> [6] http://hg.openjdk.java.net/jdk/jdk/file/dd0caf00b05c/src/hotspot/share/c1/c1_LIR.hpp#l1916 From tobias.hartmann at oracle.com Tue May 26 14:23:39 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 26 May 2020 16:23:39 +0200 Subject: RFR(S): 8243670: Unexpected test result caused by C2 MergeMemNode::Ideal In-Reply-To: References: <4d051aec-56ef-b35e-f082-2f6305ec1694@oracle.com> Message-ID: Hi Felix, thanks for the details, makes sense to me. Isn't the root cause that we are loosing type information and wouldn't that be solved by selecting the Phi with the more restrictive _adr_type? Best regards, Tobias On 07.05.20 15:42, Yangfei (Felix) wrote: > Hi Tobias, > >> -----Original Message----- >> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >> Sent: Thursday, May 7, 2020 5:10 PM >> To: Yangfei (Felix) ; hotspot-compiler- >> dev at openjdk.java.net >> Subject: Re: RFR(S): 8243670: Unexpected test result caused by C2 >> MergeMemNode::Ideal >> >> Hi Felix, >> >> were you able to figure out how we ended up with two Phis with same input >> but different _adr_type? > > As I remembered, there are two major transformations which leads to this: > > 1. During Iter GVN1, a new phi is created with narrowed memory type through PhiNode::slice_memory. > The new phi and the old phi have different _adr_type and different input. > > 2. Then C2 peel the first iteration of the given loop through PhaseIdealLoop::do_peeling. > After that, the new phi and the old phi have same input but different _adr_type. > > Hope this helps. > > Thanks, > Felix > From richard.reingruber at sap.com Tue May 26 14:31:27 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 26 May 2020 14:31:27 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: <32f34616-cf17-8caa-5064-455e013e2313@oracle.com> References: <32f34616-cf17-8caa-5064-455e013e2313@oracle.com> Message-ID: Hi Vladimir, > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > Not an expert in JVMTI code base, so can't comment on the actual changes. > From JIT-compilers perspective it looks good. I put out webrev.1 a while ago [1]: Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ Webrev(delta): http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ You originally suggested to use a handshake to switch a thread into interpreter mode [2]. I'm using a direct handshake now, because I think it is the best fit. May I ask if webrev.1 still looks good to you from JIT-compilers perspective? Can I list you as (partial) Reviewer? Thanks, Richard. [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/031245.html [2] http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030340.html -----Original Message----- From: Vladimir Ivanov Sent: Freitag, 7. Februar 2020 09:19 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ Not an expert in JVMTI code base, so can't comment on the actual changes. From JIT-compilers perspective it looks good. Best regards, Vladimir Ivanov > Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 > > The change avoids making all compiled methods on stack not_entrant when switching a java thread to > interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. > > Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. > > Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. > > Thanks, Richard. > > See also my question if anyone knows a reason for making the compiled methods not_entrant: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html > From vladimir.kozlov at oracle.com Tue May 26 19:01:05 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 26 May 2020 12:01:05 -0700 Subject: RFR(XS): 8245521: Remove STACK_BIAS In-Reply-To: References: <114B5A9A-DEF8-43AA-814A-3D7E7BD2B001@oracle.com> Message-ID: On 5/21/20 9:11 PM, David Holmes wrote: > Hi Mikael, > > Looks good. +1 > > I assume the change to GraalHotSpotVMConfig.java is to allow it to work with older VMs? Yes. stackBias will be set to 0 if STACK_BIAS is not present. Otherwise it will be set to STACK_BIAS value. Thanks, Vladimir > > Thanks, > David > > On 22/05/2020 1:36 pm, Mikael Vidstedt wrote: >> >> Please review this small change which removes the STACK_BIAS constant and its uses: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8245521 >> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8245521/webrev.00/open/webrev/ >> >> Background (from JBS): >> >> With Solaris/SPARC removed the STACK_BIAS definition in src/hotspot/share/utilities/globalDefinitions.hpp is now >> always 0 and can be removed. >> >> >> Testing: >> >> Tier1 >> >> Cheers, >> Mikael >> From mikael.vidstedt at oracle.com Tue May 26 19:46:25 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Tue, 26 May 2020 12:46:25 -0700 Subject: RFR(XS): 8245521: Remove STACK_BIAS In-Reply-To: References: <114B5A9A-DEF8-43AA-814A-3D7E7BD2B001@oracle.com> Message-ID: David/Matthias/Vladimir, thanks for the reviews! Change pushed. Cheers, Mikael > On May 26, 2020, at 12:01 PM, Vladimir Kozlov wrote: > > On 5/21/20 9:11 PM, David Holmes wrote: >> Hi Mikael, >> Looks good. > > +1 > >> I assume the change to GraalHotSpotVMConfig.java is to allow it to work with older VMs? > > Yes. stackBias will be set to 0 if STACK_BIAS is not present. Otherwise it will be set to STACK_BIAS value. > > Thanks, > Vladimir > >> Thanks, >> David >> On 22/05/2020 1:36 pm, Mikael Vidstedt wrote: >>> >>> Please review this small change which removes the STACK_BIAS constant and its uses: >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8245521 >>> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8245521/webrev.00/open/webrev/ >>> >>> Background (from JBS): >>> >>> With Solaris/SPARC removed the STACK_BIAS definition in src/hotspot/share/utilities/globalDefinitions.hpp is now always 0 and can be removed. >>> >>> >>> Testing: >>> >>> Tier1 >>> >>> Cheers, >>> Mikael >>> From john.r.rose at oracle.com Tue May 26 21:00:22 2020 From: john.r.rose at oracle.com (John Rose) Date: Tue, 26 May 2020 14:00:22 -0700 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <87lfle29dn.fsf@redhat.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> <87r1v81i8s.fsf@redhat.com> <87lfle29dn.fsf@redhat.com> Message-ID: On May 26, 2020, at 5:01 AM, Roland Westrelin wrote: > > > Here is an updated webrev: > > http://cr.openjdk.java.net/~roland/8244504/webrev.02/ > > I moved the min/max methods to MaxNode and filed > https://bugs.openjdk.java.net/browse/JDK-8245787 Perfect; thank you. I added a comment. > I noticed the previous webrev incorrectly included some of the changes > from 8223051 (support loops with long (64b) trip counts) and removed > them. I intend this webrev to only be refactoring in preparation for the > long loops support. I saw those 64b changes and just rolled with them. It won?t hurt to look at them multiple times? Yes, this refactoring alone is much easier to examine. Reviewed! ? John From john.r.rose at oracle.com Tue May 26 21:14:34 2020 From: john.r.rose at oracle.com (John Rose) Date: Tue, 26 May 2020 14:14:34 -0700 Subject: RFR(M): 8223051: support loops with long (64b) trip counts In-Reply-To: <87y2q55rj4.fsf@redhat.com> References: <87lfmd8lip.fsf@redhat.com> <87h7wv7jny.fsf@redhat.com> <601CD9EB-C4E2-413E-988A-03CE5DE9FB00@oracle.com> <87y2q55rj4.fsf@redhat.com> Message-ID: <497B34CC-BA72-4674-8C5A-CF04DEF0CDC2@oracle.com> On May 6, 2020, at 2:41 AM, Roland Westrelin wrote: > >> So can we arrange to run it more than once, by setting the inner trip >> count to be smaller? I?m afraid the optimizer could detect a one-trip loop >> and take it apart (in a later pass), and then the goal of the stress test >> won?t be achieved. > > Sure. Here is a new patch that should take all of your comments into > account: > > http://cr.openjdk.java.net/~roland/8223051/webrev.01/ > > I split the change in 2. The counted loop refactoring code is out for > review under 8244504 (I just sent it out for review). Your review of > that other one is very much welcome. So, I?m very happy with the refactoring 8244504 work. Maybe someone else will have comments 0n that, but assuming that is more or less unchanged, this change set for 8223051 on top still looks good, taking into account renaming of min/max factories. After you post an updated webrev.02 the review should be easy. Tobias might wish to run some regression tests on the final changes. ? John From xxinliu at amazon.com Tue May 26 22:57:34 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Tue, 26 May 2020 22:57:34 +0000 Subject: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Message-ID: Hi, I make a new revision of JDK-8230552. May I ask the arm reviewers to take a look? http://cr.openjdk.java.net/~xliu/8230552/01/webrev/ I haven't made aarch64 stop() uses the trap mechanism because it deserves a standalone JBS. Another thing is I don't understand what's the benefit to use the signal handler for that. I do manage to reduce stop() code size per Ningsheng's request. It avoids from generating pusha if ShowMessageBoxOnError is off (default). It still acts as expect no matter ShowMessageBoxOnError is set or not. Since we now have got rid of the HaltNode right after Uncommon_trap callnode in release build, I don't think code bloat is an issue anymore. Thanks, --lx ?On 5/6/20, 1:19 AM, "aph at redhat.com" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. On 5/6/20 8:25 AM, Liu, Xin wrote: > Currently in AArch64 MacroAssembler::stop(), it will generate many > register saving instructions by pusha() before calling to debug64(). But > I think debug64() only uses the regs[] and pc arguments when > ShowMessageBoxOnError is on. Maybe we should at least only do the saving > and pc arg passing when ShowMessageBoxOnError is on in > MacroAssembler::stop(), as what x86 does in macroAssembler_x86.cpp? Maybe we should think about a better way to do it. All we have to do, after all, is put the reason into, say, r8, and execute a trap. We don't need to push and pop anything because the trap handler will do that. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From bobpengxie at tencent.com Wed May 27 02:41:49 2020 From: bobpengxie at tencent.com (=?utf-8?B?Ym9icGVuZ3hpZSjpoonpuY8p?=) Date: Wed, 27 May 2020 02:41:49 +0000 Subject: AOT fails to compile jdk.base Message-ID: Hi Tobias, Error: Failed compilation: jdk.internal.reflect.UnsafeQualifiedStaticObjectFieldAccessorImpl.get(Ljava/lang/Object;)Ljava/ lang/Object;: org.graalvm.compiler.debug.GraalError: should not reach here: Metaspace constant load should not be happenin g directly at lir instruction: B3 at 46 org.graalvm.compiler.hotspot.amd64.AMD64HotSpotMove$HotSpotLoadMetaspaceConstantOp rsi|Q WORD = HOTSPOTLOADMETASPACECONSTANT input: meta{HotSpotType} [B0, null, B14, B1, B3, B4, B6, null, null, B2, B8, B10, B12, null, B11] Error: Failed compilation: java.lang.invoke.VarHandleReferences$FieldInstanceReadOnly.getAcquire(Ljava/lang/invoke/VarHand leReferences$FieldInstanceReadOnly;Ljava/lang/Object;)Ljava/lang/Object;: org.graalvm.compiler.debug.GraalError: should not reach here: Metaspace constant load should not be happening directly at lir instruction: B112 at 914 org.graalvm.compiler.hotspot.amd64.AMD64HotSpotMove$HotSpotLoadMetaspaceConstantOp rax|QWORD = HOTSPOTLOADMETASPACECONSTANT input: meta{HotSpotType}[B0, null, B3, B5, B7, B9, B18, B19, B21, null, B24, null, B27, B111, null, B122, B28, null, B31, B35, B36, null, B39, B40, null, B51, B53, B54, B55, B61, B63, B65, null, B72, B74, B75, null, B90, B110, B8, null, B52, B112, null, B10, B12, B14, B16, B13, null, B113, null, B11, B4, B17, B15, null, B56, B58, B60, null, B67, null, B115, B117, B119, B6, B45, B46, B48, B32, B34, B42, B43, B78, B79, B80, B57, B59, B69, B71, null, B118, B1, B25, B37, B29, null, B50, null, B76, B81, B83, B84 , null, B70, null, null, B89, null, null, null, B91, B93, B94, null, B109, null, B97, B98, B99, B95, B100, B102, B103, null, null, B108, null, null, null, B22, B33] Error: Failed compilation: java.lang.invoke.VarHandleReferences$FieldStaticReadOnly.get(Ljava/lang/invoke/VarHandleReferen ces$FieldStaticReadOnly;)Ljava/lang/Object;: org.graalvm.compiler.debug.GraalError: should not reach here: Metaspace constant load should not be happening directly at lir instruction: B8 at 60 org.graalvm.compiler.hotspot.amd64.AMD64HotSpotMove$HotSpotLoadMetaspaceConstantOp rsi|Q WORD = HOTSPOTLOADMETASPACECONSTANT input: meta{HotSpotType}[B0, null, B3, B5, B18, B6, B8, B9, B11, B12, B10, B7, B13, B15, B17, B14, B16, B1, B4]Error: Failed compilation: java.lang.invoke.VarHandleReferences$FieldStaticReadOnly.getOpaque(Ljava/lang/invoke/VarHandleR eferences$FieldStaticReadOnly;)Ljava/lang/Object;: org.graalvm.compiler.debug.GraalError: should not reach here: Metaspace constant load should not be happening directly Best regards, Peng Xie ?? 2020/5/26 ??10:01??Tobias Hartmann? ??: Hi Peng Xie, On 26.05.20 09:54, bobpengxie(??) wrote: > Errors: > Please see the attachment. Attachments are stripped. Could you please upload the file somewhere and share the link? Best regards, Tobias From Yang.Zhang at arm.com Wed May 27 02:59:11 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Wed, 27 May 2020 02:59:11 +0000 Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes In-Reply-To: <1909fa9d-98bb-c2fb-45d8-540247d1ca8b@redhat.com> References: <275eb57c-51c0-675e-c32a-91b198023559@redhat.com> <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com> <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com> <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com> <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com> <1909fa9d-98bb-c2fb-45d8-540247d1ca8b@redhat.com> Message-ID: > But to my earlier question. please: can the new instructions be moved into jdk head first, and then merged into the Panama branch, or not? The new instructions can be classified as: 1. Instructions that can be matched with NEON instructions directly. MulVB and SqrtVF have been merged into jdk master already. The patch of AbsV is in review [1]. 2. Instructions that Jdk master has middle end support for, but they cannot be matched with NEON instructions directly. Such as AddReductionVL, MulReductionVL, And/Or/XorReductionV These new instructions can be moved into jdk master first, but for auto-vectorization, the performance might not get improved. May I have a new patch for these? 3. Panama/Vector API specific instructions Such as Load/StoreVector ( 16 bits), VectorReinterpret, VectorMaskCmp, MaxV/MinV, VectorBlend etc. These instructions cannot be moved into jdk master first because there isn't middle-end support. Regards Yang [1] https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-May/008861.html -----Original Message----- From: Andrew Haley Sent: Tuesday, May 26, 2020 4:25 PM To: Yang Zhang ; Paul Sandoz Cc: hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; nd Subject: Re: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes On 25/05/2020 09:26, Yang Zhang wrote: > In jdk master, what we need to do is that writing m4 file for existing > vector instructions and placed them to a new file aarch64_neon.ad. > If no question, I will do it right away. I'm not entirely sure that such a change is necessary now. In particular, reorganizing the existing vector instructions is IMO excessive, but I admit that it might be an improvement. But to my earlier question. please: can the new instructions be moved into jdk head first, and then merged into the Panama branch, or not? It'd help if this was possible. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From ningsheng.jian at arm.com Wed May 27 07:21:01 2020 From: ningsheng.jian at arm.com (Ningsheng Jian) Date: Wed, 27 May 2020 15:21:01 +0800 Subject: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> <2551a58f-bb05-b63d-b8ed-63f120a75eeb@redhat.com> <71630bf9-1cde-69b4-c376-6318957ea672@redhat.com> Message-ID: I see CSR review and submit tests are clear, so I pushed. Thanks, Ningsheng On 5/22/20 10:36 AM, Xiaohong Gong wrote: > Hi Andrew, > >> On 5/21/20 11:24 AM, Xiaohong Gong wrote: > > > I'v created a new patch to add the condition when inserting > > "DMBs" > > > before volatile load for C1/Interpreter. > > > The updated webrev: > > > http://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.01/ > > > > > > It adds a new function "is_c1_or_interpreter_only()" , which can > > > decide whether C2/JVMCI is used. Besides, since AOT also uses > > Graal > > > compiler as the codegen, it always return false if AOT mode is > > enabled. > > > > Looks good to me, thanks. > > > > As far as I remember, Graal does optimize volatile accesses to use > > ldar/stlr, or at least it will do so in the future, so if we're > > using AOT or JVMCI the safe thing to do is add the DMBs. > > Yes, exactly! It has a patch in Graal github to do this optimization (https://github.com/oracle/graal/pull/1772). > > Thanks, > Xiaohong > > -----Original Message----- > From: Andrew Haley > Sent: Thursday, May 21, 2020 10:06 PM > To: Xiaohong Gong ; Andrew Dinn ; Derek White ; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net > Cc: nd > Subject: Re: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option > > On 5/21/20 11:24 AM, Xiaohong Gong wrote: >> I'v created a new patch to add the condition when inserting "DMBs" >> before volatile load for C1/Interpreter. >> The updated webrev: >> http://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.01/ >> >> It adds a new function "is_c1_or_interpreter_only()" , which can >> decide whether C2/JVMCI is used. Besides, since AOT also uses Graal >> compiler as the codegen, it always return false if AOT mode is enabled. > > Looks good to me, thanks. > > As far as I remember, Graal does optimize volatile accesses to use ldar/stlr, or at least it will do so in the future, so if we're using AOT or JVMCI the safe thing to do is add the DMBs. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > From ningsheng.jian at arm.com Wed May 27 07:23:13 2020 From: ningsheng.jian at arm.com (Ningsheng Jian) Date: Wed, 27 May 2020 15:23:13 +0800 Subject: RFR(L): 8231441: AArch64: Initial SVE backend support In-Reply-To: References: Message-ID: <42fca25d-7172-b4f3-335b-92e2b05e8195@arm.com> Hi, I have rebased this patch with some more comments added. And also relaxed the instruction matching conditions for 128-bit vector. I would appreciate if someone could help to review this. Whole patch: http://cr.openjdk.java.net/~njian/8231441/webrev.01 Different parts of changes: 1) SVE feature detection http://cr.openjdk.java.net/~njian/8231441/webrev.01-feature 2) c2 registion allocation http://cr.openjdk.java.net/~njian/8231441/webrev.01-ra 3) SVE c2 backend http://cr.openjdk.java.net/~njian/8231441/webrev.01-c2 (Or should I split this into different JBS?) Thanks, Ningsheng On 3/25/20 2:37 PM, Ningsheng Jian wrote: > Hi, > > Could you please help to review this patch adding AArch64 SVE support? > It also touches c2 compiler shared code. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8231441 > Webrev: http://cr.openjdk.java.net/~njian/8231441/webrev.00 > > Arm has released new vector ISA extension for AArch64, SVE [1] and > SVE2 [2]. This patch adds the initial SVE support in OpenJDK. In this > patch we have: > > 1) SVE feature enablement and detection > 2) SVE vector register allocation support with initial predicate > register definition > 3) SVE c2 backend for current SLP based vectorizer. (We also have a POC > patch of a new vectorizer using SVE predicate-driven loop control, but > that's still under development.) > > SVE register definition > ======================= > Unlike other SIMD architectures, SVE allows hardware implementations to > choose a vector register length from 128 and 2048 bits, multiple of 128 > bits. So we introduce a new vector type VectorA, i.e. length agnostic > (scalable) vector type, and Op_VecA for machine vectora register. In the > meantime, to minimize register allocation code changes, we also take > advantage of one JIT compiler aspect, that is during the compile time we > actually know the real hardware SVE vector register size of current > running machine. So, the register allocator actually knows how many > register slots an Op_VecA ideal reg requires, and could work fine > without much modification. > > Since the bottom 128 bits are shared with the NEON, we extend current > register mask definition of V0-V31 registers. Currently, c2 uses one bit > mask for a 32-bit register slot, so to define at most 2048 bits we will > need to add 64 slots in AD file. That's a really large number, and will > also break current regmask assumption. Considering the SVE vector > register is architecturally scalable for different sizes, we just define > double of original NEON vector register slots, i.e. 8 slots: Vx, Vx_H, > Vx_J ... Vx_O. After adlc, the generated register masks now looks like: > > const RegMask _VECTORA_REG_mask( 0x0, 0x0, 0xffffffff, 0xffffffff, > 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, ... > > const RegMask _VECTORD_REG_mask( 0x0, 0x0, 0x3030303, 0x3030303, > 0x3030303, 0x3030303, 0x3030303, 0x3030303, ... > > const RegMask _VECTORX_REG_mask( 0x0, 0x0, 0xf0f0f0f, 0xf0f0f0f, > 0xf0f0f0f, 0xf0f0f0f, 0xf0f0f0f, 0xf0f0f0f, ... > > And we use SlotsPerVecA to indicate regmask bit size for a VecA register. > > Although for physical register allocation, register allocator does not > need to know the real VecA register size, while doing spill/unspill, > current register allocation needs to know actual stack slot size to > store/load VecA registers. SVE is able to do vector size agnostic > spilling, but to minimize the code changes, as I mentioned before, we > just let RA know the actual vector register size in current running > machine, by calling scalable_vector_reg_size(). > > In the meantime, since some vector operations do not have unpredicated > SVE1 instructions, but only predicate version, e.g. vector multiply, > vector load/store. We have also defined predicate registers in this > patch, and c2 register allocator will allocate a temp predicate register > to fulfill the expecting unpredicated operations. And this can also be > used for future predicate-driven vectorizer. This is not efficient for > now, as we can see many ptrue instructions in the generated code. One > possible solution I can see, is to block one predicate register, and > preset it to all true. But to preserve/reinitialize a caller save > register value cross calls seems risky to work in this patch. I decide > to defer it to further optimization work. If anyone has any suggestions > on this, I would appreciate. > > SVE feature detection > ===================== > Since we may have some compiled code based on the initial detected SVE > vector register length and the compiled code is compiled only for that > vector register length, we assume that the SVE vector register length > will not be changed during the JVM lifetime. However, SVE vector length > is per-thread and can be changed by system call [3], so we need to make > sure that each jni call will not change the sve vector length. > > Currently, we verify the SVE vector register length on each JNI return, > and if an SVE vector length change is detected, jvm simply reports error > and stops running. The VM running vector length can also be set by > existing VM option MaxVectorSize with c2 enabled. If MaxVectorSize is > specified not the same as system default sve vector length (in > /proc/sys/abi/sve_default_vector_length), JVM will set current process > sve vector length to the specified vector length. > > Compiled code > ============= > We have added all current c2 backend codegen on par with NEON, but only > for vector length larger than 128-bit. > > On a 1024 bit SVE environment, for the following simple loop with int > array element type: > > for (int i = 0; i < LENGTH; i++) { > c[i] = a[i] + b[i]; > } > > c2 generated loop: > > 0x0000ffff811c0820: sbfiz x11, x10, #2, #32 > 0x0000ffff811c0824: add x13, x18, x11 > 0x0000ffff811c0828: add x14, x1, x11 > 0x0000ffff811c082c: add x13, x13, #0x10 > 0x0000ffff811c0830: add x14, x14, #0x10 > 0x0000ffff811c0834: add x11, x0, x11 > 0x0000ffff811c0838: add x11, x11, #0x10 > 0x0000ffff811c083c: ptrue p1.s // To be optimized > 0x0000ffff811c0840: ld1w {z16.s}, p1/z, [x14] > 0x0000ffff811c0844: ptrue p0.s > 0x0000ffff811c0848: ld1w {z17.s}, p0/z, [x13] > 0x0000ffff811c084c: add z16.s, z17.s, z16.s > 0x0000ffff811c0850: ptrue p1.s > 0x0000ffff811c0854: st1w {z16.s}, p1, [x11] > 0x0000ffff811c0858: add w10, w10, #0x20 > 0x0000ffff811c085c: cmp w10, w12 > 0x0000ffff811c0860: b.lt 0x0000ffff811c0820 > > Test > ==== > Currently, we don't have real hardware to verify SVE features (and > performance). But we have run jtreg tests with SVE in some emulators. On > QEMU system emulator, which has SVE emulation support, jtreg tier1-3 > passed with different vector sizes. We've also verified it with full > jtreg tests without SVE on both x86 and AArch64, to make sure that > there's no regression. > > The patch has also been applied to Vector API code base, and verified on > emulator. In Vector API, there are more vector related tests and is more > possible to generate vector instructions by intrinsification. > > A simple test can also run in QEMU user emulation, e.g. > > $ qemu-aarch64 -cpu max,sve-max-vq=2 java -XX:UseSVE=1 SIMD > > ( > To run it in user emulation mode, we will need to bypass SVE feature > detection code in this patch. E.g. apply: > http://cr.openjdk.java.net/~njian/8231441/user-emulation.patch > )l > > Others > ====== > Since this patch is a bit large, I've also split it into 3 parts, for > easy review: > > 1) SVE feature detection > http://cr.openjdk.java.net/~njian/8231441/webrev.00-feature > > 2) c2 registion allocation > http://cr.openjdk.java.net/~njian/8231441/webrev.00-ra > > 3) SVE c2 backend > http://cr.openjdk.java.net/~njian/8231441/webrev.00-c2 > > Part of this patch has been contributed by Joshua Zhu and Yang Zhang. > > Refs > ==== > [1] https://developer.arm.com/docs/ddi0584/latest > [2] https://developer.arm.com/docs/ddi0602/latest > [3] https://www.kernel.org/doc/Documentation/arm64/sve.txt > > Thanks, > Ningsheng > From Xiaohong.Gong at arm.com Wed May 27 07:23:10 2020 From: Xiaohong.Gong at arm.com (Xiaohong Gong) Date: Wed, 27 May 2020 07:23:10 +0000 Subject: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> <2551a58f-bb05-b63d-b8ed-63f120a75eeb@redhat.com> <71630bf9-1cde-69b4-c376-6318957ea672@redhat.com> Message-ID: Hi Ningsheng, Thanks for the pushing! Best Regards, Xiaohong Gong -----Original Message----- From: Ningsheng Jian Sent: Wednesday, May 27, 2020 3:21 PM To: Xiaohong Gong ; Andrew Haley ; Andrew Dinn ; Derek White ; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: Re: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option I see CSR review and submit tests are clear, so I pushed. Thanks, Ningsheng On 5/22/20 10:36 AM, Xiaohong Gong wrote: > Hi Andrew, > >> On 5/21/20 11:24 AM, Xiaohong Gong wrote: > > > I'v created a new patch to add the condition when inserting > > "DMBs" > > > before volatile load for C1/Interpreter. > > > The updated webrev: > > > http://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.01/ > > > > > > It adds a new function "is_c1_or_interpreter_only()" , which can > > > decide whether C2/JVMCI is used. Besides, since AOT also uses > > Graal > > > compiler as the codegen, it always return false if AOT mode is > > enabled. > > > > Looks good to me, thanks. > > > > As far as I remember, Graal does optimize volatile accesses to use > > ldar/stlr, or at least it will do so in the future, so if we're > > using AOT or JVMCI the safe thing to do is add the DMBs. > > Yes, exactly! It has a patch in Graal github to do this optimization (https://github.com/oracle/graal/pull/1772). > > Thanks, > Xiaohong > > -----Original Message----- > From: Andrew Haley > Sent: Thursday, May 21, 2020 10:06 PM > To: Xiaohong Gong ; Andrew Dinn > ; Derek White ; > aarch64-port-dev at openjdk.java.net; > hotspot-compiler-dev at openjdk.java.net > Cc: nd > Subject: Re: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete > UseBarriersForVolatile option > > On 5/21/20 11:24 AM, Xiaohong Gong wrote: >> I'v created a new patch to add the condition when inserting "DMBs" >> before volatile load for C1/Interpreter. >> The updated webrev: >> http://cr.openjdk.java.net/~xgong/rfr/8243339/webrev.01/ >> >> It adds a new function "is_c1_or_interpreter_only()" , which can >> decide whether C2/JVMCI is used. Besides, since AOT also uses Graal >> compiler as the codegen, it always return false if AOT mode is enabled. > > Looks good to me, thanks. > > As far as I remember, Graal does optimize volatile accesses to use ldar/stlr, or at least it will do so in the future, so if we're using AOT or JVMCI the safe thing to do is add the DMBs. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > From Xiaohong.Gong at arm.com Wed May 27 08:02:21 2020 From: Xiaohong.Gong at arm.com (Xiaohong Gong) Date: Wed, 27 May 2020 08:02:21 +0000 Subject: RFR: 8245717: VM option "-XX:EnableJVMCIProduct" could not be repetitively enabled In-Reply-To: References: Message-ID: Hi David, > On 26/05/2020 4:57 pm, Xiaohong Gong wrote: > > Hi, > > > > Could you please help to review this simple patch? It fixes the > issue > > that JVM crashes in debug mode when the vm option > "-XX:EnableJVMCIProduct" is enabled repetitively. > > > > JBS: https://bugs.openjdk.java.net/browse/JDK-8245717 > > Webrev: http://cr.openjdk.java.net/~xgong/rfr/8245717/webrev.00/ > > > > Repetitively enabling the vm option "-XX:+EnableJVMCIProduct" in > the > > command line makes the assertion fail in debug mode: > "assert(is_experimental(), sanity)". > > > > It happens when the VM iteratively parses the options from > command > > line. When the matched option is "-XX:+EnableJVMCIProduct", the > > original experimental JVMCI flags will be converted to product > mode, > > with the above assertion before it. So if all the JVMCI flags > have > > been converted to the product mode at the first time parsing > "-XX:+EnableJVMCIProduct", the assertion will fail at the second > time it is parsed. > > > > A simple fix is to just ignoring the conversion if this option > has been parsed. > > Seems a reasonable approach given the already complex handling of > this flag. > > > Testing: > > Tested jtreg > > hotspot::hotspot_all_no_apps,jdk::jdk_core,langtools::tier1 > > and jcstress:tests-custom, and all tests pass without new > failure. > > I think adding a regression test in > ./compiler/jvmci/TestEnableJVMCIProduct.java would be appropriate. Thanks for your review and it?s a good idea to add the regression test. I have added the test in the new patch: http://cr.openjdk.java.net/~xgong/rfr/8245717/webrev.01/ . Could you please take a look at it? Thank you! Thanks, Xiaohong Gong From david.holmes at oracle.com Wed May 27 08:57:55 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 27 May 2020 18:57:55 +1000 Subject: RFR: 8245717: VM option "-XX:EnableJVMCIProduct" could not be repetitively enabled In-Reply-To: References: Message-ID: On 27/05/2020 6:02 pm, Xiaohong Gong wrote: > Hi David, > > > On 26/05/2020 4:57 pm, Xiaohong Gong wrote: > > > Hi, > > > > > > Could you please help to review this simple patch? It fixes the > > issue > > > that JVM crashes in debug mode when the vm option > > "-XX:EnableJVMCIProduct" is enabled repetitively. > > > > > > JBS: https://bugs.openjdk.java.net/browse/JDK-8245717 > > > Webrev: http://cr.openjdk.java.net/~xgong/rfr/8245717/webrev.00/ > > > > > > Repetitively enabling the vm option "-XX:+EnableJVMCIProduct" in > > the > > > command line makes the assertion fail in debug mode: > > "assert(is_experimental(), sanity)". > > > > > > It happens when the VM iteratively parses the options from > > command > > > line. When the matched option is "-XX:+EnableJVMCIProduct", the > > > original experimental JVMCI flags will be converted to product > > mode, > > > with the above assertion before it. So if all the JVMCI flags > > have > > > been converted to the product mode at the first time parsing > > "-XX:+EnableJVMCIProduct", the assertion will fail at the second > > time it is parsed. > > > > > > A simple fix is to just ignoring the conversion if this option > > has been parsed. > > > > Seems a reasonable approach given the already complex handling of > > this flag. > > > > > Testing: > > > Tested jtreg > > > hotspot::hotspot_all_no_apps,jdk::jdk_core,langtools::tier1 > > > and jcstress:tests-custom, and all tests pass without new > > failure. > > > > I think adding a regression test in > > ./compiler/jvmci/TestEnableJVMCIProduct.java would be appropriate. > > Thanks for your review and it?s a good idea to add the regression test. > I have added the test in the new patch: > http://cr.openjdk.java.net/~xgong/rfr/8245717/webrev.01/ . > > Could you please take a look at it? Thank you! Looks okay to me but compiler folk need to give the final ok. Thanks, David > Thanks, > Xiaohong Gong > From Xiaohong.Gong at arm.com Wed May 27 09:18:37 2020 From: Xiaohong.Gong at arm.com (Xiaohong Gong) Date: Wed, 27 May 2020 09:18:37 +0000 Subject: Question about the expected behavior if JVMCI compiler is used on the jvm variant with C2 disabled Message-ID: Hi, Recently we found an issue that the JVM can crash in debug mode when the JVMCI compiler is used on the jvm variant that C2 is disabled (Add "-with-jvm-features=-compiler2" for configuration). The JVM crashes with the assertion fails: Internal Error (jdk/src/hotspot/share/compiler/compileBroker.cpp:891), pid=10824, tid=10825 # assert(_c2_count > 0 || _c1_count > 0) failed: No compilers? It is obvious that the jvm cannot find a compiler since both the "_c2_count" and "_c1_count" is zero due to some internal issues. Since "TieredCompilation" is closed when C2 is disabled, the compile mode should be "interpreter+C1" by default, and it works well as expected. However, I'm confused about the expected behavior if the JVMCI compiler is specified to use. For one side, I thought it should use "interpreter+JVMCI" as the compile mode. If so we have to fix the issues. For another side, I noticed that there is a VM warning when using JVMCI compiler and disabling tiered compilation with normal configuration: "Disabling tiered compilation with non-native JVMCI compiler is not recommended". So considering that "TieredCompilation" is also closed when C2 is disabled, I thought it would be better to just invalid the JVMCI compiler for it. So my question is which should be the expected behavior, choose "interpreter+JVMCI" as the compile mode or make it invalid to use JVMCI compiler when C2 is disabled? It's very appreciative if I can get any opinion! Thanks, Xiaohong From aph at redhat.com Wed May 27 10:21:44 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 27 May 2020 11:21:44 +0100 Subject: AOT fails to compile jdk.base In-Reply-To: References: Message-ID: <24d80805-d0c3-ba78-39b6-739c1ffc1644@redhat.com> On 27/05/2020 03:41, bobpengxie(??) wrote: > Error: Failed compilation: jdk.internal.reflect.UnsafeQualifiedStaticObjectFieldAccessorImpl.get(Ljava/lang/Object;)Ljava/ > lang/Object;: org.graalvm.compiler.debug.GraalError: should not reach here: Metaspace constant load should not be happenin > g directly It's always been the case that jaotc fails in a few cases to do with metaspace constant loads. Unless it fails to produce a working library there's nothing to worry about. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From tobias.hartmann at oracle.com Wed May 27 11:04:42 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 27 May 2020 13:04:42 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> <87r1v81i8s.fsf@redhat.com> <87lfle29dn.fsf@redhat.com> Message-ID: <8bdd2091-08da-f8c7-88be-fff0ea7ad437@oracle.com> Hi Roland, >> http://cr.openjdk.java.net/~roland/8244504/webrev.02/ Overall looks good to me. Some comments: addnode.hpp - Now that you've introduced these helper methods, shouldn't you also replace other usages (not only loop opts)? For example, in ModINode::Ideal loopPredicate.cpp: - "kind predicates" -> "kinds of predicates" - change in line 136 was already fixed as part of 8245714, right? loopnode.cpp: - please add brackets to lines 487/490/494/497 Best regards, Tobias From nils.eliasson at oracle.com Wed May 27 11:58:45 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 27 May 2020 13:58:45 +0200 Subject: RFR(S): 8235673: [C1, C2] Split inlining control flags In-Reply-To: <2ff562fc-cdbb-1f47-17a0-2f5c9aae487b@oracle.com> References: <1c3dccb8-12b7-0073-83ca-04f910b8d79d@oracle.com> <19d53124-94d1-50f9-f4e6-948640e7c848@oracle.com> <702038f7-7942-9c94-c507-bd36241db180@oracle.com> <2ff562fc-cdbb-1f47-17a0-2f5c9aae487b@oracle.com> Message-ID: <9ee61551-0edb-124d-ec13-a83181bba03e@oracle.com> +1 Best regards, Nils On 2020-05-15 13:54, Tobias Hartmann wrote: > Hi Martin, > > yes, looks good to me. > > Best regards, > Tobias > > On 15.05.20 13:41, Doerr, Martin wrote: >> Hi Vladimir, Nils and Tobias, >> >> Can I consider http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ reviewed by you? >> Submission repo testing was successful. >> >> Thanks and best regards, >> Martin >> >> >>> -----Original Message----- >>> From: Vladimir Kozlov >>> Sent: Donnerstag, 14. Mai 2020 22:29 >>> To: Doerr, Martin ; hotspot-compiler- >>> dev at openjdk.java.net >>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>> >>> On 5/14/20 12:14 PM, Doerr, Martin wrote: >>>> Hi Vladimir, >>>> >>>>> But we can use it in Test5091921.java. C1 compiles the test code with >>>>> specified value before - lets keep it. >>>> Ok. That makes sense for this test. Updated webrev in place. >>> Good. >>> >>>>> And this is not related to these changes but to have range(0, max_jint) for >>> all >>>>> these flags is questionable. I think >>>>> nobody ran tests with 0 or max_jint values. Bunch of tests may simple >>>>> timeout (which is understandable) but in worst >>>>> case they may crash instead of graceful exit. >>>> I was wondering about that, too. But I haven't changed that. The previously >>> global flags already had this range. >>>> I had also thought about guessing more reasonable values, but reasonable >>> limits may depend on platform and future changes. >>>> I don't think we can define ranges such that everything works great while >>> we stay inside and also such that nobody will ever want greater values. >>>> So I prefer keeping it this way unless somebody has a better proposal. >>> I did not mean to have that in these change. Current changes are fine for me. >>> >>> I was thinking aloud that it would be nice to investigate this later by >>> someone. At least for some flags. We may keep >>> current range as it is but may be add dynamic checks based on platform and >>> other conditions. This looks like starter >>> task for junior engineer or student intern. >>> >>> Thanks, >>> Vladimir >>> >>>> Thanks and best regards, >>>> Martin >>>> >>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov >>>>> Sent: Mittwoch, 13. Mai 2020 23:34 >>>>> To: Doerr, Martin ; hotspot-compiler- >>>>> dev at openjdk.java.net >>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>> >>>>> On 5/13/20 1:10 PM, Doerr, Martin wrote: >>>>>> Hi Vladimir, >>>>>> >>>>>> thanks for reviewing it. >>>>>> >>>>>>>> Should I set it to proposed? >>>>>>> Yes. >>>>>> I've set it to "Finalized". Hope this was correct. >>>>>> >>>>>>>> I've added the new C1 flags to the tests which should test C1 compiler >>> as >>>>>>> well. >>>>>>> >>>>>>> Good. Why not do the same for C1MaxInlineSize? >>>>>> Looks like MaxInlineSize is only used by tests which test C2 specific >>> things. >>>>> So I think C1MaxInlineSize would be pointless. >>>>>> In addition to that, the C2 values are probably not appropriate for C1 in >>>>> some tests. >>>>>> Would you like to have C1MaxInlineSize configured in some tests? >>>>> You are right in cases when test switch off TieredCompilation and use only >>> C2 >>>>> (Test6792161.java) or tests intrinsics. >>>>> >>>>> But we can use it in Test5091921.java. C1 compiles the test code with >>>>> specified value before - lets keep it. >>>>> >>>>> And this is not related to these changes but to have range(0, max_jint) for >>> all >>>>> these flags is questionable. I think >>>>> nobody ran tests with 0 or max_jint values. Bunch of tests may simple >>>>> timeout (which is understandable) but in worst >>>>> case they may crash instead of graceful exit. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>>> Best regards, >>>>>> Martin >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Vladimir Kozlov >>>>>>> Sent: Mittwoch, 13. Mai 2020 21:46 >>>>>>> To: Doerr, Martin ; hotspot-compiler- >>>>>>> dev at openjdk.java.net >>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>> >>>>>>> Hi Martin, >>>>>>> >>>>>>> On 5/11/20 6:32 AM, Doerr, Martin wrote: >>>>>>>> Hi Vladimir, >>>>>>>> >>>>>>>> are you ok with the updated CSR >>>>>>> (https://bugs.openjdk.java.net/browse/JDK-8244507)? >>>>>>>> Should I set it to proposed? >>>>>>> Yes. >>>>>>> >>>>>>>> Here's a new webrev with obsoletion + expiration for C2 flags in >>>>> ClientVM: >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.02/ >>>>>>>> I've added the new C1 flags to the tests which should test C1 compiler >>> as >>>>>>> well. >>>>>>> >>>>>>> Good. Why not do the same for C1MaxInlineSize? >>>>>>> >>>>>>>> And I've added -XX:+IgnoreUnrecognizedVMOptions to all tests which >>>>> set >>>>>>> C2 flags. I think this is the best solution because it still allows running >>> the >>>>> tests >>>>>>> with GraalVM compiler. >>>>>>> >>>>>>> Yes. >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>>> Best regards, >>>>>>>> Martin >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Doerr, Martin >>>>>>>>> Sent: Freitag, 8. Mai 2020 23:07 >>>>>>>>> To: Vladimir Kozlov ; hotspot- >>> compiler- >>>>>>>>> dev at openjdk.java.net >>>>>>>>> Subject: RE: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>> >>>>>>>>> Hi Vladimir, >>>>>>>>> >>>>>>>>>> You need update your CSR - add information about this and above >>>>> code >>>>>>>>> change. Example: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 >>>>>>>>> I've updated the CSR with obsolete and expired flags as in the >>> example. >>>>>>>>>> I would suggest to fix tests anyway (there are only few) because >>> new >>>>>>>>>> warning output could be unexpected. >>>>>>>>> Ok. I'll prepare a webrev with fixed tests. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Martin >>>>>>>>> >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Vladimir Kozlov >>>>>>>>>> Sent: Freitag, 8. Mai 2020 21:43 >>>>>>>>>> To: Doerr, Martin ; hotspot-compiler- >>>>>>>>>> dev at openjdk.java.net >>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>>> >>>>>>>>>> Hi Martin >>>>>>>>>> >>>>>>>>>> On 5/8/20 5:56 AM, Doerr, Martin wrote: >>>>>>>>>>> Hi Vladimir, >>>>>>>>>>> >>>>>>>>>>> thanks a lot for looking at this, for finding the test issues and for >>>>>>>>> reviewing >>>>>>>>>> the CSR. >>>>>>>>>>> For me, C2 is a fundamental part of the JVM. I would usually never >>>>>>> build >>>>>>>>>> without it ?? >>>>>>>>>>> (Except if we want to use C1 + GraalVM compiler only.) >>>>>>>>>> Yes it is one of cases. >>>>>>>>>> >>>>>>>>>>> But your right, --with-jvm-variants=client configuration should still >>> be >>>>>>>>>> supported. >>>>>>>>>> >>>>>>>>>> Yes. >>>>>>>>>> >>>>>>>>>>> We can fix it by making the flags as obsolete if C2 is not included: >>>>>>>>>>> diff -r 5f5ed86d7883 src/hotspot/share/runtime/arguments.cpp >>>>>>>>>>> --- a/src/hotspot/share/runtime/arguments.cpp Fri May 08 >>> 11:14:28 >>>>>>>>> 2020 >>>>>>>>>> +0200 >>>>>>>>>>> +++ b/src/hotspot/share/runtime/arguments.cpp Fri May 08 >>>>> 14:41:14 >>>>>>>>>> 2020 +0200 >>>>>>>>>>> @@ -562,6 +562,16 @@ >>>>>>>>>>> { "dup option", JDK_Version::jdk(9), >>>>>>>>> JDK_Version::undefined(), >>>>>>>>>> JDK_Version::undefined() }, >>>>>>>>>>> #endif >>>>>>>>>>> >>>>>>>>>>> +#ifndef COMPILER2 >>>>>>>>>>> + // These flags were generally available, but are C2 only, now. >>>>>>>>>>> + { "MaxInlineLevel", JDK_Version::undefined(), >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>>> + { "MaxRecursiveInlineLevel", JDK_Version::undefined(), >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>>> + { "InlineSmallCode", JDK_Version::undefined(), >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>>> + { "MaxInlineSize", JDK_Version::undefined(), >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>>> + { "FreqInlineSize", JDK_Version::undefined(), >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>>> + { "MaxTrivialSize", JDK_Version::undefined(), >>>>>>>>>> JDK_Version::jdk(15), JDK_Version::undefined() }, >>>>>>>>>>> +#endif >>>>>>>>>>> + >>>>>>>>>>> { NULL, JDK_Version(0), JDK_Version(0) } >>>>>>>>>>> }; >>>>>>>>>> Right. I think you should do full process for these product flags >>>>>>> deprecation >>>>>>>>>> with obsoleting in JDK 16 for VM builds >>>>>>>>>> which do not include C2. You need update your CSR - add >>> information >>>>>>>>> about >>>>>>>>>> this and above code change. Example: >>>>>>>>>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238840 >>>>>>>>>> >>>>>>>>>>> This makes the VM accept the flags with warning: >>>>>>>>>>> jdk/bin/java -XX:MaxInlineLevel=9 -version >>>>>>>>>>> OpenJDK 64-Bit Client VM warning: Ignoring option >>> MaxInlineLevel; >>>>>>>>>> support was removed in 15.0 >>>>>>>>>>> If we do it this way, the only test which I think should get fixed is >>>>>>>>>> ReservedStackTest. >>>>>>>>>>> I think it should be sufficient to add -XX:C1MaxInlineLevel=2 in >>> order >>>>> to >>>>>>>>>> preserve the inlining behavior. >>>>>>>>>>> (TestStringIntrinsics2: C1 doesn't have String intrinsics anymore. >>>>>>>>>> compiler/c2 tests: Also written to test C2 specific things.) >>>>>>>>>>> What do you think? >>>>>>>>>> I would suggest to fix tests anyway (there are only few) because >>> new >>>>>>>>>> warning output could be unexpected. >>>>>>>>>> And it will be future-proof when warning will be converted into >>> error >>>>>>>>>> (if/when C2 goes away). >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Vladimir >>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> Martin >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: hotspot-compiler-dev >>>>>>>>>>> bounces at openjdk.java.net> On Behalf Of Vladimir Kozlov >>>>>>>>>>>> Sent: Donnerstag, 7. Mai 2020 19:11 >>>>>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>>>>> >>>>>>>>>>>> I would suggest to build VM without C2 and run tests. >>>>>>>>>>>> >>>>>>>>>>>> I grepped tests with these flags I found next tests where we >>> need >>>>> to >>>>>>> fix >>>>>>>>>>>> test's command (add >>>>>>>>>>>> -XX:+IgnoreUnrecognizedVMOptions) or add @requires >>>>>>>>>>>> vm.compiler2.enabled or duplicate test for C1 with >>> corresponding >>>>> C1 >>>>>>>>>>>> flags (by ussing additional @test block). >>>>>>>>>>>> >>>>>>>>>>>> runtime/ReservedStack/ReservedStackTest.java >>>>>>>>>>>> compiler/intrinsics/string/TestStringIntrinsics2.java >>>>>>>>>>>> compiler/c2/Test6792161.java >>>>>>>>>>>> compiler/c2/Test5091921.java >>>>>>>>>>>> >>>>>>>>>>>> And there is issue with compiler/compilercontrol tests which use >>>>>>>>>>>> InlineSmallCode and I am not sure how to handle: >>>>>>>>>>>> >>>>>>>>>>>> >>> http://hg.openjdk.java.net/jdk/jdk/file/55e9cb6b23ec/test/hotspot/jtreg/c >>>>>>>>>>>> ompiler/compilercontrol/share/scenario/Command.java#l36 >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Vladimir >>>>>>>>>>>> >>>>>>>>>>>> On 5/4/20 9:04 AM, Doerr, Martin wrote: >>>>>>>>>>>>> Hi Nils, >>>>>>>>>>>>> >>>>>>>>>>>>> thank you for looking at this and sorry for the late reply. >>>>>>>>>>>>> >>>>>>>>>>>>> I've added MaxTrivialSize and also updated the issue >>> accordingly. >>>>>>>>> Makes >>>>>>>>>>>> sense. >>>>>>>>>>>>> Do you have more flags in mind? >>>>>>>>>>>>> >>>>>>>>>>>>> Moving the flags which are only used by C2 into c2_globals >>>>> definitely >>>>>>>>>> makes >>>>>>>>>>>> sense. >>>>>>>>>>>>> Done in webrev.01: >>>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/ >>>>>>>>>>>>> Please take a look and let me know when my proposal is ready >>> for >>>>> a >>>>>>>>> CSR. >>>>>>>>>>>>> Best regards, >>>>>>>>>>>>> Martin >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>> From: hotspot-compiler-dev >>>>>>>>>>>>> bounces at openjdk.java.net> On Behalf Of Nils Eliasson >>>>>>>>>>>>>> Sent: Dienstag, 28. April 2020 18:29 >>>>>>>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>>>>>>>>>>> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for addressing this! This has been an annoyance for a >>>>> long >>>>>>>>> time. >>>>>>>>>>>>>> Have you though about including other flags - like >>>>> MaxTrivialSize? >>>>>>>>>>>>>> MaxInlineSize is tested against it. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Also - you should move the flags that are now c2-only to >>>>>>>>>> c2_globals.hpp. >>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>> Nils Eliasson >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2020-04-27 15:06, Doerr, Martin wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> while tuning inlining parameters for C2 compiler with JDK- >>>>> 8234863 >>>>>>>>> we >>>>>>>>>>>> had >>>>>>>>>>>>>> discussed impact on C1. >>>>>>>>>>>>>>> I still think it's bad to share them between both compilers. >>> We >>>>>>> may >>>>>>>>>> want >>>>>>>>>>>> to >>>>>>>>>>>>>> do further C2 tuning without negative impact on C1 in the >>> future. >>>>>>>>>>>>>>> C1 has issues with substantial inlining because of the lack of >>>>>>>>>> uncommon >>>>>>>>>>>>>> traps. When C1 inlines a lot, stack frames may get large and >>> code >>>>>>>>> cache >>>>>>>>>>>> space >>>>>>>>>>>>>> may get wasted for cold or even never executed code. The >>>>>>> situation >>>>>>>>>> gets >>>>>>>>>>>>>> worse when many patching stubs get used for such code. >>>>>>>>>>>>>>> I had opened the following issue: >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235673 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> And my initial proposal is here: >>>>>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Part of my proposal is to add an additional flag which I called >>>>>>>>>>>>>> C1InlineStackLimit to reduce stack utilization for C1 methods. >>>>>>>>>>>>>>> I have a simple example which shows wasted stack space >>> (java >>>>>>>>>> example >>>>>>>>>>>>>> TestStack at the end). >>>>>>>>>>>>>>> It simply counts stack frames until a stack overflow occurs. >>> With >>>>>>> the >>>>>>>>>>>> current >>>>>>>>>>>>>> implementation, only 1283 frames fit on the stack because the >>>>>>> never >>>>>>>>>>>>>> executed method bogus_test with local variables gets inlined. >>>>>>>>>>>>>>> Reduced C1InlineStackLimit avoids inlining of bogus_test and >>>>> we >>>>>>> get >>>>>>>>>>>> 2310 >>>>>>>>>>>>>> frames until stack overflow. (I only used C1 for this example. >>> Can >>>>>>> be >>>>>>>>>>>>>> reproduced as shown below.) >>>>>>>>>>>>>>> I didn't notice any performance regression even with the >>>>>>> aggressive >>>>>>>>>>>> setting >>>>>>>>>>>>>> of C1InlineStackLimit=5 with TieredCompilation. >>>>>>>>>>>>>>> I know that I'll need a CSR for this change, but I'd like to get >>>>>>>>> feedback >>>>>>>>>> in >>>>>>>>>>>>>> general and feedback about the flag names before creating a >>>>> CSR. >>>>>>>>>>>>>>> I'd also be glad about feedback regarding the performance >>>>>>> impact. >>>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>>> Martin >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Command line: >>>>>>>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - >>>>> XX:C1InlineStackLimit=20 - >>>>>>>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - >>>>>>> XX:+PrintInlining >>>>>>>>> - >>>>>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>>>>>>>>>>> TestStack >>>>>>>>>>>>>>> CompileCommand: compileonly >>>>> TestStack.triggerStackOverflow >>>>>>>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 >>>>> bytes) >>>>>>>>>>>> recursive >>>>>>>>>>>>>> inlining too deep >>>>>>>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) >>> inline >>>>>>>>>>>>>>> caught java.lang.StackOverflowError >>>>>>>>>>>>>>> 1283 activations were on stack, sum = 0 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> jdk/bin/java -XX:TieredStopAtLevel=1 - >>>>> XX:C1InlineStackLimit=10 - >>>>>>>>>>>>>> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch - >>>>>>> XX:+PrintInlining >>>>>>>>> - >>>>>>> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow >>>>>>>>>>>>>> TestStack >>>>>>>>>>>>>>> CompileCommand: compileonly >>>>> TestStack.triggerStackOverflow >>>>>>>>>>>>>>> @ 8 TestStack::triggerStackOverflow (15 >>>>> bytes) >>>>>>>>>>>> recursive >>>>>>>>>>>>>> inlining too deep >>>>>>>>>>>>>>> @ 11 TestStack::bogus_test (33 bytes) >>> callee >>>>>>> uses >>>>>>>>>> too >>>>>>>>>>>>>> much stack >>>>>>>>>>>>>>> caught java.lang.StackOverflowError >>>>>>>>>>>>>>> 2310 activations were on stack, sum = 0 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> TestStack.java: >>>>>>>>>>>>>>> public class TestStack { >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> static long cnt = 0, >>>>>>>>>>>>>>> sum = 0; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> public static void bogus_test() { >>>>>>>>>>>>>>> long c1 = 1, c2 = 2, c3 = 3, c4 = 4; >>>>>>>>>>>>>>> sum += c1 + c2 + c3 + c4; >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> public static void triggerStackOverflow() { >>>>>>>>>>>>>>> cnt++; >>>>>>>>>>>>>>> triggerStackOverflow(); >>>>>>>>>>>>>>> bogus_test(); >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> public static void main(String args[]) { >>>>>>>>>>>>>>> try { >>>>>>>>>>>>>>> triggerStackOverflow(); >>>>>>>>>>>>>>> } catch (StackOverflowError e) { >>>>>>>>>>>>>>> System.out.println("caught " + e); >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> System.out.println(cnt + " activations were on stack, >>> sum >>>>> = " >>>>>>> + >>>>>>>>>>>> sum); >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> From tobias.hartmann at oracle.com Wed May 27 12:03:40 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 27 May 2020 14:03:40 +0200 Subject: [15] RFR(S): 8245957: Remove unused LIR_OpBranch::type after SPARC port removal Message-ID: Hi, please review the following patch that removes LIR_OpBranch::type after the only remaining usage [1] was removed with the SPARC port removal (JDK-8244224). https://bugs.openjdk.java.net/browse/JDK-8245957 We have two options: 1) Keep the type asserts in LIR_OpBranch::branch: http://cr.openjdk.java.net/~thartmann/8245957/webrev.v1.00/ 2) Remove the asserts: http://cr.openjdk.java.net/~thartmann/8245957/webrev.v2.00/ I would prefer 2). Best regards, Tobias [1] https://hg.openjdk.java.net/jdk/jdk/file/ae7ed29a5f70/src/hotspot/cpu/sparc/c1_LIRAssembler_sparc.cpp#l597 From rwestrel at redhat.com Wed May 27 12:10:45 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 27 May 2020 14:10:45 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <8bdd2091-08da-f8c7-88be-fff0ea7ad437@oracle.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> <87r1v81i8s.fsf@redhat.com> <87lfle29dn.fsf@redhat.com> <8bdd2091-08da-f8c7-88be-fff0ea7ad437@oracle.com> Message-ID: <877dwx1su2.fsf@redhat.com> Thanks reviewing this. > addnode.hpp > - Now that you've introduced these helper methods, shouldn't you also replace other usages (not only > loop opts)? For example, in ModINode::Ideal That would make sense but: int hack_res = (i >= 0) ? divisor : 1; is not a min, right? I don't see any other in the C2 code base. > loopPredicate.cpp: > - "kind predicates" -> "kinds of predicates" > - change in line 136 was already fixed as part of 8245714, right? Right. Roland. From tobias.hartmann at oracle.com Wed May 27 12:23:33 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 27 May 2020 14:23:33 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <877dwx1su2.fsf@redhat.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> <87r1v81i8s.fsf@redhat.com> <87lfle29dn.fsf@redhat.com> <8bdd2091-08da-f8c7-88be-fff0ea7ad437@oracle.com> <877dwx1su2.fsf@redhat.com> Message-ID: <06df7419-e37f-27ae-9612-47111ea3d637@oracle.com> Hi Roland, On 27.05.20 14:10, Roland Westrelin wrote: > That would make sense but: > > int hack_res = (i >= 0) ? divisor : 1; > > is not a min, right? > > I don't see any other in the C2 code base. Okay, you are right. Looks good to me then. Best regards, Tobias From rwestrel at redhat.com Wed May 27 12:27:08 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 27 May 2020 14:27:08 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <06df7419-e37f-27ae-9612-47111ea3d637@oracle.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> <87r1v81i8s.fsf@redhat.com> <87lfle29dn.fsf@redhat.com> <8bdd2091-08da-f8c7-88be-fff0ea7ad437@oracle.com> <877dwx1su2.fsf@redhat.com> <06df7419-e37f-27ae-9612-47111ea3d637@oracle.com> Message-ID: <871rn51s2r.fsf@redhat.com> > Looks good to me then. Thanks for the review. Do you think this need testing other than the submit repo? Roland. From rwestrel at redhat.com Wed May 27 12:27:34 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 27 May 2020 14:27:34 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> <87r1v81i8s.fsf@redhat.com> <87lfle29dn.fsf@redhat.com> Message-ID: <87y2pdzhop.fsf@redhat.com> > Yes, this refactoring alone is much easier to examine. > Reviewed! Thanks for the review. Roland. From rwestrel at redhat.com Wed May 27 12:29:02 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 27 May 2020 14:29:02 +0200 Subject: RFR(XS): 8245714: "Bad graph detected in build_loop_late" when loads are pinned on loop limit check uncommon branch In-Reply-To: References: <87tv041ira.fsf@redhat.com> Message-ID: <87v9khzhm9.fsf@redhat.com> > looks good and trivial to me. Thanks for the review. Roland. From tobias.hartmann at oracle.com Wed May 27 12:29:30 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 27 May 2020 14:29:30 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: <871rn51s2r.fsf@redhat.com> References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> <87r1v81i8s.fsf@redhat.com> <87lfle29dn.fsf@redhat.com> <8bdd2091-08da-f8c7-88be-fff0ea7ad437@oracle.com> <877dwx1su2.fsf@redhat.com> <06df7419-e37f-27ae-9612-47111ea3d637@oracle.com> <871rn51s2r.fsf@redhat.com> Message-ID: On 27.05.20 14:27, Roland Westrelin wrote: > Thanks for the review. Do you think this need testing other than the > submit repo? I've already submitted tier1-3 and the results look good. Best regards, Tobias From rwestrel at redhat.com Wed May 27 12:32:17 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 27 May 2020 14:32:17 +0200 Subject: RFR(M): 8244504: C2: refactor counted loop code in preparation for long counted loop In-Reply-To: References: <871rnx76go.fsf@redhat.com> <5504FD2E-4D8E-4140-AD37-426C7CC2331E@oracle.com> <00CBC968-A306-4663-B3C8-828DF4FB2E98@oracle.com> <878si45f6p.fsf@redhat.com> <87zhab3n77.fsf@redhat.com> <8A0D03CB-7662-4EA9-A232-1FF07F2ACDCD@oracle.com> <87r1v81i8s.fsf@redhat.com> <87lfle29dn.fsf@redhat.com> <8bdd2091-08da-f8c7-88be-fff0ea7ad437@oracle.com> <877dwx1su2.fsf@redhat.com> <06df7419-e37f-27ae-9612-47111ea3d637@oracle.com> <871rn51s2r.fsf@redhat.com> Message-ID: <87sgflzhgu.fsf@redhat.com> > I've already submitted tier1-3 and the results look good. Excellent. Thanks. Roland. From aph at redhat.com Wed May 27 13:03:54 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 27 May 2020 14:03:54 +0100 Subject: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> Message-ID: On 15/05/2020 11:37, Doerr, Martin wrote: > Exactly, we get stop type + stop message + registers + instructions (unfortunately not disassembled for some reason) + nice stack trace. The "some reason" is that you're not calling VMError::report_and_die correctly. Do something like this: + VMError::report_and_die(INTERNAL_ERROR, msg, detail_msg, detail_args, thread, + pc, info, ucVoid, NULL, 0, 0); I'm working on an AArch64 version now. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From erik.osterlund at oracle.com Wed May 27 13:14:39 2020 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 27 May 2020 15:14:39 +0200 Subject: RFR: 8245452: Clean up compressed pointer logic in lcm.cpp Message-ID: <9e93fcd4-a6ba-0705-05b4-581fa9d39482@oracle.com> Hi, After my change enabling compressed class pointers when compressed oops is disabled, Vladimir Kozlov pointed out that there is potential for simplifying some code in lcm.cpp that uses various checks if there is any form of compressed class/oop pointers with shift 0, as a way of using either the base->get_ptr_type() or base->bottom_type()->is_ptr() of a base pointer. These tests have always had false positives, where the base->get_ptr_type() is used when there is no way it could be a compressed pointer with shift 0. This dance is not really necessary if we just use the base->get_ptr_type() always, instead of carefully figuring out when we can use the bottom type. Because it works in both cases. Bug: https://bugs.openjdk.java.net/browse/JDK-8245452 Webrev: http://cr.openjdk.java.net/~eosterlund/8245452/webrev.00/ Thanks, /Erik From martin.doerr at sap.com Wed May 27 13:42:16 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 27 May 2020 13:42:16 +0000 Subject: [15] RFR(S): 8245957: Remove unused LIR_OpBranch::type after SPARC port removal In-Reply-To: References: Message-ID: Hi Tobias, > I would prefer 2). +1 Thanks for cleaning this up. Looks good to me. Best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Tobias Hartmann > Sent: Mittwoch, 27. Mai 2020 14:04 > To: hotspot compiler > Subject: [15] RFR(S): 8245957: Remove unused LIR_OpBranch::type after > SPARC port removal > > Hi, > > please review the following patch that removes LIR_OpBranch::type after > the only remaining usage [1] > was removed with the SPARC port removal (JDK-8244224). > > https://bugs.openjdk.java.net/browse/JDK-8245957 > > We have two options: > 1) Keep the type asserts in LIR_OpBranch::branch: > http://cr.openjdk.java.net/~thartmann/8245957/webrev.v1.00/ > 2) Remove the asserts: > http://cr.openjdk.java.net/~thartmann/8245957/webrev.v2.00/ > > I would prefer 2). > > Best regards, > Tobias > > [1] > https://hg.openjdk.java.net/jdk/jdk/file/ae7ed29a5f70/src/hotspot/cpu/spa > rc/c1_LIRAssembler_sparc.cpp#l597 From martin.doerr at sap.com Wed May 27 13:45:57 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 27 May 2020 13:45:57 +0000 Subject: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> Message-ID: Hi Andrew, I still see "Instructions" section in hex. But I can live with that. PPC change is already pushed: 8244949: [PPC64] Reengineer assembler stop function Thanks for taking care of the AArch64 version. Best regards, Martin > -----Original Message----- > From: Andrew Haley > Sent: Mittwoch, 27. Mai 2020 15:04 > To: Doerr, Martin ; Derek White > ; Ningsheng Jian ; Liu, > Xin ; hotspot-compiler-dev at openjdk.java.net > Cc: aarch64-port-dev at openjdk.java.net > Subject: Re: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information > when hitting a HaltNode for architectures other than x86 > > On 15/05/2020 11:37, Doerr, Martin wrote: > > Exactly, we get stop type + stop message + registers + instructions > (unfortunately not disassembled for some reason) + nice stack trace. > > The "some reason" is that you're not calling VMError::report_and_die > correctly. > > Do something like this: > > + VMError::report_and_die(INTERNAL_ERROR, msg, detail_msg, > detail_args, thread, > + pc, info, ucVoid, NULL, 0, 0); > > I'm working on an AArch64 version now. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From martin.doerr at sap.com Wed May 27 14:12:07 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 27 May 2020 14:12:07 +0000 Subject: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Message-ID: Hi Xin, I think Andrew's trap based version will be much better. But I'll leave the AArch64 part up to other people. > Another thing is I don't understand what's the benefit to > use the signal handler for that. Shorter code in the code cache (signal handler takes care of saving registers), better hs_err file (includes registers and instructions at the point at which the signal occurred, better stack trace: no nasty C-frames). Best regards, Martin > -----Original Message----- > From: Liu, Xin > Sent: Mittwoch, 27. Mai 2020 00:58 > To: aph at redhat.com; Ningsheng Jian ; aarch64- > port-dev at openjdk.java.net > Cc: Doerr, Martin ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: RFR(XS): Provide information when hitting a HaltNode for > architectures other than x86 > > Hi, > > I make a new revision of JDK-8230552. May I ask the arm reviewers to take a > look? > http://cr.openjdk.java.net/~xliu/8230552/01/webrev/ > > I haven't made aarch64 stop() uses the trap mechanism because it deserves > a standalone JBS. Another thing is I don't understand what's the benefit to > use the signal handler for that. > > I do manage to reduce stop() code size per Ningsheng's request. It avoids > from generating pusha if ShowMessageBoxOnError is off (default). It still acts > as expect no matter ShowMessageBoxOnError is set or not. Since we now > have got rid of the HaltNode right after Uncommon_trap callnode in release > build, I don't think code bloat is an issue anymore. > > Thanks, > --lx > > > > ?On 5/6/20, 1:19 AM, "aph at redhat.com" wrote: > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > > > On 5/6/20 8:25 AM, Liu, Xin wrote: > > Currently in AArch64 MacroAssembler::stop(), it will generate many > > register saving instructions by pusha() before calling to debug64(). But > > I think debug64() only uses the regs[] and pc arguments when > > ShowMessageBoxOnError is on. Maybe we should at least only do the > saving > > and pc arg passing when ShowMessageBoxOnError is on in > > MacroAssembler::stop(), as what x86 does in > macroAssembler_x86.cpp? > > Maybe we should think about a better way to do it. All we have > to do, after all, is put the reason into, say, r8, and execute > a trap. We don't need to push and pop anything because the trap > handler will do that. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > From tobias.hartmann at oracle.com Wed May 27 14:16:00 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 27 May 2020 16:16:00 +0200 Subject: [15] RFR(S): 8245957: Remove unused LIR_OpBranch::type after SPARC port removal In-Reply-To: References: Message-ID: <9a7f4880-f825-6585-05a5-568c06002e28@oracle.com> Hi Martin, thanks for the review! Best regards, Tobias On 27.05.20 15:42, Doerr, Martin wrote: > Hi Tobias, > >> I would prefer 2). > +1 > > Thanks for cleaning this up. Looks good to me. > > Best regards, > Martin > > >> -----Original Message----- >> From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Tobias Hartmann >> Sent: Mittwoch, 27. Mai 2020 14:04 >> To: hotspot compiler >> Subject: [15] RFR(S): 8245957: Remove unused LIR_OpBranch::type after >> SPARC port removal >> >> Hi, >> >> please review the following patch that removes LIR_OpBranch::type after >> the only remaining usage [1] >> was removed with the SPARC port removal (JDK-8244224). >> >> https://bugs.openjdk.java.net/browse/JDK-8245957 >> >> We have two options: >> 1) Keep the type asserts in LIR_OpBranch::branch: >> http://cr.openjdk.java.net/~thartmann/8245957/webrev.v1.00/ >> 2) Remove the asserts: >> http://cr.openjdk.java.net/~thartmann/8245957/webrev.v2.00/ >> >> I would prefer 2). >> >> Best regards, >> Tobias >> >> [1] >> https://hg.openjdk.java.net/jdk/jdk/file/ae7ed29a5f70/src/hotspot/cpu/spa >> rc/c1_LIRAssembler_sparc.cpp#l597 From derekw at marvell.com Wed May 27 14:17:28 2020 From: derekw at marvell.com (Derek White) Date: Wed, 27 May 2020 14:17:28 +0000 Subject: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option In-Reply-To: References: <5c755cf3-8c95-e224-49cf-88c7c8b54812@redhat.com> <2551a58f-bb05-b63d-b8ed-63f120a75eeb@redhat.com> <71630bf9-1cde-69b4-c376-6318957ea672@redhat.com> Message-ID: Xiaohong, thanks for including the ThunderX1 cleanup! - Derek -----Original Message----- From: Xiaohong Gong Sent: Wednesday, May 27, 2020 3:23 AM To: Ningsheng Jian ; Andrew Haley ; Andrew Dinn ; Derek White ; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: [EXT] RE: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option ---------------------------------------------------------------------- Hi Ningsheng, Thanks for the pushing! Best Regards, Xiaohong Gong -----Original Message----- From: Ningsheng Jian Sent: Wednesday, May 27, 2020 3:21 PM To: Xiaohong Gong ; Andrew Haley ; Andrew Dinn ; Derek White ; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: Re: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete UseBarriersForVolatile option I see CSR review and submit tests are clear, so I pushed. Thanks, Ningsheng On 5/22/20 10:36 AM, Xiaohong Gong wrote: > Hi Andrew, > >> On 5/21/20 11:24 AM, Xiaohong Gong wrote: > > > I'v created a new patch to add the condition when inserting > > "DMBs" > > > before volatile load for C1/Interpreter. > > > The updated webrev: > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Exgong_rfr_8243339_webrev.01_&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=v4awzqAGspetRbA0GcI6qzWI4gEqvTZJanovmjlaOfc&s=usotLgy1ZYkvIvvmvMzZBgyVA8tbxdiENiMKxe-xqV8&e= > > > > > > It adds a new function "is_c1_or_interpreter_only()" , which can > > > decide whether C2/JVMCI is used. Besides, since AOT also uses > > Graal > > > compiler as the codegen, it always return false if AOT mode is > > enabled. > > > > Looks good to me, thanks. > > > > As far as I remember, Graal does optimize volatile accesses to use > > ldar/stlr, or at least it will do so in the future, so if we're > > using AOT or JVMCI the safe thing to do is add the DMBs. > > Yes, exactly! It has a patch in Graal github to do this optimization (https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_oracle_graal_pull_1772&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=v4awzqAGspetRbA0GcI6qzWI4gEqvTZJanovmjlaOfc&s=BD3vlImO8sPL4QC9AGdC3Nrs-0FaO3hzd5uhecHCcjQ&e= ). > > Thanks, > Xiaohong > > -----Original Message----- > From: Andrew Haley > Sent: Thursday, May 21, 2020 10:06 PM > To: Xiaohong Gong ; Andrew Dinn > ; Derek White ; > aarch64-port-dev at openjdk.java.net; > hotspot-compiler-dev at openjdk.java.net > Cc: nd > Subject: Re: [aarch64-port-dev ] RFR: 8243339: AArch64: Obsolete > UseBarriersForVolatile option > > On 5/21/20 11:24 AM, Xiaohong Gong wrote: >> I'v created a new patch to add the condition when inserting "DMBs" >> before volatile load for C1/Interpreter. >> The updated webrev: >> https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.n >> et_-7Exgong_rfr_8243339_webrev.01_&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ& >> r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=v4awzqAGspetRbA0GcI6q >> zWI4gEqvTZJanovmjlaOfc&s=usotLgy1ZYkvIvvmvMzZBgyVA8tbxdiENiMKxe-xqV8& >> e= >> >> It adds a new function "is_c1_or_interpreter_only()" , which can >> decide whether C2/JVMCI is used. Besides, since AOT also uses Graal >> compiler as the codegen, it always return false if AOT mode is enabled. > > Looks good to me, thanks. > > As far as I remember, Graal does optimize volatile accesses to use ldar/stlr, or at least it will do so in the future, so if we're using AOT or JVMCI the safe thing to do is add the DMBs. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > =DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU9 > 7kngYUJk&m=v4awzqAGspetRbA0GcI6qzWI4gEqvTZJanovmjlaOfc&s=FzoZCaoTBONlu > dy6T1fhIPmq8cP9SmBJZGaj8xZrN-g&e= > > https://urldefense.proofpoint.com/v2/url?u=https-3A__keybase.io_andrew > haley&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNj > T0vmaU97kngYUJk&m=v4awzqAGspetRbA0GcI6qzWI4gEqvTZJanovmjlaOfc&s=gD2Ro- > lXyKC-MfVBk8CWQJ3SXcWYtJ3NGHDOdE9vINc&e= > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > From igor.veresov at oracle.com Wed May 27 14:44:13 2020 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 27 May 2020 07:44:13 -0700 Subject: AOT fails to compile jdk.base In-Reply-To: <24d80805-d0c3-ba78-39b6-739c1ffc1644@redhat.com> References: <24d80805-d0c3-ba78-39b6-739c1ffc1644@redhat.com> Message-ID: I?m working on it. The JDK part will be covered under https://bugs.openjdk.java.net/browse/JDK-8245505 . The fix will be out in a day or so. igor > On May 27, 2020, at 3:21 AM, Andrew Haley wrote: > > On 27/05/2020 03:41, bobpengxie(??) wrote: >> Error: Failed compilation: jdk.internal.reflect.UnsafeQualifiedStaticObjectFieldAccessorImpl.get(Ljava/lang/Object;)Ljava/ >> lang/Object;: org.graalvm.compiler.debug.GraalError: should not reach here: Metaspace constant load should not be happenin >> g directly > > It's always been the case that jaotc fails in a few cases to do with > metaspace constant loads. Unless it fails to produce a working library > there's nothing to worry about. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > From aph at redhat.com Wed May 27 15:22:45 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 27 May 2020 16:22:45 +0100 Subject: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> Message-ID: On 27/05/2020 14:45, Doerr, Martin wrote: > I still see "Instructions" section in hex. But I can live with that. If you look further down the log, you should see the disassembly. PPC change is already pushed: 8244949: [PPC64] Reengineer assembler stop function -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Wed May 27 15:54:30 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 27 May 2020 16:54:30 +0100 Subject: RFR: 8245986: AArch64: Provide information when hitting a HaltNode Message-ID: <92db8ab4-84e3-d425-4e9f-d6a77b0fa837@redhat.com> We need to provide a halt reason when hitting a C2 HaltNode on AArch64, and we need to do so without grossly bloating the code. http://cr.openjdk.java.net/~aph/8245986/ -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From martin.doerr at sap.com Wed May 27 15:57:02 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 27 May 2020 15:57:02 +0000 Subject: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> Message-ID: Indeed. Thanks for figuring this out. These variants of report_and_die are very confusing. Best regards, Martin > -----Original Message----- > From: Andrew Haley > Sent: Mittwoch, 27. Mai 2020 17:23 > To: Doerr, Martin ; Derek White > ; Ningsheng Jian ; Liu, > Xin ; hotspot-compiler-dev at openjdk.java.net > Cc: aarch64-port-dev at openjdk.java.net > Subject: Re: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information > when hitting a HaltNode for architectures other than x86 > > On 27/05/2020 14:45, Doerr, Martin wrote: > > I still see "Instructions" section in hex. But I can live with that. > > If you look further down the log, you should see the disassembly. > > > PPC change is already pushed: 8244949: [PPC64] Reengineer assembler stop > function > > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From xxinliu at amazon.com Wed May 27 20:09:34 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Wed, 27 May 2020 20:09:34 +0000 Subject: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Message-ID: Hi, Martin and Andrew, Yes, it's better. I compare two stacktraces and cframes in the previous stacktrace indeed disturb users from understanding their own problems. I reviewed Andrew's webrev. It looks good to me. I am happy to see that you solve rscratch1 clobber problem in such elegant way! Just one thing: for this instruction emit_int64((intptr_t)msg), can we safely say a pointer is always 64-bit on aarch64? According to arm document, in theory, aarch64 has the ILP32 data model, but I don't think we ever use ILP32 before on aarch64. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0490a/ar01s01.html May I ask Andrew to sponsor my patch when you push JDK-8245986? Now it become trivial. http://cr.openjdk.java.net/~xliu/8230552/02/webrev/ Thanks, --lx ?On 5/27/20, 7:13 AM, "Doerr, Martin" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi Xin, I think Andrew's trap based version will be much better. But I'll leave the AArch64 part up to other people. > Another thing is I don't understand what's the benefit to > use the signal handler for that. Shorter code in the code cache (signal handler takes care of saving registers), better hs_err file (includes registers and instructions at the point at which the signal occurred, better stack trace: no nasty C-frames). Best regards, Martin > -----Original Message----- > From: Liu, Xin > Sent: Mittwoch, 27. Mai 2020 00:58 > To: aph at redhat.com; Ningsheng Jian ; aarch64- > port-dev at openjdk.java.net > Cc: Doerr, Martin ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: RFR(XS): Provide information when hitting a HaltNode for > architectures other than x86 > > Hi, > > I make a new revision of JDK-8230552. May I ask the arm reviewers to take a > look? > http://cr.openjdk.java.net/~xliu/8230552/01/webrev/ > > I haven't made aarch64 stop() uses the trap mechanism because it deserves > a standalone JBS. Another thing is I don't understand what's the benefit to > use the signal handler for that. > > I do manage to reduce stop() code size per Ningsheng's request. It avoids > from generating pusha if ShowMessageBoxOnError is off (default). It still acts > as expect no matter ShowMessageBoxOnError is set or not. Since we now > have got rid of the HaltNode right after Uncommon_trap callnode in release > build, I don't think code bloat is an issue anymore. > > Thanks, > --lx > > > > On 5/6/20, 1:19 AM, "aph at redhat.com" wrote: > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > > > On 5/6/20 8:25 AM, Liu, Xin wrote: > > Currently in AArch64 MacroAssembler::stop(), it will generate many > > register saving instructions by pusha() before calling to debug64(). But > > I think debug64() only uses the regs[] and pc arguments when > > ShowMessageBoxOnError is on. Maybe we should at least only do the > saving > > and pc arg passing when ShowMessageBoxOnError is on in > > MacroAssembler::stop(), as what x86 does in > macroAssembler_x86.cpp? > > Maybe we should think about a better way to do it. All we have > to do, after all, is put the reason into, say, r8, and execute > a trap. We don't need to push and pop anything because the trap > handler will do that. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > From nils.eliasson at oracle.com Wed May 27 20:42:17 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 27 May 2020 22:42:17 +0200 Subject: RFR(M): 8244660: Code cache sweeper heuristics is broken In-Reply-To: References: <0688678b-986b-082c-425e-543c3c32b094@oracle.com> Message-ID: <1e06ca0e-803a-416f-2313-0f9e53aa94ba@oracle.com> Hi Man, On 2020-05-22 10:13, Man Cao wrote: > Hi Nils, > > Thanks for the updated code! > Below is my feedback after a closer look, ordered by importance. > > 1.?sweeper_loop() > NMethodSweeper::sweeper_loop() seems to be missing an inner loop to > check for a wakeup condition. > It could suffer from lost wakeup and spurious wakeup problems, as > described in [1]. > Similar code like ServiceThread::service_thread_entry() also has > nested loops to check for wakeup conditions. > We could add a boolean variable "should_sweep", guarded by the > CodeSweeper_lock. > [1] > https://www.modernescpp.com/index.php/c-core-guidelines-be-aware-of-the-traps-of-condition-variables Reasonable. I re-added _should_sweep and did some minor refactorings. > > One wild idea is to let the ServiceThread handle the code cache sweep > work, and remove the sweeper thread. > Has anyone considered this idea before? I think it has but I don't remember the details. Code cache sweeping would block the service thread for quite some time. > > 2. Rank of CodeSweeper_lock > In?mutexLocker.cpp: > ?def(CodeSweeper_lock ? ? ? ? ? ? , PaddedMonitor, special+1, ? true, > ?_safepoint_check_never); > > Should the rank be "special - 1", like the?CompiledMethod_lock? > We want to check that this lock acquisition order is valid, but not > vice versa: > { MonitorLocker(CodeCache_lock); MonitorLocker(CodeSweeper_lock); } > Reading the code in?Mutex::set_owner_implementation(), the deadlock > avoidance rule enforces that > the inner lock should have a lower rank than the outer lock. > "special?+ 1" has the same value as Mutex::suspend_resume, which > disables the deadlock?avoidance check. Good catch - but the rank should be "special - 2". There is one code path through InstanceKlass::add_osr_nmethod that calls make_not_entrant on an nmethod while holding the CompileMethod_lock. So the rank need to be one less. I updated enum lock_types to add one more level of special. > > 3. Data race on?_bytes_changed > ? 74? ? static volatile int _bytes_changed; > The "volatile" keyword likely intends to avoid data races and > atomicity issues, > but the accesses use "_bytes_changed?+=" and "=" to do loads and stores. > Should those accesses use Atomic::add(&_bytes_changed, value) and > other Atomic functions. Fixed. > > 4. NMethodSweeper::_sweep_threshold > Is it better to make it a "size_t" instead of "int"? Then we can use > "ulong" in metadata.xml, and SIZE_FORMAT in the log_info()). > Also it's probably better to name it as?_sweep_threshold_bytes or > _sweep_threshold_bytes, to differentiate from the > SweeperThreshold percentage value. > _sweep_threshold's?type should probably be consistent > with?_bytes_changed, so perhaps?_bytes_changed > could be changed to size_t. ok, fixed. > > 5. > 887 void CompileBroker::init_compiler_sweeper_threads() { > 888 ? NMethodSweeper::set_sweep_threshold((SweeperThreshold / 100) * > ReservedCodeCacheSize); > Is it better to use "static_cast()" to explicitly mark type cast? sure. > > -Man New webrev: http://cr.openjdk.java.net/~neliasso/8244660/webrev.03/ Best regards, Nils From mikael.vidstedt at oracle.com Wed May 27 21:40:31 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 27 May 2020 14:40:31 -0700 Subject: RFR(XS): 8245864: Obsolete BranchOnRegister Message-ID: Please review this small change which obsoletes the BranchOnRegister flag: JBS: https://bugs.openjdk.java.net/browse/JDK-8245864 webrev: http://cr.openjdk.java.net/~mikael/webrevs/8245864/webrev.00/open/webrev/ CSR: https://bugs.openjdk.java.net/browse/JDK-8245865 Background (from JBS): With Solaris removed (JDK-8241787) the BranchOnRegister flag no longer has any effect and should be removed using the normal process. Since the flag was really only useful on Solaris and since all the code it once controlled is now gone the flag will go directly to obsolete, skipping the deprecation step. Testing: * Manual verification running with the flag: spits out the warning as expected * tier1: in progress Cheers, Mikael From igor.veresov at oracle.com Wed May 27 21:57:54 2020 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 27 May 2020 14:57:54 -0700 Subject: RFR(S) 8245505: Prelink j.l.ref.Reference when loading AOT library Message-ID: This change addresses a problem in AOT with the class reference (to j.l.ref.Reference) being introduced too late (during GC barrier expansion). The solution is twofold. The first part is to run the constant replacement phase after the barrier expansion and replace it with an indirect load. The replacement phase is run in a new mode that does replacement only for known classes that are pre-linked by the runtime during the library load. We can?t do replacement with side effects at that point. The second part is the change in the runtime to do the said pre-linking. Webrev: http://cr.openjdk.java.net/~iveresov/8245505/webrev.00/ JBS: https://bugs.openjdk.java.net/browse/JDK-8245505 Thanks, igor From dean.long at oracle.com Wed May 27 22:46:11 2020 From: dean.long at oracle.com (Dean Long) Date: Wed, 27 May 2020 15:46:11 -0700 Subject: RFR(S) 8245505: Prelink j.l.ref.Reference when loading AOT library In-Reply-To: References: Message-ID: <75925804-6060-891a-f603-ec36c3887e6c@oracle.com> Looks OK to me. dl On 5/27/20 2:57 PM, Igor Veresov wrote: > This change addresses a problem in AOT with the class reference (to j.l.ref.Reference) being introduced too late (during GC barrier expansion). The solution is twofold. The first part is to run the constant replacement phase after the barrier expansion and replace it with an indirect load. The replacement phase is run in a new mode that does replacement only for known classes that are pre-linked by the runtime during the library load. We can?t do replacement with side effects at that point. The second part is the change in the runtime to do the said pre-linking. > > Webrev: http://cr.openjdk.java.net/~iveresov/8245505/webrev.00/ > JBS: https://bugs.openjdk.java.net/browse/JDK-8245505 > > Thanks, > igor > > > > From mikael.vidstedt at oracle.com Thu May 28 02:19:44 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 27 May 2020 19:19:44 -0700 Subject: RFR(XS): 8246023: Obsolete LIRFillDelaySlot Message-ID: <4546F21D-390D-49BE-9AA3-5F72003F1D50@oracle.com> Please review this small change which obsoletes the LIRFillDelaySlot flag: JBS: https://bugs.openjdk.java.net/browse/JDK-8246023 webrev: https://cr.openjdk.java.net/~mikael/webrevs/8246023/webrev.00/open/webrev CSR: https://bugs.openjdk.java.net/browse/JDK-8246024 Background (from JBS): With SPARC removed (JDK-8241787) the LIRFillDelaySlot flag no longer has any relevant effect and should be removed using the normal process. Since the flag was really only useful on SPARC and since all the code it once controlled is now gone the flag will go directly to obsolete, skipping the deprecation step. Testing: * Manual verification running with the flag: spits out the warning as expected * tier1: in progress Note: I believe additional cleanup is possible related to delay slots, but I?d like to handle that separately. Cheers, Mikael From aoqi at loongson.cn Thu May 28 04:00:06 2020 From: aoqi at loongson.cn (Ao Qi) Date: Thu, 28 May 2020 12:00:06 +0800 Subject: RFR (trivial): 8246027: Minimal fastdebug build broken after JDK-8245801 Message-ID: Hi all, Could you please review this patch? JBS: https://bugs.openjdk.java.net/browse/JDK-8246027 webrev: http://cr.openjdk.java.net/~aoqi/8246027/webrev.00/ Thanks, Ao Qi From shade at redhat.com Thu May 28 05:19:39 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 28 May 2020 07:19:39 +0200 Subject: RFR (trivial): 8246027: Minimal fastdebug build broken after JDK-8245801 In-Reply-To: References: Message-ID: <5221339e-e4fc-3d0c-9e66-debc49bcd2cb@redhat.com> On 5/28/20 6:00 AM, Ao Qi wrote: > JBS: https://bugs.openjdk.java.net/browse/JDK-8246027 > webrev: http://cr.openjdk.java.net/~aoqi/8246027/webrev.00/ Looks fine and trivial. -- Thanks, -Aleksey From aoqi at loongson.cn Thu May 28 06:04:33 2020 From: aoqi at loongson.cn (Ao Qi) Date: Thu, 28 May 2020 14:04:33 +0800 Subject: RFR (trivial): 8246027: Minimal fastdebug build broken after JDK-8245801 In-Reply-To: <5221339e-e4fc-3d0c-9e66-debc49bcd2cb@redhat.com> References: <5221339e-e4fc-3d0c-9e66-debc49bcd2cb@redhat.com> Message-ID: On Thu, May 28, 2020 at 1:19 PM Aleksey Shipilev wrote: > > On 5/28/20 6:00 AM, Ao Qi wrote: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8246027 > > webrev: http://cr.openjdk.java.net/~aoqi/8246027/webrev.00/ > > Looks fine and trivial. Thanks for reviewing the patch. Can you please sponsor it? Thanks, Ao Qi From tobias.hartmann at oracle.com Thu May 28 06:49:49 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 28 May 2020 08:49:49 +0200 Subject: RFR(XS): 8245864: Obsolete BranchOnRegister In-Reply-To: References: Message-ID: <89382b7b-fa35-72fc-966c-ab9f2a03422c@oracle.com> Hi Mikael, looks good and trivial to me. Best regards, Tobias On 27.05.20 23:40, Mikael Vidstedt wrote: > > Please review this small change which obsoletes the BranchOnRegister flag: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8245864 > webrev: http://cr.openjdk.java.net/~mikael/webrevs/8245864/webrev.00/open/webrev/ > CSR: https://bugs.openjdk.java.net/browse/JDK-8245865 > > Background (from JBS): > > With Solaris removed (JDK-8241787) the BranchOnRegister flag no longer has any effect and should be removed using the normal process. Since the flag was really only useful on Solaris and since all the code it once controlled is now gone the flag will go directly to obsolete, skipping the deprecation step. > > Testing: > > * Manual verification running with the flag: spits out the warning as expected > * tier1: in progress > > Cheers, > Mikael > From tobias.hartmann at oracle.com Thu May 28 06:51:12 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 27 May 2020 23:51:12 -0700 (PDT) Subject: RFR(XS): 8246023: Obsolete LIRFillDelaySlot In-Reply-To: <4546F21D-390D-49BE-9AA3-5F72003F1D50@oracle.com> References: <4546F21D-390D-49BE-9AA3-5F72003F1D50@oracle.com> Message-ID: <166e393c-dfef-2b88-2fc4-3f0a45dd446c@oracle.com> Hi Mikael, looks good and trivial to me. Best regards, Tobias On 28.05.20 04:19, Mikael Vidstedt wrote: > > Please review this small change which obsoletes the LIRFillDelaySlot flag: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8246023 > webrev: https://cr.openjdk.java.net/~mikael/webrevs/8246023/webrev.00/open/webrev > CSR: https://bugs.openjdk.java.net/browse/JDK-8246024 > > Background (from JBS): > > With SPARC removed (JDK-8241787) the LIRFillDelaySlot flag no longer has any relevant effect and should be removed using the normal process. Since the flag was really only useful on SPARC and since all the code it once controlled is now gone the flag will go directly to obsolete, skipping the deprecation step. > > Testing: > > * Manual verification running with the flag: spits out the warning as expected > * tier1: in progress > > > Note: I believe additional cleanup is possible related to delay slots, but I?d like to handle that separately. > > Cheers, > Mikael > From tobias.hartmann at oracle.com Thu May 28 07:15:55 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 28 May 2020 09:15:55 +0200 Subject: RFR (trivial): 8246027: Minimal fastdebug build broken after JDK-8245801 In-Reply-To: References: <5221339e-e4fc-3d0c-9e66-debc49bcd2cb@redhat.com> Message-ID: <4862556a-54d4-44f7-51aa-8992cb6c51f3@oracle.com> Looks good to me too. Thanks for fixing! Sponsored. Best regards, Tobias On 28.05.20 08:04, Ao Qi wrote: > On Thu, May 28, 2020 at 1:19 PM Aleksey Shipilev wrote: >> >> On 5/28/20 6:00 AM, Ao Qi wrote: >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8246027 >>> webrev: http://cr.openjdk.java.net/~aoqi/8246027/webrev.00/ >> >> Looks fine and trivial. > > Thanks for reviewing the patch. Can you please sponsor it? > > Thanks, > Ao Qi > From Pengfei.Li at arm.com Thu May 28 07:56:29 2020 From: Pengfei.Li at arm.com (Pengfei Li) Date: Thu, 28 May 2020 07:56:29 +0000 Subject: RFR: 8245158: C2: Enable SLP for some manually unrolled loops In-Reply-To: <47d371c9-d647-4e97-19f5-330831181ceb@oracle.com> References: <87ftbm26e8.fsf@redhat.com> <47d371c9-d647-4e97-19f5-330831181ceb@oracle.com> Message-ID: Thanks Roland and Tobias for looking at this. Hi Tobias, I've pushed this patch to the JDK submit repo but don't get the test report email. Could you or other Oracle engineer help have a check? -- Thanks, Pengfei > -----Original Message----- > From: Tobias Hartmann > Sent: Tuesday, May 26, 2020 21:29 > To: Roland Westrelin ; Pengfei Li > ; hotspot-compiler-dev at openjdk.java.net > Cc: Vladimir Kozlov ; nd > Subject: Re: RFR: 8245158: C2: Enable SLP for some manually unrolled loops > > +1 > > Best regards, > Tobias > > On 26.05.20 15:05, Roland Westrelin wrote: > > > >> Webrev: http://cr.openjdk.java.net/~pli/rfr/8245158/webrev.00/ > > > > That looks reasonable to me. > > > > Roland. > > From dean.long at oracle.com Thu May 28 07:59:18 2020 From: dean.long at oracle.com (Dean Long) Date: Thu, 28 May 2020 00:59:18 -0700 Subject: RFR(XS): 8245126 Kitchensink fails with: assert(!method->is_old()) failed: Should not be installing old methods In-Reply-To: <62e34586-eaac-3200-8f5a-ee12ad654afa@oracle.com> References: <62e34586-eaac-3200-8f5a-ee12ad654afa@oracle.com> Message-ID: This seems OK as long as the memory barriers in the thread state transitions prevent the C++ compiler from doing something like reading is_old before reading redefinition_count.? I would feel better if both JVMCI and C1/C2 cached is_old and redefinition_count at the same time (making sure to be in the _thread_in_vm state), then bail out based on the cached value of is_old. dl On 5/26/20 12:04 AM, serguei.spitsyn at oracle.com wrote: > On 5/25/20 23:39, serguei.spitsyn at oracle.com wrote: >> Please, review a fix for: >> https://bugs.openjdk.java.net/browse/JDK-8245126 >> >> Webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/kitchensink-comp.1/ >> >> >> Summary: >> ? The Kitchensink stress test with the Instrumentation module enabled >> does >> ? a lot of class retransformations in parallel with all other stressing. >> ? It provokes the assert at the compiled code installation time: >> ??? assert(!method->is_old()) failed: Should not be installing old >> methods >> >> ? The problem is that the CompileBroker::invoke_compiler_on_method in >> C2 version >> ? (non-JVMCI tiered compilation) is missing the check that exists in >> the JVMCI >> ? part of implementation: >> 2148 // Skip redefined methods >> 2149 if (target_handle->is_old()) { >> 2150 failure_reason = "redefined method"; >> 2151 retry_message = "not retryable"; >> 2152 compilable = ciEnv::MethodCompilable_never; >> 2153 } else { >> . . . >> 2168 } >> >> ? The fix is to add this check. > > Sorry, forgot to explain one thing. > Compiler code has a special mechanism to ensure the JVMTI class > redefinition did > not happen while the method was compiled, so all the assumptions > remain correct. > 2190 // Cache Jvmti state > 2191 ci_env.cache_jvmti_state(); > Part of this is a check that the value of > JvmtiExport::redefinition_count() is > cached in ciEnv variable: _jvmti_redefinition_count. > The JvmtiExport::redefinition_count() value change means a class > redefinition > happened which also implies some of methods may become old. > However, the method being compiled can be already old at the point > where the > redefinition counter is cached, so the redefinition counter check does > not help much. > > Thanks, > Serguei > >> Testing: >> Ran Kitchensink test with the Instrumentation module enabled in mach5 >> ?multiple times for 100 times. Without the fix the test normally fails >> a couple of times in 200 runs. It does not fail with the fix anymore. >> Will also submit hs tiers1-5. >> >> Thanks, >> Serguei > From aoqi at loongson.cn Thu May 28 08:50:49 2020 From: aoqi at loongson.cn (Ao Qi) Date: Thu, 28 May 2020 16:50:49 +0800 Subject: RFR (trivial): 8246027: Minimal fastdebug build broken after JDK-8245801 In-Reply-To: <4862556a-54d4-44f7-51aa-8992cb6c51f3@oracle.com> References: <5221339e-e4fc-3d0c-9e66-debc49bcd2cb@redhat.com> <4862556a-54d4-44f7-51aa-8992cb6c51f3@oracle.com> Message-ID: Thanks, Tobias! On Thu, May 28, 2020 at 3:15 PM Tobias Hartmann wrote: > > Looks good to me too. Thanks for fixing! Sponsored. > > Best regards, > Tobias > > On 28.05.20 08:04, Ao Qi wrote: > > On Thu, May 28, 2020 at 1:19 PM Aleksey Shipilev wrote: > >> > >> On 5/28/20 6:00 AM, Ao Qi wrote: > >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8246027 > >>> webrev: http://cr.openjdk.java.net/~aoqi/8246027/webrev.00/ > >> > >> Looks fine and trivial. > > > > Thanks for reviewing the patch. Can you please sponsor it? > > > > Thanks, > > Ao Qi > > From aph at redhat.com Thu May 28 09:17:24 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 28 May 2020 10:17:24 +0100 Subject: [aarch64-port-dev ] [EXT] Re: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> <7abc8ac0-0a1c-b306-8a62-78a94c98845a@redhat.com> Message-ID: <33bbd38d-a15c-8538-8725-ad1f1cfafead@redhat.com> On 27/05/2020 16:57, Doerr, Martin wrote: > Indeed. Thanks for figuring this out. > These variants of report_and_die are very confusing. They certainly are: any function with ten arguments is going to be hell's own confusing, but when you have overloads with different semantics and no explanation the reader is doomed. I'd add some comment but I'm worried my own lack of complete understanding might mislead people. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Thu May 28 10:02:46 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 28 May 2020 11:02:46 +0100 Subject: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Message-ID: On 27/05/2020 21:09, Liu, Xin wrote: > Yes, it's better. I compare two stacktraces and cframes in the previous stacktrace indeed disturb users from understanding their own problems. > I reviewed Andrew's webrev. It looks good to me. > I am happy to see that you solve rscratch1 clobber problem in such > elegant way! Great, thanks. > Just one thing: for this instruction emit_int64((intptr_t)msg), can we safely say a pointer is always 64-bit on aarch64? That's a fair point. I'll change it: there's no need for that code to depend on pointer size. > According to arm document, in theory, aarch64 has the ILP32 data > model, but I don't think we ever use ILP32 before on aarch64. If anyone ever makes IPL32 work, we'll be happy to fix HotSpot to run on it. > http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0490a/ar01s01.html > > May I ask Andrew to sponsor my patch when you push JDK-8245986? > Now it become trivial. http://cr.openjdk.java.net/~xliu/8230552/02/webrev/ -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From Pengfei.Li at arm.com Thu May 28 10:03:12 2020 From: Pengfei.Li at arm.com (Pengfei Li) Date: Thu, 28 May 2020 10:03:12 +0000 Subject: RFR: 8245158: C2: Enable SLP for some manually unrolled loops In-Reply-To: References: <87ftbm26e8.fsf@redhat.com> <47d371c9-d647-4e97-19f5-330831181ceb@oracle.com> Message-ID: BTW: I've pushed twice to the submit repo http://hg.openjdk.java.net/jdk/submit/rev/6ff334698002 (branch JDK-8245158) http://hg.openjdk.java.net/jdk/submit/rev/b88caaa3f01d (branch JDK-8245158-1) but got no report email from Mach5. -- Thanks, Pengfei > Thanks Roland and Tobias for looking at this. > > Hi Tobias, > > I've pushed this patch to the JDK submit repo but don't get the test report > email. Could you or other Oracle engineer help have a check? > > -- > Thanks, > Pengfei > > > -----Original Message----- > > From: Tobias Hartmann > > Sent: Tuesday, May 26, 2020 21:29 > > To: Roland Westrelin ; Pengfei Li > > ; hotspot-compiler-dev at openjdk.java.net > > Cc: Vladimir Kozlov ; nd > > Subject: Re: RFR: 8245158: C2: Enable SLP for some manually unrolled > > loops > > > > +1 > > > > Best regards, > > Tobias > > > > On 26.05.20 15:05, Roland Westrelin wrote: > > > > > >> Webrev: http://cr.openjdk.java.net/~pli/rfr/8245158/webrev.00/ > > > > > > That looks reasonable to me. > > > > > > Roland. > > > From rwestrel at redhat.com Thu May 28 13:19:44 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 28 May 2020 15:19:44 +0200 Subject: RFR(M): 8223051: support loops with long (64b) trip counts In-Reply-To: <497B34CC-BA72-4674-8C5A-CF04DEF0CDC2@oracle.com> References: <87lfmd8lip.fsf@redhat.com> <87h7wv7jny.fsf@redhat.com> <601CD9EB-C4E2-413E-988A-03CE5DE9FB00@oracle.com> <87y2q55rj4.fsf@redhat.com> <497B34CC-BA72-4674-8C5A-CF04DEF0CDC2@oracle.com> Message-ID: <87lflcyz67.fsf@redhat.com> > Maybe someone else will have comments 0n that, but assuming > that is more or less unchanged, this change set for 8223051 on > top still looks good, taking into account renaming of min/max > factories. > > After you post an updated webrev.02 the review should be easy. > Tobias might wish to run some regression tests on the final changes. Thanks for the review. Here is the webrev with the min/max renaming and the hunk that was erroneously included in 8244504. http://cr.openjdk.java.net/~roland/8223051/webrev.02/ Roland. From zhuoren.wz at alibaba-inc.com Thu May 28 12:43:25 2020 From: zhuoren.wz at alibaba-inc.com (=?UTF-8?B?V2FuZyBaaHVvKFpodW9yZW4p?=) Date: Thu, 28 May 2020 20:43:25 +0800 Subject: =?UTF-8?B?UkZSOjgyNDYwNTE6W0FBcmNoNjRdU0lHQlVTIGJ5IHVuYWxpZ25lZCBVbnNhZmUgY29tcGFy?= =?UTF-8?B?ZV9hbmRfc3dhcA==?= Message-ID: Hi, I found that on aarch64, SIGBUS happens when Unsafe compareAndSwapLong and compareAndSwapInt were used to access unaligned mem address in interpreter mode. In compiled code, InternalError will be thrown. We should fix the crash and throw InternalError in interpreter too. Please help review this patch. BUG: https://bugs.openjdk.java.net/browse/JDK-8246051 CR: http://cr.openjdk.java.net/~wzhuo/8246051/webrev.00/ Regards, Zhuoren From vladimir.kozlov at oracle.com Thu May 28 14:18:41 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 28 May 2020 07:18:41 -0700 Subject: RFR(XS): 8245864: Obsolete BranchOnRegister In-Reply-To: <89382b7b-fa35-72fc-966c-ab9f2a03422c@oracle.com> References: <89382b7b-fa35-72fc-966c-ab9f2a03422c@oracle.com> Message-ID: +1 Thanks, Vladimir On 5/27/20 11:49 PM, Tobias Hartmann wrote: > Hi Mikael, > > looks good and trivial to me. > > Best regards, > Tobias > > On 27.05.20 23:40, Mikael Vidstedt wrote: >> >> Please review this small change which obsoletes the BranchOnRegister flag: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8245864 >> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8245864/webrev.00/open/webrev/ >> CSR: https://bugs.openjdk.java.net/browse/JDK-8245865 >> >> Background (from JBS): >> >> With Solaris removed (JDK-8241787) the BranchOnRegister flag no longer has any effect and should be removed using the normal process. Since the flag was really only useful on Solaris and since all the code it once controlled is now gone the flag will go directly to obsolete, skipping the deprecation step. >> >> Testing: >> >> * Manual verification running with the flag: spits out the warning as expected >> * tier1: in progress >> >> Cheers, >> Mikael >> From vladimir.kozlov at oracle.com Thu May 28 14:19:21 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 28 May 2020 07:19:21 -0700 Subject: RFR(XS): 8246023: Obsolete LIRFillDelaySlot In-Reply-To: <166e393c-dfef-2b88-2fc4-3f0a45dd446c@oracle.com> References: <4546F21D-390D-49BE-9AA3-5F72003F1D50@oracle.com> <166e393c-dfef-2b88-2fc4-3f0a45dd446c@oracle.com> Message-ID: <96c6a30f-3c45-167b-f7d1-bb69d983e981@oracle.com> +1 Thanks, Vladimir On 5/27/20 11:51 PM, Tobias Hartmann wrote: > Hi Mikael, > > looks good and trivial to me. > > Best regards, > Tobias > > On 28.05.20 04:19, Mikael Vidstedt wrote: >> >> Please review this small change which obsoletes the LIRFillDelaySlot flag: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8246023 >> webrev: https://cr.openjdk.java.net/~mikael/webrevs/8246023/webrev.00/open/webrev >> CSR: https://bugs.openjdk.java.net/browse/JDK-8246024 >> >> Background (from JBS): >> >> With SPARC removed (JDK-8241787) the LIRFillDelaySlot flag no longer has any relevant effect and should be removed using the normal process. Since the flag was really only useful on SPARC and since all the code it once controlled is now gone the flag will go directly to obsolete, skipping the deprecation step. >> >> Testing: >> >> * Manual verification running with the flag: spits out the warning as expected >> * tier1: in progress >> >> >> Note: I believe additional cleanup is possible related to delay slots, but I?d like to handle that separately. >> >> Cheers, >> Mikael >> From tobias.hartmann at oracle.com Thu May 28 14:22:02 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 28 May 2020 16:22:02 +0200 Subject: [15] RFR(S): 8239477: jdk/jfr/jcmd/TestJcmdStartStopDefault.java fails -XX:+VerifyOops with "verify_oop: rsi: broken oop" Message-ID: <21b1a507-6b58-6dbe-ab9a-5b1126089be0@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8239477 http://cr.openjdk.java.net/~thartmann/8239477/webrev.00/ When loading JfrThreadLocal::_java_event_writer which is a jobject (i.e., metadata that does not life in the Java heap), type T_OBJECT is used in the C1 intrinsic for _getEventWriter. As a result, we fail during oop verification emitted by LIR_Assembler::mem2reg. We should use T_METADATA instead but can't due to JDK-8026837 [1]. Similar to [2], I'm therefore using T_ADDRESS as a workaround until JDK-8026837 is fixed. Thanks, Tobias [1] # Internal Error (/oracle/jdk_jdk/open/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp:1353), pid=8575, tid=8591 # Error: ShouldNotReachHere() [2] http://hg.openjdk.java.net/jdk/jdk/file/02a5a446f8bf/src/hotspot/share/c1/c1_LIRGenerator.cpp#l1287 From rwestrel at redhat.com Thu May 28 14:28:35 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 28 May 2020 16:28:35 +0200 Subject: RFR(S): 8244086: Following 8241492, strip mined loop may run extra iterations In-Reply-To: <87eery7sr7.fsf@redhat.com> References: <87wo5y8z2v.fsf@redhat.com> <878sid8jzn.fsf@redhat.com> <87zhat6voh.fsf@redhat.com> <87wo5s6tvs.fsf@redhat.com> <87eery7sr7.fsf@redhat.com> Message-ID: <87imggyvzg.fsf@redhat.com> > Actually in the review thread for 8223051, John suggested some > refactoring that would apply here as well. I'll update this webrev. Here is the updated webrev: http://cr.openjdk.java.net/~roland/8244086/webrev.01/ It takes advantage of new methods from 8244504. Roland. From vladimir.kozlov at oracle.com Thu May 28 16:02:58 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 28 May 2020 09:02:58 -0700 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: <32f34616-cf17-8caa-5064-455e013e2313@oracle.com> Message-ID: <057dfdb4-74df-e0ec-198d-455aeb14d5a1@oracle.com> Vladimir Ivanov is on break currently. It looks good to me. Thanks, Vladimir K On 5/26/20 7:31 AM, Reingruber, Richard wrote: > Hi Vladimir, > >>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > >> Not an expert in JVMTI code base, so can't comment on the actual changes. > >> From JIT-compilers perspective it looks good. > > I put out webrev.1 a while ago [1]: > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ > Webrev(delta): http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ > > You originally suggested to use a handshake to switch a thread into interpreter mode [2]. I'm using > a direct handshake now, because I think it is the best fit. > > May I ask if webrev.1 still looks good to you from JIT-compilers perspective? > > Can I list you as (partial) Reviewer? > > Thanks, Richard. > > [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/031245.html > [2] http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030340.html > > -----Original Message----- > From: Vladimir Ivanov > Sent: Freitag, 7. Februar 2020 09:19 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > >> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > > Not an expert in JVMTI code base, so can't comment on the actual changes. > > From JIT-compilers perspective it looks good. > > Best regards, > Vladimir Ivanov > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >> >> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >> >> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >> >> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >> >> Thanks, Richard. >> >> See also my question if anyone knows a reason for making the compiled methods not_entrant: >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html >> From mikael.vidstedt at oracle.com Thu May 28 16:10:34 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 28 May 2020 09:10:34 -0700 Subject: RFR(XS): 8245864: Obsolete BranchOnRegister In-Reply-To: References: <89382b7b-fa35-72fc-966c-ab9f2a03422c@oracle.com> Message-ID: <6B0A48C1-0DB7-4A2E-8CA8-6A9862B0298D@oracle.com> Tobias/Vladimir, thanks for the reviews! Change pushed. Cheers, Mikael > On May 28, 2020, at 7:18 AM, Vladimir Kozlov wrote: > > +1 > > Thanks, > Vladimir > > On 5/27/20 11:49 PM, Tobias Hartmann wrote: >> Hi Mikael, >> looks good and trivial to me. >> Best regards, >> Tobias >> On 27.05.20 23:40, Mikael Vidstedt wrote: >>> >>> Please review this small change which obsoletes the BranchOnRegister flag: >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8245864 >>> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8245864/webrev.00/open/webrev/ >>> CSR: https://bugs.openjdk.java.net/browse/JDK-8245865 >>> >>> Background (from JBS): >>> >>> With Solaris removed (JDK-8241787) the BranchOnRegister flag no longer has any effect and should be removed using the normal process. Since the flag was really only useful on Solaris and since all the code it once controlled is now gone the flag will go directly to obsolete, skipping the deprecation step. >>> >>> Testing: >>> >>> * Manual verification running with the flag: spits out the warning as expected >>> * tier1: in progress >>> >>> Cheers, >>> Mikael >>> From mikael.vidstedt at oracle.com Thu May 28 16:28:46 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 28 May 2020 09:28:46 -0700 Subject: RFR(XS): 8246023: Obsolete LIRFillDelaySlot In-Reply-To: <96c6a30f-3c45-167b-f7d1-bb69d983e981@oracle.com> References: <4546F21D-390D-49BE-9AA3-5F72003F1D50@oracle.com> <166e393c-dfef-2b88-2fc4-3f0a45dd446c@oracle.com> <96c6a30f-3c45-167b-f7d1-bb69d983e981@oracle.com> Message-ID: Tobias/Vladimir, thanks for the reviews! Change pushed. Cheers, Mikael > On May 28, 2020, at 7:19 AM, Vladimir Kozlov wrote: > > +1 > > Thanks, > Vladimir > > On 5/27/20 11:51 PM, Tobias Hartmann wrote: >> Hi Mikael, >> looks good and trivial to me. >> Best regards, >> Tobias >> On 28.05.20 04:19, Mikael Vidstedt wrote: >>> >>> Please review this small change which obsoletes the LIRFillDelaySlot flag: >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8246023 >>> webrev: https://cr.openjdk.java.net/~mikael/webrevs/8246023/webrev.00/open/webrev >>> CSR: https://bugs.openjdk.java.net/browse/JDK-8246024 >>> >>> Background (from JBS): >>> >>> With SPARC removed (JDK-8241787) the LIRFillDelaySlot flag no longer has any relevant effect and should be removed using the normal process. Since the flag was really only useful on SPARC and since all the code it once controlled is now gone the flag will go directly to obsolete, skipping the deprecation step. >>> >>> Testing: >>> >>> * Manual verification running with the flag: spits out the warning as expected >>> * tier1: in progress >>> >>> >>> Note: I believe additional cleanup is possible related to delay slots, but I?d like to handle that separately. >>> >>> Cheers, >>> Mikael >>> From vladimir.kozlov at oracle.com Thu May 28 16:44:58 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 28 May 2020 09:44:58 -0700 Subject: RFR: 8245717: VM option "-XX:EnableJVMCIProduct" could not be repetitively enabled In-Reply-To: References: Message-ID: Looks good to me. Thanks, Vladimir K On 5/27/20 1:02 AM, Xiaohong Gong wrote: > Hi David, > > > On 26/05/2020 4:57 pm, Xiaohong Gong wrote: > > > Hi, > > > > > > Could you please help to review this simple patch? It fixes the > > issue > > > that JVM crashes in debug mode when the vm option > > "-XX:EnableJVMCIProduct" is enabled repetitively. > > > > > > JBS: https://bugs.openjdk.java.net/browse/JDK-8245717 > > > Webrev: http://cr.openjdk.java.net/~xgong/rfr/8245717/webrev.00/ > > > > > > Repetitively enabling the vm option "-XX:+EnableJVMCIProduct" in > > the > > > command line makes the assertion fail in debug mode: > > "assert(is_experimental(), sanity)". > > > > > > It happens when the VM iteratively parses the options from > > command > > > line. When the matched option is "-XX:+EnableJVMCIProduct", the > > > original experimental JVMCI flags will be converted to product > > mode, > > > with the above assertion before it. So if all the JVMCI flags > > have > > > been converted to the product mode at the first time parsing > > "-XX:+EnableJVMCIProduct", the assertion will fail at the second > > time it is parsed. > > > > > > A simple fix is to just ignoring the conversion if this option > > has been parsed. > > > > Seems a reasonable approach given the already complex handling of > > this flag. > > > > > Testing: > > > Tested jtreg > > > hotspot::hotspot_all_no_apps,jdk::jdk_core,langtools::tier1 > > > and jcstress:tests-custom, and all tests pass without new > > failure. > > > > I think adding a regression test in > > ./compiler/jvmci/TestEnableJVMCIProduct.java would be appropriate. > > Thanks for your review and it?s a good idea to add the regression test. > I have added the test in the new patch: > http://cr.openjdk.java.net/~xgong/rfr/8245717/webrev.01/ . > > Could you please take a look at it? Thank you! > > Thanks, > Xiaohong Gong > From aph at redhat.com Thu May 28 16:52:26 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 28 May 2020 17:52:26 +0100 Subject: [aarch64-port-dev ] RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Message-ID: On 28/05/2020 11:02, Andrew Haley wrote: >> Just one thing: for this instruction emit_int64((intptr_t)msg), can we safely say a pointer is always 64-bit on aarch64? > That's a fair point. I'll change it: there's no need for that code to > depend on pointer size. http://cr.openjdk.java.net/~aph/8245986-2/ I'm trying to get better at not always assuming that pointers and int64_t can be freely exchanged, or that the world is little endian. I have a long way to go. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Thu May 28 16:55:55 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 28 May 2020 17:55:55 +0100 Subject: RFR: 8245986: AArch64: Provide information when hitting a HaltNode In-Reply-To: <92db8ab4-84e3-d425-4e9f-d6a77b0fa837@redhat.com> References: <92db8ab4-84e3-d425-4e9f-d6a77b0fa837@redhat.com> Message-ID: <0c322e72-2935-ac1a-f620-54886eec7e5e@redhat.com> On 27/05/2020 16:54, Andrew Haley wrote: > We need to provide a halt reason when hitting a C2 HaltNode on > AArch64, and we need to do so without grossly bloating the code. > > http://cr.openjdk.java.net/~aph/8245986/ New webrev: http://cr.openjdk.java.net/~aph/8245986-2/ -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From igor.veresov at oracle.com Thu May 28 17:04:12 2020 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 28 May 2020 10:04:12 -0700 Subject: RFR(S) 8245505: Prelink j.l.ref.Reference when loading AOT library In-Reply-To: <75925804-6060-891a-f603-ec36c3887e6c@oracle.com> References: <75925804-6060-891a-f603-ec36c3887e6c@oracle.com> Message-ID: <62482A76-8A23-4C2A-8814-48635CE2C5D3@oracle.com> Thanks, Dean! igor > On May 27, 2020, at 3:46 PM, Dean Long wrote: > > Looks OK to me. > > dl > > On 5/27/20 2:57 PM, Igor Veresov wrote: >> This change addresses a problem in AOT with the class reference (to j.l.ref.Reference) being introduced too late (during GC barrier expansion). The solution is twofold. The first part is to run the constant replacement phase after the barrier expansion and replace it with an indirect load. The replacement phase is run in a new mode that does replacement only for known classes that are pre-linked by the runtime during the library load. We can?t do replacement with side effects at that point. The second part is the change in the runtime to do the said pre-linking. >> >> Webrev: http://cr.openjdk.java.net/~iveresov/8245505/webrev.00/ >> JBS: https://bugs.openjdk.java.net/browse/JDK-8245505 >> >> Thanks, >> igor >> >> >> >> > From vladimir.kozlov at oracle.com Thu May 28 17:08:46 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 28 May 2020 10:08:46 -0700 Subject: RFR(S) 8245505: Prelink j.l.ref.Reference when loading AOT library In-Reply-To: <75925804-6060-891a-f603-ec36c3887e6c@oracle.com> References: <75925804-6060-891a-f603-ec36c3887e6c@oracle.com> Message-ID: +1 on AOT changes. Thanks, Vladimir On 5/27/20 3:46 PM, Dean Long wrote: > Looks OK to me. > > dl > > On 5/27/20 2:57 PM, Igor Veresov wrote: >> This change addresses a problem in AOT with the class reference (to j.l.ref.Reference) being introduced too late >> (during GC barrier expansion). The solution is twofold. The first part is to run the constant replacement phase after >> the barrier expansion and replace it with an indirect load. The replacement phase is run in a new mode that does >> replacement only for known classes that are pre-linked by the runtime during the library load. We can?t do replacement >> with side effects at that point. The second part is the change in the runtime to do the said pre-linking. >> >> Webrev: http://cr.openjdk.java.net/~iveresov/8245505/webrev.00/ >> JBS: https://bugs.openjdk.java.net/browse/JDK-8245505 >> >> Thanks, >> igor >> >> >> >> > From vladimir.kozlov at oracle.com Thu May 28 17:24:41 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 28 May 2020 10:24:41 -0700 Subject: [15] RFR(S): 8239477: jdk/jfr/jcmd/TestJcmdStartStopDefault.java fails -XX:+VerifyOops with "verify_oop: rsi: broken oop" In-Reply-To: <21b1a507-6b58-6dbe-ab9a-5b1126089be0@oracle.com> References: <21b1a507-6b58-6dbe-ab9a-5b1126089be0@oracle.com> Message-ID: <7355118d-6851-f85e-58e5-3cd5eba6dd90@oracle.com> Good. Thanks, Vladimir On 5/28/20 7:22 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8239477 > http://cr.openjdk.java.net/~thartmann/8239477/webrev.00/ > > When loading JfrThreadLocal::_java_event_writer which is a jobject (i.e., metadata that does not > life in the Java heap), type T_OBJECT is used in the C1 intrinsic for _getEventWriter. As a result, > we fail during oop verification emitted by LIR_Assembler::mem2reg. > > We should use T_METADATA instead but can't due to JDK-8026837 [1]. Similar to [2], I'm therefore > using T_ADDRESS as a workaround until JDK-8026837 is fixed. > > Thanks, > Tobias > > [1] # Internal Error (/oracle/jdk_jdk/open/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp:1353), > pid=8575, tid=8591 > # Error: ShouldNotReachHere() > > [2] http://hg.openjdk.java.net/jdk/jdk/file/02a5a446f8bf/src/hotspot/share/c1/c1_LIRGenerator.cpp#l1287 > From igor.veresov at oracle.com Thu May 28 17:36:48 2020 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 28 May 2020 10:36:48 -0700 Subject: RFR(S) 8245505: Prelink j.l.ref.Reference when loading AOT library In-Reply-To: References: <75925804-6060-891a-f603-ec36c3887e6c@oracle.com> Message-ID: Thanks, Vladimir! igor > On May 28, 2020, at 10:08 AM, Vladimir Kozlov wrote: > > +1 on AOT changes. > > Thanks, > Vladimir > > On 5/27/20 3:46 PM, Dean Long wrote: >> Looks OK to me. >> dl >> On 5/27/20 2:57 PM, Igor Veresov wrote: >>> This change addresses a problem in AOT with the class reference (to j.l.ref.Reference) being introduced too late (during GC barrier expansion). The solution is twofold. The first part is to run the constant replacement phase after the barrier expansion and replace it with an indirect load. The replacement phase is run in a new mode that does replacement only for known classes that are pre-linked by the runtime during the library load. We can?t do replacement with side effects at that point. The second part is the change in the runtime to do the said pre-linking. >>> >>> Webrev: http://cr.openjdk.java.net/~iveresov/8245505/webrev.00/ >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8245505 >>> >>> Thanks, >>> igor >>> >>> >>> >>> From dean.long at oracle.com Thu May 28 17:54:56 2020 From: dean.long at oracle.com (Dean Long) Date: Thu, 28 May 2020 10:54:56 -0700 Subject: RFR(XS): 8245126 Kitchensink fails with: assert(!method->is_old()) failed: Should not be installing old methods In-Reply-To: References: <62e34586-eaac-3200-8f5a-ee12ad654afa@oracle.com> Message-ID: <5d957cae-8911-8572-2b45-048b8d09ae79@oracle.com> Sure, you could just have cache_jvmti_state() return a boolean to bail out immediately for is_old. dl On 5/28/20 7:23 AM, serguei.spitsyn at oracle.com wrote: > Hi Dean, > > Thank you for looking at this! > Okay. Let me check what cab be done in this direction. > There is no point to cache is_old. The compilation has to bail out if > it is discovered to be true. > > Thanks, > Serguei > > > On 5/28/20 00:59, Dean Long wrote: >> This seems OK as long as the memory barriers in the thread state >> transitions prevent the C++ compiler from doing something like >> reading is_old before reading redefinition_count.? I would feel >> better if both JVMCI and C1/C2 cached is_old and redefinition_count >> at the same time (making sure to be in the _thread_in_vm state), then >> bail out based on the cached value of is_old. >> >> dl >> >> On 5/26/20 12:04 AM, serguei.spitsyn at oracle.com wrote: >>> On 5/25/20 23:39, serguei.spitsyn at oracle.com wrote: >>>> Please, review a fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-8245126 >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/kitchensink-comp.1/ >>>> >>>> >>>> Summary: >>>> ? The Kitchensink stress test with the Instrumentation module >>>> enabled does >>>> ? a lot of class retransformations in parallel with all other >>>> stressing. >>>> ? It provokes the assert at the compiled code installation time: >>>> ??? assert(!method->is_old()) failed: Should not be installing old >>>> methods >>>> >>>> ? The problem is that the CompileBroker::invoke_compiler_on_method >>>> in C2 version >>>> ? (non-JVMCI tiered compilation) is missing the check that exists >>>> in the JVMCI >>>> ? part of implementation: >>>> 2148 // Skip redefined methods >>>> 2149 if (target_handle->is_old()) { >>>> 2150 failure_reason = "redefined method"; >>>> 2151 retry_message = "not retryable"; >>>> 2152 compilable = ciEnv::MethodCompilable_never; >>>> 2153 } else { >>>> . . . >>>> 2168 } >>>> >>>> ? The fix is to add this check. >>> >>> Sorry, forgot to explain one thing. >>> Compiler code has a special mechanism to ensure the JVMTI class >>> redefinition did >>> not happen while the method was compiled, so all the assumptions >>> remain correct. >>> 2190 // Cache Jvmti state >>> 2191 ci_env.cache_jvmti_state(); >>> Part of this is a check that the value of >>> JvmtiExport::redefinition_count() is >>> cached in ciEnv variable: _jvmti_redefinition_count. >>> The JvmtiExport::redefinition_count() value change means a class >>> redefinition >>> happened which also implies some of methods may become old. >>> However, the method being compiled can be already old at the point >>> where the >>> redefinition counter is cached, so the redefinition counter check >>> does not help much. >>> >>> Thanks, >>> Serguei >>> >>>> Testing: >>>> Ran Kitchensink test with the Instrumentation module enabled in mach5 >>>> ?multiple times for 100 times. Without the fix the test normally fails >>>> a couple of times in 200 runs. It does not fail with the fix anymore. >>>> Will also submit hs tiers1-5. >>>> >>>> Thanks, >>>> Serguei >>> >> > From evgeny.nikitin at oracle.com Thu May 28 19:22:41 2020 From: evgeny.nikitin at oracle.com (Evgeny Nikitin) Date: Thu, 28 May 2020 21:22:41 +0200 Subject: RFR(S): 8242923: Trigger interface MethodHandle resolve in test without Nashorn. Message-ID: Hi, Bug: https://bugs.openjdk.java.net/browse/JDK-8242923 Webrev: http://cr.openjdk.java.net/~enikitin/8242923/webrev.01/ The test used Nashorn to trigger incorrect MethodHandle resolve in the linkResolver.cpp (which in turn caused crash on the MethodHandle invokation). Test's functionality have been checked via rolling back the fix made in the https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-October/012155.html, the test fails on 4 common platforms in mach5. The version with the bugfix reverted can be found here: http://cr.openjdk.java.net/~enikitin/8242923/webrev.00/ The change has been checked in mach5 for the 4 common platforms (passed). Please review, /Evgeny Nikitin. From evgeny.nikitin at oracle.com Thu May 28 19:33:24 2020 From: evgeny.nikitin at oracle.com (Evgeny Nikitin) Date: Thu, 28 May 2020 21:33:24 +0200 Subject: RFR(L): 8229186: Improve error messages for TestStringIntrinsics failures In-Reply-To: <2fa31b05-c37a-1367-a7dc-5ae2b13133be@oracle.com> References: <2fa31b05-c37a-1367-a7dc-5ae2b13133be@oracle.com> Message-ID: <1558df02-2c3d-6c0e-cb7b-c06d17bb2a66@oracle.com> Forwarding to the compiler mailing list as this changes a test in the Compiler area. On 2020-05-18 16:46, Evgeny Nikitin wrote: > Hi, > > Bug: https://bugs.openjdk.java.net/browse/JDK-8229186 > Webrev: http://cr.openjdk.java.net/~enikitin/8229186/webrev.00/ > > Error reporting was improved by writing a C-style escaped string > representations for the variables passed to the methods being tested. > For array comparisons, a dedicated diff-formatter was implemented. > > Sample output for comparing byte arrays (with artificial failure): > ----------System.err:(21/1553)---------- > Result: (false) of 'arrayEqualsB' is not equal to expected (true) > Arrays differ starting from [index: 7]: > ... 5, 6,?? 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, > 23, ... > ... 5, 6, 125, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, > 23, ... > ???????? ^^^^ > java.lang.RuntimeException: Result: (false) of 'arrayEqualsB' is not > equal to expected (true) > ??????? at > compiler.intrinsics.string.TestStringIntrinsics.invokeAndCheckArrays(TestStringIntrinsics.java:273) > > ??????? at ... stack trace continues - E.N. > > Sample output for comparing char arrays: > ----------System.err:(21/1579)*---------- > Result: (false) of 'arrayEqualsC' is not equal to expected (true) > Arrays differ starting from [index: 7]: > ... \\u0005, \\u0006, \\u0007, \\u0008, \\u0009, \\n, \\u000B, \\u000C, > \\r, \\u000E, ... > ... \\u0005, \\u0006,????? }, \\u0008, \\u0009, \\n, \\u000B, \\u000C, > \\r, \\u000E, ... > ?????????????????? ^^^^^^^ > java.lang.RuntimeException: Result: (false) of 'arrayEqualsC' is not > equal to expected (true) > ??????? at > compiler.intrinsics.string.TestStringIntrinsics.invokeAndCheckArrays(TestStringIntrinsics.java:280) > > ??????? at > ... and so on - E.N. > > Please review. > Thanks in advance, > /Evgeny Nikitin. From xxinliu at amazon.com Thu May 28 19:37:48 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Thu, 28 May 2020 19:37:48 +0000 Subject: [aarch64-port-dev ] RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Message-ID: Hi, Andrew, Thank you to this JBS issue for aarch64. I think I understand the mechanism behind it. We always store and load 64 bits for a pointer and let type conversion do the job. You patch looks good to me. Thanks, --lx ?On 5/28/20, 9:53 AM, "Andrew Haley" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. On 28/05/2020 11:02, Andrew Haley wrote: >> Just one thing: for this instruction emit_int64((intptr_t)msg), can we safely say a pointer is always 64-bit on aarch64? > That's a fair point. I'll change it: there's no need for that code to > depend on pointer size. http://cr.openjdk.java.net/~aph/8245986-2/ I'm trying to get better at not always assuming that pointers and int64_t can be freely exchanged, or that the world is little endian. I have a long way to go. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From nick.gasson at arm.com Fri May 29 01:54:15 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Fri, 29 May 2020 09:54:15 +0800 Subject: [aarch64-port-dev ] RFR:8246051:[AArch64]SIGBUS by unaligned Unsafe compare_and_swap In-Reply-To: References: Message-ID: On 05/28/20 20:43 PM, Wang Zhuo wrote: > I found that on aarch64, SIGBUS happens when Unsafe compareAndSwapLong and compareAndSwapInt were used to access unaligned mem address in interpreter mode. In compiled code, InternalError will be thrown. We should fix the crash and throw InternalError in interpreter too. > Please help review this patch. > > BUG: https://bugs.openjdk.java.net/browse/JDK-8246051 > CR: http://cr.openjdk.java.net/~wzhuo/8246051/webrev.00/ > Could you add a jtreg test for this? -- Nick From Xiaohong.Gong at arm.com Fri May 29 02:43:21 2020 From: Xiaohong.Gong at arm.com (Xiaohong Gong) Date: Fri, 29 May 2020 02:43:21 +0000 Subject: RFR: 8245717: VM option "-XX:EnableJVMCIProduct" could not be repetitively enabled In-Reply-To: References: Message-ID: Hi Vladimir, > Looks good to me. Thanks so much for your reviewing! Best Regards, Xiaohong Gong -----Original Message----- From: Vladimir Kozlov Sent: Friday, May 29, 2020 12:45 AM To: Xiaohong Gong ; David Holmes ; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Cc: nd Subject: Re: RFR: 8245717: VM option "-XX:EnableJVMCIProduct" could not be repetitively enabled Looks good to me. Thanks, Vladimir K On 5/27/20 1:02 AM, Xiaohong Gong wrote: > Hi David, > > > On 26/05/2020 4:57 pm, Xiaohong Gong wrote: > > > Hi, > > > > > > Could you please help to review this simple patch? It fixes the > > issue > > > that JVM crashes in debug mode when the vm option > > "-XX:EnableJVMCIProduct" is enabled repetitively. > > > > > > JBS: https://bugs.openjdk.java.net/browse/JDK-8245717 > > > Webrev: http://cr.openjdk.java.net/~xgong/rfr/8245717/webrev.00/ > > > > > > Repetitively enabling the vm option "-XX:+EnableJVMCIProduct" in > > the > > > command line makes the assertion fail in debug mode: > > "assert(is_experimental(), sanity)". > > > > > > It happens when the VM iteratively parses the options from > > command > > > line. When the matched option is "-XX:+EnableJVMCIProduct", the > > > original experimental JVMCI flags will be converted to product > > mode, > > > with the above assertion before it. So if all the JVMCI flags > > have > > > been converted to the product mode at the first time parsing > > "-XX:+EnableJVMCIProduct", the assertion will fail at the second > > time it is parsed. > > > > > > A simple fix is to just ignoring the conversion if this option > > has been parsed. > > > > Seems a reasonable approach given the already complex handling of > > this flag. > > > > > Testing: > > > Tested jtreg > > > hotspot::hotspot_all_no_apps,jdk::jdk_core,langtools::tier1 > > > and jcstress:tests-custom, and all tests pass without new > > failure. > > > > I think adding a regression test in > > ./compiler/jvmci/TestEnableJVMCIProduct.java would be appropriate. > > Thanks for your review and it?s a good idea to add the regression test. > I have added the test in the new patch: > http://cr.openjdk.java.net/~xgong/rfr/8245717/webrev.01/ . > > Could you please take a look at it? Thank you! > > Thanks, > Xiaohong Gong > From Xiaohong.Gong at arm.com Fri May 29 06:25:47 2020 From: Xiaohong.Gong at arm.com (Xiaohong Gong) Date: Fri, 29 May 2020 06:25:47 +0000 Subject: Question about the expected behavior if JVMCI compiler is used on the jvm variant with C2 disabled Message-ID: Add hotspot-runtime-dev at openjdk.java.net channel. Thanks! From: Xiaohong Gong Sent: Wednesday, May 27, 2020 5:19 PM To: hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: Question about the expected behavior if JVMCI compiler is used on the jvm variant with C2 disabled Hi, Recently we found an issue that the JVM can crash in debug mode when the JVMCI compiler is used on the jvm variant that C2 is disabled (Add "-with-jvm-features=-compiler2" for configuration). The JVM crashes with the assertion fails: Internal Error (jdk/src/hotspot/share/compiler/compileBroker.cpp:891), pid=10824, tid=10825 # assert(_c2_count > 0 || _c1_count > 0) failed: No compilers? It is obvious that the jvm cannot find a compiler since both the "_c2_count" and "_c1_count" is zero due to some internal issues. Since "TieredCompilation" is closed when C2 is disabled, the compile mode should be "interpreter+C1" by default, and it works well as expected. However, I'm confused about the expected behavior if the JVMCI compiler is specified to use. For one side, I thought it should use "interpreter+JVMCI" as the compile mode. If so we have to fix the issues. For another side, I noticed that there is a VM warning when using JVMCI compiler and disabling tiered compilation with normal configuration: "Disabling tiered compilation with non-native JVMCI compiler is not recommended". So considering that "TieredCompilation" is also closed when C2 is disabled, I thought it would be better to just invalid the JVMCI compiler for it. So my question is which should be the expected behavior, choose "interpreter+JVMCI" as the compile mode or make it invalid to use JVMCI compiler when C2 is disabled? It's very appreciative if I can get any opinion! Thanks, Xiaohong From tobias.hartmann at oracle.com Fri May 29 06:32:03 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 29 May 2020 08:32:03 +0200 Subject: [15] RFR(S): 8239477: jdk/jfr/jcmd/TestJcmdStartStopDefault.java fails -XX:+VerifyOops with "verify_oop: rsi: broken oop" In-Reply-To: <7355118d-6851-f85e-58e5-3cd5eba6dd90@oracle.com> References: <21b1a507-6b58-6dbe-ab9a-5b1126089be0@oracle.com> <7355118d-6851-f85e-58e5-3cd5eba6dd90@oracle.com> Message-ID: <5ce9dca9-d66c-9345-8f27-9aa24d834e95@oracle.com> Thanks Vladimir! Best regards, Tobias On 28.05.20 19:24, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 5/28/20 7:22 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8239477 >> http://cr.openjdk.java.net/~thartmann/8239477/webrev.00/ >> >> When loading JfrThreadLocal::_java_event_writer which is a jobject (i.e., metadata that does not >> life in the Java heap), type T_OBJECT is used in the C1 intrinsic for _getEventWriter. As a result, >> we fail during oop verification emitted by LIR_Assembler::mem2reg. >> >> We should use T_METADATA instead but can't due to JDK-8026837 [1]. Similar to [2], I'm therefore >> using T_ADDRESS as a workaround until JDK-8026837 is fixed. >> >> Thanks, >> Tobias >> >> [1] #? Internal Error (/oracle/jdk_jdk/open/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp:1353), >> pid=8575, tid=8591 >> #? Error: ShouldNotReachHere() >> >> [2] >> http://hg.openjdk.java.net/jdk/jdk/file/02a5a446f8bf/src/hotspot/share/c1/c1_LIRGenerator.cpp#l1287 >> From rwestrel at redhat.com Fri May 29 07:30:57 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 29 May 2020 09:30:57 +0200 Subject: RFR(XS): 8245714: "Bad graph detected in build_loop_late" when loads are pinned on loop limit check uncommon branch In-Reply-To: <87tv041ira.fsf@redhat.com> References: <87tv041ira.fsf@redhat.com> Message-ID: <87ftbjyz7y.fsf@redhat.com> > https://bugs.openjdk.java.net/browse/JDK-8245714 > http://cr.openjdk.java.net/~roland/8245714/webrev.00/ > > This triggers when data nodes are pinned on the uncommon trap path of a > predicate. When a new predicate is added, a region is created to merge > the paths comming from the place holder and the new predicate. Data > nodes pinned on the uncommon path for the place holder are then updated > to be pinned on the new region. That logic updates the control edge but > not the control that loop opts keep track of. This causes a crash with > the test case of the webrev where the predicate is a loop limit check. That fix is incomplete. If the Load that's pinned on the uncommon trap path is a LoadN then there's a DecodeN between the uncommon trap and the Load. The control of the DecodeN also needs to be updated. Here is an updated fix: http://cr.openjdk.java.net/~roland/8245714/webrev.01/ This one uses lazy_replace. I'm concerned that other nodes (maybe an AddP) would be assigned the projection as control and need to be moved. With lazy_replace, all nodes are guaranteed to be properly updated. Roland. From richard.reingruber at sap.com Fri May 29 08:08:53 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 29 May 2020 08:08:53 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: <057dfdb4-74df-e0ec-198d-455aeb14d5a1@oracle.com> References: <32f34616-cf17-8caa-5064-455e013e2313@oracle.com> <057dfdb4-74df-e0ec-198d-455aeb14d5a1@oracle.com> Message-ID: Thanks for the info, Vladimir, and for looking at the webrev. Best regards, Richard. -----Original Message----- From: Vladimir Kozlov Sent: Donnerstag, 28. Mai 2020 18:03 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant Vladimir Ivanov is on break currently. It looks good to me. Thanks, Vladimir K On 5/26/20 7:31 AM, Reingruber, Richard wrote: > Hi Vladimir, > >>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > >> Not an expert in JVMTI code base, so can't comment on the actual changes. > >> From JIT-compilers perspective it looks good. > > I put out webrev.1 a while ago [1]: > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ > Webrev(delta): http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ > > You originally suggested to use a handshake to switch a thread into interpreter mode [2]. I'm using > a direct handshake now, because I think it is the best fit. > > May I ask if webrev.1 still looks good to you from JIT-compilers perspective? > > Can I list you as (partial) Reviewer? > > Thanks, Richard. > > [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/031245.html > [2] http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030340.html > > -----Original Message----- > From: Vladimir Ivanov > Sent: Freitag, 7. Februar 2020 09:19 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > >> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > > Not an expert in JVMTI code base, so can't comment on the actual changes. > > From JIT-compilers perspective it looks good. > > Best regards, > Vladimir Ivanov > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >> >> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >> >> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >> >> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >> >> Thanks, Richard. >> >> See also my question if anyone knows a reason for making the compiled methods not_entrant: >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html >> From john.r.rose at oracle.com Fri May 29 08:11:00 2020 From: john.r.rose at oracle.com (John Rose) Date: Fri, 29 May 2020 01:11:00 -0700 Subject: RFR(M): 8223051: support loops with long (64b) trip counts In-Reply-To: <87lflcyz67.fsf@redhat.com> References: <87lfmd8lip.fsf@redhat.com> <87h7wv7jny.fsf@redhat.com> <601CD9EB-C4E2-413E-988A-03CE5DE9FB00@oracle.com> <87y2q55rj4.fsf@redhat.com> <497B34CC-BA72-4674-8C5A-CF04DEF0CDC2@oracle.com> <87lflcyz67.fsf@redhat.com> Message-ID: <0CD2D156-D877-40AF-8FE6-CF5C64F127D9@oracle.com> On May 28, 2020, at 6:19 AM, Roland Westrelin wrote: > >> Maybe someone else will have comments 0n that, but assuming >> that is more or less unchanged, this change set for 8223051 on >> top still looks good, taking into account renaming of min/max >> factories. >> >> After you post an updated webrev.02 the review should be easy. >> Tobias might wish to run some regression tests on the final changes. > > Thanks for the review. Here is the webrev with the min/max renaming and > the hunk that was erroneously included in 8244504. > > http://cr.openjdk.java.net/~roland/8223051/webrev.02/ > > Roland. Good, same as before. I noticed one more corner case that might be good to address. + // We can't iterate for more than max int at a time. + if (stride_con != (jint)stride_con || ABS(stride_con) >= max_jint) { + return false; + } Should be: + // We can't iterate for more than max int at a time. ++ if (stride_con != (jint)stride_con || ABS(stride_con) * (1+MIN_ITER) >= max_jint) { + return false; + } Where MIN_ITER is some constant that defines the minimum number of iterations that the strip-mined int-loop should run. If the stride is very large (nearly max_jint) then the int-loop will only run once, or just a few times. I think MIN_ITER should be at least 10. I suggest hardwiring it to 10 locally in loopnode.cpp, and making it a tunable parameter later on if we actually run into trouble with it. But we won?t; nobody is going to write loops with strides on the order of max_jint. In fact, you can leave out this suggestion altogether, if you are not comfortable with it, and we just take the odd performance hit if someone does something that strange. Either way, I say, ship it! ? John From adinn at redhat.com Fri May 29 08:33:37 2020 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 29 May 2020 09:33:37 +0100 Subject: RFR: 8245986: AArch64: Provide information when hitting a HaltNode In-Reply-To: <0c322e72-2935-ac1a-f620-54886eec7e5e@redhat.com> References: <92db8ab4-84e3-d425-4e9f-d6a77b0fa837@redhat.com> <0c322e72-2935-ac1a-f620-54886eec7e5e@redhat.com> Message-ID: On 28/05/2020 17:55, Andrew Haley wrote: > On 27/05/2020 16:54, Andrew Haley wrote: >> We need to provide a halt reason when hitting a C2 HaltNode on >> AArch64, and we need to do so without grossly bloating the code. >> >> http://cr.openjdk.java.net/~aph/8245986/ > > New webrev: http://cr.openjdk.java.net/~aph/8245986-2/ Yes, very nice. I assume you tested it :-) Reviewed! regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From aph at redhat.com Fri May 29 08:55:08 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 29 May 2020 09:55:08 +0100 Subject: RFR:8246051:[AArch64]SIGBUS by unaligned Unsafe compare_and_swap In-Reply-To: References: Message-ID: <02ee2fa8-b649-baf3-1158-0a62299a94b5@redhat.com> On 28/05/2020 13:43, Wang Zhuo(Zhuoren) wrote: > Hi, > I found that on aarch64, SIGBUS happens when Unsafe compareAndSwapLong and compareAndSwapInt were used to access unaligned mem address in interpreter mode. In compiled code, InternalError will be thrown. We should fix the crash and throw InternalError in interpreter too. > Please help review this patch. > > BUG: https://bugs.openjdk.java.net/browse/JDK-8246051 > CR: http://cr.openjdk.java.net/~wzhuo/8246051/webrev.00/ OK, thanks. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From martin.doerr at sap.com Fri May 29 09:05:20 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 29 May 2020 09:05:20 +0000 Subject: RFR(S): 8244086: Following 8241492, strip mined loop may run extra iterations In-Reply-To: <87imggyvzg.fsf@redhat.com> References: <87wo5y8z2v.fsf@redhat.com> <878sid8jzn.fsf@redhat.com> <87zhat6voh.fsf@redhat.com> <87wo5s6tvs.fsf@redhat.com> <87eery7sr7.fsf@redhat.com> <87imggyvzg.fsf@redhat.com> Message-ID: Hi Roland, thanks for improving it and for adding the comment. I'm fine with it. Best regards, Martin > -----Original Message----- > From: Roland Westrelin > Sent: Donnerstag, 28. Mai 2020 16:29 > To: Doerr, Martin ; Pengfei Li > ; hotspot-compiler-dev at openjdk.java.net > Cc: nd > Subject: RE: RFR(S): 8244086: Following 8241492, strip mined loop may run > extra iterations > > > > Actually in the review thread for 8223051, John suggested some > > refactoring that would apply here as well. I'll update this webrev. > > Here is the updated webrev: > > http://cr.openjdk.java.net/~roland/8244086/webrev.01/ > > It takes advantage of new methods from 8244504. > > Roland. From tobias.hartmann at oracle.com Fri May 29 09:41:42 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 29 May 2020 11:41:42 +0200 Subject: RFR(M): 8244660: Code cache sweeper heuristics is broken In-Reply-To: <1e06ca0e-803a-416f-2313-0f9e53aa94ba@oracle.com> References: <0688678b-986b-082c-425e-543c3c32b094@oracle.com> <1e06ca0e-803a-416f-2313-0f9e53aa94ba@oracle.com> Message-ID: <577ab253-b878-92b0-b170-14bac54173a4@oracle.com> Hi Nils, On 27.05.20 22:42, Nils Eliasson wrote: > New webrev: http://cr.openjdk.java.net/~neliasso/8244660/webrev.03/ Very nice findings and cleanup! Looks good to me, below are some minor comments. globals.hpp - line 1723: "controling" -> "controlling" - line 1724: "percent" -> "percentage" sweeper.cpp - line 258: ifs should be merged - line 484: "There can be data races on _bytes_changed. The data races are benign, since it does not matter if we loose a couple of bytes." This is not true anymore with atomic loads/stores, right? - Not sure if 'should_start_sweep' really needs it's own method. Inlining it into the single caller would have all the accesses to _bytes_changed in one place. sweeper.hpp - line 73: empty comment Best regards, Tobias From robbin.ehn at oracle.com Fri May 29 10:35:22 2020 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 29 May 2020 12:35:22 +0200 Subject: RFR(M): 8244660: Code cache sweeper heuristics is broken In-Reply-To: <577ab253-b878-92b0-b170-14bac54173a4@oracle.com> References: <0688678b-986b-082c-425e-543c3c32b094@oracle.com> <1e06ca0e-803a-416f-2313-0f9e53aa94ba@oracle.com> <577ab253-b878-92b0-b170-14bac54173a4@oracle.com> Message-ID: Hi Nils, On 2020-05-29 11:41, Tobias Hartmann wrote: > Hi Nils, > > On 27.05.20 22:42, Nils Eliasson wrote: >> New webrev: http://cr.openjdk.java.net/~neliasso/8244660/webrev.03/ > > Very nice findings and cleanup! Looks good to me, below are some minor comments. Looks good, thanks for fixing. (minus Tobias nits below) Thanks, Robbin > > globals.hpp > - line 1723: "controling" -> "controlling" > - line 1724: "percent" -> "percentage" > > sweeper.cpp > - line 258: ifs should be merged > - line 484: "There can be data races on _bytes_changed. The data races are benign, since it does not > matter if we loose a couple of bytes." This is not true anymore with atomic loads/stores, right? > - Not sure if 'should_start_sweep' really needs it's own method. Inlining it into the single caller > would have all the accesses to _bytes_changed in one place. > > sweeper.hpp > - line 73: empty comment > > Best regards, > Tobias > From rwestrel at redhat.com Fri May 29 11:15:05 2020 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 29 May 2020 13:15:05 +0200 Subject: RFR(M): 8223051: support loops with long (64b) trip counts In-Reply-To: <0CD2D156-D877-40AF-8FE6-CF5C64F127D9@oracle.com> References: <87lfmd8lip.fsf@redhat.com> <87h7wv7jny.fsf@redhat.com> <601CD9EB-C4E2-413E-988A-03CE5DE9FB00@oracle.com> <87y2q55rj4.fsf@redhat.com> <497B34CC-BA72-4674-8C5A-CF04DEF0CDC2@oracle.com> <87lflcyz67.fsf@redhat.com> <0CD2D156-D877-40AF-8FE6-CF5C64F127D9@oracle.com> Message-ID: <87d06nyoue.fsf@redhat.com> > I noticed one more corner case that might be good to address. > > + // We can't iterate for more than max int at a time. > + if (stride_con != (jint)stride_con || ABS(stride_con) >= max_jint) { > + return false; > + } > > Should be: > > + // We can't iterate for more than max int at a time. > ++ if (stride_con != (jint)stride_con || ABS(stride_con) * (1+MIN_ITER) >= max_jint) { > + return false; > + } > > Where MIN_ITER is some constant that defines the minimum > number of iterations that the strip-mined int-loop should run. > If the stride is very large (nearly max_jint) then the int-loop > will only run once, or just a few times. I think MIN_ITER > should be at least 10. > > I suggest hardwiring it to 10 locally in loopnode.cpp, and > making it a tunable parameter later on if we actually > run into trouble with it. But we won?t; nobody is going > to write loops with strides on the order of max_jint. > In fact, you can leave out this suggestion altogether, > if you are not comfortable with it, and we just take the > odd performance hit if someone does something that > strange. I thought about this a bit when I prepared the change and I left the code as is so as many loop transformations as possible are performed to shake out bugs thinking it could be revised later. Roland. From tobias.hartmann at oracle.com Fri May 29 12:46:38 2020 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 29 May 2020 14:46:38 +0200 Subject: [15] RFR(XS): 8246153: TestEliminateArrayCopy fails with -XX:+StressReflectiveCode Message-ID: <66126747-f74c-57f2-8960-4cfa603fbf10@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8246153 http://cr.openjdk.java.net/~thartmann/8246153/webrev.00/ With -XX:+StressReflectiveCode, loads from the layout helper emitted by GraphKit::get_layout_helper are not folded (usually done via LoadNode::Value -> LoadNode::load_array_final_field). As a result, the control input of the AllocateNode does not directly point to the MemBar but to the initial_slow_test emitted by GraphKit::new_instance that has not been folded either. Instead of using the control input to find the MemBar when removing allocations after scalar replacement, we should simply use the memory input. Thanks, Tobias From zhuoren.wz at alibaba-inc.com Fri May 29 12:36:21 2020 From: zhuoren.wz at alibaba-inc.com (=?UTF-8?B?V2FuZyBaaHVvKFpodW9yZW4p?=) Date: Fri, 29 May 2020 20:36:21 +0800 Subject: =?UTF-8?B?UmU6IFthYXJjaDY0LXBvcnQtZGV2IF0gUkZSOjgyNDYwNTE6W0FBcmNoNjRdU0lHQlVTIGJ5?= =?UTF-8?B?IHVuYWxpZ25lZCBVbnNhZmUgY29tcGFyZV9hbmRfc3dhcA==?= In-Reply-To: References: , Message-ID: Update patch. A jtreg test added http://cr.openjdk.java.net/~wzhuo/8246051/webrev.01/ Regards, Zhuoren ------------------------------------------------------------------ From:Nick Gasson Sent At:2020 May 29 (Fri.) 09:54 To:Sandler Cc:hotspot-compiler-dev\@openjdk.java.net ; aarch64-port-dev Subject:Re: [aarch64-port-dev ] RFR:8246051:[AArch64]SIGBUS by unaligned Unsafe compare_and_swap On 05/28/20 20:43 PM, Wang Zhuo wrote: > I found that on aarch64, SIGBUS happens when Unsafe compareAndSwapLong and compareAndSwapInt were used to access unaligned mem address in interpreter mode. In compiled code, InternalError will be thrown. We should fix the crash and throw InternalError in interpreter too. > Please help review this patch. > > BUG: https://bugs.openjdk.java.net/browse/JDK-8246051 > CR: http://cr.openjdk.java.net/~wzhuo/8246051/webrev.00/ > Could you add a jtreg test for this? -- Nick From vladimir.kozlov at oracle.com Fri May 29 16:52:02 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 29 May 2020 09:52:02 -0700 Subject: [15] RFR(XS): 8246153: TestEliminateArrayCopy fails with -XX:+StressReflectiveCode In-Reply-To: <66126747-f74c-57f2-8960-4cfa603fbf10@oracle.com> References: <66126747-f74c-57f2-8960-4cfa603fbf10@oracle.com> Message-ID: Looks good. Thanks, Vladimir On 5/29/20 5:46 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8246153 > http://cr.openjdk.java.net/~thartmann/8246153/webrev.00/ > > With -XX:+StressReflectiveCode, loads from the layout helper emitted by GraphKit::get_layout_helper > are not folded (usually done via LoadNode::Value -> LoadNode::load_array_final_field). As a result, > the control input of the AllocateNode does not directly point to the MemBar but to the > initial_slow_test emitted by GraphKit::new_instance that has not been folded either. > > Instead of using the control input to find the MemBar when removing allocations after scalar > replacement, we should simply use the memory input. > > Thanks, > Tobias > From manc at google.com Fri May 29 19:03:55 2020 From: manc at google.com (Man Cao) Date: Fri, 29 May 2020 12:03:55 -0700 Subject: RFR(M): 8244660: Code cache sweeper heuristics is broken In-Reply-To: References: <0688678b-986b-082c-425e-543c3c32b094@oracle.com> <1e06ca0e-803a-416f-2313-0f9e53aa94ba@oracle.com> <577ab253-b878-92b0-b170-14bac54173a4@oracle.com> Message-ID: Hi Nils, > New webrev: http://cr.openjdk.java.net/~neliasso/8244660/webrev.03/ Thanks for fixing and looks good to me as well! -Man From nils.eliasson at oracle.com Fri May 29 20:12:33 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 29 May 2020 22:12:33 +0200 Subject: RFR(M): 8244660: Code cache sweeper heuristics is broken In-Reply-To: <577ab253-b878-92b0-b170-14bac54173a4@oracle.com> References: <0688678b-986b-082c-425e-543c3c32b094@oracle.com> <1e06ca0e-803a-416f-2313-0f9e53aa94ba@oracle.com> <577ab253-b878-92b0-b170-14bac54173a4@oracle.com> Message-ID: On 2020-05-29 11:41, Tobias Hartmann wrote: > Hi Nils, > > On 27.05.20 22:42, Nils Eliasson wrote: >> New webrev: http://cr.openjdk.java.net/~neliasso/8244660/webrev.03/ > Very nice findings and cleanup! Looks good to me, below are some minor comments. > > globals.hpp > - line 1723: "controling" -> "controlling" > - line 1724: "percent" -> "percentage" Fixed. > > sweeper.cpp > - line 258: ifs should be merged > - line 484: "There can be data races on _bytes_changed. The data races are benign, since it does not > matter if we loose a couple of bytes." This is not true anymore with atomic loads/stores, right? > - Not sure if 'should_start_sweep' really needs it's own method. Inlining it into the single caller > would have all the accesses to _bytes_changed in one place. Fixed > sweeper.hpp > - line 73: empty comment Fixed. > > Best regards, > Tobias New webrev: http://cr.openjdk.java.net/~neliasso/8244660/webrev.04 Thanks for the review! Nils Eliasson From nils.eliasson at oracle.com Fri May 29 20:13:05 2020 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 29 May 2020 22:13:05 +0200 Subject: RFR(M): 8244660: Code cache sweeper heuristics is broken In-Reply-To: References: <0688678b-986b-082c-425e-543c3c32b094@oracle.com> <1e06ca0e-803a-416f-2313-0f9e53aa94ba@oracle.com> <577ab253-b878-92b0-b170-14bac54173a4@oracle.com> Message-ID: <68b43d17-8427-1883-c665-2931690dec9f@oracle.com> Thank you Man and Robbin! Best regards, Nils On 2020-05-29 21:03, Man Cao wrote: > Hi Nils, > > > New webrev: http://cr.openjdk.java.net/~neliasso/8244660/webrev.03/ > > Thanks for fixing and looks good to me as well! > > -Man From xxinliu at amazon.com Fri May 29 23:24:49 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Fri, 29 May 2020 23:24:49 +0000 Subject: RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> <0B03A385-BC1F-41B9-8B8F-02056BD5A706@amazon.com> <40eed1f3-27b9-5263-16c1-7563a6ff9082@arm.com> Message-ID: Hello, Since JDK-8245986(aarch64) has been resolved, may I ask a sponsor to push this change? it's like last mile of JDK-8230552. http://cr.openjdk.java.net/~xliu/8230552/02/webrev/ s390 has been reviewed by Martin. Thank Volker and OSU, I verified the new stop() mechanism on a ppc64le host. Thanks, --lx ?On 5/28/20, 3:03 AM, "Andrew Haley" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. On 27/05/2020 21:09, Liu, Xin wrote: > Yes, it's better. I compare two stacktraces and cframes in the previous stacktrace indeed disturb users from understanding their own problems. > I reviewed Andrew's webrev. It looks good to me. > I am happy to see that you solve rscratch1 clobber problem in such > elegant way! Great, thanks. > Just one thing: for this instruction emit_int64((intptr_t)msg), can we safely say a pointer is always 64-bit on aarch64? That's a fair point. I'll change it: there's no need for that code to depend on pointer size. > According to arm document, in theory, aarch64 has the ILP32 data > model, but I don't think we ever use ILP32 before on aarch64. If anyone ever makes IPL32 work, we'll be happy to fix HotSpot to run on it. > http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0490a/ar01s01.html > > May I ask Andrew to sponsor my patch when you push JDK-8245986? > Now it become trivial. http://cr.openjdk.java.net/~xliu/8230552/02/webrev/ -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From igor.ignatyev at oracle.com Sat May 30 00:12:13 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 29 May 2020 17:12:13 -0700 Subject: RFR(L): 8229186: Improve error messages for TestStringIntrinsics failures In-Reply-To: <1558df02-2c3d-6c0e-cb7b-c06d17bb2a66@oracle.com> References: <2fa31b05-c37a-1367-a7dc-5ae2b13133be@oracle.com> <1558df02-2c3d-6c0e-cb7b-c06d17bb2a66@oracle.com> Message-ID: Hi Evgeny, in general, I like the idea, yet I feel it would be more useful if ArrayDiff is also used by Assert class, do you plan to do that by a separate RFE? I also have a number of comments regarding the patch: test/hotspot/jtreg/compiler/intrinsics/string/TestStringIntrinsics.java: - I'd prefer invokeAndCompareArrays and invokeAndCheck to be as close as possible: have both of them to accept either boolean or Object as 2nd arg; print/throw the same error message - in invokeAndCompareArrays, it seems to be useful to have at least one array printed out when expectedResult != result and expectedResult is true test/lib-test/jdk/test/lib/format/ArrayDiffTest.java: - it's more common in our code to have an open curved bracket for class definition one the same line as a class name - you don't need to @compile ArrayDiffTest.java as jtreg should compile it automatically for you by @run. - copyright year shouldn't have 2019 listed there, unless you wrote this code in the prev. year and just didn't have a chance to integrate it test/lib/jdk/test/lib/format/ArrayDiff.java: - similar comment about copyright year - I don't like FAILURE_MARKS, you are making an assumption about maximal length for an object string representation, you can easily create a mark by `new String("^").repeat(n)`, yes it might be less memory efficient, but that's a test aux library after all - per code style guidelines, all even one-line loops and if-s should have { }, e.g. L#185, L#192, L#218 - there shouldn't be a space b/w function name and opening parenthesis, e.g. L#183 - 'else if' should be at the same line as closing parenthesis, e.g. L#91 - I don't see why ELLIPSIS is to be in defined in Format when it's used in ArrayDiff, I'd rather see it as a private static final field in ArrayDiff, and if someone else needs '...' string, they can create one - the same for PADDINGS; + PADDINGS seems to be highly coupled w/ how ArrayDiff and has the same problem as FAILURE_MARKS -- you can easily get IOOOBE - ArrayCodec::findMismatchIndex assumes that there are no null in source, it's better to use java.util.Objects.equals - maybe I'm missing smth, but I don't understand why ArrayCodec supports only char and byte arrays; and hence I don't understand why you need ArrayCodec::of methods, as you can simply do new ArrayCoded(Arrays.stream(a).collect(Collectors.toList()) where a is an array of any type - it'd be appreciated if all public methods had javadoc which describes all parameters using @param - it seems that ArrayCodec should be an inner static class of ArrayDiff test/lib/jdk/test/lib/format/Diff.java - similar comment about copyright year - could you please add javadoc to both methods? test/lib/jdk/test/lib/format/Format.java: - typo s/maximul/maximal - shouldn't asLiteral call asLiteral(String.valueOf(o)) at L#58? - typo s/it's/its at L#45 - it'd be appreciated if all public methods had javadoc which describse all parameters using @param there are a few other code-style/editorial nonconformities (e.g. space before ')' or javadoc comment doesn't have leading empty line) I've noticed, but I haven't written down the place where I saw them, so I'll make another pass after you address/answer the code-related comments. Thanks, -- Igor > On May 28, 2020, at 12:33 PM, Evgeny Nikitin wrote: > > Forwarding to the compiler mailing list as this changes a test in the Compiler area. > > On 2020-05-18 16:46, Evgeny Nikitin wrote: >> Hi, >> Bug: https://bugs.openjdk.java.net/browse/JDK-8229186 >> Webrev: http://cr.openjdk.java.net/~enikitin/8229186/webrev.00/ >> Error reporting was improved by writing a C-style escaped string representations for the variables passed to the methods being tested. For array comparisons, a dedicated diff-formatter was implemented. >> Sample output for comparing byte arrays (with artificial failure): >> ----------System.err:(21/1553)---------- >> Result: (false) of 'arrayEqualsB' is not equal to expected (true) >> Arrays differ starting from [index: 7]: >> ... 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, ... >> ... 5, 6, 125, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, ... >> ^^^^ >> java.lang.RuntimeException: Result: (false) of 'arrayEqualsB' is not equal to expected (true) >> at compiler.intrinsics.string.TestStringIntrinsics.invokeAndCheckArrays(TestStringIntrinsics.java:273) at ... stack trace continues - E.N. >> Sample output for comparing char arrays: >> ----------System.err:(21/1579)*---------- >> Result: (false) of 'arrayEqualsC' is not equal to expected (true) >> Arrays differ starting from [index: 7]: >> ... \\u0005, \\u0006, \\u0007, \\u0008, \\u0009, \\n, \\u000B, \\u000C, \\r, \\u000E, ... >> ... \\u0005, \\u0006, }, \\u0008, \\u0009, \\n, \\u000B, \\u000C, \\r, \\u000E, ... >> ^^^^^^^ >> java.lang.RuntimeException: Result: (false) of 'arrayEqualsC' is not equal to expected (true) >> at compiler.intrinsics.string.TestStringIntrinsics.invokeAndCheckArrays(TestStringIntrinsics.java:280) at >> ... and so on - E.N. >> Please review. >> Thanks in advance, >> /Evgeny Nikitin. From aph at redhat.com Sat May 30 20:14:55 2020 From: aph at redhat.com (Andrew Haley) Date: Sat, 30 May 2020 21:14:55 +0100 Subject: [aarch64-port-dev ] RFR:8246051:[AArch64]SIGBUS by unaligned Unsafe compare_and_swap In-Reply-To: References: Message-ID: <497b376c-561c-c40c-add6-a63af8736a3c@redhat.com> On 29/05/2020 13:36, Wang Zhuo(Zhuoren) wrote: > Update patch. A jtreg test added > http://cr.openjdk.java.net/~wzhuo/8246051/webrev.01/ The test is AArch64-only but the patch is to shared code. This doesn't make sense to me. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671