Hi Gustavo, I had a look at your change. Basically looks good. Some smaller things: 2440 Otherwise, counter will be increment 2441 // more than once. This should read: Otherwise, the counter will be incremented more than once. Can you please put declaration and assignments on one line? 2472 int abortX_offs; 2473 abortX_offs = RTMLockingCounters::abortX_count_offset(); This should read: 2472 int abortX_offs = RTMLockingCounters::abortX_count_offset(); Similar 2479+2481. I don't need a new webrev for these fixes. Reviewed. Best regards, Goetz.
-----Original Message----- From: Gustavo Romero [mailto:gromero@linux.vnet.ibm.com] Sent: Dienstag, 17. Juli 2018 15:45 To: Lindenmaier, Goetz <goetz.lindenmaier@sap.com> Cc: Doerr, Martin <martin.doerr@sap.com>; hotspot-compiler- dev@openjdk.java.net; ppc-aix-port-dev@openjdk.java.net Subject: Re: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on nested transactions
Hi,
Could I get a second review for that change please?
Best regards, Gustavo
On 07/17/2018 08:18 AM, Gustavo Romero wrote:
Hi Martin,
OK. Let's continue with v4_B [1] so.
Thanks for reviewing it!
Best regards, Gustavo
[1] http://cr.openjdk.java.net/~gromero/8205582/v4_B
On 07/17/2018 06:02 AM, Doerr, Martin wrote:
Hi Gustavo,
your webrev v4_B looks good. Reviewed.
I think v4_A wouldn't work appropriately when using +1 and -1 bits together.
A generic version could be something like (just to explain what I was thinking about): for (int nbit = 0; nbit < num_failure_bits; nbit++) { Label do_increment, check_abort;
int last_match = -1; for (ncounter = 0; ncounter < num_counters; ncounter++) { if (last_match >= 0) { rldicr_(temp_Reg, abort_status_R0, failure_bit[last_match], 0); int selection = bit_counter_map[last_match][ncounter]; if (selection == 1) { bne(CCR1, do_increment); } else if (selection == -1) { beq(CCR1, do_increment); } } last_match = nbit; }
assert(last_match >= 0, "should have at least one"); rldicr_(temp_Reg, abort_status_R0, failure_bit[last_match], 0); int selection = bit_counter_map[last_match][ncounter]; if (selection == 1) { beq(CCR1, check_abort); } else if (selection == -1) { bne(CCR1, check_abort); } bind(do_increment); ld(temp_Reg, abort_counter_offs, rtm_counters_Reg); addi(temp_Reg, temp_Reg, 1); std(temp_Reg, abort_counter_offs, rtm_counters_Reg); bind(check_abort); }
But I'm fine with webrev v4_B with the comment you have added.
Thanks, Martin
-----Original Message----- From: Gustavo Romero [mailto:gromero@linux.vnet.ibm.com] Sent: Dienstag, 17. Juli 2018 08:55 To: Doerr, Martin <martin.doerr@sap.com>; Lindenmaier, Goetz <goetz.lindenmaier@sap.com>; hotspot-compiler-dev@openjdk.java.net Cc: ppc-aix-port-dev@openjdk.java.net Subject: Re: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on nested transactions
Hi Martin,
On 07/16/2018 05:17 AM, Doerr, Martin wrote:
thanks for the new webrev.
Thanks a lot for the thorough review.
You're right, the two bits should be mutual exclusive, so the sum is equivalent to the logical or in this case. However, the loop pretends to be generic, but it's not. The "or" only works for mutual exclusive bits. If you want to keep the code with the loops, I think there should be a comment explaining this.
I see. I've experimented a couple of options following your previous suggestion of using the counter loop as the outer loop. Most difficult point was to work around the constraint of having just R0 available as scratch. Bit/bitfield extracting instrs did not help much since most of them can't perform an OR with its destination operand, which gets worse with the R0 constraint. On the other, afaics 'rldimi' which does an OR with its destination operand can't be used to extract the bit/bitfields in a generic way. That best alternative I found was to use both CCR0 and CCR1 and their EQ bits. I think all cases are covered now, i.e. all bits/conditions for a given counter are ORed. failure_code logic is not inverted any more in the map, which seems more natural. I added also more information to the comment before the bit/counter map. I think now the loop is generic. Webrev for the final result:
http://cr.openjdk.java.net/~gromero/8205582/v4_A/
Please also fix indentation.
Done.
Alternatively, the loop could get replaced by some code for each bit. Given that the loop is not really generic, I think this wouldn't be worse.
Besides that, it looks good. Thanks for improving it.
Due to the tight schedule I also provide the corrections for the last reviewed version:
http://cr.openjdk.java.net/~gromero/8205582/v4_B/
If v4_A also looks good I vouch for pushing it instead of v4_B.
Best regards, Gustavo
Best regards, Martin
-----Original Message----- From: Gustavo Romero [mailto:gromero@linux.vnet.ibm.com] Sent: Freitag, 13. Juli 2018 16:56 To: Doerr, Martin <martin.doerr@sap.com>; Lindenmaier, Goetz <goetz.lindenmaier@sap.com>; hotspot-compiler-dev@openjdk.java.net Cc: ppc-aix-port-dev@openjdk.java.net Subject: Re: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on nested transactions
Hi Martin,
On 07/12/2018 12:30 PM, Doerr, Martin wrote:
I think your new code may increment a counter twice when 2 bits are specified for it. I think this should get fixed. Iterating over the RTMLockingCounters in the outer loop and performing several checks before incrementing should fix this, right?
Do you mean increment twice counter #2 (conflict counter) or another counter? non_trans_cf and trans_cf are mutually exclusive. You probably spotted a case I'm missing so I ask.
Besides that, I have some improvement proposals:
- I think using 3 constant tables is not so good to read. For example, inverting could be encoded in your new 2 dimensional table by using -1, 0 , +1 when using int instead of bool. Would this be better?
- I think you can get rid of rtm_counters_Reg increment and restoration by computing the abort_offs relative to the original value.
Cool :)
New (interim) webrev: http://cr.openjdk.java.net/~gromero/8205582/v3/
Thanks.
Best regards, Gustavo
Best regards, Martin
-----Original Message----- From: Gustavo Romero [mailto:gromero@linux.vnet.ibm.com] Sent: Dienstag, 10. Juli 2018 18:59 To: Doerr, Martin <martin.doerr@sap.com>; Lindenmaier, Goetz <goetz.lindenmaier@sap.com>; hotspot-compiler-dev@openjdk.java.net Cc: ppc-aix-port-dev@openjdk.java.net Subject: Re: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on nested transactions
Dear Martin,
On 06/25/2018 01:31 PM, Doerr, Martin wrote:
I think it would be better to ignore bit 63 in the macroAssembler code and use the definition from the spec in assembler_ppc.hpp. Somebody may want to use the definition for other purposes.
Done. transactional_level bit is 52 now so bits 52:62 can easily be extracted using 'rldicr' in macroAssembler.
I wonder if Assembler::tm_trans_cf | Assembler::tm_non_trans_cf would be a better match for x86's description for tm_failure_bit[2]. It's also a little unfortunate to print the same bit twice as tm_failure_bit[4].
Done. Now both tm_trans_cf and tm_non_trans_cf failures will increment counter 2 (conflict). Duplicated check code for tm_failure_bit[4] was removed and now counter 4 (debug) is mapped to count traps or syscalls caught in TM events, which seems a reasonable approximation to the original semantics of the debug counter on Intel. Unfortunately I could not confirm on AIX how these two events (trap and syscall in TM) will set the failure code, so the counter will never track any information on AIX. But with the current proposed change that failure code can be easily added in the future.
I also realized that I used previously a wrong ME operand value in:
+ // Extract 11 bits + rldicr_(temp_Reg, abort_status, tm_failure_bit[i], 11);
It should be 10 to extract 11 bits actually, so all extractions must be correct now.
I hope you don't find the array map of failure bits vs counters overkilling.
Finally, I replaced the comment:
tm_tabort, // Note: Seems like signal handler sets this, too.
by:
tm_tabort, // Signal handler will set this too.
because we just enable RTM support on Power if 'htm-nosc' is supported, so treclaim. on aborting the syscall will indeed always set Abort bit in TEXASR afaik. Since debug counter now tracks trap/syscall in TM it's possible to check that counter to verify the number of aborts cause by the kernel (if any).
new webrev: http://cr.openjdk.java.net/~gromero/8205582/v2/
Best regards, Gustavo
Best regards, Martin
-----Original Message----- From: Gustavo Romero [mailto:gromero@linux.vnet.ibm.com] Sent: Montag, 25. Juni 2018 10:19 To: Lindenmaier, Goetz <goetz.lindenmaier@sap.com>; Doerr, Martin <martin.doerr@sap.com>; hotspot-compiler-dev@openjdk.java.net Cc: ppc-aix-port-dev@openjdk.java.net Subject: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on nested transactions
Hi,
Could the following change be reviewed please?
bug : https://bugs.openjdk.java.net/browse/JDK-8205582 webrev: http://cr.openjdk.java.net/~gromero/8205582/v1/
It fixes the RTM counter for nested aborts (rtm lock aborts type 5) by extracting and checking bits in the Transactional Level field of TEXASR register.
It also fixes the memory conflict counter (rtm lock aborts type 2). Power TM status register supports two bits to inform two different types of memory conflict between threads: non-transactional and transactional. According to how the jtreg RTM tests are designed the memory conflict counter counts non-transactional conflicts: on TestPrintPreciseRTMLockingStatistics a RTM lock is held on a static variable while another thread without any synchronization (non-trasactional) tries to modify the same variable. Hence that small adjustment satisfies the TestPrintPreciseRTMLockingStatistics making it pass on Power. The memory conflict counter is not used in any other place besides by the RTM precise statistics (no decision is made by the JVM based on that amount).
This change partially fixes some failures in compiler/rtm/print/TestPrintPreciseRTMLockingStatistics.java regarding the nested and memory conflict abort counters. The remaining issue will be fixed by aborting on calling JNI (next RFR).
Thank you and best regards, Gustavo