RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on nested transactions
Lindenmaier, Goetz
goetz.lindenmaier at sap.com
Wed Jul 18 07:34:48 UTC 2018
Hi Gustavo,
I had a look at your change. Basically looks good.
Some smaller things:
2440 Otherwise, counter will be increment
2441 // more than once.
This should read:
Otherwise, the counter will be incremented more than once.
Can you please put declaration and assignments on one line?
2472 int abortX_offs;
2473 abortX_offs = RTMLockingCounters::abortX_count_offset();
This should read:
2472 int abortX_offs = RTMLockingCounters::abortX_count_offset();
Similar 2479+2481.
I don't need a new webrev for these fixes. Reviewed.
Best regards,
Goetz.
> -----Original Message-----
> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com]
> Sent: Dienstag, 17. Juli 2018 15:45
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
> Cc: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-
> dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net
> Subject: Re: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on nested
> transactions
>
> Hi,
>
> Could I get a second review for that change please?
>
> Best regards,
> Gustavo
>
> On 07/17/2018 08:18 AM, Gustavo Romero wrote:
> > Hi Martin,
> >
> > OK. Let's continue with v4_B [1] so.
> >
> > Thanks for reviewing it!
> >
> >
> > Best regards,
> > Gustavo
> >
> > [1] http://cr.openjdk.java.net/~gromero/8205582/v4_B
> >
> > On 07/17/2018 06:02 AM, Doerr, Martin wrote:
> >> Hi Gustavo,
> >>
> >> your webrev v4_B looks good. Reviewed.
> >>
> >> I think v4_A wouldn't work appropriately when using +1 and -1 bits
> together.
> >>
> >> A generic version could be something like (just to explain what I was
> thinking about):
> >> for (int nbit = 0; nbit < num_failure_bits; nbit++) {
> >> Label do_increment, check_abort;
> >>
> >> int last_match = -1;
> >> for (ncounter = 0; ncounter < num_counters; ncounter++) {
> >> if (last_match >= 0) {
> >> rldicr_(temp_Reg, abort_status_R0, failure_bit[last_match], 0);
> >> int selection = bit_counter_map[last_match][ncounter];
> >> if (selection == 1) {
> >> bne(CCR1, do_increment);
> >> } else if (selection == -1) {
> >> beq(CCR1, do_increment);
> >> }
> >> }
> >> last_match = nbit;
> >> }
> >>
> >> assert(last_match >= 0, "should have at least one");
> >> rldicr_(temp_Reg, abort_status_R0, failure_bit[last_match], 0);
> >> int selection = bit_counter_map[last_match][ncounter];
> >> if (selection == 1) {
> >> beq(CCR1, check_abort);
> >> } else if (selection == -1) {
> >> bne(CCR1, check_abort);
> >> }
> >> bind(do_increment);
> >> ld(temp_Reg, abort_counter_offs, rtm_counters_Reg);
> >> addi(temp_Reg, temp_Reg, 1);
> >> std(temp_Reg, abort_counter_offs, rtm_counters_Reg);
> >> bind(check_abort);
> >> }
> >>
> >> But I'm fine with webrev v4_B with the comment you have added.
> >>
> >> Thanks,
> >> Martin
> >>
> >>
> >> -----Original Message-----
> >> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com]
> >> Sent: Dienstag, 17. Juli 2018 08:55
> >> To: Doerr, Martin <martin.doerr at sap.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; hotspot-compiler-dev at openjdk.java.net
> >> Cc: ppc-aix-port-dev at openjdk.java.net
> >> Subject: Re: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on
> nested transactions
> >>
> >> Hi Martin,
> >>
> >> On 07/16/2018 05:17 AM, Doerr, Martin wrote:
> >>> thanks for the new webrev.
> >>
> >> Thanks a lot for the thorough review.
> >>
> >>
> >>> You're right, the two bits should be mutual exclusive, so the sum is
> equivalent to the logical or in this case.
> >>> However, the loop pretends to be generic, but it's not. The "or" only
> works for mutual exclusive bits.
> >>> If you want to keep the code with the loops, I think there should be a
> comment explaining this.
> >>
> >> I see. I've experimented a couple of options following your previous
> suggestion
> >> of using the counter loop as the outer loop. Most difficult point was to
> work
> >> around the constraint of having just R0 available as scratch. Bit/bitfield
> >> extracting instrs did not help much since most of them can't perform an
> OR with
> >> its destination operand, which gets worse with the R0 constraint. On the
> other,
> >> afaics 'rldimi' which does an OR with its destination operand can't be used
> to
> >> extract the bit/bitfields in a generic way. That best alternative I found was
> to
> >> use both CCR0 and CCR1 and their EQ bits. I think all cases are covered
> now,
> >> i.e. all bits/conditions for a given counter are ORed. failure_code logic is
> not
> >> inverted any more in the map, which seems more natural. I added also
> more
> >> information to the comment before the bit/counter map.
> >> I think now the loop is generic.
> >> Webrev for the final result:
> >>
> >> http://cr.openjdk.java.net/~gromero/8205582/v4_A/
> >>
> >>
> >>> Please also fix indentation.
> >>
> >> Done.
> >>
> >>
> >>> Alternatively, the loop could get replaced by some code for each bit.
> Given that the loop is not really generic, I think this wouldn't be worse.
> >>>
> >>> Besides that, it looks good. Thanks for improving it.
> >>
> >> Due to the tight schedule I also provide the corrections for the last
> reviewed
> >> version:
> >>
> >> http://cr.openjdk.java.net/~gromero/8205582/v4_B/
> >>
> >> If v4_A also looks good I vouch for pushing it instead of v4_B.
> >>
> >>
> >> Best regards,
> >> Gustavo
> >>
> >>> Best regards,
> >>> Martin
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com]
> >>> Sent: Freitag, 13. Juli 2018 16:56
> >>> To: Doerr, Martin <martin.doerr at sap.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; hotspot-compiler-dev at openjdk.java.net
> >>> Cc: ppc-aix-port-dev at openjdk.java.net
> >>> Subject: Re: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on
> nested transactions
> >>>
> >>> Hi Martin,
> >>>
> >>> On 07/12/2018 12:30 PM, Doerr, Martin wrote:
> >>>> I think your new code may increment a counter twice when 2 bits are
> specified for it. I think this should get fixed.
> >>>> Iterating over the RTMLockingCounters in the outer loop and
> performing several checks before incrementing should fix this, right?
> >>>
> >>> Do you mean increment twice counter #2 (conflict counter) or another
> counter?
> >>> non_trans_cf and trans_cf are mutually exclusive.
> >>> You probably spotted a case I'm missing so I ask.
> >>>
> >>>
> >>>> Besides that, I have some improvement proposals:
> >>>>
> >>>> - I think using 3 constant tables is not so good to read. For example,
> inverting could be encoded in your new 2 dimensional table by using -1, 0 , +1
> when using int instead of bool. Would this be better?
> >>>>
> >>>> - I think you can get rid of rtm_counters_Reg increment and restoration
> by computing the abort_offs relative to the original value.
> >>>
> >>> Cool :)
> >>>
> >>> New (interim) webrev:
> >>> http://cr.openjdk.java.net/~gromero/8205582/v3/
> >>>
> >>> Thanks.
> >>>
> >>>
> >>> Best regards,
> >>> Gustavo
> >>>
> >>>> Best regards,
> >>>> Martin
> >>>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com]
> >>>> Sent: Dienstag, 10. Juli 2018 18:59
> >>>> To: Doerr, Martin <martin.doerr at sap.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; hotspot-compiler-dev at openjdk.java.net
> >>>> Cc: ppc-aix-port-dev at openjdk.java.net
> >>>> Subject: Re: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on
> nested transactions
> >>>>
> >>>> Dear Martin,
> >>>>
> >>>> On 06/25/2018 01:31 PM, Doerr, Martin wrote:
> >>>>> I think it would be better to ignore bit 63 in the macroAssembler code
> and use the definition from the spec in assembler_ppc.hpp.
> >>>>> Somebody may want to use the definition for other purposes.
> >>>>
> >>>> Done. transactional_level bit is 52 now so bits 52:62 can easily be
> extracted
> >>>> using 'rldicr' in macroAssembler.
> >>>>
> >>>>
> >>>>> I wonder if Assembler::tm_trans_cf | Assembler::tm_non_trans_cf
> would be a better match for x86's description for tm_failure_bit[2]. It's also a
> little unfortunate to print the same bit twice as tm_failure_bit[4].
> >>>>
> >>>> Done. Now both tm_trans_cf and tm_non_trans_cf failures will
> increment counter 2
> >>>> (conflict). Duplicated check code for tm_failure_bit[4] was removed
> and now
> >>>> counter 4 (debug) is mapped to count traps or syscalls caught in TM
> events,
> >>>> which seems a reasonable approximation to the original semantics of
> the debug
> >>>> counter on Intel. Unfortunately I could not confirm on AIX how these
> two events
> >>>> (trap and syscall in TM) will set the failure code, so the counter will
> never
> >>>> track any information on AIX. But with the current proposed change
> that failure
> >>>> code can be easily added in the future.
> >>>>
> >>>> I also realized that I used previously a wrong ME operand value in:
> >>>>
> >>>> + // Extract 11 bits
> >>>> + rldicr_(temp_Reg, abort_status, tm_failure_bit[i], 11);
> >>>>
> >>>> It should be 10 to extract 11 bits actually, so all extractions must be
> correct
> >>>> now.
> >>>>
> >>>>
> >>>> I hope you don't find the array map of failure bits vs counters
> overkilling.
> >>>>
> >>>> Finally, I replaced the comment:
> >>>>
> >>>> tm_tabort, // Note: Seems like signal handler sets this, too.
> >>>>
> >>>> by:
> >>>>
> >>>> tm_tabort, // Signal handler will set this too.
> >>>>
> >>>> because we just enable RTM support on Power if 'htm-nosc' is
> supported, so
> >>>> treclaim. on aborting the syscall will indeed always set Abort bit in
> TEXASR
> >>>> afaik. Since debug counter now tracks trap/syscall in TM it's possible to
> check
> >>>> that counter to verify the number of aborts cause by the kernel (if
> any).
> >>>>
> >>>> new webrev: http://cr.openjdk.java.net/~gromero/8205582/v2/
> >>>>
> >>>>
> >>>> Best regards,
> >>>> Gustavo
> >>>>
> >>>>> Best regards,
> >>>>> Martin
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com]
> >>>>> Sent: Montag, 25. Juni 2018 10:19
> >>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Doerr, Martin
> <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
> >>>>> Cc: ppc-aix-port-dev at openjdk.java.net
> >>>>> Subject: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on
> nested transactions
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> Could the following change be reviewed please?
> >>>>>
> >>>>> bug : https://bugs.openjdk.java.net/browse/JDK-8205582
> >>>>> webrev: http://cr.openjdk.java.net/~gromero/8205582/v1/
> >>>>>
> >>>>> It fixes the RTM counter for nested aborts (rtm lock aborts type 5) by
> >>>>> extracting and checking bits in the Transactional Level field of TEXASR
> >>>>> register.
> >>>>>
> >>>>> It also fixes the memory conflict counter (rtm lock aborts type 2).
> Power TM
> >>>>> status register supports two bits to inform two different types of
> memory
> >>>>> conflict between threads: non-transactional and transactional.
> According to how
> >>>>> the jtreg RTM tests are designed the memory conflict counter counts
> >>>>> non-transactional conflicts: on TestPrintPreciseRTMLockingStatistics a
> RTM lock
> >>>>> is held on a static variable while another thread without any
> synchronization
> >>>>> (non-trasactional) tries to modify the same variable. Hence that small
> >>>>> adjustment satisfies the TestPrintPreciseRTMLockingStatistics making
> it pass on
> >>>>> Power. The memory conflict counter is not used in any other place
> besides by the
> >>>>> RTM precise statistics (no decision is made by the JVM based on that
> amount).
> >>>>>
> >>>>> This change partially fixes some failures in
> >>>>> compiler/rtm/print/TestPrintPreciseRTMLockingStatistics.java
> regarding the
> >>>>> nested and memory conflict abort counters. The remaining issue will
> be fixed by
> >>>>> aborting on calling JNI (next RFR).
> >>>>>
> >>>>>
> >>>>> Thank you and best regards,
> >>>>> Gustavo
> >>>>>
> >>>>
> >>>
> >>
> >
More information about the hotspot-compiler-dev
mailing list