RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on nested transactions
Gustavo Romero
gromero at linux.vnet.ibm.com
Tue Jul 17 13:44:33 UTC 2018
Hi,
Could I get a second review for that change please?
Best regards,
Gustavo
On 07/17/2018 08:18 AM, Gustavo Romero wrote:
> Hi Martin,
>
> OK. Let's continue with v4_B [1] so.
>
> Thanks for reviewing it!
>
>
> Best regards,
> Gustavo
>
> [1] http://cr.openjdk.java.net/~gromero/8205582/v4_B
>
> On 07/17/2018 06:02 AM, Doerr, Martin wrote:
>> Hi Gustavo,
>>
>> your webrev v4_B looks good. Reviewed.
>>
>> I think v4_A wouldn't work appropriately when using +1 and -1 bits together.
>>
>> A generic version could be something like (just to explain what I was thinking about):
>> for (int nbit = 0; nbit < num_failure_bits; nbit++) {
>> Label do_increment, check_abort;
>>
>> int last_match = -1;
>> for (ncounter = 0; ncounter < num_counters; ncounter++) {
>> if (last_match >= 0) {
>> rldicr_(temp_Reg, abort_status_R0, failure_bit[last_match], 0);
>> int selection = bit_counter_map[last_match][ncounter];
>> if (selection == 1) {
>> bne(CCR1, do_increment);
>> } else if (selection == -1) {
>> beq(CCR1, do_increment);
>> }
>> }
>> last_match = nbit;
>> }
>>
>> assert(last_match >= 0, "should have at least one");
>> rldicr_(temp_Reg, abort_status_R0, failure_bit[last_match], 0);
>> int selection = bit_counter_map[last_match][ncounter];
>> if (selection == 1) {
>> beq(CCR1, check_abort);
>> } else if (selection == -1) {
>> bne(CCR1, check_abort);
>> }
>> bind(do_increment);
>> ld(temp_Reg, abort_counter_offs, rtm_counters_Reg);
>> addi(temp_Reg, temp_Reg, 1);
>> std(temp_Reg, abort_counter_offs, rtm_counters_Reg);
>> bind(check_abort);
>> }
>>
>> But I'm fine with webrev v4_B with the comment you have added.
>>
>> Thanks,
>> Martin
>>
>>
>> -----Original Message-----
>> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com]
>> Sent: Dienstag, 17. Juli 2018 08:55
>> To: Doerr, Martin <martin.doerr at sap.com>; Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-compiler-dev at openjdk.java.net
>> Cc: ppc-aix-port-dev at openjdk.java.net
>> Subject: Re: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on nested transactions
>>
>> Hi Martin,
>>
>> On 07/16/2018 05:17 AM, Doerr, Martin wrote:
>>> thanks for the new webrev.
>>
>> Thanks a lot for the thorough review.
>>
>>
>>> You're right, the two bits should be mutual exclusive, so the sum is equivalent to the logical or in this case.
>>> However, the loop pretends to be generic, but it's not. The "or" only works for mutual exclusive bits.
>>> If you want to keep the code with the loops, I think there should be a comment explaining this.
>>
>> I see. I've experimented a couple of options following your previous suggestion
>> of using the counter loop as the outer loop. Most difficult point was to work
>> around the constraint of having just R0 available as scratch. Bit/bitfield
>> extracting instrs did not help much since most of them can't perform an OR with
>> its destination operand, which gets worse with the R0 constraint. On the other,
>> afaics 'rldimi' which does an OR with its destination operand can't be used to
>> extract the bit/bitfields in a generic way. That best alternative I found was to
>> use both CCR0 and CCR1 and their EQ bits. I think all cases are covered now,
>> i.e. all bits/conditions for a given counter are ORed. failure_code logic is not
>> inverted any more in the map, which seems more natural. I added also more
>> information to the comment before the bit/counter map.
>> I think now the loop is generic.
>> Webrev for the final result:
>>
>> http://cr.openjdk.java.net/~gromero/8205582/v4_A/
>>
>>
>>> Please also fix indentation.
>>
>> Done.
>>
>>
>>> Alternatively, the loop could get replaced by some code for each bit. Given that the loop is not really generic, I think this wouldn't be worse.
>>>
>>> Besides that, it looks good. Thanks for improving it.
>>
>> Due to the tight schedule I also provide the corrections for the last reviewed
>> version:
>>
>> http://cr.openjdk.java.net/~gromero/8205582/v4_B/
>>
>> If v4_A also looks good I vouch for pushing it instead of v4_B.
>>
>>
>> Best regards,
>> Gustavo
>>
>>> Best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com]
>>> Sent: Freitag, 13. Juli 2018 16:56
>>> To: Doerr, Martin <martin.doerr at sap.com>; Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-compiler-dev at openjdk.java.net
>>> Cc: ppc-aix-port-dev at openjdk.java.net
>>> Subject: Re: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on nested transactions
>>>
>>> Hi Martin,
>>>
>>> On 07/12/2018 12:30 PM, Doerr, Martin wrote:
>>>> I think your new code may increment a counter twice when 2 bits are specified for it. I think this should get fixed.
>>>> Iterating over the RTMLockingCounters in the outer loop and performing several checks before incrementing should fix this, right?
>>>
>>> Do you mean increment twice counter #2 (conflict counter) or another counter?
>>> non_trans_cf and trans_cf are mutually exclusive.
>>> You probably spotted a case I'm missing so I ask.
>>>
>>>
>>>> Besides that, I have some improvement proposals:
>>>>
>>>> - I think using 3 constant tables is not so good to read. For example, inverting could be encoded in your new 2 dimensional table by using -1, 0 , +1 when using int instead of bool. Would this be better?
>>>>
>>>> - I think you can get rid of rtm_counters_Reg increment and restoration by computing the abort_offs relative to the original value.
>>>
>>> Cool :)
>>>
>>> New (interim) webrev:
>>> http://cr.openjdk.java.net/~gromero/8205582/v3/
>>>
>>> Thanks.
>>>
>>>
>>> Best regards,
>>> Gustavo
>>>
>>>> Best regards,
>>>> Martin
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com]
>>>> Sent: Dienstag, 10. Juli 2018 18:59
>>>> To: Doerr, Martin <martin.doerr at sap.com>; Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-compiler-dev at openjdk.java.net
>>>> Cc: ppc-aix-port-dev at openjdk.java.net
>>>> Subject: Re: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on nested transactions
>>>>
>>>> Dear Martin,
>>>>
>>>> On 06/25/2018 01:31 PM, Doerr, Martin wrote:
>>>>> I think it would be better to ignore bit 63 in the macroAssembler code and use the definition from the spec in assembler_ppc.hpp.
>>>>> Somebody may want to use the definition for other purposes.
>>>>
>>>> Done. transactional_level bit is 52 now so bits 52:62 can easily be extracted
>>>> using 'rldicr' in macroAssembler.
>>>>
>>>>
>>>>> I wonder if Assembler::tm_trans_cf | Assembler::tm_non_trans_cf would be a better match for x86's description for tm_failure_bit[2]. It's also a little unfortunate to print the same bit twice as tm_failure_bit[4].
>>>>
>>>> Done. Now both tm_trans_cf and tm_non_trans_cf failures will increment counter 2
>>>> (conflict). Duplicated check code for tm_failure_bit[4] was removed and now
>>>> counter 4 (debug) is mapped to count traps or syscalls caught in TM events,
>>>> which seems a reasonable approximation to the original semantics of the debug
>>>> counter on Intel. Unfortunately I could not confirm on AIX how these two events
>>>> (trap and syscall in TM) will set the failure code, so the counter will never
>>>> track any information on AIX. But with the current proposed change that failure
>>>> code can be easily added in the future.
>>>>
>>>> I also realized that I used previously a wrong ME operand value in:
>>>>
>>>> + // Extract 11 bits
>>>> + rldicr_(temp_Reg, abort_status, tm_failure_bit[i], 11);
>>>>
>>>> It should be 10 to extract 11 bits actually, so all extractions must be correct
>>>> now.
>>>>
>>>>
>>>> I hope you don't find the array map of failure bits vs counters overkilling.
>>>>
>>>> Finally, I replaced the comment:
>>>>
>>>> tm_tabort, // Note: Seems like signal handler sets this, too.
>>>>
>>>> by:
>>>>
>>>> tm_tabort, // Signal handler will set this too.
>>>>
>>>> because we just enable RTM support on Power if 'htm-nosc' is supported, so
>>>> treclaim. on aborting the syscall will indeed always set Abort bit in TEXASR
>>>> afaik. Since debug counter now tracks trap/syscall in TM it's possible to check
>>>> that counter to verify the number of aborts cause by the kernel (if any).
>>>>
>>>> new webrev: http://cr.openjdk.java.net/~gromero/8205582/v2/
>>>>
>>>>
>>>> Best regards,
>>>> Gustavo
>>>>
>>>>> Best regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com]
>>>>> Sent: Montag, 25. Juni 2018 10:19
>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
>>>>> Cc: ppc-aix-port-dev at openjdk.java.net
>>>>> Subject: RFR(s): 8205582: PPC64: RTM: Fix counter for aborts on nested transactions
>>>>>
>>>>> Hi,
>>>>>
>>>>> Could the following change be reviewed please?
>>>>>
>>>>> bug : https://bugs.openjdk.java.net/browse/JDK-8205582
>>>>> webrev: http://cr.openjdk.java.net/~gromero/8205582/v1/
>>>>>
>>>>> It fixes the RTM counter for nested aborts (rtm lock aborts type 5) by
>>>>> extracting and checking bits in the Transactional Level field of TEXASR
>>>>> register.
>>>>>
>>>>> It also fixes the memory conflict counter (rtm lock aborts type 2). Power TM
>>>>> status register supports two bits to inform two different types of memory
>>>>> conflict between threads: non-transactional and transactional. According to how
>>>>> the jtreg RTM tests are designed the memory conflict counter counts
>>>>> non-transactional conflicts: on TestPrintPreciseRTMLockingStatistics a RTM lock
>>>>> is held on a static variable while another thread without any synchronization
>>>>> (non-trasactional) tries to modify the same variable. Hence that small
>>>>> adjustment satisfies the TestPrintPreciseRTMLockingStatistics making it pass on
>>>>> Power. The memory conflict counter is not used in any other place besides by the
>>>>> RTM precise statistics (no decision is made by the JVM based on that amount).
>>>>>
>>>>> This change partially fixes some failures in
>>>>> compiler/rtm/print/TestPrintPreciseRTMLockingStatistics.java regarding the
>>>>> nested and memory conflict abort counters. The remaining issue will be fixed by
>>>>> aborting on calling JNI (next RFR).
>>>>>
>>>>>
>>>>> Thank you and best regards,
>>>>> Gustavo
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-compiler-dev
mailing list