RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4]

Paul Sandoz psandoz at openjdk.java.net
Thu Oct 15 16:02:20 UTC 2020


On Wed, 14 Oct 2020 21:29:55 GMT, CoreyAshford <github.com+51754783+CoreyAshford at openjdk.org> wrote:

>> Hi Corey,
>> 
>>> Are you thinking of a case where that produces a higher iteration count?
>> Sorry for the confusion. This is also fine:
>> length = sl - sp - 12
>> i = length / block_size
>> if (i <= 0) return 0
>> 
>> But I still wonder why we should use 2 branches. Why not
>> srawi_
>> ble(CCR0, return_zero)
>> ?
>> 
>>> Ah, I should have checked the calling conventions. I thought all of the CR* regs are volatile. I will fix that.
>> Actually, we do save and restore all CRs, so it’s not a real problem with the current implementation. But I prefer
>> staying closer to the elf ABI as long as there’s no good reason to do it differently.
>>> Your original comment said "2nd review", so I thought you meant you need to review it again after the changes.
>> We usually require at least 2 reviews by different people for all non-trivial changes. And I don’t consider the PPC64
>> part as trivial. In addition to that, I’m not familiar with Power 10.
>> Best regards,
>> Martin
>> 
>> 
>> From: CoreyAshford <notifications at github.com>
>> Sent: Dienstag, 13. Oktober 2020 22:59
>> To: openjdk/jdk <jdk at noreply.github.com>
>> Cc: Doerr, Martin <martin.doerr at sap.com>; Mention <mention at noreply.github.com>
>> Subject: Re: [openjdk/jdk] 8248188: Add IntrinsicCandidate and API for Base64 decoding (#293)
>> 
>> 
>> Hi Corey, thanks for taking some stuff out of the “too short” path. There may be a performance regression when decoding
>> many short arrays because of the stub call overhead and the usage of the slower part of the Java implementation. We
>> could do it a little better in many cases to compute the maximum possible iteration count i: i = (sl - sp) / block_size
>> if (i * block_size > sl - 12) i-- if (i <= 0) return 0 What do you think?  Are you thinking of a case where that
>> produces a higher iteration count? It looks effectively the same to me.  I don’t think branch prediction hints are
>> helpful for the “too short” check.
>> My thinking is that most of the time when the intrinsic is called, it will not take the early exit, but I suppose when
>> it is processing a sub-block_size buffer, it will return early every time. I will remove the hints.
>> And we should better use CCR1 instead of CCR2 which is specified as non-volatile.
>> 
>> Ah, I should have checked the calling conventions. I thought all of the CR* regs are volatile. I will fix that.
>> 
>> Did you already find a 2nd reviewer for the PPC64 part?
>> 
>> Your original comment said "2nd review", so I thought you meant you need to review it again after the changes. So, no,
>> I haven't looked for or found a second reviewer. Any suggestions? The folks on the team here have been busy with other
>> work.  Btw, I'm off today, so I will push commits to the above-mentioned issues tomorrow.
>> 
>>>> You are receiving this because you were mentioned.
>> Reply to this email directly, view it on GitHub<https://github.com/openjdk/jdk/pull/293#issuecomment-708005545>, or
>> unsubscribe<https://github.com/notifications/unsubscribe-auth/AKR64KC2K7IZFAPXVQIVYZDSKS5SLANCNFSM4RVHNW5Q>.
>
>> Hi Corey,
>> Are you thinking of a case where that produces a higher iteration count?
>> Sorry for the confusion. This is also fine: length = sl - sp - 12 i = length / block_size if (i <= 0) return 0 But I
>> still wonder why we should use 2 branches. Why not srawi_ ble(CCR0, return_zero) ?
> 
> You're right!  I originally thought that the `srawi.` was setting only the Zero bit, but it sets others as well.
> 
>> Ah, I should have checked the calling conventions. I thought all of the CR* regs are volatile. I will fix that.
>> Actually, we do save and restore all CRs, so it’s not a real problem with the current implementation. But I prefer
>> staying closer to the elf ABI as long as there’s no good reason to do it differently.
> 
> Looks like I don't need that code at all now, but it's good to know for the future; I have an encode intrinsic in the
> works.
>> Your original comment said "2nd review", so I thought you meant you need to review it again after the changes.
>> We usually require at least 2 reviews by different people for all non-trivial changes. And I don’t consider the PPC64
>> part as trivial. In addition to that, I’m not familiar with Power 10.
> 
> I received permission to request help from the GNU toolchain team here to review it.  Due to family issues and work
> schedule on my end, it will be at least the middle of next week before I can get a reviewer to have a look.
> Thanks for your continued patience and help.

Please update
[compiler/graalunit/HotspotTest.java](https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/graalunit/HotspotTest.java),
and add the intrinsic signature.

-------------

PR: https://git.openjdk.java.net/jdk/pull/293


More information about the core-libs-dev mailing list