[EXTERNAL] Re: RFR(M): 8248188: [PATCH] Add HotSpotIntrinsicCandidate and API for Base64 decoding
Corey Ashford
cjashfor at linux.ibm.com
Fri Sep 25 09:07:10 UTC 2020
Note, this patch set is now on a new thread in the mailing list, due to
the switchover from Mercurial to Git.
Regards,
- Corey
On 9/9/20 4:32 PM, Corey Ashford wrote:
> On 9/9/20 2:04 PM, Roger Riggs wrote:
>> Hi Corey,
>>
>> Right, the continue was so it would go back and check if the
>> conversion was
>> complete. An alternative would be to repeat the check and return if
>> there was
>> no bytes left to process.
>
> Another issue I just discovered is that the way the loop is structured,
> decodeBlock could be called multiple times in the event that isMIME is
> true, and in that case, decodeBlock will try to write into dst[]
> starting at offset 0 again.
>
> My original intention was for the intrinsic to be called a single time
> because it never attempted process bytes in the isMIME==true case, and
> because of that, the offset into the destination buffer would always be
> zero. With this loop, on the second and later calls, the offset into
> dst[] should be non-zero. This means that I also need to pass dp into
> decodeBlock. That necessitates a change in the parameter passing down
> to the intrinsic. Not a big deal, but it is a ripple.
>
> I'll get working on it.
>
> The upside of this change is that it makes the decode and encode
> intrinsics closely mirror each other, and handles the isMIME==true case
> as a happy side-effect. With the overhead of the call to the intrinsic,
> it's not clear there will be a performance gain when isMIME==true, but a
> benchmark should make that clear. I'm guessing maybe 1.5X to 2X is
> about the best that could be expected when linemax is the default 76.
>
> - Corey
>
>>
>> Thanks, Roger
>>
>> On 9/9/20 3:13 PM, Corey Ashford wrote:
>>> On 9/4/20 8:07 AM, Roger Riggs wrote:
>>>> Hi Corey,
>>>>
>>>> The idea I had in mind is refactoring the fast path into the method
>>>> you call decodeBlock.
>>>> Base64: lines 751-768.
>>>>
>>>> It leaves all the unknown/illegal character handling to the Java code.
>>>> And yes, it does not need to handle MIME, except to return on
>>>> illegal characters.
>>>>
>>>> The patch is attached.
>>>
>>> Ah, I see what you mean now, and thanks for the patch! The patch as
>>> presented doesn't work, however, because the intrinsic processes
>>> fewer bytes than are in the src buffer, and then executes a
>>> "continue;", which then proceeds to loop infinitely because the
>>> intrinsic won't process any more bytes after that.
>>>
>>> I tried dropping the continue, but that doesn't work because the Java
>>> (non-intrinsic) code processes all of the bytes, and the line of code
>>> following the loop accesses one byte after the end of the src buffer
>>> causing an array bounds error.
>>>
>>> So this needs to be re-thought a little, but it shouldn't be too
>>> difficult. I will work on it.
>>>
>>> Regards,
>>>
>>> - Corey
>>>
>>>>
>>>> Regards, Roger
>>>>
>>>>
>>>>
>>>> On 8/31/20 6:22 PM, Corey Ashford wrote:
>>>>> On 8/29/20 1:19 PM, Corey Ashford wrote:
>>>>>> Hi Roger,
>>>>>>
>>>>>> Thanks for your reply and thoughts! Comments interspersed below:
>>>>>>
>>>>>> On 8/28/20 10:54 AM, Roger Riggs wrote:
>>>>> ...
>>>>>>> Comparing with the way that the Base64 encoder was intrinsified, the
>>>>>>> method that is intrinsified should have a method body that does
>>>>>>> the same function, so it is interchangable. That likely will
>>>>>>> just shift
>>>>>>> the "fast path" code into the decodeBlock method.
>>>>>>> Keeping the symmetry between encoder and decoder will
>>>>>>> make it easier to maintain the code.
>>>>>>
>>>>>> Good point. I'll investigate what this looks like in terms of the
>>>>>> actual code, and will report back (perhaps in a new webrev).
>>>>>>
>>>>>
>>>>> Having looked at this again, I don't think it makes sense. One
>>>>> thing that differs significantly from the encodeBlock intrinsic is
>>>>> that the decodeBlock intrinsic only needs to process a prefix of
>>>>> the data, and so it can leave virtually any amount of data at the
>>>>> end of the src buffer unprocessed, where as with the encodeBlock
>>>>> intrinsic, if it exists, it must process the entire buffer.
>>>>>
>>>>> In the (common) case where the decodeBlock intrinsic returns not
>>>>> having processed everything, it still needs to call the Java code,
>>>>> and if that Java code is "replaced" by the intrinsic, it's
>>>>> inaccessible.
>>>>>
>>>>> Is there something I'm overlooking here? Basically I want the
>>>>> decode API to behave differently than the encode API, mostly to
>>>>> make the arch-specific intrinsic easier to implement. If that's not
>>>>> acceptable, then I need to rethink the API, and also figure out how
>>>>> to deal with the illegal character case. The latter could perhaps
>>>>> be done by throwing an exception from the intrinsic, or maybe by
>>>>> returning a negative length that specifies the index of the illegal
>>>>> src byte, and then have the Java code throw the exception).
>>>>>
>>>>> Regards,
>>>>>
>>>>> - Corey
>>>>>
>>>>
>>>
>>
>
More information about the core-libs-dev
mailing list