[EXTERNAL] Re: RFR(M): 8248188: [PATCH] Add HotSpotIntrinsicCandidate and API for Base64 decoding

Fri Sep 25 09:07:10 UTC 2020

Note, this patch set is now on a new thread in the mailing list, due to 
the switchover from Mercurial to Git.

Regards,

- Corey

On 9/9/20 4:32 PM, Corey Ashford wrote:
> On 9/9/20 2:04 PM, Roger Riggs wrote:
>> Hi Corey,
>>
>> Right,  the continue was so it would go back and check if the 
>> conversion was
>> complete.  An alternative would be to repeat the check and return if 
>> there was
>> no bytes left to process.
> 
> Another issue I just discovered is that the way the loop is structured, 
> decodeBlock could be called multiple times in the event that isMIME is 
> true, and in that case, decodeBlock will try to write into dst[] 
> starting at offset 0 again.
> 
> My original intention was for the intrinsic to be called a single time 
> because it never attempted process bytes in the isMIME==true case, and 
> because of that, the offset into the destination buffer would always be 
> zero.  With this loop, on the second and later calls, the offset into 
> dst[] should be non-zero.  This means that I also need to pass dp into 
> decodeBlock.  That necessitates a change in the parameter passing down 
> to the intrinsic.  Not a big deal, but it is a ripple.
> 
> I'll get working on it.
> 
> The upside of this change is that it makes the decode and encode 
> intrinsics closely mirror each other, and handles the isMIME==true case 
> as a happy side-effect.  With the overhead of the call to the intrinsic, 
> it's not clear there will be a performance gain when isMIME==true, but a 
> benchmark should make that clear.  I'm guessing maybe 1.5X to 2X is 
> about the best that could be expected when linemax is the default 76.
> 
> - Corey
> 
>>
>> Thanks, Roger
>>
>> On 9/9/20 3:13 PM, Corey Ashford wrote:
>>> On 9/4/20 8:07 AM, Roger Riggs wrote:
>>>> Hi Corey,
>>>>
>>>> The idea I had in mind is refactoring the fast path into the method 
>>>> you call decodeBlock.
>>>> Base64: lines 751-768.
>>>>
>>>> It leaves all the unknown/illegal character handling to the Java code.
>>>> And yes, it does not need to handle MIME, except to return on 
>>>> illegal characters.
>>>>
>>>> The patch is attached.
>>>
>>> Ah, I see what you mean now, and thanks for the patch!  The patch as 
>>> presented doesn't work, however, because the intrinsic processes 
>>> fewer bytes than are in the src buffer, and then executes a 
>>> "continue;", which then proceeds to loop infinitely because the 
>>> intrinsic won't process any more bytes after that.
>>>
>>> I tried dropping the continue, but that doesn't work because the Java 
>>> (non-intrinsic) code processes all of the bytes, and the line of code 
>>> following the loop accesses one byte after the end of the src buffer 
>>> causing an array bounds error.
>>>
>>> So this needs to be re-thought a little, but it shouldn't be too 
>>> difficult.  I will work on it.
>>>
>>> Regards,
>>>
>>> - Corey
>>>
>>>>
>>>> Regards, Roger
>>>>
>>>>
>>>>
>>>> On 8/31/20 6:22 PM, Corey Ashford wrote:
>>>>> On 8/29/20 1:19 PM, Corey Ashford wrote:
>>>>>> Hi Roger,
>>>>>>
>>>>>> Thanks for your reply and thoughts!  Comments interspersed below:
>>>>>>
>>>>>> On 8/28/20 10:54 AM, Roger Riggs wrote:
>>>>> ...
>>>>>>> Comparing with the way that the Base64 encoder was intrinsified, the
>>>>>>> method that is intrinsified should have a method body that does
>>>>>>> the same function, so it is interchangable.  That likely will 
>>>>>>> just shift
>>>>>>> the "fast path" code into the decodeBlock method.
>>>>>>> Keeping the symmetry between encoder and decoder will
>>>>>>> make it easier to maintain the code.
>>>>>>
>>>>>> Good point.  I'll investigate what this looks like in terms of the 
>>>>>> actual code, and will report back (perhaps in a new webrev).
>>>>>>
>>>>>
>>>>> Having looked at this again, I don't think it makes sense. One 
>>>>> thing that differs significantly from the encodeBlock intrinsic is 
>>>>> that the decodeBlock intrinsic only needs to process a prefix of 
>>>>> the data, and so it can leave virtually any amount of data at the 
>>>>> end of the src buffer unprocessed, where as with the encodeBlock 
>>>>> intrinsic, if it exists, it must process the entire buffer.
>>>>>
>>>>> In the (common) case where the decodeBlock intrinsic returns not 
>>>>> having processed everything, it still needs to call the Java code, 
>>>>> and if that Java code is "replaced" by the intrinsic, it's 
>>>>> inaccessible.
>>>>>
>>>>> Is there something I'm overlooking here?  Basically I want the 
>>>>> decode API to behave differently than the encode API, mostly to 
>>>>> make the arch-specific intrinsic easier to implement. If that's not 
>>>>> acceptable, then I need to rethink the API, and also figure out how 
>>>>> to deal with the illegal character case. The latter could perhaps 
>>>>> be done by throwing an exception from the intrinsic, or maybe by 
>>>>> returning a negative length that specifies the index of the illegal 
>>>>> src byte, and then have the Java code throw the exception).
>>>>>
>>>>> Regards,
>>>>>
>>>>> - Corey
>>>>>
>>>>
>>>
>>
>