RFR: 8170785: Excessive allocation in ParseUtil.encodePath

Wed Jan 4 17:29:48 UTC 2017

Hi Roger,

On 2017-01-04 16:57, Roger Riggs wrote:
> Hi Claes,
>
> ok, I didn't spot any bugs so its fine as is.

thanks!

>
>
> On 1/3/2017 5:52 PM, Claes Redestad wrote:
>> Hi Roger,
>>
>> On 2017-01-03 22:22, Roger Riggs wrote:
>>> Hi Claes,
>>>
>>> So Windows didn't suffer from having the '\' separator.
>>
>> relatively speaking Windows won't see any speed-up when dealing
>> with file system paths, but most places where we see encodePath in
>> profiles is actually not files: jar/runtime image paths, HTTP etc...
>>
>> Thus Windows shouldn't be left too far behind for typical workloads. :-)
>>
>>>
>>> ParseUtil:
>>>
>>> firstEncodeIndex:121:
>>>   'a' - 'z' seems more frequent than '/' or '.'; does it improve the
>>> stats to move that range to the beginning of the if.
>>>   (yes the compiler can re-order).
>>>
>>> line 125:
>>>     Since 127 is known to need encoding it could be >= 0x007f
>>>
>>> line  136: I suppose the arraycopy intrinsic already optimizes length
>>> == 0;
>>>
>>> Line 134:  I question the math on  * 2 + 16 -index;  (But this is
>>> pre-existing code)
>>>    if there were lots of characters that needed encoding it might be
>>> possible to overflow the array since 1 char is replaced by at least 3
>>> and up to 9.
>>>    16 seems like a questionable fudge factor; but perhaps it has not
>>> been a problem in practice.
>>
>> As the main objective here is to get rid of the allocations while not
>> regressing throughput when they can't be avoided I resisted *most*
>> urges to micro-optimize. :-)
>>
>> The weird heuristic with len * 2 + 16 is ugly, yes, but I prefer to
>> leave it mostly alone and perhaps revisit this for a throughput
>> performance enhancement in the future.
>>
>> '/' can actually be rather frequent in paths, and while '.' is likely
>> less common and should have been inserted after a-z, adding the check
>> improved performance of encoding to-be-rather-common paths like
>> "/java.base/bla/bla/bla/" remarkably while not visibly regressing
>> anything else.
>>
>> What is clear from micros is that having to load and call into the
>> BitSet has a relatively large overhead compared to an extra comparison
>> (which is likely why there are separate checks for a-z, A-Z and 0-9 in
>> the first place).
> A good optimizer should be able to bisect and get the result efficiently
> given the constant values.
>>
>> I should actually see if we can't remove the BitSet entirely: I think
>> the java.net.URI approach with final long masks and a simple shift and
>> check might be more efficient and relatively easy to duplicate here.
> A couple of fixed 64-bit compile time bitmasks computed from the set of
> encoded chars would be pretty efficient and compact too.
> I did wonder if there was a way to have a common utility for the encoding;
> but that's probably something to save for later.

Right, there are facilities for this in ParseUtil already (see match,
lowMask, highMask etc), and while not (javac) compile time constants
right now they probably could be.

Tested applying this approach to replace the BitSet, which seem to
improve thrpt slightly in isolation, but calling match is still quite a
bit slower than the sequence of range checks (c >= 'a' && c <= 'z')
etc, so in the end it wouldn't really change that much.

I'll defer further investigation to the future and go ahead and push
the patch as-is.

Thanks!

/Claes

>
> Roger
>>
>> Thanks!
>>
>> /Claes
>>
>>>
>>> $.02, Roger
>>>
>>>
>>>
>>> On 1/3/2017 9:46 AM, Claes Redestad wrote:
>>>> Hi,
>>>>
>>>> some users reports high allocation rates in ParseUtil.encodePath,
>>>> regardless of whether paths actually need to be encoded or not.
>>>> Since this is a commonly used utility it makes sense.
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~redestad/8170785/webrev.01
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8170785
>>>>
>>>> This patch provides a semantically neutral fast-path for cases when
>>>> the path does not need to be encoded (up to 5x speedup), reduces
>>>> allocation when the string has a prefix that does not need to be
>>>> encoded (1-2x speedup) and no regression when using a separator
>>>> that's not '/' or the first char needs encoding.
>>>>
>>>> Interpreted performance is not affected much either: small positive
>>>> when no encoding is needed, neutral or negligible regression
>>>> otherwise.
>>>>
>>>> Thanks!
>>>>
>>>> /Claes
>>>
>