Unsafe.{get,put}-X-Unaligned performance
Peter Levart
peter.levart at gmail.com
Thu Mar 12 19:15:30 UTC 2015
On 03/12/2015 07:16 PM, Vitaly Davidovich wrote:
> Right, ok -- just wanted to make sure I wasn't missing something. For
> platforms that don't support unaligned access, is it expected that
> callers will be reading/writing addresses that are unaligned to the
> size of the type they're reading? My hunch is that on such platforms
> folks would tend to align their data layouts so as to avoid unaligned
> operations, in which case checking for "natural" alignment first makes
> sense. But I don't know if that's actually true or not.
It depends on usage yes. But "Java" is a platform-independent "Platform"
and these Unsafe methods are meant to abstract-away the platform dependency.
Peter
>
> On Thu, Mar 12, 2015 at 2:07 PM, Peter Levart <peter.levart at gmail.com
> <mailto:peter.levart at gmail.com>> wrote:
>
>
>
> On 03/12/2015 06:30 PM, Vitaly Davidovich wrote:
>> Isn't the C2 intrinsic just reading the value starting at the
>> specified offset directly (when unaligned access is supported)
>> and not doing the branching?
>
> It is. This code is for those platforms not supporting unaligned
> accesses.
>
> Peter
>
>
>>
>> On Thu, Mar 12, 2015 at 1:15 PM, Peter Levart
>> <peter.levart at gmail.com <mailto:peter.levart at gmail.com>> wrote:
>>
>>
>>
>> On 03/10/2015 08:02 PM, Andrew Haley wrote:
>>> The new algorithm does an N-way branch, always loading and storing
>>> subwords according to their natural alignment. So, if the address is
>>> random and the size is long it will access 8 bytes 50% of the time, 4
>>> shorts 25% of the time, 2 ints 12.5% of the time, and 1 long 12.5% of
>>> the time. So, for every random load/store we have a 4-way branch.
>>
>>
>> ...so do you think it would be better if the order of checks
>> in if/else chain:
>>
>> 972 public final long getLongUnaligned(Object o, long
>> offset) {
>> 973 if ((offset & 7) == 0) {
>> 974 return getLong(o, offset);
>> 975 } else if ((offset & 3) == 0) {
>> 976 return makeLong(getInt(o, offset),
>> 977 getInt(o, offset + 4));
>> 978 } else if ((offset & 1) == 0) {
>> 979 return makeLong(getShort(o, offset),
>> 980 getShort(o, offset + 2),
>> 981 getShort(o, offset + 4),
>> 982 getShort(o, offset + 6));
>> 983 } else {
>> 984 return makeLong(getByte(o, offset),
>> 985 getByte(o, offset + 1),
>> 986 getByte(o, offset + 2),
>> 987 getByte(o, offset + 3),
>> 988 getByte(o, offset + 4),
>> 989 getByte(o, offset + 5),
>> 990 getByte(o, offset + 6),
>> 991 getByte(o, offset + 7));
>> 992 }
>> 993 }
>>
>>
>> ...was reversed:
>>
>> if ((offset & 1) == 1) {
>> // bytes
>> } else if ((offset & 2) == 2) {
>> // shorts
>> } else if ((offset & 4) == 4) {
>> // ints
>> } else {
>> // longs
>> }
>>
>>
>> ...or are JIT+CPU smart enough and there would be no difference?
>>
>>
>> Peter
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150312/d05cc32f/attachment.html>
More information about the hotspot-compiler-dev
mailing list