Unsafe.{get,put}-X-Unaligned performance
Peter Levart
peter.levart at gmail.com
Thu Mar 12 18:07:03 UTC 2015
On 03/12/2015 06:30 PM, Vitaly Davidovich wrote:
> Isn't the C2 intrinsic just reading the value starting at the
> specified offset directly (when unaligned access is supported) and not
> doing the branching?
It is. This code is for those platforms not supporting unaligned accesses.
Peter
>
> On Thu, Mar 12, 2015 at 1:15 PM, Peter Levart <peter.levart at gmail.com
> <mailto:peter.levart at gmail.com>> wrote:
>
>
>
> On 03/10/2015 08:02 PM, Andrew Haley wrote:
>> The new algorithm does an N-way branch, always loading and storing
>> subwords according to their natural alignment. So, if the address is
>> random and the size is long it will access 8 bytes 50% of the time, 4
>> shorts 25% of the time, 2 ints 12.5% of the time, and 1 long 12.5% of
>> the time. So, for every random load/store we have a 4-way branch.
>
>
> ...so do you think it would be better if the order of checks in
> if/else chain:
>
> 972 public final long getLongUnaligned(Object o, long offset) {
> 973 if ((offset & 7) == 0) {
> 974 return getLong(o, offset);
> 975 } else if ((offset & 3) == 0) {
> 976 return makeLong(getInt(o, offset),
> 977 getInt(o, offset + 4));
> 978 } else if ((offset & 1) == 0) {
> 979 return makeLong(getShort(o, offset),
> 980 getShort(o, offset + 2),
> 981 getShort(o, offset + 4),
> 982 getShort(o, offset + 6));
> 983 } else {
> 984 return makeLong(getByte(o, offset),
> 985 getByte(o, offset + 1),
> 986 getByte(o, offset + 2),
> 987 getByte(o, offset + 3),
> 988 getByte(o, offset + 4),
> 989 getByte(o, offset + 5),
> 990 getByte(o, offset + 6),
> 991 getByte(o, offset + 7));
> 992 }
> 993 }
>
>
> ...was reversed:
>
> if ((offset & 1) == 1) {
> // bytes
> } else if ((offset & 2) == 2) {
> // shorts
> } else if ((offset & 4) == 4) {
> // ints
> } else {
> // longs
> }
>
>
> ...or are JIT+CPU smart enough and there would be no difference?
>
>
> Peter
>
>
More information about the core-libs-dev
mailing list