Unsafe.{get,put}-X-Unaligned performance

Vitaly Davidovich vitalyd at gmail.com
Thu Mar 12 17:30:05 UTC 2015


Isn't the C2 intrinsic just reading the value starting at the specified
offset directly (when unaligned access is supported) and not doing the
branching?

On Thu, Mar 12, 2015 at 1:15 PM, Peter Levart <peter.levart at gmail.com>
wrote:

>
>
> On 03/10/2015 08:02 PM, Andrew Haley wrote:
>
> The new algorithm does an N-way branch, always loading and storing
> subwords according to their natural alignment.  So, if the address is
> random and the size is long it will access 8 bytes 50% of the time, 4
> shorts 25% of the time, 2 ints 12.5% of the time, and 1 long 12.5% of
> the time.  So, for every random load/store we have a 4-way branch.
>
>
>
> ...so do you think it would be better if the order of checks in if/else
> chain:
>
>  972     public final long getLongUnaligned(Object o, long offset) {
>  973         if ((offset & 7) == 0) {
>  974             return getLong(o, offset);
>  975         } else if ((offset & 3) == 0) {
>  976             return makeLong(getInt(o, offset),
>  977                             getInt(o, offset + 4));
>  978         } else if ((offset & 1) == 0) {
>  979             return makeLong(getShort(o, offset),
>  980                             getShort(o, offset + 2),
>  981                             getShort(o, offset + 4),
>  982                             getShort(o, offset + 6));
>  983         } else {
>  984             return makeLong(getByte(o, offset),
>  985                             getByte(o, offset + 1),
>  986                             getByte(o, offset + 2),
>  987                             getByte(o, offset + 3),
>  988                             getByte(o, offset + 4),
>  989                             getByte(o, offset + 5),
>  990                             getByte(o, offset + 6),
>  991                             getByte(o, offset + 7));
>  992         }
>  993     }
>
>
> ...was reversed:
>
> if ((offset & 1) == 1) {
>     // bytes
> } else if ((offset & 2) == 2) {
>     // shorts
> } else if ((offset & 4) == 4) {
>     // ints
> } else {
>     // longs
> }
>
>
> ...or are JIT+CPU smart enough and there would be no difference?
>
>
> Peter
>
>



More information about the core-libs-dev mailing list