Unsafe.{get,put}-X-Unaligned performance

Vitaly Davidovich vitalyd at gmail.com
Thu Mar 12 22:15:53 UTC 2015


Switches currently don't profile well (if at all) - John can shed more
light on that as this came up on the compiler list a few weeks ago.

sent from my phone
On Mar 12, 2015 6:06 PM, "Peter Levart" <peter.levart at gmail.com> wrote:

>
>
> On 03/12/2015 10:04 PM, Peter Levart wrote:
>
> ... putLongUnaligned in the style of above getLongUnaligned is more tricky
> with current code structure. But there may be a middle ground (or a sweet
> spot):
>
>
>     public final void putLongUnaligned(Object o, long offset, long x) {
>         if (((int) offset & 1) == 1) {
>             putLongParts(o, offset,
>                 (byte) (x >>> 0),
>                 (short) (x >>> 8),
>                 (short) (x >>> 24),
>                 (short) (x >>> 40),
>                 (byte) (x >>> 56));
>         } else if (((int) offset & 2) == 2) {
>             putLongParts(o, offset,
>                 (short)(x >>> 0),
>                 (int)(x >>> 16),
>                 (short)(x >>> 48));
>         } else if (((int) offset & 4) == 4) {
>             putLongParts(o, offset,
>                 (int)(x >> 0),
>                 (int)(x >>> 32));
>         } else {
>             putLong(o, offset, x);
>         }
>     }
>
>
> ...this has the same number of branches, but less instructions. You also
> need the following two:
>
>
>
> At least on Intel (with -XX:-UseUnalignedAccesses) above code (Unaligned2)
> is not any faster then your code (Unaligned) according to a JMH
> random-access test. Neither is the reversal of if/else branches
> (Unaligned1). Unaligned3 is switch-based variant (just get) and is slowest.
> Your variant seems to be the fastest by a hair:
>
> Benchmark                               Mode   Samples        Score  Score
> error    Units
> j.t.UnalignedTest.getLongUnaligned      avgt         5       16.375
> 0.837    ns/op
> j.t.UnalignedTest.getLongUnaligned1     avgt         5       18.340
> 0.617    ns/op
> j.t.UnalignedTest.getLongUnaligned2     avgt         5       16.784
> 0.969    ns/op
> j.t.UnalignedTest.getLongUnaligned3     avgt         5       19.634
> 0.871    ns/op
> j.t.UnalignedTest.putLongUnaligned      avgt         5       15.521
> 0.589    ns/op
> j.t.UnalignedTest.putLongUnaligned1     avgt         5       16.676
> 1.042    ns/op
> j.t.UnalignedTest.putLongUnaligned2     avgt         5       16.394
> 3.028    ns/op
>
>
> Regards, Peter
>
> Peter
>
>



More information about the core-libs-dev mailing list