use of Unsafe for ASCII detection

Thu Jan 7 10:30:06 UTC 2016

Hi,

> On 06/01/16 23:21, Martin Buchholz wrote:
> 
> > That is, the Unsafe code is 3x faster than the simple code.  The
> > ByteBuffer code used to be 2x slower and is now 2x faster - well
> > done - crowd goes wild!
> 
> Why, thank you.
> 
> > I see it uses new and well-hidden Unsafe.getLongUnaligned ... all
> > the performance fanatics want to use that, but y'all don't want to
> > give it to us.  And we don't want to allocate a temp ByteBuffer
> > object.  What to do?
> 
> If you allocate a temp ByteBuffer object carefully so that it does not
> escape, it will be removed and you should get the same code as
> directly using Unsafe.  I certainly tested it during the work on
> Unsafe.getXXUnaligned and it was the same.  I'll have a look at your
> example.

I think the benchmark is not good: It does not have a warmup time. So when code starts, ByteBuffer access is not yet fully optimized and inlined. So it is initially slower, because it takes longer until it is fully optimized.

To fix the benchmark, use JMH or add a non-measured extra loop *before* each measurement to warm up compiler/optimizer. The results should then really be almost identical to Unsafe.

The alternative is to try the new VarHandles code (see previous mails). We are looking at that for future Lucene developments around our ByteBufferIndexInput and access to other byte[] arrays as different type views.

Uwe