Understanding the performance of my FFI-based API

Alan Paxton alan.paxton at gmail.com
Tue Mar 14 11:06:46 UTC 2023


Hi Maurizio,

Thanks very much for taking the time to work through all that. I have now
made the changes you suggested and like you I am seeing results that are
comparable between JNI and FFI. I will update my code/document to reflect
this asap..

A few thoughts on what I have learned
1. The importance of exact FFI calls to VarHandles, and the usefulness of
the .withInvokeExactBehavior() for tracking these down.
2. Good old "make it final if you possibly can.."
3. I had missed MemorySegment.copy(...) as the way to do the efficient
memcpy out of a native segment, hence my ugly and inefficient attempt to
wrap it in a ByteBuffer
4. Not allocating objects is always the most efficient thing to do

You might be able to point me at something that explains what goes on under
the cover of invocation, and why exact matters ? My overall takeaway is
that there are a number of rules of thumb for making use of FFI fast, if
you follow them you get equivalent performance to JNI, with safety for free.

--Alan

"Benchmark","Mode","Threads","Samples","Score","Score Error
(99.9%)","Unit","Param: columnFamilyTestType","Param: keyCount","Param:
keySize","Param: valueSize"
"org.rocksdb.jmh.GetBenchmarks.ffiGet","thrpt",1,5,18964.735543,853.481052,"ops/s",no_column_family,100000,128,65536
"org.rocksdb.jmh.GetBenchmarks.ffiGetOutputSlice","thrpt",1,5,25157.246026,69.076723,"ops/s",no_column_family,100000,128,65536
"org.rocksdb.jmh.GetBenchmarks.ffiGetPinnableSlice","thrpt",1,5,28124.236270,1087.581497,"ops/s",no_column_family,100000,128,65536
"org.rocksdb.jmh.GetBenchmarks.ffiGetRandom","thrpt",1,5,18130.128894,1212.912411,"ops/s",no_column_family,100000,128,65536
"org.rocksdb.jmh.GetBenchmarks.ffiIdentity","thrpt",1,5,35359992.737450,294600.268777,"ops/s",no_column_family,100000,128,65536
"org.rocksdb.jmh.GetBenchmarks.ffiPreallocatedGet","thrpt",1,5,24029.388397,937.110620,"ops/s",no_column_family,100000,128,65536
"org.rocksdb.jmh.GetBenchmarks.ffiPreallocatedGetRandom","thrpt",1,5,23228.230564,1926.594037,"ops/s",no_column_family,100000,128,65536
"org.rocksdb.jmh.GetBenchmarks.get","thrpt",1,5,19458.822466,755.447304,"ops/s",no_column_family,100000,128,65536
"org.rocksdb.jmh.GetBenchmarks.preallocatedByteBufferGet","thrpt",1,5,25178.037840,310.913780,"ops/s",no_column_family,100000,128,65536
"org.rocksdb.jmh.GetBenchmarks.preallocatedByteBufferGetRandom","thrpt",1,5,24022.235825,622.782684,"ops/s",no_column_family,100000,128,65536
"org.rocksdb.jmh.GetBenchmarks.preallocatedGet","thrpt",1,5,25117.231538,1259.112187,"ops/s",no_column_family,100000,128,65536

On Fri, Mar 10, 2023 at 6:46 PM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:

>
> On 10/03/2023 18:05, Maurizio Cimadamore wrote:
>
> I’m not sure how much the update to 20 matters - maybe try to fix all of
> the other stuff first, and see what happens (inexact var handle calls can
> be quite slow compared to Unsafe memory access).
>
> I reverted the Java 20 changes. Numbers still looking good:
>
> ```
> Benchmark                                      (columnFamilyTestType)
> (keyCount)  (keySize)  (valueSize)   Mode  Cnt      Score     Error   Units
> GetBenchmarks.ffiGet
> no_column_family        1000        128         4096  thrpt   30    596.329
> ±   9.452  ops/ms
> GetBenchmarks.ffiGet
> no_column_family        1000        128        65536  thrpt   30     60.368
> ±   0.842  ops/ms
> GetBenchmarks.ffiGetPinnableSlice
> no_column_family        1000        128         4096  thrpt   30    752.036
> ±   5.655  ops/ms
> GetBenchmarks.ffiGetPinnableSlice
> no_column_family        1000        128        65536  thrpt   30    111.105
> ±   2.304  ops/ms
> GetBenchmarks.ffiGetRandom
> no_column_family        1000        128         4096  thrpt   30    582.699
> ±   3.379  ops/ms
> GetBenchmarks.ffiGetRandom
> no_column_family        1000        128        65536  thrpt   30     64.546
> ±   1.829  ops/ms
> GetBenchmarks.ffiIdentity
> no_column_family        1000        128         4096  thrpt   30  57239.625
> ± 674.849  ops/ms
> GetBenchmarks.ffiIdentity
> no_column_family        1000        128        65536  thrpt   30  57802.683
> ± 589.983  ops/ms
> GetBenchmarks.ffiPreallocatedGet
> no_column_family        1000        128         4096  thrpt   30    717.237
> ±   8.434  ops/ms
> GetBenchmarks.ffiPreallocatedGet
> no_column_family        1000        128        65536  thrpt   30     96.223
> ±   1.143  ops/ms
> GetBenchmarks.ffiPreallocatedGetRandom
> no_column_family        1000        128         4096  thrpt   30    585.284
> ±   5.415  ops/ms
> GetBenchmarks.ffiPreallocatedGetRandom
> no_column_family        1000        128        65536  thrpt   30     66.568
> ±   0.843  ops/ms
> GetBenchmarks.get
> no_column_family        1000        128         4096  thrpt   30    553.515
> ±   6.278  ops/ms
> GetBenchmarks.get
> no_column_family        1000        128        65536  thrpt   30     59.999
> ±   0.935  ops/ms
> GetBenchmarks.preallocatedByteBufferGet
> no_column_family        1000        128         4096  thrpt   30    738.077
> ±   8.767  ops/ms
> GetBenchmarks.preallocatedByteBufferGet
> no_column_family        1000        128        65536  thrpt   30     99.239
> ±   1.398  ops/ms
> GetBenchmarks.preallocatedByteBufferGetRandom
> no_column_family        1000        128         4096  thrpt   30    722.680
> ±  11.499  ops/ms
> GetBenchmarks.preallocatedByteBufferGetRandom
> no_column_family        1000        128        65536  thrpt   30    110.411
> ±   1.117  ops/ms
> GetBenchmarks.preallocatedGet
> no_column_family        1000        128         4096  thrpt   30    700.405
> ±   8.534  ops/ms
> GetBenchmarks.preallocatedGet
> no_column_family        1000        128        65536  thrpt   30     99.694
> ±   2.122  ops/ms
> ```
>
> Maurizio
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20230314/dd19bdc2/attachment.htm>


More information about the panama-dev mailing list