[External] : Re: Potential performance regression with FFM compared to Unsafe

Tomer Zeltzer tomerr90 at gmail.com
Sat Apr 19 16:09:11 UTC 2025


Tried pastebin.com/YvE02tgj, its like 100x slower

On Sat, Apr 19, 2025, 01:35 Chen Liang <chen.l.liang at oracle.com> wrote:

> Hi Tomer,
> Note that your way of accessing the layout might not be the best; our
> recommended way of element access is to construct a larger layout (like a
> group layout representing a struct), and then obtain var handles with
> varHandle(PathElement). These var handles perform the same access checks,
> and these duplicate checks might be merged into one by the JIT compiler;
> such is called "loop hoisting" and is seen in JDK benchmarks. I haven't got
> time to try this out on your benchmarks yet, but I hope this might be able
> to address some of the regressions you have observed.
>
> In addition, for particular structs, we are planning record and interface
> mappers; record mappers perform a single read to copy native data to
> immutable objects, while interface mappers are more memory efficient and
> lazy, but can suffer from memory tearing issues. Those might be useful for
> the different scenarios you have mentioned as well.
>
> Chen
> ------------------------------
> *From:* Tomer Zeltzer <tomerr90 at gmail.com>
> *Sent:* Friday, April 18, 2025 4:45 PM
> *To:* Chen Liang <chen.l.liang at oracle.com>
> *Cc:* panama-dev at openjdk.org <panama-dev at openjdk.org>
> *Subject:* [External] : Re: Potential performance regression with FFM
> compared to Unsafe
>
>
> Thank you for testing this out Chen!
> A number of other people were able to reproduce the on heap results so not
> sure what to say here but thats the less important conclusion I think.
> For off heap, having the memory segment as a final field sounds like
> something that can be relevant for a very few niche use cases, if at all...
> If this cant be optimized further, without the final, this means a
> significant performance hit for a lot of use cases... off the top of my
> head, libraries like zstd and gzip that do jni bindings
>
> On Fri, Apr 18, 2025, 01:17 Chen Liang <chen.l.liang at oracle.com> wrote:
>
> Hello, I think the observed performance difference is probably due to the
> heap array being static final. I tested on latest mainline, and ffm is
> consistently slower without a static final object that it can constant fold
> against: it exhibited similar performance for auto arena 100 vs byte array
> 100, both having a significant overhead compared to Unsafe, unless the byte
> array is a constant (in a static final field). Meanwhile, I cannot
> reproduce FFM being faster than Unsafe for heap access: in the best case
> FFM is still slightly slower than Unsafe.
>
> For context, I used the source code at
> https://github.com/tomerr90/UnsafeVSFMA/blob/main/src/main/java/org/example/FMASerDe.java
> <https://urldefense.com/v3/__https://github.com/tomerr90/UnsafeVSFMA/blob/main/src/main/java/org/example/FMASerDe.java__;!!ACWV5N9M2RV99hQ!Lxvv0mXLndq2JUXA3nMUEyn-744ytcvs5eJAjifpuCqRoGtqtYNwPrN8QnhNzqGoP2hkW7qBYd1IkVXpAIs$> and
> edited around. I recommend testing against 22 or later releases where FFM
> has finalized; the preview feature on 21 is no longer maintained.
>
> Regards, Chen Liang
> ------------------------------
> *From:* panama-dev <panama-dev-retn at openjdk.org> on behalf of Tomer
> Zeltzer <tomerr90 at gmail.com>
> *Sent:* Thursday, April 17, 2025 6:31 AM
> *To:* panama-dev at openjdk.org <panama-dev at openjdk.org>
> *Subject:* Potential performance regression with FFM compared to Unsafe
>
> Hey all!
> First time emailing such a list so apologies if somwthing is "off
> protocol".
> I wrote the following article where I benchmarked FFM and Unsafe in JDK21
> https://itnext.io/javas-new-fma-renaissance-or-decay-372a2aee5f32
> <https://urldefense.com/v3/__https://itnext.io/javas-new-fma-renaissance-or-decay-372a2aee5f32__;!!ACWV5N9M2RV99hQ!Lxvv0mXLndq2JUXA3nMUEyn-744ytcvs5eJAjifpuCqRoGtqtYNwPrN8QnhNzqGoP2hkW7qBYd1IekAyXVw$>
>
> Conclusions were that FFM was 42% faster for on heap accesses while 67%
> slower for off heap, which is a bit weird.
> Code is also linked in the article.
> Would love hearing your thoughts!
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20250419/a52ae649/attachment.htm>


More information about the panama-dev mailing list