[External] : Re: MemorySegment.ofAddress(...).reinterpret(...)
Brian S O'Neill
bronee at gmail.com
Mon Jul 10 14:33:01 UTC 2023
I decoded to run the benchmark against my application twice in the same
JVM, to allow for more "warmup". The entire test takes about 9 minutes
to run for each iteration. With the original unsafe version, I see about
a ~2% improvement the second time. With the Panama version, there's no
improvement the second time. The overall performance gaps widens to be
about 3.5%.
I question the effectiveness of inlining. In a micro benchmark, full
inlining is easy to observe. But in something more complex, how can I be
certain that inlining is working? I don't have the luxury of using the
ForceInline annotation myself. I can enable logging to observe inlining
actions, but the application is quite complex, and this generates a
unending stream of noise.
On 2023-07-10 02:31 AM, Maurizio Cimadamore wrote:
> AFAIK, all the work that went into hoisting bounds check with long
> induction variables should already have taken care of eliminating bounds
> checks in the vast majority of cases. I'm skeptical that the difference
> you see is caused by a bound check (especially one against
> Long.MAX_VALUE, effectively a constant). I think a more detailed
> benchmark is required here in order to assess exactly where the
> performance is being lost, as there can be several factors.
>
> I'm very very skeptical that the restricted method check is playing a
> part in all of this. We have taken extra care to make the check fast,
> and to cache the results of such check in a VM @Stable field, which is
> treated as a true constant. We have benchmark to show that no peak
> performance is lost due to the restricted method check.
>
More information about the panama-dev
mailing list