Obsoleting JavaCritical
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Mon Jul 4 09:53:47 UTC 2022
Hi Wojtek,
thanks for sharing this list, I think this is a good starting point to
understand more about your use case.
Last week I've been looking at "getrusage" (as you mentioned it in an
earlier email), and I was surprised to see that the call took a pointer
to a (fairly big) struct which then needed to be initialized with some
thread-local state:
https://man7.org/linux/man-pages/man2/getrusage.2.html
I've looked at the implementation, and it seems to be doing memset on
the user-provided struct pointer, plus all the fields assignment.
Eyeballing the implementation, this does not seem to me like a "classic"
use case where dropping transition would help much. I mean, surely
dropping transitions would help shaving some nanoseconds off the call,
but it doesn't seem to me that the call would be shortlived enough to
make a difference. Do you have some benchmarks on this one? I did some
[1] and the call overhead seemed to come up at 260ns/op - w/o transition
you might perhaps be able to get to 250ns, but that's in the noise?
As for getpid, note that you can do (since Java 9):
ProcessHandle.current().pid();
I believe the impl caches the result, so it shouldn't even make the
native call.
Maurizio
[1] - http://cr.openjdk.java.net/~mcimadamore/panama/GetrusageTest.java
On 02/07/2022 07:42, Wojciech Kudla wrote:
> Hi Maurizio,
>
> Thanks for staying on this.
>
> > Could you please provide a rough list of the native calls you make
> where you believe critical JNI is having a real impact in the
> performance of your application?
>
> From the top of my head:
> clock_gettime
> recvmsg
> recvmmsg
> sendmsg
> sendmmsg
> select
> getpid
> getcpu
> getrusage
>
> > Also, could you please tell us whether any of these calls need to
> interact with Java arrays?
> No arrays or objects of any type involved. Everything happens by the
> means of passing raw pointers as longs and using other primitive types
> as function arguments.
>
> > In other words, do you use critical JNI to remove the cost
> associated with thread transitions, or are you also taking advantage
> of accessing on-heap memory _directly_ from native code?
> Criticial JNI natives are used solely to remove the cost of
> transitions. We don't get anywhere near java heap in native code.
>
> In general I think it makes a lot of sense for Java as a
> language/platform to have some guards around unsafe code, but on the
> other hand the popularity of libraries employing Unsafe and their
> success in more performance-oriented corners of software engineering
> is a clear indicator there is a need for the JVM to provide access to
> more low-level primitives and mechanisms.
> I think it's entirely fair to tell developers that all bets are off
> when they get into some non-idiomatic scenarios but please don't take
> away a feature that greatly contributed to Java's success.
>
> Kind regards,
> Wojtek
>
> On Wed, Jun 29, 2022 at 5:20 PM Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> wrote:
>
> Hi Wojciech,
> picking up this thread again. After some internal discussion, we
> realize that we don't know enough about your use case. While
> re-enabling JNI critical would obviously provide a quick fix,
> we're afraid that (a) developers might end up depending on JNI
> critical when they don't need to (perhaps also unaware of the
> consequences of depending on it) and (b) that there might actually
> be _better_ (as in: much faster) solutions than using critical
> native calls to address at least some of your use cases (that
> seemed to be the case with the clock_gettime example you
> mentioned). Could you please provide a rough list of the native
> calls you make where you believe critical JNI is having a real
> impact in the performance of your application? Also, could you
> please tell us whether any of these calls need to interact with
> Java arrays? In other words, do you use critical JNI to remove the
> cost associated with thread transitions, or are you also taking
> advantage of accessing on-heap memory _directly_ from native code?
>
> Regards
> Maurizio
>
> On 13/06/2022 21:38, Wojciech Kudla wrote:
>> Hi Mark,
>>
>> Thanks for your input and apologies for the delayed response.
>>
>> > If the platform included, say, an intrinsified
>> System.nanoRealTime()
>> method that returned clock_gettime(CLOCK_REALTIME), how much would
>> that help developers in your unnamed industry?
>>
>> Exposing realtime clock with nanosecond granularity in the JDK
>> would be a great step forward. I should have made it clear that I
>> represent fintech corner (investment banking to be exact) but the
>> issues my message touches upon span areas such as HPC, audio
>> processing, gaming, and defense industry so it's not like we have
>> an isolated case.
>>
>> > In a similar vein, if people are finding it necessary to
>> “replace parts
>> of NIO with hand-crafted native code” then it would be interesting to
>> understand what their requirements are
>>
>> As for the other example I provided with making very short lived
>> syscalls such as recvmsg/recvmmsg the premise is getting access
>> to hardware timestamps on the ingress and egress ends as well as
>> enabling batch receive with a single syscall and otherwise
>> exploiting features unavailable from the JDK (like access to CMSG
>> interface, scatter/gather, etc).
>> There are also other examples of calls that we'd love to make
>> often and at lowest possible cost (ie. getrusage) but I'm not
>> sure if there's a strong case for some of these ideas, that's why
>> it might be worth looking into more generic approach for
>> performance sensitive code.
>> Hope this does better job at explaining where we're coming from
>> than my previous messages.
>>
>> Thanks,
>> W
>>
>> On Tue, Jun 7, 2022 at 6:31 PM <mark.reinhold at oracle.com> wrote:
>>
>> 2022/6/6 0:24:17 -0700, wkudla.kernel at gmail.com:
>> >> Yes for System.nanoTime(), but System.currentTimeMillis()
>> reports
>> >> CLOCK_REALTIME.
>> >
>> > Unfortunately System.currentTimeMillis() offers only
>> millisecond
>> > granularity which is the reason why our industry has to
>> resort to
>> > clock_gettime.
>>
>> If the platform included, say, an intrinsified
>> System.nanoRealTime()
>> method that returned clock_gettime(CLOCK_REALTIME), how much
>> would
>> that help developers in your unnamed industry?
>>
>> In a similar vein, if people are finding it necessary to
>> “replace parts
>> of NIO with hand-crafted native code” then it would be
>> interesting to
>> understand what their requirements are. Some simple
>> enhancements to
>> the NIO API would be much less costly to design and implement
>> than a
>> generalized user-level native-call intrinsification mechanism.
>>
>> - Mark
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20220704/975158e6/attachment-0001.htm>
More information about the panama-dev
mailing list