<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>Hi,<br>
As Erik explained in his reply, what we call "critical JNI" comes
in two pieces: one removes Java to native thread transitions
(which is what Wojciech is referring to), while another part
interacts with the GC locker (basically to allow critical JNI code
to access Java arrays w/o copying). I think the latter part is the
most problematic GC-wise.<br>
</p>
<p>Then, regarding the former, I think there are still questions as
to whether dropping transitions is the best way to get the
performance boost required; for instance, yesterday I did some
experiments with an experimental patch from Jorn (kudos) which
re-enables an opt-in for "trivial" native calls in the Panama API.
I used it to test clock_gettime, and, while there's an
improvement, the results I got were not as conclusive as one might
expect expected. This is what I get w/ state transitions:</p>
<p>```<br>
Benchmark Mode Cnt Score
Error Units<br>
ClockgettimeTest.panama_monotonic avgt 30 27.814 ±
0.165 ns/op<br>
ClockgettimeTest.panama_monotonic_coarse avgt 30 12.094 ±
0.103 ns/op<br>
ClockgettimeTest.panama_monotonic_raw avgt 30 27.719 ±
0.393 ns/op<br>
ClockgettimeTest.panama_realtime avgt 30 27.133 ±
0.280 ns/op<br>
ClockgettimeTest.panama_realtime_coarse avgt 30 26.812 ±
0.384 ns/op<br>
```<br>
</p>
<p>And this is what I get with transitions removed:</p>
<p>```<br>
Benchmark Mode Cnt Score
Error Units<br>
ClockgettimeTest.panama_monotonic avgt 30 22.383 ±
0.213 ns/op<br>
ClockgettimeTest.panama_monotonic_coarse avgt 30 6.312 ±
0.117 ns/op<br>
ClockgettimeTest.panama_monotonic_raw avgt 30 22.731 ±
0.279 ns/op<br>
ClockgettimeTest.panama_realtime avgt 30 22.503 ±
0.292 ns/op<br>
ClockgettimeTest.panama_realtime_coarse avgt 30 21.853 ±
0.100 ns/op<br>
</p>
<p>```<br>
</p>
<p>Here we can see a gain of 4-5ns, obtained by dropping the
transition. The only case where this makes a significant
difference is with the monotonic_coarse flavor. In the other cases
there's a difference, yes, but not as pronounced, simply because
the term we're comparing against is bigger: it's easy to see a 5ns
gain if your function runs for 10ns in total - but such a gain
starts to get lost in the "noise" when functions run for longer.
And that's the main issue with removing Java->native
transitions: the "window" in which this optimization yield a
positive effect is extremely narrow (anything lasting longer than
30ns won't probably appreciate much difference), but, as you can
see from the PR in [1], the VM changes required to support it
touch quite a bit of stuff!</p>
<p>Luckily, selectively disabling transitions from Panama is
slightly more straightforward and, perhaps, for stuff like recvmsg
syscalls that are bypassed, there's not much else we can do: while
one could imagine Panama special-casing calls to clock_gettime, as
that's a known "leaf", the same cannot be done with rcvmsg, which
is in general a blocking call. Panama also has a "trusted mode"
flag (--enable-native-access), so there is a way in the Panama API
to distinguish between safe and unsafe API point, which also helps
with this. The risk of course is for developers to see whatever
mechanism is provided as some kind of "make my code go fast
please" and apply it blindly, w/o fully understanding the
consequences. What I said before about "extremely narrow window"
remains true: in the vast majority of cases (like 99%) dropping
state transitions can result in very big downsides, while the
corresponding upsides are not big enough to even be noticeable
(the Q/A in [2] arrives at a very similar conclusion).<br>
</p>
<p>All this said, selectively disabling state transitions from
native calls made using the Panama foreign API seem the most
straightforward way to offset the performance delta introduced by
the removal of critical JNI. In part it's because the Panama API
is more flexible, e.g. function descriptors allows us to model the
distinction between a trivial and non-trivial call; in part it's
because, as stated above, Panama can already reason about calls
that are "unsafe" and that require extra permissions. And, finally
it's also because, if we added back critical JNI, we'd probably
add it back w/o its most problematic GC locker parts (that's what
[1] does AFAIK) - which means it won't be a complete code
reversal. So, perhaps, coming up with a fresh mechanism to drop
transitions (only) could also be less confusing for developers. Of
course this would require developers such as Wojciech to rewrite
some of the code to use Panama instead of JNI.</p>
<p>And, coming back to clock_gettime, my feeling is that with the
right tools (e.g. some intrinsics), we can make that go a lot
faster than what shown above. Being able to quickly get a
timestamp seems a widely-enough applicable use case to deserves
some special treatment. So, perhaps, it's worth considering a
_spectrum of solutions_ on how to improve the status quo, rather
than investing solely on the removal of thread transitions.<br>
</p>
<p>Maurizio<br>
</p>
<p>[1] - <a class="moz-txt-link-freetext" href="https://github.com/openjdk/jdk19/pull/90/files">https://github.com/openjdk/jdk19/pull/90/files<br>
</a>[2] - <a class="moz-txt-link-freetext" href="https://youtu.be/LoyBTqkSkZk?t=742">https://youtu.be/LoyBTqkSkZk?t=742</a></p>
<p><br>
</p>
<div class="moz-cite-prefix">On 04/07/2022 18:38, Vitaly Davidovich
wrote:<br>
</div>
<blockquote type="cite" cite="mid:CAHjP37E62eEbrDtS7HF0eZ0wA65xTWVF_eqpZFubP=4PTXEYVg@mail.gmail.com">
<div dir="auto">To not sidetrack this thread with my previous
reply:</div>
<div dir="auto"><br>
</div>
<div dir="auto">Maurizio - are you saying java criticals are
*already* hindering ZGC and/or other planned Hotspot
improvements? Or that theoretically they could and you’d like to
remove/deprecate them now(ish)?</div>
<div dir="auto"><br>
</div>
<div dir="auto">If it’s the former, perhaps it’s prudent to keep
them around until a compelling case surfaces where they preclude
or severely restrict evolution of the platform? If it’s the
former, would be curious what that is but would also understand
the rationale behind wanting to remove it.</div>
<div><br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Jul 4, 2022 at 1:26
PM Vitaly Davidovich <<a href="mailto:vitalyd@gmail.com" moz-do-not-send="true" class="moz-txt-link-freetext">vitalyd@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)">
<div><br>
</div>
<div><br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Jul 4, 2022 at
1:13 PM Wojciech Kudla <<a href="mailto:wkudla.kernel@gmail.com" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">wkudla.kernel@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px
0px
0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)">
<div dir="ltr">
<div>
<div>
<div>Thanks for your input, Vitaly. I'd be
interested to find out more about the nature
of the HW noise you observed in your
benchmarks as our results were very consistent
and it was pretty straightforward to pinpoint
the culprit as JNI call overhead. Maybe it was
just easier for us because we disallow C- and
P-state transitions and put a lot of effort to
eliminate platform jitter in general. Were you
maybe running on a CPU model that doesn't
support constant TSC? I would also suggest
retrying with LAPIC interrupts suppressed
(with: cli/sti) to maybe see if it's the
kernel and not the hardware.</div>
</div>
</div>
</div>
</blockquote>
<div dir="auto">This was on a Broadwell Xeon chipset
with constant tsc. All the typical jitter sources
were reduced: C/P states disabled in bios, max turbo
enabled, IRQs steered away, core isolated, etc. By
the way, by noise I don’t mean the results themselves
were noisy - they were constant run to run. I just
meant the delta between normal vs critical JNI
entrypoints was very minimal - ie “in the noise”,
particularly with rdtsc.</div>
<div dir="auto"><br>
</div>
<div dir="auto">I can try to remeasure on newer Intel
but see below …</div>
<blockquote class="gmail_quote" style="margin:0px 0px
0px
0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)">
<div dir="ltr">
<div>
<div>
<div dir="auto"><br>
<br>
</div>
100% agree on rdtsc(p) and snippets. There are
some narrow usecases were one can get some
substantial speed ups with direct access to
prefetch or by abusing misprediction to keep
icache hot. These scenarios are sadly only
available with inline assembly. I know of a few
shops that go to the length of forking Graal,
etc to achieve that but am quite convinced such
capabilities would be welcome and utilized by
many more groups if they were easily accessible
from java.</div>
</div>
</div>
</blockquote>
<div dir="auto">I’m of the firm (and perhaps
controversial for some :)) opinion these days that
Java is simply the wrong platform/tool for low latency
cases that warrant this level of control. There’re
very strong headwinds even outside of JNI costs. And
the “real” problem with JNI, besides transition costs,
is lack of inlining into the native calls. So even if
JVM transition costs are fully eliminated, there’s
still an optimization fence due to lost inlining (not
unlike native code calling native fns via shared
libs).</div>
<div dir="auto"><br>
</div>
<div dir="auto">That’s not say that perf regressions are
welcomed - nobody likes those :).</div>
</div>
</div>
<div>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px
0px
0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)">
<div dir="ltr">
<div>
<div dir="auto"><br>
<br>
</div>
Thanks,<br>
</div>
W.<br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Jul 4,
2022 at 5:51 PM Vitaly Davidovich <<a href="mailto:vitalyd@gmail.com" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">vitalyd@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px
0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)">
<div dir="auto">I’d add rdtsc(p) wrapper functions
to the list. These are usually either inline
asm or compiler intrinsic in the JNI
entrypoint. In addition, any native libs
exposed via JNI that have “trivial” functions
are also candidates for faster calling
conventions. There’re sometimes way to mitigate
the call overhead (eg batching) but it’s not
always feasible.</div>
<div dir="auto"><br>
</div>
<div dir="auto">I’ll add that last time I tried to
measure the improvement of Java criticals for
clock_gettime (and rdtsc) it looked to be in the
noise on the hardware I was testing on. It got
the point where I had to instrument the critical
and normal JNI entrypoints to confirm the
critical was being hit. The critical calling
convention isn’t significantly different *if*
basic primitives (or no args at all) are passed
as args. JNIEnv*, IIRC, is loaded from a
register so that’s minor. jclass (for static
calls, which is what’s relevant here) should be
a compiled constant. Critical call still has a
GCLocker check. So I’m not actually sure what
the significant difference is for “lightweight”
(ie few primitive or no args, primitive return
types) calls.</div>
<div dir="auto"><br>
</div>
<div dir="auto">In general, I do think it’d be
nice if there was a faster native call sequence,
even if it comes with a caveat emptor and/or
special requirements on the callee (not unlike
the requirements for criticals). I think
Vladimir Ivanov was working on “snippets” that
allowed dynamic construction of a native call,
possibly including assembly. Not sure where
that exploration is these days, but that would
be a welcome capability.</div>
<div dir="auto"><br>
</div>
<div dir="auto">My $.02. Happy 4th of July for
those celebrating!</div>
<div dir="auto"><br>
</div>
<div dir="auto">Vitaly</div>
<div><br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Jul
4, 2022 at 12:04 PM Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">maurizio.cimadamore@oracle.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)">
<div>
<p>Hi,<br>
while I'm not an expert with some of the
IO calls you mention (some of my
colleagues are more knowledgeable in
this area, so I'm sure they will have
more info), my general sense is that, as
with getrusage, if there is a system
call involved, you already pay a hefty
price for the user to kernel transition.
On my machine this seem to cost around
200ns. In these cases, using JNI
critical to shave off a dozen of
nanoseconds (at best!) seems just not
worth it.</p>
<p>So, of the functions in your list, the
ones in which I *believe* dropping
transitions would have the most effect
are (if we exclude getpid, for which
another approach is possible)
clock_gettime and getcpu, I believe, as
they might use vdso [1], which typically
brings the performance of these call
closer to calls to shared lib functions.<br>
</p>
<p>If you have examples e.g. where
performance of recvmsg (or related
calls) varies significantly between base
JNI and critical JNI, please send them
our way; I'm sure some of my colleagues
would be intersted to take a look.<br>
</p>
<p>Popping back a couple of levels, I
think it would be helpful to also define
what's an acceptable regression in this
context. Of course, in an ideal world,
we'd like to see no performance
regression at all. But JNI critical is
an unsupported interface, which might
misbehave with modern garbage collectors
(e.g. ZGC) and that requires quite a bit
of internal complexity which might, in
the medium/long run, hinder the
evolution of the Java platform (all
these things have _some_ cost, even if
the cost is not directly material to
developers). In this vein, I think calls
like clock_gettime tend to be more
problematic: as they complete very
quickly, you see the cost of transitions
a lot more. In other cases, where
syscalls are involved, the cost
associated to transitions are more
likely to be "in the noise". Of course
if we look at absolute numbers, dropping
transitions would always yield "faster"
code; but at the same time, going from
250ns to 245ns is very unlikely to
result in visible performance difference
when considering an application as a
whole, so I think it's critical here to
decide _which_ use cases to prioritize.<br>
</p>
<p>I think a good outcome of this
discussion would be if we could come to
some shared understanding of which
native calls are truly problematic (e.g.
clock_gettime-like), and then for the
JDK to provide better (and more
maintainable) alternatives for those
(which might even be faster than using
critical JNI).<br>
</p>
<p>Thanks<br>
Maurizio<br>
</p>
<p>[1] - <a href="https://man7.org/linux/man-pages/man7/vdso.7.html" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://man7.org/linux/man-pages/man7/vdso.7.html</a><br>
</p>
</div>
<div>
<div>On 04/07/2022 12:23, Wojciech Kudla
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>Thanks Maurizio,<br>
<br>
</div>
I raised this case mainly about
clock_gettime and recvmsg/sendmsg, I
think we're focusing on the wrong
things here. Feel free to drop the
two syscalls from the discussion
entirely, but the main usecases I
have been presenting throughout this
thread definitely stand.<br>
<br>
</div>
<div>Thanks<br>
</div>
<br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On
Mon, Jul 4, 2022 at 10:54 AM
Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">maurizio.cimadamore@oracle.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)">
<div>
<p>Hi Wojtek,<br>
thanks for sharing this list, I
think this is a good starting
point to understand more about
your use case.</p>
<p>Last week I've been looking at
"getrusage" (as you mentioned it
in an earlier email), and I was
surprised to see that the call
took a pointer to a (fairly big)
struct which then needed to be
initialized with some
thread-local state:</p>
<p><a href="https://man7.org/linux/man-pages/man2/getrusage.2.html" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://man7.org/linux/man-pages/man2/getrusage.2.html</a></p>
<p>I've looked at the
implementation, and it seems to
be doing memset on the
user-provided struct pointer,
plus all the fields assignment.
Eyeballing the implementation,
this does not seem to me like a
"classic" use case where
dropping transition would help
much. I mean, surely dropping
transitions would help shaving
some nanoseconds off the call,
but it doesn't seem to me that
the call would be shortlived
enough to make a difference. Do
you have some benchmarks on this
one? I did some [1] and the call
overhead seemed to come up at
260ns/op - w/o transition you
might perhaps be able to get to
250ns, but that's in the noise?<br>
</p>
<p>As for getpid, note that you
can do (since Java 9):<br>
<br>
ProcessHandle.current().pid();<br>
<br>
I believe the impl caches the
result, so it shouldn't even
make the native call.<br>
</p>
<p>Maurizio</p>
<p>[1] - <a href="http://cr.openjdk.java.net/~mcimadamore/panama/GetrusageTest.java" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">http://cr.openjdk.java.net/~mcimadamore/panama/GetrusageTest.java</a><br>
</p>
<div>On 02/07/2022 07:42, Wojciech
Kudla wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>Hi Maurizio,<br>
<br>
</div>
Thanks for staying on this.<br>
<br>
> Could you please
provide a rough list of the
native calls you make where
you believe critical JNI is
having a real impact in the
performance of your
application?<br>
</div>
<div><br>
From the top of my head:<br>
</div>
<div>clock_gettime<br>
</div>
<div>recvmsg<br>
</div>
<div>recvmmsg</div>
<div>sendmsg<br>
</div>
<div>sendmmsg</div>
<div>select<br>
</div>
<div>getpid</div>
<div>getcpu<br>
</div>
<div>getrusage<br>
</div>
<div><br>
</div>
<div>> Also, could you
please tell us whether any
of these calls need to
interact with Java arrays?<br>
</div>
<div>No arrays or objects of
any type involved.
Everything happens by the
means of passing raw
pointers as longs and using
other primitive types as
function arguments.<br>
</div>
<div><br>
> In other words, do you
use critical JNI to remove
the cost associated with
thread transitions, or are
you also taking advantage of
accessing on-heap memory
_directly_ from native code?<br>
</div>
<div>Criticial JNI natives are
used solely to remove the
cost of transitions. We
don't get anywhere near java
heap in native code.<br>
<br>
</div>
<div>In general I think it
makes a lot of sense for
Java as a language/platform
to have some guards around
unsafe code, but on the
other hand the popularity of
libraries employing Unsafe
and their success in more
performance-oriented corners
of software engineering is a
clear indicator there is a
need for the JVM to provide
access to more low-level
primitives and mechanisms. <br>
</div>
<div>I think it's entirely
fair to tell developers that
all bets are off when they
get into some non-idiomatic
scenarios but please don't
take away a feature that
greatly contributed to
Java's success.<br>
<br>
</div>
<div>Kind regards,<br>
</div>
<div>Wojtek<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed,
Jun 29, 2022 at 5:20 PM
Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">maurizio.cimadamore@oracle.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)">
<div>
<p>Hi Wojciech,<br>
picking up this thread
again. After some
internal discussion, we
realize that we don't
know enough about your
use case. While
re-enabling JNI critical
would obviously provide
a quick fix, we're
afraid that (a)
developers might end up
depending on JNI
critical when they don't
need to (perhaps also
unaware of the
consequences of
depending on it) and (b)
that there might
actually be _better_ (as
in: much faster)
solutions than using
critical native calls to
address at least some of
your use cases (that
seemed to be the case
with the clock_gettime
example you mentioned).
Could you please provide
a rough list of the
native calls you make
where you believe
critical JNI is having a
real impact in the
performance of your
application? Also, could
you please tell us
whether any of these
calls need to interact
with Java arrays? In
other words, do you use
critical JNI to remove
the cost associated with
thread transitions, or
are you also taking
advantage of accessing
on-heap memory
_directly_ from native
code?</p>
<p>Regards<br>
Maurizio<br>
</p>
<div>On 13/06/2022 21:38,
Wojciech Kudla wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>Hi Mark,<br>
<br>
</div>
Thanks for
your input and
apologies for
the delayed
response.<br>
<br>
> If the
platform
included, say,
an
intrinsified
System.nanoRealTime()<br>
method that
returned
clock_gettime(CLOCK_REALTIME),
how much would<br>
that help
developers in
your unnamed
industry?<br>
<br>
</div>
Exposing
realtime clock
with nanosecond
granularity in
the JDK would be
a great step
forward. I
should have made
it clear that I
represent
fintech corner
(investment
banking to be
exact) but the
issues my
message touches
upon span areas
such as HPC,
audio
processing,
gaming, and
defense industry
so it's not like
we have an
isolated case.<br>
<br>
> In a
similar vein, if
people are
finding it
necessary to
“replace parts<br>
of NIO with
hand-crafted
native code”
then it would be
interesting to<br>
understand what
their
requirements are<br>
<br>
</div>
As for the other
example I provided
with making very
short lived
syscalls such as
recvmsg/recvmmsg
the premise is
getting access to
hardware
timestamps on the
ingress and egress
ends as well as
enabling batch
receive with a
single syscall and
otherwise
exploiting
features
unavailable from
the JDK (like
access to CMSG
interface,
scatter/gather,
etc).<br>
</div>
<div>There are also
other examples of
calls that we'd
love to make often
and at lowest
possible cost (ie.
getrusage) but I'm
not sure if
there's a strong
case for some of
these ideas,
that's why it
might be worth
looking into more
generic approach
for performance
sensitive code.<br>
</div>
<div>Hope this does
better job at
explaining where
we're coming from
than my previous
messages.<br>
</div>
<div><br>
</div>
Thanks,<br>
</div>
W<br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On
Tue, Jun 7, 2022 at
6:31 PM <<a href="mailto:mark.reinhold@oracle.com" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">mark.reinhold@oracle.com</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px
0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)">2022/6/6
0:24:17 -0700, <a href="mailto:wkudla.kernel@gmail.com" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">wkudla.kernel@gmail.com</a>:<br>
>> Yes for
System.nanoTime(),
but
System.currentTimeMillis()
reports<br>
>>
CLOCK_REALTIME.<br>
> <br>
> Unfortunately
System.currentTimeMillis()
offers only
millisecond<br>
> granularity
which is the reason
why our industry has
to resort to<br>
> clock_gettime.<br>
<br>
If the platform
included, say, an
intrinsified
System.nanoRealTime()<br>
method that returned
clock_gettime(CLOCK_REALTIME), how much would<br>
that help developers
in your unnamed
industry?<br>
<br>
In a similar vein,
if people are
finding it necessary
to “replace parts<br>
of NIO with
hand-crafted native
code” then it would
be interesting to<br>
understand what
their requirements
are. Some simple
enhancements to<br>
the NIO API would be
much less costly to
design and implement
than a<br>
generalized
user-level
native-call
intrinsification
mechanism.<br>
<br>
- Mark<br>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</div>
-- <br>
<div dir="ltr">Sent from my phone</div>
</blockquote>
</div>
</blockquote>
</div>
</div>
-- <br>
<div dir="ltr" data-smartmail="gmail_signature">Sent from my
phone</div>
</blockquote>
</div>
</div>
-- <br>
<div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">Sent from my phone</div>
</blockquote>
</body>
</html>