Optimal place for inserting calls to the VM from interpreted java methods

John Rose john.r.rose at oracle.com
Thu Oct 10 01:51:39 UTC 2024


Generally speaking, when trading off complexity against performance in interpreted code paths, performance always loses. We only add complexity after the less complex solution has proven itself to be a measurable problem for real workloads.

On Oct 9, 2024, at 6:43 PM, Mat Carter <Matthew.Carter at microsoft.com> wrote:


Is there a place to encode calls to the VM in the interpreted methods other than the method prologs?
I've recently discovered the interpreted method adaptors which seems like a candidate, but there's
no examples of the adaptors calling into the VM.

Follows is the rational as to why I'm looking for an alternative to the method prologs

The AOTEndTrainingOnMethodEntry feature [1] introduces calls from java methods into the
VM (upcalls) when specific methods are entered. The methods are identified via a pattern in a
similar manner to the CompileOnly option.

Following the initial PR review we’re looking at removing the knowledge of this AOT feature from
the compilers/interpreter and introducing a more generic system (RuntimeUpcalls) that can be
used by other parts of the VM [2].  In building out the RuntimeUpcalls system we've come across
an inefficiency that isn't an immediate problem for this feature, but should another feature use this
new system then it's less than optimal.

Interpreted code uses a shared method prolog (there are 8 variants for 'regular' methods [more
for some special math/zip methods]), the AOTEndTrainingOnMethodEntry  feature introduces a
further 5 prolog types.  When there is a single upcall (eg. AOTEndTrainingOnMethodEntry) to the VM
everything is efficient.

The inefficiency issue arises as soon as there are two or more upcalls; which upcalls relate to
which methods is contained within the RuntimeUpcalls system.  When the interpreter examines
the method flags they only indicate whether there are any upcalls (but not how many or which ones).
As the interpreter can't encode which upcalls should be called in the prolog (without an explosion
of new runtime generated prologs), it needs to call the RuntimeUpcalls system which in turn iterates
over the upcalls and calls the appropriate ones; the problem is that during that iteration the methods
need to be compared against the pattern.  So we either pay a memory cost to cache the method to
upcall relationships or we pay a performance cost to repeatedly test the method against the pattern.

This is not a problem for C1 and C2 as we pay this cost only when the methods are compiled and
create the multiple upcalls in those methods, eliminating the need for pattern matching by the
RuntimeUpcalls system during method execution

Thanks in advance
Mat

[1] https://github.com/openjdk/leyden/pull/21
[2] https://github.com/macarte/leyden/pull/2
<image.png>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20241010/3444e510/attachment.htm>


More information about the leyden-dev mailing list