Save/load StubRoutines

Fri Jul 12 02:37:49 UTC 2024

>
> LoadStubs should also depends on presence of cached code archive.

I am not sure I get this. I already have this check
in  CDSConfig::check_vm_args_consistency():

  if (!LoadCachedCode && LoadStubs) {
    warning("LoadStubs will be ignored without LoadCachedCode");
    LoadStubs = false;
  }

and SCCache::load_XXX() routines have this check:

  SCCache* cache = open_for_read();
  if (cache == nullptr) {
    return false;
  }

Wouldn't these checks be sufficient?

Can you explain next change in arraycopy stubs on aarch64?:
>      __ b(RuntimeAddress(byte_copy_entry));
>      __ b(byte_copy_entry);

Andrew Dinn can explain this change. I am guessing because byte_copy_entry
is in the same CodeBuffer, we don't need to emit any relocation?

And why is this?:
> #if 0
>      // This block of code is moved to
> StubRoutines::x86::init_SCAddressTable()
>      StubRoutines::x86::generate_CRC32C_table(supports_clmul);
> #endif
>

I moved the call that generates StubRoutines::x86::_crc32c_table to
init_SCAddressTable so as to club its registration together with all other
constants.
I forgot to remove this block of code. I will clean it up.

Another thing worth pointing out (which may have got overlooked in the
large patch) is this change in StubGenerator::array_overlap_test
(in stubGenerator_x86_64_arraycopy.cpp)

   if (NOLp == nullptr) {
-    ExternalAddress no_overlap(no_overlap_target);
+    RuntimeAddress no_overlap(no_overlap_target);
     __ jump_cc(Assembler::belowEqual, no_overlap);
     __ cmpptr(to, end_from);
     __ jump_cc(Assembler::aboveEqual, no_overlap);

It seems like a bug to have ExternalAddress relocation for jump targets. If
this change is reverted then debug build fails with following assertion
failure:

#  Internal Error
(/home/asmehra/data/ashu-mehra/leyden/src/hotspot/cpu/x86/assembler_x86.cpp:1009),
pid=1392250, tid=1392251
#  assert(which == call32_operand) failed: jcc has no disp32 or imm

Same is the case in StubGenerator::generate_cont_thaw():

-  __ jump(ExternalAddress(StubRoutines::throw_StackOverflowError_entry()));
+  __ jump(RuntimeAddress(StubRoutines::throw_StackOverflowError_entry()));

In this case assertion failure is:

#  Internal Error
(/home/asmehra/data/ashu-mehra/leyden/src/hotspot/cpu/x86/assembler_x86.cpp:1171),
pid=1398423, tid=1398424
#  assert(which == call32_operand) failed: call has no disp32 or imm

Mainline is oblivious to this as the relocations are not emitted for
StubRoutines.

Thanks,
- Ashutosh Mehra

On Thu, Jul 11, 2024 at 1:06 PM Vladimir Kozlov <vladimir.kozlov at oracle.com>
wrote:

> Great work, Ashutosh
>
> One issue I found recently is that we need to always generate cached code
> file even if there are no cached nmethods.
> I use saved stubs code for that. May be we should select one stub or blob
> which is not under LoadStubs/StoreStubs.
> May be temporary until we start caching adapter.
>
> LoadStubs should also depends on presence of cached code archive.
>
>  > Apart from implementing the above approach, this patch also has changes
> to move JFR stubs and throw_exception stubs to
>  > SharedRuntime and stores/loads them as a RuntimeStub, as they seem to
> be a better fit there.
>
> This should go into mainline (first RFE). I see that it takes majority of
> changed lines in changeset.
>
> There are changes to use InternalAddress instead of ExternalAddress. This
> should go into mainline too (second RFE).
>
> Moved SHA and BASE64 tables in stubGenerator_aarch64.cpp into mainline
> (third RFE).
>
> Can you explain next change in arraycopy stubs on aarch64?:
>      __ b(RuntimeAddress(byte_copy_entry));
>      __ b(byte_copy_entry);
>
> And why is this?:
> #if 0
>      // This block of code is moved to
> StubRoutines::x86::init_SCAddressTable()
>      StubRoutines::x86::generate_CRC32C_table(supports_clmul);
> #endif
>
>
> Thanks,
> Vladimir K
>
>
> On 7/9/24 7:55 AM, Ashutosh Mehra wrote:
> > Hi all,
> >
> > We have been working for some time on storing and loading generated code
> (stubs and blobs) in the hotspot as part of the
> > "premain" project.
> > The code is now in a good enough shape to be shared with a wider
> audience to get some feedback and comments.
> >
> > So here is the link [0] [1] to the changes for storing and loading of
> stubs which are declared in StubRoutines.
> > This patch covers both aarch64 and x86-64 architectures.
> > The code changes are done on top of AndrewD's patch for storing the
> blobs [2].
> >
> > ---
> >
> > A brief description of the approach for storing StubRoutines stubs is
> warranted as it slightly differs from the
> > technique used for storing other runtime blobs.
> >
> > In the mainline StubRoutines are divided into 4 categories depending on
> when they are generated and their purpose -
> > initial, continuation, compiler and final
> > In comparison to a runtime blob which is stored in its own buffer, a
> StubRoutine stub belongs to a category and all the
> > stubs in a category are stored in the same buffer.
> > This makes the code buffer storing StubRoutine to have multiple entry
> points and these entry points are established as
> > the stubs are generated during the runtime.
> > Moreover, the generation of some stubs is dependent on the availability
> of certain cpu features.
> >
> > So when the buffer for a StubRoutine category is stored in the code
> cache, some extra information needs to be store to
> > be able to identify all the entry points
> > and be able to associate them with the correct stubs when loading the
> buffer.
> >
> > To implement this each stub is given a unique static id irrespective of
> whether its code is generated or not (refer to
> > macro STUB_ROUTINES_STUBS_DO in runtime/stubRoutines.hpp)
> > When the stubs are generated the entry point(s) of the stubs are stored
> in an array (see StubArchiveData::_address_array
> > in runtime/stubCodeGenerator.hpp).
> > As the stubs are generated, the entry points are appended to the array.
> Most of the stubs have only one entry point.
> > In addition to the entry point(s), the end address of the stub is also
> recorded in the array.
> > The end address is used to create StubCodeDesc when the stubs are loaded
> from the code cache.
> >
> > To identify the addresses that belong to a stub, we store a tuple of 2
> elements for each stub: first element is the
> > index of the first entry point of the stub in the _address_array
> > and second element is the number of addresses stored by this stub in the
> _address_array (see StubAddrIndexInfo in
> > runtime/stubCodeGenerator.hpp).
> > For stubs that are not generated, -1 is used for both the elements.
> These tuples are stored in an array
> > StubArchiveData::_index_table indexed by the unique stub id.
> >
> > It is easier to visualize this using a simple example:
> > Assume there are 3 stubs. Stub1 has 2 entry points (S1-1 and S1-2) and
> an end address (E1). Stub2 is not generated.
> > Stub3 has one entry point (S3-1) and end address (E3).
> > For this case the _address_array and _index_table in the StubArchiveData
> would have following entries:
> >
> > _address_array:
> > index:          0          1      2       3       4
> > contents: | S1-1 | S1-2 | E1 | S3-1 | E3 |
> >
> > _index_table:
> > index:          0        1        0
> > contents: | 0, 3 | -1, -1 | 3, 2 |
> >
> > When all the stubs of a category are generated, the _address_array and
> _index_table are stored in the code cache (in
> > SCCache::store_stubroutines_blob).
> > During load time the code along with the _address_array and _index_table
> is read back from the code cache (in
> > SCCReader::compile_stubroutines_blob).
> > The stubs entry points are set up in their respective routines that
> generates the stub (such as generate_call_stub)
> > using the _index_table.
> >
> > As the stubs are generated, their entry points are also registered with
> the SCAddressTable in SCCache.
> > To preserve the order of the SCAddressTable during load, the elements of
> the _address_array are registered when the
> > buffer is loaded (in SCCReader::compile_stubroutines_blob).
> >
> > In addition, StubArchiveData also stores the name of the stubs which is
> used to verify the code located in the code
> > cache is indeed for the stub being loaded.
> >
> > ---
> >
> > Apart from implementing the above approach, this patch also has changes
> to move JFR stubs and throw_exception stubs to
> > SharedRuntime and stores/loads them as a RuntimeStub,
> > as they seem to be a better fit there.
> > There are also some minor improvements to logging to trace the stubs
> generation and store/load times.
> >
> > Lastly, the generated code (stubs and blobs) can be controlled using
> these two options - StoreStubs and LoadStubs.
> >
> > Please review the code and provide your feedback. Let us know if any
> clarification is required.
> >
> > There is still scope of improvement. Some of the things that can be done
> are:
> > - In current scheme if a stub is not generated during training run, but
> gets generated during the production run, the
> > order of SCAddressTable can vary resulting in crash/unexpected behavior
> > - Enum used to enumerate stubs can be improved. AndrewD has the idea of
> a global enum that identifies all the
> > stubs/blobs and that can address this aspect of the patch and the
> limitation in the previous point as well.
> > - Exploring other ways to handle external constant data such as tables
> used by trigonometric stubs which need to be
> > registered in SCAddressTable very early. See
> StubRoutines::stubs_SCAddressTable_init).
> > - Using InternalAddress (or something similar) relocation type for
> targets that are in the same blob, so that they don't
> > need to be fixed on load.
>
> >
> >
> > [0] change sets:
> https://github.com/adinn/leyden/compare/d808ea2..7738ed6
> > <https://github.com/adinn/leyden/compare/d808ea2..7738ed6>
> > [1] branch:
> https://github.com/adinn/leyden/commits/premain-stub-routines/
> > <https://github.com/adinn/leyden/commits/premain-stub-routines/>
> > [2] https://github.com/adinn/leyden/commits/premain-save-generated/
> > <https://github.com/adinn/leyden/commits/premain-save-generated/>
> >
> > Thanks,
> > - Ashutosh Mehra
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20240711/da2701b2/attachment.htm>