[External] : Re: Save/load StubRoutines
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Jul 12 04:02:11 UTC 2024
On 7/11/24 7:37 PM, Ashutosh Mehra wrote:
> LoadStubs should also depends on presence of cached code archive.
>
>
> I am not sure I get this. I already have this check in CDSConfig::check_vm_args_consistency():
>
> if (!LoadCachedCode && LoadStubs) {
> warning("LoadStubs will be ignored without LoadCachedCode");
> LoadStubs = false;
> }
You are doing it in check_vm_args_consistency() before we try to open cached code file.
My concern is you may miss open_for_read()/open_for_write() checks on some paths.
Can you check that by -Xshare:auto (to not treat warnings as error) and return `false` from `SCConfig::verify()`?
>
> and SCCache::load_XXX() routines have this check:
>
> SCCache* cache = open_for_read();
> if (cache == nullptr) {
> return false;
> }
>
> Wouldn't these checks be sufficient?
Yes, but we need to make sure they exists in all places where needed.
>
> Can you explain next change in arraycopy stubs on aarch64?:
> __ b(RuntimeAddress(byte_copy_entry));
> __ b(byte_copy_entry);
>
>
> Andrew Dinn can explain this change. I am guessing because byte_copy_entry is in the same CodeBuffer, we don't need to
> emit any relocation?
Make sense if it is the same CodeBlob.
>
> And why is this?:
> #if 0
> // This block of code is moved to StubRoutines::x86::init_SCAddressTable()
> StubRoutines::x86::generate_CRC32C_table(supports_clmul);
> #endif
>
> I moved the call that generates StubRoutines::x86::_crc32c_table to init_SCAddressTable so as to club its registration
> together with all other constants.
> I forgot to remove this block of code. I will clean it up.
My concern is you move general code into SCC specific code.
If we have to remove SCC code we will lose _crc32c_table initialization.
>
> Another thing worth pointing out (which may have got overlooked in the large patch) is this change
> in StubGenerator::array_overlap_test (in stubGenerator_x86_64_arraycopy.cpp)
>
> if (NOLp == nullptr) {
> - ExternalAddress no_overlap(no_overlap_target);
> + RuntimeAddress no_overlap(no_overlap_target);
> __ jump_cc(Assembler::belowEqual, no_overlap);
> __ cmpptr(to, end_from);
> __ jump_cc(Assembler::aboveEqual, no_overlap);
>
> It seems like a bug to have ExternalAddress relocation for jump targets. If this change is reverted then debug build
> fails with following assertion failure:
From what I see no_overlap_target is `entry` address of previously generated arraycopy stub:
StubRoutines::_jbyte_disjoint_arraycopy = generate_disjoint_byte_copy(false, &entry,
"jbyte_disjoint_arraycopy");
StubRoutines::_jbyte_arraycopy = generate_conjoint_byte_copy(false, entry, &entry_jbyte_arraycopy,
"jbyte_arraycopy");
Using RuntimeAddress is correct (I don't like this name and corresponding relocation class - it should be StubAddress or
something to indicate the address of code in CodeCache and not in VM's runtime). ExternalAddress are used for outside
CodeCache addresses.
It is an other change we need to push into mainline.
>
> # Internal Error (/home/asmehra/data/ashu-mehra/leyden/src/hotspot/cpu/x86/assembler_x86.cpp:1009), pid=1392250,
> tid=1392251
> # assert(which == call32_operand) failed: jcc has no disp32 or imm
>
> Same is the case in StubGenerator::generate_cont_thaw():
>
> - __ jump(ExternalAddress(StubRoutines::throw_StackOverflowError_entry()));
> + __ jump(RuntimeAddress(StubRoutines::throw_StackOverflowError_entry()));
Good.
>
> In this case assertion failure is:
>
> # Internal Error (/home/asmehra/data/ashu-mehra/leyden/src/hotspot/cpu/x86/assembler_x86.cpp:1171), pid=1398423,
> tid=1398424
> # assert(which == call32_operand) failed: call has no disp32 or imm
>
> Mainline is oblivious to this as the relocations are not emitted for StubRoutines.
>
> Thanks,
> - Ashutosh Mehra
Thanks,
Vladimir K
>
>
> On Thu, Jul 11, 2024 at 1:06 PM Vladimir Kozlov <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>
> Great work, Ashutosh
>
> One issue I found recently is that we need to always generate cached code file even if there are no cached nmethods.
> I use saved stubs code for that. May be we should select one stub or blob which is not under LoadStubs/StoreStubs.
> May be temporary until we start caching adapter.
>
> LoadStubs should also depends on presence of cached code archive.
>
> > Apart from implementing the above approach, this patch also has changes to move JFR stubs and throw_exception
> stubs to
> > SharedRuntime and stores/loads them as a RuntimeStub, as they seem to be a better fit there.
>
> This should go into mainline (first RFE). I see that it takes majority of changed lines in changeset.
>
> There are changes to use InternalAddress instead of ExternalAddress. This should go into mainline too (second RFE).
>
> Moved SHA and BASE64 tables in stubGenerator_aarch64.cpp into mainline (third RFE).
>
> Can you explain next change in arraycopy stubs on aarch64?:
> __ b(RuntimeAddress(byte_copy_entry));
> __ b(byte_copy_entry);
>
> And why is this?:
> #if 0
> // This block of code is moved to StubRoutines::x86::init_SCAddressTable()
> StubRoutines::x86::generate_CRC32C_table(supports_clmul);
> #endif
>
>
> Thanks,
> Vladimir K
>
>
> On 7/9/24 7:55 AM, Ashutosh Mehra wrote:
> > Hi all,
> >
> > We have been working for some time on storing and loading generated code (stubs and blobs) in the hotspot as part
> of the
> > "premain" project.
> > The code is now in a good enough shape to be shared with a wider audience to get some feedback and comments.
> >
> > So here is the link [0] [1] to the changes for storing and loading of stubs which are declared in StubRoutines.
> > This patch covers both aarch64 and x86-64 architectures.
> > The code changes are done on top of AndrewD's patch for storing the blobs [2].
> >
> > ---
> >
> > A brief description of the approach for storing StubRoutines stubs is warranted as it slightly differs from the
> > technique used for storing other runtime blobs.
> >
> > In the mainline StubRoutines are divided into 4 categories depending on when they are generated and their purpose -
> > initial, continuation, compiler and final
> > In comparison to a runtime blob which is stored in its own buffer, a StubRoutine stub belongs to a category and
> all the
> > stubs in a category are stored in the same buffer.
> > This makes the code buffer storing StubRoutine to have multiple entry points and these entry points are
> established as
> > the stubs are generated during the runtime.
> > Moreover, the generation of some stubs is dependent on the availability of certain cpu features.
> >
> > So when the buffer for a StubRoutine category is stored in the code cache, some extra information needs to be
> store to
> > be able to identify all the entry points
> > and be able to associate them with the correct stubs when loading the buffer.
> >
> > To implement this each stub is given a unique static id irrespective of whether its code is generated or not
> (refer to
> > macro STUB_ROUTINES_STUBS_DO in runtime/stubRoutines.hpp)
> > When the stubs are generated the entry point(s) of the stubs are stored in an array (see
> StubArchiveData::_address_array
> > in runtime/stubCodeGenerator.hpp).
> > As the stubs are generated, the entry points are appended to the array. Most of the stubs have only one entry point.
> > In addition to the entry point(s), the end address of the stub is also recorded in the array.
> > The end address is used to create StubCodeDesc when the stubs are loaded from the code cache.
> >
> > To identify the addresses that belong to a stub, we store a tuple of 2 elements for each stub: first element is the
> > index of the first entry point of the stub in the _address_array
> > and second element is the number of addresses stored by this stub in the _address_array (see StubAddrIndexInfo in
> > runtime/stubCodeGenerator.hpp).
> > For stubs that are not generated, -1 is used for both the elements. These tuples are stored in an array
> > StubArchiveData::_index_table indexed by the unique stub id.
> >
> > It is easier to visualize this using a simple example:
> > Assume there are 3 stubs. Stub1 has 2 entry points (S1-1 and S1-2) and an end address (E1). Stub2 is not generated.
> > Stub3 has one entry point (S3-1) and end address (E3).
> > For this case the _address_array and _index_table in the StubArchiveData would have following entries:
> >
> > _address_array:
> > index: 0 1 2 3 4
> > contents: | S1-1 | S1-2 | E1 | S3-1 | E3 |
> >
> > _index_table:
> > index: 0 1 0
> > contents: | 0, 3 | -1, -1 | 3, 2 |
> >
> > When all the stubs of a category are generated, the _address_array and _index_table are stored in the code cache (in
> > SCCache::store_stubroutines_blob).
> > During load time the code along with the _address_array and _index_table is read back from the code cache (in
> > SCCReader::compile_stubroutines_blob).
> > The stubs entry points are set up in their respective routines that generates the stub (such as generate_call_stub)
> > using the _index_table.
> >
> > As the stubs are generated, their entry points are also registered with the SCAddressTable in SCCache.
> > To preserve the order of the SCAddressTable during load, the elements of the _address_array are registered when the
> > buffer is loaded (in SCCReader::compile_stubroutines_blob).
> >
> > In addition, StubArchiveData also stores the name of the stubs which is used to verify the code located in the code
> > cache is indeed for the stub being loaded.
> >
> > ---
> >
> > Apart from implementing the above approach, this patch also has changes to move JFR stubs and throw_exception
> stubs to
> > SharedRuntime and stores/loads them as a RuntimeStub,
> > as they seem to be a better fit there.
> > There are also some minor improvements to logging to trace the stubs generation and store/load times.
> >
> > Lastly, the generated code (stubs and blobs) can be controlled using these two options - StoreStubs and LoadStubs.
> >
> > Please review the code and provide your feedback. Let us know if any clarification is required.
> >
> > There is still scope of improvement. Some of the things that can be done are:
> > - In current scheme if a stub is not generated during training run, but gets generated during the production run,
> the
> > order of SCAddressTable can vary resulting in crash/unexpected behavior
> > - Enum used to enumerate stubs can be improved. AndrewD has the idea of a global enum that identifies all the
> > stubs/blobs and that can address this aspect of the patch and the limitation in the previous point as well.
> > - Exploring other ways to handle external constant data such as tables used by trigonometric stubs which need to be
> > registered in SCAddressTable very early. See StubRoutines::stubs_SCAddressTable_init).
> > - Using InternalAddress (or something similar) relocation type for targets that are in the same blob, so that
> they don't
> > need to be fixed on load.
>
> >
> >
> > [0] change sets: https://github.com/adinn/leyden/compare/d808ea2..7738ed6
> <https://urldefense.com/v3/__https://github.com/adinn/leyden/compare/d808ea2..7738ed6__;!!ACWV5N9M2RV99hQ!MEDDLwSXR4Xh2A7ZEycnLEL-7bhuOuUi-lQY8eotK5pXXPsqLHRDSMnniudrn07KAL0J4iVbU6XZoQWUGMDsMA$>
> > <https://github.com/adinn/leyden/compare/d808ea2..7738ed6
> <https://urldefense.com/v3/__https://github.com/adinn/leyden/compare/d808ea2..7738ed6__;!!ACWV5N9M2RV99hQ!MEDDLwSXR4Xh2A7ZEycnLEL-7bhuOuUi-lQY8eotK5pXXPsqLHRDSMnniudrn07KAL0J4iVbU6XZoQWUGMDsMA$>>
> > [1] branch: https://github.com/adinn/leyden/commits/premain-stub-routines/
> <https://urldefense.com/v3/__https://github.com/adinn/leyden/commits/premain-stub-routines/__;!!ACWV5N9M2RV99hQ!MEDDLwSXR4Xh2A7ZEycnLEL-7bhuOuUi-lQY8eotK5pXXPsqLHRDSMnniudrn07KAL0J4iVbU6XZoQXDMIPmIw$>
> > <https://github.com/adinn/leyden/commits/premain-stub-routines/
> <https://urldefense.com/v3/__https://github.com/adinn/leyden/commits/premain-stub-routines/__;!!ACWV5N9M2RV99hQ!MEDDLwSXR4Xh2A7ZEycnLEL-7bhuOuUi-lQY8eotK5pXXPsqLHRDSMnniudrn07KAL0J4iVbU6XZoQXDMIPmIw$>>
> > [2] https://github.com/adinn/leyden/commits/premain-save-generated/
> <https://urldefense.com/v3/__https://github.com/adinn/leyden/commits/premain-save-generated/__;!!ACWV5N9M2RV99hQ!MEDDLwSXR4Xh2A7ZEycnLEL-7bhuOuUi-lQY8eotK5pXXPsqLHRDSMnniudrn07KAL0J4iVbU6XZoQWotIgI3A$>
> > <https://github.com/adinn/leyden/commits/premain-save-generated/
> <https://urldefense.com/v3/__https://github.com/adinn/leyden/commits/premain-save-generated/__;!!ACWV5N9M2RV99hQ!MEDDLwSXR4Xh2A7ZEycnLEL-7bhuOuUi-lQY8eotK5pXXPsqLHRDSMnniudrn07KAL0J4iVbU6XZoQWotIgI3A$>>
> >
> > Thanks,
> > - Ashutosh Mehra
>
More information about the leyden-dev
mailing list