[crac] RFR: 8369566: CRaC: Record metrics during checkpoint [v3]
Timofei Pushkin
tpushkin at openjdk.org
Thu Oct 16 14:21:17 UTC 2025
On Wed, 15 Oct 2025 09:04:35 GMT, Radim Vansa <rvansa at openjdk.org> wrote:
>> CRaC Engine can support storing additional metadata about the image. This can help the infrastructure to further refine the set of feasible images (constrained by CPU architecture and features) and select the image that is expected to perform best.
>
> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix for PathPatternTest
There should be some way for the user to know which metrics are available because the names will be needed to write a selection policy. Ideally some documentation for the value ranges is also needed.
src/hotspot/share/prims/jvm.cpp line 3726:
> 3724: JVM_END
> 3725:
> 3726:
Redundant empty line
src/hotspot/share/runtime/crac.cpp line 715:
> 713: ResourceMark rm;
> 714: objArrayOop metrics_oops = objArrayOop(JNIHandles::resolve_non_null(metrics));
> 715: typeArrayOop values_oops = typeArrayOop(JNIHandles::resolve_non_null(values));
To be clearer that these are just casts and not construction+copy
Suggestion:
const auto metrics_oops = static_cast<objArrayOop>(JNIHandles::resolve_non_null(metrics));
const auto values_oops = static_cast<typeArrayOop>(JNIHandles::resolve_non_null(values));
src/hotspot/share/runtime/crac.cpp line 733:
> 731:
> 732: double uptime = TimeHelper::counter_to_millis(os::elapsed_counter());
> 733: result = result && _engine->set_score("vm.boot_time", os::javaTimeMillis() - uptime);
Shouldn't `boot_time()` use `TimeHelper` too?
src/hotspot/share/runtime/crac.cpp line 746:
> 744: result = result && _engine->set_score("java.cls.sharedUnloadedClasses", shared_unloaded_classes);
> 745: #endif // INCLUDE_MANAGEMENT
> 746: result = result && _engine->set_score("sun.cls.appClassLoadCount", ClassLoader::perf_app_classload_count()->get_value());
`ClassLoader::perf_app_classload_count()` should return null when `UsePerfData` is false. Worth to check the other counters too.
src/hotspot/share/runtime/crac.hpp line 44:
> 42: static Handle checkpoint(jarray fd_arr, jobjectArray obj_arr, bool dry_run, jlong jcmd_stream, TRAPS);
> 43: static bool is_image_score_supported();
> 44: static bool record_image_score(jobjectArray metrics, jdoubleArray values);
Since they are closer to each other than to the checkpointing methods...
Suggestion:
static bool is_image_score_supported();
static bool record_image_score(jobjectArray metrics, jdoubleArray values);
src/hotspot/share/runtime/crac_engine.cpp line 391:
> 389: return ApiStatus::OK;
> 390:
> 391:
Excessive empty line
src/hotspot/share/runtime/crac_engine.cpp line 396:
> 394: require_method(set_restore_data)
> 395: require_method(get_restore_data)
> 396: complete_extension_api(_restore_data_api);
Suggestion:
complete_extension_api(_restore_data_api)
src/hotspot/share/runtime/crac_engine.hpp line 76:
> 74:
> 75: ApiStatus prepare_image_score_api();
> 76: bool set_score(const char* metric, double value);
Suggestion:
bool set_score(const char* metric, double value) const;
src/java.base/share/classes/jdk/internal/crac/Score.java line 1:
> 1: package jdk.internal.crac;
Missing a license header
src/java.base/share/classes/jdk/internal/crac/Score.java line 41:
> 39: private static native boolean record(String[] metrics, double[] values);
> 40:
> 41: static synchronized void record() {
Probably should also be private as it is intended for the resource only
src/java.base/share/classes/jdk/internal/crac/Score.java line 60:
> 58: * recording metadata this is ignored.
> 59: * On checkpoint the metrics are not reset; if that is desired invoke {@link #resetAll()}
> 60: * manually.
BTW we should also be documenting this on the engine API level: whether something is reset on checkpoint or not. AFAIR user data is reset, constraints are not, scores — I haven't checked yet. This can be postponed until we claim the support of repeated checkpoints, I am just saying.
src/java.base/share/classes/jdk/internal/crac/mirror/impl/GlobalContext.java line 10:
> 8: private static final String GLOBAL_CONTEXT_IMPL_PROP = "jdk.crac.globalContext.impl";
> 9:
> 10: public static class Score implements JDKResource {
Both `Score` and `GlobalContext` should have private constructors for clarity
src/java.base/share/classes/jdk/internal/crac/mirror/impl/GlobalContext.java line 18:
> 16: if (ctx instanceof OrderedContext<?> octx) {
> 17: jdk.internal.crac.Score.setScore("jdk.crac.globalContext.size", octx.size());
> 18: }
I believe it should always be an `OrderedContext`, why not a cast then?
src/java.base/share/native/libcrexec/image_score.cpp line 44:
> 42: }
> 43: _score.foreach([&](const Score& score){
> 44: fprintf(f, "%s=%f\n", score._name, score._value);
There's no validation that `=` and `\n` are not present in the name so there will probably be problems with parsing
src/java.base/share/native/libcrexec/image_score.hpp line 40:
> 38: struct Score {
> 39: const char* _name;
> 40: double _value;
Since these are public it would be better not to have a leading underscore
src/java.base/share/native/libcrexec/image_score.hpp line 61:
> 59: const char* name_copy = strdup(name);
> 60: if (name_copy == nullptr) {
> 61: return false;
A error message is needed
src/java.base/share/native/libcrexec/image_score.hpp line 64:
> 62: }
> 63: return _score.add(Score(name_copy, value));
> 64: }
This does not adhere to the engine API specification that when a metric name is repeated the old value is replaced
-------------
PR Review: https://git.openjdk.org/crac/pull/269#pullrequestreview-3344638001
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435757100
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435946887
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435977554
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435991007
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435924045
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435895292
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435900501
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435910912
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435763187
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435813602
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435793224
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435847852
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2435836395
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2436061378
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2436065260
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2436021600
PR Review Comment: https://git.openjdk.org/crac/pull/269#discussion_r2436037555
More information about the crac-dev
mailing list