EA feedback
Ashutosh Mehra
asmehra at redhat.com
Tue Aug 13 19:42:44 UTC 2024
>
> Being able to trigger assembly/verification via jcmd without
> exiting, would make this far easier for us to support.
>
There is a proposed enhancement for doing exactly this (and exploring other
ways to trigger end of training run); see
https://bugs.openjdk.org/browse/JDK-8335358
Thanks,
- Ashutosh Mehra
On Fri, Aug 9, 2024 at 4:38 PM Danny Thomas <dannyt at netflix.com> wrote:
> I tried 24-leydenpremain+2-8 on a few internal applications, some quick
> feedback below (good to see you folks at the JVM LS!).
>
> If a jar has a Class-Path attribute and one or more of those libraries are
> explicitly on the classpath, it causes the actual and expected classpath to
> always differ. This is also the case currently with CDS of course, but this
> feature is sure to be deployed far more broadly than CDS is currently, so
> likely something you want to look at:
>
> [0.057s][info][class,path] non-existent Class-Path entry
> lib/failureaccess-1.0.1.jar
> [0.057s][info][class,path] opened:
> lib/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar
> [0.057s][info][class,path] library =
> lib/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar
>
> Startup time when training seems to be on par with ArchiveClassesAtExit in
> JDK 21, but it's about a 3.5x startup time penalty for one of our typical
> Spring Boot applications. From a back-to-back run on my machine (AMD EPYC
> 9R14, 32 cores, 123G, Ubuntu 22.04.4 LTS):
>
> Started App in 7.698 seconds (process running for 8.229)
> Started App in 26.247 seconds (process running for 29.262) - w/
> CacheDataStore, Training Run
> Started App in 4.341 seconds (process running for 4.917) - w/
> CacheDataStore, Production Run
>
> I also got a crash on one attempt, I can't remember what I did to cause
> this unfortunately:
>
> Stack: [0x00007f3949ab0000,0x00007f3949bb0000], sp=0x00007f3949bae628,
> free space=1017k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
> code)
> V [libjvm.so+0x42ca30] ArchiveBuilder::get_buffered_addr(unsigned char*)
> const+0x40
> V [libjvm.so+0xce4aa5] VM_PopulateDumpSharedSpace::doit()+0x395
> V [libjvm.so+0x100ae69] VM_Operation::evaluate()+0x109
> V [libjvm.so+0x100e348] VMThread::evaluate_operation(VM_Operation*)+0xe8
> V [libjvm.so+0x10142fb] VMThread::inner_execute(VM_Operation*)+0x35b
> V [libjvm.so+0x101460f] VMThread::run()+0x16f
> V [libjvm.so+0xf6e5cf] Thread::call_run()+0x9f
> V [libjvm.so+0xd74e13] thread_native_entry(Thread*)+0x183
> C [libc.so.6+0x98b07]
>
> siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr:
> 0x0000000000000030
>
> Thinking ahead to operationalizing AOT, while a single-shot/on-exit
> workflow is great for iterating locally, requiring the VM to exit makes
> this more difficult to operationalize at scale:
>
> 1. We'll perform training and assembly on test, production canary and
> production instances on behalf of application owners and handle
> distribution of the archives. Depending on when we're able to perform a
> training run, it'll have different benefits. i.e.:
> 1. Test environment will at least improve startup performance, with
> a mixed benefit for warm up depending on the kind of traffic they take in
> test
> 2. If an application uses canary deployments we'll have a full
> production profile prior to the full production deployment, and all
> instances will come up hot
> 3. If we reach production with only a test environment profile,
> we'll perform a training run in production, so instances that scale up
> following that run will come up hot (completely cold instances for an
> initial deployment is less of a concern, because we deploy immutably and
> get a natural warm-up period while we have 200% capacity online for a
> cluster)
> 2. It's currently not a problem if a VM doesn't exit completely due to
> a dangling non-daemon thread or hung shutdown hook
>
> Being able to trigger assembly/verification via jcmd without
> exiting, would make this far easier for us to support. If the overhead of
> the instrumentation for CDS can be avoided, being able to take a snapshot
> at any time on any VM would be better still, but that wouldn't be an
> impediment for us: we'll know that the instance will be used for training
> at boot time.
>
> We build nightlies of all the currently active OpenJDK projects, so if you
> land anything on premain between EA builds that you'd like us to try, let
> us know!
>
> Cheers,
> Danny
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20240813/79540592/attachment.htm>
More information about the leyden-dev
mailing list