RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive
Ioi Lam
ioi.lam at oracle.com
Mon Aug 24 22:53:52 UTC 2020
On 8/20/20 5:10 PM, Mandy Chung wrote:
>
>
> On 8/19/20 10:14 PM, Yumin Qi wrote:
>>
>> HI, Mandy
>>
>> Thanks for the review, I took one day off yesterday so just got a
>> detail look of your reply.
>>
>> On 8/19/20 1:30 PM, Mandy Chung wrote:
>>> On 8/17/20 12:37 PM, Yumin Qi wrote:
>>>> Hi, Ioi
>>>>
>>>> Thanks for review/suggestion. I have updated the webrev at the
>>>> following link:
>>>>
>>>> http://cr.openjdk.java.net/~minqi/2020/8247536/webrev-02/
>>>>
>>>
>>> This patch leverages the TRACE_RESOLVE output and passes the trace
>>> output to VM. VM then calls
>>> GenerateJLIClassesHelper::generateMHHolderClasses to do the parsing
>>> and generate Holder class per the resolved LFs. I think there are
>>> other cleaner alternatives implementing this. jlink
>>> --generate-jli-classes plugin depends the trace output whereas
>>> -Xshare:dump does not. It's cleaner to skip generating the trace
>>> output and parsing for dumping shared archive purpose. In addition,
>>> the implementation needs some cleanup (I can send you feedback on
>>> the next revision)
>>>
>> Current patch did not change the existing code for JLinkPlugin part.
>> I just moved the parsing code from GenerateJLIClassesPlugin.java to
>> GenerateJLIClassesHelper.java since the former is an internal class
>> to which we shouldn't call to generate holder classes.
>>> Instead of relying on a system property
>>> "java.lang.invoke.MethodHandle.CDS_TRACE_RESOLVE", it's better to
>>> use jdk.internal.vm.isCDSDumpingEnabled() to detect if this is CDS
>>> dump time.
>>
>> I remember we have such API to query if flag -Xshare:dump or
>> -Xshare:on used. Do you mean if DumpLoadedClassList flag set? This
>> flag is the one used to log class name to list file. In
>> GenerateLinkOptData.gmk:
>>
>> $(CLASSLIST_FILE): $(INTERIM_IMAGE_DIR)/bin/java$(EXE_SUFFIX)
>> $(CLASSLIST_JAR)
>> $(call MakeDir, $(LINK_OPT_DIR))
>> $(call LogInfo, Generating $(patsubst $(OUTPUTDIR)/%, %, $@))
>> $(call LogInfo, Generating $(patsubst $(OUTPUTDIR)/%, %,
>> $(JLI_TRACE_FILE)))
>> $(FIXPATH) $(INTERIM_IMAGE_DIR)/bin/java
>> -XX:DumpLoadedClassList=$@.raw \
>> -Duser.language=en -Duser.country=US \
>> -cp $(SUPPORT_OUTPUTDIR)/classlist.jar \
>> build.tools.classlist.HelloClasslist $(LOG_DEBUG)
>> $(GREP) -v HelloClasslist $@.raw > $@.interim
>> $(FIXPATH) $(INTERIM_IMAGE_DIR)/bin/java -Xshare:dump \
>> -XX:SharedClassListFile=$@.interim
>> -XX:SharedArchiveFile=$@.jsa \
>> -Xmx128M -Xms128M $(LOG_INFO)
>> $(FIXPATH) $(INTERIM_IMAGE_DIR)/bin/java
>> -XX:DumpLoadedClassList=$@.raw.2 \
>> -XX:SharedClassListFile=$@.interim
>> -XX:SharedArchiveFile=$@.jsa \
>> -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true \
>> -Duser.language=en -Duser.country=US \
>> --module-path $(SUPPORT_OUTPUTDIR)/classlist.jar \
>> -cp $(SUPPORT_OUTPUTDIR)/classlist.jar \
>> build.tools.classlist.HelloClasslist \
>> 2> $(LINK_OPT_DIR)/stderr > $(JLI_TRACE_FILE) \
>> || ( \
>> exitcode=$$? ; \
>> $(ECHO) "ERROR: Failed to generate link optimization
>> data." \
>> "This is likely a problem with the newly built
>> JVM/JDK." ; \
>> $(CAT) $(LINK_OPT_DIR)/stderr $(JLI_TRACE_FILE) ; \
>> exit $$exitcode \
>> )
>> $(GREP) -v HelloClasslist $@.raw.2 > $@
>>
>> The $(JLI_TRACE_FILE) is generated with both -XX:DumpLoadedClassList
>> and -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true, in current
>> implementation, DumpLoadedClassList will turn on property
>> java.lang.invoke.MethodHandle.CDS_TRACE_RESOLVE=true. So the same
>> output sent to stdout and log file DumpLoadedClassList specified.
>>
>
> These entries are duplicated in two different files: one for jlink
> --generate-jli-classes plugin and another for CDS use. CDS
> -Xshare:dump attempts to do what jlink plugin does but writes those
> generated classes in to shared archive.
>
> Like the above make logic to build JDK image, the same entries are
> written in both default-jli-trace.txt via System.out and to classlist
> via JNI call to the VM. I guess VM also implements the logic to do
> some kind of diffing and write to CDS archive.
>
>> Now instead of this property, using a vm interface API to query if
>> this flag is set, I think it is better choice. But here I am NOT sure
>> I understand your suggestion, I think there are two choices:
>>
>> 1) Using DumpLoadedClassList to collect TRACE_RESOLVE but not via
>> CDS_TRACE_RESOLVE, using new API to query if DumpLoadedClassList is set
>>
>> 2) Do not use DumpLoadedClassList, when -Xshare:dump collecting those
>> name, type and holder name to regenerate holder classes?
>>
>
> I misunderstood that this CDS_TRACE_RESOLVE flag is set during
> -Xshare:dump time.
>
> Ioi has clarified to me offline that this step is actually part of
> -XX:DumpLoadedClassList and includes these TRACE_RESOLVE logs to the
> given class list file, i.e. you repurpose the class list file to
> include the log output that was initially designed for jlink plugin.
>
> To me, I'd prefer to see this support depending on `jlink
> --generate-jli-classes` which is an existing functionality and much
> cleaner. This way this does not require any VM change. It will
> generate the holder classes in the custom image per the
> application-specific config file.
>
> What it means is that: a customer would need to create a custom image
> with their application-specific config file. It might need a new CDS
> option to specify a separate TRACE_RESOLVE file. It would turn on
> this feature by default by defining a default path of the log file if
> it helps.
>
> I understand that this is not the existing CDS work flow and CDS
> archive does not require to run on a custom image. I see the value
> of this approach which can prepare customers to start building and
> using its own custom image.
>
> Of course the implementation would be much simpler (adding a flag to
> write these traces to a given file path, that is).
>
Hi Mandy,
jlink already supports everything that's provided by this patch (so why
are we doing it?):
$ java -cp . -Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true TestJLI
| grep RESOLVE > jli-trace.txt
$ grep fail jli-trace.txt
[LF_RESOLVE] java.lang.invoke.DirectMethodHandle$Holder invokeSpecial
LLD3_L (fail)
[LF_RESOLVE] java.lang.invoke.Invokers$Holder linkToTargetMethod D3L_L
(fail)
$ jlink --add-modules java.base --generate-jli-classes=@jli-trace.txt
--output myjdk
$ du -sm myjdk
48 myjdk
$ ./myjdk/bin/java -cp .
-Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true TestJLI > after.txt
$ grep D3 after.txt
[LF_RESOLVE] java.lang.invoke.DirectMethodHandle$Holder invokeSpecial
LLD3_L (success)
[LF_RESOLVE] java.lang.invoke.Invokers$Holder linkToTargetMethod D3L_L
(success)
However, I suspect most people won't do this, because the benefit is
relatively small. Also, jlink doesn't support classpath apps, so you
would need to figure out what to use for "--add-modules". In the worst
case, the custom JDK would be over 100MB in size.
So what we want to accomplish with this patch is to make the
"--generate-jli-classes" feature available to people who are already
using CDS, so that they don't need to create a custom JDK just to get a
few optimized XXX$Holder classes. In fact, CDS users don't need to do
anything extra, as the LF_RESOLVE traces are automatically included in
the output of -XX:DumpLoadedClassList. Once they upgrade to JDK 16, they
will automatically benefit from this patch.
I understand that this is a little messy. We need to move some code from
the jdk.jlink module into java.base, such as
GenerateJLIClassesHelper::generateMethodHandleHolderClasses. These are
simple text parsing functions. They are called by hotspot, and hotspot
is conceptually part of java.base. If we leave them inside jdk.jlink,
then we would have part of java.base depending on jdk.link, which is not
good.
========
In the long term, in project Leyden, we will have better integration
between jlink, AOT, CDS, etc. Today we use "jaotc" to create
AOT-compiled code, and "java -Xshare:dump" to create the CDS archive. I
think eventually we will just use "jlink", which will internally drive
AOT and CDS.
Today jlink create a whole JDK image, but we may want to generate just
an executable, something like
$ jlink --add-modules myapp ..... --trace-file myapp.trace --output
myapp.exe
$ ./myapp.exe options ...
For space saving, we may not copy all the contents of the JDK into
myapp.exe. Instead, some contents may be loaded from $JAVA_HOME. Of
course, there will also be a way to create a "stand-alone" myapp.exe
that doesn't depend on $JAVA_HOME.
(In this new world, we will probably move the
GenerateJLIClassesHelper::generateMethodHandleHolderClasses code back to
jdk.jlink, as they won't be needed by CDS anymore.
Similarly, today we have CDS hooks in the Lambda proxy generation code
in java.lang.invoke. I suspect those will also be unnecessary, as the
pre-resolution of dynamic bytecodes can probably be done with less
intervention from CDS.)
=========
As you can probably see, the changes in jlink in Leyden will be
substantial, and will take multiple JDK releases to roll out. There are
a lot of design that need to happen, and it will be much more
complicated that just invoking "jlink" with "--cds".
While Leyden is making progress, in the short term, we want to keep
making improvements in the existing usage model of CDS. Patches like
this one is a little messy, but I think it's still under control, and
hopefully we won't have many patches of this kind.
Thanks
- Ioi
> Mandy
More information about the hotspot-runtime-dev
mailing list