<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
Hi Aman,<br>
<br>
You may also run into hidden classes (JEP 371: Hidden Classes) that
allow classes to be defined, at runtime, without names.<br>
It has been proposed to use them for generated proxies but that
hasn't been implemented yet.<br>
There are benefits to having nameless classes, because they can't be
referenced by name, only as a capability, they can be better
encapsulated.<br>
<br>
fyi, Roger Riggs<br>
<br>
<br>
<div class="moz-cite-prefix">On 5/16/24 8:11 AM, Aman Sharma wrote:<br>
</div>
<blockquote type="cite" cite="mid:50405ca9c0a64372bace175b932f9ef7@kth.se">
<style type="text/css" style="display:none;">P {margin-top:0;margin-bottom:0;}</style>
<div id="divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Garamond,Georgia,serif;" dir="ltr">
<div id="divtagdefaultwrapper" dir="ltr" style="font-size: 12pt; color: rgb(0, 0, 0); font-family: Garamond, Georgia, serif, "EmojiFont", "Apple Color Emoji", "Segoe UI Emoji", NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", EmojiSymbols;">
<p>Hi,</p>
<p><br>
</p>
<p>Thanks for your response, Liang!</p>
<p><br>
</p>
<p>> <span>I think you meant CVE-2021-42392 instead of
2022.</span></p>
<p><br>
</p>
<p>Sorry of the error. I indeed meant <a href="https://nvd.nist.gov/vuln/detail/cve-2021-42392" class="OWAAutoLink" moz-do-not-send="true">
<span>CVE-2021-42392</span></a>.<br>
</p>
<p><br>
</p>
<p>> <span>Leyden mainly avoids this unstable generation
by performing a training run to collect classes loaded</span></p>
<p><br>
</p>
<p>Would love to know the details of Project Leyden and how
they worked so far to focus on this goal. In our case, the
training run is the test suite.</p>
<p><br>
</p>
<p>> <span>GeneratedConstructorAccessor is already retired
by JEP 416 [2] in Java 18</span></p>
<p><br>
</p>
<p>I did see them not appearing in my allowlist when I ran my
study subject (Apache PDFBox) with Java 21. Thanks for
letting me know about this JEP. I see they are
re-implemented with method handles.</p>
<p><br>
</p>
<p>> <span>How are you checking the classes?</span></p>
<p><br>
</p>
<p>To detect runtime generated code, we have javaagent that is
hooked statically to the test suite execution. It gives us
all classes that that is loaded post the JVM and the
javaagent are loaded. So we only check the classes loaded
for the purpose of running the application. This is also why
we did not choose -agentlib as it would give classes for the
setting up JVM and javaagent and we the user of our tool
must the classes they load.</p>
<p><br>
</p>
<p>Next, we have a `<span>ClassFileTransformer</span>` hook in
the agent where we produce the checksum using the bytecode.
And we compare the checksum with the one existing in the
allowlist. The checksum computation algorithm is same for
both steps. Let me describe how I compute the checksum.</p>
<p><br>
</p>
<ol style="margin-bottom: 0px; margin-top: 0px;">
<li>I get the <a href="https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.1" class="OWAAutoLink" moz-do-not-send="true">
CONSTANT_Class_info</a> entry corresponding to
`this_class` and rewrite the <a href="https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.7" class="OWAAutoLink" moz-do-not-send="true">
<span>CONSTANT_Utf8_info</span></a> corresponding to a
fix String constant, say "foo".</li>
<li>Since, the name of the class is used to refer to its
types members (fields/method), I get all
<a href="https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.2" class="OWAAutoLink" moz-do-not-send="true">
<span>CONSTANT_Fieldref_info</span></a> and if its
`class_index` corresponds to the old `this_class`, we
rewrite the UTF8 value of class_index to the same constant
"foo".</li>
<li>Next, since the naming of the fields, in Proxy classes,
are also suffixed by numbers, for example, `private static
Method m4`, we rewrite the UTF8 value of name in the
<a href="https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.6" class="OWAAutoLink" moz-do-not-send="true">
CONSTANT_NameAndType_info</a>.</li>
<li>These fields can also have a random order so we simply
sort the entire byte code using `Arrays.sort(byte[])` to
eliminate any differences due to ordering of
fields/methods.</li>
<li>Simply sorting the byte array still had minute
differences. I could not understand why they existed even
though values in constant pool of the bytecode in
allowlist and at runtime were exactly the same after
rewriting. The differences existed in the bytes of the
Code attribute of methods. I concluded that the bytes
stored some position information. To avoid this, I created
a subarray where I considered the bytes corresponding to
<span>`<span>CONSTANT_Utf8_info</span>.bytes` only.
Computing a checksum for it resulted in the same
checksums for both classfiles.</span></li>
</ol>
<div><br>
</div>
<div>Let's understand the whole approach with an example of
Proxy class.</div>
<div><br>
</div>
<div>`
<pre class="notranslate"><span class="pl-k">public</span> <span class="pl-k">final</span> <span class="pl-k">class</span> <span class="pl-smi">$Proxy42</span> <span class="pl-k">extends</span> <span class="pl-smi">Proxy</span> <span class="pl-k">implements</span> <span>org.apache.logging.log4j.core.config.plugins</span>.<span class="pl-smi">Plugin</span> {</pre>
`</div>
<div><br>
</div>
<div>The will go in the allowlist as "Proxy_Plugin: <SHA256
checksum>". <br>
</div>
<div><br>
</div>
<div>When the same class is intercepted at runtime, say
"$Proxy10", we look for "Proxy_Plugin" in the allowlist and
since the checksum algorithm is same in both cases, we get a
match and let the class load.<br>
</div>
<div><br>
</div>
<div>This approach has seemed to work well for Proxy classes,
Generated Constructor Accessor (which is removed as you
said). I also looked at the species generated by method
handles. I did not notice any modification in them. Their
name generation seemed okay to me. If some new Species are
generated, it is of course detected since it is not in the
allowlist.</div>
<div><br>
</div>
<div>I have not looked into LambdaMetafactory because I did
not encounter it as a problem so far, but I am aware its
name generation is also unstable. I have run my approach
only a few projects only. And for hidden classes, I assume
the the agent won't be able to intercept them so detecting
them would be really hard.<br>
</div>
<p><br>
</p>
<div id="Signature">
<div id="divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:rgb(0,0,0); font-family:Calibri,Helvetica,sans-serif,"EmojiFont","Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols">
<div id="m_4935352394101912768Signature">
<div name="divtagdefaultwrapper"><font size="2" color="#808080"><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"><span id="divtagdefaultwrapper" style="font-size:12pt">
<div style="margin-top:0; margin-bottom:0"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">Regards,</span></div>
<span style="font-family:Garamond,Georgia,serif"></span><span style="font-family:Garamond,Georgia,serif"></span><span style="color:rgb(0,0,0)"></span><span style="font-family:Garamond,Georgia,serif"></span><span style="font-family:Garamond,Georgia,serif"></span>
<div style="margin-top:0; margin-bottom:0"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">Aman Sharma</span></div>
</span><br>
</span></font></div>
<div name="divtagdefaultwrapper"><font size="2" color="#808080"><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"></span><span class="im">PhD Student<br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">KTH
Royal Institute of Technology</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
</span><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">School
of Electrical Engineering and Computer Science
(EECS)</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">Department
of Theoretical Computer Science (TCS)</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"></span></font></div>
</div>
<a href="https://www.kth.se/profile/amansha" class="OWAAutoLink" id="LPNoLP" moz-do-not-send="true"><span style="font-size:10pt"></span></a><a href="https://algomaster99.github.io/" class="OWAAutoLink moz-txt-link-freetext" id="LPNoLP" moz-do-not-send="true">https://algomaster99.github.io/</a><br>
</div>
</div>
</div>
<hr style="display:inline-block; width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b>
<a class="moz-txt-link-abbreviated" href="mailto:liangchenblue@gmail.com">liangchenblue@gmail.com</a> <a class="moz-txt-link-rfc2396E" href="mailto:liangchenblue@gmail.com"><liangchenblue@gmail.com></a><br>
<b>Sent:</b> Thursday, May 16, 2024 5:52:03 AM<br>
<b>To:</b> Aman Sharma; core-libs-dev<br>
<b>Cc:</b> Martin Monperrus<br>
<b>Subject:</b> Re: Deterministic naming of subclasses of
`java/lang/reflect/Proxy`</font>
<div> </div>
</div>
<div>
<div dir="ltr">Hi Aman,
<div>I think you meant CVE-2021-42392 instead of 2022.</div>
<div><br>
<div>For your approach of an "allowlist" for Java runtime,
project Leyden is looking to generate a static image
[1], that </div>
<div>> At run time it cannot load classes from outside
the image, nor can it create classes dynamically.</div>
<div>Leyden mainly avoids this unstable generation by
performing a training run to collect classes loaded and
even object graphs; I am not familiar with the details
unfortunately.</div>
<div><br>
</div>
<div>Otherwise, the Proxy discussion belongs better to
core-libs-dev, as java.lang.reflect.Proxy is part of
Java's core libraries. I am replying this thread to
core-libs-dev.</div>
<div><br>
</div>
<div>For your perceived problem that classes don't have
unique names, your description sounds dubious:
GeneratedConstructorAccessor is already retired by JEP
416 [2] in Java 18, and there are many other cases in
which JDK generates classes without stable names,
notoriously LambdaMetafactory (Gradle wished for
cacheable Lambdas); the same applies for the generated
classes for MethodHandle's LambdaForms (which carries
implementation code for LambdaForm). How are you
checking the classes? It seems you are not checking
hidden classes. Proxy and Lambda classes are defined by
the caller's class loader, while LambdaForms are under
JDK's system class loader I think. We need to ensure you
are correctly finding all unstable classes before we can
proceed.</div>
<div><br>
</div>
<div>[1]: <a href="https://openjdk.org/projects/leyden/notes/01-beginnings" moz-do-not-send="true" class="moz-txt-link-freetext">https://openjdk.org/projects/leyden/notes/01-beginnings</a></div>
<div>[2]: <a href="https://openjdk.org/jeps/416" moz-do-not-send="true" class="moz-txt-link-freetext">https://openjdk.org/jeps/416</a><br>
</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, May 15, 2024 at
7:00 PM Aman Sharma <<a href="mailto:amansha@kth.se" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">amansha@kth.se</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
<div>
<div dir="ltr">
<div id="m_5381525267202685274m_1510169701429416940divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:rgb(0,0,0); font-family:Garamond,Georgia,serif,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols">
<p>Hi,</p>
<p><br>
</p>
<p>My name is Aman and I am a PhD student at KTH
Royal Institute of Technology, Stockholm, Sweden.
I research as part of
<a href="https://chains.proj.kth.se/" target="_blank" moz-do-not-send="true">CHAINS</a>
project to strengthen the software supply chain of
multiple ecosystem. I particularly focus on
runtime integrity in Java. In this email, I want
to write about an issue I have discovered with
<i>dynamic generation of
`java.lang.reflect.Proxy`classes</i>. I will
propose a solution and would love to hear the
feedback from the community. Let me know if this
is the correct mailing-list for such discussions.
It seemed the most relevant from
<a href="https://mail.openjdk.org/mailman/listinfo" target="_blank" moz-do-not-send="true">this list</a>.</p>
<p><br>
</p>
<p><b>My research</b></p>
<p><b><br>
</b></p>
<p>Java has features to load class on the fly - it
can either download or generate a class at
runtime. These features are useful for inner
workings of JDK. For example, implementing
annotations, reflective access, etc. However,
these features have also contributed to critical
vulnerabilities in the past - CVE-2021-44228
(log4shell), CVE-2022-33980, CVE-2022-42392. All
of these vulnerabilities have one thing in common
-
<i>a class that was not known during build time
was downloaded/generated at runtime and loaded
into JVM.</i></p>
<p><br>
</p>
<p>To defend against such vulnerabilities, we
propose a solution to <i>allowlist classes for
runtime</i>. This allowlist will contain an
exhaustive list of classes that can be loaded by
the JVM and it will be enforced at runtime. We
build this allowlist from three sources:</p>
<ol style="margin-bottom:0px; margin-top:0px">
<li>All classes of all modules provided by the
Java Standard Library. We use <a href="https://github.com/classgraph/classgraph" target="_blank" moz-do-not-send="true">
ClassGraph</a> to scan the JDK. </li>
<li>We can take the source code and all
dependencies of an application. We use a
software bill of materials to get all the data.</li>
<li>Finally, we use run the test suite to include
any runtime downloaded/generated classes.</li>
</ol>
<div>Such a list is able to prevent the above 3 CVEs
because it does not let the "unknown" bytecode to
be loaded.</div>
<div><br>
</div>
<div><b>Problem with generating such an allowlist</b></div>
<div><b><br>
</b></div>
<div>The first two parts of the allowlist are easy
to get. The problem is with the third step where
we want to allowlist all the classes that could be
downloaded or generated. Upon running the test
suite and hooking to the classes it loads, we
observer that the list consists of classes that
are called "<span>com/sun/proxy/$Proxy2</span>", "<span>jdk/internal/reflect/GeneratedConstructorAccessor3</span>"
among many more. The purpose of these classes can
be identifed. The proxy class is created for to
implement an annotation. The accessor gives access
to constructor of a class to the JVM.</div>
<div><br>
</div>
<div>When enforcing this allowlist at runtime, we
see that the bytecode content for "<span>com/sun/proxy/$Proxy2</span>"
differs in the allowlist and at runtime. In our
case, we we are experimenting with
<a href="https://github.com/apache/pdfbox" target="_blank" moz-do-not-send="true">pdfbox</a>
so we created the allowlist using its test suite.
Then we enforced this allowlist while running some
of its subcommands. However, there was some other
proxy class say
<span>"com/sun/proxy/$Proxy5" at runtime that
implemented the same interfaces and had the same
methods as "<span>com/sun/proxy/$Proxy2" in the
allowlist. They only differed in the name of
the class, order of fields, and types for
fields references. This could happen because
the order of the loading of class is workload
dependent, but it causes problem to generate
such an allowlist.</span></span></div>
<div><span><span><br>
</span></span></div>
<div><span><span><b>Solution <br>
</b></span></span></div>
<p><br>
We propose that naming of subclasses of "<span>java/lang/reflect/Proxy</span>"
should not be dependent upon the order of loading.
In order to do so, two issues can be fixed:</p>
<ol style="margin-bottom:0px; margin-top:0px">
<li><span><a href="https://github.com/openjdk/jdk/blob/b687aa550837830b38f0f0faa69c353b1e85219c/src/java.base/share/classes/java/lang/reflect/Proxy.java#L531" target="_blank" moz-do-not-send="true">The
naming of the class should not be based on
AtomicLong</a></span>. Rather it could be
named based on the interfaces it implements. I
also wonder why AtomicLong is chosen in the
first place.</li>
<li>Methods of the interfaces must be in a
particular order. Right now, <span></span><span><span></span></span><a href="https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Class.java#L2178" target="_blank" moz-do-not-send="true"><span><span>they
are not sorted in any particular orde</span>r</span></a>.<br>
</li>
</ol>
<p><br>
</p>
<div id="m_5381525267202685274m_1510169701429416940Signature">
<div id="m_5381525267202685274m_1510169701429416940divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:rgb(0,0,0); font-family:Calibri,Helvetica,sans-serif,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols">
<div id="m_5381525267202685274m_1510169701429416940m_4935352394101912768Signature">
<div name="divtagdefaultwrapper"><font size="2" color="#808080"><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"><span id="m_5381525267202685274m_1510169701429416940divtagdefaultwrapper" style="font-size:12pt">
<div style="margin-top:0px; margin-bottom:0px"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">These fixes
will make proxy class generation
deterministic with respect to
order of loading and won't be
flagged at runtime since the test
suite would already detect them.</span></div>
<div style="margin-top:0px; margin-bottom:0px"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif"><br>
</span></div>
<div style="margin-top:0px; margin-bottom:0px"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">I would
love to hear from the community
about these ideas. If in
agreement, I would be happy to
produce a patch. I have discovered
this issue with subclasses of <a href="https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/reflect/ConstructorAccessor.java" target="_blank" moz-do-not-send="true">
GeneratedConstructorAccessor</a>
as well and I imagine it will also
apply to some other runtime
generated classes. If you
disagree, please let me know also.
It helps with my research.</span></div>
<div style="margin-top:0px; margin-bottom:0px"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif"><br>
</span></div>
<div style="margin-top:0px; margin-bottom:0px"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">I also have
<a href="https://github.com/chains-project/exploits-for-sbom.exe" target="_blank" moz-do-not-send="true">
PoCs for the above CVEs</a> and
a proof concept tool is being
developed under the name
<a href="https://github.com/chains-project/sbom.exe" target="_blank" moz-do-not-send="true">sbom.exe</a>
in case any one wonders about the
implementation. I would also be
happy to explain more.<br>
</span></div>
<div style="margin-top:0px; margin-bottom:0px"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif"><br>
</span></div>
<div style="margin-top:0px; margin-bottom:0px"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">Regards,</span></div>
<span style="font-family:Garamond,Georgia,serif"></span><span style="font-family:Garamond,Georgia,serif"></span><span style="color:rgb(0,0,0)"></span><span style="font-family:Garamond,Georgia,serif"></span><span style="font-family:Garamond,Georgia,serif"></span>
<div style="margin-top:0px; margin-bottom:0px"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">Aman Sharma</span></div>
</span><br>
</span></font></div>
<div name="divtagdefaultwrapper"><font size="2" color="#808080"><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"></span><span>PhD
Student<br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">KTH
Royal Institute of Technology</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
</span><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">School
of Electrical Engineering and Computer
Science (EECS)</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">Department
of Theoretical Computer Science (TCS)</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"></span></font></div>
</div>
<a href="https://www.kth.se/profile/amansha" id="m_5381525267202685274m_1510169701429416940LPNoLP" target="_blank" moz-do-not-send="true"><span style="font-size:10pt"></span></a><a href="https://algomaster99.github.io/" id="m_5381525267202685274m_1510169701429416940LPNoLP" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://algomaster99.github.io/</a><br>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>