RFR: 8360037: Refactor ImageReader in preparation for Valhalla support
David Beaumont
duke at openjdk.org
Tue Jul 1 14:18:21 UTC 2025
On Mon, 30 Jun 2025 23:27:49 GMT, David Beaumont <duke at openjdk.org> wrote:
> Refactoring `ImageReader` to make it easy to add preview mode functionality for Valhalla.
>
> This PR is a large change to `ImageReader` (effectively a rewrite) but reduces the surface area of the API significantly, reduces code complexity and increases performance/memory efficiency. The need for this sort of change comes from the need to support "preview mode" in Valhalla for classes loaded from core modules.
>
> ### Preview Mode
>
> In the proposed preview mode support for Valhalla, a resource with the name `/modules/<module>/<path>` may come from one of two locations in the underlying jimage file; `/<module>/<path>` or `/<module>/META-INF/preview/<path>`. Furthermore, directories (represented as directory nodes via the API) will have a potentially different number of child nodes depending on whether preview mode is in effect, and some directories may only exist at all in preview mode.
>
> Furthermore, the directories and symbolic link nodes in `/packages/...` will also be affected by the existence of new package directories. To retain consistency and avoid "surprises" later, all of this needs to be taken into account.
>
> Below is a summary of the main changes for mainline JDK to better support preview mode later:
>
> ### 1: Improved encapsulation for preview mode
>
> The current `ImageReader` code exposes the data from the jimage file in several ways; via the `Node` abstraction, but also with methods which return an `ImageLocation` instance directly. In preview mode it would not be acceptable to return the `ImageLocation`, since that would leak the existence of resources in the `META-INF/preview` namespace and lets users see resource nodes with names that don't match the underlying `ImageLocation` from which their contents come.
>
> As such, the PR removes all methods which can leak `ImageLocation` or other "underlying" semantics about the jimage file. Luckily most of these are either used minimally and easily migrated to using nodes, or they were not being used at all. This PR also removes any methods which were already unused across the OpenJDK codebase (if I discover any issues with over-trimming of the API during full CI testing, it will be easy to address).
>
> ### 2. Simplification of directory child node handling
>
> The current `ImageReader` code attempts to create parent directories "on demand" for any child nodes it creates. This results in parent directories having a non-empty but incomplete set of child nodes present. This is referred to as the directory being "incomple...
Anything marked "TODO" is for discussion during the coming review. I don't intend to leave any of these in the code after, but they raise specific issues I'd like address.
With these pre-preview comments, I think this is finally ready for review.
make/test/BuildMicrobenchmark.gmk line 1:
> 1: #
Ignore this file, it's part of the PR to add the benchmark. I'll merge and sort everything out once that's in.
src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 1:
> 1: /*
I think that this class is best reviewed as if it were a new implementation, rather than trying to reason about the specific changes between the versions. Hopefully the comments will make the design clear and the reduced complexity/lines-of-code will help it be understood in a more stand-alone way.
src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 95:
> 93: }
> 94:
> 95: // directory management interface
A large number of the methods below were either:
1. Unused or effectively no-ops.
2. Breaking encapsulation and leaking underlying types such as `ImageLocation` (not a big problem now, but prevents cleanly implementing the preview mode logic for Valhalla).
The new API has essentially 3 top level methods here:
1. findNode()
2. getResource(Node)
3. getResourceBuffer(Node)
Any navigation of the node hierarchy is done via `getChildNames()` and unlike now, no user can obtain a reference to an "incomplete" node.
src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 157:
> 155: }
> 156:
> 157: /**
This method and the `getResourceBuffer()` method below are, I think, only called in one place and could, in theory, be factored out of this class altogether.
src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 255:
> 253: if (openers.isEmpty()) {
> 254: close();
> 255: // TODO (review note): Should this be synchronized by "this" ??
I genuinely this this might be an existing issue.
src/java.base/share/classes/jdk/internal/jimage/ImageReader.java line 601:
> 599: }
> 600:
> 601: /**
Most of these were either never called or no longer needed in the new implementation.
src/java.base/share/classes/jdk/internal/jrtfs/ExplodedImage.java line 122:
> 120:
> 121: @Override
> 122: public List<Node> getChildren() {
ExplodedImage makes a custom subclass of Node to "fake" the ImageReader behaviour. While this will need to change for preview mode, for now it's enough to just modify the one affected API.
src/java.base/share/classes/jdk/internal/jrtfs/JrtFileAttributes.java line 52:
> 50: @Override
> 51: public FileTime creationTime() {
> 52: return node.creationTime();
There's no benefit to having all these one-line getter methods duplicated in `ImageReader` since it already provides the file attributes object directly.
src/java.base/share/classes/jdk/internal/jrtfs/JrtFileSystem.java line 228:
> 226: }
> 227: if (filter == null) {
> 228: return node.getChildren()
Same logic, just based on the child names instead of the nodes.
src/java.base/share/classes/jdk/internal/jrtfs/SystemImage.java line 61:
> 59: // open a .jimage and build directory structure
> 60: final ImageReader image = ImageReader.open(moduleImageFile);
> 61: image.getRootDirectory();
This method no longer serves any purpose (it used to initialize the root directory entries, but it's not needed now.
src/java.base/share/classes/jdk/internal/module/SystemModuleFinders.java line 224:
> 222: ImageReader reader = SystemImage.reader();
> 223: for (String mn : reader.getModuleNames()) {
> 224: ImageLocation loc = reader.findLocation(mn, "module-info.class");
It is unsafe to let users see the underlying `ImageLocation` for a resource in preview mode, so this code had to be migrated to using a `Node` based view of the data. This is probably the most complex change not part of `ImageReader` itself.
src/java.base/share/classes/jdk/internal/module/SystemModuleFinders.java line 509:
> 507: if (node.isDirectory()) {
> 508: // build node
> 509: ImageReader.Node dir = SystemImage.reader().findNode(name);
This is an example of the fragility of the current "partially complete directory" behaviour. The `node` variable is exactly the same instance as the `dir` variable, and the call to `findNode()` serves only to ensure the directory is "complete" before use. In the new API only the name is available, so `findNode()` *must* be called, ensuring there's no risk of accidentally using an incomplete directory node.
test/jdk/jdk/internal/jimage/ImageReaderTest.java line 78:
> 76:
> 77: @Test
> 78: public void testModuleDirectories() throws IOException {
I could make these more data driven (i.e. parametrized) if people want.
test/jdk/jdk/internal/jimage/ImageReaderTest.java line 217:
> 215:
> 216: /// Loads and performs actions on classes stored in a given `ImageReader`.
> 217: private static class ImageClassLoader extends ClassLoader {
Loading and running class bytes to get a `toString()` value will be super important in the preview mode tests to prove that the correct class bytes were used.
test/jdk/jdk/internal/jimage/JImageReadTest.java line 28:
> 26: * @summary Unit test for libjimage JIMAGE_Open/Read/Close
> 27: * @modules java.base/jdk.internal.jimage
> 28: * @run testng/othervm JImageReadTest
Looking back, I don't recall exactly why this was needed. I will see if it's still necessary, but I think it was.
test/jdk/tools/jimage/ImageReaderDuplicateChildNodesTest.java line 1:
> 1: /*
Just uses child names rather than nodes.
-------------
PR Review: https://git.openjdk.org/jdk/pull/26054#pullrequestreview-2973065956
PR Review: https://git.openjdk.org/jdk/pull/26054#pullrequestreview-2975530216
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177654333
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177702506
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2176142533
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177705221
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177707657
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177710965
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177658457
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177660750
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177664240
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177671307
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177676879
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177687414
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177691003
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177692842
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177694731
PR Review Comment: https://git.openjdk.org/jdk/pull/26054#discussion_r2177697691
More information about the build-dev
mailing list