module structure for ISA-specific sources and classes

Wed Sep 10 00:34:15 UTC 2014

I asked the Jigsaw-dev list for advice on placing ISA-specific code into the OpenJDK source tree.

Below is the fullest answer; this appears to be the consensus.  The full thread is here:
  http://mail.openjdk.java.net/pipermail/jigsaw-dev/2014-September/003574.html

Summary:

Let's use the existing names for OPENJDK_TARGET_CPU_ARCH (which are x86, arm, ppc, s390, sparc), placed as siblings of "share" and "$OS" in the source paths.

I suggest that the assembly code and low-level C code in libffi should go here:
  src/java.base/x86/classes/jdk/internal/ffi/...

The main idea is it would simplify the OpenJDK makefile access to cpu-specific code.

The JNI code for JFFI, which is shared, can go here as expected:
  src/java.base/share/native/jdk/internal/ffi/...

Also, this convention gives a couple more options for placing cpu-specific Java code:
  [0] src/java.base/share/classes/jdk/internal/jnr/x86asm/Assembler.java
  [1] src/java.base/x86/classes/jdk/internal/jnr/x86asm/Assembler.java
  [2] src/java.base/x86/classes/jdk/internal/jnr/asm/Assembler.java

The current layout is something like [0], while [1] would give a clearer picture of where to find cpu-specific stuff.  The option [2] may be overkill, unless Assembler implements a platform-independent API and we don't want to go through a provider-type API.

Actually, I don't see a very strong reason to place Java code in cpu-specific top-level folders in the repo.  The strongest reason (IMO) is to avoid building class files for cpus that are irrelevant to the currently-building cpu.  But there aren't many such files, at least not yet, so the effect of building the extra files is a fleabite.

There is a proposal for jointly-specialized folders too, which we can think about:  solaris-x86, linux-sparc, etc., sibling with "share", "x86", and "linux".  Unless there is a lot of native code for all these combinations, we shouldn't make those os-cpu folders.

(Since we are doing JNI enhancements, we should be working under the java.base module, which is a new thing for JDK 9.)

Comments?

— John

Begin forwarded message:

From: Magnus Ihse Bursie <magnus.ihse.bursie at oracle.com>
Subject: Re: need advice on module structure for ISA-specific sources and classes
Date: September 9, 2014 at 1:47:16 AM PDT
To: John Rose <john.r.rose at oracle.com>, jigsaw-dev at openjdk.java.net

On 2014-09-04 23:13, John Rose wrote:
> Straw man proposal:  Allow the folder names "cpu.$CPU" and "$OS.$CPU" to occur as a sibling to "share" and $OS in source paths.

I think the basic idea of putting hardware-dependent code in directories as "siblings" to the share and $OS directories is sound.

In general, I'm a bit afraid of "over-engineering" in this kinds of issues. It is technically easy to make very strict hierarchies and structures, but in the real world it often turns out that:

a) we're talking about a very small subset of the code base that is involved

b) things get messy and do not nicely adhere to nice and strict categorizations. For instance, x86 code sometimes are shared between 32 and 64 bits, and sometimes not. Should you have a x86-common, x86-32 and x86-64? Or just a single x86, and deal with the differences in address size as it happens?

I think it is worth remembering that any way to organize source code must be *helpful*, not just nicely categorized. And it must be helpful both to the engineers looking for code to modify, and to the build system to automatically figure out what should be included when compiling.

So, the answer to the question in b) above is "it depends" -- if there are a lot of shared code and just a few places where address size differ, putting it all in "x86" seems reasonable. On the other hand, if there is very little shared code, splitting it up might make more sense. And if it is just a few lines in a few files, just using #ifdefs might be simpler. And such situations might change over time, and if the change has been to radical, it might be better to reorganize.

I think the KISS-rule is a good guiding principle, and I think Johns proposal mostly follow that. The only thing I'd like to suggest instead is that we drop the "cpu." prefix. Sure, in theory we might get confused when someone releases the "x86 OS" or the "macosx CPU" :-) but in reality, there is no problem in telling the difference between windows and sparc. The only thing that is important is that we keep the same order of $OS and $CPU in the "combined" directories, so we do not mix "solaris.sparc" with "arm64.linux".

It is also worth noting in the existing solution that the $OS directory is not strictly just OS. It either "OS" (windows, linux or macosx) or "OS type" (unix). So we already have different kind of separators as siblings to share.

As for the build system, we find the source code to compile for a specific platform by combining the share directory with the directories matching the "OS" and the "OS type" (if they exist). It would be easy as pie to add $CPU and $OS.$CPU as well to that search path.

In the build system, we define two variables for each target platform, OPENJDK_TARGET_CPU_ARCH and OPENJDK_TARGET_CPU, where the latter implies a specific address size as well. I would very much appreciate if the names used for the new directories match these variables (for example, the values for OPENJDK_TARGET_CPU_ARCH are: x86, arm, ppc, s390 and sparc), since that will allow us to keep consistency of the names in the build, and to do the directory matching without any name translations. (We've got too many of those already in legacy code :-/)

Finally my personal opinion is that a dash is a better separator than a dot, e.g. "solaris-sparc" is more readable than "solaris.sparc" (and aligns better with what we've already done in the makefiles), but that's not a big deal.

/Magnus