need advice on module structure for ISA-specific sources and classes

Wed Sep 10 00:08:26 UTC 2014

On Sep 9, 2014, at 1:47 AM, Magnus Ihse Bursie <magnus.ihse.bursie at oracle.com> wrote:

> On 2014-09-04 23:13, John Rose wrote:
>> Straw man proposal:  Allow the folder names "cpu.$CPU" and "$OS.$CPU" to occur as a sibling to "share" and $OS in source paths.
> 
> I think the basic idea of putting hardware-dependent code in directories as "siblings" to the share and $OS directories is sound.
> 
> In general, I'm a bit afraid of "over-engineering" in this kinds of issues. It is technically easy to make very strict hierarchies and structures, but in the real world it often turns out that:
> 
> a) we're talking about a very small subset of the code base that is involved
> 
> b) things get messy and do not nicely adhere to nice and strict categorizations. For instance, x86 code sometimes are shared between 32 and 64 bits, and sometimes not. Should you have a x86-common, x86-32 and x86-64? Or just a single x86, and deal with the differences in address size as it happens?
> 
> I think it is worth remembering that any way to organize source code must be *helpful*, not just nicely categorized. And it must be helpful both to the engineers looking for code to modify, and to the build system to automatically figure out what should be included when compiling.
> 
> So, the answer to the question in b) above is "it depends" -- if there are a lot of shared code and just a few places where address size differ, putting it all in "x86" seems reasonable. On the other hand, if there is very little shared code, splitting it up might make more sense. And if it is just a few lines in a few files, just using #ifdefs might be simpler. And such situations might change over time, and if the change has been to radical, it might be better to reorganize.

Given that (a) each CPU_ARCH (e.g., sparc) has conventions for organizing CPU distinctions (e.g., sparc V8 and V9), and that (b) we get a simpler top-level structure, it looks like we want to use CPU_ARCH names at the top level, unless this proves decisively awkward.

(As a pattern, it appears that the CPU_ARCH name is also the name of the 32-bit CPU variant.  So a top-level split would show up as a new special folder for the 64-bit variant, e.g., x86_64 contributing code which no longer fits in x86.  Seems unlikely.)

> I think the KISS-rule is a good guiding principle, and I think Johns proposal mostly follow that. The only thing I'd like to suggest instead is that we drop the "cpu." prefix. Sure, in theory we might get confused when someone releases the "x86 OS" or the "macosx CPU" :-) but in reality, there is no problem in telling the difference between windows and sparc. The only thing that is important is that we keep the same order of $OS and $CPU in the "combined" directories, so we do not mix "solaris.sparc" with "arm64.linux".

Point taken.  HotSpot gives a little guidance here, by putting the os before the cpu; let's do that.

> It is also worth noting in the existing solution that the $OS directory is not strictly just OS. It either "OS" (windows, linux or macosx) or "OS type" (unix). So we already have different kind of separators as siblings to share.

Yep.

> As for the build system, we find the source code to compile for a specific platform by combining the share directory with the directories matching the "OS" and the "OS type" (if they exist). It would be easy as pie to add $CPU and $OS.$CPU as well to that search path.

That's reassuring!

> In the build system, we define two variables for each target platform, OPENJDK_TARGET_CPU_ARCH and OPENJDK_TARGET_CPU, where the latter implies a specific address size as well. I would very much appreciate if the names used for the new directories match these variables (for example, the values for OPENJDK_TARGET_CPU_ARCH are: x86, arm, ppc, s390 and sparc), since that will allow us to keep consistency of the names in the build, and to do the directory matching without any name translations. (We've got too many of those already in legacy code :-/)

I was hoping for a recommendation like this. I can now retire my straw-man. :-)

> Finally my personal opinion is that a dash is a better separator than a dot, e.g. "solaris-sparc" is more readable than "solaris.sparc" (and aligns better with what we've already done in the makefiles), but that's not a big deal.
> 
> /Magnus

Thanks for your carefully-considered comments, Magnus.

— John