RFR: 8256541: Sort out what version of awk is used in the build system

Magnus Ihse Bursie ihse at openjdk.java.net
Thu Nov 26 21:03:01 UTC 2020


For historical reasons, there exists a variety of different implementations of awk: awk (the original implementation), gawk (the GNU version), nawk (new awk, iirc) and the lesser known mawk. 

Things are complicated by the fact that the original awk is seldom used, but instead gawk or nawk is typically symlinked to be named "awk". 

In terms of functionality there are very few differences. The original awk is most limited, while nawk and gawk is mostly replaceable. 

So the conditions for this is somewhat messy, but we manage impressively to mess it up even further. :-) 

We set up the following definitions: 
`BASIC_REQUIRE_PROGS(NAWK, [nawk gawk awk])`
`BASIC_REQUIRE_SPECIAL(AWK, [AC_PROG_AWK])`
and `AC_PROG_AWK`, according to the documentation, "[c]heck for gawk, mawk, nawk, and awk, in that order". 

So, if you have nawk and awk (but no other) installed, both NAWK and AWK will be set to nawk. If you have only awk, both will be set to awk. The difference is if you have gawk installed, then NAWK will be nawk and AWK will be gawk. 

As an example, on my mac, I only have the original awk, so both AWK and NAWK will be awk. 

On my ubuntu box, things are even more confused. I have: 
$ ls -l /usr/bin/*awk 
lrwxrwxrwx 1 root root 21 Feb 6 10:36 awk -> /etc/alternatives/awk* 
-rwxr-xr-x 1 root root 658072 Feb 11 2018 gawk* 
-rwxr-xr-x 1 root root 3189 Feb 11 2018 igawk* 
-rwxr-xr-x 1 root root 125416 Apr 3 2018 mawk* 
lrwxrwxrwx 1 root root 22 Feb 6 10:37 nawk -> /etc/alternatives/nawk* 

$ ls -l /etc/alternatives/*awk 
lrwxrwxrwx 1 root root 13 Feb 10 10:56 /etc/alternatives/awk -> /usr/bin/gawk* 
lrwxrwxrwx 1 root root 13 Feb 10 10:56 /etc/alternatives/nawk -> /usr/bin/gawk* 

So awk, nawk and gawk all executes the same binary, i.e. gawk. Only mawk is different. So on that machine, AWK would be gawk and NAWK would be nawk, but both will execute gawk. 

I propose that we remove NAWK, and only use AWK, but we should stop using AC_PROG_AWK and define it in an order that is transparent to us. I recommend [gawk nawk awk], since on Linux systems nawk (as we've seen) is likely to be gawk under disguise anyway, so it's better to be clear about that. 

This reasoning assumes that the awk scripts we write are portable enough to be executed by any awk. If we run into any problem with this, we might have to restrict the variation of awks we support.

To make this work properly, I also needed to get rid of the awk launched by fixpath in CompileCommand. (This only worked. since AWK was not evolved to a full path by `AC_PROG_AWK`, but was only `awk`(or whatever). Otherwise this could not work with fixpath, so it was very much a hack to begin with...

-------------

Commit messages:
 - Remove instance of awk in CompileCommand
 - 8256541: Sort out what version of awk is used in the build system

Changes: https://git.openjdk.java.net/jdk/pull/1470/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1470&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8256541
  Stats: 35 lines in 15 files changed: 4 ins; 9 del; 22 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1470.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1470/head:pull/1470

PR: https://git.openjdk.java.net/jdk/pull/1470



More information about the build-dev mailing list