RFR: 8256541: Sort out what version of awk is used in the build system

Erik Joelsson erikj at openjdk.java.net
Mon Nov 30 15:35:56 UTC 2020


On Thu, 26 Nov 2020 20:58:38 GMT, Magnus Ihse Bursie <ihse at openjdk.org> wrote:

> For historical reasons, there exists a variety of different implementations of awk: awk (the original implementation), gawk (the GNU version), nawk (new awk, iirc) and the lesser known mawk. 
> 
> Things are complicated by the fact that the original awk is seldom used, but instead gawk or nawk is typically symlinked to be named "awk". 
> 
> In terms of functionality there are very few differences. The original awk is most limited, while nawk and gawk is mostly replaceable. 
> 
> So the conditions for this is somewhat messy, but we manage impressively to mess it up even further. :-) 
> 
> We set up the following definitions: 
> `BASIC_REQUIRE_PROGS(NAWK, [nawk gawk awk])`
> `BASIC_REQUIRE_SPECIAL(AWK, [AC_PROG_AWK])`
> and `AC_PROG_AWK`, according to the documentation, "[c]heck for gawk, mawk, nawk, and awk, in that order". 
> 
> So, if you have nawk and awk (but no other) installed, both NAWK and AWK will be set to nawk. If you have only awk, both will be set to awk. The difference is if you have gawk installed, then NAWK will be nawk and AWK will be gawk. 
> 
> As an example, on my mac, I only have the original awk, so both AWK and NAWK will be awk. 
> 
> On my ubuntu box, things are even more confused. I have: 
> $ ls -l /usr/bin/*awk 
> lrwxrwxrwx 1 root root 21 Feb 6 10:36 awk -> /etc/alternatives/awk* 
> -rwxr-xr-x 1 root root 658072 Feb 11 2018 gawk* 
> -rwxr-xr-x 1 root root 3189 Feb 11 2018 igawk* 
> -rwxr-xr-x 1 root root 125416 Apr 3 2018 mawk* 
> lrwxrwxrwx 1 root root 22 Feb 6 10:37 nawk -> /etc/alternatives/nawk* 
> 
> $ ls -l /etc/alternatives/*awk 
> lrwxrwxrwx 1 root root 13 Feb 10 10:56 /etc/alternatives/awk -> /usr/bin/gawk* 
> lrwxrwxrwx 1 root root 13 Feb 10 10:56 /etc/alternatives/nawk -> /usr/bin/gawk* 
> 
> So awk, nawk and gawk all executes the same binary, i.e. gawk. Only mawk is different. So on that machine, AWK would be gawk and NAWK would be nawk, but both will execute gawk. 
> 
> I propose that we remove NAWK, and only use AWK, but we should stop using AC_PROG_AWK and define it in an order that is transparent to us. I recommend [gawk nawk awk], since on Linux systems nawk (as we've seen) is likely to be gawk under disguise anyway, so it's better to be clear about that. 
> 
> This reasoning assumes that the awk scripts we write are portable enough to be executed by any awk. If we run into any problem with this, we might have to restrict the variation of awks we support.
> 
> To make this work properly, I also needed to get rid of the awk launched by fixpath in CompileCommand. (This only worked. since AWK was not evolved to a full path by `AC_PROG_AWK`, but was only `awk`(or whatever). Otherwise this could not work with fixpath, so it was very much a hack to begin with...

Marked as reviewed by erikj (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1470



More information about the build-dev mailing list