RFR: 8173970: jar tool should have a way to extract to a directory

Jaikiran Pai jai.forums2013 at gmail.com
Sun Feb 28 04:19:37 UTC 2021


Hello Alan,

On 27/02/21 2:23 pm, Alan Bateman wrote:
>
> Yes, the option name will need to be agreed. It would be useful to 
> enumerate the options that the other tools are using to specify the 
> location where to extract. If you see JBS issues mentioning tar -C not 
> supporting chdir when extracting then it might be Solaris tar, which 
> isn't the same as GNU tar which has different options. It might be 
> better to look at more main stream tools, like unzip although jar -d 
> is already taken. It would be nice if there were some consistency with 
> other tools in the JDK that doing extracting (The jmod and jimage 
> extract commands use use --dir for example).

I had a look at both tar and unzip commands on MacOS and Linux (CentOS) 
setup that I had access to.

--------------
tar on MacOS:
--------------

tar --version
bsdtar 3.3.2 - libarchive 3.3.2 zlib/1.2.11 liblzma/5.0.5 bz2lib/1.0.6

The version of this tool has:

-C directory, --cd directory, --directory directory
              In c and r mode, this changes the directory before adding 
the following files.
              In x mode, change directories after opening the archive 
but before extracting
              entries from the archive.

A command like "tar -xzf foo.tar.gz -C /tmp/bar/" works fine and 
extracts the foo.tar.gz from current directory to a target directory 
/tmp/bar/

--------------
tar on CentOS:
--------------

tar --version
tar (GNU tar) 1.26

This version of the tool has:

Common options:
        -C, --directory=DIR
               change to directory DIR

Although the wording isn't clear that, when used with -x, it extracts to 
the directory specified in -C, it does indeed behave that way.

Specifically, a command like "tar -xzf foo.tar.gz -C /tmp/bar/" works 
fine and extracts the foo.tar.gz from current directory to a target 
directory /tmp/bar/

-------------------------------
unzip on both MacOS and CentOS:
-------------------------------

unzip -v
UnZip 6.00 of 20 April 2009, by Info-ZIP.  Maintained by C. Spieler.

This version of the tool has:

[-d exdir]
               An optional directory to which to extract files.  By 
default, all files and sub-
               directories are recreated in the current directory; the 
-d option allows extrac-
               tion in an arbitrary directory (always assuming one has 
permission to  write  to
               the  directory).  This option need not appear at the end 
of the command line; it
               is also accepted before the zipfile specification (with  
the  normal  options),
               immediately  after  the zipfile specification, or between 
the file(s) and the -x
               option.  The option and directory may be concatenated 
without  any  white  space
               between  them,  but  note  that  this may cause normal 
shell behavior to be sup-
               pressed.  In particular, ``-d ~'' (tilde) is expanded by 
Unix C shells into  the
               name of the user's home directory, but ``-d~'' is treated 
as a literal subdirec-
               tory ``~'' of the current directory.

unzip foo.zip -d /tmp/bar/ works fine and extracts the foo.zip from 
current directory to /tmp/bar/

---------------
jimage and jmod
---------------

The jimage and jmod as you note use the --dir option for extracting to 
that specified directory.


Those were the tools I looked at. I think using the -d option with -x 
for the jar command is ruled out since it already is used for a 
different purpose, although for a different "main" operation of the jar 
command.

As for using --dir for this new feature, I don't think it alone will be 
enough. Specifically, I couldn't find a "short form" option for the 
--dir option in the jimage or jmod commands. For the jar extract feature 
that we are discussing here, I think having a short form option (in 
addition to the longer form) is necessary to have it match the usage 
expectations of similar other options that the jar command exposes. So 
even if we do choose --dir as the long form option, we would still need 
a short form for it and since -d is already taken for something else, we 
would still need to come up with a different one. The short form of this 
option could be -C (see below).


I think reusing the -C option, for this new feature, perhaps is a good 
thing. The -C is currently used by the update and create "main" 
operation of the jar command and the man page for this option states:

-C dir
               When creating (c) or updating (u) a JAR file, this option 
temporarily changes
               the directory while processing files specified by the 
file operands. Its
               operation is intended to be similar to the -C option of 
the UNIX tar utility.For
               example, the following command changes to the classes 
directory and adds the
               Bar.class file from that directory to my.jar:

               jar uf my.jar -C classes Bar.class
....

Using the -C option would indeed align it with the tar command. For the 
"long form" of this option, the tar command (both on MacOS and CentOS) 
uses --directory. For this jar extract feature though, we could perhaps 
just use --dir to have it align with the jimage and the jmod tools.

So I think the combination of -C (short form) and --dir (long form) 
would perhaps be suitable for this feature.


>
> There are other discussion points around the behavior when the target 
> directory exists or does not exist, to ensure there is some 
> consistency with main stream tools.

I'm guessing you mean the behaviour of creating a directory (or a 
hierarchy of directories) if the target directory is not present? My 
testing with the tar tool (both on MacOS and CentOS) shows that if the 
specified target directory doesn't exist, then the extract fails. The 
tar extract command doesn't create the target directory during extract. 
On the other hand, the unzip tool, does create the directory if it 
doesn't exist. However, interestingly, the unzip tool creates only one 
level of that directory if it doesn't exist. Specifically, if you specify:

unzip foo.zip -d /tmp/blah/

and if "blah/" isn't a directory inside /tmp/ directory, then it creates 
the "blah/" directory inside /tmp/ and then extracts the contents of the 
zip into it.

However,

unzip foo.zip -d /tmp/blah/hello/

and if "blah/" isn't a directory inside /tmp/ directory, then this 
command fails with an error and it doesn't create the hierarchy of the 
target directories.

Coming to the jimage and the jmod commands, both these commands create 
the entire directory hierarchy if the target directory specified during 
extract, using --dir, doesn't exist. So a command like:

jimage extract --dir /tmp/blah/foo/bar/ jdkmodules

will create the blah/foo/bar/ directory hierarchy if blah doesn't exist 
in /tmp/, while extracting the "jdkmodules" image.

 From the user point of view, I think this behaviour of creating the 
directories if the target directory doesn't exist, is probably the most 
intuitive and useful and if we did decide to use this approach for this 
new option for jar extract command, then it would align with what we 
already do in jimage and jmod commands.

One another minor detail, while we are at this, is that, IMO we should 
let the jar extract command to continue to behave the way it currently 
does when it comes to overwriting existing files. If the jar being 
extracted contains a file by the same name, in the target directory 
(hierarchy) then it should continue to overwrite that file. In other 
words, I don't think we should change the way the jar extract command 
currently behaves where it overwrites existing files when extracting.


-Jaikiran




More information about the compiler-dev mailing list