[Request for Comments] File extension redux

Brian Burkhalter brian.burkhalter at oracle.com
Wed Oct 11 15:54:44 UTC 2023


Hi Roger,

Thanks for the detailed, prompt response. It appears that what I have been thinking is pretty much in line with what you wrote. Specifically I have been considering the following proposal:


  *   In the common case, the extension is the sequence of characters after the last period character in the file name.
  *   The extension does not include the period character.
  *   An indeterminate extension is represented by the empty string “”; no nulls.
  *   Three methods could be added to Path:
     *   String getExtension()
     *   Path removeExtension()
     *   Path addExtension(String)
  *   For a given Path p this invariant would be satisfied:
     *   p.equals(p.removeExtension().addExtension(p.getExtension())) == true

The foregoing approach would, I think, obviate the need to directly handle the period character.

Thanks,

Brian

On Oct 10, 2023, at 2:31 PM, Roger Riggs <Roger.Riggs at oracle.com<mailto:Roger.Riggs at oracle.com>> wrote:

Hi Brian,

First the intuition and then the rationale.

Intuition: the extension does not include the period '.' learned from years of working with existing systems, both shell and programming languages with file apis.

Second: In trying to make Path.getExtension() handle all of the use cases possible, it makes some of them more difficult.  In particular, the cases for removing or replacing the extension are easier to do with a method for that purpose. It is easier to explain using a Path.removeExtension() method than to explain how to convert the path to a string and then remove some number of characters from the end and recreate the Path.

Rationale:

  *   Other Java libraries that provide path name utilities do not include the dot as part of the extension; (Guava and Apache Commons). A mismatch with them (and other common Java libraries) creates friction and room to make mistakes in conversions between different representations.
  *   The no-dot representation works just fine except in the case of trailing dot, which is quite uncommon in practice.
  *   The APIs provided should do what the programmer needs to do, not be APIs from which the programmer can create the API they need and still avoid an explosion of APIs for narrow use cases.
  *   Extensions are used in many ways, from switch statements to keys in maps to simple if/then/else; some are hard coded and others are configured or loaded as plugins. The semantics should be simple and direct.

For 1) and 3) I would say that both "foo" or "foo." have an empty ("") extension.
Providing a removeExtension() or replaceExtension(String) method can embed the edge case handling and make the API easy to use without having to burden the Path.getExtension() method with a model that includes that complexity.

Regards, Roger


On 10/10/23 2:31 PM, Brian Burkhalter wrote:
I would like to resurrect this topic which has languished for quite a while now. It was discussed for probably decades before my involvement with it. A brief outline of its most recent history is included below. The essence is that it would be good to add a java.nio.file.Path.getExtension method, and possibly one or more companion methods. General comments on the topic are welcome regardless of whether the included chronology is reviewed. Things to consider include:

1) whether a getExtension method ever returns null;
2) whether a period character is included as the first character of the returned extension;
3) whether the extension of “foo” (no dot) differs from that of “foo.” (terminal dot).

Thanks,

Brian

--- snip --- TL;DR --- snip ---

[1] Feb. 2018
Path.hasExtension was initially proposed but dropped in favor of Path.getExtension. This method was for the most common cases equivalent to

Path path = ...;
String name = path.getFileName();
String extension = name.substring(name.lastIndexOf(“.”) + 1);

which is to say the extension is the portion of the file name string after but not including the last period character (dot). This effort faded out.

[2] Apr. 2021 - Mar. 2022
A pull request was proposed and culminated in the proposal of the Path method

String getExtension(String defaultExtension)

where in the common case the extension is once again the portion of the file name string after the last dot. For degenerate cases the “defaultExtension” parameter would be returned.  This effort was superseded by [3].

[3] Mar. 2022 - Dec. 2022

This pull request specified the Path method

String getExtension()

which in the common case returned the portion of the file name string after the last dot, and in unusual cases either the empty string (terminal dot) or null (no dot). For a time this proposal used Optional but that was dropped as unwieldy. This PR was approved and integrated.

[4] Dec. 2022

Due primarily to the undesirability of the possibility of a null return value and the lateness of the integration in the development cycle, the API added in [3] was backed out.

[5] Dec. 22

A new issue was filed to track the resumption of eventual work on this topic. This issue suggested that the last dot be included in the extension, e.g., the extension of “image.jpg” would be “.jpg” rather than “jpg” and that the value returned by getExtension would never be null. It also proposed a complementary method Path.removeExtension which, together with Path.getExtension, would form an invariant

Path path = ...;
assert path.equals(path.removeExtension() + path.getExtension());

which would obviate the need to handle a period character explicitly.

[1] https://mail.openjdk.org/pipermail/nio-dev/2018-February/004716.html
[2] https://github.com/openjdk/jdk/pull/2319
[3] https://github.com/openjdk/jdk/pull/8066
[4] https://github.com/openjdk/jdk/pull/11566
[5] https://bugs.openjdk.org/browse/JDK-8298318


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/nio-dev/attachments/20231011/92764e8c/attachment-0001.htm>


More information about the nio-dev mailing list